--- title: "kallisto_pgenerosa_pipeline" author: "Olivia" date: "2/1/2022" output: html_document --- #This is for juv p.generosa from Trueseq-stranded-mRNA-libraries-GeoRNA7-G1-NR006_S4_L001_R1_001.fastq.gz library(credentials) set_github_pat() In this little workflow, we will be using a relatively new technology, pseudoalignment and quantification to deal with RNA-seq data from two samples. The technical steps are: 1. Use the SRA SDK to download FASTQ files for each sample 2. Build a transcriptome index for Kallisto 3. Pseudoalignment and quantification with Kallisto 4. Read Kallisto output into a SummarizedExperiment object ##Robets lab: Take our RNA-seq data, map onto genome, and describe the gene repertoire expressed at different stages. Do everthing on Raven. Step 1: ID all relevant datasets Step 2: QC data Step 3: Map to genome to get expression / count data. Step 4: Functional annotation of genes ```{bash} #download geoduck transcriptome from genomic databank, do this only once curl --insecure \ -O https://owl.fish.washington.edu/halfshell/genomic-databank/Pgenerosa_transcriptome_v5.fasta ``` ```{bash} pwd #find where you are in your directory, move data to better location ls #mv ../Pgenerosa_transcriptome_v5.fasta ../data/Pgenerosa_transcriptome_v5.fasta mv Pgenerosa_transcriptome_v5.fasta ``` ```{bash} #download Read 1 and 2 from nightingales spreadsheet cd ~/gitrepos/kallisto/data curl --insecure \ -O http://owl.fish.washington.edu/nightingales/P_generosa/Trueseq-stranded-mRNA-libraries-GeoRNA1-A1-NR006_S1_L001_R1_001.fastq.gz curl --insecure \ -O http://owl.fish.washington.edu/nightingales/P_generosa/Trueseq-stranded-mRNA-libraries-GeoRNA1-A1-NR006_S1_L001_R2_001.fastq.gz ``` ```{bash} #load kallisto /home/shared/kallisto/kallisto \ index -i data/transcriptome_v5.idx data/Pgenerosa_transcriptome_v5.fasta #press open when MacOs security message pops up ``` ```{bash} /home/shared/kallisto/kallisto quant \ -i /home/olivia/gitrepos/kallisto/data/transcriptome_v5.idx \ -o /home/olivia/gitrepos/kallisto/analyses/gonad-Trueseq-stranded-mRNA-libraries-NR006_1 \ -b 100 \ /home/olivia/gitrepos/kallisto/data/Trueseq-stranded-mRNA-libraries-GeoRNA1-A1-NR006_S1_L001_R1_001.fastq.gz \ /home/olivia/gitrepos/kallisto/data/Trueseq-stranded-mRNA-libraries-GeoRNA1-A1-NR006_S1_L001_R2_001.fastq.gz ``` ```{bash} rsync -avP ~/gitrepos/kallisto/analyses/gonad-Trueseq-stranded-mRNA-libraries-NR006_1 ocattau@gannet.fish.washington.edu:/volume2/web/gigas/analyses/ #push whole folder to gannet ```