Existing RNA-Seq data was retrieved from the following complete
RNASeq data available in the NCBI database:
- SRR19782039 Exposure
to Valsartan & Carbamazepine
- SRR16771870 Exposure
to a synthetic hormone 17 a-Ethinylestradiol (EE2)
- SRR7725722 Diarrhetic
Shellfish Poisoning (DSP) toxins associated with Harmful Algal Blooms
(HABs)
- SRR13013756 Hypoxia
- Mytilus galloprovincialis Reference
Genome
Moving CDS file from other place on Raven
cp /home/shared/8TB_HDD_01/sr320/ncbi/ncbi_dataset/data/GCA_900618805.1/cds_from_genomic.fna ../data/cds_from_genomic.fna
head ../data/cds_from_genomic.fna
## >lcl|UYJE01000001.1_cds_VDH88688.1_1 [locus_tag=MGAL_10B017214] [protein=Hypothetical predicted protein] [protein_id=VDH88688.1] [location=complement(join(55975..56224,59152..59365,60239..60337,61332..61522,64535..64608))] [gbkey=CDS]
## ATGAATAGAATTACTGATAGGGACTACGACTACTATGACTTTGAAGATGACAGTGACCACGAGCCTTGCGATAGTTCTGA
## TGATGATATCGAGGTTATTTTACATGGAACACCTGAACAGAAGCGTAAATTACAGACCAAAGTCCAACAAAGACATGATT
## CTTCAAGTGAAGATGACTTTGAAAAGGAAATGAATAATGAACTTAACAAACATATTAAAGGACTGGTAAATGAAAGATCA
## AGTAATGTTGCAGAAACTGTTCAAGGTAGTAGCAAAGCTCAAGACCAAGAGAAACCAACAGAACAACAACAATTTTATGA
## TGATATTTATTACGATTCAGAAGAAGAGGAAATGGTTTTACAAGGTGATGAACGTGTCAAAAGAAGACAACCTGTTCAAA
## GCAATGATGACTTATTGTACGATCCTGACCTAGACGAAGAAGACCAGCGATGGGTTGATGCTGAACGACAAGCTTATCAG
## CTGCCTGTACCCTCAGGATCCAAATCAAAACGTCAAAACAGTGATGCAGTTTTAAACTGTCCCGCTTGTATGACATTACT
## GTGTCTTGATTGTCAGGGGCATGATGTTTATGAAAACCAGTACAGAGCTATGTTTGTTAAGAACTGTCGTGTCGATACAT
## CAGAATTATTAAAACAGCCGTTACAGAAGAAAAAACGTAAAAAAAAACAGAAGACATTGGACACTACAAATAATGAAACA
Create the index file to align my short read files to the genes from the MGAL_10
/home/shared/kallisto/kallisto \
\
index \
-i ../data/MGAL_cds.index ../data/cds_from_genomic.fna
/home/shared/kallisto/kallisto quant
## kallisto 0.46.1
## Computes equivalence classes for reads and quantifies abundances
##
## Usage: kallisto quant [arguments] FASTQ-files
##
## Required arguments:
## -i, --index=STRING Filename for the kallisto index to be used for
## quantification
## -o, --output-dir=STRING Directory to write output to
##
## Optional arguments:
## --bias Perform sequence based bias correction
## -b, --bootstrap-samples=INT Number of bootstrap samples (default: 0)
## --seed=INT Seed for the bootstrap sampling (default: 42)
## --plaintext Output plaintext instead of HDF5
## --fusion Search for fusions for Pizzly
## --single Quantify single-end reads
## --single-overhang Include reads where unobserved rest of fragment is
## predicted to lie outside a transcript
## --fr-stranded Strand specific reads, first read forward
## --rf-stranded Strand specific reads, first read reverse
## -l, --fragment-length=DOUBLE Estimated average fragment length
## -s, --sd=DOUBLE Estimated standard deviation of fragment length
## (default: -l, -s values are estimated from paired
## end data, but are required when using --single)
## -t, --threads=INT Number of threads to use (default: 1)
## --pseudobam Save pseudoalignments to transcriptome to BAM file
## --genomebam Project pseudoalignments to genome sorted BAM file
## -g, --gtf GTF file for transcriptome information
## (required for --genomebam)
## -c, --chromosomes Tab separated file with chromosome names and lengths
## (optional for --genomebam, but recommended)
ls /home/shared/8TB_HDD_02/cnmntgna/GitHub/chris-musselcon/output/ncbi/
## SRR13013756.fastq
## SRR13013756_fastqc.html
## SRR13013756_fastqc.zip
## SRR16771870_1.fastq
## SRR16771870_1_fastqc.html
## SRR16771870_1_fastqc.zip
## SRR16771870_2.fastq
## SRR16771870_2_fastqc.html
## SRR16771870_2_fastqc.zip
## SRR19782039.fastq
## SRR19782039_fastqc.html
## SRR19782039_fastqc.zip
## SRR7725722_1.fastq
## SRR7725722_1_fastqc.html
## SRR7725722_1_fastqc.zip
## SRR7725722_2.fastq
## SRR7725722_2_fastqc.html
## SRR7725722_2_fastqc.zip
#mkdir ../output
#mkdir ../output/kallisto_01
find /home/shared/8TB_HDD_02/cnmntgna/GitHub/chris-musselcon/output/ncbi/*_1.fastq \
| xargs basename -s _1.fastq | xargs -I{} /home/shared/kallisto/kallisto \
-i ../data/MGAL_cds.index \
quant \
-o ../output/kallisto_01/{} \
-t 4 \
/home/shared/8TB_HDD_02/cnmntgna/GitHub/chris-musselcon/output/ncbi/{}_1.fastq /home/shared/8TB_HDD_02/cnmntgna/GitHub/chris-musselcon/output/ncbi/{}_2.fastq