--- author: Sam White toc-title: Contents toc-depth: 5 toc-location: left date: 2018-04-10 20:40:00+00:00 layout: post slug: trimgalorefastqcmultiqc-14bp-trim-c-virginica-mbd-bs-seq-fastq-data title: TrimGalore/FastQC/MultiQC - 14bp Trim C.virginica MBD BS-seq FASTQ data categories: - 2018 - Miscellaneous tags: - Crassostrea virginica - Eastern oyster - FASTQC - MBD-Seq - multiqc - TrimGalore! - ZymoResearch --- [Yesterday, I ran TrimGalore/FastQC/MultiQC on the _Crassostrea virginica_ MBD BS-seq data from ZymoResearch](https://robertslab.github.io/sams-notebook/posts/2018/2018-04-09-trimgalorefastqcmultiqc-auto-trim-c-virginica-mbd-bs-seq-fastq-data/) with the default settings (i.e. "auto-trim"). There was still some variability in the first ~15bp of the reads and Steven wanted to see how a hard trim would change things. I ran [TrimGalore](https://github.com/FelixKrueger/TrimGalore) (using the built-in FastQC option), with a hard trim of the first 14bp of each read and followed up with MultiQC for a summary of the FastQC reports. TrimGalore job script: * [20180410_trimgalore_trim14bp_Cvirginica_MDB.sh](https://owl.fish.washington.edu/Athaliana/20180410_trimgalore_trim14bp_Cvirginica_MBD/20180410_trimgalore_trim14bp_Cvirginica_MDB.sh) Standard error was redirected on the command line to this file: * [20180410_trimgalore_trim14bp_Cvirginica_MBD/stderr.log](https://owl.fish.washington.edu/Athaliana/20180410_trimgalore_trim14bp_Cvirginica_MBD/stderr.log) MD5 checksums were generated on the resulting trimmed FASTQ files: * [20180410_trimgalore_trim14bp_Cvirginica_MBD/checksums.md5](https://owl.fish.washington.edu/Athaliana/20180410_trimgalore_trim14bp_Cvirginica_MBD/checksums.md5) All data was copied to my folder on Owl. Checksums for FASTQ files were verified post-data transfer (data not shown). ##### Results: Output folder: * [20180410_trimgalore_trim14bp_Cvirginica_MBD/](https://owl.fish.washington.edu/Athaliana/20180410_trimgalore_trim14bp_Cvirginica_MBD/) FastQC output folder: * [20180410_trimgalore_trim14bp_Cvirginica_MBD/20180410_fastqc_trimgalore_trim14bp_Cvirginica_MBD/](https://owl.fish.washington.edu/Athaliana/20180410_trimgalore_trim14bp_Cvirginica_MBD/20180410_fastqc_trimgalore_trim14bp_Cvirginica_MBD/) MultiQC output folder: * [20180410_trimgalore_trim14bp_Cvirginica_MBD/20180410_fastqc_trimgalore_trim14bp_Cvirginica_MBD/multiqc_data/](https://owl.fish.washington.edu/Athaliana/20180410_trimgalore_trim14bp_Cvirginica_MBD/20180410_fastqc_trimgalore_trim14bp_Cvirginica_MBD/multiqc_data/) MultiQC HTML report: * [20180410_trimgalore_trim14bp_Cvirginica_MBD/20180410_fastqc_trimgalore_trim14bp_Cvirginica_MBD/multiqc_data/multiqc_report.html](https://owl.fish.washington.edu/Athaliana/20180410_trimgalore_trim14bp_Cvirginica_MBD/20180410_fastqc_trimgalore_trim14bp_Cvirginica_MBD/multiqc_data/multiqc_report.html) OK, this trimming definitely took care of the variability seen in the first ~15bp of all the reads. However, I noticed that the last 2bp of each of the Read 1 seqs all have some wonky stuff going on. I'm guessing I should probably trim that stuff off, too...