--- author: Sam White toc-title: Contents toc-depth: 5 toc-location: left date: 2018-09-11 22:04:24+00:00 layout: post slug: trimgalorefastqcmultiqc-c-virginica-oil-spill-mbdseq-concatenated-sequences title: TrimGalore/FastQC/MultiQC - C.virginica Oil Spill MBDseq Concatenated Sequences categories: - 2018 - LSU C.virginica Oil Spill MBD BS Sequencing tags: - Crassostrea virginica - Eastern oyster - FASTQC - jupyter notebook - multiqc - TrimGalore! - trimming --- [Previously concatenated and analyzed our _Crassostrea virginica_ oil spill MBDseq data with FastQC](https://robertslab.github.io/sams-notebook/posts/2018/2018-09-10-sequencing-data-analysis-c-virginica-oil-spill-mbdseq-concatenation-fastqc/). We decided to try improving things by [running them through TrimGalore! to remove adapters and poor quality sequences](https://github.com/RobertsLab/resources/issues/369). Processed the samples on Roadrunner (Apple Xserve; Ubuntu 16.04) using default TrimGalore! settings. After trimming, TrimGalore! output was summarized using MultiQC. Trimmed FastQ files were then analyzed with FastQC and followed up with MultiQC. Documented in Jupyter Notebook (see below). Jupyter Notebook (GitHub): * [20180911_roadrunner_virginica_trimgalore.ipynb](https://github.com/RobertsLab/code/blob/master/notebooks/sam/20180911_roadrunner_virginica_trimgalore.ipynb) * * * # RESULTS TrimGalore! output folder: * [20180911_virginica_oil_trimgalore_01/](https://owl.fish.washington.edu/Athaliana/20180911_virginica_oil_trimgalore_01/) TrimGalore! MultiQC report (HTML): * [20180911_virginica_oil_trimgalore_01/multiqc_report.html](https://owl.fish.washington.edu/Athaliana/20180911_virginica_oil_trimgalore_01/multiqc_report.html) TrimGalore! FastQC output folder: * [20180911_virginica_oil_trimgalore_01/20180911_virginica_oil_trimmed_fastqc/](https://owl.fish.washington.edu/Athaliana/20180911_virginica_oil_trimgalore_01/20180911_virginica_oil_trimmed_fastqc/) FastQC MultiQC report (HTML: * [20180911_virginica_oil_trimgalore_01/20180911_virginica_oil_trimmed_fastqc/multiqc_report.html](https://owl.fish.washington.edu/Athaliana/20180911_virginica_oil_trimgalore_01/20180911_virginica_oil_trimmed_fastqc/multiqc_report.html) Overall, things look a bit better, but there are still some issues. Will likely eliminate sample `2112_lane_1_TGACCA` from analysis and apply some additional sequence filtering, based on sequence length. * * * ###### SEQUENCE CONTENT PLOT ![](https://user-images.githubusercontent.com/4442039/45500147-d84d6a00-b732-11e8-816d-ff8bde235aba.png) * * * ###### SHORT SEQUENCE CONTAMINATION ![](https://user-images.githubusercontent.com/4514104/45501383-db962500-b735-11e8-895a-2caf8bd8cff9.png) * * * ![](https://user-images.githubusercontent.com/4514104/45501387-dd5fe880-b735-11e8-86a7-9b0efc162a73.png)