---
author: Sam White
toc-title: Contents
toc-depth: 5
toc-location: left
layout: post
title: FastQC-MultiQC - C.bairdi RNAseq Day 12 26 Infected Uninfected
date: '2019-10-24 16:24'
tags:
  - tanner crab
  - RNAseq
  - fastqc
  - multiqc
  - Chionoecetes bairdi
categories:
  - 2019
  - Tanner Crab RNAseq
---
After [receiving the rest of the crab data and concatenating it all together](https://robertslab.github.io/sams-notebook/posts/2019/2019-10-24-Data-Received---C.bairdi-RNAseq-Day9-12-26-Infected-Uninfected/), I ran FastQC and MultiQC on the FastQ files.


---

# RESULTS

Output folder:

- [20191024_cbai_fastqc_multiqc](https://gannet.fish.washington.edu/Atumefaciens/20191024_cbai_fastqc_multiqc)

MultiQC Report (HTML):

- [[20191024_cbai_fastqc_multiqc/multiqc_report.html](https://gannet.fish.washington.edu/Atumefaciens/20191024_cbai_fastqc_multiqc/multiqc_report.html)

So, that's done. However, I've noticed that one of the samples (sample ID 329775) only has ~42M reads (circled in red below):

![multiqc read count screencap with sample ID 329775 circled in red](https://github.com/RobertsLab/sams-notebook/blob/master/images/screencaps/20191024_cbai_fastqc_multiqc_read-counts.png?raw=true)

This read count is ~16% less than what we were quoted for. Unfortunately, the quote was for "~50M reads per sample". That "~" leads to a fair amount of ambiguity. The rest of the samples hover around 5 - 6% less than the 50M read mark. Is that acceptable? I don't know. In hindsight, I should've clarified what they meant on the quote. Of course, the other option is that this facility ([Northwest Genomics Center](https://nwgc.gs.washington.edu/))get their act together and write quotes that aren't ambiguous (e.g. promise >= 50M reads; then there's absolutely no confusion about what the customer is supposed to receive)!!

#### UPDATE

After emailing the sequencing facility, it turns out the read count "issue" is a difference in the terminology. They're reporting absolute number of reads generated (I agree with this, btw). So, we have most certainly received >50M for each sample. The confusion was related to the way other facilities refer to read counts. Most other facilities will count a read pair (e.g. R1 an R2) as a single read.

Regardless, I still think they need to do away with ambiguity when quoting projects. :)