---
title: "10-FastQC-pre-trim"
output: html_document
date: "2025-02-11"
---
Grace Crandall
Rmd to run FastQC on pre-trimmed RNAseq data from coelomocytes from _P. helianthoides_, _P. ochraceus_, and _D. imbricata_ sampled Summer 2023 for the Multi-species experiment.
All raw files are found on OWL: [http://owl.fish.washington.edu/nightingales/P_helianthoides/](http://owl.fish.washington.edu/nightingales/P_helianthoides/)
The files are:
| sample_ID_PSC | star_ID | species | experiment | treatment | experiment_day | bin_number | sample_date |
|---------------|----------|--------------------------|--------------|-----------|----------------|------------|-------------|
| 517 | Derm_105 | dermasterias_imbricata | multispecies | exposed | 12 | 1 | 8/2/23 |
| 518 | Pis_120 | pisaster_ochraceus | multispecies | exposed | 12 | 1 | 8/2/23 |
| 519 | Pycno_89 | pycnopodia_helianthoides | multispecies | exposed | 12 | 1 | 8/2/23 |
| 523 | Derm 104 | dermasterias_imbricata | multispecies | exposed | 12 | 2 | 8/2/23 |
| 524 | Pis_128 | pisaster_ochraceus | multispecies | exposed | 12 | 2 | 8/2/23 |
| 525 | Pycno_83 | pycnopodia_helianthoides | multispecies | exposed | 12 | 2 | 8/2/23 |
| 529 | Derm_100 | dermasterias_imbricata | multispecies | exposed | 12 | 3 | 8/2/23 |
| 530 | Pis_131 | pisaster_ochraceus | multispecies | exposed | 12 | 3 | 8/2/23 |
| 531 | Pycno_91 | pycnopodia_helianthoides | multispecies | exposed | 12 | 3 | 8/2/23 |
| 535 | Derm_106 | dermasterias_imbricata | multispecies | exposed | 12 | 4 | 8/2/23 |
| 536 | Pis_121 | pisaster_ochraceus | multispecies | exposed | 12 | 4 | 8/2/23 |
| 537 | Pycno_84 | pycnopodia_helianthoides | multispecies | exposed | 12 | 4 | 8/2/23 |
| 547 | Derm_101 | dermasterias_imbricata | multispecies | exposed | 12 | 6 | 8/2/23 |
| 548 | Pis_118 | pisaster_ochraceus | multispecies | exposed | 12 | 6 | 8/2/23 |
| 549 | Pycno_87 | pycnopodia_helianthoides | multispecies | exposed | 12 | 6 | 8/2/23 |
| 559 | Derm_111 | dermasterias_imbricata | multispecies | exposed | 12 | 8 | 8/2/23 |
| 560 | Pis_119 | pisaster_ochraceus | multispecies | exposed | 12 | 8 | 8/2/23 |
| 561 | Pycno_26 | pycnopodia_helianthoides | multispecies | exposed | 12 | 8 | 8/2/23 |
Transferred files to Raven.
- `ssh` into Raven with credentials
- make a directory called `multispecies2023`
- change directories into `multispecies2023` and run code below to `rsync` raw RNAseq files from OWL to Raven:
We want all the files from 2024, and they all have the same start: "PSC-05*"
`wget -r --no-directories --no-parent -A "PSC-05*" https://owl.fish.washington.edu/nightingales/P_helianthoides`
Files are now all in `/home/shared/8TB_HDD_02/graceac9/multispecies2023`
# Run `FastQC`
FastQC on Raven lives: `/home/shared/FastQC-0.12.1/fastqc`
```{bash}
/home/shared/FastQC-0.12.1/fastqc -h
```
Check working directory:
```{bash}
pwd
```
Run FASTQC on untrimmed RNAseq .fastq.gz files:
Below modified from Roberts Lab [Code Snippets](https://robertslab.github.io/resources/code_Snippets/)
```{bash}
# Set CPU threads to use
threads=48
# Populate array with FastQ files
fastq_array=(/home/shared/8TB_HDD_02/graceac9/multispecies2023/*.fastq.gz)
# Pass array contents to new variable
fastqc_list=$(echo "${fastq_array[*]}")
# Run FastQC
# NOTE: Do NOT quote ${fastqc_list}
/home/shared/FastQC-0.12.1/fastqc \
--threads ${threads} \
--outdir /home/shared/8TB_HDD_02/graceac9/fastqc \
${fastqc_list}
```
FastQC files are in: `/home/shared/8TB_HDD_02/graceac9/fastqc`
In terminal in the Rproj, put:
`eval "$(/opt/anaconda/anaconda3/bin/conda shell.bash hook)"
conda activate`
Then navigate into the directory: `/home/shared/8TB_HDD_02/graceac9/fastqc` and run in terminal: `multiqc .`
The report will be generated in seconds...
To view the report, transfer the html to owl or or gannet
In terminal, while still in the directory where the fastqc report lives, run the following to `rsync` the file to the directory on owl:
`rsync --archive --progress --verbose multiqc_report.html grace@owl.fish.washington.edu:/volume1/web/gcrandall/multispeciesSSWD/QCreports`
The report now lives on OWL: http://owl.fish.washington.edu/gcrandall/multispeciesSSWD/QCreports/multiqc_report_untrimmedRNAseqData.html
* NOTE: In Owl, I renamed the multi-qc report to "multiqc_report_untrimmedRNAseqData.html" because there will be another report in there from the trimmed data