--- title: "05-deseq2" output: html_document date: "2024-02-02" --- Rmd to run `DESeq2` comparisons between libraries. Summer 2021 SSWD Challenge Experiments coelomocyte RNAseq libraries aligned to the gene list FASTA from the published _Pycnopodia helianthoides_ genome. Load packages needed to run code: ```{r} library(DESeq2) library(tidyverse) library(pheatmap) library(RColorBrewer) library(data.table) library(ggplot2) ``` # load rounded count matrix of all 32 coelomocyte RNAseq libraries with genome gene list FASTA alignment: ```{r} countmatrix <- read.delim("../data/kallisto_count_matrix_rounded.tab", header = TRUE, sep = '\t') head(countmatrix) ``` # Subset the libraries from "Experiment B" - stars that were part of the final experiment that were injected with their inoculate (either unfiltered raw tissue homogenate from a wasting adult _Pycnopodia helianthoides_, or a heat-killed raw tissue homogenate from a wasting adult _Pycnopodia helianthoides_). Stars H03 and H05 had two time points as controls, so I picked one of each so that it's a balanced 8 vs 8 comparison. | library_ID | star_ID | previous_exposure | treatment | sample_date | experiment_day | |------------|---------|-----------------------------|-------------------|-------------|----------------| | PSC.56 | E06 | 0.45 live inoculate 9/23/21 | control (10/5/21) | 10/14/21 | 9 | | PSC.52 | E11 | 0.45 live inoculate 9/23/21 | control (10/5/21) | 10/14/21 | 9 | | PSC.54 | E18 | 0.45 live inoculate 9/23/21 | control (10/5/21) | 10/14/21 | 9 | | PSC.61 | E09 | 0.45 live inoculate 9/23/21 | control (10/5/21) | 10/15/21 | 10 | | PSC.64 | H05 | NA | control (10/5/21) | 10/15/21 | 10 | | PSC.73 | H03 | NA | control (10/5/21) | 10/16/21 | 11 | | PSC.76 | E10 | 0.45 live inoculate 9/23/21 | control (10/5/21) | 10/17/21 | 12 | | PSC.81 | E07 | 0.45 live inoculate 9/23/21 | control (10/5/21) | 10/18/21 | 13 | | PSC.59 | H01 | NA | exposed (10/5/21) | 10/14/21 | 9 | | PSC.57 | H06 | NA | exposed (10/5/21) | 10/14/21 | 9 | | PSC.69 | H04 | NA | exposed (10/5/21) | 10/15/21 | 10 | | PSC.67 | H09 | NA | exposed (10/5/21) | 10/15/21 | 10 | | PSC.71 | H18 | NA | exposed (10/5/21) | 10/15/21 | 10 | | PSC.75 | H08 | NA | exposed (10/5/21) | 10/16/21 | 11 | | PSC.78 | H10 | NA | exposed (10/5/21) | 10/17/21 | 12 | | PSC.83 | H07 | NA | exposed (10/5/21) | 10/18/21 | 13 | ```{r} expB <- select(countmatrix, "PSC.56", "PSC.52", "PSC.54", "PSC.61", "PSC.64", "PSC.73", "PSC.76", "PSC.81", "PSC.59", "PSC.57", "PSC.69", "PSC.67", "PSC.71", "PSC.75", "PSC.78", "PSC.83") head(expB) ``` 26581 rows, 16 columns Make a data frame for the comparison: ```{r} colData <- data.frame(condition=factor(c("control", "control", "control", "control", "control", "control", "control", "control", "exposed", "exposed", "exposed", "exposed", "exposed", "exposed", "exposed", "exposed")), type=factor(rep("paired-end",16))) rownames(colData) <- colnames(expB) dds <- DESeqDataSetFromMatrix(countData = expB, colData = colData, design = ~ condition) ``` ```{r} dds <- DESeq(dds) res <- results(dds) res <- res[order(rownames(res)), ] ``` From [Bioconductor `DESeq2` Vignette](http://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html): ```{r} vsd <- vst(dds, blind=FALSE) rld <- rlog(dds, blind=FALSE) head(assay(vsd), 3) ``` ```{r} plot <- plotPCA(vsd, intgroup=c("condition")) nudge <- position_nudge(y = 4) plot + geom_text(aes(label = name), position = nudge) ``` ```{r} # Count number of hits with adjusted p-value less then 0.05 dim(res[!is.na(res$padj) & res$padj <= 0.05, ]) ``` 7834 DEGs ```{r} control_v_exposed_fig <- res # The main plot plot(control_v_exposed_fig$baseMean, control_v_exposed_fig$log2FoldChange, pch=20, cex=0.45, ylim=c(-15, 15), log="x", col="darkgray", main="Control vs Exposed (pval % replace(is.na(.), 0) ~ condition, data = vsduvf.transpose.df, method = "eu") permanova.uvf #Significant influence of all factors ``` 0.367 --> NO SIG DIFFERENCE ```{r} # Count number of hits with adjusted p-value less then 0.05 dim(resuvf[!is.na(resuvf$padj) & resuvf$padj <= 0.05, ]) ``` no degs ```{r} unfilt_v_microbef_fig <- resuvf # The main plot plot(unfilt_v_microbef_fig$baseMean, unfilt_v_microbef_fig$log2FoldChange, pch=20, cex=0.45, ylim=c(-15, 15), log="x", col="darkgray", main="Unfiltered vs Microbial Fraction (pval % replace(is.na(.), 0) ~ condition, data = vsduvc.transpose.df, method = "eu") permanova.uvc #Significant influence of all factors ``` 0.222 --> no significant difference ```{r} # Count number of hits with adjusted p-value less then 0.05 dim(resuvc[!is.na(resuvc$padj) & resuvc$padj <= 0.05, ]) ``` 36 degs ```{r} unfilt_v_control_fig <- resuvc # The main plot plot(unfilt_v_control_fig$baseMean, unfilt_v_control_fig$log2FoldChange, pch=20, cex=0.45, ylim=c(-15, 15), log="x", col="darkgray", main="Unfiltered vs Control (pval % replace(is.na(.), 0) ~ condition, data = vsdfvc.transpose.df, method = "eu") permanova.fvc #Significant influence of all factors ``` 0.846 --> no significant difference ```{r} # Count number of hits with adjusted p-value less then 0.05 dim(resfvc[!is.na(resfvc$padj) & resfvc$padj <= 0.05, ]) ``` 4 DEGs ```{r} microbial_v_control_fig <- resfvc # The main plot plot(microbial_v_control_fig$baseMean, microbial_v_control_fig$log2FoldChange, pch=20, cex=0.45, ylim=c(-15, 15), log="x", col="darkgray", main="Microbial Fraction Vs Control (pval