Aidan Coyle,

Originally written 2021/03/10

Roberts Lab, UW-SAFS

Written for analysis of Hematodinium differential gene expression

Purpose:

This script produces the CSV needed for GO-MWU, which is a 2-column table of accession IDs and unadjusted p-values. A header line is required, although the contents are irrelevant.

GO-MWU also requires another input file which we already created - a tab-delimited 2-column table of accession IDs and GO terms with no header and only 1 line per gene

Explanation of geneIDs_pvals():

This script utilizes a function built for this analysis by me - geneids_pvals(). It is defined in hematodinium_analysis_functions.R

Inputs and Outputs

input_file: DESeq2 output file containing transcript IDs and unadjusted p-values

blast_file: path that leads to transcript ID/accession ID table. Optimally, use an annotated BLASTx table of the same transcriptome used to create the kallisto index

output_file: path to the output file - a 2-column CSV of accession IDs and unadjusted p-values with a header line. This should be written to the same directory that you will run GO-MWU in.

library(tidyverse)

source("hematodinium_analysis_functions.R")

Obtaining CSVs

# Elevated Day 0 vs. Elevated Day 2, indiv. libraries only
geneIDs_pvals(input_file =  "../graphs/DESeq2_output/cbai_transcriptomev4.0/elev0_vs_elev2_indiv/AllGenes_wcols.txt",
              blast_file =  "../output/BLASTs/uniprot_swissprot/cbai4.0_blastxres.tab",
              output_file  =  "../scripts/46_running_GO-MWU/cbai4.0_elev0_vs_elev2_indiv_pvals.csv")

# Ambient Day 2 vs. Elevated Day 2, indiv. libraries only
geneIDs_pvals(input_file = "../graphs/DESeq2_output/cbai_transcriptomev4.0/amb2_vs_elev2_indiv/AllGenes_wcols.txt",
              blast_file = "../output/BLASTs/uniprot_swissprot/cbai4.0_blastxres.tab",
              output_file = "../scripts/46_running_GO-MWU/cbai4.0_amb2_vs_elev2_indiv_pvals.csv")