--- title: "09-DAVID" output: html_document date: "2024-02-29" --- Rmd for making files of information to put into DAVID to get enrichment. Goals: 1. Create a file that contains all the uniprot Accession IDs from the _P. helianthoides_ genome gene list blastx `analyses/06-BLAST/summer2021-uniprot_blastx.tab` 2. Create a file that contains all the uniprot accession IDs from: 2a. Annotated DEG list from Experiment A: `analyses/08-deglist_annot/expA_DEG_annot.tab` 2b. Annotated DEG list from Experiment B: `analyses/08-deglist_annot/expB_DEG_annot.tab` ```{r} library(dplyr) library(tidyr) library(tibble) ``` # Create Uniprot Accession ID file for `analyses/06-BLAST/summer2021-uniprot_blastx.tab` Read in file: ```{r} phel.blast <- read.delim("../analyses/06-BLAST/summer2021-uniprot_blastx.tab", header = F) head(phel.blast) ``` Rename columns to make sure column "V3" is "uniprot_acc": ```{r} cols.phel.blast <- c("V1", "V2", "uniprot_acc", "V4", "V5", "V6", "V7", "V8", "V9", "V10", "V11", "V12", "V13", "V14") colnames(phel.blast) <- cols.phel.blast head(phel.blast) ``` Call out just column "uniprot_acc": ```{r} blast.ua <- select(phel.blast, "uniprot_acc") head(blast.ua) ``` write out as file: ```{r} #write.table(blast.ua, "../analyses/09-DAVID/phel-blast-uniprot_acc.txt", sep = "\t", row.names = F, quote = FALSE, col.names = TRUE) ``` WRote out 02/29/2024 # Create uniprot accession ID file for expA_DEG list Read in DEG list: ```{r} expA <- read.delim("../analyses/08-deglist_annot/expA_DEG_annot.tab") head(expA) ``` Pull out uniprot Accession ID column (note there will be blanks because not every gene ID will have an annotation - make a note of the numbers!), and also note that some gene ids will have multiple annotations Current file contains 4,403 rows ```{r} expA.ua <- select(expA, "uniprot_acc") head(expA.ua) ``` remove rows that are _NA_: ```{r} expA.ua.nna <- na.omit(expA.ua) head(expA.ua.nna) ``` Now contains 42606 rows (1,417 removed) Write out the table: ```{r} #write.table(expA.ua.nna, "../analyses/09-DAVID/expA_DEG-uniprot_acc.txt", sep = "\t", row.names = F, quote = FALSE, col.names = TRUE) ``` WRote out 02/29/2024 # Create uniprot accession ID file for exp B deg list ```{r} expB <- read.delim("../analyses/08-deglist_annot/expB_DEG_annot.tab") head(expB) ``` 76506 rows Subset the uniprot accession id column: ```{r} expB.ua <- select(expB, "uniprot_acc") head(expB.ua) ``` remove NA's ```{r} expB.ua.nna <- na.omit(expB.ua) head(expB.ua.nna) ``` 73540 rows (removed 2966 Nas) Write out table: ```{r} #write.table(expB.ua.nna, "../analyses/09-DAVID/expB_DEG-uniprot_acc.txt", sep = "\t", row.names = F, quote = FALSE, col.names = TRUE) ``` WRote out 02/29/2024