---
title: "09-DAVID"
output: html_document
date: "2024-02-29"
---
Rmd for making files of information to put into DAVID to get enrichment.
Goals:
1. Create a file that contains all the uniprot Accession IDs from the _P. helianthoides_ genome gene list blastx `analyses/06-BLAST/summer2021-uniprot_blastx.tab`
2. Create a file that contains all the uniprot accession IDs from:
2a. Annotated DEG list from Experiment A: `analyses/08-deglist_annot/expA_DEG_annot.tab`
2b. Annotated DEG list from Experiment B: `analyses/08-deglist_annot/expB_DEG_annot.tab`
```{r}
library(dplyr)
library(tidyr)
library(tibble)
```
# Create Uniprot Accession ID file for `analyses/06-BLAST/summer2021-uniprot_blastx.tab`
Read in file:
```{r}
phel.blast <- read.delim("../analyses/06-BLAST/summer2021-uniprot_blastx.tab", header = F)
head(phel.blast)
```
Rename columns to make sure column "V3" is "uniprot_acc":
```{r}
cols.phel.blast <- c("V1", "V2", "uniprot_acc", "V4", "V5", "V6", "V7", "V8", "V9", "V10", "V11", "V12", "V13", "V14")
colnames(phel.blast) <- cols.phel.blast
head(phel.blast)
```
Call out just column "uniprot_acc":
```{r}
blast.ua <- select(phel.blast, "uniprot_acc")
head(blast.ua)
```
write out as file:
```{r}
#write.table(blast.ua, "../analyses/09-DAVID/phel-blast-uniprot_acc.txt", sep = "\t", row.names = F, quote = FALSE, col.names = TRUE)
```
WRote out 02/29/2024
# Create uniprot accession ID file for expA_DEG list
Read in DEG list:
```{r}
expA <- read.delim("../analyses/08-deglist_annot/expA_DEG_annot.tab")
head(expA)
```
Pull out uniprot Accession ID column (note there will be blanks because not every gene ID will have an annotation - make a note of the numbers!), and also note that some gene ids will have multiple annotations
Current file contains 4,403 rows
```{r}
expA.ua <- select(expA, "uniprot_acc")
head(expA.ua)
```
remove rows that are _NA_:
```{r}
expA.ua.nna <- na.omit(expA.ua)
head(expA.ua.nna)
```
Now contains 42606 rows (1,417 removed)
Write out the table:
```{r}
#write.table(expA.ua.nna, "../analyses/09-DAVID/expA_DEG-uniprot_acc.txt", sep = "\t", row.names = F, quote = FALSE, col.names = TRUE)
```
WRote out 02/29/2024
# Create uniprot accession ID file for exp B deg list
```{r}
expB <- read.delim("../analyses/08-deglist_annot/expB_DEG_annot.tab")
head(expB)
```
76506 rows
Subset the uniprot accession id column:
```{r}
expB.ua <- select(expB, "uniprot_acc")
head(expB.ua)
```
remove NA's
```{r}
expB.ua.nna <- na.omit(expB.ua)
head(expB.ua.nna)
```
73540 rows (removed 2966 Nas)
Write out table:
```{r}
#write.table(expB.ua.nna, "../analyses/09-DAVID/expB_DEG-uniprot_acc.txt", sep = "\t", row.names = F, quote = FALSE, col.names = TRUE)
```
WRote out 02/29/2024