Week 5 Presentation

FISH 546 Week 5 Presentation

Sarah Yerrace

Invasive Lionfish Gut Content Metabarcoding

Red Lionfish (Pterois volitans) (Include image, CHECK)

One) Clearly Demonstrate Goal

-Start: Raw sequences from COI (mtDNA)

-End: Taxon assignments to sequences

JonahVentures technically does this for me. I want to do it better with different database

Two) Methods taken

One of many fastQ files

                                                                                                                                                                                           X.M02551.259.000000000.KC225.1.1101.17112.2083.1.N.0.CTTGACGAGGTT
1 GGTACTGGTTGAACAGTTTATCCCCCTCTATCAGGCAACCTAGCCCACGCAGGTGCTTCCGTAGACTTAACCATCTTCTCCCTCCACCTGGCAGGTATTTCTTCAATCCTGGGAGCCATCAACTTTATCACTACCATTATTAATATGAAACCCCCTGCCATTTCCCAGTATCAAACCCCTCTTTTCGTATGGGCTGTTCTTATTACTGCCGTACTTCTCCTTCTGTCCCTCCCAGTCTTAGCTGCTGGCA
2                                                                                                                                                                                                                                                          +
3 B@AABFFFFFAFFFGGFGGGGGGHGGGGHHHHGHHHCHGHHHHHHHHGGGGGGAFFHHHHGHFGFHGHFFHHHHHHHHHHHHHGGGGHHHCFFFGFHHHHEHHHHFHHHHHGEFFGGCGHF3FHHHFHHHHHBGHFHHHHH3FH3FG3B3GHGGGGEHHHHHHHHHDFF1FG11FGGGGGGHHHHHGHHHG<GGHFHGHHGDGHHHHHHHGGDDHHHHHHHGGGFGGGGGGGGFBFFGGEBFFFGGGFG/
4                                                                                                                                                                                           @M02551:259:000000000-KC225:1:1101:18428:2160 1:N:0:CTTGACGAGGTT
5 GGTACTGGATGAACTGTTTACCCCCCTCTATCAGGCAACCTAGCCCACGCAGGTGCTTCCGTAGACTTAACCATCTTCTCCCTCCACCTGGCAGGTATTTCTTCAATCCTGGGAGCCATCAACTTTATCACTACCATTATTAATATGAAACCCCCTGCCATTTCCCAGTATCAAACCCCTCTTTTCGTACGGGCTGTTCTTATTACTGCCGTACTTCTCCTTCTGTCCCTCCCAGTCTTAGCTGCTGGCA
6                                                                                                                                                                                                                                                          +

(Show parts of initial data, CHECK)

Two) Methods taken

Here’s sneak peak of my metadata file

  Sample_name Well Set     Locus                    PrimerF
1      test01   A1   1 Leray_COI GGWACWGGWTGAACWGTWTAYCCYCC
2      test02   B1   1 Leray_COI GGWACWGGWTGAACWGTWTAYCCYCC
3      test03   C1   1 Leray_COI GGWACWGGWTGAACWGTWTAYCCYCC
4      test04   D1   1 Leray_COI GGWACWGGWTGAACWGTWTAYCCYCC
5      test05   E1   1 Leray_COI GGWACWGGWTGAACWGTWTAYCCYCC
                     PrimerR i7_Index_Name i5_Index_Name
1 TAIACYTCIGGRTGICCRAARAAYCA            NA            NA
2 TAIACYTCIGGRTGICCRAARAAYCA            NA            NA
3 TAIACYTCIGGRTGICCRAARAAYCA            NA            NA
4 TAIACYTCIGGRTGICCRAARAAYCA            NA            NA
5 TAIACYTCIGGRTGICCRAARAAYCA            NA            NA
                                         file1
1 JV173_UniCOI_Tornabene_S043447.1.R1.fastq.gz
2 JV173_UniCOI_Tornabene_S043448.1.R1.fastq.gz
3 JV173_UniCOI_Tornabene_S043449.1.R1.fastq.gz
4 JV173_UniCOI_Tornabene_S043450.1.R1.fastq.gz
5 JV173_UniCOI_Tornabene_S043451.1.R1.fastq.gz
                                         file2
1 JV173_UniCOI_Tornabene_S043447.1.R2.fastq.gz
2 JV173_UniCOI_Tornabene_S043448.1.R2.fastq.gz
3 JV173_UniCOI_Tornabene_S043449.1.R2.fastq.gz
4 JV173_UniCOI_Tornabene_S043450.1.R2.fastq.gz
5 JV173_UniCOI_Tornabene_S043451.1.R2.fastq.gz

Two) Methods taken

Sample Name Locus Well Set File 1
Test 1 Leray_COI A1 1 JV173_UniCOI_Tornabene_S043447.1.R1
Test 2 Leray_COI B1 1 JV173_UniCOI_Tornabene_S043448.1.R1
Test 3 Leray_COI C1 1 JV173_UniCOI_Tornabene_S043449.1.R1
Test 4 Leray_COI D1 1 JV173_UniCOI_Tornabene_S043450.1.R1
Test 5 Leray_COI C1 1 JV173_UniCOI_Tornabene_S043451.1.R1

(table included, CHECK)

Two) Methods taken

Show core code

## run cutadapt script (see scripts folder) with all the original FASTQ files from Ventures raw sequences to remove the primers at both ends of each sequence

fastqs <- "../Data/"

F1s <- sort(list.files(fastqs, pattern="R1.fastq", full.names = TRUE))
R1s <- sort(list.files(fastqs, pattern="R2.fastq", full.names = TRUE))

sample.names <- str_replace(basename(F1s), "_R1.fastq","")

Three) Preliminary Results

This is using the taxon assingment provided by JonahVentures from just N=4 of N=132 lionfish

prelim <- read.csv("../Output/JVB1606-UniCOI-read-data.csv")
hist(x=prelim$X..match, main= "Sequence Percent Match", xlab="% Match")

Make a plot from code, CHECK, highlight line of code, CHECK

Four) Outline Steps for the Future

Meet with Marta next week

Run Blast

Remove Lionfish reads