--- author: Sam White toc-title: Contents toc-depth: 5 toc-location: left layout: post title: Data Wrangling - Create Pgenerosa_v070 GFFs date: '2019-05-23 15:23' tags: - Panopea generosa - geoduck - jupyter - GFF - Pgenerosa_v070 - v070 categories: - 2019 - Geoduck Genome Sequencing --- Since the [MAKER genome annotation of our Pgenerosa_v070 genome assembly](https://robertslab.github.io/sams-notebook/posts/2019/2019-02-28-Genome-Annotation---Pgenerosa_v070-MAKER-on-Mox/) finally completed last week, I decided to separate out the various features into separate GFF files, as I'm sure we'll need/want them at some point. This was all done in a Jupyter Notebook on my computer (swoose) - see notebook linked below. Created GFF files for: - [CDS (169,461 features)](http://owl.fish.washington.edu/halfshell/genomic-databank/Pgenerosa_v070.CDS.gff) - [exons (175,007 features)](http://owl.fish.washington.edu/halfshell/genomic-databank/Pgenerosa_v070.exon.gff) - [genes (53,036 features)](http://owl.fish.washington.edu/halfshell/genomic-databank/Pgenerosa_v070.gene.gff) - [mRNAs (53,036 features)](http://owl.fish.washington.edu/halfshell/genomic-databank/Pgenerosa_v070.mRNA.gff) Files were `rsync`'d to (each one is linked above): - http://owl.fish.washington.edu/halfshell/genomic-databank Jupyter Notebook (GitHub): [20190523_pgen_Pgenerosa_v070_gffs.ipynb](https://github.com/RobertsLab/code/blob/master/notebooks/sam/20190523_pgen_Pgenerosa_v070_gffs.ipynb) Will add these files to our [Genomic Resources Wiki](https://github.com/RobertsLab/resources/wiki/Genomic-Resources).