# code directory contents: [01-data-explore-sr.Rmd](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/code/01-data-explore-sr.Rmd) Steven Roberts... exploring the raw data, aligning with `kallisto`; and other things. Output directory for this code file: [../output/01-data-explore](https://github.com/grace-ac/project-pycno-multispecies-2023/tree/main/output/01-data-explore) [02-blast-klonetest.Rmd](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/code/02-blast-klonetest.Rmd) Steven Roberts... testing things out on klone, I believe. Output directory for this code file: [../output/02-blast-klonetest](https://github.com/grace-ac/project-pycno-multispecies-2023/tree/main/output/02-blast-klonetest) [03-der-uniq.Rmd](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/code/03-der-uniq.Rmd) Steven Roberts... empty Rmd... I will delete after I finish this readme page..., and yet there is output for this code. Output directory for this code file: [../output/03-der-uniq](https://github.com/grace-ac/project-pycno-multispecies-2023/tree/main/output/03-der-uniq) [04-pyc-hisat.Rmd](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/code/04-pyc-hisat.Rmd) Steven Roberts... hisat for pycno. Output directory for this code file does not exist in the repository. [05-pis-annot.Rmd](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/code/05-pis-annot.Rmd) Steven Roberts... I think done on klone, so I don't have access to his results. Output directory for this code file does not exist in the repository. [06-Go-3-species.Rmd](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/code/06-Go-3-species.Rmd) Steven Roberts. Output directory for this code file: [../output/06-Go-3-species](https://github.com/grace-ac/project-pycno-multispecies-2023/tree/main/output/06-Go-3-species) [07-transcriptome-compare.Rmd](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/code/07-transcriptome-compare.Rmd) Steven Roberts. The code doesn't match the output... so I'm not sure what was done to get the output files. Output directory for this code file: [../output/07-transcriptome-compare](https://github.com/grace-ac/project-pycno-multispecies-2023/tree/main/output/07-transcriptome-compare) [08-trinity-unmapped.Rmd](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/code/08-trinity-unmapped.Rmd) Steven Roberts. No output directory for this code file in the repository. [09-unmap-blast.Rmd](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/code/09-unmap-blast.Rmd) Steven Roberts. Output directory for this code file: [../output/09-unmap-blast](https://github.com/grace-ac/project-pycno-multispecies-2023/tree/main/output/09-unmap-blast) [10-FastQC-pre-trim.Rm](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/code/10-FastQC-pre-trim.Rmd) Grace Crandall. Rmd to run `FastQC` on pre-trimmed RNAseq data. [11-multispecies-RNAseq-trimming.Rmd](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/code/11-multispecies-RNAseq-trimming.Rmd) Grace Crandall. Rmd to trim RNAseq data. [12-hisat2_pycno.Rmd](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/code/12-hisat2_pycno.Rmd) Grace Crandall. Rmd to align _P. helianthoides_ RNAseq data to the _P. helianthoides_ genome. - `../data`: [Pycno gene count matrix](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/data/pycno_gene_count_matrix_2023.csv) - 23464 genes - `../data`: [Pycno transcrpit count matrix](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/data/pycno_transcript_count_matrix_2023.csv) - 25831 transcripts [13-hisat2_pisaster.Rmd](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/code/13-hisat2_pisaster.Rmd) Grace Crandall. Rmd to align _P. ochraceus_ RNAseq data to the _P. ochraceus_ genome. CURRENTLY UNSUCCESSFUL. - `../data`: [Pisaster gene count matrix](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/data/pisaster_gene_count_matrix_2023.csv) - 32370 lines, but all counts are 0's - `../data`: [Pisaster transcript count matrix](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/data/pisaster_transcript_count_matrix_2023.csv) - 35696 lines, but all counts are 0's [14-hisat2_derm.Rmd](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/code/14-hisat2_derm.Rmd) Grace Crandall. Rmd to align _D. imbricata_ RNAseq data to the _D. imbricata_ genome. CURRENTLY UNSUCCESSFUL. [15-BLAST-pycno.Rmd](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/code/15-BLAST-pycno.Rmd) Grace Crandall. Rmd to `BLAST` the _P. helianthoides_ genome to annotate. Note for future self --> can just `BLAST` the count matrices. Output directory for this code file: [../output/15-BLAST-pycno](https://github.com/grace-ac/project-pycno-multispecies-2023/tree/main/output/15-BLAST-pycno) [16-BLAST-pisaster.Rmd](https://github.com/grace-ac/project-pycno-multispecies-2023/blob/main/code/16-BLAST-pisaster.Rmd) Grace Crandall. Rmd to `BLAST` the _P. ochraceus_ genome to annotate. Note for future self --> can just `BLAST` the count matrices. --- Notes that Steven put in: # NCBI Datasets https://www.ncbi.nlm.nih.gov/datasets This zip archive contains an NCBI Datasets Data Package. NCBI Datasets Data Packages can include sequence, annotation and other data files, and metadata in one or more data report files. Data report files are in JSON Lines format. --- ## FAQs ### Where is the data I requested? Your data is in the subdirectory `ncbi_dataset/data/` contained within this zip archive. ### I still can't find my data, can you help? We have identified a bug affecting Mac Safari users. When downloading data from the NCBI Datasets web interface, you may see only this README file after the download has completed (while other files appear to be missing). As a workaround to prevent this issue from recurring, we recommend disabling automatic zip archive extraction in Safari until Apple releases a bug fix. For more information, visit: https://www.ncbi.nlm.nih.gov/datasets/docs/reference-docs/mac-zip-bug/ ### How do I work with JSON Lines data reports? Visit our JSON Lines data report documentation page: https://www.ncbi.nlm.nih.gov/datasets/docs/v2/tutorials/working-with-jsonl-data-reports/ ### What is NCBI Datasets? NCBI Datasets is a resource that lets you easily gather data from across NCBI databases. Find and download gene, transcript, protein and genome sequences, annotation and metadata. ### Where can I find NCBI Datasets documentation? Visit the NCBI Datasets documentation pages: https://www.ncbi.nlm.nih.gov/datasets/docs/ --- National Center for Biotechnology Information National Library of Medicine info@ncbi.nlm.nih.gov