--- author: Sam White toc-title: Contents toc-depth: 5 toc-location: left date: 2017-10-09 17:41:41+00:00 layout: post slug: data-management-convert-oly-pacbio-h5-to-fastq title: Data Management - Convert Oly PacBio H5 to FASTQ categories: - 2017 - Olympia Oyster Genome Sequencing tags: - FASTQ - olympia oyster - Ostrea lurida - PacBio --- After working with all of this Olympia oyster genome sequencing data, I remembered that we had an old, singular PacBio SMRT cell file (from June 2013). This file didn't seem to be included in any recent assemblies of Sean's or mine. This is most likely because we have it in the PacBio H5 format and not in FASTQ. I installed [PacBio's pbh5tools](https://github.com/PacificBiosciences/pbh5tools) on my computer (swoose), converted the file and moved it to owl/nightingales/O_lurida python bash5tools.py /mnt/owl/nightingales/O_lurida/m130619_081336_42134_c100525122550000001823081109281326_s1_p0.bas.h5 --outType fastq I generated an MD5 checksum and appended to the `checksums.md5` file in /owl/nightingales/O_lurida using the following command: md5sum m130619_081336_42134_c100525122550000001823081109281326_s1_p0.fastq | awk '{print $2 " = " $1}' >> checksums.md5 The command above pipes the output to awk to format the output to match the existing format of the `checksums.md5` file (i.e. filename = hash). I've also updated our [Nightingales spreadsheet (Google Sheet)(https://docs.google.com/spreadsheets/d/1_XqIOPVHSBVGscnjzDSWUeRL7HUHXfaHxVzec-I-8Xk/edit?usp=sharing) to reflect this. Will generate updated PacBio assemblies with Canu and/or Racon.