---
author: Sam White
toc-title: Contents
toc-depth: 5
toc-location: left
date: 2018-04-05 14:20:18+00:00
layout: post
slug: gunzip-bgi-hiseq-geoduck-genome-sequencing-data
title: 'Gunzip - BGI HiSeq Geoduck Genome Sequencing Data '
categories:
  - 2018
  - Geoduck Genome Sequencing
tags:
  - BGI
  - geoduck
  - gunzip
  - mox
  - Panopea generosa
---

In preparation to run SpareAssembler, I needed to gunzip the [BGI gzipped FASTQ files from 20180327](https://robertslab.github.io/sams-notebook/posts/2018/2018-03-27-fastqcmultiqc-bgi-geoduck-genome-sequencing-data/).

Ran the following slurm script on our Mox node:


    <code>
    #!/bin/bash
    ## Job Name
    #SBATCH --job-name=20180405_geoduck_bgi_gunzip
    ## Allocation Definition
    #SBATCH --account=srlab
    #SBATCH --partition=srlab
    ## Resources
    ## Nodes (We only get 1, so this is fixed)
    #SBATCH --nodes=1
    ## Walltime (days-hours:minutes:seconds format)
    #SBATCH --time=30-00:00:00
    ## Memory per node
    #SBATCH --mem=500G
    ##turn on e-mail notification
    #SBATCH --mail-type=ALL
    #SBATCH --mail-user=samwhite@uw.edu
    ## Specify the working directory for this job
    #SBATCH --workdir=/gscratch/scrubbed/samwhite/bgi_geoduck
    
    for i in /gscratch/scrubbed/samwhite/bgi_geoduck/*.gz; do
        filename="${i##*/}"
        no_ext="${filename%%.*}"
        gunzip < "$i" > "$no_ext".fastq
    done
    </code>


##### Results:


Completed in ~45mins. Will proceed with massive geoduck genome assembly!