--- author: Sam White toc-title: Contents toc-depth: 5 toc-location: left layout: post title: Data Management - Create Geoduck Bisulfite Genomes with Bismark on Mox date: '2019-02-21 20:47' tags: - Panopea generosa - geoduck - mox - bismark categories: - 2019 - Miscellaneous --- In preparation for some a [quick and "dirty" bisulfite analysis](https://github.com/RobertsLab/resources/issues/589), I needed to make bisulfite genomes with, and for, Bismark to use. SBATCH script contains all details regarding which genome was used as input and which software versions were used to produce the subsequent bisulfite genome: - [20190221_pgenerosa_genome_prep_bismark.sh](http://gannet.fish.washington.edu/Atumefaciens/20190221_pgenerosa_genome_prep_bismark/20190221_pgenerosa_genome_prep_bismark.sh)
#!/bin/bash
## Job Name
#SBATCH --job-name=bismark
## Allocation Definition
#SBATCH --account=coenv
#SBATCH --partition=coenv
## Resources
## Nodes
#SBATCH --nodes=2
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=10-00:00:00
## Memory per node
#SBATCH --mem=120G
##turn on e-mail notification
#SBATCH --mail-type=ALL
#SBATCH --mail-user=samwhite@uw.edu
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/scrubbed/samwhite/outputs/20190221_pgenerosa_genome_prep_bismark
# Load Python Mox module for Python module availability
module load intel-python3_2017
# Document programs in PATH (primarily for program version ID)
date >> system_path.log
echo "" >> system_path.log
echo "System PATH for $SLURM_JOB_ID" >> system_path.log
echo "" >> system_path.log
printf "%0.s-" {1..10} >> system_path.log
echo ${PATH} | tr : \\n >> system_path.log
# Directories and programs
## Programs
bismark_dir="/gscratch/srlab/programs/Bismark-0.19.0"
bowtie2_dir="/gscratch/srlab/programs/bowtie2-2.3.4.1-linux-x86_64"
samtools="/gscratch/srlab/programs/samtools-1.9/samtools"
## Files
genome_v070="/gscratch/scrubbed/samwhite/data/P_generosa/Pgenerosa_v070"
genome_v071="/gscratch/scrubbed/samwhite/data/P_generosa/Pgenerosa_v071"
genome_v073="/gscratch/scrubbed/samwhite/data/P_generosa/Pgenerosa_v073"
# Prepare bisulfite genomes
${bismark_dir}/bismark_genome_preparation \
--path_to_bowtie ${bowtie2_dir} \
--verbose ${genome_v070}
${bismark_dir}/bismark_genome_preparation \
--path_to_bowtie ${bowtie2_dir} \
--verbose ${genome_v071}
${bismark_dir}/bismark_genome_preparation \
--path_to_bowtie ${bowtie2_dir} \
--verbose ${genome_v073}
---
# RESULTS
Output folders were ```rsync```ed to the Owl [genomic databank directory](http://owl.fish.washington.edu/halfshell/genomic-databank/), and gzipped:
- [Pgenerosa_v070_bisulfite.tar.gz](http://owl.fish.washington.edu/halfshell/genomic-databank/Pgenerosa_v070_bisulfite.tar.gz)
- [Pgenerosa_v071_bisulfite.tar.gz](http://owl.fish.washington.edu/halfshell/genomic-databank/Pgenerosa_v071_bisulfite.tar.gz)
- [Pgenerosa_v073_bisulfite.tar.gz](http://owl.fish.washington.edu/halfshell/genomic-databank/Pgenerosa_v073_bisulfite.tar.gz)
These contain all the necessary files for use with Bismark, as well as a ```readme``` file and a copy of the SBATCH script.
Will add to our [Genomic Resources wiki](https://github.com/RobertsLab/resources/wiki/Genomic-Resources)