--- author: Sam White toc-title: Contents toc-depth: 5 toc-location: left layout: post title: BLASTx - Clupea pallasii (Pacific herring) liver and testes transcriptomes on Mox date: '2018-12-12 12:10' tags: - blastx - Clupea pallasii - Pacific herring - transcriptome - liver - testes - Mox categories: - 2018 - Miscellaneous --- Apparently we've had some interest in the Pacific herring transcriptomes (liver and testes) we produced many years ago. As such, [Steven asked that I do a quick BLASTx to help annotate these two transcriptomes](https://github.com/RobertsLab/resources/issues/511). Two FastA files were downloaded from FigShare to Mox with the following commands: Liver transcriptome: ```wget https://s3-eu-west-1.amazonaws.com/pfigshare-u-files/88394/HerringHepaticTranscriptome34300contigs.fa``` Testes transcriptome: ```wget https://s3-eu-west-1.amazonaws.com/pfigshare-u-files/88395/HerringTesticularTranscriptome31545contigs.fa``` Used the UniProtKB database that I downloaded on 20181008 as the blast database. Here are the SBATCH script files used to run these jobs: Liver job: - [20181212_blastx_herring_transcriptomes/20181212_herring_liver_blastx.sh](http://gannet.fish.washington.edu/Atumefaciens/20181212_blastx_herring_transcriptomes/20181212_herring_liver_blastx.sh) Testes job: - [20181212_blastx_herring_transcriptomes/20181212_herring_testes_blastx.sh](http://gannet.fish.washington.edu/Atumefaciens/20181212_blastx_herring_transcriptomes/20181212_herring_testes_blastx.sh) Since the two scripts are fairly short in length, I'll put the full contents below. Liver job:
#!/bin/bash
## Job Name
#SBATCH --job-name=herring_blast
## Allocation Definition
#SBATCH --account=srlab
#SBATCH --partition=srlab
## Resources
## Nodes
#SBATCH --nodes=2
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=02-0:00:10
## Memory per node
#SBATCH --mem=120
##turn on e-mail notification
#SBATCH --mail-type=ALL
#SBATCH --mail-user=samwhite@uw.edu
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/srlab/sam/outputs/20181212_blastx_herring_liver
# Load Python Mox module for Python module availability
module load intel-python3_2017
# Document programs in PATH (primarily for program version ID)
date >> system_path.log
echo "" >> system_path.log
echo "System PATH for $SLURM_JOB_ID" >> system_path.log
echo "" >> system_path.log
printf "%0.s-" {1..10} >> system_path.log
echo ${PATH} | tr : \\n >> system_path.log
# Copy SBATCH script to working directory
cp /gscratch/srlab/sam/sbatch_scripts/20181212_herring_liver_blastx.sh .
# Make blast database available to blast
export BLASTDB=/gscratch/srlab/blastdbs/UniProtKB_20181008
# Set variables
## BLASTx
blastx="/gscratch/srlab/programs/ncbi-blast-2.6.0+/bin/blastx"
## UniProt database
uniprot="/gscratch/srlab/blastdbs/UniProtKB_20181008/20181008_uniprot_sprot.fasta"
liver_fasta="/gscratch/srlab/sam/data/C_pallasii/transcriptomes/HerringHepaticTranscriptome34300contigs.fa"
# Run blastp on UniProt database
${blastx} \
-query ${liver_fasta} \
-db ${uniprot} \
-max_target_seqs 1 \
-outfmt 6 \
-num_threads 56 \
> 20181212.herring.liver.blastx.outfmt6 \
2> 20181212.herring.liver.blastx.err
Testes job:
#!/bin/bash
## Job Name
#SBATCH --job-name=herring_blast
## Allocation Definition
#SBATCH --account=coenv
#SBATCH --partition=coenv
## Resources
## Nodes
#SBATCH --nodes=2
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=02-0:00:10
## Memory per node
#SBATCH --mem=120
##turn on e-mail notification
#SBATCH --mail-type=ALL
#SBATCH --mail-user=samwhite@uw.edu
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/srlab/sam/outputs/20181212_blastx_herring_liver
# Load Python Mox module for Python module availability
module load intel-python3_2017
# Document programs in PATH (primarily for program version ID)
date >> system_path.log
echo "" >> system_path.log
echo "System PATH for $SLURM_JOB_ID" >> system_path.log
echo "" >> system_path.log
printf "%0.s-" {1..10} >> system_path.log
echo ${PATH} | tr : \\n >> system_path.log
# Copy SBATCH script to working directory
cp /gscratch/srlab/sam/sbatch_scripts/20181212_herring_testes_blastx.sh .
# Make blast database available to blast
export BLASTDB=/gscratch/srlab/blastdbs/UniProtKB_20181008
# Set variables
## BLASTx
blastx="/gscratch/srlab/programs/ncbi-blast-2.6.0+/bin/blastx"
## UniProt database
uniprot="/gscratch/srlab/blastdbs/UniProtKB_20181008/20181008_uniprot_sprot.fasta"
testes_fasta="/gscratch/srlab/sam/data/C_pallasii/transcriptomes/HerringTesticularTranscriptome31545contigs.fa"
# Run blastp on UniProt database
${blastx} \
-query ${testes_fasta} \
-db ${uniprot} \
-max_target_seqs 1 \
-outfmt 6 \
-num_threads 56 \
> 20181212.herring.testes.blastx.outfmt6 \
2> 20181212.herring.testes.blastx.err
---
# RESULTS
Output folder:
- [20181212_blastx_herring_transcriptomes/](http://gannet.fish.washington.edu/Atumefaciens/20181212_blastx_herring_transcriptomes/)
Liver BLASTx output file:
- [20181212_blastx_herring_transcriptomes/20181212.herring.liver.blastx.outfmt6](http://gannet.fish.washington.edu/Atumefaciens/20181212_blastx_herring_transcriptomes/20181212.herring.liver.blastx.outfmt6)
Testes BLASTx output file:
- [20181212_blastx_herring_transcriptomes/20181212.herring.testes.blastx.outfmt6](http://gannet.fish.washington.edu/Atumefaciens/20181212_blastx_herring_transcriptomes/20181212.herring.testes.blastx.outfmt6)