--- author: Sam White toc-title: Contents toc-depth: 5 toc-location: left layout: post title: Transcriptome Annotation - Geoduck Ctenidia with BLASTx on Mox date: '2019-03-18 15:37' tags: - geoduck - blastx - Panopea generosa - mox - transcriptome categories: - 2019 - Miscellaneous --- I'll be annotating the [transcriptome assembly (from 20190215)](https://robertslab.github.io/sams-notebook/posts/2019/2019-02-15-Transcriptome-Assembly---Geoduck-Tissue-Specific-Assembly-Ctenidia/) using [Trinotate](https://github.com/Trinotate/Trinotate.github.io/wiki) and part of that process is having BLASTx output for the Trinity assembly, so have run BLASTx on Mox. SBATCH script: - [20190318_blastx_geoduck_ctenidia_RNAseq.sh](https://raw.githubusercontent.com/RobertsLab/sams-notebook/master/sbatch_scripts/20190318_blastx_geoduck_ctenidia_RNAseq.sh)
#!/bin/bash
## Job Name
#SBATCH --job-name=blastx_ctendia
## Allocation Definition
#SBATCH --account=coenv
#SBATCH --partition=coenv
## Resources
## Nodes
#SBATCH --nodes=1
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=25-00:00:00
## Memory per node
#SBATCH --mem=120G
##turn on e-mail notification
#SBATCH --mail-type=ALL
#SBATCH --mail-user=samwhite@uw.edu
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/scrubbed/samwhite/outputs/20190318_blastx_geoduck_ctenidia_RNAseq
# Load Python Mox module for Python module availability
module load intel-python3_2017
# Document programs in PATH (primarily for program version ID)
date >> system_path.log
echo "" >> system_path.log
echo "System PATH for $SLURM_JOB_ID" >> system_path.log
echo "" >> system_path.log
printf "%0.s-" {1..10} >> system_path.log
echo ${PATH} | tr : \\n >> system_path.log
wd="$(pwd)"
# Paths to input/output files
blastx_out="${wd}/blastx.outfmt6"
sp_db="/gscratch/srlab/programs/Trinotate-v3.1.1/admin/uniprot_sprot.pep"
trinity_fasta="/gscratch/scrubbed/samwhite/outputs/20190215_trinity_geoduck_ctenidia_RNAseq/trinity_out_dir/Trinity.fasta"
# Paths to programs
blast_dir="/gscratch/srlab/programs/ncbi-blast-2.8.1+/bin"
blastx="${blast_dir}/blastx"
# Run blastx on Trinity fasta
${blastx} \
-query ${trinity_fasta} \
-db ${sp_db} \
-max_target_seqs 1 \
-outfmt 6 \
-evalue 1e-3 \
-num_threads 28 \
> ${blastx_out}
---
# RESULTS
Output folder:
- [20190318_blastx_geoduck_ctenidia_RNAseq](http://gannet.fish.washington.edu/Atumefaciens/20190318_blastx_geoduck_ctenidia_RNAseq)
BLASTx output (output format 6):
- [20190318_blastx_geoduck_ctenidia_RNAseq/blastx.outfmt6](http://gannet.fish.washington.edu/Atumefaciens/20190318_blastx_geoduck_ctenidia_RNAseq/blastx.outfmt6)