Downloading complete uniprot database. cd ../data curl -O https://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.fasta.gz mv uniprot_sprot.fasta.gz uniprot_sprot_r2023_01.fasta.gz gunzip -k uniprot_sprot_r2023_01.fasta.gz ls ../data Using the NCBI blast software command ‘makeblastdb’ to create a blast database from the uniprot fasta file. /home/shared/ncbi-blast-2.11.0+/bin/makeblastdb \ -in /home/shared/8TB_HDD_02/cvaldi/celeste-tunicate-devo/data/uniprot_sprot_r2023_01.fasta \ -dbtype prot \ -out /home/shared/8TB_HDD_02/cvaldi/celeste-tunicate-devo/output/blastdb/uniprot_sprot_r2023_01 Lets look at what’s in the sequence.fata file before we BLAST it. head - n 20 ../data/psuedo-alignment/sequence.fasta echo "How many sequences are there?" grep -c ">" ../data/psuedo-alignment/sequence.fasta Blasting the reference transcriptome: /home/shared/ncbi-blast-2.11.0+/bin/blastx \ -query ../data/sequence.fasta \ -db ../output/blastdb/uniprot_sprot_r2023_01 \ -out ../output/Bsc-uniprot_blastx.tab \ -evalue 1E-20 \ -num_threads 20 \ -max_target_seqs 1 \ -outfmt 6 Lets take a little peak at the tab file we just created: head -2 ../output/Bsc-uniprot_blastx.tab wc -l ../output/Bsc-uniprot_blastx.tab