miRanda is a target prediction software, used to identify likely miRNA-mRNA interactions.
Inputs:
FASTA of A.pulchra 3’UTRs Apul_3UTR_1kb.fasta, generated in 05-Apul-annotate-UTRs
FASTA of A.pulchra mature miRNAs miRNA_mature-Apul.fasta. miRNAs identified in 04-Apul-sRNA-discovery-ShortStack, matures isolated for use in 06-Apul-miRNA-mRNA-RNAhybrid
Outputs:
# score cutoff >100
# energy cutoff <-10
# strict binding
/home/shared/miRanda-3.3a/src/miranda \
../data/06-Apul-miRNA-mRNA-RNAhybrid/miRNA_mature-Apul.fasta \
../output/05-Apul-annotate-UTRs/Apul_3UTR_1kb.fasta \
-sc 100 \
-en -10 \
-strict \
-out ../output/07-Apul-miRNA-mRNA-miRanda/Apul-miRanda-3UTR-strict_all.tab
Let’s look at the output
echo "miranda run finished!"
echo "counting number of putative interactions predicted"
zgrep -c "Performing Scan" ../output/07-Apul-miRNA-mRNA-miRanda/Apul-miRanda-3UTR-strict_all.tab
echo "Parsing output"
grep -A 1 "Scores for this hit:" ../output/07-Apul-miRNA-mRNA-miRanda/Apul-miRanda-3UTR-strict_all.tab | sort | grep '>' > ../output/07-Apul-miRNA-mRNA-miRanda/Apul-miRanda-3UTR-strict-parsed.txt
echo "counting number of putative interactions predicted"
wc -l ../output/07-Apul-miRNA-mRNA-miRanda/Apul-miRanda-3UTR-strict_all.tab
## miranda run finished!
## counting number of putative interactions predicted
## 1905309
## Parsing output
## counting number of putative interactions predicted
## 16835258 ../output/07-Apul-miRNA-mRNA-miRanda/Apul-miRanda-3UTR-strict_all.tab
This is a lot of putative interactions! We can probably narrow it down though. In vertebrates, miRNA-mRNA binding only requires complementarity of an miRNA seed region of ~8 nucleotides. This requirement is built in to miRanda target prediction. In cnidarians, however, miRNA-mRNA binding is believed to require near-complete complementarity of the full mature miRNA, similarly to plants. Let’s look at how many putative interactions are predicted for a binding length of at least 21 nucleotides (the length of our smallest mature miRNA).
echo "number of putative interactions of at least 21 nucleotides"
awk -F'\t' '$7 >= 21' ../output/07-Apul-miRNA-mRNA-miRanda/Apul-miRanda-3UTR-strict-parsed.txt | wc -l
echo ""
echo "check some:"
awk -F'\t' '$7 >= 21' ../output/07-Apul-miRNA-mRNA-miRanda/Apul-miRanda-3UTR-strict-parsed.txt | head -5
## number of putative interactions of at least 21 nucleotides
## 18420
## 
## check some:
## >Cluster_10452.mature::ptg000020l:10483758-10483779(-)   ntLink_4:203294-204294  159.00  -17.21  2 21    826 849 21  66.67%  80.95%
## >Cluster_10452.mature::ptg000020l:10483758-10483779(-)   ntLink_4:241021-242021  159.00  -17.21  2 21    396 419 21  66.67%  80.95%
## >Cluster_10452.mature::ptg000020l:10483758-10483779(-)   ntLink_6:11395524-11396524  151.00  -13.49  2 21    890 914 22  63.64%  72.73%
## >Cluster_10452.mature::ptg000020l:10483758-10483779(-)   ntLink_6:12015318-12016318  150.00  -12.28  2 20    975 999 21  61.90%  76.19%
## >Cluster_10452.mature::ptg000020l:10483758-10483779(-)   ntLink_6:12240644-12241644  154.00  -18.86  2 21    361 387 24  70.83%  70.83%
We can also see from the alignment percentages (last 2 entries) that this number includes alignments with multiple mismatches. Let’s filter again to reduce the number of permissible mismatches. Let’s say we want no more than 3 mismatches. For an alignment of 21 nucleotides, this would be an alignment rate of (21-3)/21 = 85.7%.
echo "number of putative interactions of at least 21 nucleotides, with at most 3 mismatches"
awk -F'\t' '$7 >= 21' ../output/07-Apul-miRNA-mRNA-miRanda/Apul-miRanda-3UTR-strict-parsed.txt | awk -F'\t' '$8 >= 85' | wc -l
echo ""
echo "check some:"
awk -F'\t' '$7 >= 21' ../output/07-Apul-miRNA-mRNA-miRanda/Apul-miRanda-3UTR-strict-parsed.txt | awk -F'\t' '$8 >= 85' | head -5
## number of putative interactions of at least 21 nucleotides, with at most 3 mismatches
## 33
## 
## check some:
## >Cluster_14532.mature::ptg000025l:7472581-7472603(-) ptg000007l:4113238-4114238  174.00  -19.59  2 22    352 374 21  85.71%  85.71%
## >Cluster_14532.mature::ptg000025l:7472581-7472603(-) ptg000016l:7511874-7512874  180.00  -21.41  2 22    573 596 21  85.71%  85.71%
## >Cluster_14610.mature::ptg000025l:10668923-10668945(-)   ptg000016l:1601190-1602190  180.00  -19.49  2 22    785 808 21  85.71%  85.71%
## >Cluster_14610.mature::ptg000025l:10668923-10668945(-)   ptg000021l:2346486-2347486  180.00  -16.96  2 22    838 861 21  85.71%  85.71%
## >Cluster_14633.mature::ptg000025l:11346116-11346137(+)   ptg000004l:10107854-10108854    178.00  -17.36  2 21    54 77   21  85.71%  90.48%