Program options for transdecoder_predict: ######################################################################################## # ______ ___ __ # /_ __/______ ____ ___ / _ \___ _______ ___/ /__ ____ # / / / __/ _ `/ _\(_- - Transcriptome Protein Prediction # # # Required: # # -t transcripts.fasta # # Common options: # # # --retain_long_orfs_mode 'dynamic' or 'strict' (default: dynamic) # In dynamic mode, sets range according to 1%FDR in random sequence of same GC content. # # # --retain_long_orfs_length under 'strict' mode, retain all ORFs found that are equal or longer than these many nucleotides even if no other evidence # marks it as coding (default: 1000000) so essentially turned off by default.) # # --retain_pfam_hits domain table output file from running hmmscan to search Pfam (see transdecoder.github.io for info) # Any ORF with a pfam domain hit will be retained in the final output. # # --retain_blastp_hits blastp output in '-outfmt 6' format. # Any ORF with a blast match will be retained in the final output. # # --single_best_only Retain only the single best orf per transcript (prioritized by homology then orf length) # # --output_dir | -O output directory from the TransDecoder.LongOrfs step (default: basename( -t val ) + ".transdecoder_dir") # # -G genetic code (default: universal; see PerlDoc; options: Euplotes, Tetrahymena, Candida, Acetabularia, ...) # # --no_refine_starts start refinement identifies potential start codons for 5' partial ORFs using a PWM, process on by default. # ## Advanced options # # -T Top longest ORFs to train Markov Model (hexamer stats) (default: 500) # Note, 10x this value are first selected for removing redundancies, # and then this -T value of longest ORFs are selected from the non-redundant set. # Genetic Codes # # # --genetic_code Universal (default) # # Genetic Codes (derived from: https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi) # # Acetabularia Candida Ciliate Dasycladacean Euplotid Hexamita Mesodinium Mitochondrial-Ascidian Mitochondrial-Chlorophycean Mitochondrial-Echinoderm Mitochondrial-Flatworm Mitochondrial-Invertebrates Mitochondrial-Protozoan Mitochondrial-Pterobranchia Mitochondrial-Scenedesmus_obliquus Mitochondrial-Thraustochytrium Mitochondrial-Trematode Mitochondrial-Vertebrates Mitochondrial-Yeast Pachysolen_tannophilus Peritrich SR1_Gracilibacteria Tetrahymena Universal # # --version show version (5.5.0) # ######################################################################################### ---------------------------------------------- Program options for transdecoder_lORFs: ######################################################################################## # ______ ___ __ # /_ __/______ ____ ___ / _ \___ _______ ___/ /__ ____ # / / / __/ _ `/ _\(_- - Transcriptome Protein Prediction # # # Required: # # -t transcripts.fasta # # Optional: # # --gene_trans_map gene-to-transcript identifier mapping file (tab-delimited, gene_idtrans_id ) # # -m minimum protein length (default: 100) # # -G genetic code (default: universal; see PerlDoc; options: Euplotes, Tetrahymena, Candida, Acetabularia) # # -S strand-specific (only analyzes top strand) # # --output_dir | -O path to intended output directory (default: basename( -t val ) + ".transdecoder_dir") # # --genetic_code Universal (default) # # Genetic Codes (derived from: https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi) # Acetabularia Candida Ciliate Dasycladacean Euplotid Hexamita Mesodinium Mitochondrial-Ascidian Mitochondrial-Chlorophycean Mitochondrial-Echinoderm Mitochondrial-Flatworm Mitochondrial-Invertebrates Mitochondrial-Protozoan Mitochondrial-Pterobranchia Mitochondrial-Scenedesmus_obliquus Mitochondrial-Thraustochytrium Mitochondrial-Trematode Mitochondrial-Vertebrates Mitochondrial-Yeast Pachysolen_tannophilus Peritrich SR1_Gracilibacteria Tetrahymena Universal # # # --version show version tag (5.5.0) # ######################################################################################### ---------------------------------------------- Program options for hmmscan: # hmmscan :: search sequence(s) against a profile database # HMMER 3.2.1 (June 2018); http://hmmer.org/ # Copyright (C) 2018 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Usage: hmmscan [-options] Basic options: -h : show brief help on version and usage Options controlling output: -o : direct output to file , not stdout --tblout : save parseable table of per-sequence hits to file --domtblout : save parseable table of per-domain hits to file --pfamtblout : save table of hits and domains to file, in Pfam format --acc : prefer accessions over names in output --noali : don't output alignments, so output is smaller --notextw : unlimit ASCII text output line width --textw : set max width of ASCII text output lines [120] (n>=120) Options controlling reporting thresholds: -E : report models <= this E-value threshold in output [10.0] (x>0) -T : report models >= this score threshold in output --domE : report domains <= this E-value threshold in output [10.0] (x>0) --domT : report domains >= this score cutoff in output Options controlling inclusion (significance) thresholds: --incE : consider models <= this E-value threshold as significant --incT : consider models >= this score threshold as significant --incdomE : consider domains <= this E-value threshold as significant --incdomT : consider domains >= this score threshold as significant Options for model-specific thresholding: --cut_ga : use profile's GA gathering cutoffs to set all thresholding --cut_nc : use profile's NC noise cutoffs to set all thresholding --cut_tc : use profile's TC trusted cutoffs to set all thresholding Options controlling acceleration heuristics: --max : Turn all heuristic filters off (less speed, more power) --F1 : MSV threshold: promote hits w/ P <= F1 [0.02] --F2 : Vit threshold: promote hits w/ P <= F2 [1e-3] --F3 : Fwd threshold: promote hits w/ P <= F3 [1e-5] --nobias : turn off composition bias filter Other expert options: --nonull2 : turn off biased composition score corrections -Z : set # of comparisons done, for E-value calculation --domZ : set # of significant seqs, for domain E-value calculation --seed : set RNG seed to (if 0: one-time arbitrary seed) [42] --qformat : assert input is in format : no autodetection --cpu : number of parallel CPU workers to use for multithreads [2] ---------------------------------------------- Program options for blastp: USAGE blastp [-h] [-help] [-import_search_strategy filename] [-export_search_strategy filename] [-task task_name] [-db database_name] [-dbsize num_letters] [-gilist filename] [-seqidlist filename] [-negative_gilist filename] [-negative_seqidlist filename] [-taxids taxids] [-negative_taxids taxids] [-taxidlist filename] [-negative_taxidlist filename] [-entrez_query entrez_query] [-db_soft_mask filtering_algorithm] [-db_hard_mask filtering_algorithm] [-subject subject_input_file] [-subject_loc range] [-query input_file] [-out output_file] [-evalue evalue] [-word_size int_value] [-gapopen open_penalty] [-gapextend extend_penalty] [-qcov_hsp_perc float_value] [-max_hsps int_value] [-xdrop_ungap float_value] [-xdrop_gap float_value] [-xdrop_gap_final float_value] [-searchsp int_value] [-seg SEG_options] [-soft_masking soft_masking] [-matrix matrix_name] [-threshold float_value] [-culling_limit int_value] [-best_hit_overhang float_value] [-best_hit_score_edge float_value] [-subject_besthit] [-window_size int_value] [-lcase_masking] [-query_loc range] [-parse_deflines] [-outfmt format] [-show_gis] [-num_descriptions int_value] [-num_alignments int_value] [-line_length line_length] [-html] [-max_target_seqs num_sequences] [-num_threads int_value] [-ungapped] [-remote] [-comp_based_stats compo] [-use_sw_tback] [-version] DESCRIPTION Protein-Protein BLAST 2.8.1+ Use '-help' to print detailed descriptions of command line arguments ----------------------------------------------