Program options for hisat2: HISAT2 version 2.1.0 by Daehwan Kim (infphilo@gmail.com, www.ccb.jhu.edu/people/infphilo) Usage: hisat2 [options]* -x {-1 -2 | -U | --sra-acc } [-S ] Index filename prefix (minus trailing .X.ht2). Files with #1 mates, paired with files in . Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2). Files with #2 mates, paired with files in . Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2). Files with unpaired reads. Could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2). Comma-separated list of SRA accession numbers, e.g. --sra-acc SRR353653,SRR353654. File for SAM output (default: stdout) , , can be comma-separated lists (no whitespace) and can be specified many times. E.g. '-U file1.fq,file2.fq -U file3.fq'. Options (defaults in parentheses): Input: -q query input files are FASTQ .fq/.fastq (default) --qseq query input files are in Illumina's qseq format -f query input files are (multi-)FASTA .fa/.mfa -r query input files are raw one-sequence-per-line -c , , are sequences themselves, not files -s/--skip skip the first reads/pairs in the input (none) -u/--upto stop after first reads/pairs (no limit) -5/--trim5 trim bases from 5'/left end of reads (0) -3/--trim3 trim bases from 3'/right end of reads (0) --phred33 qualities are Phred+33 (default) --phred64 qualities are Phred+64 --int-quals qualities encoded as space-delimited integers --sra-acc SRA accession ID Alignment: --n-ceil func for max # non-A/C/G/Ts permitted in aln (L,0,0.15) --ignore-quals treat all quality values as 30 on Phred scale (off) --nofw do not align forward (original) version of read (off) --norc do not align reverse-complement version of read (off) Spliced Alignment: --pen-cansplice penalty for a canonical splice site (0) --pen-noncansplice penalty for a non-canonical splice site (12) --pen-canintronlen penalty for long introns (G,-8,1) with canonical splice sites --pen-noncanintronlen penalty for long introns (G,-8,1) with noncanonical splice sites --min-intronlen minimum intron length (20) --max-intronlen maximum intron length (500000) --known-splicesite-infile provide a list of known splice sites --novel-splicesite-outfile report a list of splice sites --novel-splicesite-infile provide a list of novel splice sites --no-temp-splicesite disable the use of splice sites found --no-spliced-alignment disable spliced alignment --rna-strandness specify strand-specific information (unstranded) --tmo reports only those alignments within known transcriptome --dta reports alignments tailored for transcript assemblers --dta-cufflinks reports alignments tailored specifically for cufflinks --avoid-pseudogene tries to avoid aligning reads to pseudogenes (experimental option) --no-templatelen-adjustment disables template length adjustment for RNA-seq reads Scoring: --mp , max and min penalties for mismatch; lower qual = lower penalty <6,2> --sp , max and min penalties for soft-clipping; lower qual = lower penalty <2,1> --no-softclip no soft-clipping --np penalty for non-A/C/G/Ts in read/ref (1) --rdg , read gap open, extend penalties (5,3) --rfg , reference gap open, extend penalties (5,3) --score-min min acceptable alignment score w/r/t read length (L,0.0,-0.2) Reporting: -k (default: 5) report up to alns per read Paired-end: -I/--minins minimum fragment length (0), only valid with --no-spliced-alignment -X/--maxins maximum fragment length (500), only valid with --no-spliced-alignment --fr/--rf/--ff -1, -2 mates align fw/rev, rev/fw, fw/fw (--fr) --no-mixed suppress unpaired alignments for paired reads --no-discordant suppress discordant alignments for paired reads Output: -t/--time print wall-clock time taken by search phases --un write unpaired reads that didn't align to --al write unpaired reads that aligned at least once to --un-conc write pairs that didn't align concordantly to --al-conc write pairs that aligned concordantly at least once to (Note: for --un, --al, --un-conc, or --al-conc, add '-gz' to the option name, e.g. --un-gz , to gzip compress output, or add '-bz2' to bzip2 compress output.) --summary-file print alignment summary to this file. --new-summary print alignment summary in a new style, which is more machine-friendly. --quiet print nothing to stderr except serious errors --met-file send metrics to file at (off) --met-stderr send metrics to stderr (off) --met report internal counters & metrics every secs (1) --no-head supppress header lines, i.e. lines starting with @ --no-sq supppress @SQ header lines --rg-id set read group id, reflected in @RG line and RG:Z: opt field --rg add ("lab:value") to @RG line of SAM header. Note: @RG line only printed when --rg-id is set. --omit-sec-seq put '*' in SEQ and QUAL fields for secondary alignments. Performance: -o/--offrate override offrate of index; must be >= index's offrate -p/--threads number of alignment threads to launch (1) --reorder force SAM output order to match order of input reads --mm use memory-mapped I/O for index; many 'hisat2's can share Other: --qc-filter filter out reads that are bad according to QSEQ filter --seed seed for random number generator (0) --non-deterministic seed rand. gen. arbitrarily instead of using read attributes --remove-chrname remove 'chr' from reference names in alignment --add-chrname add 'chr' to reference names in alignment --version print version information and quit -h/--help print this usage message ---------------------------------------------- Program options for stringtie: StringTie v2.2.1 usage: stringtie [-G ] [-l ] [-o ] [-p ] [-v] [-a ] [-m ] [-j ] [-f ] [-c ] [-g ] [-u] [-L] [-e] [--viral] [-E ] [--ptf ] [-x ] [-A ] [-h] {-B|-b } [--mix] [--conservative] [--rf] [--fr] Assemble RNA-Seq alignments into potential transcripts. Options: --version : print just the version at stdout and exit --conservative : conservative transcript assembly, same as -t -c 1.5 -f 0.05 --mix : both short and long read data alignments are provided (long read alignments must be the 2nd BAM/CRAM input file) --rf : assume stranded library fr-firststrand --fr : assume stranded library fr-secondstrand -G reference annotation to use for guiding the assembly process (GTF/GFF) --ptf : load point-features from a given 4 column feature file -o output path/file name for the assembled transcripts GTF (default: stdout) -l name prefix for output transcripts (default: STRG) -f minimum isoform fraction (default: 0.01) -L long reads processing; also enforces -s 1.5 -g 0 (default:false) -R if long reads are provided, just clean and collapse the reads but do not assemble -m minimum assembled transcript length (default: 200) -a minimum anchor length for junctions (default: 10) -j minimum junction coverage (default: 1) -t disable trimming of predicted transcripts based on coverage (default: coverage trimming is enabled) -c minimum reads per bp coverage to consider for multi-exon transcript (default: 1) -s minimum reads per bp coverage to consider for single-exon transcript (default: 4.75) -v verbose (log bundle processing details) -g maximum gap allowed between read mappings (default: 50) -M fraction of bundle allowed to be covered by multi-hit reads (default:1) -p number of threads (CPUs) to use (default: 1) -A gene abundance estimation output file -E define window around possibly erroneous splice sites from long reads to look out for correct splice sites (default: 25) -B enable output of Ballgown table files which will be created in the same directory as the output GTF (requires -G, -o recommended) -b enable output of Ballgown table files but these files will be created under the directory path given as -e only estimate the abundance of given reference transcripts (requires -G) --viral : only relevant for long reads from viral data where splice sites do not follow consensus (default:false) -x do not assemble any transcripts on the given reference sequence(s) -u no multi-mapping correction (default: correction enabled) -h print this usage message and exit --ref/--cram-ref reference genome FASTA file for CRAM input Transcript merge usage mode: stringtie --merge [Options] { gtf_list | strg1.gtf ...} With this option StringTie will assemble transcripts from multiple input files generating a unified non-redundant set of isoforms. In this mode the following options are available: -G reference annotation to include in the merging (GTF/GFF3) -o output file name for the merged transcripts GTF (default: stdout) -m minimum input transcript length to include in the merge (default: 50) -c minimum input transcript coverage to include in the merge (default: 0) -F minimum input transcript FPKM to include in the merge (default: 1.0) -T minimum input transcript TPM to include in the merge (default: 1.0) -f minimum isoform fraction (default: 0.01) -g gap between transcripts to merge together (default: 250) -i keep merged transcripts with retained introns; by default these are not kept unless there is strong evidence for them -l