Program options for repeat_masker: Option h is ambiguous (help, hmmer_dir, html) RepeatMasker version 4.1.0 /gscratch/srlab/programs/RepeatMasker-4.1.0/RepeatMasker - 4.1.0 NAME RepeatMasker - Mask repetitive DNA SYNOPSIS RepeatMasker [-options] DESCRIPTION The options are: -h(elp) Detailed help Default settings are for masking all type of repeats in a primate sequence. -e(ngine) [crossmatch|wublast|abblast|ncbi|rmblast|hmmer] Use an alternate search engine to the default. Note: 'ncbi' and 'rmblast' are both aliases for the rmblastn search engine engine. The generic NCBI blastn program is not sensitive enough for use with RepeatMasker at this time. -pa(rallel) [number] The number of sequence batch jobs [50kb minimum] to run in parallel. RepeatMasker will fork off this number of parallel jobs, each running the search engine specified. For each search engine invocation ( where applicable ) a fixed the number of cores/threads is used: RMBlast 4 cores nhmmer 2 cores crossmatch 1 core To estimate the number of cores a RepeatMasker run will use simply multiply the -pa value by the number of cores the particular search engine will use. -s Slow search; 0-5% more sensitive, 2-3 times slower than default -q Quick search; 5-10% less sensitive, 2-5 times faster than default -qq Rush job; about 10% less sensitive, 4->10 times faster than default (quick searches are fine under most circumstances) repeat options -nolow Does not mask low_complexity DNA or simple repeats -noint Only masks low complex/simple repeats (no interspersed repeats) -norna Does not mask small RNA (pseudo) genes -alu Only masks Alus (and 7SLRNA, SVA and LTR5)(only for primate DNA) -div [number] Masks only those repeats < x percent diverged from consensus seq -lib [filename] Allows use of a custom library (e.g. from another species) -cutoff [number] Sets cutoff score for masking repeats when using -lib (default 225) -species Specify the species or clade of the input sequence. The species name must be a valid NCBI Taxonomy Database species name and be contained in the RepeatMasker repeat database. Some examples are: -species human -species mouse -species rattus -species "ciona savignyi" -species arabidopsis Other commonly used species: mammal, carnivore, rodentia, rat, cow, pig, cat, dog, chicken, fugu, danio, "ciona intestinalis" drosophila, anopheles, worm, diatoaea, artiodactyl, arabidopsis, rice, wheat, and maize Contamination options -is_only Only clips E coli insertion elements out of fasta and .qual files -is_clip Clips IS elements before analysis (default: IS only reported) -no_is Skips bacterial insertion element check Running options -gc [number] Use matrices calculated for 'number' percentage background GC level -gccalc RepeatMasker calculates the GC content even for batch files/small seqs -frag [number] Maximum sequence length masked without fragmenting (default 60000) -nocut Skips the steps in which repeats are excised -noisy Prints search engine progress report to screen (defaults to .stderr file) -nopost Do not postprocess the results of the run ( i.e. call ProcessRepeats ). NOTE: This options should only be used when ProcessRepeats will be run manually on the results. output options -dir [directory name] Writes output to this directory (default is query file directory, "-dir ." will write to current directory). -a(lignments) Writes alignments in .align output file -inv Alignments are presented in the orientation of the repeat (with option -a) -lcambig Outputs ambiguous DNA transposon fragments using a lower case name. All other repeats are listed in upper case. Ambiguous fragments match multiple repeat elements and can only be called based on flanking repeat information. -small Returns complete .masked sequence in lower case -xsmall Returns repetitive regions in lowercase (rest capitals) rather than masked -x Returns repetitive regions masked with Xs rather than Ns -poly Reports simple repeats that may be polymorphic (in file.poly) -source Includes for each annotation the HSP "evidence". Currently this option is only available with the "-html" output format listed below. -html Creates an additional output file in xhtml format. -ace Creates an additional output file in ACeDB format -gff Creates an additional Gene Feature Finding format output -u Creates an additional annotation file not processed by ProcessRepeats -xm Creates an additional output file in cross_match format (for parsing) -no_id Leaves out final column with unique ID for each element (was default) -e(xcln) Calculates repeat densities (in .tbl) excluding runs of >=20 N/Xs in the query CONFIGURATION OVERRIDES -hmmer_dir The path to the HMMER profile HMM search software. -libdir Path to the RepeatMasker libraries directory. -abblast_dir The path to the installation of the ABBLAST sequence alignment program. -default_search_engine The default search engine to use -trf_prgm The full path including the name for the TRF program. -rmblast_dir The path to the installation of the RMBLAST sequence alignment program. -crossmatch_dir The path Phil Green's cross_match program ( phrap program suite ). SEE ALSO Crossmatch, ProcessRepeats COPYRIGHT 2002-2019 Copyright (C) Institute for Systems Biology 2002-2019 Developed by Arian Smit and Robert Hubley. 2000-2001 Copyright (C) Arian Smit 2000-2001. 1996-1999 Copyright (C) University of Washington, Developed by Arian Smit, Philip Green and Colin Wilson of the University of Washington Department of Genomics. AUTHORS Arian Smit Robert Hubley ----------------------------------------------