This is the command gt-genomediff that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator
PROGRAM:
NAME
gt-genomediff - Calculates Kr: pairwise distances between genomes.
SYNOPSIS
gt genomediff [option ...] (INDEX | -indexname NAME SEQFILE SEQFILE [...])
DESCRIPTION
-indextype [...]
specify type of index, one of: esa|pck|encseq. Where encseq is an encoded sequence and
an enhanced suffix array will be constructed only in memory. (default: encseq)
-indexname [string]
Basename of encseq to construct. (default: undefined)
-unitfile [filename]
specifies genomic units, see below for description. (default: undefined)
-mirrored [yes|no]
virtually append the reverse complement of each sequence (default: no)
-pl [value]
specify prefix length for bucket sort recommendation: use without argument; then a
reasonable prefix length is automatically determined. (default: 0)
-dc [value]
specify difference cover value (default: 0)
-memlimit [string]
specify maximal amount of memory to be used during index construction (in bytes, the
keywords MB and GB are allowed) (default: undefined)
-scan [yes|no]
do not load esa index but scan it sequentially. (default: yes)
-thr [value]
Threshold for difference (du, dl) in divergence calculation. default: 1e-9
-abs_err [value]
absolute error for expected shulen calculation. default: 1e-5
-rel_err [value]
relative error for expected shulen calculation. default: 1e-3
-M [value]
threshold for minimum logarithm. default: DBL_MIN
-v [yes|no]
be verbose (default: no)
-help
display help for basic options and exit
-help+
display help for all options and exit
-version
display version information and exit
The genomediff tool only accepts DNA input.
When used with sequence files or encseq, an enhanced suffix array will be built in memory.
The ESA will not be created completely, but construction will use -memlimit as a threshold
and build it partwise, calculating the Shu-length for each part.
File format for option -unitfile (in Lua syntax):
units = {
genome1 = { "path/file1.fa", "file2.fa" },
genome2 = { "file3.fa", "path/file4.fa" }
}
Give the path to the files as they were given to the encseq tool! You can use
$ gt encseq info INDEXNAME
to get a list of files in an encoded sequence.
Comment lines in Lua start with -- and will be ignored.
See GTDIR/testdata/genomediff/unitfile1.lua for an example.
Options -pl, -dc and -memlimit are options to influence ESA construction.
REPORTING BUGS
Report bugs to <[email protected]>.
Use gt-genomediff online using onworks.net services