EnglishFrenchSpanish

OnWorks favicon

leaff - Online in the Cloud

Run leaff in OnWorks free hosting provider over Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

This is the command leaff that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


leaff - sequence library utilities and applications

SYNOPSIS


leaff [-f fasta-file] [options]

DESCRIPTION


LEAFF (Let's Extract Anything From Fasta) is a utility program for working with multi-
fasta files. In addition to providing random access to the base level, it includes several
analysis functions.

OPTIONS


SOURCE FILES
-f file: use sequence in 'file' (-F is also allowed for historical reasons)
-A file: read actions from 'file'

SOURCE FILE EXAMINATION
-d: print the number of sequences in the fasta
-i name: print an index, labelling the source 'name'

OUTPUT OPTIONS
-6 <#>: insert a newline every 60 letters
(if the next arg is a number, newlines are inserted every
n letters, e.g., -6 80. Disable line breaks with -6 0,
or just don't use -6!)
-e beg end: Print only the bases from position 'beg' to position 'end'
(space based, relative to the FORWARD sequence!) If
beg == end, then the entire sequence is printed. It is an
error to specify beg > end, or beg > len, or end > len.
-ends n Print n bases from each end of the sequence. One input
sequence generates two output sequences, with '_5' or '_3'
appended to the ID. If 2n >= length of the sequence, the
sequence itself is printed, no ends are extracted (they
overlap).
-C: complement the sequences
-H: DON'T print the defline
-h: Use the next word as the defline ("-H -H" will reset to the
original defline
-R: reverse the sequences
-u: uppercase all bases

SEQUENCE SELECTION
-G n s l: print n randomly generated sequences, 0 < s <= length <= l
-L s l: print all sequences such that s <= length < l
-N l h: print all sequences such that l <= % N composition < h
(NOTE 0.0 <= l < h < 100.0)
(NOTE that you cannot print sequences with 100% N
This is a useful bug).
-q file: print sequences from the seqid list in 'file'
-r num: print 'num' randomly picked sequences
-s seqid: print the single sequence 'seqid'
-S f l: print all the sequences from ID 'f' to 'l' (inclusive)
-W: print all sequences (do the whole file)

LONGER HELP
-help analysis
-help examples

ANALYSIS FUNCTIONS
--findduplicates a.fasta
Reports sequences that are present more than once. Output
is a list of pairs of deflines, separated by a newline.

--mapduplicates a.fasta b.fasta
Builds a map of IIDs from a.fasta and b.fasta that have
identical sequences. Format is "IIDa <-> IIDb"

--md5 a.fasta:
Don't print the sequence, but print the md5 checksum
(of the entire sequence) followed by the entire defline.

--partition prefix [ n[gmk]bp | n ] a.fasta
--partitionmap [ n[gmk]bp | n ] a.fasta
Partition the sequences into roughly equal size pieces of
size nbp, nkbp, nmbp or ngbp; or into n roughly equal sized
parititions. Sequences larger that the partition size are
in a partition by themself. --partitionmap writes a
description of the partition to stdout; --partiton creates
a fasta file 'prefix-###.fasta' for each partition.
Example: -F some.fasta --partition parts 130mbp
-F some.fasta --partition parts 16

--segment prefix n a.fasta
Splits the sequences into n files, prefix-###.fasta.
Sequences are not reordered; the first n sequences are in
the first file, the next n in the second file, etc.

--gccontent a.fasta
Reports the GC content over a sliding window of
3, 5, 11, 51, 101, 201, 501, 1001, 2001 bp.

--testindex a.fasta
Test the index of 'file'. If index is up-to-date, leaff
exits successfully, else, leaff exits with code 1. If an
index file is supplied, that one is tested, otherwise, the
default index file name is used.

--dumpblocks a.fasta
Generates a list of the blocks of N and non-N. Output
format is 'base seq# beg end len'. 'N 84 483 485 2' means
that a block of 2 N's starts at space-based position 483
in sequence ordinal 84. A '.' is the end of sequence
marker.

--errors L N C P a.fasta
For every sequence in the input file, generate new
sequences including simulated sequencing errors.
L -- length of the new sequence. If zero, the length
of the original sequence will be used.
N -- number of subsequences to generate. If L=0, all
subsequences will be the same, and you should use
C instead.
C -- number of copies to generate. Each of the N
subsequences will have C copies, each with different
errors.
P -- probability of an error.

HINT: to simulate ESTs from genes, use L=500, N=10, C=10
-- make C=10 sequencer runs of N=10 EST sequences
of length 500bp each.
to simulate mRNA from genes, use L=0, N=10, C=10
to simulate reads from genomes, use L=800, N=10, C=1
-- of course, N= should be increased to give the
appropriate depth of coverage

--stats a.fasta
Reports size statistics; number, N50, sum, largest.

--seqstore out.seqStore
Converts the input file (-f) to a seqStore file (for instance,
for use with the Celera assembler or sim4db).

NOTES


Please note that options are ORDER DEPENDENT. Sequences are printed whenever a SEQUENCE
SELECTION option occurs on the command line. OUTPUT OPTIONS are not reset when a sequence
is printed.

SEQUENCES are numbered starting at ZERO, not one!

EXAMPLES


1. Print the first 10 bases of the fourth sequence in file 'genes':
leaff -f genes -e 0 10 -s 3

2. Print the first 10 bases of the fourth and fifth sequences:
leaff -f genes -e 0 10 -s 3 -s 4

3. Print the fourth and fifth sequences reverse complemented, and the sixth
sequence forward. The second set of -R -C toggle off reverse-complement:
leaff -f genes -R -C -s 3 -s 4 -R -C -s 5

4. Convert file 'genes' to a seqStore 'genes.seqStore'.
leaff -f genes --seqstore genes.seqStore

Use leaff online using onworks.net services


Free Servers & Workstations

Download Windows & Linux apps

  • 1
    PAC Manager
    PAC Manager
    PAC is a Perl/GTK replacement for
    SecureCRT/Putty/etc (linux
    ssh/telnet/... gui)... It provides a GUI
    to configure connections: users,
    passwords, EXPECT regula...
    Download PAC Manager
  • 2
    GeoServer
    GeoServer
    GeoServer is an open-source software
    server written in Java that allows users
    to share and edit geospatial data.
    Designed for interoperability, it
    publishes da...
    Download GeoServer
  • 3
    Firefly III
    Firefly III
    A free and open-source personal finance
    manager. Firefly III features a
    double-entry bookkeeping system. You can
    quickly enter and organize your
    transactions i...
    Download Firefly III
  • 4
    Apache OpenOffice Extensions
    Apache OpenOffice Extensions
    The official catalog of Apache
    OpenOffice extensions. You'll find
    extensions ranging from dictionaries to
    tools to import PDF files and to connect
    with ext...
    Download Apache OpenOffice Extensions
  • 5
    MantisBT
    MantisBT
    Mantis is an easily deployable, web
    based bugtracker to aid product bug
    tracking. It requires PHP, MySQL and a
    web server. Checkout our demo and hosted
    offerin...
    Download MantisBT
  • 6
    LAN Messenger
    LAN Messenger
    LAN Messenger is a p2p chat application
    for intranet communication and does not
    require a server. A variety of handy
    features are supported including
    notificat...
    Download LAN Messenger
  • More »

Linux commands

  • 1
    abidw
    abidw
    abidw - serialize the ABI of an ELF
    file abidw reads a shared library in ELF
    format and emits an XML representation
    of its ABI to standard output. The
    emitted ...
    Run abidw
  • 2
    abilint
    abilint
    abilint - validate an abigail ABI
    representation abilint parses the native
    XML representation of an ABI as emitted
    by abidw. Once it has parsed the XML
    represe...
    Run abilint
  • 3
    coresendmsg
    coresendmsg
    coresendmsg - send a CORE API message
    to the core-daemon daemon ...
    Run coresendmsg
  • 4
    core_server
    core_server
    core_server - The primary server for
    SpamBayes. DESCRIPTION: Currently serves
    the web interface only. Plugging in
    listeners for various protocols is TBD.
    This ...
    Run core_server
  • 5
    fwflash
    fwflash
    fwflash - program to flash image file
    to a connected NXT device ...
    Run fwflash
  • 6
    fwts-collect
    fwts-collect
    fwts-collect - collect logs for fwts
    bug reporting. ...
    Run fwts-collect
  • More »

Ad