cdbfasta - Online in the Cloud

This is the command cdbfasta that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


cdbfasta - Creates an index file for records from a multi-fasta file.

DESCRIPTION


Usage:
cdbfasta <fastafile> [-o <index_file>] [-r <record_delimiter>]

[-z <compressed_db>] [-i] [-m|-n <numkeys>|-f<LIST>]|-c|-C]

[-w <stopwords_list>] [-s <stripendchars>] [-v]

Creates an index file for records from a multi-fasta file. By default (without
-m/-n/-c/-C option), only the first space-delimited token from the defline is used
as a key.

<fastafile> is the multi-fasta file to index; -o the index file will be named
<index_file>; if not given,

the index filename is database name plus the suffix '.cidx'

-r <record_delimiter> a string of characters at the beginning of line

marking the start of a record (default: '>')

-Q treat input as fastq format, i.e. with '@' as record delimiter

and with records expected to have at least 4 lines

-z database is compressed into the file <compressed_db>

before indexing (<fastafile> can be "-" or "stdin" in order to get the input
records from stdin)

-s strip extraneous characters from *around* the space delimited

tokens, for the multikey options below (-m,-n,-f); Default <stripendchars> set is:
'",`.(){}/[]!:;~|><+-

-m ("multi-key" option) create hash entries pointing to

the same record for all tokens found in the defline

-n <numkeys> same as -m, but only takes the first <numkeys>

tokens from the defline

-f indexes *space* delimited tokens (fields) in the defline as given

by LIST of fields or fields ranges (the same syntax as UNIX 'cut')

-w <stopwordslist> exclude from indexing all the words found

in the file <stopwordslist> (for options -m, -n and -k)

-i do case insensitive indexing (i.e. create additional keys for

all-lowercase tokens used for indexing from the defline

-c for deflines in the format: db1|accession1|db2|accession2|...,

only the first db-accession pair ('db1|accession1') is taken as key

-C like -c, but also subsequent db|accession constructs are indexed,

along with the full (default) token; additionally, all nrdb concatenated accessions
found in the defline are parsed and stored (assuming 0x01 or '^|^' as separators)

-a accession mode: like -C option, but indexes the 'accession'

part for all 'db|accession' constructs found

-A like -a and -C together (both accessions and 'db|accession'

constructs are used as keys

-v show program version and exit

Use cdbfasta online using onworks.net services



Latest Linux & Windows online programs