This is the command predictprotein that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator
PROGRAM:
NAME
predictprotein - analyse protein sequence
SYNOPSIS
predictprotein [--blast-processors] [--num-cpus|c] [--debug|d] [--help] [--make-file|m]
[--makedebug] [--man] [--method] [--dryrun|n] [--numresmax] [--output-dir|o]
[--print-ext-method-map] [--profnumresmin] [--psicexe] [--prot-name|p] [--sequence|seq|s]
[--seqfile] [--spkeyidx] [--target]* [--version|v] [--work-dir|w]
predictprotein [--bigblastdb] [--big80blastdb] [--pfam2db] [--pfam3db] [--prodomblastdb]
[--prositedat] [--prositeconvdat] [--swissblastdb]
predictprotein [--setacl|acl] [--<no>cache-merge] [--<no>force-cache-store]
[--<no>use-cache]
DESCRIPTION
predictprotein runs a set of protein sequnce analysis methods:
Standard methods
These methods are run by the default target 'all':
Feature Target Extension Man page
------- ------ --------- --------
atom mobility profbval profbval, profb4snap profbval(1)
bacterial transmem- proftmb proftmb, proftmbdat proftmb(1)
brane beta barrels
coiled-coils coiledcoils coils, coils_raw coils-wrap(1)
ncoils(1)
disulfide bridges disulfinder disulfinder disulfinder(1)
Gene Ontology terms metastudent metastudent.BPO.txt, metastudent(1)
metastudent.CCO.txt,
metastudent.MFO.txt
local alignment blast blastPsiOutTmp, chk, blastpgp(1)
blastPsiMat,
blastPsiAli,
blastpSwissM8 blastall(1)
local complexity ncbi-seg segNorm, segNormGCG ncbi-seg(1)
non-regular secondary norsp nors, sumNors norsp(1)
structure
nuclear localization predictnls nls, nlsDat, nlsSum predictnls(1)
Pfam scan hmmer v2 hmm2pfam hmm2pfam hmm2pfam(1)
Pfam scan hmmer v3 hmm3pfam hmm3pfam, hmm3pfamTbl, hmmscan(1)
hmm3pfamDomTbl
PROSITE scan prosite prosite prosite_scan(1)
protein-protein profisis isis profisis(1)
interaction sites
secondary structure, prof profRdb prof(1)
accessibility from
sequence profile
secondary structure, prof prof1Rdb prof(1)
accessibility from
single sequence
secondary structure, reprof reprof reprof(1)
accessibility from
single sequence
transmembrane phd phdPred, phdRdb prof(1)
helices
unstructured loops norsnet norsnet norsnet(1)
Optional methods
These methods are non-redistributable or depend on non-redistributable software (indicated
by '*'). You have to acquire the non-redistributable components yourself before you can
use these methods.
These methods are run by the target 'optional'.
Feature Target Extension Man page
------- ------ --------- --------
disordered regions metadisorder mdisorder metadisorder(1)
subcellular loctree3 {arch,bact,euka}.lc3 loctree3(1)
tmhmm* tmhmm n.a.
protein-RNA, somena somena somena(1)
protein-DNA
interaction sites
position-specific psic* psic, clustalngz psic(1),
independent counts runNewPSIC(1),
and its base multi- clustalw(1)
ple alignment
transmembrane helices tmhmm tmhmm n.a.
tmseg tmseg tmseg(1)
functional regions consurf _consurf.grades consurf(1)
Resources
Database Cmd line argument
-------- -----------------
big (Uniprot+PDB) blast database --bigblastdb
big_80 (big @ 80% sequence identity --big80blastdb
redunancy level) blast database
swiss blast database --swissblastdb
pfam v2 database --pfam2db
pfam v3 database --pfam3db
prosite_convert.dat --prositeconvdat
Resources for optional targets
Database Cmd line argument
-------- -----------------
big (Uniprot+PDB) blast database --bigblastdb
prosite.dat --prositedat
Swiss-Prot keyword-to-accession --spkeyidx
'index' for loctree
Generating Resources
Courtesy of Wiktor Jurkowski:
* rostlab-data-prosite_convert prosite.dat prosite_convert.dat
* perl /usr/share/loctree/perl/keyindex4loctree.pl < keyindex.txt > keyindex_loctree.txt
* hmmpress Pfam-A.hmm
Output format
Method outputs are deposited into --output-dir. Each method has one or more file name
extensions associated with it, see the table above. Refer to the man page of the
individual methods for further details. Extensions ending with `gz' are compressed with
gzip(1).
REFERENCES
Rost, B., Yachdav, G., and Liu, J. (2004). The PredictProtein server. Nucleic Acids Res,
32(Web Server issue), W321-6.
In case you find predictprotein and the tools within useful please cite:
* the references for PredictProtein, see above
* the references for the tools you used, see REFERENCES on the man page of the tool
OPTIONS
--blast-processors
Number of processors to use, default = 1
-c, --num-cpus
Make jobs, default = 1
-d, --debug
--help
Print a brief help message and exits.
-m, --make-file
make file to use, default = /usr/share/predictprotein/MakefilePP.mk
--makedebug
debug argument for make, see make(1)
--man
This documentation page
--method
Describes method control parameters and requests methods to run when --target is not
all. Format example:
--method=norsp,win=50
* begin with the method name, e.g. `norsp'
* list method control parameters, e.g. win=50
Not all methods support passing control parameters in this way due to their primitive
command line interfaces.
-n, --dryrun
Do not execute, just shows what is about to be run
--numresmax
Maximum sequence length, default: 6000. Sequences longer than this will make
predictprotein fail with the respective error code, see ERRORS.
-o, --output-dir
Final location of outputfiles, required unless caching is used.
--print-ext-method-map
Print externsion-to-method map. Useful as input file for consistency checkers.
Format: <extension><tab><method>.
--profnumresmin
Minimum sequence length required by prof, default: 17. Sequences shorter than this
will make predictprotein fail with the respective error code, see ERRORS.
--psicexe
psic wrapper executable, default: /usr/share/rost-runpsic/runNewPSIC.pl
-p, --prot-name
Base name of result files and protein name in - for example - FASTA files. Default =
`query'.
Valid names are of the character set "[[:alnum:]._-]".
-s, --seq, --sequence
one letter amino acid sequence input
--seqfile
FASTA amino acid sequence file; if `-', standard input is read
--spkeyidx
Swiss-Prot keyword-to-identifier 'index' file for loctree(1).
--target=string
Method groups to run. Give this argument for each target you need. Default: the
value of `default_targets' in the configuration file; `all' if that is not given.
Some targets of interest:
all methods that are GPL or redistributable to non-commercial entities
optional
methods that do not fit into all
Look at /usr/share/predictprotein/MakefilePP.mk for a list of targets ("Use the source
Luke").
-v, --version
Print package version
-w, --work-dir
Working directory, optional
Database options
--bigblastdb
Path to comprehensive blast database
--big80blastdb
Path to comprehensive blast database at 80% sequence identity redundancy level
--pfam2db
Pfam v2 database, e.g. Pfam_ls
--pfam3db
Pfam v3 database, e.g. Pfam-A.hmm
--prodomblastdb
Obsolete. This argument is kept only to maintain compatibility with older versions.
--prositedat
Path to `prosite.dat' file, see
<https://rostlab.org/owiki/index.php/Packages#Resource_definitions>
--prositeconvdat
Path to `prosite_convert.dat' file, see
<https://rostlab.org/owiki/index.php/Packages#Resource_definitions>
--swissblastdb
Path to SwissProt blast database
Cache related options
--acl, --setacl
Set access control lists. Access control lists are set only in case results are
stored in the cache. This option is ineffective otherwise. All previous ACLs are
lost - no merging. The read bit controls browsability of results. Other bits are not
used. E.g.
u:lkajan:4,u:gyachdav:4,g:lkajan:4,o::0
--cache-merge
--nocache-merge
Merge/do not merge results into cache. --cache-merge reuses results already in cache;
this turns --use-cache on automatically. --cache-merge is incompatible with
--force-cache-store.
--nocache-merge is the default UNLESS
· --use-cache is on and
· --noforce-cache-store is in effect and
· --target is used and
· the cache is not empty
--cache-merge is silently ignored in case the cache is empty.
--force-cache-store
--noforce-cache-store
Enable/disable forcing storage of results into cache. Implies --use-cache. Default:
--noforce-cache-store
With --noforce-cache-store when predictprotein finds cached results it simply fetches
them from the cache and does no processing (even if the results are incomplete). With
--force-cache-store predictprotein does not fetch anything from the cache but does
store the results, completely replacing what was cached.
--force-cache-store is incompatible with --cache-merge.
--use-cache
--nouse-cache
Use/do not use cache for predictprotein results. Default: --nouse-cache.
Option `use_cache' may be given in configuration files to override default.
ERRORS
253 Sequence is too long, see --numresmax
254 Sequence is too short, shorter than minimum length required by prof. See
--profnumresmin.
EXAMPLES
predictprotein --seqfile /usr/share/doc/predictprotein/examples/tquick.fasta --output-dir /tmp/pp
predictprotein --seqfile /usr/share/doc/predictprotein/examples/tquick.fasta --output-dir /tmp/pp --target query.profRdb --target loctree3
predictprotein --seqfile /usr/share/doc/predictprotein/examples/tquick.fasta --method=norsp,win=100 --output-dir /tmp/pp
Cache examples
Store results in cache, do not care about storing files in --output-dir:
predictprotein --seqfile /usr/share/doc/predictprotein/examples/tquick.fasta --method=norsp,win=100 --use-cache --setacl g:rostlab:7
If not in cache store, otherwise fetch results from cache into --output-dir:
predictprotein --seqfile /usr/share/doc/predictprotein/examples/tquick.fasta --method=norsp,win=100 --use-cache --setacl g:rostlab:7 --output-dir /tmp/pp
ENVIRONMENT
PREDICTPROTEINCONF
Location of predictproteinrc configuration file to use, overriding other configuration
files
Use predictprotein online using onworks.net services