This is the command rate4site_doublerep that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator
PROGRAM:
NAME
rate4site - detector of conserved amino-acid sites
SYNOPSIS
rate4site [OPTIONS] -s <MSA FILE>
DESCRIPTION
The rate of evolution is not constant among amino acid sites: some positions evolve slowly
and are commonly referred to as "conserved", while others evolve rapidly and are referred
to as "variable". The rate variations correspond to different levels of purifying
selection acting on these sites. The purifying selection can be the result of geometrical
constraints on the folding of the protein into its 3D structure, constraints at amino acid
sites involved in enzymatic activity or in ligand binding or, alternatively, at amino acid
sites that take part in protein-protein interactions. Rate4Site calculates the relative
evolutionary rate at each site using a probabilistic-based evolutionary model. This allows
taking into account the stochastic process underlying sequence evolution within protein
families and the phylogenetic tree of the proteins in the family. The conservation score
at a site corresponds to the site's evolutionary rate.
METHODOLOGY
The sole obligatory input to Rate4Site is an MSA file. The program then computes a
phylogenetic tree that is consistent with the available MSA (the user can also input a
pre-calculated tree). It then calculates the relative conservation score for each site in
the MSA. This is carried out using either an empirical Bayesian method or a maximum
likelihood method (Pupko et al., 2002). The differences between the two methods are
explained in details in Mayrose et al (2004).
REFERENCES
Mayrose, I., Graur, D., Ben-Tal, N., and Pupko, T. 2004. Comparison of site-specific rate-
inference methods: Bayesian methods are superior. Mol Biol Evol 21: 1781-1791.
OPTIONS
-s MSA_FILE
The input sequence file name. The following formats are supported: Mase, Molphy,
Phylip, Clustal, Fasta
-t The input tree file name (in Newick format)
-o OUTPUT_FILE
The results output file
-a Reference sequence name in the MSA. The conservation scores are printed based on the
amino-acids in this sequence.
-k The number of discrete Gamma categories
-m Evolutionary model. The following amino-acids models are supported:
DAY (-md), JTT (-mj), REV (-mr), aaJC (-ma), LG (-Ml), WAG (-Mw) .
For nucleotides, the following models are supported: JC (-mn), HKY (-Mh), Tamura92 (-Mt), GTR (-Mg).
-b Branch lengths optimization flag:
-bn = no Branch lengths optimization
-bh = optimization using a homogeneous model (no among-site-rate-variation)
-bg = optimization using a Gamma model
-i Rate inference method flag:
-Im = rates are inferred using the maximum likelihood method
-Ib = rates are inferred using the empirical Bayes method
-z Tree constructing method
zj = Neighbor-joining tree with Jukes-Cantor distances
zn = Neighbor-joining tree with maximum likelihood distances
-h Short help message
Use rate4site_doublerep online using onworks.net services