phytime - Online in the Cloud

Run phytime in OnWorks free hosting provider over Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

This is the command phytime that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

Run in Ubuntu Run in Fedora Run in Windows Sim Run in MACOS Sim

PROGRAM:

NAME

phytime - Bayesian estimation of divergence times from large sequence alignments

DESCRIPTION

Bayesian estimation of divergence times from molecular sequences relies on sophisticated
Markov chain Monte Carlo techniques, and Metropolis-Hastings (MH) samplers have been
successfully used in that context. This approach involves heavy computational burdens that
can hinder the analysis of large phylogenomic data sets. Reliable estimation of divergence
times can also be extremely time consuming, if not impossible, for sequence alignments
that convey weak or conflicting phylogenetic signals, emphasizing the need for more
efficient sampling methods. This article describes a new approach that estimates the
posterior density of substitution rates and node times. The prior distribution of rates
accounts for their potential autocorrelation along lineages, whereas priors on node ages
are modeled with uniform densities. Also, the likelihood function is approximated by a
multivariate normal density. The combination of these components leads to convenient
mathematical simplifications, allowing the posterior distribution of rates and times to be
estimated using a Gibbs sampling algorithm. The analysis of four real-world data sets
shows that this sampler outperforms the standard MH approach and demonstrates the
suitability of this new method for analyzing large and/or difficult data sets.

SYNOPSIS

phytime [command args]

OPTIONS

All the options below are optional except '-i','-u' and '--calibration'.

Command options:

-i (or --input) seq_file_name

seq_file_name is the name of the nucleotide or amino-acid sequence file in PHYLIP
format.

-d (or --datatype) data_type

data_type is 'nt' for nucleotide (default), 'aa' for amino-acid sequences, or
'generic', (use NEXUS file format and the 'symbols' parameter here).

-q (or --sequential)

Changes interleaved format (default) to sequential format.

-m (or --model) model

model : substitution model name. - Nucleotide-based models : HKY85 (default) |
JC69 | K80 | F81 | F84 | TN93 | GTR | custom (*) (*) : for the custom option, a
string of six digits identifies the model. For instance, 000000

corresponds to F81 (or JC69 provided the distribution of nucleotide frequencies is
uniform). 012345 corresponds to GTR. This option can be used for encoding any
model that is a nested within GTR.

- Amino-acid based models : LG (default) | WAG | JTT | MtREV | Dayhoff | DCMut |
RtREV | CpREV | VT

Blosum62 | MtMam | MtArt | HIVw |
HIVb | custom

--aa_rate_file filename

filename is the name of the file that provides the amino acid substitution rate
matrix in PAML format. It is compulsory to use this option when analysing amino
acid sequences with the `custom' model.

--calibration filename

filename is the name of the calibration file that provides a priori defined
boundaries for node ages. Please read the manual for more information about the
format of this file.

-t (or --ts/tv) ts/tv_ratio

ts/tv_ratio : transition/transversion ratio. DNA sequences only. Can be a fixed
positive value (ex:4.0) or e to get the maximum likelihood estimate.

-v (or --pinv) prop_invar

prop_invar : proportion of invariable sites. Can be a fixed value in the [0,1]
range or e to get the maximum likelihood estimate.

-c (or --nclasses) nb_subst_cat

nb_subst_cat : number of relative substitution rate categories. Default :
nb_subst_cat=4. Must be a positive integer.

-a (or --alpha) gamma

gamma : distribution of the gamma distribution shape parameter. Can be a fixed
positive value or e to get the maximum likelihood estimate.

-u (or --inputtree) user_tree_file

user_tree_file : starting tree filename. The tree must be in Newick format.

--r_seed num

num is the seed used to initiate the random number generator. Must be an integer.

--run_id ID_string

Append the string ID_string at the end of each PhyML output file. This option may
be useful when running simulations involving PhyML.

--quiet

No interactive question (for running in batch mode) and quiet output.

--no_memory_check

No interactive question for memory usage (for running in batch mode). Normal output
otherwise.

--chain_len num

num is the number of generations or runs of the Markov Chain Monte Carlo. Set to
1E+6 by default. Must be an integer.

--sample_freq num

The chain is sampled every num generations. Set to 1E+3 by default. Must be an
integer.

--no_data

Use this option to sample from the priors only (rather from the posterior joint
density of the model parameters).

--fastlk

Use the multivariate normal approximation to the likelihood and speed up
calculations

Use phytime online using onworks.net services