mlpack_gmm_train - Online in the Cloud

This is the command mlpack_gmm_train that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


mlpack_gmm_train - gaussian mixture model (gmm) training

SYNOPSIS


mlpack_gmm_train [-h] [-v] -g int -i string [-m string] [-n int] [-P] [-N double] [-M string] [-p double] [-r] [-S int] [-s int] [-T double] [-t int] -V

DESCRIPTION


This program takes a parametric estimate of a Gaussian mixture model (GMM) using the EM
algorithm to find the maximum likelihood estimate. The model may be saved to file, which
will contain information about each Gaussian.

If GMM training fails with an error indicating that a covariance matrix could not be
inverted, make sure that the --no_force_positive flag is not specified. Alternately,
adding a small amount of Gaussian noise (using the --noise parameter) to the entire
dataset may help prevent Gaussians with zero variance in a particular dimension, which is
usually the cause of non-invertible covariance matrices.

The 'no_force_positive' flag, if set, will avoid the checks after each iteration of the EM
algorithm which ensure that the covariance matrices are positive definite. Specifying the
flag can cause faster runtime, but may also cause non-positive definite covariance
matrices, which will cause the program to crash.

Optionally, multiple trials may be performed, by specifying the --trials option. The model
with greatest log-likelihood will be taken.

REQUIRED OPTIONS


--gaussians (-g) [int]
Number of Gaussians in the GMM.

--input_file (-i) [string]
File containing the data on which the model will be fit.

OPTIONS


--help (-h)
Default help info.

--info [string]
Get help on a specific module or option. Default value ''.

--input_model_file (-m) [string]
File containing initial input GMM model. Default value ''.

--max_iterations (-n) [int]
Maximum number of iterations of EM algorithm (passing 0 will run until
convergence). Default value 250.

--no_force_positive (-P)
Do not force the covariance matrices to be positive definite.

--noise (-N) [double]
Variance of zero-mean Gaussian noise to add to data. Default value 0.

--output_model_file (-M) [string]
File to save trained GMM model to. Default value ''.

--percentage (-p) [double]
If using --refined_start, specify the percentage of the dataset used for each
sampling (should be between 0.0 and 1.0). Default value 0.02.

--refined_start (-r)
During the initialization, use refined initial positions for k-means clustering
(Bradley and Fayyad, 1998).

--samplings (-S) [int]
If using --refined_start, specify the number of samplings used for initial points.
Default value 100.

--seed (-s) [int]
Random seed. If 0, 'std::time(NULL)' is used. Default value 0.

--tolerance (-T) [double]
Tolerance for convergence of EM. Default value 1e-10.

--trials (-t) [int]
Number of trials to perform in training GMM. Default value 1.

--verbose (-v)
Display informational messages and the full list of parameters and timers at the
end of execution.

--version (-V)
Display the version of mlpack.

ADDITIONAL INFORMATION


For further information, including relevant papers, citations, and theory, consult the
documentation found at http://www.mlpack.org or included with your DISTRIBUTION OF MLPACK.

mlpack_gmm_train(1)

Use mlpack_gmm_train online using onworks.net services



Latest Linux & Windows online programs