mlpack_logistic_regression - Online in the Cloud

This is the command mlpack_logistic_regression that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


mlpack_logistic_regression - l2-regularized logistic regression and prediction

SYNOPSIS


mlpack_logistic_regression [-h] [-v] [-d double] [-m string] [-l string] [-L double] [-n int] [-O string] [-o string] [-M string] [-s double] [-T string] [-e double] [-t string] -V

DESCRIPTION


An implementation of L2-regularized logistic regression using either the L-BFGS optimizer
or SGD (stochastic gradient descent). This solves the regression problem

y = (1 / 1 + e^-(X * b))

where y takes values 0 or 1.

This program allows loading a logistic regression model from a file (-i) or training a
logistic regression model given training data (-t), or both those things at once. In
addition, this program allows classification on a test dataset (-T) and will save the
classification results to the given output file (-o). The logistic regression model itself
may be saved with a file specified using the -m option.

The training data given with the -t option should have class labels as its last dimension
(so, if the training data is in CSV format, labels should be the last column).
Alternately, the -l (--labels_file) option may be used to specify a separate file of
labels.

When a model is being trained, there are many options. L2 regularization (to prevent
overfitting) can be specified with the -l option, and the optimizer used to train the
model can be specified with the --optimizer option. Available options are 'sgd'
(stochastic gradient descent) and 'lbfgs' (the L-BFGS optimizer). There are also various
parameters for the optimizer; the --max_iterations parameter specifies the maximum number
of allowed iterations, and the --tolerance (-e) parameter specifies the tolerance for
convergence. For the SGD optimizer, the --step_size parameter controls the step size
taken at each iteration by the optimizer. If the objective function for your data is
oscillating between Inf and 0, the step size is probably too large. There are more
parameters for the SGD and L-BFGS optimizers, but the C++ interface must be used to access
these.

Optionally, the model can be used to predict the responses for another matrix of data
points, if --test_file is specified. The --test_file option can be specified without
--input_file, so long as an existing logistic regression model is given with --model_file.
The output predictions from the logistic regression model are stored in the file given
with --output_predictions.

This implementation of logistic regression does not support the general multi-class case
but instead only the two-class case. Any responses must be either 0 or 1.

OPTIONS


--decision_boundary (-d) [double] Decision boundary for prediction; if the logistic
function for a point is less than the boundary, the class is taken to be 0; otherwise, the
class is 1. Default value 0.5.

--help (-h)
Default help info.

--info [string]
Get help on a specific module or option. Default value ''. --input_model_file
(-m) [string] File containing existing model (parameters). Default value ''.

--labels_file (-l) [string]
A file containing labels (0 or 1) for the points in the training set (y). Default
value ''.

--lambda (-L) [double]
L2-regularization parameter for training. Default value 0.

--max_iterations (-n) [int]
Maximum iterations for optimizer (0 indicates no limit). Default value 10000.

--optimizer (-O) [string]
Optimizer to use for training ('lbfgs' or ’sgd'). Default value 'lbfgs'.

--output_file (-o) [string]
If --test_file is specified, this file is where the predicted responses will be
saved. Default value ''. --output_model_file (-M) [string] File to save trained
logistic regression model to. Default value ''.

--step_size (-s) [double]
Step size for SGD optimizer. Default value 0.01.

--test_file (-T) [string]
File containing test dataset. Default value ’'.

--tolerance (-e) [double]
Convergence tolerance for optimizer. Default value 1e-10. --training_file (-t)
[string] A file containing the training set (the matrix of predictors, X). Default
value ''.

--verbose (-v)
Display informational messages and the full list of parameters and timers at the
end of execution.

--version (-V)
Display the version of mlpack.

ADDITIONAL INFORMATION


ADDITIONAL INFORMATION


For further information, including relevant papers, citations, and theory, For further
information, including relevant papers, citations, and theory, consult the documentation
found at http://www.mlpack.org or included with your consult the documentation found at
http://www.mlpack.org or included with your DISTRIBUTION OF MLPACK. DISTRIBUTION OF
MLPACK.

mlpack_logistic_regression(1)

Use mlpack_logistic_regression online using onworks.net services



Latest Linux & Windows online programs