mlpack_hoeffding_tree - Online in the Cloud

This is the command mlpack_hoeffding_tree that can be run in the OnWorks free hosting provider using one of our multiple free online workstations such as Ubuntu Online, Fedora Online, Windows online emulator or MAC OS online emulator

PROGRAM:

NAME


mlpack_hoeffding_tree - hoeffding trees

SYNOPSIS


mlpack_hoeffding_tree [-h] [-v] [-b] [-B int] [-c double] [-m string] [-l string] [-n int] [-I int] [-N string] [-o int] [-M string] [-s int] [-p string] [-P string] [-T string] [-L string] [-t string] -V

DESCRIPTION


This program implements Hoeffding trees, a form of streaming decision tree suited best for
large (or streaming) datasets. This program supports both categorical and numeric data
stored in the ARFF format. Given an input dataset, this program is able to train the tree
with numerous training options, and save the model to a file. The program is also able to
use a trained model or a model from file in order to predict classes for a given test set.

The training file and associated labels are specified with the --training_file and
--labels_file options, respectively. The training file must be in ARFF format. The
training may be performed in batch mode (like a typical decision tree algorithm) by
specifying the --batch_mode option, but this may not be the best option for large
datasets.

When a model is trained, it may be saved to a file with the --output_model_file (-M)
option. A model may be loaded from file for further training or testing with the
--input_model_file (-m) option.

A test file may be specified with the --test_file (-T) option, and if performance numbers
are desired for that test set, labels may be specified with the --test_labels_file (-L)
option. Predictions for each test point will be stored in the file specified by
--predictions_file (-p) and probabilities for each predictions will be stored in the file
specified by the --probabilities_file (-P) option.

OPTIONS


--batch_mode (-b)
If true, samples will be considered in batch instead of as a stream. This generally
results in better trees but at the cost of memory usage and runtime.

--bins (-B) [int]
If the 'domingos' split strategy is used, this specifies the number of bins for
each numeric split. Default value 10.

--confidence (-c) [double]
Confidence before splitting (between 0 and 1). Default value 0.95.

--help (-h)
Default help info.

--info [string]
Get help on a specific module or option. Default value ''.

--info_gain (-i)
If set, information gain is used instead of Gini impurity for calculating Hoeffding
bounds. --input_model_file (-m) [string] File to load trained tree from. Default
value ’'.

--labels_file (-l) [string]
Labels for training dataset. Default value ''.

--max_samples (-n) [int]
Maximum number of samples before splitting. Default value 5000.

--min_samples (-I) [int]
Minimum number of samples before splitting. Default value 100.
--numeric_split_strategy (-N) [string] The splitting strategy to use for numeric
features: 'domingos' or 'binary'. Default value ’binary'.
--observations_before_binning (-o) [int] If the 'domingos' split strategy is used,
this specifies the number of samples observed before binning is performed. Default
value 100. --output_model_file (-M) [string] File to save trained tree to. Default
value ’'.

--passes (-s) [int]
Number of passes to take over the dataset. Default value 1. --predictions_file
(-p) [string] File to output label predictions for test data into. Default value
''. --probabilities_file (-P) [string] In addition to predicting labels, provide
prediction probabilities in this file. Default value ''.

--test_file (-T) [string]
File of testing data. Default value ''. --test_labels_file (-L) [string] Labels of
test data. Default value ''. --training_file (-t) [string] Training dataset file.
Default value ''.

--verbose (-v)
Display informational messages and the full list of parameters and timers at the
end of execution.

--version (-V)
Display the version of mlpack.

ADDITIONAL INFORMATION


ADDITIONAL INFORMATION


For further information, including relevant papers, citations, and theory, For further
information, including relevant papers, citations, and theory, consult the documentation
found at http://www.mlpack.org or included with your consult the documentation found at
http://www.mlpack.org or included with your DISTRIBUTION OF MLPACK. DISTRIBUTION OF
MLPACK.

mlpack_hoeffding_tree(1)

Use mlpack_hoeffding_tree online using onworks.net services



Latest Linux & Windows online programs