- training_filename:
- File path to the training data
- num_examples:
- Number of training examples
- num_features:
- Number of features
- positive:
- Label for positive examples
- testing_filename:
- File path to the testing data
- num_testing_examples:
- Number of testing examples
- max_sample_size:
- Number of examples to scan for generating heuristic used in Sparrow
- max_bin_size:
- Maximum number of bins for discretizing continous feature values
- min_gamma:
- Minimum value of the \gamma of the generated tree nodes
- default_gamma:
- Default maximum value of the \gamma for generating tree nodes
- max_trials_before_shrink:
- Maximum number of examples to scan before shrinking the value of \gamma
- min_ess:
- Minimum effective sample size for triggering resample
- num_iterations:
- Number of boosting iterations
- max_leaves:
- Maximum number of tree leaves in each boosted tree
- channel_size:
- Maximum number of elements in the channel connecting scanner and sampler
- buffer_size:
- Number of examples in the sample set that needs to be loaded into memory
- batch_size:
- Number of examples to process in each weak rule updates
- serial_sampling:
- Set to true to stop running sampler in the background of the scanner
- num_examples_per_block:
- Number of examples in a block on the stratified binary file
- disk_buffer_filename:
- File name for the stratified binary file
- num_assigners:
- Number of threads for putting examples back to correct strata
- num_samplers:
- Number of threads for sampling examples from strata
- network:
- IP addresses of other machines in the network
- port:
- The network port used for parallel training
- local_name:
- Identifier for the local machine
- save_process:
- Flag for keeping all intermediate models during training (for debugging purpose)
- save_interval:
- Number of iterations between persisting models on disk
- debug_mode:
- Flag for activating debug mode
- models_table_filename:
- (for validation only) the file names of the models to run the validation
- incremental_testing:
- Flag indicating if models are trained incrementally
- testing_scores_only:
- Flag for validation mode, set to true to output raw scores of testing examples, and set to false for printing the validation scores but not raw scores