The svm_toolkit contains some methods to make this search process simpler. The main addition is Svm.cross_validation_search, which takes five parameters:
- a dataset for training, this is an instance of the Problem class.
- a dataset to use for cross-validation, this is also an instance of the Problem class.
- an array of values to use for the cost parameter: it is recommended these be exponentially growing, in powers of two.
- an array of values to use for the gamma parameter: these are also recommended to be exponentially growing, in powers of two.
- optionally, a fifth parameter value of true will generate a contour plot of the cross-validation performance against the two parameters.
The image below shows an example contour plot. (The contour plot is drawn using PlotPackage.)

Rescaling Features
The Problem class provides a rescale method, which rescales each feature in the current problem to fall in a given range: the default is for the features to end up in the range [0, 1]. For example,
problem.rescale
will make sure the problem's features are all in the range [0, 1]
Training and Evaluation
Assuming our data are divided into training, cross-validation and test sets, the following program will train, optimise and evaluate an RBF model:
require "svm_toolkit"
# load in datasets to use for training, cross validation, and testing
TrainingData = Problem.from_file "training_set.dat"
CrossValData = Problem.from_file "cross_val_set.dat"
TestData = Problem.from_file "test_set.dat"
# Make sure all features are in range [0, 1]
TrainingData.rescale
CrossValData.rescale
TestData.rescale
# decide on the range of costs and gammas to search over
Costs = [-5, -3, -1, 0, 1, 3, 5, 8, 10, 13, 15].collect {|n| 2**n}
Gammas = [-15, -12, -8, -5, -3, -1, 1, 3, 5, 7, 9].collect {|n| 2**n}
# create the best model, and display the contour plot of results
best_model = Svm.cross_validation_search(
TrainingData,
CrossValData,
Costs,
Gammas,
true
)
# evaluate model on the test set
puts "Test set errors: #{best_model.evaluate_dataset(TestData)}"
# save the model for later use
best_model.save "model.dat"
0 comments:
Post a Comment