Wednesday, August 25, 2010

Using libsvm from jruby

The latest libsvm distribution contains a Java version of the libsvm system. I want to use this from jruby, but there is a problem. The java version of libsvm has been written based directly on the C code, and this means class names do not have capital letters, which upsets the import process.

Correcting this is fairly simple. All the important classes need to have capital letter names. Having done this and recompiled the code, I am making it available here. You can unpack the code (using 'jar xf libsvm.jar') to find the source files. Or, you can just 'require "libsvm"' to use the contents within a jruby program.

Using this from jruby is not as smooth as it might be: some convenience functions to convert ruby data into the SVM format would be helpful, but for now here is a simple example of learning and then categorising some vectors. (The example is adapted from that in a great introduction to SVMs.)

require "java"
require "libsvm"

import "libsvm.Parameter"
import "libsvm.Model"
import "libsvm.Problem"
import "libsvm.Node"
import "libsvm.Svm"

puts "Classification with LIBSVM"
puts "--------------------------"

# Sample dataset: the 'Play Tennis' dataset
# from T. Mitchell, Machine Learning (1997)
# --------------------------------------------
# Labels for each instance in the training set
# 1 = Play, 0 = Not
@@labels = [0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0]

# Recoding the attribute values into range [0, 1]
@@instances = [
[0.0,1.0,1.0,0.0],
[0.0,1.0,1.0,1.0],
[0.5,1.0,1.0,0.0],
[1.0,0.5,1.0,0.0],
[1.0,0.0,0.0,0.0],
[1.0,0.0,0.0,1.0],
[0.5,0.0,0.0,1.0],
[0.0,0.5,1.0,0.0],
[0.0,0.0,0.0,0.0],
[1.0,0.5,0.0,0.0],
[0.0,0.5,0.0,1.0],
[0.5,0.5,1.0,1.0],
[0.5,1.0,0.0,0.0],
[1.0,0.5,1.0,1.0]
]

# create some arbitrary train/test split
@@training_labels = @@labels.slice(0,10)
@@training_instances = @@instances.slice(0,10)
@@test_labels = @@labels.slice(10,14)
@@test_instances = @@instances.slice(10,14)

# convert vector of values into a vector of Nodes
def convert values
ns = Node[values.size].new
values.each_with_index do |v, i|
n = Node.new
n.index = i
n.value = v
ns[i] = n
end
return ns
end

# Define kernel parameters
# -- changing these makes the difference between something working or not
@@pa = Parameter.new
@@pa.C = 10
@@pa.svm_type = Parameter::NU_SVC
@@pa.degree = 1
@@pa.coef0 = 0
@@pa.eps= 0.001
@@pa.probability = 0
@@pa.nu = 0.5
@@pa.gamma = 100

@@sp = Problem.new

# Add documents to the training set
@@sp.l = @@training_labels.size
@@sp.x = Node[@@training_instances.size][@@training_instances[0].size].new
@@training_instances.each_with_index do |instance, i|
instance.each_with_index do |v, j|
n = Node.new
n.index = j
n.value = v
@@sp.x[i][j] = n
end
end
@@sp.y = Java::double[@@training_labels.size].new
@@training_labels.each_with_index {|v, i| @@sp.y[i] = v}

# Try four different Kernels
@@kernels = [ Parameter::LINEAR, Parameter::POLY, Parameter::RBF, Parameter::SIGMOID ]
@@kernel_names = [ 'Linear', 'Polynomial', 'Radial basis function', 'Sigmoid' ]

@@kernels.each_index do |j|

# Iterate and over each kernel type
@@pa.kernel_type = @@kernels[j]
m = Svm.svm_train(@@sp, @@pa)
errors = 0

# Test kernel performance on the training set
@@training_labels.each_index do |i|
pred = Svm.svm_predict(m, @@sp.x[i])
puts "Prediction: #{pred}, True label: #{@@training_labels[i]}, Kernel: #{@@kernel_names[j]}"
errors += 1 if @@labels[i] != pred
end
puts "Kernel #{@@kernel_names[j]} made #{errors} errors on the training set"

# Test kernel performance on the test set
errors = 0
@@test_labels.each_index do |i|
pred = Svm.svm_predict(m, convert(@@test_instances[i]))
puts "\t Prediction: #{pred}, True label: #{@@test_labels[i]}"
errors += 1 if @@test_labels[i] != pred
end

puts "Kernel #{@@kernel_names[j]} made #{errors} errors on the test set \n\n"
end

In terms of performance, you don't appear to lose anything by switching to jruby and the java version of libsvm: I reran my experiment from an earlier post (with suitable changes to the code) and it ran in the same amount of time, and, of course, got pretty much the same result.


0 comments:

Post a Comment