Earth – Multivariate adaptive regression splines

There have recently been some additions to the lineup of Orange learners. One of these is It is an Orange interface to the Earth library written by Stephen Milborrow implementing Multivariate adaptive regression splines.

So lets take it out for a spin on a simple toy dataset ( – created using the Paint Data widget in the Orange Canvas):

import Orange
from Orange.regression import earth
import numpy
from matplotlib import pylab as pl

data ="")
earth_predictor = earth.EarthLearner(data)

X, Y = data.to_numpy("A/C")

pl.plot(X, Y, ".r")

linspace = numpy.linspace(min(X), max(X), 20)
predictions = [earth_predictor([s, "?"]) for s in linspace]

pl.plot(linspace, predictions, "-b")

which produces the following plot:

Earth predicitons

We can also print the model representation using

print earth_predictor

which outputs:

Y =
   +1.198 * max(0, X - 0.485)
   -1.803 * max(0, 0.485 - X)
   -1.321 * max(0, X - 0.283)
   -1.609 * max(0, X - 0.640)
   +1.591 * max(0, X - 0.907)

See reference for full documentation.

(Edit: Added link to the dataset file)

Orange 2.5: code conversion

Orange 2.5 unifies Orange’s C++ core and Python modules into a single module hierarchy. To use the new module hierarchy, import Orange instead of orange and accompanying orng* modules. While we will maintain backward compatibility in 2.* releases, we nevertheless suggest programmers to use the new interface. The provided conversion tool can help refactor your code to use the new interface.

The conversion script,, resides in Orange’s main directory. To refactor from the “Orange for beginners” tutorial runpython -w -o doc/ofb-rst/code/

The old code

import orange
import orngTest, orngStat, orngTree

# set up the learners
bayes = orange.BayesLearner()
tree = orngTree.TreeLearner(mForPruning=2) = "bayes" = "tree"
learners = [bayes, tree]

# compute accuracies on data
data = orange.ExampleTable("voting")
res = orngTest.crossValidation(learners, data, folds=10)
cm = orngStat.computeConfusionMatrices(res,

is refactored to

import Orange

# set up the learners
bayes = Orange.classification.bayes.NaiveLearner()
tree = Orange.classification.tree.TreeLearner(mForPruning=2) = "bayes" = "tree"
learners = [bayes, tree]

# compute accuracies on data
data ="voting")
res = Orange.evaluation.testing.cross_validation(learners, data, folds=10)
cm = Orange.evaluation.scoring.compute_confusion_matrices(res,

Read more about the refactoring tool on the wiki and on the help page (python --help).

Random forest switches to Simple tree learner by default

Random forest classifiers now use Orange.classification.tree.SimpleTreeLearner by default, which considerably shortens their construction times.

Using a random forest classifier is easy.

import Orange

iris ='iris')
forest = Orange.ensemble.forest.RandomForestLearner(iris, trees=200)
for instance in iris:
    print forest(instance), instance.get_class()

The example above loads the iris dataset and trains a random forest classifier with 200 trees. The classifier is then used to label all training examples, printing its prediction alongside the actual class value.

Using SimpleTreeLearner insted of TreeLearner substantially reduces the training time. The image below compares construction times of Random Forest classifiers using a SimpleTreeLearner or a TreeLearner as the base learner.

Random Forest

By setting the base_learner parameter to TreeLearer it is possible to revert to the original behaviour:

tree_learner = Orange.classification.tree.TreeLearner()
forest_orig = Orange.ensemble.forest.RandomForestLearner(base_learner=tree_learner)