# Earth – Multivariate adaptive regression splines

There have recently been some additions to the lineup of Orange learners. One of these is Orange.regression.earth.EarthLearner. It is an Orange interface to the Earth library written by Stephen Milborrow implementing Multivariate adaptive regression splines.

So lets take it out for a spin on a simple toy dataset (data.tab – created using the Paint Data widget in the Orange Canvas):

```import Orange
from Orange.regression import earth
import numpy
from matplotlib import pylab as pl

data = Orange.data.Table("data.tab")
earth_predictor = earth.EarthLearner(data)

X, Y = data.to_numpy("A/C")

pl.plot(X, Y, ".r")

linspace = numpy.linspace(min(X), max(X), 20)
predictions = [earth_predictor([s, "?"]) for s in linspace]

pl.plot(linspace, predictions, "-b")
pl.show()```

which produces the following plot:

We can also print the model representation using

`print earth_predictor`

which outputs:

```Y =
1.013
+1.198 * max(0, X - 0.485)
-1.803 * max(0, 0.485 - X)
-1.321 * max(0, X - 0.283)
-1.609 * max(0, X - 0.640)
+1.591 * max(0, X - 0.907)```

See Orange.regression.earth reference for full documentation.

# Orange 2.5: code conversion

Orange 2.5 unifies Orange’s C++ core and Python modules into a single module hierarchy. To use the new module hierarchy, import Orange instead of orange and accompanying orng* modules. While we will maintain backward compatibility in 2.* releases, we nevertheless suggest programmers to use the new interface. The provided conversion tool can help refactor your code to use the new interface.

The conversion script, orange2to25.py, resides in Orange’s main directory. To refactor accuracy8.py from the “Orange for beginners” tutorial runpython orange2to25.py -w -o accuracy8_25.py doc/ofb-rst/code/accuracy8.py.

The old code

```import orange
import orngTest, orngStat, orngTree

# set up the learners
bayes = orange.BayesLearner()
tree = orngTree.TreeLearner(mForPruning=2)
bayes.name = "bayes"
tree.name = "tree"
learners = [bayes, tree]

# compute accuracies on data
data = orange.ExampleTable("voting")
res = orngTest.crossValidation(learners, data, folds=10)
cm = orngStat.computeConfusionMatrices(res,
classIndex=data.domain.classVar.values.index('democrat'))```

is refactored to

```import Orange

# set up the learners
bayes = Orange.classification.bayes.NaiveLearner()
tree = Orange.classification.tree.TreeLearner(mForPruning=2)
bayes.name = "bayes"
tree.name = "tree"
learners = [bayes, tree]

# compute accuracies on data
data = Orange.data.Table("voting")
res = Orange.evaluation.testing.cross_validation(learners, data, folds=10)
cm = Orange.evaluation.scoring.compute_confusion_matrices(res,
classIndex=data.domain.classVar.values.index('democrat'))```

Read more about the refactoring tool on the wiki and on the help page (python orange2to25.py --help).

# Random forest switches to Simple tree learner by default

Random forest classifiers now use Orange.classification.tree.SimpleTreeLearner by default, which considerably shortens their construction times.

Using a random forest classifier is easy.

```import Orange

iris = Orange.data.Table('iris')
forest = Orange.ensemble.forest.RandomForestLearner(iris, trees=200)
for instance in iris:
print forest(instance), instance.get_class()```

The example above loads the iris dataset and trains a random forest classifier with 200 trees. The classifier is then used to label all training examples, printing its prediction alongside the actual class value.

Using SimpleTreeLearner insted of TreeLearner substantially reduces the training time. The image below compares construction times of Random Forest classifiers using a SimpleTreeLearner or a TreeLearner as the base learner.

By setting the base_learner parameter to TreeLearer it is possible to revert to the original behaviour:

```tree_learner = Orange.classification.tree.TreeLearner()
forest_orig = Orange.ensemble.forest.RandomForestLearner(base_learner=tree_learner)```