Model replaces Classify and Regression

Did you recently wonder where did Classification Tree go? Or what happened to Majority?

Orange 3.4.0 introduced a new widget category, Model, which now contains all supervised learning algorithms in one place and replaces the separate Classify and Regression categories.

    

This, however, was not a mere cosmetic change to the widget hierarchy. We wanted to simplify the interface for new users and make finding an appropriate learning algorithm easier. Moreover, now you can reuse some workflows on different data sets, say housing.tab and iris.tab!

Leading up to this change, many algorithms were refactored so that regression and classification versions of the same method were merged into a single widget (and class in the underlying python API). For example, Classification Tree and Regression Tree have become simply Tree, which is capable of modelling categorical or numeric target variables. And similarly for SVM, kNN, Random Forest, …

Have you ever searched for a widget by typing its name and were confused by multiple options appearing in the search box? Now you do not need to decide if you need Classification SVM or Regression SVM, you can just select SVM and enjoy the rest of the time doing actual data analysis!

 

Here is a quick wrap-up:

  • Majority and Mean became Constant.
  • Classification Tree and Regression Tree became Tree. In the same manner, Random Forest and Regression Forest became Random Forest.
  • SVM, SGD, AdaBoost and kNN now work for both classification and regression tasks.
  • Linear Regression only works for regression.
  • Logistic Regression, Naive Bayes and CN2 Rule Induction only work for classification.

Sorry about the last part, we really couldn’t do anything about the very nature of these algorithms! 🙂

 
      

The Beauty of Random Forest

It is the time of the year when we adore Christmas trees. But these are not the only trees we, at Orange team, think about. In fact, through almost life-long professional deformation of being a data scientist, when I think about trees I would often think about classification and regression trees. And they can be beautiful as well. Not only for their elegance in explaining the hidden patterns, but aesthetically, when rendered in Orange. And even more beautiful then a single tree is Orange’s rendering of a forest, that is, a random forest.

Related: Pythagorean Trees and Forests

Here are six trees in the random forest constructed on the housing data set:

The random forest for annealing data set includes a set of smaller-sized trees:

A Christmas-lit random forest inferred from pen digits data set looks somehow messy in trying to categorize to ten different classes:

The power of beauty! No wonder random forests are one of the best machine learning tools. Orange renders them according to the idea of Fabian Beck and colleagues who proposed Pythagoras trees for visualizations of hierarchies. The actual implementation for classification and regression trees for Orange was created by Pavlin Policar.