10 Tips and Tricks for Using Orange

TIP #1: Follow tutorials and example workflows to get started.

It’s difficult to start using new software. Where does one start, especially a total novice in data mining? For this exact reason we’ve prepared Getting Started With Orange – YouTube tutorials for complete beginners. Example workflows on the other hand can be accessed via Help – Examples.


TIP #2: Make use of Orange documentation.

You can access it in three ways:

  1. Press F1 when the widget is selected. This will open help screen.
  2. Select Widget – Help when the widget is selected. It works the same as above.
  3. Visit online documentation.


TIP #3: Embed your help screen.

Drag and drop help screen to the side of your Orange canvas. It will become embedded in the canvas. You can also make it narrower, allowing for a full-size analysis while exploring the docs.



TIP #4: Use right-click.

Right-click on the canvas and a widget menu will appear. Start typing the widget you’re looking for and press Enter when the widget becomes the top widget. This will place the widget onto the canvas immediately. You can also navigate the menu with up and down.


TIP #5: Turn off channel names.

Sometimes it is annoying to see channel names above widget links. If you’re already comfortable using Orange, you can turn them off in Options – Settings. Turn off ‘Show channel names between widgets’.


TIP #6: Hide control pane.

Once you’ve set the parameters, you’d probably want to focus just on visualizations. There’s a simple way to do this in Orange. Click on the split between the control pane and visualization pane – you should see a hand appearing instead of a cursor. Click and observe how the control pane gets hidden away neatly. To make it reappear, click the split again.




TIP #7: Label your data.

So you’ve plotted your data, but have no idea what you’re seeing. Use annotation! In some widgets you will see a drop-down menu called Annotation, while in others it will be called a Label. This will mark your data points with suitable labels, making your MDS plots and Scatter Plots much more informative. Scatter Plot also enables you to label only selected points for better clarity.


TIP #8: Find your plot.

Scrolled around and lost the plot? Zoomed in too much? To re-position the plot click ‘Reset zoom’ and the visualization will jump snugly into the visualization pane. Comes in handy when browsing the subsets and trying to see the bigger picture every now and then.




TIP #9: Reset widget settings.

Orange is geared to remember your last settings, thus assisting you in a rapid analysis. However, sometimes you need to start anew. Go to Options – Reset widget settings… and restart Orange. This will return Orange to its original state.


TIP #10: Use Educational add-on.

To learn about how some algorithms work, use Orange3-Educational add-on. It contains 4 widgets that will help you get behind the scenes of some famous algorithms. And since they’re interactive, they’re also a lot of fun!






The Story of Shadow and Orange

This is a long story. I remember when started my PhD in Italy. There I met a researcher and he said to me: »You should do some simulations on x-ray optics beamline.« »Yes, but how should I do that?« He gave me a big tape, it was 1986. I soon realized it was all code. But it was a code called Shadow.

I started to look at the code, to play with it, do some simulations… Soon my boss told me:

»You should do a simulation with asymmetric crystals for monochromators.«

»But asymmetric crystals are not foreseen in this code.«

»Yes, think about how to do it. You should contact Franco Cerrina, he’s the author of Shadow.«

I indeed contacted prof. Cerrina and at that time this was not easy, because there was no direct e-mail. What we had was called a digital deck net, Digital Computers Network. I had to go to another laboratory just to send him an e-mail. Soon, he replied: »Come to see me.« I managed to get some funding to go to the US and for the next two years I spent a good amount of time in Madison, Wisconsin.

I started to work with prof. Cerrina and it was thanks to my work on Shadow that I was called by the European Synchrotron Facility and they offered me a position. But soon I stopped working on Shadow, because I was getting busy with other things.

It was only in 2009 that I contacted prof. Cerrina again. We needed to upgrade our software, so I went back to the US two or three times and started working on what is now Shadow3.


In 2010 I organized a trip to go visit again with my family for the summer. We booked the house, we booked the trip… And it was ten days before the departure that I learned that Cerrina died. And since everything was already organized, we decided to visit the US anyway.

There, I went to Cerrina’s laboratory and met his PhD student, who was keeping his possessions. I said to her:

»Tell me everything you were doing recently and I will try to recover what I can.«

And at that moment, she said many things were on this big old Mac. So I proposed to buy this Mac from her, but my home institution wasn’t happy, they saw no reason to buy a second-hand Mac. Even though it contained some important things Cerrina was working on!

Luckily, I managed to get it and I was able to recover many things from it. Moreover, I kept maintaining the Shadow code, because it is a standard software in the community. At the very beginning, the source was not public. Then it was eventually published, but the code was very complicated and nobody managed to recompile that. Thus I decided to clean the code and finally we completed the new version of Shadow in 2011.


Three years ago it was time to update Shadow again, especially the interface. One day I discovered Orange and I thought ‘it looked nice’. In that exact time I met Luca [Rebuffi] in Trieste. He got so excited about Orange that his PhD project became redesigning Shadow’s interface with Orange! And now we have OASYS, which is an adaptation of Orange for optical physics. So I hope that in the future, we will have many more users and also many more developers helping us bring simple tools to the scientific community.


— Manuel Sanchez del Rio

Intro to Data Mining for Life Scientists

RNA Club Munich has organized Molecular Life of Stem Cells Conference in Ljubljana this past Thursday, Friday and Saturday. They asked us to organize a four-hour workshop on data mining. And here we were: four of us, Ajda, Anze, Marko and myself (Blaz) run a workshop for 25 students with molecular biology and biochemistry background.


We have covered some basic data visualization, modeling (classification) and model scoring, hierarchical clustering and data projection, and finished with a touch of deep-learning by diving into image analysis by deep learning-based embedding.

Related: Data Mining Course at Baylor College of Medicine in Houston

It’s not easy to pack so many new things on data analytics within four hours, but working with Orange helps. This was a hands-on workshop. Students brought their own laptops with Orange and several of its add-ons for bioinformatics and image analytics. We also showed how to prepare one’s own data using Google Forms and designed a questionary, augment it in a class, run it with students and then analyze the questionary with Orange.




The hard part of any short course that includes machine learning is how to explain overfitting. The concept is not trivial for data science newcomers, but it is so important it simply cannot be left out. Luckily, Orange has some cool widgets to help us understanding the overfitting. Below is a workflow we have used. We read some data (this time it was a yeast gene expression data set called brown-selected that comes with Orange), “destroyed the data” by randomly permuting the column with class values, trained a classification tree, and observed near perfect results when the model was checked on the training data.


Sure this works, you are probably saying. The models should have been scored on a separate test set! Exactly, and this is what we have done next with Data Sampler, which lead us to cross-validation and Test & Score widget.

This was a great and interesting short course and we were happy to contribute to the success of the student-run MLSC-2016 conference.

Text Mining: version 0.2.0

Orange3-Text has just recently been polished, updated and enhanced! Our GSoC student Alexey has helped us greatly to achieve another milestone in Orange development and release the latest 0.2.0 version of our text mining add-on. The new release, which is already available on PyPi, includes Wikipedia and SimHash widgets and a rehaul of Bag of Words, Topic Modeling and Corpus Viewer.


Wikipedia widget allows retrieving sources from Wikipedia API and can handle multiple queries. It serves as an easy data gathering source and it’s great for exploring text mining techniques. Here we’ve simply queried Wikipedia for articles on Slovenia and Germany and displayed them in Corpus Viewer.

Query Wikipedia by entering your query word list in the widget. Put each query on a separate line and run Search.


Similarity Hashing widget computes similarity hashes for the given corpus, allowing the user to find duplicates, plagiarism or textual borrowing in the corpus. Here’s an example from Wikipedia, which has a pre-defined structure of articles, making our corpus quite similar. We’ve used Wikipedia widget and retrieved 10 articles for the query ‘Slovenia’. Then we’ve used Similarity Hashing to compute hashes for our text. What we got on the output is a table of 64 binary features (predefined in the SimHash widget), which denote a 64-bit hash size. Then we computed similarities in text by sending Similarity Hashing to Distances. Here we’ve selected cosine row distances and sent the output to Hierarchical Clustering. We can see that we have some similar documents, so we can select and inspect them in Corpus Viewer.

Output of Similarity Hashing widget.
We’ve selected the two most similar documents in Hierarchical Clustering and displayed them in Corpus Viewer.


Topic Modeling now includes three modeling algorithms, namely Latent Semantic Indexing (LSP), Latent Dirichlet Allocation (LDA), and Hierarchical Dirichlet Process (HDP). Let’s query Twitter for the latest tweets from Hillary Clinton and Donald Trump. First we preprocess the data and send the output to Topic Modeling. The widget suggests 10 topics, with the most significant words denoting each topic, and outputs topic probabilities for each document.

We can inspect distances between the topics with Distances (cosine) and Hierarchical Clustering. Seems like topics are not extremely author specific, since Hierarchical Clustering often puts Trump and Clinton in the same cluster. We’ve used Average linkage, but you can play around with different linkages and see if you can get better results.

Example of comparing text by topics.


Now we connect Corpus Viewer to Preprocess Text. This is nothing new, but Corpus Viewer now displays also tokens and POS tags. Enable POS Tagger in Preprocess Text. Now open Corpus Viewer and tick the checkbox Show Tokens & Tags. This will display tagged token at the bottom of each document.

Corpus Viewer can now display tokens and POS tags below each document.


This is just a brief overview of what one can do with the new Orange text mining functionalities. Of course, these are just exemplary workflows. If you did textual analysis with great results using any of these widgets, feel free to share it with us! 🙂

Data Mining Course in Houston #2

This was already the second installment of Introduction to Data Mining Course at Baylor College of Medicine in Houston, Texas. Just like the last year, the course was packed. About 50 graduate students, post-docs and a few faculty attended, making the course one of the largest elective PhD courses from over a hundred offered at this prestigious medical school.


The course was designed for students with little or no experience in data science. It consisted of seven two-hour lectures, each followed by a homework assignment. We (Blaz and Janez) lectured on data visualization, classification, regression, clustering, data projection and image analytics. We paid special attention to the problems of overfitting, use of regularization, and proper ways of testing and scoring of modeling methods.

The course was hands-on. The lectures were practical. They typically started with some data set and explained data mining techniques through designing data analysis workflows in Orange. Besides some standard machine learning and bioinformatics data sets, we have also painted the data to explore, say, the benefits of different classification techniques or design data sets where k-means clustering would fail.

This year, the course benefited from several new Orange widgets. The recently published interactive k-means widget was used to explain the inner working of this clustering algorithm, and polynomial classification widget was helpful in discussion of decision boundaries of classification algorithms. Silhouette plot was used to show how to evaluate and explore the results of clustering. And finally, we explained concepts from deep learning using image embedding to show how already trained networks can be used for clustering and classification of images.

Image Analytics

Visualizing Gradient Descent

This is a guest blog from the Google Summer of Code project.


Gradient Descent was implemented as a part of my Google Summer of Code project and it is available in the Orange3-Educational add-on. It simulates gradient descent for either Logistic or Linear regression, depending on the type of the input data. Gradient descent is iterative approach to optimize model parameters that minimize the cost function. In machine learning, the cost function corresponds to prediction error when the model is used on the training data set.

Gradient Descent widget takes data on input and outputs the model and its coefficients.


The widget displays the value of the cost function given two parameters of the model. For linear regression, we consider feature from the training set with the parameters being the intercept and the slope. For logistic regression, the widget considers two feature and their associated multiplicative parameters, setting the intercept to zero. Screenshot bellow shows gradient descent on a Iris data set, where we consider petal length and sepal width on the input and predict the probability that iris comes from the family of Iris versicolor.


  1. The type of the model used (either Logistic regression or Linear regression)
  2. Input features (one for X and one for Y axis) and the target class
  3. Learning rate is the step size of the gradient descent
  4. In a single iteration step, stochastic approach considers only a single data instance (instead of entire training set). Convergence in terms of iterations steps is slower, and we can instruct the widget to display the progress of optimization only after given number of steps (Step size)
  5. Step through the algorithm (steps can be reverted with step back button)
  6. Run optimization until convergence


Following shows gradient descent for linear regression using The Boston Housing Data Set when trying to predict the median value of a house given its age.


On the left we use the regular and on the right the stochastic gradient descent. While the regular descent goes straight to the target, the path of stochastic is not as smooth.

We can use the widget to simulate some dangerous, unwanted behavior of gradient descent. The following screenshots show two extreme cases with too high learning rate where optimization function never converges, and a low learning rate where convergence is painfully slow.


The two problems as illustrated above are the reason that many implementations of numerical optimization use adaptive learning rates. We can simulate this in the widget by modifying the learning rate for each step of the optimization.

Making recommendations

This is a guest blog from the Google Summer of Code project.


Recommender systems are everywhere, we can find them on YouTube, Amazon, Netflix, iTunes,… This is because they are crucial component in a competitive retail services.

How can I know what you may like if I have almost no information about you? The answer: taking Collaborative filtering (CF) approaches. Basically, this means to combine all the little knowledge we have about users and/or items in order to build a grid of knowledge with which we make recommendation.

To help you with that, Biolab has written Orange3-Recommendation – an add-on for Orange3 to train recommendation models, cross-validate them and make predictions.

Input data

First things first. Orange3-Recommendation can read files in native tab-delimited format, or can load data from any of the major standard spreadsheet file type, like CSV and Excel. Native format starts with a header row with feature (column) names. Second header row gives the attribute type, which can be continuous, discrete, string or time. The third header line contains meta information to identify dependent features (class), irrelevant features (ignore) or meta features (meta).

Here are the first few lines from a data set:

    tid      user        movie       score
    string   discrete    discrete    continuous
    meta     row=1       col=1       class
    1        Breza       HarrySally  2
    2        Dana        Cvetje      5
    3        Cene        Prometheus  5
    4        Ksenija     HarrySally  4
    5        Albert      Matrix      4

The third row is mandatory in this kind of datasets*, in order to know which attributes correspond to the users (row=1) and which ones to the items (col=1). For the case of big datasets, users and items must be specified as continuous attributes due to efficiency issues. (*Note: If the meta attributes row or col, some simple heuristics will be applied: users=column 0, items=column 1, class=last column)

Here are the first few lines from a data set :

    user            movie         score         tid
    continuous      continuous    continuous    time
    row=1           col=1         class         meta
    196             242           3             881250949
    186             302           3             891717742
    22              377           1             878887116
    244             51            2             880606923
    166             346           1             886397596
    298             474           4             884182806

Training a model

This step is pretty simple. To train a model we have to load the data as is described above and connect it to the learner. (Don’t forget to click apply)

data to brismf

If the model uses side information, we only need to add an extra file.


In addition, we can set the parameters of our model by double-clicking it:

Screen Shot 2016-08-22 at 15.49.56

By using a fixed seed, we make random numbers predictable. Therefore, this feature is useful if we want to compare results in a deterministic way.


This is as simple as it seems. The only thing to point out is that side information must be connected to the model.



Still, cross-validation is a robust way to see how our model is performing. I consider that it’s a good idea to check how our model performs with respect to the baseline. This presents a negligible overload* in our pipeline and makes our analysis more solid. (*For 1,000,000 ratings, it can take 0.027s).

We can add a baseline leaner to Test&Score and select the model we want to apply.


Making recommendations

The prediction flow is exactly the same as in Orange3.


Analyzing low-rank matrices



Once we’ve output the low-rank matrices, we can play around the vectors in those matrices to discover hidden relations or understand the known ones. For instance, here we plot vector 1 and 2 from the item-feature matrix by simply connecting Data Table with selected instances to the widget Scatter Plot.

Visualizing vectors

Using similar approaches we can discover pretty interesting things like similarity between movies or users, how movie genres relate with each other, changes in users’ behavior, when the popularity of a movie has been raised due to a commercial campaign,… and many others.

Finally, a simple pipeline to do all of the above can be something like this:


On the left side we connected several models to Test&Score in order to cross-validate them. Later, we trained a SVD++ model, made some predictions, got the low-rank matrices learnt by the model and plotted some vectors of the Item-feature matrix.

Analysis (Advanced users)

Here we’ve made a workflow (which can be downloaded here) to perform a really basic analysis on the results obtained through factoring the user and item feature matrices with BRISMF over the movielens100k dataset. (Note: Once downloaded, set the prepared datasets in the folder ‘orange’. Probably you’re gonna get a couple errors. Don’t worry, it’s normal. To solve it, apply the scripts sequentially but don’t forget previously to select all the rows in the related Table.)

Instead of explaining how this pipeline works, the best thing you can do is to download it and play with it.

Complex flow


One of the analysis you can do, is to plot the most popular movies across two first vectors of the matrix descomposition. Later, you can try to find clusters, tweak it a bit and find crossed relations (e.g. male/female Vs. action/drama).

Cluster movies

Now let’s focus on the scripting part.

Rating models

In this tutorial we are going to train a BRISMF model.

1. First we import Orange and the learner that we want to use:

import Orange
from orangecontrib.recommendation import BRISMFLearner


2. After that, we have to load a dataset:

data = Orange.data.Table('movielens100k.tab')


3. Then we set the learner parameters, and finally we train it passing the dataset as an argument (the returned value will be our model trained):

learner = BRISMFLearner(num_factors=15, num_iter=25, learning_rate=0.07, lmbda=0.1)
recommender = learner(data)


4. Finally, we can make predictions (in this case, for the first three pairs in the dataset):

prediction = recommender(data[:3])
>>> [ 3.79505151 3.75096513 1.293013 ]

Ranking models

At this point we can try something new, let’s make recommendations for a dataset in which only binary relevance is available. For this case, CLiMF is model that will suit our needs.

import Orange
import numpy as np
from orangecontrib.recommendation import CLiMFLearner

# Load data
data = Orange.data.Table('epinions_train.tab')

# Train recommender
learner = CLiMFLearner(num_factors=10, num_iter=10, learning_rate=0.0001, lmbda=0.001)
recommender = learner(data)

# Make recommendations
>>> [ 494,   803,   180, ..., 25520, 25507, 30815]


Later, we can score the model. In this case we’re using the MeanReciprocalRank:

import Orange

# Load test
dataset testdata = Orange.data.Table('epinions_test.tab') 

# Sample users 
num_users = len(recommender.U)
num_samples = min(num_users, 1000) # max. number to sample
users_sampled = np.random.choice(np.arange(num_users), num_samples) 

# Compute Mean Reciprocal Rank (MRR) 
mrr, _ = recommender.compute_mrr(data=testdata, users=users_sampled) 
print('MRR: %.4f' % mrr) 
>>> MRR: 0.3975

SGD optimizers

This add-on includes several configurations that can be used to modify the updates on the low rank matrices during the stochastic gradient descent optimization.

  • SGD: Classical SGD update.
  • Momentum: SGD with inertia.
  • Nesterov momentum: A Momentum that “looks ahead”.
  • AdaGrad: Optimizer that adapts its learning rating during the process.
  • RMSProp: “Leaky” AdaGrad.
  • AdaDelta: Extension of Adagrad that seeks to reduce its aggressive.
  • Adam: Similar to AdaGrad and RMSProp but with an exponentially decaying average of past gradients.
  • Adamax: Similar to Adam, but taking the maximum between the gradient and the velocity.


Do you want to learn more about this? Check our documentation!

Visualization of Classification Probabilities

This is a guest blog from the Google Summer of Code project.


Polynomial Classification widget is implemented as a part of my Google Summer of Code project along with other widgets in educational add-on (see my previous blog). It visualizes probabilities for two-class classification (target vs. rest) using color gradient and contour lines, and it can do so for any Orange learner.

Here is an example workflow. The data comes from the File widget. With no learner on input, the default is Logistic Regression. Widget outputs learners Coefficients, Classifier (model) and Learner.


Polynomial Classification widget works on two continuous features only, all other features are ignored. The screenshot shows plot of classification for an Iris data set .


  1. Set name of the learner. This is the name of learner on output.
  2. Set features that logistic regression is performed on.
  3. Set class that is classified separately from other classes.
  4. Set the degree of a polynom that is used to transform an input data (1 means attributes are not transformed).
  5. Select whether see or not contour lines in chart. The density of contours is regulated by Contour step.


The classification for our case fails in separating Iris-versicolor from the other two classes. This is because logistic regression is a linear classifier, and because there is no linear combination of the chosen two attributes that would make for a good decision boundary. We can change that. Polynomial expansion adds features that are polynomial combinations of original ones. For example, if an input data contains features [a, b], polynomial expansion of degree two generates feature space [1, a, b, a2, a b, b2]. With this expansion, the classification boundary looks great.



Polynomial Classification also works well with other learners. Below we have given it a Classification Tree. This time we have painted the input data using Paint Data, a great data generator used while learning about Orange and data science. The decision boundaries for the tree are all square, a well-known limitation for tree-based learners.



Polynomial expansion if high degrees may be dangerous. Following example shows overfitting when degree is five. See the two outliers, a blue one on the top and the red one at the lower right of the plot? The classifier was unnecessary able to separate the outliers from the pack, something that will become problematic when classifier will be used on the new data.


Overfitting is one of the central problems in machine learning. You are welcome to read our previous blog on this problem and possible solutions.

Interactive k-Means

This is a guest blog from the Google Summer of Code project.


As a part of my Google Summer of Code project I started developing educational widgets and assemble them in an Educational Add-On for Orange. Educational widgets can be used by students to understand how some key data mining algorithms work and by teachers to demonstrate the working of these algorithms.

Here I describe an educational widget for interactive k-means clustering, an algorithm that splits the data into clusters by finding cluster centroids such that the distance between data points and their corresponding centroid is minimized. Number of clusters in k-means algorithm is denoted with k and has to be specified manually.

The algorithm starts by randomly positioning the centroids in the data space, and then improving their position by repetition of the following two steps:

  1. Assign each point to the closest centroid.
  2. Move centroids to the mean position of points assigned to the centroid.

The widget needs the data that can come from File widget, and outputs the information on clusters (Annotated Data) and centroids:


Educational widget for k-means works finds clusters based on two continuous features only, all other features are ignored. The screenshot shows plot of an Iris data set and clustering with k=3. That is partially cheating, because we know that iris data set has three classes, so that we can check if clusters correspond well to original classes:


  1. Select two features that are used in k-means
  2. Set number of centroids
  3. Randomize positions of centroids
  4. Show lines between centroids and corresponding points
  5. Perform the algorithm step by step. Reassign membership connects points to nearest centroid, Recompute centroids moves centroids.
  6. Step back in the algorithm
  7. Set speed of automatic stepping
  8. Perform the whole algorithm as fast preview
  9.  Anytime we can change number of centroids with spinner or with click in desired position in the graph.

If we want to see the correspondence of clusters that are denoted by k-means and classes, we can open Data Table widget where we see that all Iris-setosas are clustered in one cluster and but there are just few Iris-versicolor that are classified is same cluster together with Iris-virginica and vice versa.


Interactive k-means works great in combination with Paint Data. There, we can design data sets where k-mains fails, and observe why.


We could also design data sets where k-means fails under specific initialization of centroids. Ah, I did not tell you that you can freely move the centroids and then restart the algorithm. Below we show the case of centroid initialization and how this leads to non-optimal clustering.


Rule Induction (Part I – Scripting)

This is a guest blog from the Google Summer of Code project.


We’ve all heard the saying, “Rules are meant to be broken.” Regardless of how you might feel about the idea, one thing is certain. Rules must first be learnt. My 2016 Google Summer of Code project revolves around doing just that. I am developing classification rule induction techniques for Orange, and here describing the code currently available in the pull request and that will become part of official distribution in an upcoming release 3.3.8.

Rule induction from examples is recognised as a fundamental component of many machine learning systems. My goal was foremost to implement supervised rule induction algorithms and rule-based classification methods, but also to devise a more general framework of replaceable individual components that users could fine-tune to their needs. To this purpose, separate-and-conquer strategy was applied. In essence, learning instances are covered and removed following a chosen rule. The process is repeated while learning set examples remain. To evaluate found hypotheses and to choose the best rule in each iteration, search heuristics are used (primarily, rule class distribution is the decisive determinant).

The use of the created module is straightforward. New rule induction algorithms can be easily introduced, by either utilising predefined components or developing new ones (these include various search algorithms, search strategies, evaluators, and others). Several well-known rule induction algorithms have already been included. Let’s see how they perform!

Classic CN2 inducer constructs a list of ordered rules (decision list). Here, we load the titanic data set and create a simple classifier, which can already be used to predict data.

import Orange
data = Orange.data.Table('titanic')
learner = Orange.classification.CN2Learner()
classifier = learner(data)

Similarly, a set of unordered rules can be constructed using Unordered CN2 inducer. Rules are learnt for each class individually, in regard to the original learning data. To evaluate found hypotheses, Laplace accuracy measure is used. Having first initialised the learner, we then control the algorithm by modifying its parameters. The underlying components are available to us by accessing the rule finder.

data = Table('iris.tab')
learner = CN2UnorderedLearner()

# consider up to 10 solution streams at one time
learner.rule_finder.search_algorithm.beam_width = 10

# continuous value space is constrained to reduce computation time
learner.rule_finder.search_strategy.bound_continuous = True

# found rules must cover at least 15 examples
learner.rule_finder.general_validator.min_covered_examples = 15

# found rules must combine at most 2 selectors (conditions)
learner.rule_finder.general_validator.max_rule_length = 2

classifier = learner(data)

Induced rules can be quickly reviewed and interpreted. They are each of the form ‘if cond then predict class”. That is, a conjunction of selectors followed by the predicted class.

for rule in classifier.rule_list:
... print(rule, rule.curr_class_dist.tolist())

>>> IF petal length<=3.0 AND sepal width>=2.9 THEN iris=Iris-setosa [49, 0, 0]
>>> IF petal length>=3.0 AND petal length<=4.8 THEN iris=Iris-versicolor [0, 46, 3]
>>> IF petal width>=1.8 AND petal length>=4.9 THEN iris=Iris-virginica [0, 0, 43]
>>> IF TRUE THEN iris=Iris-virginica [50, 50, 50]  # the default rule

If no other rules fire, default rule (majority classification) is used. Specific to each individual rule inducer, the application of the default rule varies.

Though rule learning is most frequently used in the context of predictive induction, it can be adapted to subgroup discovery. In contrast, subgroup discovery aims at learning individual patterns or interesting population subgroups, rather than to maximise classification accuracy. Induced rules prove very valuable in terms of their descriptive power. To this end, CN2-SD algorithms were also implemented.

Hopefully, the addition to the Orange software suite will benefit both novice and expert users looking advance their knowledge in a particular area of study, through a better understanding of given predictions and underlying argumentation.