Making Predictions

One of the cool things about being a data scientist is being able to predict. That is, predict before we know the actual outcome. I am not talking about verifying your favorite classification algorithm here, and I am not talking about cross-validation or classification accuracies or AUC or anything like that. I am talking about the good old prediction. This is where our very own Predictions widget comes to help.

predictive analytics
Predictions workflow.

 

We will be exploring the Iris data set again, but we’re going to add a little twist to it. Since we’ve worked so much with it already, I’m sure you know all about this data. But now we got three new flowers in the office and of course there’s no label attached to tell us what species of Iris these flowers are. [sigh….] Obviously, we will be measuring petals and sepals and contrasting the results with our data.

predictive analytics
Our new data on three flowers. We have used Google Sheets to enter the data and the copied the sharable link and pasted the link to the File widget.

 

But surely you don’t want to go through all 150 flowers to properly match the three new Irises? So instead, let’s first train a model on the existing data set. We connect the File widget to the chosen classifier (we went with Classification Tree this time) and feed the results into Predictions. Now we write down the measurements for our new flowers into Google Sheets (just like above), load it into Orange with a new File widget and input the fresh data into Predictions. We can observe the predicted class directly in the widget itself.

predictive analytics
Predictions made by classification tree.

 

In the left part of the visualization we have the input data set (our measurements) and in the right part the predictions made with classification tree. By default you see probabilities for all three class values and the predicted class. You can of course use other classifiers as well – it would probably make sense to first evaluate classifiers on the existing data set, find the best one for your and then use it on the new data.

 

14 thoughts on “Making Predictions

    1. hi .. i want some help in orange ? i want to tarin a module on my mobiles reviews data set using svm but it will not done correctly ..

  1. Thanks for your post. I am currently using the Test and Score at the same time as the Prediction widget. Will Predictions work without the Test and Score, or do i need to run both at the same time? Thx

    1. Test&Score and Predictions are two different widgets. Test&Score is meant for evaluating the performance of the model with cross-validation or other method of your choice. Predictions takes a model and predicts new data instances. Please see widget documentation for details on their use.

  2. I am facing an issue, when I attach the test file to the Predictions widget it says “One or more predictors failed…” All my columns in the test file are same as my training data file. Also I have set the target variable in the File widget. Can you please help?

      1. There was indeed a bug in the Predictions widget in Orange 3.4.
        It is already fixed in version 3.4.2. Please see if everything works for you as well and report if there are still any issues.

        1. I have Orange version 3.4.3, and I am having the same issue. The columns in the test file are the same as in the training data file, and the targets are the same in both as well.

Leave a Reply