10 Tips and Tricks for Using Orange

TIP #1: Follow tutorials and example workflows to get started.

It’s difficult to start using new software. Where does one start, especially a total novice in data mining? For this exact reason we’ve prepared Getting Started With Orange – YouTube tutorials for complete beginners. Example workflows on the other hand can be accessed via Help – Examples.

 

TIP #2: Make use of Orange documentation.

You can access it in three ways:

  1. Press F1 when the widget is selected. This will open help screen.
  2. Select Widget – Help when the widget is selected. It works the same as above.
  3. Visit online documentation.

 

TIP #3: Embed your help screen.

Drag and drop help screen to the side of your Orange canvas. It will become embedded in the canvas. You can also make it narrower, allowing for a full-size analysis while exploring the docs.

embed-help

 

TIP #4: Use right-click.

Right-click on the canvas and a widget menu will appear. Start typing the widget you’re looking for and press Enter when the widget becomes the top widget. This will place the widget onto the canvas immediately. You can also navigate the menu with up and down.

 

TIP #5: Turn off channel names.

Sometimes it is annoying to see channel names above widget links. If you’re already comfortable using Orange, you can turn them off in Options – Settings. Turn off ‘Show channel names between widgets’.

 

TIP #6: Hide control pane.

Once you’ve set the parameters, you’d probably want to focus just on visualizations. There’s a simple way to do this in Orange. Click on the split between the control pane and visualization pane – you should see a hand appearing instead of a cursor. Click and observe how the control pane gets hidden away neatly. To make it reappear, click the split again.

panel1

panel2

 

TIP #7: Label your data.

So you’ve plotted your data, but have no idea what you’re seeing. Use annotation! In some widgets you will see a drop-down menu called Annotation, while in others it will be called a Label. This will mark your data points with suitable labels, making your MDS plots and Scatter Plots much more informative. Scatter Plot also enables you to label only selected points for better clarity.

 

TIP #8: Find your plot.

Scrolled around and lost the plot? Zoomed in too much? To re-position the plot click ‘Reset zoom’ and the visualization will jump snugly into the visualization pane. Comes in handy when browsing the subsets and trying to see the bigger picture every now and then.

zoom-pan

 

 

TIP #9: Reset widget settings.

Orange is geared to remember your last settings, thus assisting you in a rapid analysis. However, sometimes you need to start anew. Go to Options – Reset widget settings… and restart Orange. This will return Orange to its original state.

 

TIP #10: Use Educational add-on.

To learn about how some algorithms work, use Orange3-Educational add-on. It contains 4 widgets that will help you get behind the scenes of some famous algorithms. And since they’re interactive, they’re also a lot of fun!

educational

 

 

 

 

The Story of Shadow and Orange

This is a long story. I remember when started my PhD in Italy. There I met a researcher and he said to me: »You should do some simulations on x-ray optics beamline.« »Yes, but how should I do that?« He gave me a big tape, it was 1986. I soon realized it was all code. But it was a code called Shadow.

I started to look at the code, to play with it, do some simulations… Soon my boss told me:

»You should do a simulation with asymmetric crystals for monochromators.«

»But asymmetric crystals are not foreseen in this code.«

»Yes, think about how to do it. You should contact Franco Cerrina, he’s the author of Shadow.«

I indeed contacted prof. Cerrina and at that time this was not easy, because there was no direct e-mail. What we had was called a digital deck net, Digital Computers Network. I had to go to another laboratory just to send him an e-mail. Soon, he replied: »Come to see me.« I managed to get some funding to go to the US and for the next two years I spent a good amount of time in Madison, Wisconsin.

I started to work with prof. Cerrina and it was thanks to my work on Shadow that I was called by the European Synchrotron Facility and they offered me a position. But soon I stopped working on Shadow, because I was getting busy with other things.

It was only in 2009 that I contacted prof. Cerrina again. We needed to upgrade our software, so I went back to the US two or three times and started working on what is now Shadow3.

 

In 2010 I organized a trip to go visit again with my family for the summer. We booked the house, we booked the trip… And it was ten days before the departure that I learned that Cerrina died. And since everything was already organized, we decided to visit the US anyway.

There, I went to Cerrina’s laboratory and met his PhD student, who was keeping his possessions. I said to her:

»Tell me everything you were doing recently and I will try to recover what I can.«

And at that moment, she said many things were on this big old Mac. So I proposed to buy this Mac from her, but my home institution wasn’t happy, they saw no reason to buy a second-hand Mac. Even though it contained some important things Cerrina was working on!

Luckily, I managed to get it and I was able to recover many things from it. Moreover, I kept maintaining the Shadow code, because it is a standard software in the community. At the very beginning, the source was not public. Then it was eventually published, but the code was very complicated and nobody managed to recompile that. Thus I decided to clean the code and finally we completed the new version of Shadow in 2011.

 

Three years ago it was time to update Shadow again, especially the interface. One day I discovered Orange and I thought ‘it looked nice’. In that exact time I met Luca [Rebuffi] in Trieste. He got so excited about Orange that his PhD project became redesigning Shadow’s interface with Orange! And now we have OASYS, which is an adaptation of Orange for optical physics. So I hope that in the future, we will have many more users and also many more developers helping us bring simple tools to the scientific community.

 

— Manuel Sanchez del Rio

Intro to Data Mining for Life Scientists

RNA Club Munich has organized Molecular Life of Stem Cells Conference in Ljubljana this past Thursday, Friday and Saturday. They asked us to organize a four-hour workshop on data mining. And here we were: four of us, Ajda, Anze, Marko and myself (Blaz) run a workshop for 25 students with molecular biology and biochemistry background.

img_20160929_133840

We have covered some basic data visualization, modeling (classification) and model scoring, hierarchical clustering and data projection, and finished with a touch of deep-learning by diving into image analysis by deep learning-based embedding.

Related: Data Mining Course at Baylor College of Medicine in Houston

It’s not easy to pack so many new things on data analytics within four hours, but working with Orange helps. This was a hands-on workshop. Students brought their own laptops with Orange and several of its add-ons for bioinformatics and image analytics. We also showed how to prepare one’s own data using Google Forms and designed a questionary, augment it in a class, run it with students and then analyze the questionary with Orange.

pano_20160929_113352

img_0355

img_0353

The hard part of any short course that includes machine learning is how to explain overfitting. The concept is not trivial for data science newcomers, but it is so important it simply cannot be left out. Luckily, Orange has some cool widgets to help us understanding the overfitting. Below is a workflow we have used. We read some data (this time it was a yeast gene expression data set called brown-selected that comes with Orange), “destroyed the data” by randomly permuting the column with class values, trained a classification tree, and observed near perfect results when the model was checked on the training data.

yeast-overfitting-distributions

Sure this works, you are probably saying. The models should have been scored on a separate test set! Exactly, and this is what we have done next with Data Sampler, which lead us to cross-validation and Test & Score widget.

This was a great and interesting short course and we were happy to contribute to the success of the student-run MLSC-2016 conference.