Unfreezing Orange

Have you ever tried Orange with data big enough that some widgets ran for more than a second? Then you have seen it: Orange froze. While the widget was processing, the interface would not respond to any inputs, and there was no way to stop that widget.

Not all the widgets freeze, though! Some widgets, like Test & Score, k-Means, or Image Embedding, do not block. While they are working, we are free to build other parts of the workflow, and these widgets also show their progress. Some, like Image Embedding, which work with lots of images, even allow interruptions.

Why does Orange freeze? Most widgets process users’ actions directly: after an event (click, pressed key, new input data) some code starts running: until it finishes, the interface can not respond to any new events. This is a reasonable approach for short tasks, such as making a selection in a Scatter Plot. But with longer tasks, such as building a Support Vector Model on big data, Orange gets unresponsive.

To make Orange responsive while it is processing, we need to start the task in a new thread. As programmers we have to consider the following:
1. Starting the task. We have to make sure that other (older) tasks are not running.
2. Showing results when the task has finished.
3. Periodic communication between the task and the interface for status reports (progress bars) and task stopping.

Starting the task and showing the results are straightforward and well documented in a tutorial for writing widgets. Periodic communication with stopping is harder: it is completely task-dependent and can be either trivial, hard, or even impossible. Periodic communication is, in principle, unessential for responsiveness, but if we do not implement it, we will be unable to stop the running task and progress bars would not work either.

Taking care of periodic communication was the hardest part of making the Neural Network widget responsive. It would have been easy, had we implemented neural networks ourselves. But we use the scikit-learn implementation, which does not expose an option to make additional function calls while fitting the network (we need to run code that communicates with the interface). We had to resort to a trick: we modified fitting so that a change to an attribute called n_iters_ called a function (see pull request). Not the cleanest solution, but it seems to work.

For now, only a few widgets work so that the interface remains responsive. We are still searching for the best way to make existing widgets behave nicely, but responsiveness is now one of our priorities.

GSoC Review: Visualizations with Qt

During the course of this summer, I created a new plotting library for Orange plot, replacing the use of PyQwt. I can say that I have succesfully completed my project, but the library (and especially the visualization widgets) could still use some more work. The new library supports a similar interface, so little change is needed to convert individual widgets, but it also has several advantages over the old implementation:

  • Animations: When using a single curve to show all data points, data changes only move the points instead of replacing them. These moves are now animated, as are color and size changes.
  • Multithreading: All position calculations are done in separate threads, so the interface remains responsive even when an long operation is running in the background.
  • Speed: I removed several occurances of needlessly clearing and repopulating the graph.
  • Simplicity: Because it was written with Orange in mind, the new library has functions that match Orange’s data structures. This leads to simpler code in widgets using the library, and less operations in Python.
  • Appearance: The plot can use the system palette, or a custom color theme. In general, I think it looks much nicer that Qwt-based plots.
  • Documentation: There is an extensive API documentation (will soon be available at Orange 2.5 documentation), as well as two widget examples.

However, there are also disadvantages to using the new library. They are not many, and I’ve been trying to keep them as few and as small as possible, but there still are some.

  • Line rendering: For some reason, whenever lines are rendered on the plot, the whole widget starts acting very slow. The effect is even more noticeable when zooming. As far as I can tell, this happens in Qt’s drawing libraries, so there is not much I can do about it.
  • Axis labels: With a large number of long axis labels, the formatting gets rather ugly. This is a minor inconvenience, but it does make the plots look unprofessional.

Fortunately, I have little school obligations this september, so I think I will be able to work on Orange some more, at least until school starts. I have already added gesture support and some minor improvements since the end of the program.

Finally, I’d like to take this opportunity to thank the Orange team, especially my mentor Miha, for accepting me and helping me throughout the summer. It’s been an interesting project, and I’ll be happy to continue working with the same software and the same team.