Orange is Getting Smarter

In the past few months, Orange has been getting smarter and sleeker.

Since version 3.15.0, Orange remembers which distinct widgets users like to connect, adjusting the sorting on the widget search menu accordingly. Additionally, there is a new look for the Edit Links window coming soon.

Orange recently implemented a basic form of opt-in usage tracking, specifically targeting how users add widgets to the canvas.

Word cloud of widget popularity in Orange.

 

The information is collected anonymously for the users that opted-in. We will use this data to improve the widget suggestion system. Furthermore, the data provides us the first insight into how users interact with Orange. Let’s see what we’ve found out from the data recorded in the past few weeks.

 

There are four different ways of adding a widget to the canvas,

  • clicking it in the sidebar,
  • dragging it from the sidebar,
  • searching for it by right-clicking on canvas,
  • extending the workflow by dragging the communication channel from a widget.

 

A workflow extend action.

 

Among Orange users, the most popular way of adding a new widget is by dragging the communication line from the output widget – we think this is the most efficient way of using Orange too. However, the patterns vary among different widgets.

How users add widgets to canvas, from 20,775 add widget events.

 

Users tend to add root nodes such as File via a click or drag from the sidebar, while adding leaf nodes such as Data Table via extension from another widget.

How users add File to canvas.

How users add Data Table to canvas.

 

The widget popularity contest goes to: Data Table! Rightfully so, one should always check their data with Data Table.

Widget popularity visualization in Box Plot.

 

52% of sessions tracked consisted of no widgets being added (the application just being opened and closed). While some people might really like watching the loading screen, most of these are likely due to the fact that usage is not tracked until the user explicitly opts in.

 

Each bit of collected data comes at a cost to the privacy of the user. Care was put into minimizing the intrusiveness of data collection methods, while maximizing the usefulness of the collected data.

Initially, widget addition events were planned to include a ‘time since application start’ value, in order to be able to plot a user’s actions as a function of time. While this would be cool, it was ultimately decided that its usefulness is outweighed by the privacy cost to users.

 

For the keen, data is gathered per canvas session, in the following structure:

  • Date
  • Orange version
  • Operating system
  • Widget addition events, each entailing:
    • Widget name
    • Type of addition (Click, Drag, Search or Extend)
    • (Other widget name), if type is Extend
    • (Query), if type is Search or Extend

Data Mining for Anthropologists?

This weekend we were in Lisbon, Portugal, at the Why the World Needs Anthropologists conference, an event that focuses on applied anthropology, design, and how soft skills can greatly benefit the industry. I was there to hold a workshop on Data Ethnography, an approach that tries to combine methods from data science and anthropology into a fruitful interdisciplinary mix!

Data Ethnography workshop at this year’s Why the World Needs Anthropologists conference.

 

Data ethnography is a novel methodological approach that tries to view social phenomena from two different points of view – qualitative and quantitative. The quantitative approach is using data mining and machine learning methods on anthropological data (say from sensors, wearables, social media, online fora, field notes and so on) trying to find interesting patterns and novel information. The qualitative approach uses ethnography to substantiate the analytical findings with context, motivations, values, and other external data to provide a complete account of the studied phenomenon.

At the workshop, I presented a couple of approaches I use in my own research, namely text mining, clustering, visualization of patterns, image analytics, and predictive modeling. Data ethnography can be used, not only in its native field of computational anthropology, but also in museology, digital anthropology, medical anthropology, and folkloristics (the list is probably not exhaustive). There are so many options just waiting for the researchers to dig in!

Related: Text Analysis Workshop at Digital Humanities 2017

However, having data- and tech-savvy anthropologists does not only benefit the research, but opens a platform for discussing the ethics of data science, human relationships with technology, and overcoming model bias. Hopefully, the workshop inspired some of the participants to join me on a journey through the amazing expanses of data science.

To get you inspired, here are two contributions that present some option for computational anthropological research: Data Mining Workspace Sensors: A New Approach to Anthropology and Power of Algorithms for Cultural Heritage Classification: The Case of Slovenian Hayracks.