Multi-label classification (and Multi-target prediction) in Orange

The last summer, student Wencan Luo participated in Google Summer of Code to implement Multi-label Classification in Orange. He provided a framework, implemented a few algorithms and some prototype widgets. His work has been “hidden” in our repositories for too long; finally, we have merged part of his code into Orange (widgets are not there yet …) and added a more general support for multi-target prediction.

You can load multi-label tab-delimited data (e.g. emotions.tab) just like any other tab-delimited data:

>>> zoo = Orange.data.Table('zoo')            # single-target
>>> emotions = Orange.data.Table('emotions')  # multi-label

The difference is that now zoo‘s domain has a non-empty class_var field, while a list of emotions‘ labels can be obtained through it’s domain’s class_vars:

>>> zoo.domain.class_var
EnumVariable 'type'
>>> emotions.domain.class_vars
<EnumVariable 'amazed-suprised',
 EnumVariable 'happy-pleased',
 EnumVariable 'relaxing-calm',
 EnumVariable 'quiet-still',
 EnumVariable 'sad-lonely',
 EnumVariable 'angry-aggresive'>

A simple example of a multi-label classification learner is a “binary relevance” learner. Let’s try it out.

>>> learner = Orange.multilabel.BinaryRelevanceLearner()
>>> classifier = learner(emotions)
>>> classifier(emotions[0])
[<orange.Value 'amazed-suprised'='0'>,
 <orange.Value 'happy-pleased'='0'>,
 <orange.Value 'relaxing-calm'='1'>,
 <orange.Value 'quiet-still'='1'>,
 <orange.Value 'sad-lonely'='1'>,
 <orange.Value 'angry-aggresive'='0'>]
>>> classifier(emotions[0], Orange.classification.Classifier.GetProbabilities)
[<1.000, 0.000>, <0.881, 0.119>, <0.000, 1.000>,
 <0.046, 0.954>, <0.000, 1.000>, <1.000, 0.000>]

Real values of label variables of emotions[0] instance can be obtained by calling emotions[0].get_classes(), which is analogous to the get_class method in the single-target case.

For multi-label classification, we can also perform testing like usual, however, specialised evaluation measures have to be used:

>>> test = Orange.evaluation.testing.cross_validation([learner], emotions)
>>> Orange.evaluation.scoring.mlc_hamming_loss(test)
[0.2228780213603148]

In one of the following blog posts, a multi-target regression method PLS that is in the process of implementation will be described.

  • http://twitter.com/colindickson Colin Dickson

    i’m having an issue with the ‘class_var’ and ‘class_vars’ attributes.   i’ve updated to the most recent Orange version, and i’m able to follow with the rest of the example

    $ python
    Python 2.7.2+ (default, Oct  4 2011, 20:03:08) 
    [GCC 4.6.1] on linux2
    Type “help”, “copyright”, “credits” or “license” for more information.
    >>> import Orange
    >>> zoo = Orange.data.Table(‘zoo’)
    >>> zoo.class_var
    Traceback (most recent call last):
      File “”, line 1, in
    AttributeError: ‘orange.ExampleTable’ has no attribute ‘class_var’
    >>> emo = Orange.data.Table(’emotions’)
    >>> emo.class_vars
    Traceback (most recent call last):
      File “”, line 1, in
    AttributeError: ‘orange.ExampleTable’ has no attribute ‘class_vars’

    • matija

      Oh-oh, that’s my mistake, thanks for pointing it out! It should be zoo.domain.class_var and emotions.domain.class_vars! I’ll fix the blog post, too.

  • vino

    how can i add-on the multitarget?