Report is back! (and better than ever)

 

I’m sure you’d agree that reporting your findings when analyzing the data is crucial. Say you have a couple of interesting predictions that you’ve tested with several methods many times and you’d like to share that with the world. Here’s how.

Save Graph just got company – a Report button! Report works in most widgets, apart from the very obvious ones that simply transmit or display the data (Python Scripting, Edit Domain, Image Viewer, Predictions…).

 

Why is Report so great?

 

  1. Display data and graphs used in your workflow. Whatever you do with your data will be put in the report upon a click of a button.

report1

 

2. Write comments below each section in your workflow. Put down whatever matters for your research – pitfalls and advantages of a model, why this methodology works, amazing discoveries, etc.

report2

 

3. Access your workflows. Every step of the analysis recorded in the Report is saved as a workflow and can be accessed by clicking on the Orange icon. Have you spent hours analyzing your data only to find out you made a wrong turn somewhere along the way? No problem. Report saves workflows for each step of the analysis. Perhaps you would like to go back and start again from Bo Plot? Click on the Orange icon next to Box Plot and you will be taken to the workflow you had when you placed that widget in the report. Completely stress-free!

report5

 

4. Save your reports. The amazing new report that you just made can be saved as .html, .pdf or .report file. Html and PDF are pretty standard, but report format is probably the best thing since sliced bread. Why? Not only it saves your report file for later use, you can also send it to your colleagues and they will be able to access both your report and workflows used in the analysis.

5. Open report. To open a saved report file go to File → Open Report. To view the report you’re working on, go to Options → Show report view or click Shift+R.

NetworkX in Orange

NetworkX – a popular open-source python library for network analysis has finally found its way into Orange. It is now used as a base class for network representation in all Orange modules and widgets. By that, we offered to the widespread network community a fruitful and fun way to visualize and explore networks, using their existing NetworkX scripts. It has never been easier to combine network analysis and visualization with existing machine learning and data discovery methods.

Complete documentation is available in the Orange network headquarters. For a brief overview, take a look at the following example. Let us suppose we would like to analyse the data about patients, having one of two types of leukemia. So, we have a data set with 72 patient, 4600+ gene expressions and a class variable. We also have a vast network of human genes, connected if they share a biological function. What we would like to examine is a sub-network with only several hundred most expressed genes from the data set. To show off a bit, we will also use the Orange Bioinformatics add-on. Here is how we do it:

import Orange
import obiExpression

# load leukemia data set
table = Orange.data.Table("/media/Ox/Projects_Archive/res/BIO/leukemia/leukemiaGSEA.tab")

useAttributeLabels = False
ttest = obiExpression.ExpressionSignificance_TTest(table, useAttributeLabels)

target = [table.domain.classVar(0), table.domain.classVar(1)]

# test for significantly expressed genes
score = ttest(target = target)

# each gene is scored (t-test, p-value)
print score[0]
>>> (FloatVariable 'HIST1H4C', (1.8377179790830149, 0.07034778767062116))

# sort by p-value
from operator import itemgetter
score.sort(key=lambda s: s[1][1])

# select 200 genes with the lowest p-value
important_genes = [gene_var.name for gene_var, s in score[:200]]

# read the gene network (5000+ genes, dense network)
G = Orange.network.readwrite.read('genes_biofunct.gpickle')

items = G.items().filter_bool({'gene': important_genes})
indices = [i for i, present in enumerate(items) if present]

# build a subraph of 200 most expressed genes
G_sub = G.subgraph(indices)

In addition to the power of scripting environment, we also get the benefits of visual data exploration with Orange widgets. However, network widgets are currently under heavy development, so expect some bugs if you dare to try them. Coding should be finished in a month or two, check the blog for progress updates. Here is how to open the network in Nx Explorer widget:

import sys
import PyQt4

# must have OWNxExplorer in python path!
import OWNxExplorer

app=PyQt4.QtGui.QApplication(sys.argv)
ow=OWNxExplorer.OWNxExplorer()
ow.show()

# set the network
ow.set_graph(G_sub)
app.exec_()