Single-Cell Data Science for Everyone

Molecular biologists have in the past twenty years invented technologies that can collect abundant experimental data. One such technique is single-cell RNA-seq, which, very simplified, can measure the activity of genes in possibly large collections of cells. The interpretation of such data can tell us about the heterogeneity of cells, cell types, or provide information on their development.

Typical analysis toolboxes for single-cell data are available in R and Python and, most notably, include Seurat and scanpy, but they lack interactive visualizations and simplicity of Orange. Since the fall of 2017, we have been developing an extension of Orange, which is now (almost) ready. It has even been packed into its own installer. The first real test of the software was in early 2018 through a one day workshop at Janelia Research Campus. On March 6, and with a much more refined version of the software, we have now repeated the hands-on workshop at the University of Pavia.

The five-hour workshop covered both the essentials of data analysis and single cell analytics. The topics included data preprocessing, clustering, and two-dimensional embedding, as well as working with marker genes, differential expression analysis, and interpretation of clusters through gene ontology analysis.

I want to thank Prof. Dr. Riccardo Bellazzi and his team for the organization, and Erasmus program for financial support. I have been a frequent guest to Pavia, and learn something new every time I am there. Besides meeting new students and colleagues that attended the workshop and hearing about their data analysis challenges, this time I have also learned about a dish I had never had before in all my Italian travels. For one of the dinners (thank you, Michela) we had Pizzoccheri. Simply great!