Machine learning for genomics

Jeroen de Ridder

Our Research

Genome conformation. The genome is not a straight line. We are developing computational strategies to exploit measurements of the genome conformation in the analysis of genomics data. To this end, we build graph-based data integration strategies and exploit large-scale epigenomics datasets. Recently, we have shown that cancer-causing mutations in the mouse genome are co-localized in 3D hotspots and linked to known cancer genes through long-range chromatin interactions [1]. Together with the de Laat lab (Hubrecht Institute), we are working on designing the computational methods to detect multi-way interactions in the 3D genome.

Non-coding mutations. We work on analytical and computational frameworks that lead to fast, cost-efficient and comprehensive detection and annotation of structural variations in cancer genomes. We particularly focus on previously neglected variations occurring in unexplored regions of the cancer genome – the non-coding genome. With these methods we aim to provide an important component in future genome-first-based clinical decision making for cancer patients and drive discovery of novel cancer genes and mechanism from modern day whole genome sequencing data.

Interpretable machine learning. In this research line we aim to unravel biological mechanisms by investigating how and why trained prediction models fit the data. For instance, we create methods to identify robust genesets or pathways that differentiate between breast cancer subtypes or cancer treatment. To this end, we employ machine learning models that can exploit existing biological knowledge, such as network- and pathway-based classifiers [3].

Data integration methods. To answer modern biological questions often a systems approach is required, wherein multiple genome-wide measurements interrogating multiple biological phenomena need to be integrated. To enable this, we investigate data integration methodologies, in particular those that exploit graphs and graph-mining. For instance, we developed so called scale-aware graph-topological measures [4] that enable rich descriptions of network architecture and used this to describe DNA-DNA contact maps in the brain [2].

Key publications
[1] Babaei S., et al. , de Ridder J. 3D hotspots of recurrent retroviral insertions reveal long-range interactions with cancer genes. Nature Comm. 2015
[2] Babaei, S., et al. , de Ridder J.*, Reinders M.* Multi-scale chromatin interactions are predictive for spatial co-expression patterns in the mouse cortex. PLoS Comp. Biol. 2015
[3] Allahyar A., de Ridder J. FERAL: network-based classifier with application to breast cancer outcome prediction. Bioinformatics. 2015
[4] Hulsman M., Dimitrakopoulos C., de Ridder J. Scale-space measures for graph topology link protein network architecture to function, Bioinformatics, 2014
[5] Akhtar W, et al. de Ridder J, …, van Lohuizen M, van Steensel B. Chromatin Position Effects Assayed by Thousands of Reporters Integrated in Parallel. Cell. 2013

Research Group

Jeroen de Ridder
Jeroen de Ridder
Group name: De Ridder Lab
Research field: Machine learning for genomics
Genomics, Machine Learning
Department: Center for Molecular Medicine


Universiteitsweg 100
3584 CG
Office: STR1.305
Building: Stratenum