Sisse Njor

Sisse Njor says: “The purpose of this project is to improve existing analytical tools used to estimate the major benefits and harms of cancer screening.

Researchers are globally striving to produce reliable estimates on the major benefits and harms of cancer screening and to agree upon which methods that produce reliable estimates. However, the existing analytical tools are increasingly obsolete and require updating. This project will suggest and validate a new method based on existing analytical tools from other research areas. A method that will hopefully enable researchers to produce reliable estimates on the major benefits and harms of cancer screening, both for the entire population and for subgroups.

With the increasing moves to use individualized screening it is extremely important to know if there are subgroups that only have a very small reduction in cancer mortality when participating in screening or subgroups who have a particular high risk of overdiagnosis. The new method may provide these answers”.

Sisse Njor holds a Senior Researcher position at Randers Regional Hospital, Department of Public Health Programmes since 2017, and is furthermore affiliated to Aarhus University, Institute of Clinical Medicine as an Associate Professor and the Danish Clinical Quality Program, National Clinical Registries as an epidemiologist.

Lars Kai Hansen

Can artificial intelligence algorithms learn to communicate in a language we understand?

Lars Kai Hansen says: “Machine learning algorithms are often perceived as complex black boxes and much research has already gone into opening the black box to explain what has been learned from data. The communication aspects of explainable AI have attracted less attention. The cognitive spaces project is aimed at relating AI explanations better to given user groups and effectively let the algorithms speak the user’s language. We will realize the vision by aligning learned representations of data with formal human knowledge graphs. We hope to understand and push the limits to deep learning interactivity by theoretical and experimental analysis, design of new learning schemes to enable knowledge aware models and explanation.

Our primary use case concerns cognitive spaces for deeper understanding of electric brainwaves (EEG). These signals are of increasing diagnostic importance and EEG signals play a fundamental role in neuroscience. In an ambitious attempt to understand EEG models better we will use cognitive space methods for real-time “captioning” of the brainwave signal”.

Since 2000 Lars Kai Hansen has been a Professor at Technical University of Denmark where he heads the Cognitive Systems group.

Hiren Joshi

Hiren Joshi says: “The “third language of life” after genes and proteins is that of complex sugars. This language (the glycocode) describes myriad ways that organisms have fine-tuned proteins and cellular functions to allow complex life to thrive. We know how 100s of enzymes generate sugars in cells, but we do not know how the individual cell regulates its enzymes and glycosylation network to make specific sugars required in health and diseases. The goal of this project is to learn how the glycocode is regulated, and in doing so reveal its functions. Using data science, the project team will build a foundational machine learning model for in silico glycoscience: GolgiNet. Capturing the regulatory patterns of cellular glycosylation, GolgiNet will be used to predict biological functions, reveal the sugars of a single cell, and predict the sugar-coated proteins a cell is programmed to make. GolgiNet will transform our ability to understand the third language of life, providing a Rosetta stone to decipher how sugars can mediate biological interactions”.

Hiren Joshi came to University of Copenhagen in 2012, and has since 2021 been an Associate Professor at the Copenhagen Center for Glycomics.

Fernando Racimo

Fernando Racimo says: “The genomes of organisms contain information about their past history: migrations, displacements and expansions of populations can be discerned from the footprints they left in genetic sequences – including our own genomes. Space is thus a crucial dimension of evolution: organisms interact, mate and compete with organisms that are closest to them in their landscape. Yet, tools for analyzing genomes in space are scarce or highly limited in scope.

Which types of genetic patterns are most informative of spatial aspects of the history of a species? And how can we best harness them to better understand the movement and past distribution of those species? To answer these questions, our research program will generate an array of computational tools for simulating, analyzing and modelling genomes on real geographic landscapes. These tools will be applicable to genetic data from both present-day living organisms and from extinct populations, allowing us to better understand population processes with unprecedented detail.

We will then apply our newly developed methods to a specific case-study: ancient epidemics in recent human prehistory. We will infer the spatial distribution and expansion of ancient pathogens and their hosts, using a combination of present-day and ancient genomic data. We will seek to understand how past epidemics have affected human populations over the last 50,000 years, how humans – in turn – have responded to these epidemics, and how future epidemics might unfold over time, as a consequence of climate change and ecological breakdown”.


Fernando Racimo came from UC Berkeley to University of Copenhagen in 2017 and is now an Associate Professor at Globe Institute.

Mikkel Schmidt

Mikkel Schmidt says: “Designing and creating new molecules and materials with specific tailored properties can lead to huge progress in medicine, solar cells, catalysts, and many other scientific areas. But identifying which compounds have the properties we desire is not easy. While quantum mechanical computations can determine many properties in a matter of minutes or hours, the number of possible molecules is so huge that search by trial and error is futile. Based on databases of known compounds and their properties, deep neural networks have proven extremely efficient in predicting properties of new compounds. These neural networks can guide our search, but have no notion of uncertainty and can give misleading results that are difficult to diagnose. To be used efficiently, the neural networks need to know what they don’t know. In this project we will develop methods for uncertainty quantification in deep neural networks aimed at the search for new and exciting materials and molecules”.

Mikkel Schmidt has been an Associate Professor at DTU Compute since 2013.

Erin Gabriel

Erin Gabriel says: “The project aims to develop statistical methods to improve personalized treatment decision-making while considering patient burden and accounting for the shortcomings of the data being used.

As medical data and treatment options grow, a vital question becomes how to use the information available to make the best treatment decision. Methods exist to help select the best treatment for each patient based on patient and disease characteristics. However, these methods often do not consider the patient’s burden for collecting those characteristics, nor do they account for the potential shortcomings of the data used in the selection. Both issues can lead to the selection of sub-optimal treatments and potential harm to patients. To avoid this, selected decisions should be tested in clinical trials, and the patient burden should always be considered. Randomized clinical trials can be costly, untimely, or simply impossible. The use of validated surrogate endpoints can make randomized clinical trials feasible, but in the setting of personalized treatment, improved statistical evaluation methods are needed. Regardless of the data collection type, there is also a need for statistical tools that account for patient burden in treatment selection. Finally, when observational data must be used, improved methods are needed for treatment selection that account for biases that may occur due to the lack of randomization”.

Erin Gabriel joined the Biostatistics Section of University of Copenhagen, Department of Public Health as an Associate Professor in 2022.

Adam Hulman

Adam Hulman says: “Artificial intelligence enables computer programs to execute human-like tasks like image and speech recognition, text translation, and more. These applications are based on deep learning, a method that can recognize patterns in large datasets (e.g. millions of images from the internet) and then make predictions for new cases. In this project, deep learning methods will be developed and applied in a clinical setting. Persons with type 1 diabetes visit their physicians regularly for check-ups and screening for complications. Some of them also monitor their health using wearable devices even between visits. Combining these data creates a unique opportunity for the development of clinical prediction models that can assist clinicians to tailor prevention and treatment. However, complex data of different types (tabular, images, time series) collected repeatedly over time call for the development and application of novel deep learning methods”.


Adam Hulman came to Denmark for a Postdoc position at Aarhus University, Department of Public Health in 2015. In 2018 he joined Steno Diabetes Center Aarhus, Aarhus University Hospital, where he has been a Senior Data Scientist since 2020.

Tibor V. Varga

Tibor V. Varga says: “In the EU, health inequalities account for 20% of total healthcare costs and related welfare losses amount to nearly 1 trillion EUR per year. The European Commission considers health inequalities to be one of the greatest challenges facing European healthcare systems. Navigating this challenge requires improved data and smarter methods and tools for evaluating inequalities in health as well as practical ways for narrowing healthcare gaps. The vision of the Algorithmic Fairness in Diabetes Prediction (ALFADIAB) research program is a society where access to healthcare and quality of care do not depend on ethnicity, race, sex, or wealth. Even in Denmark, with its universal healthcare system, this is not yet a reality, and minorities with diabetes, and those who are the poorest, are affected more than others. As an example, immigrants, their descendants, and those who are the poorest have higher rates of developing type 2 diabetes, experience more severe complications (diseases of the heart, eye, and kidney), and benefit less from the Danish healthcare system. In this research program, I will investigate whether established risk prediction models, that are used to forecast which individuals are at high risk of diabetes, are underperforming for minorities and those with lower socioeconomic status. By utilizing Danish registry-based data on millions of people I will assess inequalities in diabetes management and care, and deploy artificial intelligence techniques to develop improved predictive models that are equitable and perform equally well between subgroups”.


Tibor V. Varga has been an Assistant Professor at Section of Epidemiology at University of Copenhagen, Department of Public Health since 2020.

Niklas Pfister

The CausalBiome project will develop a new unified framework for statistical analysis and causal inference on human microbiome data.

Microorganisms, such as bacteria, fungi and viruses interact in diverse ways with their surroundings. The human body is estimated to be a habitat for more than 10,000 different microbial species and they have been associated with various health outcomes such as cardiovascular disease, metabolic diseases, obesity, mental illness, and autoimmune disorders. Thanks to recent advances in gene sequencing technology, scientists are now able to directly measure these microbes. However, to understand how they interact with their human host, sophisticated statistical tools are needed to analyze the highly complex data. Unfortunately, current techniques do not offer a unified approach that incorporates all available knowledge into the analysis.

The CausalBiome project will fill this gap by developing novel statistical and data science analysis methods, which will lead to a better understanding of how the microbiome interacts with its host. All results will be made publicly available to help other scientists gain new insights into how microbes affect our health.

Shilpa Garg

The project aims to develop new computational methods for analysing and integrating data from both short-read and long-read DNA/RNA sequencing experiments.

The field of DNA and RNA sequencing has for many years been dominated by the so-called next-generation or second-generation sequencing technologies with rely on massive parallel sequencing of short reads, and methods for analyzing such data are well developed. Recently, however, new third-generation technologies have emerged which produce much longer reads enabling scientists to fill gaps and study phenomena such as repetitive sequences and structural variants.

However, computational methods to process and integrate these data types are missing. This project therefore aims to develop efficient, high-quality computational methods and open-source software packages for processing massive datasets for integrative sequencing analysis of complex diseases. Such new methods will significantly improve the understanding of genetic variation in novel megabase-sized repetitive regions and to study cell heterogeneity underlying complex diseases.

The computational tools will be useful to large-scale initiatives such as the Human Pangenome Reference Consortium and the Danish National Genome Center, and may yield new insights into complex diseases, such as cancer and Type 2 diabetes.

Read more about the project here.