The project aims to develop new computational methods for analysing and integrating data from both short-read and long-read DNA/RNA sequencing experiments.
The field of DNA and RNA sequencing has for many years been dominated by the so-called next-generation or second-generation sequencing technologies with rely on massive parallel sequencing of short reads, and methods for analyzing such data are well developed. Recently, however, new third-generation technologies have emerged which produce much longer reads enabling scientists to fill gaps and study phenomena such as repetitive sequences and structural variants.
However, computational methods to process and integrate these data types are missing. This project therefore aims to develop efficient, high-quality computational methods and open-source software packages for processing massive datasets for integrative sequencing analysis of complex diseases. Such new methods will significantly improve the understanding of genetic variation in novel megabase-sized repetitive regions and to study cell heterogeneity underlying complex diseases.
The computational tools will be useful to large-scale initiatives such as the Human Pangenome Reference Consortium and the Danish National Genome Center, and may yield new insights into complex diseases, such as cancer and Type 2 diabetes.
Read more about the project here.