20 years after the first decoding of human DNA, which turned out to be partial, a consortium of more than 100 researchers has finally managed to sequence our DNA from start to finish. At the same time, they revealed the previously inaccessible 8% of DNA, equivalent to an extra chromosome. These new genetic territories, hitherto unexplored, promise important advances, in particular for better treatment of genetic diseases or a better understanding of the origins of our species. This new work is the subject of eight publications and commentaries in the journal. The science.
The hardest puzzle in the world
Imagine a jigsaw puzzle with a thousand pieces, sometimes with inexact cuts so that several pieces can fit together in several places. Imagine that this puzzle has additional areas of solid blue sky, and that you don’t have a model on the box. This is precisely the situation in which researchers from Celera Genetics and the Human Genome Project found themselves, authors of two publications published in 2001 and describing the first sequencing of the human genome. “DNA sequencing technologies can only determine the sequence of relatively small DNA fragments.”, explains Nicholas Altemoz, a researcher from Berkeley (USA) and the first author of one of the new publications. ”So, to get the sequence of the entire genome, we have to break it down into little pieces, arrange those little pieces, then find where they overlap and put them back together.”
In 2001, 92% of the human genome was described.
However, two decades earlier, sequencing technologies could not decipher the long sequences in the vast array of bases (the units that make up DNA) that make up our genome. “Until recently (…) sequencers could not read more than 1000 bases at a time.”, while there are about three billion of them in our DNA, specifies the American expert in the field of genetics and bioinformatics Dina M. Church in a commentary published in The science. As a result, since 2001, millions of bases have remained unknown and the 169 most repeated sequences could not “be orderly or navigate the assembly with confidence“. “The Human Genome Project has mapped about 92 percent of it. The rest of the sequences were inherently complex and required technological advances not available at the time.”, sums up researcher Karen Miga, who participated in this work. But technology is advancing rapidly. In a paper published in 2022, the advent of precision long-read and ultra-long-read sequencing has made it possible to decipher sequences of 20,000 and over 100,000 bases, respectively!
“As if the pieces are now very large, like in a children’s puzzle“, illustrates researcher Winston Thimp from Johns Hopkins University (USA), co-author of this work. “And we found that there are objects in the rooms, say, grass or the sun. It’s not just blue skies.“. The implementation is all the more reliable because the new complete human DNA comes from one individual, and not from a patchwork of several, like the previous one.