Innovative software assembles complete genome sequences in days
Washington, February 18
Researchers have developed and released an innovative software tool to assemble truly complete, gapless genome sequences from a variety of species, according to a new study.
This software, called Verkko, makes the process of assembling complete genome sequences more affordable and accessible, the study said.
Researchers from National Institutes of Health (NIH), US, developed the software.
Verkko, which means “network” in Finnish, grew from assembling the first gapless human genome sequence, which was finished last year by the Telomere-to-Telomere (T2T) consortium, a collaborative project funded by the National Human Genome Research Institute (NHGRI), part of NIH, the study said.
A description of the new software is published in the journal Nature Biotechnology.
“We took everything we learned in the T2T project and automated the process,” said NHGRI associate investigator Sergey Koren, who led the creation of Verkko and is senior author on the paper.
“Now with Verkko, we can essentially push a button and automatically get a complete genome sequence,” said Koren.
The T2T consortium used new DNA sequencing technologies and analytical methods to generate and assemble the remaining 8-10 per cent of the human genome sequence, the study said.
However, the researchers assembled those fragments manually – a process that took this massive and highly skilled team several years to complete, they said.
Verkko can finish the same task in a couple of days, they said.
Assembling a genome sequence is like putting together a jigsaw puzzle, and different DNA sequencing technologies generate different types of genomic puzzle pieces.
Some are small and highly detailed, while others are much bigger though the image is blurry, the researchers said.
Verkko compares and assembles both types of pieces to generate a complete and accurate picture, the study said.
The study described that Verkko started by putting together the small, detailed pieces, creating many partially assembled but disconnected segments of sequence.
Then, Verkko compared the assembled regions with the larger, less precise pieces. These larger pieces served as a framework to order the more detailed regions, the study said.
The final product is an accurate and complete genome sequence.
The researchers tested Verkko with human and non-human genome sequencing data, they said.
The software quickly and precisely assembled the sequences of whole chromosomes, which was once a painstaking feat, the study said.
As Verkko leads to more complete human genome sequences, researchers can better assess human genomic diversity, they said.
With only one gapless human genome sequence, scientists currently lack knowledge about the diversity of many portions of the genome, such as regions of highly repetitive DNA, across the human population, the study said.
Verkko will also accelerate efforts to generate gapless genome sequences of species commonly used in research, such as mice, fruit flies and zebrafish, improving their usefulness to scientists, the study said.
Additionally, generating gapless genome sequences from a variety of plants, animals and other organisms will aid in comparative genomics, the study of the differences and similarities among the genomes of diverse species, the study said.
“Verkko can democratize generating gapless genome sequences,” said Adam Phillippy, an NHGRI senior investigator who worked on the T2T project and the development of Verkko.
“This new software will make assembling complete genome sequences as affordable and routine as possible,” said Phillippy.