Back

HudsonAlpha researchers develop algorithms to confirm clinical whole genome sequencing results

Clinical next-generation sequencing is widely used to help make diagnoses in patients with suspected genetic disorders. Identifying the genetic variants can inform a diagnosis for these patients. However, next-generation sequencing methods are known to have errors at many steps of the pipeline.

To reduce the risk of false positive results being returned to patients, the American College of Medical Genetics and Genomics (ACMG) and the College of American Pathologists (CAP) recommend that clinical labs perform additional sequencing confirmation for reported variants, usually by Sanger sequencing.

Confirmatory testing greatly improves the specificity of variant calling, but it results in an increased turnaround time and cost of testing. Members of the HudsonAlpha Institute for Biotechnology’s Clinical Services Lab recently published results in Genetics in Medicine showing that a computer algorithm predicts false positive variants in the place of expensive confirmatory sequencing.

The team, which included James Holt, PhD, Melissa Kelly, PhD, Brett Sundlof, Ghunwa Nakouzi, PhD, David Bick, MD, and Elaine Lyon, PhD, FACMG, sequenced five reference human genome samples and compared the results with an established set of variants for each genome. Then they trained machine learning models to identify variants that were labeled as false positives. 

The models identified 99.5% of the false positive heterozygous single-nucleotide variants (SNVs) and heterozygous insertion/deletion variants (indels). Further, the model reduced confirmatory testing of non-actionable, non-primary SNVs by 85% and indels by 75%. Using the algorithm in clinical practice reduces confirmatory testing using Sanger sequencing by 71%.

These results are exciting and indicate that a low false positive call rate can be maintained while significantly reducing the need for confirmatory testing, thereby reducing time and cost of clinical next-generation sequencing.