Paired plus-minus sequencing is an ultra-high throughput and accurate method for dual strand sequencing of DNA molecules.

abstract

Distinguishing real biological variation in the form of single-nucleotide variants (SNVs) from errors is a major challenge for genome sequencing technologies. This is particularly true in settings where SNVs are at low frequency such as cancer detection through liquid biopsy, or human somatic mosaicism. State-of-the-art molecular denoising approaches for DNA sequencing rely on duplex sequencing, where both strands of a single DNA molecule are sequenced to discern true variants from errors arising from single stranded DNA damage. However, such duplex approaches typically require massive over-sequencing to overcome low capture rates of duplex molecules. To address these challenges, we introduce paired plus-minus sequencing (ppmSeq) technology, in which both DNA strands are partitioned and clonally amplified on sequencing beads through emulsion PCR. In this reaction, both strands of a double-stranded DNA molecule contribute to a single sequencing read, allowing for a duplex yield that scales linearly with sequencing coverage across a wide range of inputs (1.8-98 ng). We benchmarked ppmSeq against current duplex sequencing technologies, demonstrating superior duplex recovery with ppmSeq, with a rate of 44%±5.5% (compared to ∼5-11% for leading duplex technologies). Using both genomic as well as cell-free DNA, we established error rates for ppmSeq, which had residual SNV detection error rates as low as 7.98×10 ^-8 for gDNA (using an end-repair protocol with dideoxy nucleotides) and 3.5×10 ^-7 ±7.5×10 ^-8 for cell-free DNA. To test the capabilities of ppmSeq for error-corrected whole-genome sequencing (WGS) for clinical application, we assessed circulating tumor DNA (ctDNA) detection for disease monitoring in cancer patients. We demonstrated that ppmSeq enables powerful tumor-informed ctDNA detection at concentrations of 10 ^-5 across most cancers, and up to 10 ^-7 in cancers with high mutation burden. We then leveraged genome-wide trinucleotide mutation patterns characteristic of urothelial (APOBEC3-related and platinum exposure-related signatures) and lung (tobacco-exposure-related signatures) cancers to perform tumor-naive ctDNA detection, showing that ppmSeq can identify a disease-specific signal in plasma cell-free DNA without a matched tumor, and that this signal correlates with imaging-based disease metrics. Altogether, ppmSeq provides an error-corrected, cost-efficient and scalable approach for high-fidelity WGS that can be harnessed for challenging clinical applications and emerging frontiers in human somatic genetics where high accuracy is required for mutation identification.

VIVO Weill Cornell Medical College

Paired plus-minus sequencing is an ultra-high throughput and accurate method for dual strand sequencing of DNA molecules.

Overview

abstract

authors

publication date

published in

Identity

PubMed Central ID

Digital Object Identifier (DOI)

PubMed ID