Genomic sleuthing separates 30-year tuberculosis transmission chains
Richard Anthony and colleagues show that tracking newly emerging SNPs with WGS can distinguish transmission chains in a Mycobacterium tuberculosis cluster spanning over 30 years.
Tuberculosis remains a public health challenge, in part because the bacteria that cause it, Mycobacterium tuberculosis, can circulate for many years and infect many people in linked outbreaks called clusters. Whole genome sequencing (WGS) has emerged as a powerful tool to identify epidemiological links between isolates by comparing single nucleotide polymorphisms (SNPs) — tiny differences in the DNA. Investigators led by corresponding author Richard Anthony looked at a particularly long-lived cluster of M. tuberculosis in the Netherlands that grew to more than 150 cases over a period of over 30 years. In such extended clusters, the usual approach of using a predefined SNP threshold to rule in or out recent transmission can be limited, because overall genetic variability is low. To get around that problem, the team focused on newly emerging, or informative, SNPs that appear as the cluster accumulates mutations over time. They examined WGS data to see which genomic positions varied across the cluster, looked for minority variant populations in other isolates, and used those patterns to build a transmission scheme based on the genetic data alone, which they then compared to traditional epidemiological investigations.
The study analyzed WGS data derived from 61 sequencing files representing 54 patients in the extended cluster. Genomic positions that showed variation within the cluster were screened carefully for minority populations of alternative bases in other isolates — that is, mixed variants that might indicate recent or ongoing transmission events. From this analysis the researchers identified 52 informative SNPs that helped to resolve relationships among isolates; eight of those were also detected as mixed variants, suggesting variants were present as minority populations in some samples. One particular emerging SNP was found in the gene dnaA (1199G>A R400H); this mutation has been observed in other transmitted strains and the authors note it may be under selection. Using the pattern of filtered SNPs that had accumulated in the cluster, the team generated a transmission scheme based solely on the WGS data and compared it to the scenarios identified through classical epidemiological cluster investigations. They report a high concordance between the genetically inferred chains and the traditional epidemiological findings.
The findings indicate that careful analysis of filtered SNPs that accumulate in the genome of M. tuberculosis in large, long-running clusters can reveal transmission dynamics even when overall genetic variability is limited. By focusing on newly emerging, informative SNPs and on mixed variants that appear as minority populations, researchers can distinguish transmission chains that might otherwise be blurred when relying on a simple SNP distance threshold. The identification of an emerging dnaA (1199G>A R400H) change that appears in other transmitted strains raises the possibility that some mutations may be repeatedly selected during transmission, and therefore could serve as useful markers when reconstructing chains of spread. Overall, the approach described supports and refines traditional epidemiological cluster investigations, offering a genomic layer of evidence that can help public health teams untangle complex, decades-long transmission networks.
Public health teams can use WGS-derived emerging SNPs to refine contact tracing in long-term Mycobacterium tuberculosis clusters. This genomic approach supports more precise epidemiological investigations when standard SNP thresholds are insufficient.
Author: Rina de Zwaan