PAPER 20 Mar 2026 Global

New tool maps mobile DNA in bacteria more precisely

Yang Zhou presents ISdetector, an open-source pipeline that pinpoints insertion sequence locations and related structural changes from short-read data.

Insertion sequences (ISs) are short mobile bits of DNA that drive genomic change in bacteria and archaea. Because they move and copy themselves, ISs create the genetic plasticity that can alter traits such as drug resistance, virulence, and patterns of spread—details that matter to researchers tracking outbreaks and studying disease-causing microbes. Yet finding the exact position of an IS in a genome is surprisingly hard when researchers rely on standard high-throughput short-read sequencing. ISs are repetitive, and the structural changes they cause—like deletions or rearrangements—can confuse plain alignment methods that assume reads match a stable reference. Recognizing this gap as whole-genome sequencing becomes routine for population-level studies, Yang Zhou and colleagues developed ISdetector, a purpose-built pipeline to detect precise insertion sites of specific ISs. ISdetector is designed to be robust against repetition and structural variation, producing clear, high-resolution maps of where ISs sit in genomes and how they move. The work aims to give microbiologists a reliable way to follow mobile DNA across many samples, using data types already common in labs around the world.

ISdetector combines an IS-clean reference strategy with clustering of IS-relevant signals from soft-clipped reads to locate insertion coordinates accurately. The pipeline works on high-throughput short-read sequencing data and is implemented in Python; it integrates standard tools including BWA, SAMtools, and BLAST+, and uses the Biopython and Pysam libraries for data processing. ISdetector was benchmarked against existing tools such as ISMapper and MGEFinder and showed higher accuracy and robustness. In tests on high-GC-content genomes like Mycobacterium tuberculosis it achieved an F1 score of 0.91, and in genomes with many ISs such as Shigella sonnei it reached an F1 score of 0.85. Importantly, ISdetector can identify IS movements that are accompanied by structural variations—including large-scale deletions—that are often missed by other methods. The implementation supports multi-threading and displayed near-linear decreases in running time as thread counts increased, making it scalable and efficient for studies that process large numbers of samples.

The capabilities demonstrated by ISdetector have practical implications for microbial genomics. By pinpointing exact insertion coordinates and recognizing associated structural changes, the pipeline helps researchers better interpret how mobile DNA contributes to phenotypes like drug resistance and virulence and how those traits spread through populations. Its higher accuracy compared with ISMapper and MGEFinder means fewer false leads and a clearer picture of genomic rearrangements, particularly in challenging contexts such as high-GC genomes or organisms with many IS elements. The fact that ISdetector is open-source and integrates widely used tools lowers the barrier for adoption; labs can run it on existing short-read datasets and scale analyses across cohorts thanks to multi-threaded performance. The source code, documentation, and usage instructions are freely available at https://github.com/carolynzy/ISdetector, providing a practical resource for teams conducting population-level genomic surveillance and basic research into genomic plasticity.

Public Health Impact

ISdetector will let researchers map insertion sequences and related structural changes more accurately, improving studies of drug resistance, virulence, and pathogen spread. Its speed and scalability mean large population-level sequencing projects can include reliable IS detection without major added cost.

insertion sequences
whole-genome sequencing
bioinformatics
Mycobacterium tuberculosis
Shigella sonnei
{% if expert_links_html %}
Featured Experts

Author: Yang Zhou

Read Original Source →