PAPER 05 Jan 2026 Global

AI ranks compounds for tuberculosis proteins using human chemoproteomics

Adrián Jinich and colleagues developed TubercuProbe, a machine learning tool that leverages human chemoproteomic data to prioritize compound–protein interactions in Mycobacterium tuberculosis.

Tuberculosis research faces a practical bottleneck: powerful chemical biology tools like activity-based protein profiling (ABPP) and residue-specific chemoproteomics have changed how scientists study human proteins, but applying these methods to pathogens such as Mycobacterium tuberculosis (Mtb) is difficult. Biosafety restrictions, limited throughput and the lack of reusable atlases of protein reactivity slow discovery in infectious agents. To address this gap, researchers including Adrián Jinich introduced TubercuProbe, a cross-species machine learning framework designed to carry knowledge from large human chemoproteomic datasets into pathogen work. Rather than running all experiments directly in dangerous or slow-growing organisms, TubercuProbe uses sequence information and pretrained molecular and protein features to rank likely compound–protein interactions in Mtb. The goal is not to replace laboratory validation but to provide a practical, sequence-driven first pass that highlights the most promising compounds and residues for follow-up ABPP and chemoproteomic experiments. By shifting much of the prioritization into computational space, the approach aims to reduce the experimental burden and accelerate the identification of probes that could be used to study or target Mtb virulence proteins.

TubercuProbe combines a graph isomorphism network (GINE) to encode small-molecule ligands with frozen ESM-C (600M) protein embeddings, and links the two with bidirectional cross-attention. The model was trained on more than 2 million ChEMBL compound–protein pairs, a dataset composed predominantly of human targets, and produced strong predictive performance: R2 = 0.77 (MSE = 0.45) for continuous affinity prediction. The framework also transferred to a different task, binary cysteine reactivity prediction, where it achieved CysDB AUPRC = 0.63. Ablation studies showed that pretrained features remained highly useful across different freezing strategies (ΔAUPRC < 0.03), suggesting the model learns general patterns of protein–molecule interaction that cross species. As a focused demonstration, TubercuProbe was used to prioritize cysteine-reactive electrophiles and molecular glues for three Mtb virulence proteins—PtpB, SapM, and Rv3671c—producing candidate probes for prospective ABPP validation. An orthogonal comparison with Boltz-2 structure predictions showed moderate correlation (Pearson r ≈0.69). The authors note TubercuProbe is a lightweight, sequence-driven ranker intended to guide pre-experimental choices.

The significance of TubercuProbe lies in its practical contribution to chemoproteomic discovery under biosafety constraints. By transferring large-scale human chemoproteomic knowledge into predictions for pathogens, the tool offers a way to narrow down the vast chemical space before committing to difficult or costly experiments in Mtb. The finding that pretrained protein features are broadly transferable reduces the need for extensive pathogen-specific data to get useful rankings, and the moderate agreement with structure-based predictions like Boltz-2 provides an additional, independent line of support. Importantly, the authors emphasize that effective covalent probes must both reach their target and react there; they discuss future extensions toward multitask learning that jointly predict non-covalent binding and covalent reactivity so that prioritized compounds are more likely to succeed in real assays. Overall, TubercuProbe is positioned as a practical pre-experimental filter to speed up ABPP and chemoproteomic discovery in biosafety-restricted systems while pointing to the next steps needed to convert computational hits into validated chemical probes.

Public Health Impact

TubercuProbe can reduce the time, cost and experimental burden of finding chemical probes for Mycobacterium tuberculosis by prioritizing candidates before lab work. This pre-experimental ranking helps focus scarce biosafety-level experiments on the most promising compounds and residues.

tuberculosis
Mycobacterium tuberculosis
chemoproteomics
machine learning
cysteine reactivity
{% if expert_links_html %}
Featured Experts

Author: Abhiram Chalamalasetty

Read Original Source →