New hybrid model improves tuberculosis forecasts in Nepal
Dip Bdr. Singh led a study showing a hybrid SARIMA-CNNAR model forecasts Nepal’s monthly TB cases with high accuracy, aiding public health planning.
Tuberculosis remains a major public health challenge in Nepal, where incidence rates are substantially higher than global estimates. To give health authorities earlier and more reliable warning of changing case numbers, researchers led by Dip Bdr. Singh used ten years of routine surveillance data to build a new forecasting tool. The team obtained monthly TB incidence reports from the National Tuberculosis Control Center (NTCC), Nepal covering January 2015 to December 2024, a period that saw average monthly cases rise from 2,048 in 2015 to 3,447 in 2024 — a 68.4% increase. Aiming to improve on existing prediction methods, the study combined two complementary modeling approaches: one designed to capture regular seasonal patterns and another able to learn complex, nonlinear behavior in the data. The goal was to produce reliable month-by-month forecasts that could support early warning systems, help allocate scarce resources, and enable targeted public health actions in a resource-limited setting.
The team built a hybrid model that pairs Seasonal Autoregressive Integrated Moving Average (SARIMA) with a Convolutional Neural Network Auto-Regressive (CNNAR) component. In this design SARIMA models the linear seasonal trends while CNNAR learns nonlinear structure from the remaining errors (residuals). Hyperparameters were tuned by grid search with 5-fold cross-validation. The researchers also performed structural break analysis and sensitivity analysis to test robustness. On the 2024 test set the hybrid SARIMA-CNNAR model achieved MAE=248.35, RMSE=294.31, MAPE=7.2%, and R 2 =0.79. For comparison, standalone CNNAR scored MAE=251.08, RMSE=336.55, MAPE=7.7%, R 2 =0.73; LSTM had MAE=267.91, RMSE=324.55, MAPE=7.5%, R 2 =0.75; XGBoost had MAE=314.74, RMSE=373.99, MAPE=8.5%, R 2 =0.66; Facebook Prophet had MAE=371.15, RMSE=478.40, MAPE=10.4%, R 2 =0.45; and SARIMA alone had MAE=401.11, RMSE=503.93, MAPE=10.99%, R 2 =0.39. All models reproduced seasonal peaks in March–May and July–August, and forecasts indicate these seasonal patterns will continue into 2025. Sensitivity testing showed under 5% variation in metrics across parameter settings.
The study presents the first validated hybrid SARIMA-CNNAR approach applied to national TB data in Nepal and shows that combining linear seasonal modeling with a nonlinear learning component can substantially improve forecast accuracy. For public health officials, more accurate month-by-month predictions mean better timing for distributing diagnostic supplies, adjusting staffing levels during peak months, and focusing awareness campaigns when they will be most effective. Because the model was designed and tested on routine surveillance data and shown to be robust, it can be integrated into national surveillance systems to provide operational intelligence. The authors also highlight that this data-driven hybrid approach is adaptable: with similar time series from other diseases or other countries, the same modeling strategy could strengthen planning where resources are limited and timely decisions matter most.
More accurate TB forecasts can help the NTCC pre-position diagnostics, plan staff, and target outreach before seasonal peaks. The approach is adaptable for other diseases and settings, helping strengthen surveillance where resources are limited.
Author: Dip Bdr. Singh