Advancing pearl millet yield forecasting : comparative analysis of individual and ensemble machine learning approaches over Rajasthan, India

Alsaber, Ahmad and Setiya, Parul and Satpathi, Anurag and Aljamaan, Abrar and Pan, Jiazhu (2025) Advancing pearl millet yield forecasting : comparative analysis of individual and ensemble machine learning approaches over Rajasthan, India. PLoS ONE, 20 (3). e0317602. ISSN 1932-6203 (https://doi.org/10.1371/journal.pone.0317602)

[thumbnail of Alsaber-etal-PLOSOne-2025-Advancing-pearl-millet-yield-forecasting]
Preview
Text. Filename: Alsaber-etal-PLOSOne-2025-Advancing-pearl-millet-yield-forecasting.pdf
Final Published Version
License: Creative Commons Attribution 4.0 logo

Download (1MB)| Preview

Abstract

Pearl millet (Pennisetum glaucum L.) is a resilient crop known for its ability to thrive in arid and semi-arid regions, making it a crucial staple in regions prone to drought. Rajasthan, a state in India, emerged as the top producer of pearl millet. This study enhances yield forecasting for pearl millet using machine learning models across nine districts viz. Jaipur, Ajmer, Jodhpur, Bikaner, Bharatpur, Alwar, Sikar, Jhunjhunu and Nagaur in Rajasthan, India. Data from 1997–2019 (23 years), including yield data from the Directorate of Economics and Statistics and weather data from the NASA POWER web portal, were analysed. The study employed individual machine learning methods (GLM, ELNET, XGB, SVR and RF) and their ensemble combinations (GLM, ELNET, Cubist and RF). Discerning the overall best performing model across all locations remained challenging. For instance, while ensemble models exhibited subpar performance in Barmer and Nagaur, their performance ranged from satisfactory to commendable in other locations. To identify the best model, all models were ranked based on their R2 and nRMSE (%) values. Combined average ranks during training and testing revealed the model performance ranking as I-XGB (3.83) >  I-GLM (4.28) >  E-ELNET (4.32) >  I-RF (4.67) >  E-GLM (4.88) >  I-SVR (4.90) >  I-ELNET (4.94) >  E-RF (6.03) >  E-Cubist (7.15), where I denotes individual model, while E denotes ensemble model. Intriguingly, while individual GLM and XGB models demonstrated superior performance during calibration, they exhibited poorer performance during validation, potentially indicating issues of data overfitting. Hence, the ensemble ELNET approach is recommended for accurate prediction of pearl millet yield, followed by the individual RF model. These performances underscore the importance of tailored model selection based on specific geographic and environmental conditions.

ORCID iDs

Alsaber, Ahmad, Setiya, Parul, Satpathi, Anurag, Aljamaan, Abrar and Pan, Jiazhu ORCID logoORCID: https://orcid.org/0000-0001-7346-2052;