DEVELOPING MACHINE LEARNING ALGORITHMS FOR PREDICTING SOYBEAN YIELD BASED ON WEATHER AND SOIL DATA

Authors

  • Muhammad Bilal Faculty of Agriculture, Gomal University, Dera Ismail Khan 29050, Khyber Pakhtunkhwa, Pakistan Author
  • Muhammad Danial Ahmad Qureshi Department of Artificial Intelligence. University of Management & Technology, Lahore, Pakistan Author

DOI:

https://doi.org/10.64038/eatf.01.2024.2

Keywords:

Soybean Yield Prediction, Machine Learning, Environmental Data Integration, Random Forest, Model Interpretability, Precision Agriculture

Abstract

Accurate forecasting of soybean (Glycine max) yield under variable environmental conditions is essential for optimizing management decisions and ensuring food security. In this study, we developed a hybrid machine learning pipeline that integrates weather (cumulative growing‐season precipitation, mean temperature, solar radiation, humidity) and soil (moisture, organic matter, pH, texture) data using principal component analysis and recursive feature elimination. We trained Random Forest, support vector machine, and deep neural network models on a multi‐year (2021–2023), multi‐region dataset and evaluated performance via five‐fold cross‐validation and independent test sets. Random Forest consistently outperformed alternatives, achieving a lowest test‐set RMSE of 1.15 t/ha, MAE of 0.82 t/ha, and R² up to 0.87, while five‐fold MAE ranged 0.83–0.90 t/ha. Regional assessments revealed the East region had the highest accuracy and the West the greatest error variance. SHAP‐based analysis ranked cumulative precipitation and mean temperature as the top drivers of yield variability, supported by feature‐importance bar charts, scatter plots of predicted versus actual yields, and error‐distribution visualizations. Correlation heatmaps confirmed low to moderate collinearity among key predictors, validating the benefit of multi‐modal data fusion. Our approach demonstrates robust, interpretable yield forecasting and can be transferred to other crops and regions with local calibration. This decision‐support tool offers stakeholders a scalable solution for enhancing soybean productivity under climatic uncertainty.

Downloads

Published

2024-06-30

How to Cite

DEVELOPING MACHINE LEARNING ALGORITHMS FOR PREDICTING SOYBEAN YIELD BASED ON WEATHER AND SOIL DATA. (2024). Eco AgriTech Frontiers, 1(01), 29-42. https://doi.org/10.64038/eatf.01.2024.2