YIELD OF MAIN AGRICULTURAL CROPS IN THE PENZA REGION: ANALYSIS OF PREDICTORS AND MEDIUM-TERM FORECAST
Abstract and keywords
Abstract:
Against the backdrop of increasing climatic instability and import substitution policies, medium-term crop yield forecasting is acquiring strategic importance for regional agricultural planning. Penza Oblast, which holds a leading position in sugar beet production and several other crops within the Volga Federal District, lacks verified systematic forecasting research. The aim of this study is to develop and validate a suite of predictive yield models for five major crops in Penza Oblast — grains and legumes, sugar beet, sunflower, potato, and vegetables — over a forecasting horizon of up to three years. The models are built on Ridge regression with L2 regularisation, exponential observation weighting, and an expanding window validation scheme. For each crop, an extensive grid search was conducted (12000–25000 configurations) across a space of 135 predictors comprising regional statistical data, MODIS NDVI satellite indices, agroclimatic indicators, and ERA5-Land reanalysis data for the period 1990–2025 (n = 36). Automated feature selection produced crop-specific predictor sets of 17–31 variables with low cross-crop overlap (Jaccard coefficient 0.09–0.26) and agronomically interpretable composition. Backtesting yielded MAPE values ranging from 6,52% (sugar beet) to 15,18% (sunflower). Ridge regression outperformed Random Forest and XGBoost across all crops by 4,5–9,7 percentage points, confirming the advantage of regularised linear models under small sample conditions. Point and interval forecasts for 2026–2028 (bootstrap, N = 2000) indicate moderate yield growth for grains and sugar beet, with stabilisation projected for the remaining crops. The developed forecasting framework achieves high to good accuracy (per the Lewis scale) for four out of five crops and is recommended for integration into regional medium-term agricultural planning systems — primarily for regional agricultural authorities and large agro-industrial holdings.

Keywords:
crop yield prediction, medium-term forecast, Penza region, machine learning, Ridge regression, agrometeorological predictors, NDVI, MAPE, bootstrap validation
Text
Text (PDF): Read Download
References

1. Shawon S.M. et al. Crop yield prediction using machine learning: An extensive and systematic literature review // Smart Agricultural Technology. – 2025. – Vol. 10. – Art. 100718. – DOI: https://doi.org/10.1016/j.atech.2024.100718.

2. Schauberger B., Jägermeyr J., Gornott C. A systematic review of local to regional yield forecasting approaches and frequently used data resources // European Journal of Agronomy. – 2020. – Vol. 120. – Art. 126153. – DOI: https://doi.org/10.1016/j.eja.2020.126153.

3. Jorvekar P.P., Wagh S.K., Prasad J.R. Predictive modeling of crop yields: a comparative analysis of regression techniques for agricultural yield prediction // Agricultural Engineering International: CIGR Journal. – 2024. – Vol. 26, No. 2. – P. 125–140.

4. Scherbakov A.S., Bogomazov S.V. Prognozirovanie urozhaynosti ozimoy pshenicy na osnove agroklimaticheskih pokazateley i vegetacionnogo indeksa NDVI v usloviyah Penzenskoy oblasti // Niva Povolzh'ya. – 2025. – № 3(75). – S. 10–11. – DOI:https://doi.org/10.36461/NP.2025.75.3.021.

5. Tindova M.G. Analiz dinamiki vyraschivaniya saharnoy svekly v RF // Ekonomiko-matematicheskie metody analiza deyatel'nosti predpriyatiy APK: materialy VII Mezhdunarodnoy nauchno-prakticheskoy konferencii. – Saratov, 2023. – S. 306–312.

6. Gur'yanova N.M., Mayorkina E.V. Prognozirovanie valovogo sbora maslichnyh kul'tur Penzenskoy oblasti // Surskiy vestnik. – 2020. – № 1(9). – S. 56–61.

7. Samandarzoda I.H. i dr. Sovremennoe sostoyanie i perspektivy razvitiya proizvodstva kartofelya v Penzenskoy oblasti // Niva Povolzh'ya. – 2023. – № 3(67). – s. 4002. – DOI:https://doi.org/10.36461/NP.2023.67.3.021.

8. Tishin M.E., Dubinin A.V. Metodologicheskiy podhod k prognozirovaniyu urozhaynosti sel'skohozyaystvennyh kul'tur na osnove Ridge-regressii (na primere Penzenskoy oblasti) // Surskiy vestnik. – 2026. – № 6. – (v pechati).

9. Hersbach H. The ERA5 global reanalysis / H. Hersbach et al. // Quarterly Journal of the Royal Meteorological Society. – 2020. – Vol. 146. – P. 1999–2049. – DOIhttps://doi.org/10.1002/qj.3803.

10. Lewis C.D. Industrial and business forecasting methods. – London : Butterworth Scientific, 1982. – 143 p.

11. Velde M., Nisini L. Performance of the MARS-crop yield forecasting system for the European Union: Assessing accuracy, in-season, and year-to-year improvements from 1993 to 2015 // Agricultural Systems. – 2019. – Vol. 168. – P. 203–212. – DOI: https://doi.org/10.1016/j.agsy.2018.06.009.

12. Hastie T., Tibshirani R., Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. – 2nd ed. – New York : Springer, 2009. – 745 p. – DOI: https://doi.org/10.1007/978-0-387-84858-7.

13. Valenko D. et al. Agroclimatic Forecasting Under Degraded Sensor Data: A Robustness Benchmark of Machine-Learning Models // Applied Sciences. – 2025. – Vol. 16, No. 10. – Art. 5075. – DOI: https://doi.org/10.3390/app16105075.

14. Meroni M. et al. Yield forecasting with machine learning and small data: What gains for grains? // Agricultural and Forest Meteorology. – 2021. – Vol. 308–309. – Art. 108555. – P. 1–13. – DOI: https://doi.org/10.1016/j.agrformet.2021.108555.

15. Paudel D. et al. Machine learning for large-scale crop yield forecasting // Agricultural Systems. – 2021. – Vol. 187. – Art. 103016. – DOI:https://doi.org/10.1016/j.agsy.2020.103016.

Login or Create
* Forgot password?