A Hybrid Approach to Tea Crop Yield Prediction Using Simulation Models and Machine Learning.
Dania BatoolMuhammad ShahbazHafiz Shahzad AsifKamran ShaukatTalha Mahboob AlamIbrahim A HameedZeeshan RamzanAbdul WaheedHanan AljuaidSuhuai LuoPublished in: Plants (Basel, Switzerland) (2022)
Tea ( Camellia sinensis L.) is one of the most highly consumed beverages globally after water. Several countries import large quantities of tea from other countries to meet domestic needs. Therefore, accurate and timely prediction of tea yield is critical. The previous studies used statistical, deep learning, and machine learning techniques for tea yield prediction, but crop simulation models have not yet been used. However, the calibration of a simulation model for tea yield prediction and the comparison of these approaches is needed regarding the different data types. This research study aims to provide a comparative study of the methods for tea yield prediction using the Food and Agriculture Organization (FAO) of the United Nations AquaCrop simulation model and machine learning techniques. We employed weather, soil, crop, and agro-management data from 2016 to 2019 acquired from tea fields of the National Tea and High-Value Crop Research Institute (NTHRI), Pakistan, to calibrate the AquaCrop simulation model and to train regression algorithms. We achieved a mean absolute error ( MAE ) of 0.45 t/ha, a mean squared error ( MSE ) of 0.23 t/ha, and a root mean square error ( RMSE ) of 0.48 t/ha in the calibration of the AquaCrop model and, out of the ten regression models, we achieved the lowest MAE of 0.093 t/ha, MSE of 0.015 t/ha, and RMSE of 0.120 t/ha using 10-fold cross-validation and MAE of 0.123 t/ha, MSE of 0.024 t/ha, and RMSE of 0.154 t/ha using the XGBoost regressor with train test split. We concluded that the machine learning regression algorithm performed better in yield prediction using fewer data than the simulation model. This study provides a technique to improve tea yield prediction by combining different data sources using a crop simulation model and machine learning algorithms.