Estimation performance comparison of machine learning approaches and time series econometric models: evidence from the effect of sector-based energy consumption on CO 2 emissions in the USA.

Talat Ulussever Serpil Kılıç Depren Mustafa Tevfik Kartal Özer Depren

Published in: Environmental science and pollution research international (2023)

By considering the existence of two separate analysis families and the usage of different data frequencies, this study aims to examine the effect of method choice, data frequency, and sector-based energy consumption on carbon dioxide (CO 2 ) emissions by performing machine learning (ML) algorithms and time series econometric (TS) models simultaneously. In this situation, the study examines the United States (USA), considers sector-based energy consumption indicators as explanatory variables, uses monthly and yearly data between January 1973 and December 2021, estimates CO 2 emissions, and compares the estimation performance of the models. The empirical findings reveal that (i) the ML algorithms outperform the TS models based on R 2 and goodness of fit criteria; (ii) the estimation performance of the models increases with the high-frequency (i.e., monthly) data; (iii) the ML algorithms perform much better in case of high-frequency usage; (iv) some thresholds identify the effects of the sector-based energy consumption indicators on the CO 2 emissions; (v) electric power and transportation sectors are the most important sectors in the estimation of the CO 2 emissions for monthly and yearly data, respectively. Hence, the study provides to help the understanding role of method choice, data frequency, and sector-based energy consumption for the estimation of CO 2 emissions. Based on the results, this study proposes that US policymakers should consider the ML algorithms, use higher-frequency data, and include sector-based energy consumption indicators to have a better estimation of CO 2 emissions.

Keyphrases