基于ARIMA和CART的负载预测模型

1)国网四川省电力公司信息通信公司,四川成都 610015; 2) 电子科技大学计算机科学与工程学院,四川成都 611731

计算机应用技术; 时间序列; 负载预测; 最小二乘法; 自回归差分滑动平均模型; 分类回归树

Load forecasting model based on ARIMA and CART
WANG Diangang1, HUANG Lin1, CHANG Jian1, MEI Kejin2, and NIU Xinzheng2

1)State Grid Sichuan Electric Power Company Information and Communication Corporation, Chengdu 610015, Sichuan Province, P.R.China 2)School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, Sichuan Province, P.R.China

computer application technology; time series; load forecasting; least square method; auto regressive integrated moving average(ARIMA); classification and regression tree(CART)

DOI: 10.3724/SP.J.1249.2019.03245

备注

主机资源的负载预测对其运营维护工作具有重要意义.传统负载预测方法通常采用线性时间序列模型拟合负载数据,而负载受复杂的内外部环境影响,线性模型无法很好地表征负载数据规律.为提高模型的精度,提出将负载信息分解为线性部分和非线性部分的思想,并将自回归差分滑动平均(autoregressive integrated moving average,ARIMA)模型和分类回归树(classification and regression tree,CART)模型相结合进行预测.通过加权最小二乘法改进的ARIMA预测线性部分,通过边界判定优化的CART预测非线性部分,并结合两者获得综合预测结果.在真实负载数据集下进行对比实验,结果表明,改进后的算法预测精度相比传统方法提高了15%以上,且对偏远值和不同的时间间隔都均有良好的适应性.

The load forecasting of host resources is of great significance to the operation and maintenance work. The traditional load forecasting methods usually use linear model to fit the load data. In the actual operation of equipment, the load is affected by the complex internal and external environment where many nonlinear factors are included. The linear time series model can not well characterize the law of load data. In order to improve the model accuracy, the idea of decomposing the load information into linear part and nonlinear parts is proposed, and the autoregressive integrated moving average(ARIMA)model and the classification and regression tree(CART)model are combined for prediction. Specifically, the ARIMA model improved by the weighted least squares method is used to predict the linear part and the CART optimized by the boundary determination is used to predict the nonlinear part, and the prediction results of the two parts are combined to obtain a comprehensive prediction result. The comparison experiments are carried out on the real load dataset. The results show that the prediction accuracy of the proposed algorithm is improved by more than 15% compared with the traditional method, and it has good adaptability to remote values and different time intervals.

·