增量学习的优化算法在app使用预测中的应用

1)北京理工大学珠海学院,广东珠海 519000; 2)澳门科技大学资讯科技学院,澳门 999078; 3)贵州商学院计算机与信息工程学院,贵州贵阳 550014

模式识别; app使用预测; 聚类; 增量学习; 大数据

The application of optimization algorithm based on incremental learning in app usage prediction
HAN Di1, 2, LI Wenting3, WANG Qingjuan1, and ZHOU Tianjian1

1)Beijing Institute of Technology, Zhuhai, Zhuhai 519000, Guangdong Province, P.R.China 2)Faculty of Information Technology, Macau University of Science and Technology, Macao 999078, P.R.China 3)Computer and Information Engineering College, Guizhou University of Commerce, Guiyang 550014, Guizhou Province, P.R.China

pattern recognition; app usage prediction; clustering; incremental learning; big data

DOI: 10.3724/SP.J.1249.2019.01043

备注

随着智能手机中app数量的不断增加,准确查询目标app渐趋困难.目前利用历史用户数据预测手机系统下一个使用的app算法存在两类问题:一是部分算法因未考虑训练数据日益递增,导致预测结果的准确度随时间增加而降低; 二是虽然考虑到了增量数据,但增加了因增量数据而重新建模的时间,导致总体耗时增加.为减少建模时间,本研究提出Predictor预测系统,利用优化后的增量IkNN模型为用户提供app使用的预测功能.通过学习app特征的上下文关系,设计了聚类有效值(cluster effective value, CEV)策略,采用多维度特征方法来提高分类的准确度,从而提高预测准确度.实验结果表明,带有CEV策略的IkNN模型比默认的IkNN模型拥有更稳定的预测准确度,其应用模型Predictor能减少建模的时间,同时提高预测准确度.

With the increasing number of apps on smartphones, it becomes more and more difficult to query the target app accurately. It is increasingly important and necessary to predict the next app to be launched quickly and accurately. There are two kinds of problems in using historical user data to predict the next app algorithm: One is that some algorithms do not consider the increment of training data over time, which leads to the decrease of the prediction accuracy over time. The other is that although some algorithms take the incremental data into account, they increase the time required to rebuild the model, thus greatly increase the overall time-consuming. To reduce the remodeling time, we utilize an incremental k-nearest neighbors(IkNN)model algorithm to implement a Predictor prediction system. When the IkNN model is used for predicting the next app usage, a new problem is found. When modeling with training data, the classification accuracy reduces with the increase of number of features of an app. After studying the relationship among the context features of an app, we design a cluster effective value(CEV)which can compensate the errors induced by multidimensional features and thus improve the prediction accuracy. It is shown that the IkNN model algorithm with CEV has a higher and more stable prediction accuracy than that of the algorithm without CEV. The large-scale experiments show that the Predictor can reduce the remodeling time and improve the prediction accuracy.

·