基于BERT-BiLSTM模型的短文本自动评分系统

深圳信息职业技术学院人工智能技术应用工程实验室,广东深圳518172

信号与信息处理;自然语言处理;BERT语言模型;短文本自动评分;长短时记忆网络;二次加权kappa系数

Short text automatic scoring system based on BERT-BiLSTM model
XIA Linzhong,YE Jianfeng,LUO De’an,GUAN Mingxiang,LIU Jun,and CAO Xuemei

Engineering Applications of Artificial Intelligence Technology Laboratory, Shenzhen Institute of Information Technology, Shenzhen 518172, Guangdong Province, P. R. China

signal and information processing; natural language processing; BERT language model; short text automatic scoring; long short-term memory net; quadratic weighted kappa coefficient

DOI: 10.3724/SP.J.1249.2022.03349

备注

针对短文本自动评分中存在的特征稀疏、一词多义及上下文关联信息少等问题,提出一种基于BERT-BiLSTM(bidirectionalencoderrepresentationsfromtransformers-bidirectionallongshort-termmemory)的短文本自动评分模型.使用BERT(bidirectionalencoderrepresentationsfromtransformers)语言模型预训练大规模语料库习得通用语言的语义特征,通过预训练好的BERT语言模型预微调下游具体任务的短文本数据集习得短文本的语义特征和关键词特定含义,再通过BiLSTM(bidirectionallongshort-termmemory)捕获深层次上下文关联信息,最后将获得的特征向量输入Softmax回归模型进行自动评分.实验结果表明,对比CNN(convolutionalneuralnetworks)、CharCNN(character-levelCNN)、LSTM(longshort-termmemory)和BERT等基准模型,基于BERT-BiLSTM的短文本自动评分模型所获的二次加权kappa系数平均值最优.
Aiming at the problems of sparse features, polysemy of one word and less context related information in short text automatic scoring, a short text automatic scoring model based on bidirectional encoder representations from transformers - bidirectional long short-term memory (BERT-BiLSTM) is proposed. Firstly, the large-scale corpus is pre-trained with bidirectional encoder representations from transformers (BERT) language model to acquire the semantic features of the general language. Then the semantic features of short text and the semantics of keywords in a specific context are acquired through the short text data for the pre-fine tuning downstream specific tasks set pre-fined by BERT. And then the deep-seated context dependency is captured through bidirectional long short-term memory (BiLSTM). Finally, the obtained feature vectors are input into Softmax regression model for automatic scoring. The experimental results show that compared with other benchmark models of convolutional neural networks (CNN), character-level CNN (CharCNN), long short-term memory (LSTM) and BERT, the short text automatic scoring model based on BERT-BiLSTM achieves the best average value of quadratic weighted kappa coefficient.
·