面向大型开放式网络课程评论的教育领域情感词典构建

Construction of education domain sentiment lexicon for evaluation of massive open online courses

  • 摘要: 针对现有情感分析工具在处理大型开放式网络课程(massive open online courses,MOOC)评论时面临领域词汇覆盖不足及上下文语义偏移问题,提出了基于多元特征融合的教育领域情感词典(education domain sentiment lexicon,EDSL)构建方法;开发了余弦相似度算法和极性倾向点互信息(polarity orientation pointwise mutual information,POPMI)算法联合的候选词扩展工具,从向量空间语义近似与词汇统计共现关联角度实现了领域潜在情感词的扩充.通过本研究所构建的教育领域专属语料库对双向编码器表征模型进行微调,获得了具备深度语义感知能力的情感极性分类器,实现对候选词情感倾向自动精准标注.通过融合策略整合了通用情感资源与领域特有词汇,形成具有高适配性的EDSL.真实在线课程评论数据集的试验结果表明,相较于主流通用情感词典,该词典在准确率、精确率、召回率和F1分数4个指标均有明显提升.EDSL可为智能教学质量评估提供有效的情感计算工具支持,对推动数据驱动教学改进具有重要意义.

     

    Abstract: Education-domain sentiment lexicon (EDSL) was constructed based on multi-feature fusion to rectify insufficient domain-specific vocabulary coverage and contextual semantic shift associated with existing analysis tools commonly used for evaluation of massive open online courses (MOOC). A word-expansion tool integrating cosine similarity algorithm and polarity orientation pointwise mutual information (POPMI) algorithm was developed, achieving expansion of potential domain-specific sentiment words from semantic similarity in vector space and statistical lexical co-occurrence association. Bidirectional encoder representations from transformer models were fine-tuned using domain-specific educational corpus, to obtain a sentiment polarity classifier with deep semantic perception capabilities, for automatic and accurate annotation of sentiment orientation of candidate words. General sentiment resources and domain-specific vocabulary were integrated through a fusion strategy, to form a highly adaptable educational sentiment lexicon. Compared to mainstream general sentiment lexicons, accuracy, precision, recall and F1 score these four indicators have shown significant improvement. EDSL provides effective affective computing tool support for intelligent teaching quality assessment and holds significant implications for advancing data-driven teaching improvements.

     

/

返回文章
返回