基于预训练模型的医学习题解析半自动生成方法研究

孙月萍¹,
王娟¹,
董良广²,
刘燕¹,
杨丽¹,
李姣¹,
侯丽¹

1. 中国医学科学院医学信息研究所北京协和医学院图书馆;
2. 人民卫生出版社有限公司;

通讯作者: 侯丽, hou.li@imicams.ac.cn

DOI：10.12201/bmr.202602.00048

声明：预印本系统所发表的论文仅用于最新科研成果的交流与共享，未经同行评议，因此不建议直接应用于指导临床实践。

Research on a Semi-automatic Generation Method for Medical Exercise Answer Explanations Based on the Pre-trained Model

Sun Yueping¹,
Wang Juan¹,
Dong Liangguang²,
Liu Yan¹,
Yang Li¹,
Li Jiao¹,
Hou Li¹

1. Institute of Medical Information,Chinese Academy of Medical Sciences/Peking Union Medical College ;
2. People’s Medical Publishing House Co,Ltd ;

Corresponding author: Hou Li, hou.li@imicams.ac.cn

摘要：目的/意义为提升医学习题解析的生成效率与质量，本研究探索了一种基于预训练语言模型的半自动化解决方案，以期克服传统人工生成方式效率低、成本高且不可追溯等问题。方法/过程本研究引入一个基于MC-BERT的混合智能增强框架。该框架首先自动化地完成题目结构识别、知识点抽取与初步解析生成，随后引入关键的人工校验环节，以严格把控内容的准确性与规范性，形成可追溯的解析语料。结果/结论在专项数据集上的评估表明，对比传统潜在语义索引模型（LSI）和预训练基线模型，解析推荐精确率有较大提升。实验结果表明，所提出的方法能够显著提高解析生成的效率，减少人工干预成本，同时保障了解析内容与医学教学大纲和教材知识体系的一致性和可追溯性，该参考解析推荐框架为医学教育智能化提供了可行路径。

关键词： 预训练模型; 医学习题解析; QNLI; 半自动化生成; 解析推荐

Abstract: Abstract Purpose/Significance Traditional methods for generating explanations for medical exercises are often inefficient, costly, and lack traceability. This study aims to address these limitations by developing a semi-automated approach based on a pre-trained language model to enhance both the efficiency and quality of medical exercise answer explanation generation. Method/Process We propose a hybrid intelligence framework utilizing the MC-BERT model. This framework automates the initial stages of explanation generation, including question structure recognition, key knowledge point extraction, and draft explanation formulation. A critical subsequent manual verification phase ensures the accuracy, normative terminology, and logical coherence of the content, resulting in a structured and traceable reference explanation corpus. Result/Conclusion Evaluations on a dedicated medical exercise dataset demonstrate that the proposed framework achieves a relative higher precision in explanation recommendation compared to both the traditional Latent Semantic Indexing (LSI) method and pre?trained baseline models. The results indicate that our method significantly improves generation efficiency, reduces manual intervention costs, and reliably maintains consistency with the official medical curriculum and textbook knowledge system while ensuring traceability. The proposed framework represents a feasible and effective pathway toward advancing intelligent solutions in medical education.

Key words: Pre-trained model; Medical exercise answer explanation; QNLI; Semi-automatic generation; Explanation recommendation

提交时间：2026-02-11

版权声明：作者本人独立拥有该论文的版权，预印本系统仅拥有论文的永久保存权利。任何人未经允许不得重复使用。
html
图表
冯凤翔, 任慧玲, 李晓瑛, 王巍洁, 王勖, 张颖. 融合相似度算法与预训练模型的中文电子病历实体映射方法研究. 2023. doi: 10.12201/bmr.202305.00015

王娟, 侯丽, 孙月萍, 李佳明, 杨丽, 董良广, 李云汉. 面向儿科医学试题的答案解析自动推荐方法研究. 2024. doi: 10.12201/bmr.202409.00026

牛宇翔, 葛珊衫, 王力华. 从传统NLP到大语言模型电子病历生成技术的探索与研究. 2024. doi: 10.12201/bmr.202412.00080

姜胜耀, 袁铖, 朱立峰, 李寅驰, 范亚蔚, 张维彦, 阮彤, 邵炜. 基于大模型微调的出院小结生成幻觉抑制方法. 2025. doi: 10.12201/bmr.202503.00005

张臣, 陈辉, 曹丰, 王玥琪, 柯任. 基于深度学习的左心房CTA图像自动分割研究进展. 2025. doi: 10.12201/bmr.202503.00042

张韦, 程炜焓, 郭富祥, 张建伟. 基于BERT的在线心理健康问答社区回答质量预测研究. 2025. doi: 10.12201/bmr.202509.00007

吴欢, 何昆仑. 基于循证医学和电子病历数据的通用医学知识图谱构建. 2024. doi: 10.12201/bmr.202409.00027

吕婷钰, 李晓瑛, 刘宇炀, 杜晋华, 李心怡, 罗妍, 唐小利, 任慧玲, 刘辉, 尹浩. 中文医学知识大模型问答语料数据集构建研究. 2024. doi: 10.12201/bmr.202404.00002

王世文, 李一凡, 郑群, 曹旭晨. 面向乳腺肿瘤的诊前问答系统决策模型构建研究. 2023. doi: 10.12201/bmr.202303.00029

王杰, 王至诚, 娄帅, 董建成, 曹新志. 基于深度学习算法Mask R-CNN的甲状腺结节检测模型研究. 2024. doi: 10.12201/bmr.202411.00085

序号	提交日期	编号	操作
1	2025-10-14	10.12201/bmr.202602.00048V1	下载

公开评论匿名评论仅发给作者

引用格式

孙月萍, 王娟, 董良广, 刘燕, 杨丽, 李姣, 侯丽. 基于预训练模型的医学习题解析半自动生成方法研究. 2026. biomedRxiv.202602.00048

访问统计

阅读量：20
下载量： 0
评论数：0

基于预训练模型的医学习题解析半自动生成方法研究

通讯作者: 侯丽, hou.li@imicams.ac.cn

DOI：10.12201/bmr.202602.00048

Research on a Semi-automatic Generation Method for Medical Exercise Answer Explanations Based on the Pre-trained Model

Corresponding author: Hou Li, hou.li@imicams.ac.cn

引用格式

访问统计

分享

Email This Article