renhuiling, lixiaoying, wangweijie, wangxu, zhangying. Research on Chinese electronic medical record entity mapping method by fusing similarity algorithm and pre-trained model. 2023. biomedRxiv.202305.00015
Research on Chinese electronic medical record entity mapping method by fusing similarity algorithm and pre-trained model
Corresponding author: renhuiling, ren.huiling@imicams.ac.cn
DOI: 10.12201/bmr.202305.00015
-
Abstract: Purpose/SignificanceIn order to fully explore and utilize the physical resources of Chinese electronic medical records, the combination of algorithms and models suitable for large-scale entity mapping without manually constructing regular features is studied. Method/Process The self-annotated Chinese electronic medical record standard dataset is used to fuse the similarity algorithm and the pre-trained model, and the candidate entity generation and entity disambiguation stages of entity mapping are applied respectively, and the performance of different similarity algorithms and pre-trained models selected in this paper is compared and analyzed in this process. Results/Conclusion A method to improve the effect of drug-like entity mapping is proposed, and the combination of Jaccard similarity algorithm and Bert pre-training model is finally determined, which achieves more than 90% accuracy and 99% recall in the entity mapping task, which can efficiently realize the entity mapping task of massive Chinese electronic medical records.
Key words: Entity mapping; Entity standardization; Similarity algorithm; Electronic medical recordsSubmit time: 17 May 2023
Copyright: The copyright holder for this preprint is the author/funder, who has granted biomedRxiv a license to display the preprint in perpetuity. -
图表
-
chenjieqing, zhangfeng. Named Entity Recognition in Chinese Electronic Medical Records Using Knowledge Graph Construction. 2023. doi: 10.12201/bmr.202312.00011
Deng Jiale, Hu Zhensheng, Lian Wanmin, Hua Yunpeng, Zhou Yi. Research on entity recognition of liver cancer electronic medical records based on RoBERTa-CRF. 2023. doi: 10.12201/bmr.202303.00027
wuxuehong. A method of recognizing entities from Chinese Electronic Medical Record based on domain word vector combined with word attributes reasoning. 2021. doi: 10.12201/bmr.202109.00016
chenjianqiu, huangxiaofang. Joint extraction of Chinese EMR entity relationship based on bert. 2022. doi: 10.12201/bmr.202206.00003
xiaoxiaoxia. Research on named entity recognition of Chinese medical records based on BERT-BiLSTM-CRF with Chinese radicals. 2023. doi: 10.12201/bmr.202303.00004
Deng Lan, Du Tongzhou. An Efficient, Secure and Multi-keyword Search Scheme on Encrypted Electronic Medical Records. 2021. doi: 10.12201/bmr.202105.00008
xie jia qi. Leveraging Pre-trained Language Model for Consumer Health Question Classification. 2021. doi: 10.12201/bmr.202101.00017
HU Haiyang, ZHAO Congpu, Ma Lian, JIANG Huizhen, ZHANG Jing, ZHU Weiguo. Attention Mechanism And Dilated Convolution Neural Networks for Named Entity Recognition. 2021. doi: 10.12201/bmr.202102.00004
shenrongrong, xiashuaishuai, yanjunfeng. Review on Research of Named Entity Recognition in Chinese Medicine. 2022. doi: 10.12201/bmr.202207.00038
ZHAO Jia-Qi, WANG Xiao-Feng, FAN Yu-Yu, ZHANG Wei, WANG Hui-Xuan, LI Jin-Shan. Research on the Quality and Countermeasures of Electronic Medical Record Data. 2020. doi: 10.12201/bmr.202011.00008
-
ID Submit time Number Download 1 2023-02-20 bmr.202305.00015V1
Download -
-
Public Anonymous To author only
Get Citation
Article Metrics
- Read: 577
- Download: 1
- Comment: 0