高级检索
当前位置: 首页 > 详情页

Dynamic Text categorization of search results for medical class recognition in real world evidence studies in the Chinese language

文献详情

资源类型:

收录情况: ◇ EI

机构: [a]R&D Information China, AstraZeneca 199 Liangjing Road, Pudong, Shanghai, 201203, China [b]Guangdong Provincial Hospital of Chinese Medicine, 111 Dade Road, Guangzhou 510120, China [c]IBM China Company, China
出处:

关键词: Artificial intelligence Data mining Electronic health records Health services research Natural language processing

摘要:
Classifying clinical terms from electronic medical record (EMR) systems is critical for real world evidence (RWE) research. Yet the task is challenging, especially in languages other than English. Clinical research institutes require a cost-effective method to address this challenge. We proposed a software pipeline with two components: a feature generator that gathers descriptive words of the terms by text-segmenting the search results from two search engines and a learning mechanism that utilizes machine learning algorithms for classification. Models are trained with training sets of different sizes to determine effectiveness. Models were compared using 10-fold cross validation or another supplied testing set. We applied our pipeline to a Chinese medication term set extracted from a clinical system, and also to a data set of standard medications names. A term-vs.-word frequency matrix was generated based on the Google search results of the term sets. Most models tasked with classifying whether a medication belonged to Western or Chinese medicine achieved high accuracy, especially with radial basis functions (RBF) network. The performance of models trained with training sets of different sizes was not significantly different. When the same approach was applied to the information gathered from another Chinese language search engine (Baidu), better performance was achieved. The results of the other experiments conducted on the medication name set also demonstrates a significant improvement from baseline. Dynamic text categorization with machine learning can be applied to classify clinical terms based on information retrieved from search engines in RWE studies. © 2017 ACM.

语种:
第一作者:
第一作者机构: [a]R&D Information China, AstraZeneca 199 Liangjing Road, Pudong, Shanghai, 201203, China
通讯作者:
推荐引用方式(GB/T 7714):
APA:
MLA:

资源点击量:2018 今日访问量:0 总访问量:645 更新日期:2024-07-01 建议使用谷歌、火狐浏览器 常见问题

版权所有©2020 广东省中医院 技术支持:重庆聚合科技有限公司 地址:广州市越秀区大德路111号