Biomedical Evidence Engineering for Data-Driven Discovery.
Academic Article
Overview
abstract
MOTIVATION: With the rapid development of precision medicine, a large amount of health data (such as electronic health records, gene sequencing, medical images, etc.) has been produced. It encourages more and more interest in data-driven insight discovery from these data. A reasonable way to verify the derived insights is by checking evidence from biomedical literature. However, manual verification is inefficient and not scalable. Therefore, an intelligent technique is necessary to solve this problem. RESULTS: This paper introduces a framework for biomedical evidence engineering, addressing this problem more effectively. The framework consists of a biomedical literature retrieval module and an evidence extraction module. The retrieval module ensembles several methods and achieves state-of-the-art performance in biomedical literature retrieval. A BERT-based evidence extraction model is proposed to extract evidence from literature in response to queries. Moreover, we create a dataset with 1 million examples of biomedical evidence, 10,000 of which are manually annotated. AVAILABILITY: Datasets are available at https://github.com/SendongZhao.