• Login
    View Item 
    •   DSpace@RPI Home
    • Tetherless World Constellation
    • Tetherless World Publications
    • View Item
    •   DSpace@RPI Home
    • Tetherless World Constellation
    • Tetherless World Publications
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Entity linking for biomedical literature

    Author
    Zheng, Jin; Howsmon, Daniel; Zhang, Boliang; Hahn, Juergen; McGuinness, Deborah; Hendler, Jim; Ji, Heng
    Thumbnail
    Other Contributors
    Date Issued
    2015-05-20
    Degree
    Terms of Use
    Metadata
    Show full item record
    URI
    https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/1472-6947-15-S1-S4; https://hdl.handle.net/20.500.13015/4485
    Abstract
    Background The Entity Linking (EL) task links entity mentions from an unstructured document to entities in a knowledge base. Although this problem is well-studied in news and social media, this problem has not received much attention in the life science domain. One outcome of tackling the EL problem in the life sciences domain is to enable scientists to build computational models of biological processes with more efficiency. However, simply applying a news-trained entity linker produces inadequate results. Methods Since existing supervised approaches require a large amount of manually-labeled training data, which is currently unavailable for the life science domain, we propose a novel unsupervised collective inference approach to link entities from unstructured full texts of biomedical literature to 300 ontologies. The approach leverages the rich semantic information and structures in ontologies for similarity computation and entity ranking. Results Without using any manual annotation, our approach significantly outperforms state-of-the-art supervised EL method (9% absolute gain in linking accuracy). Furthermore, the state-of-the-art supervised EL method requires 15,000 manually annotated entity mentions for training. These promising results establish a benchmark for the EL task in the life science domain. We also provide in depth analysis and discussion on both challenges and opportunities on automatic knowledge enrichment for scientific literature. Conclusions In this paper, we propose a novel unsupervised collective inference approach to address the EL problem in a new domain. We show that our unsupervised approach is able to outperform a current state-of-the-art supervised approach that has been trained with a large amount of manually labeled data. Life science presents an underrepresented domain for applying EL techniques. By providing a small benchmark data set and identifying opportunities, we hope to stimulate discussions across natural language processing and bioinformatics and motivate others to develop techniques for this largely untapped domain.;
    Department
    Publisher
    BMC Medical Informatics and Decision Making
    Relationships
    Access
    Collections
    • Tetherless World Publications

    Browse

    All of DSpace@RPICommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Login

    DSpace software copyright © 2002-2022  DuraSpace
    Contact Us | Send Feedback
    DSpace Express is a service operated by 
    Atmire NV