• Login
    View Item 
    •   DSpace@RPI Home
    • Rensselaer Libraries
    • RPI Theses Open Access
    • View Item
    •   DSpace@RPI Home
    • Rensselaer Libraries
    • RPI Theses Open Access
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Knowledge base construction from scientific literature

    Author
    Wang, Han
    Thumbnail
    View/Open
    177779_Wang_rpi_0185E_10978.pdf (10.63Mb)
    Other Contributors
    Fox, Peter A.; Hendler, James A.; Ji, Heng; Stephan, Eric; Lewis, Daniel;
    Date Issued
    2016-12
    Subject
    Multidisciplinary science
    Degree
    PhD;
    Terms of Use
    This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.;
    Metadata
    Show full item record
    URI
    https://hdl.handle.net/20.500.13015/1818
    Abstract
    Knowledge Bases (KBs) have become a functional utility as a repository of information for both humans and software agents to seek confirmed facts about the world. With the wide-ranging application of KBs, automatically constructing either generic KBs or domain-specific KBs using information extracted from multiple sources such as web pages, reports, and research papers has grown into an interesting task for both academia and industry.; SciKB adopts an open information extraction approach to extract fact triples from the input documents, then jointly learns the distributed representations of the involved entities and relations in an unsupervised fashion, and finally utilizes the obtained representations to organize the entities and relations into hierarchical clusters. Experiments are conducted to evaluate each component of the SciKB pipeline and the results demonstrate its effectiveness in two scientific domains: Biomedical Science and Earth Science.; This dissertation presents SciKB, an end-to-end Knowledge Base Construction system, which takes in a collection of research articles within a certain scientific domain and outputs a domain-specific KB. The resultant KB contains fact triples extracted from the input documents as well as hierarchical clusters of the entities and relations involved in the facts. Each cluster aggregates entities or relations with similar semantic meanings, and the hierarchies serve as an implicit schema of the KB.;
    Description
    December 2016; School of Science
    Department
    Multidisciplinary Science Program;
    Publisher
    Rensselaer Polytechnic Institute, Troy, NY
    Relationships
    Rensselaer Theses and Dissertations Online Collection;
    Access
    Users may download and share copies with attribution in accordance with a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. No commercial use or derivatives are permitted without the explicit approval of the author.;
    Collections
    • RPI Theses Online (Complete)
    • RPI Theses Open Access

    Browse

    All of DSpace@RPICommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Login

    DSpace software copyright © 2002-2022  DuraSpace
    Contact Us | Send Feedback
    DSpace Express is a service operated by 
    Atmire NV