• Login
    View Item 
    •   DSpace@RPI Home
    • Rensselaer Libraries
    • RPI Theses Online (Complete)
    • View Item
    •   DSpace@RPI Home
    • Rensselaer Libraries
    • RPI Theses Online (Complete)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Moving from news to social media: unsupervised knowledge enrichment for event extraction

    Author
    Li, Hao
    View/Open
    176953_Li_rpi_0185E_10804.pdf (2.120Mb)
    Other Contributors
    Ji, Heng; Wallace, William A., 1935-; McGuinness, Deborah L.; Adali, Sibel; Liu, Li (Emily);
    Date Issued
    2015-12
    Subject
    Computer science
    Degree
    PhD;
    Terms of Use
    This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.;
    Metadata
    Show full item record
    URI
    https://hdl.handle.net/20.500.13015/1597
    Abstract
    However, identifying and classifying events is a challenging problem mainly due to three reasons: the first challenge is the lack of training data across genres thus traditional supervised systems can not be easily adapted to new genres. For example, we found that event extraction performed notably worse on web blogs than on newswire texts. Adapting an existing event extractor to another genre usually requires additional annotations.; This thesis focuses on tackling these challenges for event extraction in various genres, where the inter-dependencies of various components and subtasks can be found. The main theme of this thesis is to incorporate within-genre knowledge and cross-genre knowledge as two types of background knowledge to boost the event extractor performance, instead of conducting event extraction solely on each single document (e.g., a new article sentence or a social media message). We utilize three genres - news articles, tweets and Facebook messages as three case studies, to demonstrate the effectiveness and efficiency of utilizing knowledge enrichment techniques for event extraction tasks.; The second challenge comes from informal genres such as social media. The context of a social media message is usually short and incomplete (e.g., each tweet has a length limitation of 140 characters). Lacking of context, a single tweet itself usually cannot provide a complete picture of the corresponding events. The third challenge is the informal nature of social media. Social media messages are written in an informal style, which causes the poor performance of NLP tools designed for more formal genres.; Event extraction is an important task in Information Extraction (IE), which is a sub-field in Natural Language Processing (NLP). It has been applied to different genres (e.g., news articles, web blogs, tweets, etc.) and various applications (e.g., question answering, information retrieval, etc.). The goal of event extraction is to extract structure information for the events that are of interest from unstructured documents. It will be extremely valuable if we could automatically detect and extract such events effectively.;
    Description
    December 2015; School of Science
    Department
    Dept. of Computer Science;
    Publisher
    Rensselaer Polytechnic Institute, Troy, NY
    Relationships
    Rensselaer Theses and Dissertations Online Collection;
    Access
    Restricted to current Rensselaer faculty, staff and students. Access inquiries may be directed to the Rensselaer Libraries.;
    Collections
    • RPI Theses Online (Complete)

    Browse

    All of DSpace@RPICommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Login

    DSpace software copyright © 2002-2022  DuraSpace
    Contact Us | Send Feedback
    DSpace Express is a service operated by 
    Atmire NV