Show simple item record

dc.rights.licenseRestricted to current Rensselaer faculty, staff and students. Access inquiries may be directed to the Rensselaer Libraries.
dc.contributorJi, Heng
dc.contributorWallace, William A., 1935-
dc.contributorMcGuinness, Deborah L.
dc.contributorAdali, Sibel
dc.contributorLiu, Li (Emily)
dc.contributor.authorLi, Hao
dc.date.accessioned2021-11-03T08:32:02Z
dc.date.available2021-11-03T08:32:02Z
dc.date.created2016-02-11T08:09:01Z
dc.date.issued2015-12
dc.identifier.urihttps://hdl.handle.net/20.500.13015/1597
dc.descriptionDecember 2015
dc.descriptionSchool of Science
dc.description.abstractHowever, identifying and classifying events is a challenging problem mainly due to three reasons: the first challenge is the lack of training data across genres thus traditional supervised systems can not be easily adapted to new genres. For example, we found that event extraction performed notably worse on web blogs than on newswire texts. Adapting an existing event extractor to another genre usually requires additional annotations.
dc.description.abstractThis thesis focuses on tackling these challenges for event extraction in various genres, where the inter-dependencies of various components and subtasks can be found. The main theme of this thesis is to incorporate within-genre knowledge and cross-genre knowledge as two types of background knowledge to boost the event extractor performance, instead of conducting event extraction solely on each single document (e.g., a new article sentence or a social media message). We utilize three genres - news articles, tweets and Facebook messages as three case studies, to demonstrate the effectiveness and efficiency of utilizing knowledge enrichment techniques for event extraction tasks.
dc.description.abstractThe second challenge comes from informal genres such as social media. The context of a social media message is usually short and incomplete (e.g., each tweet has a length limitation of 140 characters). Lacking of context, a single tweet itself usually cannot provide a complete picture of the corresponding events. The third challenge is the informal nature of social media. Social media messages are written in an informal style, which causes the poor performance of NLP tools designed for more formal genres.
dc.description.abstractEvent extraction is an important task in Information Extraction (IE), which is a sub-field in Natural Language Processing (NLP). It has been applied to different genres (e.g., news articles, web blogs, tweets, etc.) and various applications (e.g., question answering, information retrieval, etc.). The goal of event extraction is to extract structure information for the events that are of interest from unstructured documents. It will be extremely valuable if we could automatically detect and extract such events effectively.
dc.language.isoENG
dc.publisherRensselaer Polytechnic Institute, Troy, NY
dc.relation.ispartofRensselaer Theses and Dissertations Online Collection
dc.subjectComputer science
dc.titleMoving from news to social media: unsupervised knowledge enrichment for event extraction
dc.typeElectronic thesis
dc.typeThesis
dc.digitool.pid176952
dc.digitool.pid176953
dc.digitool.pid176954
dc.rights.holderThis electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.
dc.description.degreePhD
dc.relation.departmentDept. of Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record