Joint information extraction
Loading...
Authors
Li, Qi
Issue Date
2015-05
Type
Electronic thesis
Thesis
Thesis
Language
ENG
Keywords
Computer science
Alternative Title
Abstract
Taking entity mention extraction, relation extraction and event extraction as points of view, the main part of this thesis presents a novel sentence-level joint IE framework based on structured prediction and inexact search. In this new framework, the three types of IE components can be simultaneously extracted to alleviate error propagation problem. And we can make use of various global features to produce more accurate and coherent results. Experimental results on the ACE corpora show that our joint model achieves state-of-the-art performance on each stage of the extraction. We further go beyond sentence level and make improvement in cross-document setting. We use an integer-linear-programming (ILP) formulation to conduct cross-document inference so that many spurious results can be effectively filtered out based on the inter-dependencies over the facts from different places. Finally, to investigate the cross-lingual dependencies, we present a CRF-based joint bilingual name tagger for parallel corpora, then demonstrate the application of this method to enhance name-aware machine translation.
Description
May 2015
School of Science
School of Science
Full Citation
Publisher
Rensselaer Polytechnic Institute, Troy, NY