dc.contributor.author | Huang, Lifu | |
dc.contributor.author | May, Jonathan | |
dc.contributor.author | Pan, Xiaoman | |
dc.contributor.author | Ji, Heng | |
dc.contributor.author | Ren, Xiang | |
dc.contributor.author | Han, Jiawei | |
dc.contributor.author | Zhao, Lin | |
dc.contributor.author | Hendler, James A. | |
dc.date.accessioned | 2023-01-25T20:57:16Z | |
dc.date.available | 2023-01-25T20:57:16Z | |
dc.date.issued | 2017 | |
dc.identifier.citation | Lifu Huang, Jonathan May, Xiaoman Pan, Heng Ji, Xiang Ren, Jiawei Han, Lin Zhao, and James A. Hendler. Liberal Entity Extraction: Rapid Construction of Fine-Grained Entity Typing Systems. Big Data.Mar 2017.19-31.http://doi.org/10.1089/big.2017.0012 | en_US |
dc.identifier.uri | https://www.liebertpub.com/doi/10.1089/big.2017.0012 | |
dc.identifier.uri | http://doi.org/10.1089/big.2017.0012 | |
dc.identifier.uri | https://hdl.handle.net/20.500.13015/6408 | |
dc.description.abstract | The ability of automatically recognizing and typing entities in natural language without prior knowledge (e.g., predefined entity types) is a major challenge in processing such data. Most existing entity typing systems are limited to certain domains, genres, and languages. In this article, we propose a novel unsupervised entity-typing framework by combining symbolic and distributional semantics. We start from learning three types of representations for each entity mention: general semantic representation, specific context representation, and knowledge representation based on knowledge bases. Then we develop a novel joint hierarchical clustering and linking algorithm to type all mentions using these representations. This framework does not rely on any annotated data, predefined typing schema, or handcrafted features; therefore, it can be quickly adapted to a new domain, genre, and/or language. Experiments on genres (news and discussion forum) show comparable performance with state-of-the-art supervised typing systems trained from a large amount of labeled data. Results on various languages (English, Chinese, Japanese, Hausa, and Yoruba) and domains (general and biomedical) demonstrate the portability of our framework. | en_US |
dc.publisher | Mary Ann Liebert, Inc. | en_US |
dc.rights | Attribution-NonCommercial-NoDerivs 3.0 United States | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/us/ | * |
dc.title | Liberal Entity Extraction: Rapid Construction of Fine-Grained Entity Typing Systems | en_US |
dc.type | Article | en_US |