dc.rights.license | Restricted to current Rensselaer faculty, staff and students. Access inquiries may be directed to the Rensselaer Libraries. | |
dc.contributor | Ji, Heng | |
dc.contributor | Fox, Peter A. | |
dc.contributor | Hendler, James A. | |
dc.contributor | Lin, Chin-Yew | |
dc.contributor | Sun, Yizhou | |
dc.contributor.author | Huang, Hongzhao | |
dc.date.accessioned | 2021-11-03T08:27:09Z | |
dc.date.available | 2021-11-03T08:27:09Z | |
dc.date.created | 2015-06-09T13:58:54Z | |
dc.date.issued | 2015-05 | |
dc.identifier.uri | https://hdl.handle.net/20.500.13015/1496 | |
dc.description | May 2015 | |
dc.description | School of Science | |
dc.description.abstract | Microblogging, a new type of online information sharing platform through short messages of up to 140 characters, has grown up quickly and received increasing attentions in recent years. A microblogging platform (e.g., Twitter) enables both individuals and organizations to disseminate information, from current affairs to breaking news in a timely fashion, which makes it a valuable knowledge source with super-fresh information. For example, during Hurricane Irene in 2011, updates from users living in New York City and transportation/evacuation posts from the government are very useful information for people to keep track of the disaster. Therefore, conducting related Natural Language Processing (NLP) research on this new genre is demanded to assist knowledge mining and discovery. | |
dc.description.abstract | To achieve our goals, we propose to leverage and model heterogeneous information networks (HINs), in contrast to most existing NLP approaches on traditional genres (e.g., news) that only explored single type of information (e.g., texts). Microblogging contains heterogeneous types of information from social network structures to cross-genre link- ages, forming rich HINs. By designing effective approaches to model both unstructured texts and structured HINs, we can incorporate additional evidence from HIN structures beyond texts. In this thesis, we present different approaches to construct HINs from cross- genre, cross-source, and cross-type information by incorporating the existing clean social relations, as well as performing deep content analysis with some of the well-developed NLP approaches. We also present various effective approaches including unsupervised propagation, semi-supervised graph regularization, supervised learning-to-rank and deep neural networks to model HINs for ranking, classification, and similarity measurement. Our experimental results demonstrate that heterogeneous information network analysis approaches are also powerful in the field of NLP. | |
dc.description.abstract | Different from the semi-structured knowledge bases (e.g., Wikipedia) and the traditional news, the informal microblogs tend to be noisy, short, and informal. And the phenomenon of information implicitness is more prominent and pervasive in microblogging. These characteristics bring unique challenges to people's reading and understanding of the informal microblogs, as well as many knowledge mining and discovery tasks. Thus, in order to alleviate these problems, in this thesis we propose to filter noisy and uninformative information, enrich the short microblogs with background knowledge from knowledge bases such as Wikipedia, and resolve the informal and implicit information to their regular referents. | |
dc.language.iso | ENG | |
dc.publisher | Rensselaer Polytechnic Institute, Troy, NY | |
dc.relation.ispartof | Rensselaer Theses and Dissertations Online Collection | |
dc.subject | Computer science | |
dc.title | Modeling heterogeneous networks for information ranking, enrichment and resolution on microblogs | |
dc.type | Electronic thesis | |
dc.type | Thesis | |
dc.digitool.pid | 176074 | |
dc.digitool.pid | 176075 | |
dc.digitool.pid | 176076 | |
dc.rights.holder | This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author. | |
dc.description.degree | PhD | |
dc.relation.department | Dept. of Computer Science | |