• Login
    View Item 
    •   DSpace@RPI Home
    • Rensselaer Libraries
    • RPI Theses Online (Complete)
    • View Item
    •   DSpace@RPI Home
    • Rensselaer Libraries
    • RPI Theses Online (Complete)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Modeling heterogeneous networks for information ranking, enrichment and resolution on microblogs

    Author
    Huang, Hongzhao
    View/Open
    176075_Huang_rpi_0185E_10650.pdf (1.961Mb)
    Other Contributors
    Ji, Heng; Fox, Peter A.; Hendler, James A.; Lin, Chin-Yew; Sun, Yizhou;
    Date Issued
    2015-05
    Subject
    Computer science
    Degree
    PhD;
    Terms of Use
    This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.;
    Metadata
    Show full item record
    URI
    https://hdl.handle.net/20.500.13015/1496
    Abstract
    Microblogging, a new type of online information sharing platform through short messages of up to 140 characters, has grown up quickly and received increasing attentions in recent years. A microblogging platform (e.g., Twitter) enables both individuals and organizations to disseminate information, from current affairs to breaking news in a timely fashion, which makes it a valuable knowledge source with super-fresh information. For example, during Hurricane Irene in 2011, updates from users living in New York City and transportation/evacuation posts from the government are very useful information for people to keep track of the disaster. Therefore, conducting related Natural Language Processing (NLP) research on this new genre is demanded to assist knowledge mining and discovery.; To achieve our goals, we propose to leverage and model heterogeneous information networks (HINs), in contrast to most existing NLP approaches on traditional genres (e.g., news) that only explored single type of information (e.g., texts). Microblogging contains heterogeneous types of information from social network structures to cross-genre link- ages, forming rich HINs. By designing effective approaches to model both unstructured texts and structured HINs, we can incorporate additional evidence from HIN structures beyond texts. In this thesis, we present different approaches to construct HINs from cross- genre, cross-source, and cross-type information by incorporating the existing clean social relations, as well as performing deep content analysis with some of the well-developed NLP approaches. We also present various effective approaches including unsupervised propagation, semi-supervised graph regularization, supervised learning-to-rank and deep neural networks to model HINs for ranking, classification, and similarity measurement. Our experimental results demonstrate that heterogeneous information network analysis approaches are also powerful in the field of NLP.; Different from the semi-structured knowledge bases (e.g., Wikipedia) and the traditional news, the informal microblogs tend to be noisy, short, and informal. And the phenomenon of information implicitness is more prominent and pervasive in microblogging. These characteristics bring unique challenges to people's reading and understanding of the informal microblogs, as well as many knowledge mining and discovery tasks. Thus, in order to alleviate these problems, in this thesis we propose to filter noisy and uninformative information, enrich the short microblogs with background knowledge from knowledge bases such as Wikipedia, and resolve the informal and implicit information to their regular referents.;
    Description
    May 2015; School of Science
    Department
    Dept. of Computer Science;
    Publisher
    Rensselaer Polytechnic Institute, Troy, NY
    Relationships
    Rensselaer Theses and Dissertations Online Collection;
    Access
    Restricted to current Rensselaer faculty, staff and students. Access inquiries may be directed to the Rensselaer Libraries.;
    Collections
    • RPI Theses Online (Complete)

    Browse

    All of DSpace@RPICommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Login

    DSpace software copyright © 2002-2022  DuraSpace
    Contact Us | Send Feedback
    DSpace Express is a service operated by 
    Atmire NV