Robust news veracity detection
Author
Horne, Benjamin D.Other Contributors
Adali, Sibel; Patterson, Stacy; Xia, Lirong; Gordon, Tamar; Nevo, Dorit;Date Issued
2020-05Subject
Computer scienceDegree
PhD;Terms of Use
This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.; Attribution-NonCommercial-NoDerivs 3.0 United StatesMetadata
Show full item recordAbstract
Second, we learn from news source behavior and relationships. In this approach, we explore content sharing behavior by both mainstream and alternative news sources. Specifically, we construct content sharing networks from our unstructured news article data and perform an extensive set of qualitative analyses to better understand how content sharing is used maliciously by unreliable news sources. Using this gained understanding and rich network structure, we employ both standard network science metrics and network embedding methods to utilize content sharing networks in the task of automatically detecting low veracity news articles. We then demonstrate how the two approaches proposed in this thesis can be used together to automatically detect low veracity news.; The spread of false and misleading news online can have offline impacts. These impacts range widely, including health, public opinion, and safety. While low veracity information is not necessarily a new occurrence, the scale at which it is produced and disseminated is. This production and dissemination scale is partial due to the low barrier to entry into the information ecosystem. Today, anyone can spread information by creating a blog or website which appears to be proper news source. These seemingly credible sources of information can then obtain wide attention due to social networks and the engagement-based algorithms that curate the media feeds in these networks. In turn, this lack of gate-keeping opens the media ecosystem up for malicious actors to spread targeted disinformation.; Due to the scale of low veracity news, the main question asked in this thesis is: Can we automatically detect low veracity news articles? Specifically, in this thesis, we develop and examine two broad approaches to automatically detecting low veracity news articles. First, we learn from news article text. In this approach, we focus on creating features of high veracity and low veracity news articles through text-based feature engineering. This feature engineering process starts out as an exploration on fact-checked news article data, with the goal of creating features that are interpretable by the eventual human end-user. After an understanding of the feature space is gained, we transfer the methodology to higher-level concepts of news veracity, such as reliability and bias, in order to use large scale data in machine learning tasks. We then test the robustness of these machine learning models in a series of concept drift tests and adversarial attack tests.;Description
May 2020; School of ScienceDepartment
Dept. of Computer Science;Publisher
Rensselaer Polytechnic Institute, Troy, NYRelationships
Rensselaer Theses and Dissertations Online Collection;Access
CC BY-NC-ND. Users may download and share copies with attribution in accordance with a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. No commercial use or derivatives are permitted without the explicit approval of the author.;Collections
Except where otherwise noted, this item's license is described as CC BY-NC-ND. Users may download and share copies with attribution in accordance with a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. No commercial use or derivatives are permitted without the explicit approval of the author.