Getting the Dirt on Big Data
Author
Waterman, K Krasnow; Hendler, James A.Other Contributors
Date Issued
2013-09-10Degree
Terms of Use
Attribution-NonCommercial-NoDerivs 3.0 United StatesFull Citation
K. Krasnow and J. Hendler, Getting the Dirt on Big Data, Big Data, 1(3), 2013.Metadata
Show full item recordURI
https://www.liebertpub.com/doi/abs/10.1089/big.2013.0026?journalCode=big; https://doi.org/10.1089/big.2013.0026; https://hdl.handle.net/20.500.13015/6432Abstract
“Dirty data” – data which is incomplete or incorrect – presents a significant challenge in producing trustworthy data analytics. The old technology adage “garbage in, garbage out” applies to this problem. If you analyze erroneous information, you produce erroneous results. For big data analytics, this means misunderstanding of the broad brushstrokes of the data, its statistical themes and trends. The primary industry approach to solving this challenge has been to propose to correct all of this dirty data, projects that cost millions of dollars and take many years, resulting in an endless game of catch-up. We believe that a better solution is to accept that data is flawed and use related data to refine the analytic results. Using Linked Data to augment and visualize big data, the success of this “broad” data approach can be seen in relatively quick and easy to produce examples.;Department
Publisher
Mary Ann Liebert, Inc.Relationships
Access
Collections
The following license files are associated with this item: