Getting the Dirt on Big Data
AuthorWaterman, K Krasnow; Hendler, James A.
Full CitationK. Krasnow and J. Hendler, Getting the Dirt on Big Data, Big Data, 1(3), 2013.
MetadataShow full item record
URIhttps://www.liebertpub.com/doi/abs/10.1089/big.2013.0026?journalCode=big; https://doi.org/10.1089/big.2013.0026; https://hdl.handle.net/20.500.13015/6432
Abstract“Dirty data” – data which is incomplete or incorrect – presents a significant challenge in producing trustworthy data analytics. The old technology adage “garbage in, garbage out” applies to this problem. If you analyze erroneous information, you produce erroneous results. For big data analytics, this means misunderstanding of the broad brushstrokes of the data, its statistical themes and trends. The primary industry approach to solving this challenge has been to propose to correct all of this dirty data, projects that cost millions of dollars and take many years, resulting in an endless game of catch-up. We believe that a better solution is to accept that data is flawed and use related data to refine the analytic results. Using Linked Data to augment and visualize big data, the success of this “broad” data approach can be seen in relatively quick and easy to produce examples.;
PublisherMary Ann Liebert, Inc.
The following license files are associated with this item: