The Semantic Data Dictionary Approach to Data Annotation & Integration
No Thumbnail Available
Authors
Rashid, Sabbir
Chastain, Katherine
Stingone, Jeanette
McGuinness, Deborah L.
McCusker, Jamie
Issue Date
2017-10-20
Type
Article
Language
en_US
Keywords
Alternative Title
Abstract
A standard approach to describing datasets is through the use of data dictionaries: tables which contain information about the content, description, and format of each data variable. While this approach is helpful for a human readability, it is difficult for a machine to understand the meaning behind the data. Consequently, tasks involving the combination of data from multiple sources, such as data integration or schema merging, are not easily automated. In response, we present the Semantic Data Dictionary (SDD) specification, which allows for extension and integration of data from multiple domains using a common metadata standard. We have developed a structure based on the Semanticscience Integrated Ontology’s (SIO) high-level, domain-agnostic conceptualization of scientific data, which is then annotated with more specific terminology from domain-relevant ontologies. The SDD format will make the specification, curation and search of data much easier than direct search of data dictionaries through terminology alignment, but also through the use of “compositional” classes for column descriptions, rather than needing a 1:1 mapping from column to class.
Description
Full Citation
Rashid, S. M., Chastain, K., Stingone, J. A., McGuinness, D. L., & McCusker, J. P. (2017). The Semantic Data Dictionary Approach to Data Annotation & Integration. SemSci@ ISWC, 2017.
Publisher
CEUR Workshop Proceedings