Show simple item record

dc.contributor.authorJohnson, Matt
dc.contributor.authorRavi, Meenu
dc.contributor.authorPinheiro, Paulo
dc.contributor.authorStingone, Jeanette
dc.contributor.authorMcGuinness, Deborah L.
dc.date.accessioned2022-02-15T17:29:40Z
dc.date.available2022-02-15T17:29:40Z
dc.date.issued2020-08-01
dc.identifier.other18
dc.identifier.urihttps://ehp.niehs.nih.gov/doi/abs/10.1289/isee.2020.virtual.P-0206
dc.description.abstractThe NIEHS-supported Human Health and Exposure Analysis Resource (HHEAR) Data Center maintains a public-use data repository to promote reuse of environmental health data generated by the HHEAR program. The creation and maintenance of this repository requires the integration of information from a wide variety of epidemiologic studies. We have developed the Human Aware Data Acquisition Framework to enable this complex integration, supporting harmonization across multiple studies, and enabling meaningful search and access of the data deposited in the HHEAR Data Repository. To integrate data from a new study, investigators engage in an initial, time-consuming effort to link study data to the HHEAR ontology, a controlled vocabulary of environmental and public health terms. This is accomplished by generating a semantic data dictionary (SDD) from the data dictionaries and codebooks provided by HHEAR study investigators. Originally, this had been done manually by an expert in both epidemiological terminology and ontological modeling. To increase the accessibility of these tools for environmental health scientists who lack formal ontologic training, we have developed an SDD-Editor that simplifies the ontology modeling process. The SDD-Editor reuses elements common to epidemiologic data dictionaries and spreadsheet software, while integrating features needed to form semantic links between public health concepts and existing ontologies. The SDD-Editor suggests potential concept matches for study variables within the SDD using natural language processing to capture the semantic similarity between data dictionary and ontology class descriptions. If no suitable suggestion exists, investigators can search for ontology terms using a search engine powered by Bioportal. Once finished, a validator is run to check that the SDD has the correct format and all classes are valid. By automating parts of the ontology modeling process, the SDD-Editor greatly facilitates the dynamic integration of HHEAR environmental health studies into a single repository, benefiting the scientific community.
dc.relation.urihttps://tw.rpi.edu/project/human-health-exposure-analysis-repository-hhear
dc.subjectHuman Health Exposure Analysis Repository (HHEAR)
dc.titleA Semi-Automated Approach to Data Harmonization Across Environmental Health Studies


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record