• Login
    View Item 
    •   DSpace@RPI Home
    • Tetherless World Constellation
    • Tetherless World Publications
    • View Item
    •   DSpace@RPI Home
    • Tetherless World Constellation
    • Tetherless World Publications
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    A Semi-Automated Approach to Data Harmonization Across Environmental Health Studies

    Author
    Johnson, Matt; Ravi, Meenu; Pinheiro, Paulo; Stingone, Jeanette; McGuinness, Deborah
    Thumbnail
    Other Contributors
    Date Issued
    2020-08-01
    Subject
    Human Health Exposure Analysis Repository (HHEAR)
    Degree
    Terms of Use
    Metadata
    Show full item record
    URI
    https://ehp.niehs.nih.gov/doi/abs/10.1289/isee.2020.virtual.P-0206
    Abstract
    The NIEHS-supported Human Health and Exposure Analysis Resource (HHEAR) Data Center maintains a public-use data repository to promote reuse of environmental health data generated by the HHEAR program. The creation and maintenance of this repository requires the integration of information from a wide variety of epidemiologic studies. We have developed the Human Aware Data Acquisition Framework to enable this complex integration, supporting harmonization across multiple studies, and enabling meaningful search and access of the data deposited in the HHEAR Data Repository. To integrate data from a new study, investigators engage in an initial, time-consuming effort to link study data to the HHEAR ontology, a controlled vocabulary of environmental and public health terms. This is accomplished by generating a semantic data dictionary (SDD) from the data dictionaries and codebooks provided by HHEAR study investigators. Originally, this had been done manually by an expert in both epidemiological terminology and ontological modeling. To increase the accessibility of these tools for environmental health scientists who lack formal ontologic training, we have developed an SDD-Editor that simplifies the ontology modeling process. The SDD-Editor reuses elements common to epidemiologic data dictionaries and spreadsheet software, while integrating features needed to form semantic links between public health concepts and existing ontologies. The SDD-Editor suggests potential concept matches for study variables within the SDD using natural language processing to capture the semantic similarity between data dictionary and ontology class descriptions. If no suitable suggestion exists, investigators can search for ontology terms using a search engine powered by Bioportal. Once finished, a validator is run to check that the SDD has the correct format and all classes are valid. By automating parts of the ontology modeling process, the SDD-Editor greatly facilitates the dynamic integration of HHEAR environmental health studies into a single repository, benefiting the scientific community.;
    Department
    Relationships
    https://tw.rpi.edu/project/human-health-exposure-analysis-repository-hhear;
    Access
    Collections
    • Tetherless World Publications

    Browse

    All of DSpace@RPICommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Login

    DSpace software copyright © 2002-2022  DuraSpace
    Contact Us | Send Feedback
    DSpace Express is a service operated by 
    Atmire NV