Show simple item record

dc.rights.licenseRestricted to current Rensselaer faculty, staff and students. Access inquiries may be directed to the Rensselaer Libraries.
dc.contributorFox, Peter A.
dc.contributorMcGuinness, Deborah L.
dc.contributorBerman, Francine Denise, 1951-
dc.contributorNewberg, Heidi
dc.contributor.authorFu, Linyun
dc.date.accessioned2021-11-03T08:33:08Z
dc.date.available2021-11-03T08:33:08Z
dc.date.created2016-02-26T09:54:48Z
dc.date.issued2015-12
dc.identifier.urihttps://hdl.handle.net/20.500.13015/1632
dc.descriptionDecember 2015
dc.descriptionSchool of Science
dc.description.abstractIn this thesis, we specify a paradigm of preparing research publications based on invocation of operations to overcome many of the challenges associated with provenance capture mentioned above. The paradigm is to create publications on a portable provenance aware platform that transparently captures the proper provenance information. The PROV-PUB-O ontology was created for capturing proper knowledge of provenance for authoring processes based on invocations of operations, as well as describing and locating the published results in research publications. To evaluate the usability of PROV-PUB-O, we created the Ontology Usability Scale (OUS), which is the first set of metrics for ontology usability evaluation.
dc.description.abstractThe provenance capture framework enabling the paradigm that fulfills the following requirements will be elaborated. First, the provenance captured must be stored in a way that the reproducibility of the reported results can be decided and the "false paths" can be found in the provenance graph that caused a certain result to not be reproducible. Second, the authoring platform must use a front end supporting a variety of programming languages/modes used by real researchers to create results. The objective is to keep the learning overhead to a minimum. Third, it is also required that the capture of provenance needs no or minimal involvement of the users. A prototype platform is implemented to demonstrate the specified framework. Chapter 4 of the 2014 U.S. National Climate Assessment report (NCA2014) is our use case and the reproduction enabling provenance of tables and figures in this chapter is shown to be captured by the prototype.
dc.description.abstractExisting frameworks and systems for capturing provenance for computational experiments are either specifically tailored for scientific workflow systems or based on a model that is not detailed enough for reproduction of the published results. Authors who are not familiar with any workflow system need to learn how to use one of these systems in order to create provenance that is detailed enough for reproducibility with them.
dc.description.abstractProvenance is critical for research publication readers to correctly interpret important content and enables them to evaluate the credibility of the reported results by digging into the software in use, source and change of data and responsible agents. It also would enable the reader to reproduce the scientific conclusions by following or adapting the process leading to the reported results. However, the creation of proper provenance for research publications may cost the authors a lot if they lack the necessary knowledge and technical support. First, it requires knowledge of proper logical provenance information to capture for the report creating process, causing extra learning overhead on the authors. Second, it may also require technical knowledge of the physical configurations of the program(s) execution platform such as the operating system or even the computer hardware, in order to obtain useful provenance information for the purpose of reproducibility and validation of the content. This usually entails even more learning overhead. Even if the authors already know what provenance should get recorded and how to record it, the actual recording work is usually distracting to the authors focusing on authoring the research publications and thus insufficiently motivated.
dc.language.isoENG
dc.publisherRensselaer Polytechnic Institute, Troy, NY
dc.relation.ispartofRensselaer Theses and Dissertations Online Collection
dc.subjectComputer science
dc.titleAutomatic provenance capturing for research publications
dc.typeElectronic thesis
dc.typeThesis
dc.digitool.pid177105
dc.digitool.pid177108
dc.digitool.pid177110
dc.rights.holderThis electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.
dc.description.degreePhD
dc.relation.departmentDept. of Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record