Persona aware strategy towards building a comprehensive semantic data layer: a resource for marginalized us stem graduate students

Thumbnail Image
Keshan, Neha
Issue Date
Electronic thesis
Computer science
Research Projects
Organizational Units
Journal Issue
Alternative Title
The long-standing problem of “what comes next,” i.e. the problem of finding an ideal career path, has been a barrier in attracting STEM students from marginalized communities. A Survey of Earned Doctorates conducted by NSF shows the disproportionate completion rates between marginalized and non-marginalized communities: An approximately 75\% increase in science and engineering doctorates as compared to an approximately 5\% increase among marginalized community members in the US in 2016. This dissertation aims at understanding and mitigating the factors affecting marginalized students in STEM. In this context, marginalized communities are defined as groups of students excluded based on ethnicity, race, linguistics, gender identity, age, physical ability, and/or immigration status. Even though marginalized students might have similar resources as their non-marginalized peers (including advisors, institute support programs, and various online resources), they are still not receiving the assistance they need to overcome their unique challenges. Apart from social barriers, marginalized students struggle to access the siloed, non-communicative, and incomplete reference points that speak to their experiences. I believe that the challenge of harmonizing these resources can be attacked by combining the currently used social science methodologies—surveys and interviews—with our own computer science techniques—web science and artificial intelligence tools—to provide a more concrete and trustworthy solution. This approach mitigates the lack of accessible reference points for marginalized US STEM graduate students based on the concepts which underpin social machines—“no one knows everything, but everyone knows something”—and semantics. The constant updating of resources makes them prone to combinatorial explosions and much less amenable to simple integration processes. The use of semantics via an ontology and a knowledge graph, therefore, becomes vital. This work focuses on building a persona aware semantic data layer to harmonize structured, semi-structured, and unstructured resources using knowledge graphs. The use of personas provides a mechanism to evaluate the sufficiency of the generated competency questions. These questions are then used to evaluate the ontologies and knowledge graphs without building an interface. The provided persona aware strategy helps ensure that such data layers could be well exploited to cater to multiple systems and applications. This work provides a pathway to develop a first of its kind, flexible, comprehensive resource for marginalized US graduate students to access the specific/general required information and connect with users who are concerned for their welfare or are interested in recruiting them. This will help facilitate entry of students from marginalized communities into more scientific fields, increasing the diversification of the graduate pool, and leading to more innovative, inclusive, and collaborative scientific progression.
School of Science
Full Citation
Rensselaer Polytechnic Institute, Troy, NY
Terms of Use
PubMed ID