An experimental study measuring human annotator categorization agreement on commonsense sentences

Authors
Santos, Henrique
Kejriwal, Mayank
Mulvehill, Alice
Forbush, Gretchen
McGuinness, Deborah L.
ORCID
No Thumbnail Available
Other Contributors
Issue Date
2021-06-18
Keywords
Machine Common Sense (MCS) Multi-modal Open World Grounded Learning and Inference (MOWGLI)
Degree
Terms of Use
Full Citation
Abstract
Developing agents capable of commonsense reasoning is an important goal in Artificial Intelligence (AI) research. Because commonsense is broadly defined, a computational theory that can formally categorize the various kinds of commonsense knowledge is critical for enabling fundamental research in this area. In a recent book, Gordon and Hobbs described such a categorization, argued to be reasonably complete. However, the theory’s reliability has not been independently evaluated through human annotator judgments. This paper describes such an experimental study, whereby annotations were elicited across a subset of eight foundational categories proposed in the original Gordon-Hobbs theory. We avoid bias by eliciting annotations on 200 sentences from a commonsense benchmark dataset independently developed by an external organization. The results show that, while humans agree on relatively concrete categories like time and space, they disagree on more abstract concepts. The implications of these findings are briefly discussed.
Description
Department
Publisher
Experimental Results
Relationships
https://tw.rpi.edu/project/machine-common-sense-mcs-multi-modal-open-world-grounded-learning-and-inference-mowgli
Access