Beyond Labels: Empowering Human with Natural Language Explanations through a Novel Active-Learning Architecture

Authors
Yao, Bingsheng
Jindal, Ishan
Popa, Lucian
Katsis, Yannis
Ghosh, Sayan
He, Lihong
Lu, Yuxuan
Srivastava, Shashank
Hendler, James A.
Wang, Dakuo
ORCID
Loading...
Thumbnail Image
Other Contributors
Issue Date
2023-05-23
Keywords
Degree
Terms of Use
Attribution-NonCommercial-NoDerivs 3.0 United States
Full Citation
Yao, B., Jindal, I., Popa, L., Katsis, Y., Ghosh, S., He, L., … Wang, D. (2023). Beyond Labels: Empowering Human with Natural Language Explanations through a Novel Active-Learning Architecture. ArXiv [Cs.CL]. Retrieved from http://arxiv.org/abs/2305.12710
Bingsheng Yao, Ishan Jindal, Lucian Popa, Yannis Katsis, Sayan Ghosh, Lihong He, Yuxuan Lu, Shashank Srivastava, Yunyao Li, James A. Hendler, & Dakuo Wang (2023). Beyond Labels: Empowering Human Annotators with Natural Language Explanations through a Novel Active-Learning Architecture. In Findings of the Association for Computational Linguistics: EMNLP 2023.
Abstract
Data annotation is a costly task; thus, researchers have proposed low-scenario learning techniques like Active-Learning (AL) to support human annotators; Yet, existing AL works focus only on the label, but overlook the natural language explanation of a data point, despite that real-world humans (e.g., doctors) often need both the labels and the corresponding explanations at the same time. This work proposes a novel AL architecture to support and reduce human annotations of both labels and explanations in low-resource scenarios. Our AL architecture incorporates an explanation-generation model that can explicitly generate natural language explanations for the prediction model and for assisting humans' decision-making in real-world. For our AL framework, we design a data diversity-based AL data selection strategy that leverages the explanation annotations. The automated AL simulation evaluations demonstrate that our data selection strategy consistently outperforms traditional data diversity-based strategy; furthermore, human evaluation demonstrates that humans prefer our generated explanations to the SOTA explanation-generation system.
Description
Department
Publisher
Association for Computational Linguistics
Relationships
Access