Neural models for causal information extraction using domain adaptation
Loading...
Authors
Saha, Anik
Issue Date
2023-08
Type
Electronic thesis
Thesis
Thesis
Language
en_US
Keywords
Electrical engineering
Alternative Title
Abstract
The task of identifying causality related events or actions from text is an important step in building knowledge graph of events and consequences from the vast amount of unlabeled documents available to us in the digital age. As there has been very little work on this problem, we perform a comparative study on different neural models on 4 different data sets with causal labels. We train sequence tagging and span based models to extract causally related events from their textual description. Our experiments affirm the fact that large pre-trained language models, i.e. BERT, can be fine-tuned on labeled data sets to outperform traditional deep learning models like LSTM. Our results show that span based models are better at classifying spans of words as cause or effect compared to sequence tagging models using the same pre-trained weights from BERT. The length of the spans labeled as causes and effects in a data set also has a significant impact on the advantage of using a span based model. Towards the goal of developing a general purpose model for extraction of causal knowledge from text, we focus on the unsupervised domain adaptation (UDA) scenario where we adapt a model trained on a source domain to a new domain without any label. Several studies on the UDA task for text classification have shown the effectiveness of the adversarial domain adaptation method. We investigate the effect of integrating linguistic information in the adversarial domain adaptation framework for the causal information extraction task. We show the advantage of leveraging the word dependecy relationship in adapting word based neural models like LSTM to new domains. We also find that guiding the adversarial domain classifier to adapt the specific task classifier output is more effective than requiring the encoder model outputs to have similar distributions in two domains.
Description
August2023
School of Engineering
School of Engineering
Full Citation
Publisher
Rensselaer Polytechnic Institute, Troy, NY