Natural language understanding with semantic parsing

Thumbnail Image
Liang, Zhicheng
Issue Date
Electronic thesis
Computer science
Research Projects
Organizational Units
Journal Issue
Alternative Title
As a branch of natural language processing (NLP), natural language understanding (NLU) involves transforming natural language into a structured, machine-readable format. Typical target formats vary depending on the domain problems to be solved and such transformations are usually challenging due to the semantic gap between natural language text inputs and target structured outputs. The work within this dissertation is focused on how semantic parsing, by providing an intermediate representation between text and desired outputs, can aid in some typical and representative NLU problems that demonstrate some key computational aspects of understanding. Compared to single-relation question answering, answering complex questions involving multiple KG relations by using a KG imposes additional difficulties of generating a corresponding SPARQL query due to the need of predicting implicit but critical graph structure that represents the intention of a natural language query (NLQ). We thus start with a domain-specific semantic parsing approach that extracts a semantic query graph as an intermediate representation between NLQ and SPARQL, with which the overall KGQA performance measured by accuracy and F1 score on the answer set is improved. Next, we explore the use of a more general-purpose semantic parse, abstract meaning representation (AMR), in a representative task that requires quantitative and logical reasoning capabilities -- math word problem solving -- which aims to derive math expressions for solving problems described in natural language. We present our novel Graph-to-Sequence/Graph-to-Tree learning approach that leverages AMR for graph construction to model the semantic relationship among quantities appearing in problem description. An edge-aware graph neural network encoder is reused for aggregating features from the AMR-based graph to enrich text representations that are used in a sequence/tree decoder for generating math expressions. We show that using our approach with AMR as an intermediate representation between text and expressions achieves higher logical form and solution accuracy on derived expressions, compared to the baselines without using semantic parse. Finally, we identify some limitations of AMR when using it for modeling world state in text-adventure games, which requires inducing a knowledge graph from observations written in natural language. First we present our path pattern based approach to leveraging AMR for this knowledge graph prediction task. We further enrich AMR with VerbNet semantics to collect more informative path patterns for relation prediction. We then conduct a detailed comparison study on our AMR and AMR+VerbNet driven approaches and representative baselines to assess the value of semantic parse for this task. As an extension, to introduce commonsense knowledge as another source of AMR enrichment, we further present a preliminary study on mining commonsense knowledge from dictionary term definitions by using part-of-speech tag patterns and existing triple scoring models. Through in-depth evaluation on the value of intermediate semantic representations for deriving target outputs from text inputs, we believe that the techniques proposed and discussed in this work can provide guidance to other similar NLU problems where the semantic gap between text inputs and target outputs is the main challenge.
December 2022
School of Science
Full Citation
Rensselaer Polytechnic Institute, Troy, NY
Terms of Use
PubMed ID