Human-level natural language understanding : false progress and real challenges

Thumbnail Image
Bignoli, Perrin G.
Issue Date
Electronic thesis
Cognitive science
Research Projects
Organizational Units
Journal Issue
Alternative Title
The field of Natural Language Processing (NLP) focuses on the study of how utterances composed of human-level languages can be understood and generated. Typically, there are considered to be three intertwined levels of structure that interact to create meaning in language: syntax, semantics, and pragmatics. Not only is a large amount of language-specific information required to learn syntactic patterns and semantic constructions, but also an even vaster amount of commonsense and world knowledge is required to interpret language at the pragmatic level. Because the task of creating a system that is fully capable of processing language at the human level of proficiency is extremely daunting, researchers have mostly focused on developing constrained models of specific aspects of NLP, such as Part-of-Speech tagging or named entity resolution. However, a number of more comprehensive NLP platforms, produced by companies such as IBM, Google, Microsoft, and Apple have taken on the task of providing fully functional, natural-language-driven services.
Unfortunately, it turns out that much of the present optimism in the NLP field is essentially hyperbole. An in depth analysis of knowledge-lean NLP techniques will demonstrate that they are significantly underpowered for handling tasks such as basic syntactic parsing, let alone full natural language understanding. To make matters worse, the NLP field has been led down this path in large part due to the kinds of evaluation methods used to gauge success on individual NLP tasks. Once these metrics are analyzed, it is straightforward to see how they can create confusion and misleading results. In light of these findings, it is the intent of this document to demonstrate the case for modifying the current knowledge-lean NLP culture with techniques that are derived from Cognitive Science.
Based on the level of success reported by researchers, as well as from the advertisement campaigns of companies that deal with NLP technology, it would seem that the NLP problem is well on its way to being solved by the techniques currently employed by the field. Currently, knowledge-lean, statistical, machine-learning like approaches dominate the NLP research arena. These approaches have largely displaced earlier knowledge-rich, expert-system like approaches, because of their increased flexibility and robustness. Because NLP is widely considered to be an "AI-hard" problem, the resolution of the challenges of making a human-level NLP system would have huge implications for both Cognitive Science and AI in general.
December 2013
School of Humanities, Arts, and Social Sciences
Full Citation
Rensselaer Polytechnic Institute, Troy, NY
PubMed ID