Spatial location of binaural signals using cepstral analysis
Loading...
Authors
Tyler, Jeramey
Issue Date
2023-12
Type
Electronic thesis
Thesis
Thesis
Language
en_US
Keywords
Cognitive science
Alternative Title
Abstract
As described by the precedence effect, the time delay between a sound (lead) and reflection (lag) influences our ability to perceive the lead's spatial location. When the lead and lag overlap we will perceive them as a single auditory event called a binaural fusion. Binaural fusions merge the spatial characteristics of the lead and lag making it difficult to identify their origins. When the sound signal is periodic, like music, the potential for binaural fusions to occur increases dramatically. Precedence effect models of auditory perception have traditionally avoided binaural fusion by using noise signals or impulsive signals, by pre-calculating the signal's impulse response, or by calculating the impulse response after the fact from a discrete signal. Unfortunately such models are not useful in real-world scenarios as the impulse response is rarely known beforehand, periodic and impulsive signals often coexist, and sound is continuous. With the increased interest in spatial audio technologies comes an increased demand for precedence effect models that can be applied to real-world applications. In this thesis we present the cepstral binaural model (CEPBIMO), a perceptual model of the precedence effect that is more resilient to periodicity, reflections, and binaural fusions than existing models. In addition to exhibiting improved performance, CEPBIMO surpasses the abilities of existing models in that it can be applied to variable signal types including periodic signals, it does not require prior knowledge of the impulse response, and it can be applied to a running audio signal in real-time. The innovations of CEPBIMO stem from its novel application of cepstral analysis, a signal processing technique used to identify frequencies of periodicity in a signal. From the results of cepstral analysis a deconvolution filter is created and used to separate sounds from reflections. For a given binaural signal CEPBIMO will return a binaural activity map, a visual representation of the acoustic scene. For these experiments four datasets of 10,000 synthetic binaural signals were generated using various signal types and processed using CEPBIMO. The binaural activity maps produced by CEPBIMO were evaluated using a series of simple convolutional neural networks. Though CEPBIMO was tested with more complex environments than preceding models its results exhibited more accuracy than results produced by other models and evaluated using more constrained environments.
Description
December 2023
Full Citation
Publisher
Rensselaer Polytechnic Institute, Troy, NY