At this week’s lab meeting, Bing’er Jiang will present on Modelling Perceptual Effects of Phonology with Automatic Speech Recognition Systems.

  • Wednesday, May 13th, at 14:30 UTC-4
  • Via Zoom. Contact Emily for details.

Abstract

This study explores the minimal knowledge a listener needs to compensate for phonological assimilation, one kind of phonological process responsible for variation in speech. We used standard automatic speech recognition models to represent English and French listeners. We found that, first, some types of models show language-specific assimilation patterns comparable to those shown by human listeners. Like English listeners, when trained on English, the models compensate more for place assimilation than for voicing assimilation, and like French listeners, the models show the opposite pattern when trained on French. Second, the models which best predict the human pattern use contextually-sensitive acoustic models and language models, which capture allophony and phonotactics, but do not make use of higher-level knowledge of a lexicon or word boundaries.