This week’s lab meeting will feature a talk from Ben LeBrun.

  • Thursday, March 18, 13:30–14:30 (Montreal time, UTC-4).
  • Meetings are via Zoom. If you are interested in attending any of the meetings this semester, please take a moment now to register at this link. After approval, you will receive a confirmation email with the link to join.

Abstract

The use of pre-trained Transformer language models (TLMs) has led to significant advances in the field of natural language processing. This success has typically been measured by quantifying model performance on down-stream tasks, or through their ability to predict words in large samples of text. However, these benchmarks are biased in favour of frequent natural language constructions, measuring performance on common, recurring patterns in the data. The behaviour of TLMs on the large set of complex and infrequent linguistic constructions is in comparison understudied. In this talk, I will present preliminary results exploring GPT2’s ability to reproduce this long-tail of syntactic constructions, and how this ability is modulated by fine-tuning.

Bio

Ben is an undergraduate student double majoring in linguistics and computer science.