This week’s lab meeting will feature a talk by Richard Futrell, entitled Information-theoretic models of natural language.
- Thursday, February 25, 13:30–14:30 (Montreal time, UTC-5).
- Meetings are via Zoom. If you are interested in attending any of the meetings this semester, please take a moment now to register at this link. After approval, you will receive a confirmation email with the link to join.
I claim that human languages can be modeled as information-theoretic codes, that is, systems that maximize information transfer under certain constraints. I argue that the relevant constraints for human language are those involving the cognitive resources used during language production and comprehension. Viewing human language in this way, it is possible to derive and test new quantitative predictions about the statistical, syntactic, and morphemic structure of human languages.
I start by reviewing some of the many ways that natural languages differ from optimal codes as studied in information theory. I argue that one distinguishing characteristic of human languages, as opposed to other natural and artificial codes, is a property I call “information locality”: information about particular aspects of meaning is localized in time within a linguistic utterance. I give evidence for information locality at multiple levels of linguistic structure, including the structure of words and the order of words in sentences.
Next, I state a theorem showing that information locality is a property of any communication system where the encoder and/or decoder are operating incrementally under memory constraints. The theorem yields a new, fully formal, and quantifiable definition of information locality, which leads to new predictions about word order and the structure of words across languages. I test these predictions in broad corpus studies of word order in over 50 languages, and in case studies of the order of morphemes within words in two languages.
Richard Futrell is an Assistant Professor of Language Science at the University of California, Irvine. His research applies information theory to better understand human language and how humans and machines can learn and process it.