This site is under construction!
Recently much attention has been paid to whether large language models (LLMs) can serve as theories of language (Piantadosi 2023 and replies to other scholars in it). Unfortunately, the discussion has been kept at an abstract level and virtually nothing has been said about how LLMs work technically and what their internal organization means for linguistic theory (LT). My research fills this gap. Since algorithms in different LLMs may differ, I focus on ChatGPT. ChatGPT has a vocabulary of 100k tokens. Tokenization makes possible the representation of a large amount of text with a small set of subword units (tokens). Most of the tokens coincide with linguistic units and are letters (phonemes), morphemes or words. ChatGPT seems to relate to major claims of major linguistic theories, as well as to major findings of research in psycholinguistics. The most significant difference between LT and ChatGPT consists in the fact that LT is level-based, in the sense that phonology manipulates phonemes, morphology manipulates morphemes and syntax combines words (and morphemes in Distributed Morphology). The order of phonology, morphology and syntax in the architecture of the grammar is theory-dependent: Phonology and morphology may precede or follow syntax. By contrast, ChatGPT works with linear sequences of tokens and phonology, morphology and syntax take place simultaneously. In other words, ChatGPT elevates phonology and morphology to the level of syntax. *Note*: More on ChatGPT in lingbuzz/008135, "A reply to Moro et alia’s claim that “LLMs can produce ‘impossible’ languages".
Full text at: https://ling.auf.net/lingbuzz/008123
Stela Manova for Gauss:AI