skip to content

What’s The Deal With Large Language Models? (video player)

Transcript ↓

We tend to think of language as a communication tool, designed for transmission. And it is that, even though much of the time, we'd really rather it wasn't. More importantly, though, language is a framework for mapping our own minds.

Without language, we simply wouldn't know what the fuck we are thinking. As the semiotician Ferdinand De Saussure put it: “Without language, thought is a vague, uncharted nebula”.

Some would rather computers did their thinking for them. But for that to work, those computers would have to speak the same language. Unfortunately, natural languages like English do not have straightforward rules like programming languages.

Try explaining to a computer that “shit” means very bad but “the shit” means the very best, which means the knees of a bee.

So we stopped trying to formally teach computers language and started just subjecting them to it. Not just a little bit of language, like the contents of the November/December 2022 issue of Potato Review magazine, but all 762.94 metric fuck tons of it.

By tokenizing and building a map of all our semi-literate memes and shit posts, LLMs can—when prompted—produce language that sounds like one of us. Like a dickhead.

But no part of this process makes the computer intelligent, sentient, or inclined to “kill all humans”.

It works a bit like sampling does for music. Disparate tracks and genres are mashed together according to compatible characteristics like tempo and key. But good sampling is creative, not just probabilistic.

Hence LLM output is less akin to DJ Shadow's alchemical “Mutual Slump” than it is to Kanye West's unnecessary “Stronger”—a track Kanye mixed and remixed over 70 times trying to better the Daft Punk original from which it takes its only notable material. LLMs, like Kanye West, are not self aware.

The idea of a zombie apocalypse, where the dead come back to terrorise and assimilate the living, has been a longstanding preoccupation. But it's not undead flesh we should be worried about, it's undead information. And that's what generative AI is: not really living, seemingly not dead, a shimmering contortion of what we thought we think and believed we believe.

The more we generate content and the more it is allowed to generate itself, the further the necrosis will spread. Rapidly, generative media will overwhelm the nominally “real” media from which it was originally trained, forcing it to reproduce with itself. This incestuous feedback loop will obliterate what remains of culture, leaving us gasping for brains, whatever the fuck they are.