E242: The TRUTH About Large Language Models and Agentic AI
An interview with Andriy Burkov, author "The Hundred-Page Language Models Book"
YouTube Link | Podcast Link (all platforms)
Who is Andriy Burkov?
Andriy Burkov is a renowned machine learning expert and leader. He's also the author of (so far) three books on machine learning, including the recently-released "The Hundred-Page Language Models Book", which takes curious people from the very basics of language models all the way up to building their own LLM. Andriy is also a formidable online presence and is never afraid to call BS on over-the-top claims about AI capabilities via his punchy social media posts.
I have been following Andriy on LinkedIn for a while and have always found his takes just the right mixture of sassy and informative. He clearly knows his space and isn’t afraid to call people out. Now, I’m not a data scientist, but I have worked on a number of AI-powered products in the past, including some that used some of the precursors to LLMs. I’ve trained my own models and been amazed at what came out of them, so I’m always sceptical of claims of magic. I was glad to have the chance to chat with a proper expert! Check it out now…
Episode highlights:
1. Large Language Models are neither magic nor conscious
LLMs boil down to relatively simple mathematics at an unfathomably large scale. Humans are terrible at visualising big numbers and cannot comprehend the size of the dataset or the number of GPUs that have been used to create the models. You can train the same LLM on a handful of records and get garbage results, or throw millions of dollars at it and get good results, but the fundamentals are identical, and there's no consciousness hiding in between the equations. We see good-looking output, and we think it's talking to us. It isn't.
2. As soon as we saw it was possible to do mathematics on words, LLMs were inevitable
There were language models before LLMs, but the invention of the transformer architecture truly accelerated everything. That said, the fundamentals trace further back to "simpler" algorithms, such as word2vec, which proved that it is possible to encode language information in a numeric format, which meant that the vast majority of linguistic information could be represented by embeddings, which enabled people to run equations on language. After that, it was just a matter of time before they got scaled out.
3. LLMs look intelligent because people generally ask about things they already know about
The best way to be disappointed by an LLM's results is to ask detailed questions about something you know deeply. It's quite likely that it'll give good results to start with, because most people's knowledge is so unoriginal that, somewhere in the LLM's training data, there are documents that talk about the thing you asked about. But, it will degrade over time and confidently keep writing even when it doesn't know the answer. These are not easily solvable problems and are, in fact, fundamental parts of the design of an LLM.
4. Agentic AI relies on unreliable actors with no true sense of agency
The concept of agents is not new, and people have been talking about them for years. The key aspect of AI agents is that they need self-motivation and goals of their own, rather than being told to have goals and then simulating the desire to achieve them. That's not to say that some agents are not useful in their own right, but the goal of fully autonomous, agentic systems is a long way off, and may not even be solvable.
5. LLMs represent the most incredible technical advance since the personal computer, but people should quit it with their most egregious claims
LLMs are an incredible tool and can open up whole new worlds for people who are able to get the best out of them. There are limits to their utility, and some of their shortcomings are likely unsolvable, but we should not minimise their impact. However, there are unethical people out there making completely unsubstantiated claims based on zero evidence and a fundamental misunderstanding of how these models work. These people are scaring people and encouraging terrible decision-making from the gullible. We need to see through the hype.
Follow Andriy
You can catch up with Andriy here:
Twitter/"X": https://twitter.com/burkov
True Positive Newsletter:
Related episodes you should like:
The Big Pivot to Reinvent Product Management (Yana Welinder, Founder & CEO @ Kraftful)
Reinventing the Future of Customer Success with Human-First AI (Nick Mehta, CEO @ Gainsight)
Most PMs Neglect Data Due To a Lack of Time and Skills (Martijn Moret, CEO @ DataSquirrel.ai)
Although I'm not one of those people who make exaggerated claims about LLMs, I think he made a bit of oversimplification.