← Back to blog

LLMs feel smart, but don't actually understand anything

11 min read

AI

When you first interact with an LLM (or large language model), it’s hard not to be impressed.

The answers are fluent. The tone feels confident. Sometimes you have the feeling that the system understands what you’re asking, not just the words but also the intent behind them. You might even feel like it already knows where you’re going next.

That feeling is powerful, but it’s also why people start to make mistakes.

At its core, an LLM doesn’t understand language the way humans do. It doesn’t reason in the same way we do. It doesn’t form beliefs, maintain experiences, or check its output against reality. What it does is simpler, and for us a little unsettling:

It predicts the next most likely token, given everything it has seen so far.

And that’s it. Everything else (explanations, reasoning-like behavior, step-by-step logic, and even emotional language) emerges from that single mechanism. And understanding this changes how you should use these systems.

The illusion of comprehension

When you ask a question to an LLM, what do you think happens?

understand

Most people intuitively imagine something very familiar: the system reads the question, understands what it means, thinks about it, and then produces an answer. That’s what humans do. The output that an LLM creates looks similar, so it’s natural to assume the same process happened internally.

We assume comprehension before it can give an answer.

But an LLM doesn’t work that way. It doesn’t start with meaning and then decide what to say. It begins with patterns in sequences of tokens, learned from massive amounts of text.

During training, the model is shown enormous amounts of text and repeatedly asked one question: Given this sequence of tokens, what is the most likely to come next?

It’s never asked whether a sentence is true, whether an explanation is correct, ot what the user is trying to achieve. It is only trained to continue text in a way that statistically fits what it has seen before.

Truthful text has different statistical properties than false text. Coherent reasoning has different patterns than rambling. So while the model isn’t trained on truth or logic directly, it learns correlates of these things because they’re embedded in the patterns of human text.

What is a “token”?

I just mentioned that an LLM is trained to see patterns in sequences of tokens. It’s worth zooming in on what a token is, because everything that follows depends on this one concept.

Language models are trained on text, so it’s tempting to think of tokens as words, characters, or sentences. But they’re none of those. A token is a unit of text chosen by the model’s tokenizer. This is a separate component that exists purely to turn raw text into something the model can process.

Before the language model itself is trained, a tokenizer is trained with a very different goal: compression efficiency. It scans massive amounts of text and looks for character sequences that frequently appear together. Common sequences get merged into single tokens. Rare ones get split into smaller pieces. The result is a vocabulary of tens of thousands of tokens that balance two constraints:

  • Common text should require as few tokens as possible
  • any text should be representable

That’s why a token might be:

  • a whole word like “apple”
  • part of a word like “app” + “le”
  • punctuation
  • whitespace
  • common fragments like “ing” or “tion”

There is nothing semantic about these splits. They’re purely driven by frequency, and not by meaning.

Most modern tokenizers use a technique called Byte Pair Encoding. It starts with individual characters, then it repeatedly merges the most common adjacent pairs. If “ing” appears everywhere, it becomes a token. If “under” is common, it becomes another. This process is repeated thousands of times.

At runtime, tokenization is purely mechanical. The tokenizer matches the longest tokens it knows. When it is doing this, matching whitespace and punctuation matters; even a leading space can change which token is used. “hello” and ” hello” are statistically different patterns.

The model never sees words at all. It only sees token IDs, so numbers. When people talk about context windows or token limits, they’re talking about how many of these numeric chunks the model can process at once.

Next-token prediction, step by step

So how does this prediction actually work?

Suppose the model gets this prompt:

The capital of France is

The tokenizer first converts that text into a sequence of token IDs. The model then produces a probability distribution over all possible next tokens:

TokenProbability
Paris0.72
Lyon0.05
a0.04
the0.03
Marseille0.02

The model selects from the highest probability tokens, often introducing some randomness to prevent outputs from becoming repetitive. The selected token is appended to the input, and the process repeats.

The input is now The capital of France is Paris.

From there, the next token might be a punctuation mark or a space followed by another sentence. This loop continues until the model predicts an end-of-sequence.

At no point does the model check whether it’s generating is true. It’s just probability, applied over and over again.

What we perceive as a fluent paragraph is really hundreds of these tiny predictions chained together. The model doesn’t plan a sentence before writing it. The sentence emerges as the most statistically likely continuation, one token at a time.

But why does it feel like reasoning?

The example of the capital of France is just a simple prompt. But when you ask a question like ” Explain why this approach violates CQRS principles”. The answer doesn’t just sound correct. It unfolds logically. It introduces concepts in the right order, weighs trade-offs, and arrives at a conclusion that feels deliberate.

From the outside, this is indistinguishable from what a human expert might do.

But what’s happening isn’t reasoning in the human sense.

The model isn’t consciously evaluating CQRS principles or checking an architecture against them. It’s predicting what a plausible explanation of a CQRS violation typically looks like, based on patterns in its training data.

Architectural explanations tend to follow a recognizable structure:

  • introduce a principle
  • describe responsibilities
  • explain consequences
  • connect them into a narrative

When the model reproduces that structure, the result resembles reasoning. In learning to predict these patterns well enough, the model develops internal representations that function something like concepts. Research shows that scaled models build internal “world models”; they track entity states, handle spatial reasoning, and maintain consistency across contexts.

These aren’t grounded in sensory experience like human understanding. They’re not updated through real-world interaction. But they’re not purely surface-level either. The model has learned abstract representations that help it predict better, and in many cases, those representations correlate with genuine logical relationships.

The key distinction: this is computation without the kind of understanding humans have. It’s a different kind of processing that can produce reasoning-like outputs without the reasoning process we’re familiar with.

Why “think step by step” seems to help

If you ask a question directly, the response might feel shallow. But add “think step by step” or “explain your reasoning,” and the response becomes more detailed, more structured, sometimes more accurate.

What’s happening?

Phrases like “let’s break it down” or “step by step” are statistically associated with longer explanations, intermediate conclusions, and careful sequencing of ideas. By adding that instruction, you’re steering the model towards a different region of its probability space, one where the next tokens are more likely to resemble methodical explanations.

But recent research suggests something more interesting. The intermediate tokens in chain-of-thought reasoning actually help the model perform computations it couldn’t otherwise. By generating step 1, the model creates context that helps it predict step 2 more accurately. It’s using its own output as a scratchpad for processing that wouldn’t fit in a single forward pass.

This isn’t human-style reasoning, but it’s not purely cosmetic either. The steps are functional, not just stylistic.

The dangerous part is that this technique can just as easily produce confident nonsense. The model isn’t checking whether each step is valid; it’s just following the pattern of what methodical explanations look like. I’ve seen answers so clean and well-structured that I only noticed problems when I tried to apply them. The explanation looked convincing, but the code didn’t behave as promised.

Confident nonsense isn’t a bug

Once you see the system this way, several behaviors make sense.

The model sounds confident even when it’s wrong because confidence itself is just another learned pattern. If authoritative language commonly appears in responses to certain questions, the model reproduces that tone regardless of correctness. Confidence is a stylistic artifact, not a reliability signal.

As models scale and are trained on more data, these hallucinations actually decrease. This might seem contradictory if they’re just pattern-matching machines. But larger models capture more subtle patterns and relationships in the training data. They don’t “know” more facts the way humans do, but they’ve learned richer statistical associations. Where training data overwhelmingly supports one answer, contradictions become statistically unlikely.

The same logic also explains why small changes in wording can lead to different answers. When you rephrase a question, you change the token sequence and therefore the mathematical input to the model. Those two questions may be semantically equivalent to a human, but they’re different number sequences to the model, pushing predictions in different directions.

It also explains why the same prompt doesn’t always yield the same response. Most modern language models intentionally introduce a degree of randomness when selecting the next token. Without it, outputs would be repetitive and brittle. The trade-off is that there is no deterministic path toward a single correct answer. The model samples from plausible continuations rather than reasoning towards a single truth.

There is no internal model of the world

An LLM doesn’t maintain a consistent world model in the way humans do. It develops internal representations that function as world models for prediction, but they’re not grounded in sensory experience or updated through interaction with reality.

If context in one conversation suggests the Eiffel Tower is in Paris, and context in another is ambiguous, the model responds based purely on what’s statistically likely given each specific input. There’s no persistent memory flagging inconsistencies across conversations. There’s no notion of having been wrong before.

Consistency appears when the prompt constrains output strongly enough, or when training data contains such overwhelming evidence that contradictory predictions become vanishingly unlikely. But it’s not something the model enforces internally; it’s an emergent property of the statistics.

What about reasoning models?

When people say LLMs can reason, they’re observing the model’s ability to reproduce patterns that correlate with good reasoning in written form. In many domains, especially technical ones, those patterns are sufficient to produce genuinely helpful explanations.

This is why the model can explain concepts clearly, walk through trade-offs, and structure arguments, while simultaneously hallucinating details or breaking down at the edges. What looks like reasoning is sophisticated pattern completion, but the patterns are rich enough to be functionally useful.

Some newer models are explicitly trained to improve this. They generate many reasoning paths, evaluate which ones lead to correct answers, and reinforce those patterns. This is still fundamentally a prediction, but a prediction shaped by feedback about what works.

The result is systems that can solve complex problems through a process that looks like reasoning and often produces the same results as reasoning, even though the underlying mechanism is different from human cognition.

Why is it important to know that an LLM doesn’t understand you

The difference isn’t just an academic nuance. It has real consequences when you start using these systems in production.

If you expect understanding, you will overtrust the output. If you recognize that the system is fundamentally performing sophisticated statistical prediction, you naturally start designing around its limitations.

Designing around these limitations looks like

  • Verification over trust: Cross-reference outputs against sources of truth. Don’t assume correctness, but validate it.
  • Constrained generation over open-ended: The more you constrain the output format and domain, the more reliable the results. Narrow prompts work better than broad ones.
  • Errors as expected behavior: Don’t treat hallucinations as edge cases to eliminate. Treat them as inherent properties to design around.
  • Deterministic systems around non-deterministic components: Build reliable scaffolding that uses LLM outputs as inputs to deterministic logic, not as final answers.

An LLM can draft explanations, summarize designs, or explore trade-offs, but it should never be the final authority.

That mindset shift is subtle but critical. The most useful question isn’t Does the model understand me?, but it’s

Why did the model predict this response?

That question transforms the system from mysterious intelligence into an engineering problem you can reason about.

Conclusion

LLMs are neither magical nor safe by default. They’re powerful tools that work through statistical prediction over learned patterns. Those patterns are rich enough to produce remarkably useful outputs, but they’re not grounded in the kind of understanding that comes from experience or interaction with reality.

Once you adopt this lens, LLMs stop feeling like intelligent beings you must trust or distrust. They become what they are: sophisticated, unreliable, incredibly useful tools you can design around.

They’re predictable enough to be engineered with, as long as you don’t forget what they’re actually doing.

Comments