LLMs Don’t Have to Know the Truth to Be Useful

Jan 7

A line I’ve seen a lot recently goes something like:“LLMs don’t know what’s true.”

I think that’s mostly right. But it can be misleading, because it invites a second conclusion that doesn’t necessarily follow:
“If the model doesn’t internally know what’s true, it can’t be trusted for real work.”

That conclusion sounds sensible, but it’s not how we get reliability in the real world.
And it’s not even how humans usually get to truth.

What follows is my attempt to explain a simple idea:
Trust shouldn’t rely on the model. It should emerge from the loop.

A robot that can talk but can’t look

Imagine a robot that has read a ridiculous number of books. It can talk about almost anything. It can even sound like a teacher.

But it has no eyes. No hands. It can’t do experiments. It can’t check the weather. It can’t open a drawer to see what’s inside.

So when you ask it a question, what is it doing?

In a simplified sense (and in training, literally), it’s doing a superpowered version of:
“In texts like these, what words usually come next?”

That’s not meant as an insult. It’s more like a description of the core mechanism. And it explains something we all recognize:
sometimes the model is correct, sometimes it’s wrong in an obvious way, and sometimes it’s wrong in a way that still sounds oddly convincing.

So yes: an LLM, by itself, isn’t naturally anchored to reality.

The slightly uncomfortable part: Us humans aren’t either

Humans have something LLMs don’t:
we’re connected to the world. We can look, touch, measure, test, and learn from consequences.

But it’s also true that a single human mind, on its own, doesn’t reliably produce truth either. If you isolate a person from:

other people correcting them,
written records,
measurements,
experiments,
real consequences,

…they can stay wrong for a long time. And not in a “I’m lying” way, but in a sincere, confident way.

People do this all the time. We misremember things. We rationalize. We prefer stories that make sense to stories that are accurate. We defend beliefs that are socially important to us.

So when we talk about “humans have a concept of truth,” I think that’s real, but it’s not because each individual brain has some perfect truth organ inside it.

We get truth (when we get it) through a larger system.

Truth is what you approach when a system keeps bumping into reality

Consider how we become confident about things in practice:

“This bridge design is safe.”
“This medicine works.”
“This financial report is correct.”

In almost every case, confidence doesn’t come from one smart person thinking really hard. It comes from a process:

multiple people look at it,
someone checks the numbers,
someone tests it,
someone keeps records,
mistakes create visible failures,
the group updates.

Science is the cleanest example of this. It’s not that scientists are unusually wise. It’s that the system has a few features that help truth survive over time: measurement, replication, incentives for correction, and the ability for reality to veto your ideas.

So I think it’s reasonable to say:
Our ability to converge on truth is less a property of an individual mind and more a property of a system with feedback.

The misunderstanding I keep seeing in AI

If you accept that framing, then the common “LLMs don’t know truth” critique is fair, but incomplete.

Because what matters in practice is not “does the model contain truth?” but something closer to:
“Where does this system get its reality feedback?”

A lot of deployments still treat the model like an oracle: ask → answer → hope.
And then we act surprised when it sometimes behaves like a confident improviser.

But we don’t build reliable things in the physical world by relying on one “perfect worker.”
We build reliable things by breaking work into steps and adding checks.

A useful analogy: factories don’t require omniscient workers

A car factory doesn’t work because each worker understands the whole car. A worker might tighten one set of bolts all day. Another might inspect one measurement. Another might test a component. The reliability comes from:

decomposition into steps,
inspection points,
logging defects,
and improving the process over time.

No single station “contains the full truth” of whether the car is perfect.

The factory produces reliability as an emergent property.

That analogy isn’t perfect, but I think it’s a better mental model for applied AI than “let’s build a super mind and ask it to be right.”

What we mean by “cognitive production lines” at UNeFAi

This is the motivation behind UNeFAi.

We build what we call cognitive production lines. It’s a simple idea: instead of treating knowledge work as “ask one model to do everything,” we treat it more like a cognitive production line:

break the work into stages,
make each stage responsible for a specific transformation,
add verification where it matters,
store intermediate artifacts (so nothing lives only in the model’s head),
and learn from outcomes over time.

The point is not to deny that models hallucinate. The point is to design a workflow where hallucination is less dangerous by design, because:

claims get checked,
sources are constrained,
errors become visible,
and the system improves.

In other words, we try to move from “chatbot as oracle” to “truth-seeking system.”

Specialised learning is easier than general learning (and usually enough)

People sometimes jump from “we need AI agents that learn” to “we need to figure out how to get agents to do general learning” But most useful work in organizations is not “general intelligence.” It’s domain-specific, process-shaped, and tied to outcomes. That’s why specialized learning is so much easier:

the problem space is smaller,
success criteria are clearer,
feedback is faster,
risks can be bounded.

A system that helps a company produce better proposals, respond to support tickets, maintain compliance documentation, or plan maintenance does not need to understand the universe.

It needs to reliably improve on a particular set of tasks, under particular constraints.

That’s a very different engineering challenge — and a much more solvable one.

Why “User Needs First” matters (beyond product slogans)

This is where our “User Needs First” approach fits naturally. If you want a system to learn, you need:

a target,
a signal,
a way to update.

Desired outcomes provide all three.

Instead of asking “is the answer true in some abstract sense?” we ask:

what does success look like for this user?
what failure modes matter?
where should we place checks?
what should we log so the system learns?

This turns truth into something operational:
not “does the model know truth,”
but “did the system reliably produce the outcome, and can we see why when it didn’t?”

It’s not perfect, but it’s practical.

A basic example: “write a proposal” as a production line

Suppose a company wants help writing client proposals.

A naive approach: prompt → proposal → send

A production line approach might look like:

clarify the desired outcomes (and what success looks like)
pull facts from approved sources (CRM, previous proposals, pricing docs)
draft the proposal in a standard structure
verify claims and numbers against sources
flag risks and missing info
have a human review the high-stakes parts
determine and log the outcome (did it win? what feedback?) so the process improves

Notice what happened:
we didn’t demand that the model possess a deep internal “concept of truth.”
We designed places where truth can enter:

through sources,
through checks,
through consequences.

Still useful?

A common reaction to all this is: “Fine, but if you still need a human in the loop, aren’t we basically back where we started?”

Not really. The point of a cognitive production line isn’t to replace people. It’s to move the human to the right place in the process, where their judgment actually adds value, and let the system offload the rest.

In practice, most knowledge work contains a lot of “mechanical” effort:

gathering context from scattered sources,
drafting first versions,
rewriting for structure and tone,
checking consistency,
extracting and formatting,
preparing summaries and options.

These are exactly the parts machines can do quickly and consistently. The human stays responsible for:

final intent (“is this what we actually want to say/do?”),
judgment calls and tradeoffs,
exception handling,
and approving anything high-stakes.

That’s why you can get real productivity gains even with a human in the loop: the human isn’t doing most of the keystrokes or context switching anymore.

There’s another part that’s easy to miss:
In many workflows, I actually expect quality and truthfulness to improve, compared to humans alone. Not because machines are “more truthful,” but because the combination is strong:

machines are good at speed, coverage, and consistency,
humans are good at judgment, values, and noticing when something is “off,”
and the workflow can be designed to force claims through sources and checks.

If you’ve ever done pair programming, it’s a similar idea: two imperfect systems combined with the right process can outperform either one alone.

The short takeaway

So yes — LLMs don’t come with an internal truth-meter.

But I think the more important question is: Where does your system get feedback from reality?

If the answer is “nowhere,” the model will sometimes behave like a smooth talker.

If the answer is “sources, checks, logs, outcomes,” you can get reliability without pretending you have an oracle.

A line I keep coming back to is: Truth isn’t guaranteed by the brain. It’s in earned in the loop.

That’s true for humans, and I think it will be true for practical AI too.

What UNeFAi does (in brief)

UNeFAi builds AI systems that translate users’ desired outcomes into structured cognitive production lines: sequences of specialized agents, tools, verification steps, and learning loops. The idea is to produce reliable outcomes by designing the workflow, not by hoping that one day the model can be relied upon to be right.

Curious what cognitive production lines can do for your organization? We’d love to explore it with you.

Geir Axel Oftedahl