There are real costs to not admitting what you don’t know. AI companies are racing to curb a very human flaw as large language models learn to say, ‘I’m not sure.’

LONDON — Confidence is a marvelous servant and a terrible master. Economists have long noted that people who signal certainty earn trust, promotions and airtime. Yet when bravado outruns the facts, errors multiply and costs pile up: bad investments, misdiagnoses, missed warnings. That dynamic isn’t limited to humans anymore. As generative AI spreads through offices and public services, the industry behind it is contending with a distinctly human flaw: overconfidence.
The danger is mundane and therefore pervasive. In boardrooms, the person who answers fastest is often mistaken for the one who knows most. In hospitals, anchoring on an early hunch can blind a clinician to contradictory symptoms. In newsrooms and markets, apparent certainty travels faster than nuance. This is the phenomenon Sarah O’Connor has been writing about: the real, measurable costs of not admitting what you don’t know. The twist in 2025 is that large language models (LLMs) — tools trained to predict the next word — can display the same brittle certainty even when their internal evidence is thin.
Technologists call these failures hallucinations or confabulations. The model generates a fluent answer with the cadence of authority, but the facts are wrong or the citations invented. Because the prose sounds plausible, people accept it. Companies have responded with safety layers and product nudges. Many consumer chatbots now add disclaimers, sprinkle in probability words and sometimes refuse to answer. The hope is not to make models timid, but calibrated — able to surface a great answer when evidence is strong and to step back when it is not.
Calibration is an old statistical idea with new urgency. A system is said to be calibrated when statements of confidence map to reality over time: if it claims to be 80 per cent sure, it should be right about eight times in ten. LLMs were not built with this goal; they are rewarded for fluency. Engineers are now grafting on methods from machine learning and risk management — temperature scaling to tame overconfident probabilities, uncertainty estimation from model ensembles, selective prediction that lets a system abstain — to give models something like the self-knowledge humans lack.
The cultural shift inside tech firms is as important as the math. For a decade, the implicit product mantra was speed and delight. Today, leaders speak more about ‘knowing when to say I don’t know’. Teams run red‑teaming exercises focused not only on offensive content but on over‑claiming: does the model invent laws, psychoanalyse a user, or diagnose a disease when it should route to a human? Product managers monitor the rate of safe abstentions the way banks track capital buffers.
Consider corporate search tools rolling out to offices this year. Early pilots earned raves for summarising sprawling document sets, but they also inserted confident fiction — a phantom policy, a misremembered deadline — that sent teams down rabbit holes. Fixes were unglamorous: add document‑level citations by default; require a click to reveal source passages; introduce ‘low confidence’ banners; and wire in escalation paths to subject‑matter experts. Accuracy rose. So did user trust.
Public‑sector deployments face higher stakes. A town hall that uses an assistant to answer planning queries must ensure the bot does not misstate rights or deadlines. Health systems experimenting with draft discharge notes cannot tolerate invented medications. The question is not whether AI can reduce paperwork — it can — but whether its confidence is disciplined by warrants, evidence and oversight. A single high‑profile mistake on a vulnerable case will shape public perceptions for years.
None of this is a plea for machines to mumble. There are domains where assertiveness is valuable: triaging inboxes, drafting first passes, translating instructions. The problem is misplaced certainty — the polished answer delivered with equal conviction whether the model has seen this pattern a thousand times or never. People are prone to the same flaw, which is why checklists, second opinions and pre‑mortems exist. The most successful AI products will borrow those human guardrails and automate them.
Researchers are experimenting with ways to make caution legible. One approach measures agreement among multiple model runs; when outputs diverge wildly, the system flags uncertainty or abstains. Another ties claims to verifiable sources through retrieval, showing the user where each fact came from. A third uses conformal prediction — a method that wraps a forecast in a statistically valid region — to bound the range of plausible answers. None is a cure‑all, but together they make it harder for glibness to masquerade as knowledge.
There is also a design question: how should a machine express doubt? People bristle at hedging that sounds like buck‑passing, yet appreciate transparency when stakes are clear. The emerging pattern is graded disclosure. For low‑risk tasks, a subtle confidence meter or colour cue may suffice. For moderate risk, products present citations and a ‘double‑check this’ nudge. For high risk — legal, medical, financial advice — the right answer is often a redirect to a qualified professional or a human‑in‑the‑loop review. The art is to avoid desensitising users with constant alarms.
The economics of overconfidence are under‑appreciated. Time lost to chasing false leads is an invisible tax on productivity. In contact centres, a confident but incorrect answer forces a second call; in software, a misleading code suggestion can propagate bugs across systems. Multiply small frictions by millions of interactions and the cost is material. Companies are beginning to publish ‘hallucination budgets’ the way they publish error budgets in reliability engineering — targets that focus teams on the most expensive failure modes.
Regulators have noticed. Emerging AI rules in the United States and Europe stress transparency around limitations and require providers to evaluate and mitigate foreseeable risks. That does not mean governments can or should micromanage model behaviour, but it does put a premium on record‑keeping: what data supported a claim, what safeguards were active, when and why a system declined to answer. In time, we may see a liability gradient: the higher the confidence displayed to users, the stronger the evidence and accountability expected behind the scenes.
The workplace implications are more intimate. Junior employees who grow up with AI assistants may internalise a style of communication that prizes polish over doubt. Managers will need to reward careful uncertainty just as they reward crisp delivery. A good meeting does not end with the most emphatic voice carrying the day; it ends with a clear plan, explicit unknowns, and a strategy to resolve them. Teaching people to use AI well is partly about teaching them to ask for confidence bounds and to notice when the machine is winging it.
There is a paradox here. Confidence is contagious; doubt rarely is. Teams that admit uncertainty risk appearing indecisive. But the alternative — the ritual performance of certainty — corrodes decision‑making. The way out is to distinguish between poise and proof. Leaders can model the difference by narrating their own uncertainty and setting norms: we publish only with sources; we escalate when evidence is thin; we celebrate near‑misses that were caught by the process. The signal to the market is competence, not bravado.
This summer, several AI providers introduced ‘I don’t know’ modes for enterprise customers. The assistants now ask to search, to retrieve documents, or to involve a colleague when confidence falls below a threshold tuned by the client. Early adopters report fewer spectacular bloopers and more predictable workflows. The trade‑off is speed. That is a fair price. In settings where errors carry legal or financial consequences, a slower, surer assistant is worth more than a fast, flaky one.
Humans will not stop over‑claiming; neither will machines. But we can make it costly to bluff. Product teams can publish calibration plots and abstention rates. CIOs can demand procurement pilots that measure not only accuracy but the quality of uncertainty signals. Journalists can resist the lure of definitive takes when the evidence is genuinely mixed. And the rest of us can practise a phrase that should be stamped above every keyboard: I don’t know yet — here is what I would check.
Overconfidence feels safe in the moment because it silences anxiety. In reality, it is a form of risk‑taking with compound interest. Admitting what we don’t know — and building machines that can do the same — is not weakness. It is the cornerstone of judgement. The strongest signal of intelligence, human or artificial, is not the volume at which it speaks, but the discipline with which it distinguishes knowledge from guesswork.



