AI & Agentic Systems

AI Hallucination Is a Trust Problem, Not a Technical One

Hallucination is framed as a retrieval bug to be engineered away. Lee and See would call it a trust calibration failure, and that framing changes the solution entirely.

2026-05-14 · 6 min read AI & Agentic SystemsTrust & Security
AiPart 19 of 51
Ai Adoption S Curve Ai Adoption Social NAi Adoption Toe FramAi Agent Software 37Ai Agents Customer SAi Agents Principal Ai Ambidexterity ExpAi Chatbots Media RiAi Coding Effective Ai Colleague StructuAi Customer Service Ai Dependency ParadoAi Deskilling PeopleAi Energy SociotechnAi Ethics CeremonialAi Future Work AutomAi Garbage Can ModelAi Governance Theate19Ai Healthcare IdentiAi Hiring Fairness MAi Implementation FaAi Is Research MethoAi Labor Market Is RAi Layoffs Budget RoAi Models Critical RAi Native DevelopmenAi Network Effects DAi Observational LeaAi Output Boundary OAi Participant Work Ai Pilots Dont BecomAi Pilots Dont BecomAi Pilots Dont BecomAi Pilots Dont BecomAi Policy GovernanceAi Productivity ParaAi Project CancellatAi Regulation GlobalAi Safety Pmt Fear DAi Scaling Gap EnterAi Security AutomatiAi Security PlatformAi Self Efficacy DigAi Supercomputing RbAi Takes Over RoutinAi Task Technology FAi Training Data KnoAi Transaction Cost Ai Trust Repair WillAi Vendor Concentrat

In May 2023 a lawyer filed a federal brief in the Southern District of New York. He had used ChatGPT to research precedent, and the model had cheerfully invented six cases that did not exist, complete with docket numbers, plaintiff names, and plausible-sounding holdings. The lawyer did not verify them because the output looked authoritative. The judge noticed. Sanctions followed. When I first read about this, the reaction I kept seeing was that we need better models, better retrieval, better grounding. What I could not stop thinking was that this has nothing to do with the model architecture. It has everything to do with how a human calibrated trust in a system that was intermittently reliable in ways the human could not predict.

The AI industry frames hallucination as a technical bug. Better training data. Retrieval-augmented generation. Tighter prompting. Stricter factuality benchmarks. The assumption is that hallucination lives inside the model, and if we engineer the model better, the problem shrinks. I think this is the wrong frame. Lee and See (2004) would look at the lawyer, the chatbot, the enterprise team that gave up on their AI pilot, and call all of them the same thing: a trust calibration failure. The technical fix misses the mechanism that actually causes the damage.

Lee and See (2004) give three calibration states. Overtrust, or misuse, is when a person relies on automation beyond its actual capability. This is the lawyer. The ChatGPT output looked like legal writing, it cited real-sounding authorities, it used the register and format of a federal filing. The lawyer calibrated trust based on surface fluency, not on actual reliability, because surface fluency was the only signal available. The system gave no warning flag, no confidence score, no structural cue that said "I am fabricating right now." The lawyer was not lazy. He was operating with incomplete information about when the system would fail. That is overtrust, and overtrust is a calibration problem, not a retrieval problem.

Undertrust is the mirror image. It happens when people reject automation that could improve their performance, because past failures or opacity have destroyed any basis for calibrated reliance. Lee and See call this disuse. An enterprise team that watched the model hallucinate twice in a demo stops using it for anything, including tasks the model would have done well. The model did not get worse. The calibration broke because failures arrived unpredictably and the cost of each failure was asymmetric. One hallucinated number in a quarterly report erases whatever trust was built across fifty accurate ones.

Calibrated trust is the goal: reliance matched to actual capability. But calibration requires predictability. You need to know when the system is reliable and when it is not. If the model is right ninety percent of the time but the ten percent of failures are indistinguishable from the successes, calibration is impossible. The user cannot learn the boundary because the boundary is invisible. This is why hallucination is not primarily a model-quality problem. You can improve accuracy from ninety percent to ninety-five percent and the calibration problem does not go away, because the remaining five percent of failures are still unpredictable and the user still cannot tell which output is the bad one.

I kept thinking about Dimoka (2010). She showed that trust and distrust are not opposite ends of one continuum. They involve different neural substrates and they can coexist. I trust ChatGPT to summarize my notes but I do not trust it to answer a legal question, and I hold both these evaluations at the same time without any contradiction. The interface does not surface this distinction. The same model that writes perfectly accurate code can fabricate an API function that does not exist, and the user sees the same tone, the same confidence. There is no structural separation between what the model is competent at and what it is not. The user supplies that separation from judgment alone, which is exhausting and unreliable.

Air Canada learned this in February 2024. A passenger asked the airline's chatbot whether bereavement fares were available, and the chatbot invented a refund policy that did not exist. The passenger relied on it. The airline argued the chatbot was a separate legal entity. The tribunal disagreed and ordered the refund. When I read this story, I was not thinking about whether the model was well-tuned. I was thinking about boundary resources. The chatbot sat at the boundary between the organization and the customer. It spoke with the voice of the airline. The customer's calibration was not about the model, it was about the organization that deployed it. If you put a conversational agent at your organizational boundary and give it no structural cues about when it is authorized to speak and when it is improvising, you have designed a trust problem, not deployed a defective model.

This is where I think the hallucination debate is missing its own mechanism. The industry keeps treating hallucination as a technical problem to be engineered away. Better retrieval. Better fine-tuning. Better benchmarks. The implicit assumption is that once the hallucination rate drops below some threshold, the problem is solved. I think this is wrong because the calibration problem is structural, not statistical. A system that is accurate ninety-nine percent of the time still has a calibration problem if the remaining one percent of failures are catastrophic, unpredictable, and indistinguishable from successes. The user needs to know when to override the system and when to accept it. Without structural cues, even a very accurate model leaves the user in a state of permanent calibration uncertainty, and permanent calibration uncertainty produces the same organizational outcomes as an unreliable model: people either blindly trust the output or abandon the tool entirely.

The interventions that Lee and See describe for each calibration failure are different, and neither one is "make the model more accurate." For overtrust, the intervention is transparency about limitations. Show the user where the model is likely to fail, when confidence is low, what the plausible failure modes are. For undertrust, the intervention is evidence of capability, shown in domains where the model is genuinely reliable and the user can verify it. Neither intervention requires changing the model architecture. Both require changing the information environment in which the user operates. This is what makes hallucination an IS problem, not an ML problem. It requires designing the interaction between the person and the system so that calibration is possible. It requires surfacing the information the user needs to know when to trust the output and when to override it. If you work on the model and ignore the interaction, the calibration problem remains regardless of how accurate the model becomes.

I wrote about the specification error that happens when researchers measure trust, trustworthiness, reliance, and delegation as if they were the same construct. That problem shows up here too. When an enterprise deploys an AI system and measures adoption, they are measuring a behavioral outcome that trust influences but does not determine. A team might use the system because they have no alternative, without trusting it. Another team might trust the model's output but not delegate high-stakes decisions to it because organizational policy blocks transfer. If the organization thinks the problem is low adoption and tries to fix it by improving the model, they are treating a delegation or calibration problem as an accuracy problem, and the fix will miss the mechanism.

Bhattacherjee knew this in 2001. The continuance problem is not about whether people start using a system. It is about whether they keep using it after the initial experience. Every SaaS company that measures signup rates and ignores the first-thirty-day experience is replaying the same error: treating adoption as the finish line when the actual mechanism that determines outcomes is what happens between expectation and confirmation. Lee and See gave us the calibration framework. Dimoka gave us the neural separation of trust and distrust. Mayer et al. gave us the distinction between trust and trustworthiness. The IS field built the theoretical toolkit to understand hallucination as a trust problem twenty years ago. The AI industry is still treating it as an engineering one.

I am not saying model quality does not matter. I am saying that model quality alone cannot solve the problem, because the problem is not that the model sometimes fails. The problem is that the user cannot predict when it will fail, and uncertainty about failure is a calibration problem, not a retrieval problem. The most profitable investment a SaaS company can make in its AI product is not a better model. It is better structural information about what the model is likely to get right, what it is likely to get wrong, and what happens in between. That is an IS question. Bhattacherjee asked it about continuance in 2001. Lee and See asked it about automation in 2004. The IS field had the tools. The AI industry just has not picked them up yet.


About the author

A
Ali Safari
PhD Student in IS, University of North Texas

Researching AI governance, trust in intelligent systems, and agentic AI. Writing while studying for comps.

Share

More notes

← Previous
Why Doctors Still Resist Better AI: Identity, Not Accuracy
Next →
AI Adopted the Way the Garbage Can Model Predicts

Related notes