The Specification Error in Your Trust Variable

I was reviewing a paper draft last week that measured "trust in AI" by asking participants whether the system was competent. The items were clean, the factor loadings were strong, and the conclusions felt reasonable. The problem is that competence is not trust. Competence is one dimension of trustworthiness, which is a property of the trustee. Trust is a psychological state of the trustor. The study had measured the agent's characteristics and then called it the trustor's willingness to be vulnerable. Mayer et al. (1995) could not have been clearer about this distinction, and I keep seeing it blurred anyway, not just in student papers but in published work that should know better. So I want to lay out what I think is the core problem: four constructs that keep getting collapsed into one variable, and why that collapse is a specification error, not a semantic quibble.

Mayer et al. (1995) define trust as the willingness of one party to be vulnerable to the actions of another party, based on the expectation that the other will perform a particular action important to the trustor, irrespective of the trustor's ability to monitor or control that party. Three things matter in this definition. Trust requires vulnerability, meaning the trustor accepts risk. Trust involves positive expectation, meaning the trustor believes the trustee will act beneficially. Trust involves lack of control, meaning the trustor cannot fully monitor the trustee's behavior. Notice that all three elements sit inside the person doing the trusting. Trust is a psychological state, not a feature of the system being trusted.

Trustworthiness, by contrast, is a property of the trustee. Mayer et al. break it into three dimensions: competence (the ability to perform), benevolence (the willingness to act in the trustor's interest rather than purely in self-interest), and integrity (adherence to principles the trustor finds acceptable). These three are distinct. A system can be competent without being benevolent. A system can have integrity in how it follows rules without being particularly competent at the task. And trust depends on all three, weighted by context. The paper that measures competence items and labels the factor "trust" has committed a specification error because it has measured a subset of trustworthiness and called it the trustor's willingness to be vulnerable. The construct it actually captured sits at a different level and has different antecedents.

McKnight et al. (2002) make the problem even more visible by disaggregating trust into four types. Disposition to trust is a general willingness to depend on others across situations. Institution-based trust is the belief that structural guarantees create safety. Trusting beliefs are perceptions of a specific party's competence, benevolence, and integrity. Trusting intentions are the willingness to depend on that specific party in a specific situation. A study that measures disposition to trust and generalizes to trusting intentions has committed a level error. The antecedents and measurement requirements of these four types are not interchangeable. McKnight et al. built this disaggregation precisely because earlier work kept conflating personality dispositions with situational beliefs and behavioral intentions, and the conflation was producing predictions that did not hold.

So far this might sound like a disagreement about labels. It is not. The error has real consequences when you move from interpersonal trust to trust in automation, and then from trust to delegation. Lee and See (2004) frame the challenge through three calibration states. Overtrust (misuse) means relying on automation beyond its actual capability. Undertrust (disuse) means rejecting automation that could improve performance. Calibrated trust is the goal: reliance matched to actual capability. This reframing changes the intervention. If trust were a single continuum, you would always want to increase it. But if overtrust is a real danger, increasing trust is sometimes exactly the wrong move. You want to calibrate it, which means sometimes decreasing trust and sometimes increasing it, depending on the mismatch between perceived and actual capability.

Dimoka (2010) pushes this further. Her NeuroIS evidence shows that trust and distrust are not opposite ends of one continuum. They may involve different neural substrates, and they can coexist. Reducing distrust is not the same thing as increasing trust. They are separate processes with separate antecedents. A study that measures distrust on a reverse scale and calls it "low trust" is not just committing a measurement error. It is missing an entire mechanism. If your AI system is generating both trust (based on demonstrated competence) and distrust (based on perceived opacity or past failures), you need to measure both, because the intervention that reduces distrust (transparency about limitations) is different from the intervention that increases trust (evidence of capability).

Now we get to the construct that creates the most confusion in current AI research. Baird and Maruping (2021) argue that for agentic information systems, delegation is the right central construct, not use. I wrote about this in more detail in my post on why use is the wrong construct for agentic systems. What matters here is the relationship between delegation and trust. Delegation is a behavioral transfer of task rights and responsibilities from a human to an agentic system. It involves appraisal (judging whether the agent can do the task), distribution (allocating subtasks), and coordination (managing interdependencies). Trust can influence delegation. But trust is not delegation. You can delegate without trusting, because your boss told you to, because there is no alternative, or because the task is too small to worry about. You can trust without delegating, because the task is too consequential, or because institutional constraints block transfer. Treating delegation as a straightforward behavioral consequence of trust is the same kind of specification error as measuring competence and calling it trust. The antecedents are different, the mechanism is different, and the level of analysis shifts from a psychological state to a behavioral act embedded in organizational context.

Liu et al. (2025) make this separation empirically concrete. Their hidden Markov model treats willingness to delegate as a latent state distinct from both trust and observable delegation behavior. Willingness is a behavioral disposition to delegate a specific task. Trust is a belief about the agent's reliability and competence. The two can diverge. A manager can trust the AI's diagnostic accuracy but remain unwilling to delegate because the decision is too consequential, or because hospital policy requires human sign-off. That same manager can delegate despite low trust because time pressure leaves no alternative. Liu et al. also separate the observable delegation outcome (accepting or overriding a recommendation, which leaves a digital trace) from the unobservable willingness state (which exists in the manager's mind and does not show up in any log). This three-way distinction, between trust as belief, willingness as latent state, and delegation as observable behavior, is where I think the field needs to land. Each operates at a different level. Each has different antecedents. Each requires different measurement. Conflating them is not a terminological shortcut. It is a specification error that changes what you think you are explaining and what interventions you think will work.

I want to be precise about what specification error means here, because the term comes from econometrics and I am using it in the IS theory sense. Specification error is when the model includes the wrong construct for the theoretical mechanism you are claiming to test. If your theory says trust influences continued use, but your instrument measures trustworthiness, your model is mispecified because you have measured a property of the system when your theory calls for a psychological state of the person. If your theory says delegation is the key dependent variable for agentic AI, but you measure trust and call it delegation, the same problem applies. The coefficients may still be significant, but they are attached to the wrong construct. The intervention you derive from those coefficients will target the wrong variable. You will try to increase trust when the bottleneck is institutional authority, or you will try to demonstrate competence when the bottleneck is task criticality.

The four constructs I have been talking about, trust, trustworthiness, reliance, and delegation, operate at different levels and have different antecedents. Trust is a psychological state of the trustor. Trustworthiness is a set of characteristics of the trustee. Reliance is a behavioral pattern of depending on the system, sometimes without the vulnerability that trust requires. Delegation is a behavioral transfer of task rights and responsibilities. You can rely on a system without trusting it, the way you rely on a crosswalk signal without any relationship with it. You can trust a system without delegating to it, the way you might trust a colleague's expertise but not give them decision rights on your project. These are not minor distinctions. They determine what your study actually measures, what predictions you can make, and what interventions you can recommend.