The World Exists Whether You Measure It or Not

I kept reading the same sentence in three papers and not fully absorbing it. Observed regularities may not reveal underlying causal mechanisms. I had underlined it in Mingers et al. (2013), circled it in Wynn and Williams (2012), and still treated it as a caveat rather than the whole point. Then I realized that this one sentence is the reason a BI dashboard saves lives in one hospital and does nothing in another, and no amount of variance explained will tell you why.

Critical realism starts from a claim so obvious it sounds banal: the world exists whether you study it or not. Bhaskar (1975) called this the intransitive dimension of reality. Things have structures and powers independent of our knowledge of them. Gravity worked before Newton named it. This is what makes CR different from interpretivism, which treats social reality as constructed through meaning. Interpretivism is powerful for understanding how people make sense of their situation, but it has a hard time saying that something real is happening beneath those interpretations. CR says: people construct meaning, and there are mechanisms operating regardless of whether anyone notices them.

This is also what makes CR different from positivism. Positivism says reality is objective and observable. CR says reality is objective but only partially observable. The positivist sees a correlation between BI dashboard use and patient outcomes and tests whether it holds at significance. The critical realist asks: what mechanism produced this correlation, and in what context?

Bhaskar's stratified ontology is where this gets concrete. He splits reality into three domains. The real contains mechanisms, structures, and powers, the causal forces that generate events. A hospital's decision-making routines and information norms are mechanisms at the real level. The actual is events that happen when mechanisms are activated: a doctor using the dashboard to triage patients, or a nurse ignoring it because her shift has a different workflow. The empirical is the subset of events we observe and measure: usage logs, survey responses, quarterly reports. These three domains do not line up neatly. Mechanisms operate whether or not anyone observes the events they generate. And mechanisms can counteract each other. Two hospitals install the same BI dashboard. In one, the decision-making routine mechanism activates it productively. In the other, the resource-allocation mechanism suppresses it because management uses the dashboard for surveillance. The event that shows up in the data is the net result of multiple mechanisms pulling in different directions. The positivist model that reports R-squared is describing the empirical layer. It cannot tell you which mechanism is doing the work.

Wynn and Williams (2012) built CR research design principles for IS on top of this stratified ontology. Their core logic is retroduction: reasoning backward from observed events to the mechanisms that could have generated them. This is neither induction nor deduction. It is a different mode of inference. You observe something that needs explaining, propose hypothetical mechanisms that, if they existed, would produce what you observed, and test those mechanisms against evidence. Mingers et al. (2013) spell out what this means for IS. CR provides ontological depth that positivism cannot deliver, because it distinguishes the real from the actual from the empirical. It respects the social construction of knowledge that interpretivism stresses, because it acknowledges that our access to mechanisms is always theory-laden and fallible. But it does not collapse into relativism, because it insists that there are real mechanisms we can approximate even if we can never fully know them.

The practical consequence is that a critical realist study looks different from a positivist study. You do not start with hypotheses and test them with a survey. You start with a puzzle and gather evidence to identify which mechanisms, in which contextual configurations, could have produced the observed outcome. Wynn and Williams (2012) lay out the method: identify mechanisms, specify contextual conditions, trace how mechanism plus context produces outcome patterns, and evaluate by explanatory power rather than statistical significance. Zachariadis, Scott, and Barrett (2013) extend this to mixed methods, arguing that quantitative methods are largely descriptive because correlations alone cannot uncover mechanisms, while qualitative methods do the heavier work of constructing propositions and identifying how mechanisms interact. In CR, the qualitative work identifies the mechanism, and the quantitative work describes the pattern it produces.

The reason I keep coming back to this is that so many IS studies run the wrong inference. They test whether X predicts Y, find a significant path coefficient, and move on. But the same X produces different Y in different contexts because different mechanisms are at work. Randomized controlled trials in IS illustrate this. You randomize hospitals into treatment and control, roll out a BI dashboard, measure outcomes. If the average treatment effect is positive, you declare success. But CR forces you to ask: which mechanism produced the effect where it worked, and which suppressed it where it failed? The RCT gives you the empirical layer. It tells you that, on average, something happened. It cannot tell you why, because why is a question about mechanisms and contexts, not averages.

I think this is the most useful thing CR gives the IS field, and also the thing the field resists most. As Orlikowski and Baroudi (1991) documented, IS has been dominated by positivist assumptions for decades. TAM and UTAUT measure perceptions, model paths, report variance explained, and rarely ask what mechanism connects perceived usefulness to behavioral intention. When the same model produces different results across studies, the positivist response is to add moderators. The CR response is to ask: what mechanism is operating here, and is it the same mechanism across contexts? A theory that specifies mechanism plus context plus outcome does more explanatory work than one that specifies X predicts Y, even if the latter has a higher R-squared.

The biggest trap, and Mingers et al. (2013) warn about this explicitly, is treating CR as a compromise between positivism and interpretivism. It is not a hybrid. It is a distinct position with its own ontology, its own epistemology, and its own logic of inquiry. The stratified ontology is not something positivism and interpretivism can produce by averaging. Retroduction is not a middle ground between deduction and induction. It is a different mode of reasoning. As I wrote about paradigms, paradigm is not method, and you cannot mix ontologies. CR is its own commitment.

Where this hits the real world is in implementation failure. Hospital A buys a clinical decision support system and mortality drops. Hospital B buys the same system and nothing changes. The positivist study reports the average effect and tests for heterogeneity. The interpretivist study documents how clinicians made sense of the system. The critical realist study compares the two hospitals and asks: what mechanism, activated in what context, generated this outcome in Hospital A but not Hospital B? Maybe Hospital A had a quality improvement routine that the dashboard amplified. Maybe Hospital B had a surveillance culture that made clinicians game the metrics rather than use them for clinical reasoning. The mechanism is different. The context activated different mechanisms. Understanding that lets you design better interventions, because you stop treating the technology as if it carries its effect inside it. The effect is co-produced by mechanism and context.

This is also why CR is useful for thinking about AI adoption. The same large language model produces different outcomes in different organizations not because the technology varies but because the organizing mechanisms vary. Trust routines, delegation norms, feedback processes: these are real mechanisms that generate actual events, and only some of those events make it into the empirical layer. Delegation theory captures part of this, but CR gives you the deeper ontology to explain why the same delegation pattern produces different outcomes in different places.

There is a cost. CR research is harder to do well because you cannot hide behind a clean model fit. You have to specify mechanisms, make them plausible, show how they interact with context, and demonstrate that the combination could have produced what you observed. The evaluation standard is explanatory power, not statistical significance. Wynn and Williams (2012) acknowledge this: CR can be vague unless mechanisms and contexts are clearly specified. The cure is not to retreat to positivism. The cure is to do the harder work.

The world exists whether you measure it or not. Most of it exists in layers you cannot measure directly. Critical realism gives you a language for those layers and a method for reasoning about them. Whether IS research takes up that language in large numbers is a different question. The field has been slow, and I understand why: positivism scales, interpretivism respects meaning, and CR demands mechanism-level explanation that requires careful case comparison. But if the question is why the same technology produces different outcomes in different contexts, and every implementation study you have read leaves that question unanswered, then the stratified ontology is not a philosophical luxury. It is the minimum you need to ask the question properly.