What Theory Actually Is (And What It Isn't)

I had been reading IS papers for about eight months before I realized that most of what I was calling theory was not theory at all. I would underline a conceptual framework diagram, circle the hypotheses, highlight the variable names, and file the whole thing under "theoretical contribution" in my notes. Then I read Sutton and Staw (1995) and realized that every single thing I had been pointing to was on their list of what theory is not.

Sutton and Staw name five things that are not theory: references, data, variable lists, diagrams, and hypotheses. Each one can accompany theory, but none of them is theory by itself. References show you read the literature. Data show you collected something. Variable lists show you named things. Diagrams show you drew boxes. Hypotheses show you made predictions. None of these explain why anything happens. They are the furniture of a paper, not its architecture. Theory is the argument that connects constructs to each other through a mechanism that justifies the connection. Without the mechanism, you have a list, not an explanation.

Whetten (1989) makes the same point from the positive side. A theory needs four things: what, how, why, and boundaries. The "what" names the constructs and defines them. The "how" describes the relationships among those constructs. The "why" provides the logic that justifies those relationships, and this is the part that matters most. A paper can have constructs and hypotheses without having theory, because theory requires the argument that explains why the constructs relate as proposed. The boundaries say when, where, and for whom the explanation should hold. Whetten calls the "why" the theoretical glue. Without it, the whole thing falls apart.

I think about this whenever I review a paper that presents a research model with six constructs, five hypotheses, and a diagram showing paths with asterisks, but never explains why those paths should exist. The authors will cite theory, name theory, draw theory, but they will not do the work of theory. The why is missing. Sutton and Staw and Whetten are not making a subtle point. They are saying that most of what gets published under the label of theoretical contribution is missing the one part that makes it theoretical.

Gregor (2006) does something different and I think equally important. Instead of asking what theory is or is not, she asks what kind of theory it is. She identifies five types in IS. Type I is theory for analysis, which classifies and describes phenomena without explaining or predicting. Type II is theory for explanation, which says how and why something happens. Type III is theory for prediction, which forecasts outcomes without explaining the mechanism. Type IV is theory for explanation and prediction, which does both. Type V is theory for design and action, which tells you how to build something or intervene. Gregor also discusses Dubin's precision and power paradoxes: a precise theory loses generality, and a general theory loses predictive sharpness. You cannot have both at full strength. The typology forces you to choose what kind of contribution you are making, instead of pretending you are making all of them at once.

This matters because different theory types answer different questions, and confusing them produces bad research. A Type I analysis that classifies AI governance mechanisms into five categories is a real contribution, but it is not an explanation. A Type III prediction model that forecasts churn rates from usage data is useful, but it does not tell you why customers leave. A Type V design principle for building transparent AI systems tells you how to intervene, not why transparency works. When reviewers or authors treat these as interchangeable, the result is either overclaiming or underdelivering. Gregor's framework gives you a vocabulary for being honest about what your theory does.

Then there is Mohr (1982), who I think is underread outside of methods seminars but who changes how you read everything after him. Mohr argues that variance theory and process theory operate on different logics and cannot be judged by the same criteria. Variance theory uses necessary and sufficient conditions in the form "if X, then Y." You judge it by effect sizes, variance explained, and falsifiability. Process theory uses necessary but not sufficient conditions where the outcome comes from a temporal sequence. You judge it by fidelity to that sequence and narrative coherence. A process study evaluated by variance criteria looks weak, because the predictors will not explain large amounts of variance. A variance study evaluated by process criteria looks ahistorical, because it ignores sequence and timing. The two logics are not competing. They are answering different kinds of questions. But if you mix them without knowing which one you are using, you end up with research that satisfies nobody.

When I wrote about delegation in agentic systems, I was relying on a process logic even though much of the IS tradition treats adoption as a variance question. Liu et al. (2025) model delegation as a dynamic state that shifts over time through feedback loops. You cannot capture that with a single-shot survey construct. Baird and Maruping (2021) identify appraisal, distribution, and coordination as mechanisms, and these unfold across stages. The theoretical logic is process, not variance. Mohr would say that evaluating this work by the proportion of variance explained would miss the point entirely.

Markus and Robey (1988) add the causal dimension. They identify three causal views in IS research. The technological imperative treats IT as an exogenous force that determines organizational outcomes. The organizational imperative treats humans as fully in control of technology. The emergent perspective treats outcomes as arising from the interaction between technology, organizational structures, and human agency. The emergent view is almost always the correct answer in IS because the deterministic alternatives are empirically untenable. Technology never determines outcomes alone, and humans never fully control technology. Markus and Robey also specify that causal structure has three dimensions: causal agency, logical structure, and level of analysis. Logical structure is where Mohr's variance versus process distinction lives. Level of analysis forces you to say whether your theory operates at the individual, group, organization, or societal level, because conclusions valid at one level do not transfer automatically to another.

When I read CARE theory, the dignity framework that Leidner and Tona (2021) built, I noticed something. CARE specifies what (claims, affronts, response, equilibrium), how (digital practices affront dignity, individuals and organizations respond, equilibrium shifts), why (dignity claims create expectations that digital practices can violate), and boundaries (applies to digital contexts where people interact with systems that collect and process their information). It also operates at a specific level (individual and organizational dignity) and takes an emergent causal stance (technology and human agency interact to produce outcomes). These are not incidental features. They are what make it a theory rather than a classification scheme.

Sun et al. (2025) recently added a quality standard that I think should change how we evaluate all of this. They frame contribution quality as a triadic balance of novelty, rigor, and relevance. Novelty alone is not enough. A new label on an old idea does not count. Genuine novelty must appear in constructs, mechanisms, methods, or artifacts, and it must coexist with methodological soundness and practical impact. This reframing matters because IS research has spent decades defending rigor at the expense of relevance, or reaching for relevance at the expense of rigor, or chasing novelty that is superficial. Sun et al. say you have to have all three. Not two out of three. Not novelty that is really just a new context. And not relevance that comes from studying a hot topic without a theoretical mechanism. The triadic standard forces a conversation about what counts as a contribution that the binary rigor-versus-relevance debate never could.