PLS vs. CB-SEM: The Statistics Debate That Will Not Die

The first time I had to choose between PLS-SEM and CB-SEM for a survey-based study, I did what most PhD students do: I asked my advisor, read a methods paper, and looked at what similar papers in the target journal had done. What I found was a field with strong opinions, a long history of argument, and surprisingly little consensus on when each approach is actually appropriate. The debate has been running since at least the late 1990s and it is still generating new papers.

Structural equation modeling is the dominant quantitative method in IS survey research. The basic idea is that you have theoretical constructs (things like perceived usefulness, trust, or organizational commitment) that you cannot observe directly, and you measure them through observable indicators (survey items). SEM lets you specify relationships between these constructs while accounting for measurement error in the indicators. Two main approaches exist: covariance-based SEM, implemented in software like AMOS and lavaan, and partial least squares SEM, implemented most commonly in SmartPLS.

Wold originally developed PLS in a different context, and Chin (1998) brought it into IS research with a paper that made the case for its advantages. The argument was that PLS-SEM works with smaller samples, handles non-normally distributed data, and is better suited for predictive and exploratory research where the goal is explaining variance rather than confirming theoretical structure. Chin's framing was influential. IS research adopted PLS-SEM at scale over the following decade, and SmartPLS became a standard tool in the IS methods toolkit.

The core statistical difference matters for understanding what each approach is actually doing. CB-SEM is factor-based: it estimates latent variables by extracting common variance from indicators, and it tests whether the covariance structure implied by your model matches the covariance structure in your data. The fit statistics that CB-SEM produces (RMSEA, CFI, SRMR) are evaluating this match. PLS-SEM is composite-based: it constructs latent variable scores as weighted linear combinations of indicators and optimizes for maximizing explained variance in endogenous constructs. This means PLS-SEM does not produce model fit statistics in the same sense, which is both a feature (you are not constrained by fit requirements) and a limitation (you cannot test whether your measurement model is correct in the same way).

The reflective versus formative distinction is where the theoretical stakes are highest. Reflective constructs are ones where the indicators reflect an underlying latent variable. If I have a construct called "organizational commitment," and the items I measure are different expressions of that underlying commitment, then the construct causes the indicators. Removing one item should not change the construct. CB-SEM is designed for reflective constructs. Formative constructs are the opposite: the indicators define or constitute the construct. A "socioeconomic status" construct formed from income, education, and occupational prestige is formative. Removing income changes what socioeconomic status means. PLS-SEM handles both. The trouble is that IS researchers often choose their measurement approach based on software habit or sample size rather than the theoretical direction of causality. Lin et al. (2019), in a review of PLS-SEM use in e-learning research, found that model misspecification, where a reflective construct is incorrectly treated as formative or vice versa, leads to inflated or deflated path coefficients and R-squared values.

This is the critique that has accumulated over years: PLS-SEM has been used in IS research when CB-SEM is more appropriate, because PLS is more forgiving with small samples and tends to produce higher R-squared values. Hair, Ringle, and Sarstedt, who have been both champions of PLS-SEM and critics of its misuse, have published guidelines trying to clarify when PLS-SEM is the right choice. Their consistent argument is that PLS-SEM is appropriate for prediction-oriented research and for models with formative constructs, not as a general substitute for CB-SEM when the sample is small. "My sample is only 100 people" is not, by itself, a justification for PLS-SEM over CB-SEM. The justification should be theoretical and match the purpose of the research.

The Fornell-Larcker criterion is a standard discriminant validity check that appears constantly in IS SEM papers. It tests whether each construct shares more variance with its own indicators than with other constructs in the model. Average Variance Extracted, or AVE, is the metric. An AVE above 0.5 is usually required. Composite reliability above 0.7. These thresholds come from psychometric traditions and have been widely adopted in IS as default validity checks. They apply most cleanly to reflective constructs. For formative constructs, you need different validation procedures, including checking for multicollinearity among indicators and ensuring each indicator contributes meaningfully to the construct.

What I find frustrating about this debate is that it often becomes a proxy argument about software preference and reviewer expectations rather than a genuine methodological discussion. I have read papers where the entire justification for PLS-SEM is a half-sentence citing Chin (1998) and noting that the sample size was under 200. That is not a methodological choice. It is a citation that covers a choice made on other grounds. And reviewers at IS journals have sometimes accepted this, which means the practice is reinforced through publication.

Gartner's practitioner research does not use SEM at all. Their surveys produce simpler descriptive statistics, crosstabulations, and trend analyses that practitioners can understand and act on directly. This is not a failure of Gartner's research. It reflects the difference between knowledge produced for academic accumulation and knowledge produced for decision support. A CIO asking "how many of our peers have adopted cloud-native infrastructure?" needs a percentage and a benchmark, not a structural model. Academic IS research asks different questions, which is why SEM matters in our context. But it is worth being honest about what the sophisticated methodology buys us, and whether it is always worth the additional complexity and the risk of misuse.

The practical choice I try to make is to ask what kind of question I am actually answering before I pick an approach. If I am testing whether a well-established theoretical model holds in a new context, CB-SEM with confirmatory factor analysis is appropriate. If I am exploring a new phenomenon and trying to understand what predicts an outcome in a complex model, PLS-SEM makes more sense. If my constructs are genuinely formative, PLS-SEM is the right tool regardless of sample size. The decision should flow from the research question and the theoretical structure of the constructs, not from the software that is most convenient or the one that is most likely to produce publishable results.