DeLone and McLean's IS Success Model Works for AI If You Flip Who the User Is

I kept running into the same tension reading DeLone and McLean's IS success model alongside papers about AI system evaluation. The model gives six dimensions: system quality, information quality, service quality, use, user satisfaction, and net benefits. Every time I applied it to an AI system something felt incomplete. Not wrong. Just incomplete in a way that made me ask who the user actually is.

DeLone and McLean (1992) built the model to answer one question: how well is the system performing across its dimensions? System quality captures reliability, response time, and ease of access. Information quality covers accuracy, completeness, and timeliness. Service quality, which they added in the 2003 update, captures the IT support and customer-service properties of a system. Use and satisfaction feed into each other in feedback loops, and both drive net benefits at the individual and organizational level. The whole structure assumes a human at every point: a human evaluates quality, a human uses the system, a human is satisfied or not, a human experiences the net benefits.

That assumption worked fine for the ERP systems and decision support tools the model was designed to evaluate. It works less well when the system is an AI model that does not just serve humans but serves other systems. Here is where I think the big move is hiding in plain sight.

An AI model has two classes of user. The human user is the person who reads the chatbot response or reviews the AI-generated summary. The system user is the downstream application that calls the model's API, ingests its output, and makes decisions based on that output. Both users experience system quality, information quality, and service quality. Both produce use and satisfaction. Both incur net benefits or net costs. The model does not need replacement. It needs a pluralized user construct.

Consider a RAG pipeline. System quality for the human user means the chatbot responds quickly and does not crash. System quality for the downstream RAG retriever means the embedding model has low latency and high uptime, because every millisecond of delay compounds across the retrieval and generation stages. Information quality for the human user means the answer is accurate and well sourced. Information quality for the retriever means the embedding vectors return relevant chunks with high precision, because irrelevant chunks degrade the generated output before the human ever sees it. When a RAG pipeline fails, the failure is often invisible to the human user at the point of system quality or information quality for the system user. The retrieved chunk was wrong but the generated sentence looks fine. Torres and Sidorova (2019) argued that information quality is not an output property of a system but a property constructed in the process of effective use. That logic applies to both the human's effective use and the downstream system's effective use of the model's output. The representation fidelity question does not change when the consumer of information is another model.

This is also why AI monitoring is not optional. DeLone and McLean's model assumes system quality is a stable input. A human user evaluates it once and either keeps using the system or stops. AI model quality degrades over time without any human noticing. Data drift shifts the distribution the model was trained on, so the same input produces different output quality. Concept drift shifts the relationship between input and target, so the model learns the wrong mapping. Latency increases as inference loads grow. A human user might not detect the degradation until the output becomes obviously wrong, at which point the net benefits have already eroded. System quality monitoring becomes a non-human feedback loop: the monitoring system evaluates the model's system quality, triggers a retraining pipeline, and improves the model's information quality for both human and system users. The satisfaction dimension in the model maps to the monitoring threshold mechanism. The monitoring system is satisfied when accuracy and latency stay within bounds, and dissatisfied when they drift. Use shapes satisfaction, and satisfaction shapes future use. The feedback loop runs even when no human is in the cycle.

The feedback logic cuts in the opposite direction too. DeLone and McLean embedded feedback loops between use and satisfaction, and the model has always treated those as within-human cycles. A human uses a system, finds it useful, uses it more, and gets more benefits. With AI the loop crosses the human-system boundary in both directions. More AI output shapes what humans expect from AI. If a chatbot gives confident but wrong answers often enough, the human user adjusts their satisfaction threshold downward and uses the output more cautiously. That changed human behavior then feeds back into what the AI learns, because the AI is trained on human interaction data, including the corrections, the rephrasings, and the abandoned sessions. The feedback loop does not just cycle within the user. It cycles through the training data and back into the model. The IS success model's feedback loop turns into a coevolutionary loop that Benbya et al. (2020) would recognize as a complex adaptive system property. I am not sure DeLone and McLean anticipated that their feedback loop would one day connect to the data distribution that trains the artifact itself.

Service quality is the dimension that stretches the most under this view. In the 2003 model, service quality means the IT support function, help desks, and system administration that keep the system running for human users. For an AI system, service quality also means the prompt engineering support, the fine-tuning pipeline maintenance, the human feedback annotation, and the incident response when the model produces harmful output. A human user experiences service quality when the model's output is well formatted and the API documentation is clear. A system user experiences service quality when the model provider publishes reliable endpoints, SLAs, and error codes. The net benefits for both users depend on how well service quality serves both classes simultaneously. If the API is reliable but the output is garbage, the system user fails even though the human user would never know. If the output is accurate but the API is down every afternoon, the system user fails even though the model itself is good.

I wrote about effective use and information quality in analytics contexts and the same reasoning applies here. The effective use literature asks whether use is faithful to the domain the system represents, not just whether use happens. When the user is a downstream model, the faithfulness question gets harder because the downstream model does not have a human's judgment about whether the output is faithful. It just takes the output as input and propagates any error forward. Model collapse in recursive AI training is exactly this scenario: one model's output becomes the next model's training data, the system user in the chain is another model, and every degradation in information quality compounds across generations.

I think the biggest contribution the IS success literature can make to the AI era is a conceptual one. The model does not need to be replaced. The six dimensions hold up. What needs to change is the assumption that the user is always a person. System quality, information quality, and service quality matter for human users and for system users. Use and satisfaction run in parallel tracks for both. Net benefits multiply across the whole chain. DeLone and McLean gave us the architecture. We just need to populate it with more occupants than they planned for.