AI & Agentic Systems

AI Is Not a Technology. It Is a Task-Technology Fit Problem.

Goodhue and Thompson said performance follows fit, not features. AI is a family of tools, and the wrong one for the task will fail the same way every time.

2026-05-14 · 5 min read AI & Agentic SystemsComps & ReflectionsSociotechnical Systems
AiPart 47 of 51
Ai Adoption S Curve Ai Adoption Social NAi Adoption Toe FramAi Agent Software 37Ai Agents Customer SAi Agents Principal Ai Ambidexterity ExpAi Chatbots Media RiAi Coding Effective Ai Colleague StructuAi Customer Service Ai Dependency ParadoAi Deskilling PeopleAi Energy SociotechnAi Ethics CeremonialAi Future Work AutomAi Garbage Can ModelAi Governance TheateAi Hallucination TruAi Healthcare IdentiAi Hiring Fairness MAi Implementation FaAi Is Research MethoAi Labor Market Is RAi Layoffs Budget RoAi Models Critical RAi Native DevelopmenAi Network Effects DAi Observational LeaAi Output Boundary OAi Participant Work Ai Pilots Dont BecomAi Pilots Dont BecomAi Pilots Dont BecomAi Pilots Dont BecomAi Policy GovernanceAi Productivity ParaAi Project CancellatAi Regulation GlobalAi Safety Pmt Fear DAi Scaling Gap EnterAi Security AutomatiAi Security PlatformAi Self Efficacy DigAi Supercomputing RbAi Takes Over Routin47Ai Training Data KnoAi Transaction Cost Ai Trust Repair WillAi Vendor Concentrat

I was reading Goodhue and Thompson (1995) for the third time, and I kept thinking about every AI pilot I have watched quietly run aground. A company deploys a large language model for customer support triage and the agents stop using it after two weeks. Another company bolts the same model onto its supply chain forecasting and gets results that are worse than the statistical baseline. The vendor calls it a data problem. The IT team calls it a change management problem. Nobody calls it what it actually is: a task-technology fit problem.

Goodhue and Thompson built the TTF model around a simple claim that I think has become more relevant than they could have predicted. Technology produces performance when its capabilities fit the requirements of the task. The same tool that improves one person's output can degrade another person's, not because the tool is broken, but because the task demands something the tool was not designed to do. This is not a subtle insight. It is the central reason most AI implementations will fail in 2026 and 2027, and the reason nobody will learn the right lesson from those failures.

The mistake starts with the word AI itself. It is not one technology. The word covers large language models, computer vision systems, predictive models, recommendation algorithms, robotic process automation, and a dozen other architectures that share almost nothing except a marketing label. Treating them as interchangeable is like calling a chainsaw, a hammer, and a level the same tool because they all go in a toolbox. An LLM is a statistical language engine trained on text completion. A computer vision model is a pattern matcher trained on labeled images. A predictive model is a function approximator trained on historical outcomes. Each one fits a different task profile, and when you put the wrong architecture on the wrong task, the TTF model predicts exactly what you will get: no performance gain, or worse, a performance loss.

The 2024 BCG study on generative AI and knowledge worker performance is a nearly perfect TTF case study, even if nobody framed it that way. The researchers found that LLM access improved performance on creative idea generation tasks, where the model's training distribution covers diverse textual patterns and the evaluation criteria reward breadth and fluency. The same LLM access reduced performance on business problem-solving tasks that fell outside the model's training distribution, where the right answer required reasoning about context the model had never seen. Same workers, same LLM, different task. Fit drove performance in one condition, and misfit destroyed it in the other. Goodhue and Thompson could have written the hypothesis themselves.

I see this pattern everywhere now. An LLM is excellent at summarizing a dense document because summarization is a routine analytical task that maps directly onto next-token prediction over a known text corpus. The same LLM is terrible at scheduling a meeting with complex human constraints because scheduling is a coordinating task that requires tacit knowledge about relationships, preferences, power dynamics, and unspoken priorities that no training corpus captures. The model is not failing. It is doing exactly what its architecture can do. The task is wrong for the technology.

The comparison that keeps running through my head is the ERP wave of the 1990s. Companies spent millions on SAP and Oracle implementations, customized the software to match their existing processes, and then wondered why they saw no productivity gain. The answer was that implementing ERP without process redesign forces a misfit between the technology's logic and the organization's actual task structure. The same mechanism is playing out with AI today. A company buys an enterprise AI license, plugs it into existing workflows without analyzing what those workflows actually require, and labels the result a pilot. When the pilot fails, the explanation is always something other than the task itself.

I wrote about the TOE framework before, and that lens explains part of this problem: organizational context matters enormously for whether any technology takes hold. But TTF explains something TOE does not. TOE tells you why an organization adopts a tool. TTF tells you whether the tool will actually work once it is deployed. You can have perfect adoption, perfect champion support, and zero task-technology fit, and the result will be a system that everyone uses and nobody benefits from. I also wrote about effective use being distinct from adoption, and the same logic applies: even high-quality use of the wrong tool for the task is not going to produce the outcome the organization wants.

I think TTF is the most underused diagnostic in AI adoption today. Every board presentation I see starts with the technology: here is what the model can do, here is the benchmark score, here is the competitor using it. Almost none of them start with the task: what is the work that needs doing, what cognitive operations does it require, what information does the performer need, what decisions depend on the output, and which AI architecture actually maps onto those demands. The question should not be should we adopt AI. The question should be what task does this tool fit, and what evidence do we have that the fit exists for this specific group of people in this specific context.

A company that starts an AI pilot with a task analysis will choose a different tool than one that starts with a technology demo. It will set different success criteria, evaluate different outcomes, and abandon projects earlier when the fit is absent. It will also deploy fewer pilots overall, but the ones it does deploy will produce more of the performance that Goodhue and Thompson described. That trade-off is the one most organizations refuse to make because it sounds like a slowdown. It is not a slowdown. It is the difference between buying a chainsaw to trim a bonsai tree and walking over to the shears.

I am not sure this lesson will stick. The AI market rewards breadth claims, and TTF is a narrowing claim. It says no, this tool is not for every task, you need to specify what you are doing before you know what to build or buy. That is a harder conversation than demoing a new model. But it is also the only conversation that prevents the next cycle of failed pilots from being blamed on the wrong thing.


About the author

A
Ali Safari
PhD Student in IS, University of North Texas

Researching AI governance, trust in intelligent systems, and agentic AI. Writing while studying for comps.

Share

More notes

← Previous
AI Training Data Made the Knowledge-Based View Visible
Next →
When AI Gets It Wrong, Who Is Responsible?

Related notes