Training AI models consumes staggering amounts of energy. Trist and Bamforth showed in 1951 why optimizing only the technical subsystem creates a collapse.
I read something this week that I cannot shake. Training GPT-3 consumed roughly 1,300 megawatt-hours of electricity. That is equivalent to what 130 US homes use in a year. A single ChatGPT query uses about ten times more energy than a Google search. SemiAnalysis projects that by 2027, AI servers could consume as much electricity as the entire country of Argentina. These numbers are startling on their own. What bothers me is not the numbers themselves but the pattern they point to.
The pattern goes like this. An industry identifies a goal that is purely technical, model accuracy, parameter count, capability. It pours resources into optimizing that goal. The technical subsystem improves rapidly and impressively. And the social and environmental subsystems that make the technical achievement sustainable are treated as externalities. Trist and Bamforth (1951) documented this exact sequence seventy-five years ago. They were studying British coal mines where the longwall method had mechanized extraction. The technical redesign split one autonomous team into three specialized shifts. On paper it was more efficient. What happened next was that absenteeism rose, conflict between shifts spiked, informal coordination collapsed, and productivity fell below pre-mechanization levels. The technical optimization destroyed the social system, and the social collapse dragged the technical performance down with it.
I wrote about this in detail when I traced the longwall method through modern software culture and in a deeper piece about what joint optimization actually requires. The mechanism is the same every time. The coal mine becomes a data center. The social system becomes the communities that host the infrastructure and the climate that absorbs the emissions. The error of treating one subsystem as independently optimizable stays the same.
Ackoff (1971) gave the theoretical language for this error. He argued that decomposing a system into parts and optimizing each part independently can degrade overall system performance, because the performance of a system depends primarily on how its parts interact, not on how well each part performs individually. The AI industry has decomposed the system into compute and environment. Compute is optimized ruthlessly: bigger models, more parameters, faster inference, lower cost per token. The environment is treated as a separate concern to be addressed later through offsets or renewable energy purchases, not as something that must be jointly optimized with model design from the start. Ackoff would recognize this immediately. It is the same mistake he described in 1971, just running on GPUs instead of conveyor belts.
The AI industry is in what I think of as the longwall phase. The technical trajectory is remarkable. Models can reason, generate, translate, and code at levels that would have seemed impossible five years ago. The industry is optimizing for capability and speed, and the results are visible and impressive. But the social and environmental costs are compounding in ways that are not captured in the capability benchmarks. Data centers already consume between one and two percent of global electricity, according to recent industry reports. Every new model generation pushes that number higher. The communities where data centers are built report water stress, grid strain, and rising local electricity costs. These are not minor side effects. They are signals that the social subsystem is under strain, following the exact dynamic Trist and Bamforth documented in 1951.
What makes this case different from the original longwall method is that the AI industry already has its joint optimization response. Model efficiency research, which focuses on smaller models, distillation, quantization, and sparser architectures, is a direct attempt to reduce the environmental cost of AI without abandoning the technical benefits. A smaller model that performs nearly as well as a larger one is a joint optimization outcome: it respects both the technical goal of capability and the environmental constraint of energy cost. Microsoft's carbon-negative pledge, which commits the company to removing more carbon than it emits by 2030, is another move in the same direction. It acknowledges that the industry's growth trajectory cannot continue without accounting for its environmental impact.
I think these responses are real progress, though I am not sure they go deep enough. Efficiency gains do not automatically reduce total energy consumption. They can also make AI cheaper and more accessible, which drives more usage and more total energy demand. This is the Jevons paradox applied to compute: a more efficient model may not reduce energy consumption if it lowers the cost of inference enough to expand the market. And carbon offsets, even aggressive ones, are a financial instrument, not a change in how models are designed. They treat the symptom rather than the interdependence that Ackoff identified.
The deeper issue is a tension between two institutional logics that are pulling in opposite directions. One logic is about capability and speed. The market rewards bigger models, more features, faster deployment, and competitive positioning. The other logic is about sustainability and responsibility. It demands that the industry account for its environmental footprint, respect community impacts, and design within ecological constraints. These two logics are not aligned, and the AI industry is currently optimizing for the first while treating the second as a constraint to be managed rather than a value to be jointly optimized with.
Bostrom and Heinen (1977) brought sociotechnical systems theory into the IS field in the very first volume of MIS Quarterly. Their argument was that system failures are almost always sociotechnical problems, not purely technical ones. Sarker et al. (2019) found that only about 13% of IS research actually operates at genuine sociotechnical interaction. The rest treats technology as background or as a deterministic force. The AI energy problem is the same structural pattern at the industry level. The industry is solving the technical problem of model capability while treating the environmental and social consequences as something to be dealt with after the fact.
I think the AI industry is going to hit the productivity collapse that Trist and Bamforth described. Not because models stop improving. They will keep getting better. But the social and environmental systems around them will push back in ways that the current optimization function does not account for. Communities will resist data center construction. Regulators will impose energy caps. The cost of carbon will eventually be priced into the economics of training and inference. These are not anti-technology outcomes. They are the social subsystem expressing its interdependence with the technical one. The question is whether the industry will recognize the pattern before the collapse or after. Joint optimization means designing for both capability and sustainability from the start. The theory has been telling us this since 1951.
About the author
Share
More notes
Related notes