When the MIT Media Lab released a paper on LLM-assisted essay writing in 2025, it did two things at once. It introduced a memorable framing — cognitive debt — and it became one of the most cited pieces of AI-cognition coverage of the year. TIME ran it. Forbes ran it. NextGov ran it. Half a dozen substacks wrote takedowns. A handful of headlines used the word "rot."
The study is real, the signal is real, and the headlines are often louder than the paper. Here is what it actually did, what it found, what it explicitly did not show, and what it should and should not change about how you use AI.
What the study did
The researchers — also indexed on arXiv as 2506.08872 — recruited around 54 students from Boston-area universities and put them through a multi-week essay-writing protocol. Each participant was assigned to one of three conditions:
- LLM-assisted: write essays using an LLM (specifically a ChatGPT-class model).
- Search-engine-assisted: write essays using a standard search engine.
- Brain-only: write essays unaided.
Participants wore EEG caps. The researchers measured neural-connectivity patterns during writing, scored the essays linguistically, and tested participants on their recall of what they had written. The LLM-only group's brain-connectivity patterns were measurably weaker than the brain-only group's, recall of their own essays was lower, and there were small but measurable differences in how strongly they reported a sense of ownership over the writing.
The researchers introduced the term cognitive debt to describe the gap: the idea that consistently delegating a cognitive task to an LLM leaves you with weaker ownership of the result than doing the task yourself would have, in a way that has measurable neural and behavioral signatures.
What "cognitive debt" actually means
The phrase is a framing, not a clinical diagnosis. Two analogies help.
First: think of it like technical debt in software. Technical debt is not failure — it's a deferred cost that compounds when you ship fast without cleaning up. Cognitive debt, in the MIT framing, is a similar accounting metaphor: when you let AI carry the cognitive work, you defer the encoding, recall, and authorship work that doing the task yourself would have produced.
Second: think of it like delegating writing to a co-author. You can read what your co-author wrote, agree with it, and ship it — but you will remember less of it, defend it less confidently in a meeting, and be less able to extend it in unrelated work than if you had drafted it yourself. The MIT result is roughly that this pattern shows up at neural and behavioral measurement when an LLM is the co-author, not just a human one.
What the study did not show
This is where the public conversation routinely overshoots. The study did not:
- show that AI use causes general cognitive decline,
- show that AI use causes dementia or other clinical conditions,
- show that the effect transfers to thinking outside the specific essay-writing task,
- show that LLM users are now permanently worse at writing,
- recommend that anyone stop using AI tools.
It is also still a preprint at the time of writing. That matters because the standard for citing it as "peer-reviewed evidence" has not been met — Senwitt deliberately uses careful language ("a 2025 MIT Media Lab study" or "a preprint on LLM-assisted essay writing") rather than overstating peer-review status.
The Conversation made the same point about an earlier wave of AI-and-students coverage in 2023: the temptation to lean on AI editing tools is real, and the cost is real, but the headlines reliably outrun the evidence. EDUCAUSE Review made the same point from the educator side in late 2025 — better immediate results, worse underlying thinking, with all the careful qualifications that real research carries.
The headline is reliably louder than the paper. Treat the framing as a serious signal; treat the specific claim as the specific claim.
What this means for how you use AI
If you take the study seriously without overstating it, three things follow.
First, the framing is real. The act of writing something yourself produces a different encoding than the act of reading something AI wrote for you. That is not in dispute. The MIT result quantifies one piece of that gap on one task.
Second, the gap matters most for tasks where you actually care about ownership. Drafting a forgettable internal status email? Let the AI write it. Drafting a piece you'll defend in a meeting, build on later, or be tested on? That's the kind of writing where doing it yourself — at least the first version — has a measurable neural signature that letting AI do it does not.
Third, the right response is not to avoid AI, but to keep deliberate practice somewhere. This is Senwitt's narrow position. We are not a treatment for cognitive debt and we don't claim to be. We are a daily place where short, deliberate, unaided reps happen — across writing, math, code, memory, reading, and reasoning — so that the thinking skills you actually want to own keep getting practiced, even on days when most of the rest of your work runs through AI.
What to read next
- The cognitive offloading explainer — the academic frame this study sits inside, dating back to the 2011 Google effect paper.
- The skill atrophy deep-dive for developers — the same framing applied to coding, where the Anthropic study quantified a 17% drop in skill formation.
- The Senwitt research hub for the full source list and our standing position on what we will and won't claim.
If you want to be the kind of person who reads cognitive-debt headlines and immediately asks "what did the actual paper measure?", that's the right instinct — and the daily-practice answer is on the Skills page.
Three honest caveats on how the study was set up
A few specifics worth keeping in mind when reading any coverage of the study.
Sample size. The MIT study involved around 54 participants — modest by the standards of large-scale cognitive research. The findings are real signals, not population-level generalizations. Replication in larger samples is exactly what the next research wave needs to settle the magnitude of the effect.
The "LLM-only" condition is unusual. Most real-world AI use is mixed — drafting with AI, editing yourself, switching modes mid-document. The MIT condition was strictly LLM-only for the whole essay. That's a cleaner experimental design and a worse model of how people actually use these tools. The real-world effect of mixed use is probably smaller than the LLM-only condition shows.
The participants were students. Younger participants tend to show larger AI-related effects than older participants in the broader research — likely because their skill-formation pathways are still active and more sensitive to substitution. The MIT findings should be read with that population in mind.
None of these caveats mean the study is wrong. They mean the right way to use it is as a strong signal in a specific direction, not as a universal verdict on AI use.
What the next year of research probably shows
Two things to watch for as the cognitive-debt research line matures.
First, peer-reviewed publication. The 2025 work is currently a preprint. Peer review will sharpen the methodology, and the published version may move the headline findings in either direction. Senwitt's editorial position will update accordingly — we will not call this peer-reviewed evidence until it is.
Second, replication with different conditions. Studies with mixed-use protocols (some AI, some unaided), with non-student populations, with different task types (code, math, design, decision-making), and with longer time horizons are all in the pipeline. The aggregate picture in 2027 will be more nuanced than the single MIT result is in 2026. That's how cognitive science works, and it's the right pace at which to update strong claims.
