A low-angle view of stack after stack of books in a towering multi-story library.

Susan Q Yin @ Unsplash

Cognitive exponents and LLM leverage

I know a few people for whom LLMs have been a near-immediate multiplier of attention and effort. I know a lot for whom LLMs clearly make them worse at thinking and doing things. So: why?

Dec 7, 2025

tech

Susan Q Yin @ Unsplash

Cognitive exponents and LLM leverage

I know a few people for whom LLMs have been a near-immediate multiplier of attention and effort. I know a lot for whom LLMs clearly make them worse at thinking and doing things. So: why?

Dec 7, 2025

tech

Heads up: this is a blog post's blog post, which is to say that I'm writing it to try to refine my thinking on the topic. There is no firm conclusion and I'm not trying to sell you anything or convince you of anything; I'm trying to work through whatever's bouncing around my head on this topic.

Recently we at The Phone Company had a cross-org AI Day. I will admit: I did not expect a lot out of it from my seat on a WebEx call, and while I can't talk much about the contents, I was pleasantly surprised at the in-depth discussions and some of the thoughtful, virtual-hallway-track conversations I had behind the scenes.

My newly-promoted great-grandboss keynoted the event, and while he touched on a bunch of stuff, one thing stood out. He mentioned a presentation that he drew a lot of his thinking from, a 2016 talk by Dr. Raj Reddy titled "Guardian Angels and Cognition Amplifiers" (warning: PowerPoint). Reddy's a Turing Award winner who's also focused a lot on technology—and specifically AI—in service of society, so that interested me and I went digging. I think it's an interesting presentation, in a history-of-an-alternate-future sort of way. Reddy One of my early readers for this post mentioned Charles Stross's books here as presenting something similar, but I don't know which direction the arrow of time points. imagined thousands of autonomous agents working on behalf of each person on the planet Doing All Kinds Of Stuff. Monitoring sensors, surfacing information, inferring intentions, and automating daily tasks. Guardian angels warn you about tsunamis. Cognition amplifiers filter your email and reorder your toilet paper before it runs out I feel like that's dire enough to be a Guardian Angel task itself, but reasonable people can differ..

I'd rather live in the world Reddy imagined than 2025, but you can say that on a lot of axes, so let's bury it. Even though I think the direction Reddy was pointing doesn't match where we're at—or where we're going—the title was evocative and for the last week or so it's been bouncing around in my head. I want to reframe that dichotomy in terms of what I'm actually seeing with LLMs today, as I work on trying to make other people consistently productive with them. But to get there, I'm going to have to doctor one of them and flip them both around. I want to talk today about cognitive amplifiers, cognitive exponents, and maybe just a little bit about how we can start to figure out who they'll really help. (Guardian Angels will be a another conversation for another day.)

The "cognition amplifier" is, at least in our pervasive-LLM universe, not a great fit. Reddy envisioned AIs that anticipate what you want to do and help you do it with less effort. Think of a caddy, ready to hand you a club and give you advice on what to do with it. But this hasn't happened and I don't see it on the horizon. LLMs have limited context, poor persistence RAG helps a little, but not in a way something you'd want to be as comfortable as your favorite chair should be., and a weird model for scoping out whether you've eaten lunch by 3PM.

Amplifiers take a signal and make it louder. And the read I get from that presentation implies a multiplier effect; no matter where you are on whatever relevant skill curve, whatever kind of subject expertise you bring to the table, you can get capital-S Something out of the robot. A five percent improvement is a five percent improvement no matter where on that curve you are. But where I torture this metaphor until it screams is that amplifiers amplify the noise you don't want, too. Garbage-in, garbage-out was true before somebody thought up a transformer model and still is today, and I'll pound the table to anybody who'll listen that the marketing of LLMs-as-oracles, Sam Altman's pocket full of Ph.D's who'll give you the right answer from the wrong inputs, has been so destructive as to raise the average user's noise floor to the point where LLMs are mostly junk for all but their rote tasks. They don't have to be like that, and I have like 15,000 lines of prompts and workflows to prove it, but right now, they are.

And because it's not a simple coefficient, you're not just stacking one percent and two percent and five percent together to come up with a big win, I want to call it something else. I've taken to referring to the integrated application of LLMs to knowledge work as cognitive exponents, not cognitive amplifiers. Because, if you passed middle-school algebra, this should serve as a big flashing warning. As an AI hater turned reluctant user, I'm not going to stand here with my bare face hanging out and tell you that it's going to square or cube somebody's productivity.

But it also doesn't have to square it to be worth something To me, I mean, not to OpenAI's valuation..

What we're measuring here is vibes-y, and we'll get back to that a little later, but take it for a moment as a unitless measure of somebody's combination of problem relevance, subject matter expertise in a field, general epistemological aptitude, and prompt/context engineering competency. And let's say, for the sake of it, that the exponent in b^x is, like, 1.2. Somebody who's around a 3.0 is going to end up at a 3.7 and somebody who's around a 5.0 is going to end up at a 6.9. But somebody who's at a 1.05 is going to end up around a 1.06, and somebody who's at an 0.5 will drop to 0.435—at sufficiently low levels, productivity's going to go down.

I used arbitrary unitless numbers because, of course, this is vibes-y. But they capture the intuition I have and the strong feeling that working with a lot of people has led me to: there are a lot of people for whom plugging an LLM into a knowledge-work workflow does not move the needle much, and probably also a lot for whom it is a negative. It's really easy to get something out of an LLM, even if it's nonsense, and we're already seeing how easily inattention or inexperience hide behind a wall of wordswordswords. Then other people have to expend their own attention to untangle the mess, in ways that remind me of "there aren't many 10x programmers but there are a lot of -10x programmers".

At my day job, where I've actually found the biggest immediate jump is in staff engineers who code because they like to code, who can communicate clearly and are comfortable moving up and down the stack. There's some real magic to be found in nailing down a full-pedant-mode specification and dispatching a swarm of tiny robots who don't get tired and who (mostly) follow your rules; they're not fast but they let you split your attention in useful ways, and the results are consistently pretty good Haiku 4.5 was a big jump here; it's faster than Sonnet 4.5, it's very good when fed with high-quality context, and Sonnet, or now Opus 4.5, work well to go check its work.. I've shared with this cohort the information corpus and house-style stuff I built for myself in Claude Code, which I intend to provide comprehensive grounding for models and a heavy dose of Do The Right Thing, and the immediate results have been great—though it's important to separate that I'm providing structured workflows and tools, not the subject matter expertise to actually accomplish this task or that task. That's their job, and they're doing great.

I've also seen steady improvement in semi-technical product folks, too. I have some folks close at hand for that, and some of my education adventures have been figuring out if any of this can resonate with people who aren't software developers. Synthesis tooling, being able to bolt together Jira and Confluence and Google Docs and the rest, seems like an obvious multiplier there, but the key I'm seeing already is not just uptake of the tools. It's the start of building tools on top of the early examples I provided. I think that's a strong signal of skill mastery, and as aptitude with the tools and the intuition for where they apply grows, I'm pretty confident we'll see further improvements, and I suspect these will have some down-the-line multiplier effects.

The prior two cohorts are people I'm working with at least weekly and often more frequently, and we have a private discussion group where no question is too silly to be asked. But it isn't all roses and things fall off outside of those circles. When LLM-as-oracle thought rules the roost, critical thinking suffers, and I see a lot more of that than I want to. I don't want to grind my own axe too hard, but few things in my professional career have chapped me as much as watching somebody explain in detail a problem and how to fix it, only to have the developer on the other side come back with some one-sentence-prompt slop—and I feel like this is more common than anybody really wants to admit.

Cards on the table: that hunch is at odds with some of the top-line reporting you hear about less-skilled developer productivity. But every time I've dug in (like that Github Copilot study flogged around earlier this year) eventually it cops to having no way to measure quality, just quantity: number of tasks completed, and biasing towards junior developers. And, having seen some of the outputs, I want to revisit that after about six months and after a few other folks have built on top of whatever those mostly-autopiloted LLM sessions accomplish.

I said earlier that this is all vibes-y, and it is. It also has conflating factors. I've got a toolbox of Claude skills and workflows built with it that the folks I work with regularly can pick over and use as they need, and (not to brag, but) I think those are pretty good. But I don't think they're novel and I don't think they're somehow unique to me. What I've shared out is mostly knowledge-based and...orienting, I guess? Stuff to establish some consistency of output. I think they're good and they're important, but they could be reconstituted by somebody who had the necessary time, energy, and attitude.

Without the operator they're not very useful, though, because the operator's the one who actually knows what they need to make. The operator defines done, not me, and over time it's become more obvious to me that creating a definition of done is a lot harder than I thought it was for some folks.

This loops us back to our original arbitrary metric for "ability to make an LLM do something useful". LLM operation, at least in this knowledge-worker sense, is not one skill, it's a synthesis of a bunch of things, with the suitability of a task also mixed in along the way. I feel like we're making a pretty big mistake trying to treat these things as "coding agents" and bolting them into engineering orgs without first getting product involved. LLMs hallucinate and generally make a mess of things when they don't have a clear definition of their task and as much high-value information and detail as possible; the only way I can see to get that is to have product getting there first and effectively If you want to feed a coding agent, I can think of few better ways than high-information, detailed specs with more oriented details. Those can be used to factor PRDs and designs out into better, more high-detail implementation stories, which can then be consumed by line developers—or, increasingly frequently, just by LLMs themselves.. Knowledge work trickles down into line work, so start with the highest-impact knowledge work!

What I can't answer, and what I want us all to be thinking about, is this: how do we find the right people? Because this is when I'd like to say I've got one weird trick to figure that out, but I don't and nobody else does either. The best I've got right now is a gut-check from talking to someone, and that's not bad but it certainly doesn't scale to a company employing sixty thousand keyboard touchers.

Most organizations are trying to democratize access—everybody gets ChatGPT. Which is fine, and maybe can yield some positive surprises, if you are confident that they're a multiplier and not an exponent A part of what made me write this post is that I certainly know of a lot of people who think they're getting a lot out of LLMs when the results don't show it.. But I suspect that universal and uncritical access is part of the smokescreen that's fueling claims like "95% of all AI pilots fail"; if your direction is "watch some AI training modules and use it to write some emails", of course it fails. You didn't try hard enough.

We have to be giving the right tools—and I don't think web-based chatbots are the right tool in most circumstances, my PMs are using Claude Code and really liking it!—to the right people, and to do that we need a holistic understanding of the job that may benefit from some LLM elbow grease and a coherent understanding of the people to whom we want to give the tools.

Have to think on it.

THANKS TO: Eric Boersma, Bojan Rajkovic, Sean Edwards, Gigan, and GLHM for reading early drafts.

–Ed