Cognition’s CEO targets token leaderboards’ misincentives
Cognition’s Scott Wu calls token spend leaderboards “directionally correct” but warns they can push teams into the wrong behavior—ranking engineers by tokens instead of output. His critique joins other tech leaders who argue AI compute is expensive and that re
For a lot of AI teams, the incentive question has become embarrassingly simple: how many tokens did you use today?
On the “Founders” podcast, released on Sunday, Cognition CEO Scott Wu tried to put that impulse back in context—without pretending it’s wrong. “It is directionally correct,” he said, “but I think there are definitely some places where people have gotten carried away.”
His example was blunt: “People are like, ‘We rank our engineers by how many tokens they’re spending.’ Well, let’s try and rank people by how much output they’re actually producing.”
Wu’s warning lands at an awkward time for the industry. Cognition is among the most visible names in AI software engineering, and it’s built its momentum on the promise that models can turn expensive compute into real work done.
Cognition, founded in 2023 by Wu, Steven Hao, and Walden Yan, is best known for Devin, an autonomous AI software engineer. The San Francisco-based company has drawn investment from Founders Fund, Khosla Ventures, Elad Gil, and Pear VC. In May. it raised more than $1 billion at a $26 billion post-money valuation. making it one of the most valuable AI coding startups globally.
Wu argued on the podcast that compute costs shouldn’t be ignored—but neither should the performance upside. He acknowledged “compute is expensive,” then pointed to a practical test: if engineers can ship three times more than they would without AI, it is “clearly worth it.”
That’s where his critique tightens. He said companies need to tie rewards to less lofty metrics, with outcomes that are easier to verify—such as how many tickets are resolved and how much cheaper and faster a project can be.
He’s not alone in targeting the leaderboard trend that has come to be described as tokenmaxxing: using tons of AI tools like Claude, Codex, and Cursor to boost productivity and to get ahead on internal AI use dashboards and reviews.
Earlier this month. Jacob Lauritzen. chief technology officer of legal AI startup Legora. said employees should not be rewarded just for using AI. “A lot of people. say. get a leaderboard and bring up token usage at performance reviews. ” he said on a podcast. “That leads to tokenmaxxing, which is people just burn tokens just to look good.”.
Lauritzen added: “That’s a really stupid way to do anything.”
The pushback extends beyond startups that sell AI coding. At a Bloomberg conference this month. Andrew Feldman. CEO of Cerebras Systems. said giving employees unlimited tokens was “boneheaded from the get-go.” He used an efficiency analogy to make the point: “You don’t need a Ferrari to go to the grocery store. right?. Use a lower-cost open source model.” He then framed the lesson in everyday terms: “What we’re learning is how to shop at Costco.”.
Wu’s own stance threads a needle: token tracking can be useful. but the incentive design can go off the rails when token burn becomes the scoreboard. If the industry keeps rewarding the wrong number. teams may optimize for consumption instead of results—exactly the kind of “getting carried away” Wu warned about.
Right now. the debate inside companies is less about whether AI usage should be measured than about what measurement should drive. For Cognition’s CEO, tokens aren’t the enemy—bad ranking systems are. Rewards tied to tickets resolved. and to projects that get cheaper and faster. are meant to bring the focus back to shipping work. not lighting up dashboards.
Cognition Scott Wu token leaderboards tokenmaxxing Devin AI coding compute costs incentives productivity Legora Jacob Lauritzen Cerebras Systems Andrew Feldman
So basically stop counting tokens?? Cool story.
This is why tech is messy now. If they’re ranking by tokens, people will just spam usage and call it work. Also isn’t “tokens” like… money? Not sure.
I listened to like 10 minutes of the podcast clip or whatever and it sounded like he’s saying pay attention to output, not just the bill. But then they raised a billion so idk, seems like they still want more compute anyway? Like output could be anything. “Three times more” is such a vague flex.
Tokens leaderboard sounds like a way to cheat the system, which happens everywhere. My cousin works in AI and they don’t even use “tokens” the way people talk about it on TV, so maybe this is just insider stuff. But the CEO saying compute is expensive… yeah, no kidding. If they’re spending more tokens then maybe the model is just making more errors and they don’t notice until later. Anyway incentives always mess people up.