Neuroscientist spots “curse” in mice, reshapes learning models

Assistant professor Kauê Machado Costa says his experiments often overturn his expectations—then uses those surprises to challenge a two-system view of learning. From a dopamine-based learning signal that appears tied to non-reward prediction errors, to orbito
When Kauê Machado Costa talks about his career, he doesn’t describe it as a straight line of triumphs. He calls it a “slightly cursed” run of predictions.
Over the course of his career, he says, he has made a hypothesis, predicted an outcome, and—rarely—been right. Almost every time, he adds, the experiments come back with the opposite result or something he hadn’t imagined. He jokes about it, but he also treats it like fuel. “That’s why I facetiously called it a curse. ” he says. “but actually it’s also a blessing. because it means that my career has been very exciting.”.
That mindset now sits behind his research into learning—how the brain updates what matters. what’s likely. and how future behavior gets shaped by past experience. Costa. an assistant professor in the Department of Psychology at the University of Alabama at Birmingham. is one of the 2026 Young American Scientists featured in a segment for “The Young American Scientists. ” an editorially independent project produced with financial support from Regeneron.
Costa’s path through experiments has also pushed him into a more uncomfortable lesson: sometimes the model system is the problem, not the hypothesis.
He described a turning point toward the last quarter of his Ph.D. when he worked with a widely used transgenic line in neuroscience research known as DAT-Cre mice. The line is designed to enable genetic manipulation specifically in dopamine neurons that express DAT. the dopamine transporter—an approach many labs rely on. Costa was testing the effect of trying to knock out a particular gene.
He ran the experiments blind to genotype, part of a rigor he emphasizes. But when he finally uncovered the data. the control mice were “acting kind of funny. ” he says. and their behavior was “severely affecting the conclusions” he was trying to draw. Rather than switch tracks, Costa says he treated the odd behavior as information.
His deeper dive revealed what he calls a strong sex-dependent phenotype in the specific substrains and versions he was using. Native expression of the dopamine transporter was reduced or impaired. In practical terms. especially in females. the mice showed traits that mapped onto a model of attention-deficit/hyperactivity disorder: they were hyperactive and had lower DAT function.
In the paper he and his team wrote up—selected as one of the top 100 neuroscience papers downloaded in the journal that year—Costa said researchers should look into this background effect if they’re trying to detect behavioral effects from a genetic manipulation. The core warning he delivered was simple: the strain itself already carries this background phenotype.
It’s a caution that lands differently given his research philosophy. Costa tells the story of how he approaches projects by starting with predictions rather than only exploring without direction. He says one way his lab works is to begin with a hypothesis and a specific, strong prediction. If the prediction is wrong. he argues. the result is still “informative.” Without that. he says. many of his projects would have ended in “dismal failure. ” even though he adds that experiments can still fail for multiple reasons.
That tension—between what researchers expect and what the data actually show—becomes the backbone of how Costa thinks learning works.
For years, a major framework in learning science has split learning into two camps. One view is that the brain learns the value of cues and actions: how motivationally relevant an outcome is. In today’s computational language, this is often called model-free learning. It relies on updating a value function, without needing a detailed internal representation of the world.
The other view treats learning as building a richer representation: the brain creates a simulation of the external world and learns associations among events. estimating probabilities of what will happen based on what it has observed. This approach is often called model-based learning. It has advantages—like letting you make inferences about events that haven’t occurred—while also demanding more energy and computational resources.
Costa’s work, he says, pushes against a tidy split.
He highlights two findings from his research on dopamine signals—signals long associated with model-free learning. The classic interpretation is that dopamine activity represents prediction errors: the difference between predicted and expected reward value. But in one set of experiments. Costa says he showed that dopamine signals can represent prediction errors about things that don’t have reward value. In that sense, the signals “approximate much more a model-based prediction error than a model-free prediction error,” he said.
In another study, Costa turned to the orbitofrontal cortex, a part of the frontal cortex located above the eyes in humans. Previous research, he said, had associated this region with model-based learning, including a hypothesis that it stores cognitive “map” representations for task execution.
Costa’s initial hypothesis was that inactivating the orbitofrontal cortex would disrupt model-based learning and trigger a default to model-free strategies—if one system can’t operate, the other fills in.
What he found was different. Model-based learning was disrupted when the orbitofrontal cortex was inactivated. but the disruption was “very specific.” The rats. he says. could still create a model of the world. The problem was the model itself: it became confused, imprecise. From that result, Costa’s proposal shifts the emphasis. The orbitofrontal cortex may not be the switch that enables all model-based learning. Instead, he argues, it may be particularly important for linking specific events to each other. When it isn’t working, models degrade—less detailed, less accurate, and more likely to confuse associations.
Costa also connected this idea to mental health. He suggested that maladaptive behavior and disease states may not be explained only by an opposition between model-free and model-based learning. Instead. he says. they may be better understood as the brain deploying models of different structures—some detailed and accurate for a task. others confusing and distorted.
One example he pointed to involves latent inhibition. a process of attentional filtering described by Costa as a way to ignore information that is irrelevant. In a study he published, he said the orbitofrontal cortex was essential for latent inhibition. People with schizophrenia. he added. have deficits in latent inhibition. meaning they don’t filter out information efficiently and may treat relevance as applying to too much. Costa framed the consequence as spurious associations that can contribute to hallucinations and cognitive disorder.
He then floated an alternative lens for schizophrenia: rather than a general malfunction of model-based learning, a disordered model with a particular structure might be generating the symptoms. The approach, he said, fits the goals of computational psychiatry.
He also raised addiction and substance use disorder. He acknowledged that a theory he says has “fallen out of fashion” described a transition from model-based to model-free strategy after substance use—away from detailed representations and toward an over-prioritization of rewards and values. Costa said he prefers a different framing: instead of a clean switch between strategies. he argued that drug abuse may lead to the creation of a disordered model.
He connected that idea to behavioral treatments, including the success of contingency management in drug abuse treatment. He said that success is harder to explain if you assume an overdominance of model-free strategies. Costa stressed that these are controversial ideas and said many colleagues disagree—but he framed them as the direction his thinking has been leading based on his previous work.
Taken together, his research threads point toward a central idea: learning might not be a simple contest between two systems. It may be about models of different complexity that recruit different brain areas, depending on the task and the quality of what those circuits are able to represent.
In the end, Costa says his lab is pursuing questions that are both mechanistic and informational. One major goal is figuring out the “informational content” of dopamine teaching signals—whether dopamine truly carries information that extends beyond reward prediction errors.
He wants to know what dimensions of information contribute to those signals.
He’s also investigating interactions between neuromodulators. In a recent study he published with Zhewei Zhang. a fellow postdoc at the NIH. and in work involving his postdoc advisor Geoffrey Schoenbaum. Costa said he found that recording acetylcholine alongside dopamine reveals that their interactions change depending on whether dopamine responds to reward versus motivation-related processes. That led him to ask how different neuromodulators interact in learning. and whether these interactions increase the brain’s “information capacity.”.
And then there’s the orbitofrontal cortex, with an even bigger question behind it: what environmental properties or conditions determine when the brain creates more detailed, simpler, more precise, or more generalized models.
Costa ends by returning to the same core question that runs under his lab’s predictions. his dopamine experiments. and his orbitofrontal cortex findings: what mechanisms—informational. cellular. and molecular—determine what the brain learns. and what gets incorporated into the mind when it is exposed to events in the world.
For viewers of the episode, there’s also a quieter takeaway hidden in his opening joke. In a research culture that often rewards certainty, Costa’s “curse” is a reminder that the mind learns to adapt—not only the brain inside the experiment, but the scientist inside the lab.
neuroscience learning dopamine orbitofrontal cortex model-free learning model-based learning prediction error latent inhibition schizophrenia addiction DAT-Cre mice
So the mice are cursed now? I’m sure that checks out.
Wait, dopamine and prediction errors like… that’s basically the brain’s “expectation” thing right? Kinda wild he says his results are backwards a lot, but I mean science is like that I guess. Does this help with humans or just mice lab drama?
I don’t get why people act like it’s some big thing. If you’re testing learning models on mice and they don’t do what you thought, isn’t that just the experiment failing? Also “two-system view” sounds like politics, not neuroscience. Unless it’s saying dopamine controls punishment??
This headline makes it sound like he discovered an actual curse, like witchcraft but for lab rats lol. But then it’s like, dopamine signals and “non-reward prediction errors”?? I swear I’ve heard stuff like that before and it always turns into some confusing model. If his predictions are “opposite” then how is that useful, unless he’s just guessing differently each time. Anyway, sounds like he’s doing the right kind of chaotic research.