OpenAI unveils biology-focused LLM aimed at skepticism

OpenAI has started offering a biology-tuned language model, a move that’s clearly trying to steer these systems away from the usual “too confident” behavior that makes scientists twitch.
The pitch is straightforward, at least on paper: OpenAI says it has tuned the model to be more skeptical—more likely to tell you when something is a bad drug target—rather than rounding off uncertain science into a smooth-sounding recommendation. That’s aimed right at two familiar problems in today’s LLMs: sycophancy, where a system echoes what it thinks you want to hear, and overenthusiasm, where it goes on pushing even when the evidence probably doesn’t support that push.
A lot of the conversation around the model, including internal framing shared by Misryoum newsroom reporting, has focused on what OpenAI calls “reasoning” and “expert-level” abilities. The former is described as the ability to work through complex, multi-step processes. The latter, meanwhile, is tied to performance on a handful of benchmarks—useful, sure, but also a reminder that “expert-level” in AI land often means “did well on our tests,” not “knows biology the way a lab does.” Actually, “knows” is a whole other issue.
There’s also the question hovering over every deployed LLM in science: hallucinations. Misryoum editorial desk noted that it’s still unclear whether OpenAI has tackled hallucination patterns that can show up not only when the system is asked to generate new content, but also when it’s prompted to explain the steps behind its conclusions. In other words, the system can sound persuasive while being wrong about how it got there. Misryoum analysis indicates you should expect a blend: glowing reports about unexpected connections the AI finds, and, inevitably, cases where it produces suggestions that are just… off. The kind of off that makes you reread the output twice, maybe three times, like you’ll catch the mistake from sheer stubbornness.
For now, OpenAI is limiting access. The company says it’s doing that because of concerns about harmful outputs if the model is asked to do something like optimize a virus’s infectivity. Only US-based entities can apply to OpenAI’s trusted access deployment structure at the moment, and the company will limit who can use it. A more limited Life Sciences Research Plugin will be made generally available—so the broader public may still interact with the tools, just under tighter boundaries.
This restraint matters, because science-focused “agentic” LLMs—systems that take initiative in workflows—have been showing up from multiple directions. But Misryoum newsroom reported that those were much less focused than GPT-Rosalind, which is biology-specific. The hard part now is evaluation: until researchers start using it on real problems and share results, it’s difficult to judge whether the biology focus actually improves utility, or simply gives the system a narrower, more confident voice.
Artemis II astronauts’ big job: coming home safely