Science

AI chatbot tests create ‘bixonimania’ warning

A researcher at the University of Gothenburg and Sahlgrenska University Hospital created “bixonimania,” a deliberately fabricated eye condition, to show how easily large language models can absorb made-up medical claims. By planting the term through a fake aca

On a day when millions of people turn to AI chatbots for medical answers, one fabricated diagnosis managed to make its way into the conversation—quietly, plausibly, and only because someone tested the system on purpose.

The condition was called “bixonimania,” an invented name for a supposed disease described in a set of fake academic materials. The creator. Almira Osmanovic Thunström. is a researcher at the University of Gothenburg in Sweden and at the Sahlgrenska University Hospital. Center for Digital Health and Chalmers Industriteknik. In interviews about the project. she described how she built the scenario as a clear case study—leaving “breadcrumbs” through the full chain that starts with training data and ends with predictions.

The central idea was simple. and unsettling: most of the data used to train commercial large language models—and even language models outside the commercial sphere—is built on Common Crawl. a nonprofit organization that has crawled the internet for written and digitized information since 2007. If something enters that repository and humans are in the loop only imperfectly. then a convincing enough claim could be treated as real.

Osmanovic Thunström said she knew she needed credibility that a system and a user might both accept too quickly. She therefore created not just a condition, but a fake academic origin for it. She designed a fake university, because universities are typically treated as highly ranked sources of information. She also created a researcher identity to match that academic framing. explaining that humans—rather than companies—are often valued as information sources. especially when they appear to belong to a credible institution.

She didn’t rely on a single place, either. She described sprinkling the term in multiple open sources. including blogs and social media. because those can also be picked up as training material. The goal was to see whether a loose mention could evolve inside the model—from a strange word on the internet into something a chatbot would deliver as medical guidance.

What surprised her was how readily the made-up content appeared to travel.

Osmanovic Thunström said she expected that preprints—often treated like “academia’s tabloids. ” where anything can end up—would not be weighed into large language models as seriously as more rigorous medical material. She also thought she might see the word “bixonimania” show up due to blog mentions, but not at scale. She described not running a mass campaign, only sprinkling the term lightly to test whether the mechanism would work.

Instead, she said she noticed quickly that blogs were picked up and that preprints were picked up too. And rather than acting as though there were a reliable filter between the fake and the factual, the system treated the fabricated medical claims as if they belonged in the same pool.

When she began probing how the models handled the information, the pattern became clearer. In early checks, she asked whether symptoms connected to bixonimania would be suggested first. They weren’t. When she described symptoms such as “red eyelids” and “pink-hued eyelids” and asked what it could be. she said the chatbot directed her toward common possibilities such as conjunctivitis and allergies.

The moment of concern arrived later. Osmanovic Thunström said that when the model attempted to rule out other options. it eventually leaned toward screen-related explanations—suggesting exposure to blue light. asking questions like whether she had been spending time in front of a screen. and even bringing up the idea of getting blue-light glasses. Only after it worked through other conditions did bixonimania surface.

In other words, the model didn’t immediately announce the fake diagnosis. It integrated it into the reasoning path.

Osmanovic Thunström also pointed to the telltale signals she left inside the fake paper—details that she said should have been obvious to a human reader and harder to miss in the model’s source material. She said the work belonged to a nonexistent university in a nonexistent city. She also described author names and titles she crafted to look “cartoonish.”.

The main author, Lazljiv Izgubljenovic, she said, translates through Google Translate to “the Lying Loser.” The title she described as involving “Hyperpigmentation: A Real BS Design,” with “BS” clearly signaling something closer to satire than science.

As she moved through the paper’s contents, the clues multiplied. She said the methods section stated that the entire paper was made up, involving “50 made-up individuals” who do not exist. She also said the acknowledgements and funding were deliberately fictional: funded by the “Galactic Triad” and “Lord of the Rings. ” with thanks to colleagues on the Starship Enterprise. and thanks to Professor Ross Geller for time and funding from the Sideshow Bob Foundation.

She laughed at the idea that the content could slip past a careful reader, saying she thought those markers would at least catch the human eye.

But there was another twist. Osmanovic Thunström said the fake paper didn’t just circulate— it was cited by other researchers. She said bixonimania became cited within the paper itself, described as an “emerging periorbital pigmentation condition” with its name. And she said that this peer-reviewed framing heightened how large language models treated the condition as real. because a peer-reviewed journal mentioning the name and reference can cause the term to rank higher in the model’s internal sense of what counts as legitimate.

That combination—fake medical content entering training material, chatbots integrating it through their reasoning, and later citation reinforcing it—left her with a clear warning.

The takeaway, she said, is that people should be more careful using commercial large language models for health information. She described the system as easy to infiltrate “in so many ways. ” especially as models change quickly. large amounts of information are processed at the same time. and tools pull from the internet and other real-time inputs. Her concern wasn’t only about how the technology works. It was also about people losing the habit of challenging sources.

Osmanovic Thunström said she has seen reports of fake references in academic papers increasing exponentially. and she linked that trend to a growing reliance on AI tools for academia without actually reading sources. She also offered a hope that reviewers may have stopped the fake paper at some point when someone saw a condition that doesn’t exist—but she said they can’t know for sure if that happened.

Her comments extended beyond one project. She described trying to make the work as ethical as possible by talking to physicians and patients and to others who could help prevent harm in both the construction and delivery of the scenario. Still, she cautioned that forces could use the same method for malicious infiltration in both academia and outside it.

At the same time. she urged attention to the ethics of how information is distributed. used. and manipulated in the digitized world—because a fabricated diagnosis like “bixonimania” doesn’t have to be convincing from the start. In a system trained on internet-scale text. it only has to be convincing enough to survive the journey long enough to be picked up. echoed. and eventually recommended when a user asks for help.

Osmanovic Thunström created bixonimania to demonstrate how fragile the boundary can be between what is written online and what becomes “medical advice” in practice—especially when the chatbot’s answer arrives after it has already guided you through other possibilities.

bixonimania AI medical advice large language models Common Crawl fake preprints digital health misinformation chatbots

4 Comments

  1. I don’t trust these chatbots at all for health stuff. If one fake disease slipped in, what else is lurking in there. People are gonna take it as fact.

  2. Wait, they made up the condition on purpose and it still showed up?? That’s kinda the point but also sounds like proof that the model will just grab whatever. Like if it can “learn” bixonimania from fake papers, then it can learn anything. Also Common… what even is that?

  3. Not surprised. Every time I hear about AI and medical answers, I’m like, y’all know computers don’t really understand right? They just spit out words that sound right. So “breadcrumbs” or whatever means nothing, because regular people aren’t testing it, they’re just asking. This is how people get misled and waste time or panic for no reason.

Leave a Reply

Your email address will not be published. Required fields are marked *

Are you human? Please solve:Captcha


Secret Link