Synthetic survey data rises, but trust is strained

As fewer people answer unknown calls and online surveys are easier to dodge, more polling and market research firms are using AI to create synthetic survey responses. Qualtrics is selling synthetic panels built from its data, Gallup has partnered with Simile f
The phone doesn’t ring the way it used to. When an unknown number calls, many people simply don’t pick up—and that shift has started to bleed into the way polls and consumer research are built.
At least eight in 10 people don’t answer unknown calls, according to the Pew Research Center. Online surveys, too, can be gamed: they require people to opt in by physically visiting a website, which can make them easier to ignore than phone surveys.
That’s the opening synthetic data firms are stepping into. Across polling and consumer research, some companies are using artificial intelligence to manufacture synthetic survey responses—plausible answers from fake people—either to stand in for missing voices or to pad out real ones.
Qualtrics, the experience-management giant, says it has moved this approach from experiment to product. It now offers synthetic panels that take a survey as an input and produce record-level responses designed to be statistically modeled the same way as responses from 1. 000 humans. according to Ali Henriques. the company’s executive director of market research.
Qualtrics’ system leans heavily on its own data. A publicly available base model contributes between 5% and 10% of the final result. The remaining 95%+ is drawn from Qualtrics’ commissioned research and aggregated. anonymized client data. stripped of brands. with the underlying information no more than 18 months to two years old to keep it relevant.
It’s not only Qualtrics. In May, Gallup—an organization that traces back 90 years—disclosed a partnership with Simile, an AI company founded by Stanford researchers, to build “agents” from in-depth interviews with around 1,000 members of its probability-based panel.
Gallup’s caution is explicit. The pollster said simulated responses won’t be used to produce its published population estimates. and it has pledged never to present them as human answers. In its blog post announcing the partnership. it said: “Our work on simulated responses is not a departure from that commitment. It is built on top of it.”.
Researchers studying the space say that difference matters—because synthetic data can be useful without being the same thing as discovery. Jason Miklian. a research professor at the Center for Global Sustainability at Norway’s University of Oslo. studies synthetic research and warns that it performs best when you already know what you’re looking for.
“While synthetic data can give you an incredible snapshot of conventional wisdoms of what sorts of things people have generally believed over time,” Miklian says, “it’s incredibly bad at generating anything surprising.”
To him, the surprises are where real value sits. The new knowledge—what drives scholarship or business decisions—tends to come from unexpected results, not rehearsed ones. Still. Miklian sees a clear role: synthetic data can pressure-test a survey before spending money administering it to real people. or help with questions whose answers would have looked the same five or 10 years ago.
Other academics are less convinced the guardrails will hold once the technology is commercialized. Sean Westwood. a political scientist at Dartmouth College and director of its Polarization Research Lab. is concerned about what he calls mission creep—especially when marketing language blurs what a system actually measures.
“‘We use GPT-5’ is just not a method,” Westwood says. He argues that firms selling “silicon sampling” rarely disclose the model or the success metrics against which they ought to be benchmarked. In his view, stereotypes embedded in training data can get laundered into what eventually becomes consensus when scaled up.
His criticism isn’t theoretical. Some companies are using AI to scale up their systems. he points to Ifop’s product DataBoost AI. which Ifop says can “transform small sub-samples into robust bases using statistical levers”. In one example criticized by French statisticians on Bluesky. Ifop used the tech to turn a sample of 116 real interviews with middle- and high-school teachers into a group of 580 teachers. Ifop did not respond to an interview request.
Westwood also argues that uncertainty becomes harder to model in practice. Because AI models work in a non-deterministic way. he says. researchers can’t use traditional statistical techniques to calculate uncertainty in a real sample. Increasing sample sizes, he argues, sacrifices the ability to understand what is actually being measured.
Miklian adds another fear: a creep of synthetic responses into what was once human-driven political polling. He also worries about feedback loops—synthetic surveys amplifying existing assumptions. then becoming ammunition for challengers when real election results fail to match the synthetic picture.
Qualtrics, for its part, is trying to prevent that outcome in its own market. Henriques says the company is “making a concerted effort to educate the market that this is not a replacement.”
She says she has spent the last year and a half thinking about synthetic respondents and the boundary between modeling behavior and reproducing lived experience. “All of these pieces start to come together in a really interesting way that’s understanding just the human. ” Henriques says. “But I don’t believe we’ll be able to fully simulate those really lived experiences.”.
synthetic data AI polling market research Qualtrics Gallup Simile Ifop DataBoost AI survey methodology uncertainty bias polarization political polling
So they’re basically making up answers now? That seems messed up.
I don’t even answer unknown calls, but now they’re saying the polls need AI people to replace us? Cool, so the results are just vibes with math.
Wait, Qualtrics is taking their own data and then generating more responses? Sounds like it could be biased because they’re using stuff from 2 years ago. But also how would anyone even know if the “synthetic panels” are lying? Like isn’t every survey already kinda fake anyway?
Honestly this is what happens when everyone ignores phone polls. But I saw something on TikTok that said AI polls are more accurate than real ones?? Idk man. If the system needs a survey as an input then it’s kinda like feeding it what you want it to say. Also, “18 months to 2 years old” data… so it’s basically outdated opinions pretending to be current. Great. Not like people are getting spammed enough already.