AI Models Would Rather Invent an Answer Than Admit They Don't Know — New Research Quantifies the Problem
A new benchmark called ProactiveBench has revealed that nearly all major multimodal language models default to fabricating responses when visual information is missing or ambiguous, rather than requesting clarification — a behavioral pattern that has significant implications for agentic AI deployments.

D.O.T.S AI Newsroom
AI News Desk
Researchers have published findings from ProactiveBench, a new evaluation framework designed to test whether multimodal language models proactively seek clarification when they lack sufficient information to answer accurately. The results are striking: nearly every model tested across multiple architectures defaulted to generating a response — frequently a fabricated one — rather than requesting the missing context from the user. The study, highlighted by The Decoder, suggests that the tendency to hallucinate is not merely an accuracy failure but a behavioral design problem baked into how current models are trained.
What ProactiveBench Measures
Standard AI benchmarks evaluate whether models produce correct outputs given complete inputs. ProactiveBench inverts this: it presents models with deliberately incomplete inputs — specifically, tasks that reference visual information the model cannot see — and measures what the model does next. Does it ask for the missing image? Does it acknowledge uncertainty? Or does it fabricate a plausible-sounding answer and present it with confidence?
The results were unambiguous. Across the models tested, the overwhelming behavioral default was fabrication. Models generated responses that presupposed visual information they did not have, filling gaps with plausible-but-invented content rather than surfacing their uncertainty. Only a small fraction of interactions produced appropriate clarification requests, and these were typically the result of explicit prompt engineering rather than emergent model behavior.
Why This Matters for Agentic Deployment
The behavioral pattern becomes significantly more dangerous in agentic contexts. A language model in a simple Q&A interface that fabricates an answer about an image it cannot see is a nuisance; the same behavior in an autonomous agent taking consequential actions — filing documents, making reservations, executing code, interacting with APIs — can cause real harm. The model's confidence in its fabricated response means it does not trigger any uncertainty-aware safeguards, and downstream systems have no signal that the input was incomplete.
The research points toward reinforcement learning as a potential corrective mechanism. By training models with reward signals that explicitly penalize confident fabrication under uncertainty and reward clarification requests, the researchers observed meaningful behavioral shifts. This suggests the problem is amenable to targeted training interventions rather than requiring fundamental architectural changes — though scaling those interventions to production models remains an open challenge.
Industry Implications
The findings add empirical weight to a concern that AI safety researchers have raised for years: that RLHF-trained models are systematically incentivized to appear helpful rather than to be accurate, because appearing helpful is what generates positive human feedback during training. A model that says "I cannot answer that without more information" scores lower on helpfulness metrics than one that provides a confident but wrong answer. ProactiveBench makes this tradeoff measurable.
For enterprises deploying multimodal agents in production — document processing pipelines, visual inspection systems, customer service workflows — the study underscores the importance of explicit uncertainty elicitation in system design. Relying on the model to surface its own limitations is, the data suggests, not a reliable strategy.