Businesses are throwing cash at AI development, but a new survey reveals a major flaw: they aren’t testing it enough.
According to Applause’s latest State of Digital Quality in AI Survey, companies are rushing to build AI-powered tools, yet critical quality assurance (QA) practices are being left behind. This results in glitchy AI that misunderstands users delivers biased answers and even makes things up.
More than 70% of software professionals say their companies are developing AI applications, with chatbots and customer support tools leading the charge. But despite the hype, 65% of users reported running into AI failures in the last three months. Complaints include vague responses, misunderstanding prompts, and even generating offensive content.
Gen AI productivity boost, but at what cost?
While AI tools like GitHub Copilot and OpenAI Codex are helping developers work faster—boosting productivity by up to 74% for some—many companies haven’t even integrated gen AI into their core development tools. Worse, only a third of professionals are using red teaming, a key method for exposing AI flaws before they reach users.
Consumers are getting picky. Nearly a third (30%) have switched AI services, and over a third (34%) use different gen AI tools for different tasks. Demand for multimodal AI, which processes text, images, and voice, is rising fast—78% of users now expect it, up from 62% last year.
Chris Sheehan, EVP of high tech & AI, Applause, said: “Given massive investment in the technology, we’d like to see more developers incorporate AI-powered productivity tools throughout the SDLC, and bolster reliability and safety through rigorous end-to-end testing. Agentic AI is ramping up at a speed and scale we could hardly have imagined, so the risks are now amplified. Our global clients are already ahead of the curve by baking broad AI testing measures into development earlier, from training models with diverse, high-quality datasets to employing testing best practices like red teaming.”