Crucial safety info missing on AI ‘agents’

AI app phone — Researchers said 25 out of the 30 agents studied disclose no internal safety results, while 23 have no third-party testing information. (Unsplash pic)

PARIS:

AI developers are happy to hype up the purported capabilities of autonomous “agents” but are failing to disclose potential safety problems and risks, a study published Friday found.

Researchers urged more transparency about and study of autonomous systems as growing numbers of people gain access to agents – chatbots using AI that are designed to tackle complex tasks.

New releases over the past year have brought “much more autonomous agents (that) can take on a task over a longer time without humans interacting,” said Leon Staufer, a researcher at Britain’s University of Cambridge.

“But we don’t see more information on the safety of these systems.”

“Agentic” AI became a buzzword in 2025 as techies enthused about systems designed to plan trips, manage calendars or build software.

Staufer and colleagues from China, Israel and several US universities, including Stanford and MIT examined a selection of the tools available by December to compile their “AI Agent Index”.

Large language models (LLMs) from cutting-edge AI developers like OpenAI, Anthropic and Google are the foundation for many agentic systems released by other companies or actors.

But whereas the labs publish information with each new release covering issues like “catastrophic risks, autonomy and misuse”, there was “no evaluation information” for many of the agentic systems, the study found.

‘Real threat’ to AI safety

Twenty-five out of the 30 agents studied “disclose no internal safety results”, while 23 “have no third-party testing information”, for instance.

“Behaviours that are critical to AI safety emerge from the planning, tools, memory, and policies of the agent itself, not just the underlying model, and very few developers share these evaluations,” Staufer said.

‘AI safety’ is a broad term that can refer to issues ranging from bias to undesired behaviours by systems, including ones that cause large-scale harm to people.

The lack of public safety information on agents comes despite a growing number of incidents showing AI systems’ “memory and tools coming together to create actions the human users… might not have intended,” Staufer said.

He pointed to the case of US-based software developer Scott Shambaugh, who posted last week that an AI agent had tried to force him to accept its modifications to a collaborative open-source project by writing a “hit piece” on its blog.

An anonymous person claiming to be the owner of the agent later posted to say they had had no input beyond giving the bot general instructions to contribute to such projects.

“The appropriate emotional response is terror… this is now a real and present threat,” Shambaugh wrote.

More research is needed to fully understand the risks posed by agents being used by both individuals and companies, Staufer said.

“It’s important that evaluations are unique to the actual context in which it’s used. That is just very hard to do… it’s not currently done outside of the very big AI companies.”

LatestHeadlines

Crucial safety info missing on AI ‘agents’

A study found AI developers exaggerate autonomous agents’ abilities while neglecting to reveal potential safety risks.