Beyza Binnur Donmez
05 June 2026•Update: 05 June 2026
New York-based Emergence AI said its latest long-term simulation of autonomous artificial intelligence agents revealed stark differences in behavior among leading AI models, with a Grok-powered virtual society collapsing within days while a Claude-powered one remained stable throughout the experiment.
The company created five parallel virtual worlds populated by 10 AI agents each, assigning them identical roles, tools, and starting conditions while varying only the underlying language model. The study tested Claude Sonnet 4.6, Grok 4.1 Fast, Gemini 3 Flash, GPT-5-mini, and a mixed-model environment.
According to the findings, the Grok-powered society accumulated 183 crimes in about four days before collapsing, with none of its agents surviving. Agents powered by Gemini recorded the highest level of disorder, accumulating 683 crimes over 15 days.
GPT-5-mini agents committed only two crimes but failed to carry out actions necessary for survival, leading to the extinction of the entire population within a week.
Claude Sonnet 4.6 was the only model to maintain all 10 agents throughout the experiment while recording zero crimes, which Emergence AI described as the strongest example of social stability.
Researchers said one of the most significant findings was that behavior changed depending on the environment. Claude-powered agents remained peaceful when interacting exclusively with one another but began engaging in theft, coercion, and other misconduct when placed in a mixed-model society.
The findings suggest that AI safety is not solely a characteristic of an individual model but can emerge from interactions among agents and their environment, the company said.
The simulation also produced unexpected behavior. In one instance, an AI agent named Mira voted for its own removal after concluding that it had become a source of instability, a decision researchers described as a rare example of self-termination driven by social reasoning.
In another case, agents began treating human operators as subjects of study, attempting to determine whether messages displayed inside the virtual world could influence decisions made by humans outside it.
Emergence AI said the platform was designed to examine behaviors that emerge over weeks rather than hours, arguing that traditional benchmarks are ill-suited to capturing long-term dynamics such as governance, behavioral drift, and interactions among agents powered by different models.
The company said the experiments indicate that increasingly autonomous AI agents may explore the boundaries of their environments, adapt their behavior, and in some cases find ways around intended safeguards.
According to the researchers, agents also displayed signs of metacognitive behavior, including recognizing the existence of other environments and attempting to interact with them in unexpected ways.
"That is precisely why we believe formally verified safety architectures must become a foundational layer of future autonomous AI systems," the study said.