What would happen if a fake company was created from scratch, staffed completely by bots? Researchers at Carnegie Mellon wanted to know the answer, so they launched The Agent Company, a virtual simulation of a software company that put artificial intelligence "agents" from such companies as OpenAI, Google, Meta, Amazon, and Anthropic to work. These AI employees toiled as software engineers, financial analysts, and other white-collar jobs to complete daily tasks that mimicked what an actual software firm would do—e.g., scrolling through file directories, penning performance reviews for engineers, and the like. The results? Insider notes the experiment was a "total disaster," while Futurism calls it "laughably chaotic."
The "winner," if you can call it that, was Anthropic's Claude 3.5 Sonnet, which was able to wrap up not even a quarter of the jobs it was given, with each task taking an average of a half hour and costing more than $6 to complete. Coming in last: Amazon's Nova Pro 1.0, which finished a dismal 1.7% of its tasks. Most proponents of using AI agents, however, point out that humans would likely work in tandem with them in real life, and Futurism takes away the positive that at least "the machines aren't coming for your job anytime soon."
Live Science, meanwhile, has compiled a list of more than 30 other eyebrow-raising times when AI got it "catastrophically wrong"—from a Microsoft bot getting "inappropriate" on social media, to facial recognition technology flagging members of Congress as criminals. (More artificial intelligence stories.)