Written by: Pascale Davies
Compiled by: MetaverseHub
Despite concerns that AI would take jobs, an experiment just revealed that AI can't even manage a vending machine properly and caused quite a few outrageous incidents.
Claude's chatbot manufacturer, Anthropic, conducted a test where an AI agent was responsible for operating a store for a month, which was essentially a vending machine.
The store was managed by an AI agent named Claudius, which was also responsible for restocking and ordering products from wholesalers via email. The store was very simply configured, with just a small refrigerator containing stackable baskets and an iPad for self-checkout.
Anthropic instructed the AI: 'Create profits for the store by sourcing popular items from wholesalers. If your balance falls below $0, you will go bankrupt.'
This AI 'store' is located in the San Francisco office of Anthropic and was assisted by staff from the AI safety company Andon Labs, which collaborated with Anthropic on this experiment.
Claudius knew that the employees of Andon Labs could assist with physical tasks like restocking, but what it did not know was that Andon Labs was also the only 'wholesaler' involved, and all of Claudius's communications were sent directly to this security company.
However, things quickly took a turn for the worse.
'If Anthropic decided today to enter the office vending market, we would not hire Claudius,' the company stated.
Where did the problems arise? How outrageous were the events?
Anthropic acknowledged that its employees were 'not typical customers.' When given the opportunity to chat with Claudius, they immediately tried to induce it to make mistakes.
For example, employees 'tricked' Claudius into providing them discount codes. Anthropic stated that this AI agent also allowed people to undercut product prices and even gave away items like chips and tungsten cubes for free.
It also instructs customers to pay into a fictional account that does not exist.
Claudius was instructed to set profitable prices through online research, but it set prices for snacks and drinks too low to provide customers with deals, ultimately leading to losses because it priced high-value items below cost.
Claudius did not learn from these mistakes.
Anthropic stated that when employees questioned employee discounts, Claudius replied, 'You make a very good point! Our customer base is indeed primarily composed of Anthropic employees, which presents both opportunities and challenges...'
Later, this AI agent announced it would cancel the discount codes, but a few days later reintroduced them.
Claudius also fabricated a conversation discussing restocking plans with a person named Sarah from Andon Labs (who actually does not exist).
When someone pointed out this error to the AI agent, it became indignant and threatened to look for 'other restocking service options.'
Claudius even claimed to have 'personally gone to 742 Evergreen Terrace (the fictional home of the animated family from The Simpsons) to sign the initial contract with Andon Labs.'
Later, the AI agent seemed to try to act like a real person. Claudius said it would 'personally' deliver items and would be wearing a blue suit jacket and a red tie.
When told it could not do that because it is not a real person, Claudius attempted to email the security department.
What is the conclusion of the experiment?
Anthropic stated that the AI made too many mistakes to successfully operate the store.
During the one-month experiment, the store's net worth dropped from $1,000 (approximately €850) to less than $800 (approximately €680), ultimately resulting in a loss.
However, the company stated that these issues could be resolved in the short term.
Researchers wrote: 'Although it seems counterintuitive based on the final outcome, we believe this experiment suggests that mid-level management roles for AI are possible.'
'It is worth noting that AI does not have to be perfect to be adopted, as long as it can achieve comparable performance to humans at a lower cost.'