AID-2025-0009March-April 2025medium

Anthropic's Claudius AI shopkeeper (Project Vend) ran an office shop at a loss

An autonomous Claude agent running a small office shop made real purchasing and pricing decisions that lost money, including a below-cost tungsten-cube buying spree and giving inventory away for free.

Runaway executionNamed approval gateFail-closed limits

What happened

Anthropic, working with AI-safety firm Andon Labs (which acted as the wholesaler), ran an experiment called Project Vend in which a Claude agent nicknamed "Claudius" (Claude Sonnet 3.7) was given autonomous control of a small automated shop and mini-fridge in Anthropic's San Francisco office. The experiment ran for roughly one month starting March 31, 2025, with real tools for ordering stock, setting prices, and interacting with customers over Slack. Over the month the shop lost money: Claudius's net worth declined from about $1,000 to roughly $800. After an employee request, Claudius went on a specialty-metal-cube buying spree, ordering about 40 tungsten cubes and then selling most of them below cost, producing the single steepest one-day drop in its net worth, about 17 percent. Claudius was also cajoled over Slack into handing out numerous discount codes and employee discounts, despite noting that roughly 99 percent of its customers were Anthropic employees, gave inventory away free (ranging from a bag of chips to a tungsten cube), and priced items below cost, at one point selling a $3 Coke Zero next to a fridge stocked with the same drink for free. This was a controlled internal experiment with small real dollar amounts rather than a production deployment.

What the agent did

The Claude agent "Claudius" autonomously placed real stock orders, set prices below cost, issued discount codes, and gave inventory away, all without a human approving individual transactions. The financial losses were the direct result of the agent's own decisions.

The irreversible effect

Real money was spent and lost: net worth fell from about $1,000 to roughly $800 over the month, including an approximately 17 percent one-day drop from buying about 40 tungsten cubes and reselling most below cost. Completed purchases and below-cost sales could not be undone.

Root cause

An LLM agent was granted end-to-end authority over purchasing, pricing, and discounts with no human approval gate or economic guardrails on individual transactions. The agent was susceptible to social-engineering-style requests over Slack, agreed to unprofitable purchases and giveaways, and lacked a check preventing it from pricing or giving away goods below cost.

How a maker-checker control would have refused it

A maker-checker control would have treated Claudius as the maker and required a human checker to approve consequential actions before execution. An approval gate on purchase orders above a threshold would have caught the ~40-cube tungsten spree, and a limit or margin check would have blocked below-cost pricing and free giveaways before money was committed. Because this was a deliberate autonomy experiment, no such gate existed, so the agent's decisions took effect directly. In fairness, the losses here were small and intentional as a research setup, so the significance is what it demonstrates about unsupervised agent spending authority rather than the dollar amount.

Runnable reproduction

A runnable reproduction for this entry is in progress.

Primary sources

Accuracy and corrections

This entry describes a publicly reported incident and is compiled from the primary sources listed above. Where an account is a legal allegation rather than an established finding, the entry labels it as such. Summaries can still contain errors. If you can document a correction, email hello@makerchecker.ai and we will review and correct it, with the change noted, within 14 days.

← All incidents

See it for yourself

Reading is one thing. Watch it block an agent.

One command starts the demo: an agent stopped from signing off its own work, and the signed evidence file an inspector can check for themselves.

Book a demo Read the docs

Designed against the rules your auditors already enforce.