In Moffatt v Air Canada (BC Civil Resolution Tribunal, February 2024), Air Canada was held liable for negligent misrepresentation after its website chatbot invented a bereavement refund policy that did not exist, and a customer relied on it to book travel. The airline was ordered to pay roughly 483 Canadian dollars and its argument that the chatbot was a separate entity responsible for its own words was rejected outright.
In November 2022, Jake Moffatt used Air Canada's website chatbot to ask about fares after the death of his grandmother. The chatbot told him he could book at the regular price and then claim a bereavement discount retroactively, within 90 days of issue, by submitting a request. He booked on that basis. When he later applied, Air Canada refused, because its actual bereavement policy did not allow retroactive claims after travel (CanLII).
The policy the chatbot described did not exist. Air Canada's defence was, in substance, that the chatbot was a separate entity responsible for its own words and that the correct policy was available elsewhere on the site. The British Columbia Civil Resolution Tribunal rejected that argument. In February 2024 it found negligent misrepresentation and held the airline responsible for information on its own website, whether it came from a static page or a chatbot (McCarthy Tétrault).
The tribunal ordered Air Canada to pay damages of roughly 483 Canadian dollars, the difference between the fare paid and the bereavement fare the chatbot had promised. The sum was small. The principle was not. A company was held to a commitment its automated agent made on its behalf, and the existence of a correct policy elsewhere did not undo the promise (Pinsent Masons).
What actually failed: the governance gap
It is tempting to read this as a hallucination story, and the wrong policy was certainly a fabrication. That framing misses where the harm was created. The chatbot answering a question is not the problem. The chatbot making a financial commitment the airline then had to honour is the problem.
There is a clear line between the bot answered and the bot bound the company. Most chatbot output sits on the safe side of that line. Hours of operation, baggage allowances, general guidance, none of it obligates the firm to anything. A statement that creates a refund entitlement is different in kind. It is a consequential action, and in Moffatt the tribunal treated it as one.
The governance gap is that nothing in the path distinguished between those two classes of output. The chatbot could state a routine fact and could create a financial obligation through the same unreviewed channel, with no step that required a human to confirm the second. There was also no durable record of what the chatbot had specifically promised, which is why the dispute turned on a customer screenshot rather than the company's own logs.
How MakerChecker changes the outcome
MakerChecker governs the actions an agent is allowed to take, not the words it generates. The relevant move is to separate answering from committing and to put a gate on the commitment.
Model the conversational responses as a low-risk, granted capability. The agent
can answer questions freely. Model any statement that creates a financial
obligation as a distinct high-risk skill, refund.commit, that is not granted
for autonomous use. A promise of a retroactive bereavement refund is a
refund.commit call, and above a defined threshold it routes to an approval
gate rather than executing.
Concretely, the agent proposing a refund commitment of about 483 dollars does
not bind the airline. The proposal is logged as a pending promise and held at a
non-bypassable gate that requires named human sign-off before it becomes a
commitment the company owes. Segregation of duties applies: the agent that
proposes the refund cannot approve it, because forbid_requester excludes the
requester from the approving set. A person reviews the proposed terms and either
signs or declines. Under that arrangement, a fabricated policy never reaches the
customer as a binding entitlement, because no commitment exists until a human
signs it.
Every step is written to a tamper-evident Ed25519-signed hash-chained audit that can be verified offline. The record shows what the agent proposed, that it was held pending review, and who signed or refused. If a dispute reaches a tribunal, the company holds its own signed account of what was actually committed and by whom, rather than relying on the customer's screenshot.
What MakerChecker would not fix
MakerChecker would not have stopped the chatbot from inventing the policy. It does not make the model more accurate, does not detect hallucination, and does not judge whether an answer is true. If the agent states a wrong bereavement policy in conversation, that wrong statement still appears in the chat.
This matters for an honest reading of the Moffatt facts. Part of the harm here was content, the customer received incorrect information and relied on it before any commitment was processed. MakerChecker addresses the commitment, not the content. It prevents an unreviewed promise from binding the company, and it records what was promised, so the fabricated answer cannot quietly become an enforceable obligation.
The line is the useful part. The bot can be wrong in conversation and the firm can still correct it. The bot binding the firm to a refund it never offered is the event that needs a gate, and that is the event a gate stops.
See the configuration: examples/rogue-ai/air-canada-chatbot-bereavement-refund-binding