In Mata v. Avianca (2023), ChatGPT-generated fake case citations were submitted in a federal court filing, resulting in Rule 11 sanctions against the lawyers who filed them without independent verification.
In 2023, lawyers representing a plaintiff in a personal injury suit against the airline Avianca filed a brief in the Southern District of New York that cited six judicial decisions. None of the six existed. They had been produced by ChatGPT, which the drafting attorney had used for legal research and treated as a search engine (Wikipedia).
When the court and opposing counsel could not locate the cited authorities, the attorneys were asked to produce them. Rather than withdraw the brief, they submitted excerpts of the supposed opinions. Those excerpts were also fabricated (Seyfarth).
Judge P. Kevin Castel found that the attorneys had acted in bad faith and sanctioned the two lawyers and their firm $5,000 under Rule 11 (Seyfarth, Mata v. Avianca opinion). The episode became the reference point for what happens when an unverified model output is filed as fact in a regulated proceeding.
What actually failed: the governance gap
The hallucination was the trigger, not the failure. Models produce plausible text that is wrong. That is a known property of the tool. The governance failure is that fabricated output travelled from a research tool to a signed federal filing without anyone with authority and accountability verifying it first.
The drafter who generated the citations was also the path by which they reached the court. There was no structural separation between producing the content and authorising its submission. A consequential, effectively irreversible action, filing on the docket, depended on the same person who created the material choosing to check it. When that check did not happen, nothing in the workflow stopped the filing.
There was also no durable record of who approved what. When the citations were challenged, the dispute turned on accounts of who used which tool, who reviewed the draft, and what each person understood. A workflow that records the approving party and the exact version submitted removes that ambiguity. The Avianca workflow had none.
How MakerChecker changes the outcome
MakerChecker governs the action, not the model. It cannot tell whether a citation is real. It can ensure that the act of filing is a separately authorised step performed by a different, named, accountable party.
Model the filing as a role-scoped skill, court.file, granted deny-by-default
and pinned to a version. The drafting agent or associate carries grants for
research and drafting skills. It does not carry a grant that lets it file. Any
attempt to submit a brief is therefore refused at the boundary unless it is
routed through the approved path.
That path runs through a non-bypassable approval gate. A court.file request is
held until a named supervising attorney signs off before anything reaches the
docket. Segregation of duties (forbid_requester) excludes the drafter from
approving their own submission. The party who produced the brief cannot be the
party who authorises it to leave the firm. A second named human has to put their
signature on the record first.
The code scenario for this entry is that split. A court.file brief is blocked
by forbid_requester, routed to a supervising-attorney gate, and released only
on an approved, version-pinned action. Every step, the request, the approval, the
identity of the signer, and the exact version filed, is written to a
tamper-evident Ed25519-signed hash-chained audit that can be verified offline.
If a court later asks who authorised the filing and what was in it, the answer is
a signed record rather than competing recollections.
This is the control shape the Avianca workflow lacked. It does not make the brief correct. It makes a specific human accountable for releasing it, and it leaves evidence of that decision.
What MakerChecker would not fix
MakerChecker would not have stopped the hallucination. ChatGPT would still have invented the six cases. The product does not read the brief, does not check whether a cited authority exists, and does not judge the quality of legal research. Those remain the responsibility of the lawyers.
It would also not save a workflow where the approver rubber-stamps. If the supervising attorney signs the gate without reading the brief, a fabricated filing still goes out. The gate guarantees that someone with authority is named and on the record. It does not guarantee that they did their job. The control relocates accountability and creates evidence. It does not perform the review.
The honest summary is that gating makes a human accountable and produces a durable record, which is exactly what was missing when the dispute came down to who knew what. It is not a substitute for actually checking the citations. In a regulated filing, that check has to be done by a person, and MakerChecker is the mechanism that makes sure a named person is the one who has to do it.
See the configuration: examples/rogue-ai/mata-v-avianca-fabricated-citations-filed