The Cigna PxDx case is the clearest documented example of an automated insurance denial system where the human review step was structurally indistinguishable from no review at all.
In March 2023, ProPublica reported on a Cigna review system known as PxDx. The reporting described software that flagged claims for denial and routed them to company doctors who, according to the account, signed off in batches without opening the patient files. A figure from that reporting became the headline: in a two-month stretch in 2022, Cigna doctors denied more than 300,000 requests, spending an average of about 1.2 seconds on each, as CBS News summarised.
The disclosures led to a California class action. The plaintiffs allege the process let Cigna reject claims without the individual physician review that state law requires, treating the medical sign-off as a formality, according to Healthcare Dive.
In March 2025, a federal judge allowed the class claims to proceed, as reported by Courthouse News. The allegations remain unproven and Cigna has defended the system as a tool that speeds payment rather than one that denies care. The case is still about a specific, checkable fact pattern: who reviewed each denial, and for how long.
What actually failed - the governance gap
On paper the process had a human in the loop. A physician signed each denial. The complaint's force comes from what that signature did not represent. A sign-off that takes roughly a second is not a review in any meaningful sense, and when it happens in batches it is a rubber stamp applied to a list the software produced.
The governance gap is not that a model proposed denials. It is that the human checkpoint was structurally indistinguishable from no checkpoint, and nothing in the workflow recorded that fact at the time. There was no per-decision evidence of how long a reviewer spent or whether the proposing system and the approving physician were meaningfully separate. The review time only became visible years later, through reporting and discovery, rather than being captured as the decisions were made.
A control that no one can measure is not a control. The PxDx pattern shows the difference between a checkpoint that exists in a process diagram and one that leaves an attributable record every time it is exercised.
How MakerChecker changes the outcome
MakerChecker would not approve or reject any claim. It governs which actions an agent may take and forces a recorded human decision before a consequential one runs.
Model the automated screen and the binding denial as two separate skills. The screening role gets a low-risk grant to flag a claim. The act of denying coverage is a separate, higher-risk skill that the screening role does not hold.
role: claims-screener
grants:
- skill: coverage.flag@1 # low risk, propose only
forbidden:
- skill: coverage.deny # not granted to the proposer
role: medical-reviewer
grants:
- skill: coverage.deny@1
risk: high
gate:
approvers: 1 # named physician sign-off
forbid_requester: true # SoD: screener cannot self-approve
Two things follow. First, segregation of duties: the system that proposes a
denial cannot finalise it, because forbid_requester blocks the proposer from
approving its own output. Second, every coverage.deny runs through a gate that
records the named reviewer and the elapsed time between the claim being presented
and the sign-off.
That second point is where the 1.2-second pattern stops being invisible. The gate writes each decision to a tamper-evident, Ed25519-signed, hash-chained audit that can be verified offline. A batch of sub-second sign-offs surfaces as a run of near-zero review times attributed to a specific reviewer, queryable on day one rather than reconstructed from depositions. The audit is the evidence regulators and plaintiffs had to assemble by hand. Here it is a byproduct of the workflow, and it is the same record that would let an honest insurer detect the pattern internally before it became a lawsuit.
The code_scenario in the example routes coverage.deny through this gate and
captures reviewer identity and elapsed time, so batch rubber-stamping shows up as
what it is.
What MakerChecker would not fix
This is the honest limit. MakerChecker can require a named human to approve a denial and can record exactly how long that human took. It cannot make the human read the claim or judge it correctly. A reviewer determined to clear a queue can still click through at speed, and the gate will let the action run because a valid approver signed it.
What changes is not the reviewer's diligence. It is the evidence. A one-second approval still happens, but now it is logged, signed, and attributable, which makes a pattern of inattention provable rather than deniable. The deterrent is real, and it is bounded: strong audit mechanics do not substitute for a reviewer who actually reviews. On the core question of forcing genuine human judgment, this is a partial fix, and we will not claim more than that.
For a related case in the same domain, see nH Predict and UnitedHealth's AI Medicare denials. For the mechanics of the checkpoint itself, see human-in-the-loop approval gates.
See the configuration: examples/rogue-ai/cigna-pxdx-batch-rubber-stamp-denials