A device manufacturer receives a complaint: a patient on an infusion pump reported a burn near the cannula site. Somebody has to read that, decide whether it could plausibly be a serious injury, and then decide the question the FDA actually cares about, is this reportable? Under the agency's Medical Device Reporting rule, that second decision starts a clock. Get it wrong in the direction of silence and you have a regulatory finding waiting to be written up.

This is exactly the kind of high-volume, language-heavy reading that an AI agent is good at. It is also exactly the kind of decision that, done wrong, ends with a warning letter. The instinct to automate the whole thing is understandable. The mistake is automating the decision instead of the triage.

Two decisions hiding in one inbox

Complaint handling looks like one task. It is two, and they have very different risk profiles.

The first is triage: read the complaint, normalize it, extract the device, the event, the patient outcome, the dates. Flag whether it looks like it could involve death, serious injury, or a malfunction that could cause one. This is reading and pattern-matching. An agent can do it fast, consistently, and around the clock, and it never gets bored on the four-hundredth complaint of the week.

The second is reportability: the formal determination, under 21 CFR Part 803, of whether this event must be reported to the FDA and on what timeline. That is a regulated judgment with legal consequences. It is the decision a quality or regulatory-affairs professional is trained, named, and accountable for. It is not a decision you want a model making silently because it scored the case at 0.71.

MakerChecker is built around keeping those two apart. The agent handles the reading. The human keeps the decision. The system proves the line was never crossed.

What the agent is allowed to do

The temptation with a capable agent is to let scope creep. It read the complaint well, so let it also classify reportability; it classified well, so let it draft the report; it drafted well, so let it file. Each step feels like a small, reasonable extension. Collectively they hand a regulated decision to a system no inspector can hold accountable.

A control plane removes that drift by making capability explicit and deny-by-default. The agent acts as a named identity holding one role, and that role is granted exactly the skills it needs and nothing else, a pattern we describe in the six primitives. For a complaint-triage agent, the grants might be:

read the incoming complaint queue
extract and structure event details into a draft case
flag a potential serious injury or reportable malfunction for human review
propose a Part 803 classification, marked clearly as a proposal

Notice what is not on that list: deciding reportability, filing with the FDA, or closing a case as non-reportable. Those are not configured out by a polite instruction in the prompt. They are simply not granted. The agent cannot call a door it was never given a key to, and every grant is versioned, so you can reconstruct exactly what the triage agent was permitted to do on the date any given complaint came in.

The named human gate

When the agent flags a complaint as potentially reportable, the run does not proceed. It stops at an approval gate, a workflow step that parks the case and demands a named human signature before anything irreversible happens. The reportability call belongs to a person, by design.

A gate is more than a pause. It carries the controls that make the eventual record defensible:

It identifies the signer, a real, named individual, not "the system."
It bars the requester from signing their own request, so the agent that proposed a classification can never be the actor that ratifies it. This is structural segregation of duties, enforced inside the run rather than promised in a policy document.
It can require a quorum of named approvers for the cases that warrant it.
It captures the signer's reason verbatim, so the determination carries its meaning, the why, not just the click.

That last point matters more than it sounds. A reportability decision is a judgment call, and when the FDA reviews your complaint files, the question is rarely "did a human approve this." It is "on what basis." A captured rationale, bound to the named signer and the exact record, is the difference between a defensible determination and a checkbox.

Why the audit trail is the point

The reason this architecture exists is not elegance. It is evidence.

When an inspector pulls your medical device reporting files, they are testing two things: that reportable events were reported on time, and that your complaint records are intact and trustworthy. The first is process. The second is the part agents quietly undermine, because an automated pipeline that can write to its own records is a pipeline whose records are hard to trust.

MakerChecker writes every action, every model call, and every human signature to an append-only, hash-chained ledger, each entry cryptographically signed. Change one record after the fact and the chain visibly breaks. The export is verifiable offline, against a published open specification, by someone with no access to your systems and no reason to trust the vendor.

That maps directly onto the predicate rules you already live under. Tamper-evident audit trails are 21 CFR 11.10(e). Signatures that carry their meaning are 21 CFR 11.50, and the binding of a signature to the specific record it approves is 21 CFR 11.70. The reportability gate is the human decision Part 803 assumes a person makes. MakerChecker does not invent any of this, it implements, for an AI agent, the controls regulators have demanded of people for decades.

What this is not

This is not a content-safety filter. Guardrail products that ask "is this output dangerous or off-policy?" are useful and complementary. MakerChecker answers a different question: is this actor authorized to take this action, and can you prove it? A complaint-triage agent can produce perfectly safe, well-formed text and still have no business deciding reportability. The guardrail checks the words; the control plane checks the authority.

It is also not a claim that the software is "compliant." No tool makes you compliant. MakerChecker is designed against the rules your auditors already enforce, and it gives you the structural controls and the evidence to stand behind your own determinations. The judgment stays yours. The proof that you made it properly is what the system supplies.

The same shape applies wherever a model reads regulated case data and a human must own the call, adverse-event intake in pharmacovigilance, batch release, cohort identification in a clinical trial. The agent reads. The named human decides. The record proves it.

See how it works, or book a demo to watch an agent get blocked from approving its own work, live.

Medical device reporting with AI agents

Two decisions hiding in one inbox

What the agent is allowed to do

The named human gate

Why the audit trail is the point

What this is not

MakerChecker for life sciences

Pharmacovigilance and AI agents

21 CFR Part 11 for AI agents

Human-in-the-loop approval gates for agents

See an agent get stopped.