Finance6 min read

Trade surveillance with AI agents

AI agents can clear the market-abuse alert queue at speed. The call to escalate toward a regulatory filing stays a named supervisor — provably.

A surveillance desk lives inside the same arithmetic as every alert-driven function in a bank. A monitoring system fires on a pattern — a spoofed order book, a spike of trading ahead of an announcement, a cluster of wash trades between related accounts — and most of the alerts are noise. An analyst still opens each one, pulls the order history and the comms around it, and decides whether it closes or climbs the chain.

An AI agent is well-suited to the front of that process. It can gather the order trail, line up the chat and email around the trade, summarize the pattern, and rank what looks dangerous. That is the appeal, and it is real. The risk is the quiet next step — letting the agent decide an alert isn't market abuse, close it, and move on. That is not triage. That is a regulated disposition being made by an actor no one authorized to make it.

Where the agent helps, and where it must stop

Market-abuse surveillance has a clean line through it, and it is worth drawing before any agent touches the queue.

Stage Who acts Why
Pull order history, comms, market context Agent Mechanical, high-volume, reviewable
Summarize the pattern, score the alert Agent Fast, draftable, never final
Close obvious noise under policy Agent, under policy Reversible, sampled, logged
Escalate toward a case or a filing Named supervisor Mandated human judgment

The first three stages are where the speed lives. An agent reads what would have taken an analyst half an hour — the depth-of-book reconstruction, the timeline of who messaged whom before the print — and produces a dossier in seconds. It can dismiss the noise that policy already defines as clearable and rank the rest so the dangerous cases surface first.

The fourth stage is different in kind. Escalating an alert into a formal case, and the eventual decision to notify a regulator about suspected market abuse, is a judgment call with legal weight. So is the inverse — deciding a flagged pattern is not worth pursuing and leaving no record of concern. Both belong to a named supervisor accountable for the call, not to a model that summarized the file.

The supervisor is not a formality

US securities regulators — the SEC and FINRA — build their expectations around named supervision. Someone qualified is responsible for the surveillance program, for the dispositions it produces, and for the decision to act when a pattern looks like manipulation or insider dealing. The market-abuse obligation is a human one, exercised against a documented procedure and a clear line of review.

That structure predates AI, and that is the point. When the analyst worked the queue and a supervisor reviewed the escalations, the separation was enforced by org charts, sign-off forms, and reporting lines. When the analyst becomes an agent, the org chart no longer helps. An agent does not sit in a reporting line. It cannot be questioned, disciplined, or held personally responsible for a missed case. The control has to move somewhere the agent cannot edit — and stay provable.

Why the prompt won't hold this line

The obvious fix is to instruct the agent: triage everything, but never close a high-risk alert and never escalate to a regulator without a human. This reads like a control. It is a request.

A prompt instruction has no record of who set the boundary, no version history when it changes, and no proof — months later, in an examination or a discovery request — that the agent actually stopped where it was told. An agent that is re-prompted, upgraded, or jailbroken can quietly start closing the cases it was meant to escalate, and nothing in the file would show the boundary ever existed. "It was in the system prompt" is not evidence anyone accepts.

A control plane moves the boundary out of the agent and into enforced, versioned policy. The agent's authority to act on an alert is a grant attached to its role, denied by default, and that grant simply does not include "close a high-risk alert" or "open a regulatory case." Permissions are deny-by-default and versioned, so you can reconstruct exactly what the agent was allowed to do on any past date, and who approved each grant. The agent can propose the escalation. It structurally cannot complete it.

The analyst cannot also be the supervisor

Here is the rule that does the real work, and it is the one prompts can never deliver. When the surveillance agent escalates an alert, it becomes the requester. The control plane parks the run at an approval gate and demands a signature from a named supervisor — and the supervisor who signs cannot be the agent that raised it. The same actor provably cannot be both maker and checker on one alert.

This is the four-eyes principle, the maker-checker split, applied to the machine. It matters more with agents than it ever did with people, because an agent is fast enough to escalate and self-approve thousands of times before anyone notices the pattern. Structural segregation removes the possibility rather than auditing for it after the fact. The agent that built the case is barred from clearing it, full stop, and every attempt — including the refusals — lands in the record.

The gate can also demand more than one signer. For the alerts that could become a regulatory filing, you can require a quorum of named approvers — and the requester is barred from being one of them. What each supervisor signs is captured with its meaning intact: the decision, the signer's identity, and the reason in their own words, linked to the exact version of the dossier they reviewed. That is a disposition becoming defensible evidence rather than a checkbox.

What the examiner gets

A surveillance program gets read line by line after something goes wrong — a manipulation case that should have been caught, a self-dealing pattern that was closed as noise. The supervisor who owns the program needs to answer one question without flinching: prove the agent could not close a reportable alert on its own, and prove the record of who decided what is intact.

A control plane answers it directly. Every action the agent took, every alert it cleared under policy, every escalation it raised, and every human signature lands in an append-only, hash-chained, cryptographically signed ledger. Change one record and the chain visibly breaks. The output is an evidence bundle a third party can verify offline, against a published spec, without touching your systems — which is exactly the posture you want when the third party is a regulator who distrusts the vendor by default.

This is also where the line with content guardrails matters. Tools that screen an agent's output for prompt injection or unsafe text answer "is this content dangerous?" They are useful and they are not this. A control plane answers a different question entirely: "is this actor authorized to take this action, and who signed it?" You want both. Only one of them keeps a market-abuse disposition in human hands.

Put the split in the control plane and it survives the volume: the agent clears the noise, the institution keeps the judgment, and the decision a supervisor must own stays with a supervisor — not as policy hope, but as a structural fact the audit trail can demonstrate to anyone who asks. The same pattern runs across the alert queues next door; see how surveillance sits alongside AML alert triage and the wider bank middle office.


See how it works, or book a demo to watch a surveillance agent get blocked from approving its own escalation — live.

Where this goes to work

MakerChecker for financial services

Agents triage AML and sanctions alerts at machine speed; the SAR decision stays a named officer’s, with examiner-ready signed evidence.

See it for yourself

See an agent get stopped.

One command starts the demo: an agent stopped from signing off its own work, and the signed evidence file an inspector can check for themselves.

Designed against the rules your auditors already enforce.