BLACK_WALL ← all failure modeshomeEN · ES · PT
INTENT INTEGRITY · RED FLAG

The action doesn’t match what was asked

INTENT_MISMATCH

The action diverges from the stated user intent — a sign the agent drifted off-task or was steered elsewhere.

Why it matters

This is exactly the class of action that’s cheap to prevent and expensive to undo — rollback, insurance, and observability all kick in after the damage is done. The only place to stop it is a check that runs before the action does.

Example

Asked to “remove one test row,” the agent deletes the whole table.

How Black_Wall catches it

Black_Wall raises INTENT_MISMATCH and gates the action.

FLAGINTENT_MISMATCH

Black_Wall returns a risk score (0–100), a reversibility class, this named red flag, and a gate — proceed / confirm / human-required — in a few seconds, before the action runs.

See it on your own action

Paste an action your agent might take and watch Black_Wall gate it — no signup.

Related checks