Anwer Gertani

The Cyber Desk

Policy Autonomy: The Right End State (And Why You're Framing It Wrong)

April 12, 2026 · Anwer Gertani

AI / MLSecurity StrategyLeadership

"AI as decision maker" creates maximum board resistance. "AI executing human-defined policy at machine speed" gets CISO sign-off. The difference in framing is everything. The difference in practice is almost nothing.

This is Post 6 of 7 in the series “Building Security Operations That AI Can Run.”

The framing of AI as decision maker generates a predictable set of objections. They have very good answers when the frame is AI executing policy that humans wrote, reviewed, and approved. Three ways to visualise this:

Option 1 — Policy hierarchy

01 · HUMAN AUTHORITYPolicy LayerWritten · Reviewed · Approved by humansIntent · Boundaries · Escalation triggers · Review cadence02 · AI EXECUTESExecution LayerAI acts within approved policy at machine speedML automation · LLM agents · Agentic workflows · Audit trail03 · HUMAN DECIDESEscalation LayerCases that exceed policy authority surface to humansNovel cases · Boundary conditions · Policy gaps · LLM ambiguity

Option 2 — ML vs LLM policy execution comparison

Classical MLLLM AgentHow it executesDeterministic rules: if X → YProbabilistic reasoning from intentWhat policy saysExplicit conditions + exceptionsIntent + reasoning constraintsSame input?Same output every timeMay vary by contextFailure modeWrong label — visible in outputConfident wrong reasoning — silentInjection riskLowHigh — adversary content can manipulate reasoningOversight methodAudit rule conditionsRed-team + reasoning constraint review

Option 3 — Policy governance cycle

POLICYGOVERNSALL AIWrite PolicyHumans define intent& boundariesAI ExecutesRuns withinapproved policyReview OutcomesAudit trail& performanceRed-team & TestAdversarial testingfor LLM agents

Policy autonomy means that humans retain decision-making authority by writing the policy that defines which actions AI may take, under what conditions, within what boundaries, and with what escalation triggers. AI executes it consistently, at machine speed, every time the conditions are met. No human is in the decision loop for that specific action — but the decision was made when the policy was written, by the people with authority to make it.

This model describes ML-based automation accurately. It requires significant extension to describe large language model and agentic AI deployments. When an LLM executes policy, it reasons about the policy — it interprets intent, weighs contextual factors, and may produce different outputs for inputs that are functionally equivalent but textually different. Writing policy for LLM agents requires a different discipline: you are writing the intent you want the system to reason from, not the rules you want it to follow — and the governance framework must account for the difference.

Agentic AI — an LLM with tool access — given the objective of investigating an alert can query threat intelligence APIs, examine endpoint telemetry, correlate with historical incidents, and propose a containment action without human intervention at each step. The policy for an agentic AI therefore needs to define not just what actions are permitted but what reasoning approaches are sanctioned, which external systems the agent is permitted to query, what outputs require human review before execution, and what the agent must do when it encounters adversary-controlled content that attempts to redirect its objectives.

The policy review cycle is the mechanism through which humans retain meaningful control. For LLM and agentic AI, this requires: reviewing cases where agent reasoning diverged from expected paths, running scheduled red-team exercises to test prompt injection resistance, and updating reasoning constraints when adversary techniques evolve.