Skip to content

Security & Governance

Security and governance define what the AI system is allowed to know, say, change, store, and trigger. This layer cuts across workflow frameworks, harnesses, app frameworks, tools, data, and models.

Governance should not slow every change equally. Good governance increases ceremony according to risk tier.

What this layer owns

ConcernExample control
Data boundaryclassification, redaction, retention
Tool boundaryscoped credentials, allowlists, approval gates
Model boundaryprovider policy, model routing, sensitive-data controls
Human boundarynamed approvers, review evidence, escalation
Delivery boundaryrisk-tiered AI-DLC gates, PR checks, release readiness
Audit boundarytraces, decision records, tool-call logs
Memory boundarywhat is stored, where, for how long, and how deletion works

AI agent threat model

mermaid
flowchart TB
    A[AI system] --> B[Model boundary]
    A --> C[Tool boundary]
    A --> D[Data boundary]
    A --> E[Human approval boundary]
    B --> F[Prompt injection and unsafe output]
    C --> G[Tool abuse and privilege escalation]
    D --> H[Data leakage and retention risk]
    E --> I[Rubber-stamp approval risk]
    J[Governance controls] -. mitigate .-> F
    J -. mitigate .-> G
    J -. mitigate .-> H
    J -. mitigate .-> I

Control map

RiskControl
Prompt injectioninstruction hierarchy, retrieval filtering, tool confirmation
Data exfiltrationdata classification, redaction, allowlisted tools
Tool abusescoped credentials, human approval for destructive actions
Memory leakageretention policy, deletion policy, workspace isolation
Unreviewed AI codePR review, tests, CI, AI-DLC audit for high-risk changes
Rubber-stamp governancenamed approvers and evidence-based approval

Governance by risk tier

Risk tierExampleRequired control
LowUI copy, docs, small refactornormal review and tests
Mediumproduct behavior, billing UX, data processingspec/change artifact, tests, reviewer evidence
Highauth, payments, customer data, infrastructureAI-DLC-style gates, security review, audit record
Criticalregulated data, destructive automation, prod deploysnamed approvers, rollback plan, incident readiness, tool-call audit

This is where AWS AI-DLC is strongest: it makes lifecycle state, approvals, and audit explicit. Spec Kit and OpenSpec can define changes, but governance must be added when risk is high.

Security checklist for coding agents

  • Use isolated workspaces or worktrees for risky changes.
  • Never grant unrestricted production credentials to a coding agent.
  • Require tests before merge.
  • Review generated specs and generated code separately.
  • Log shell commands and tool calls when possible.
  • Require human approval for destructive filesystem or cloud actions.
  • Store prompts and generated docs only in approved repo locations.
  • Treat generated code like junior-developer code until reviewed.

Security checklist for AI apps

  • Classify data before it enters prompts, retrieval, memory, or logs.
  • Filter retrieval results before model context construction.
  • Scope tool credentials to least privilege.
  • Gate write/destructive actions.
  • Add prompt-injection tests for tool and RAG workflows.
  • Add output validation for structured actions.
  • Track model, prompt version, tool version, and trace ID.
  • Define memory retention and deletion policy.
  • Keep incident runbooks for unsafe outputs and tool misuse.

Step-by-step adoption guide

  1. Define risk tiers for your organization.
  2. Map each AI workflow or app feature to a risk tier.
  3. Define required evidence per tier: spec, tests, evals, approval, audit, rollback.
  4. Identify all data classes touched by the agent/app.
  5. Identify all tools and mark them read, write, destructive, or sensitive.
  6. Add approval gates for high-risk actions.
  7. Add logs and traces before production rollout.
  8. Review controls after incidents and major model/tool changes.

Failure modes

Failure modeConsequencePrevention
Governance applied equally to everythingteam bypasses processrisk-tiered ceremony
No named approversapprovals become vagueexplicit ownership
Prompt injection ignoredretrieved content can steer toolssafety evals and tool confirmation
Secrets in prompts/logsdata breachredaction and secret scanning
Memory retained foreverprivacy and compliance riskretention and deletion policy
Agent given broad credentialsprivilege escalationscoped credentials and gateway

References

Built as a static bilingual AI engineering stack guide.