Security · Layer 1

The Gatekeeper permission engine.

Every MCP-AQL operation resolves to exactly one of four permission levels before it runs. The enforcement code below is quoted verbatim from the public mcp-server repository, so the implementation is presented directly rather than described.

← Back to the security overview

The threat

An LLM decides which tool call to make. If the only control between that decision and the filesystem is the model's own judgement or the MCP client's "always allow" toggle, a prompt-injected, socially-engineered, or mistaken model is one tool call away from damage. The Gatekeeper ensures the model's intent is never the final authority.

Operation pipeline

Every MCP-AQL call traverses this pipeline before the type model or precedence rules apply, and any stage can terminate it.

flowchart TD
  A[MCP-AQL operation requested] --> B{Route valid?}
  B -- no --> XN[DENY: unknown route]
  B -- yes --> C[Resolve active element policy]
  C --> D{In an element deny list?}
  D -- yes --> XD[DENY: non-elevatable, cannot be confirmed]
  D -- no --> E{In an element confirm list?}
  E -- yes --> F[Mark: confirmation required]
  E -- no --> G{In an element allow list?}
  G -- "yes and elevatable" --> H[AUTO_APPROVE]
  G -- no --> I[Fall back to route default level]
  F --> J{Already confirmed this session?}
  I --> J
  J -- yes --> H
  J -- no --> K[Out-of-band human confirmation or challenge]
  K -- approved --> H
  K -- denied --> XR[Operation rejected]
  H --> L[Operation executes]
  L --> M[Autonomy evaluator and danger zone gate the next step]
  classDef deny fill:#b91c1c,stroke:#7f1d1d,color:#fff;
  classDef allow fill:#15803d,stroke:#14532d,color:#fff;
  class XN,XD,XR deny;
  class H,L allow;

The diagram defines the contract; the code below implements each stage.

A four-value permission model

Access control is not allowed: true/false. Every operation resolves to one of four discrete levels, and the policy type carries an explicit canBeElevated flag so a destructive operation can be marked non-negotiable at the type level.

From src/handlers/mcp-aql/GatekeeperTypes.ts

/**
 * Permission levels for operation access control.
 * These define how operations are approved or denied.
 */
export enum PermissionLevel {
  /** Always allowed, no confirmation needed */
  AUTO_APPROVE = 'AUTO_APPROVE',
  /** Confirm THIS instance only, ask again next time */
  CONFIRM_SINGLE_USE = 'CONFIRM_SINGLE_USE',
  /** Confirm once, auto-approve for rest of session */
  CONFIRM_SESSION = 'CONFIRM_SESSION',
  /** Never allowed (blocked by policy) */
  DENY = 'DENY',
}

/**
 * Operation policy mapping entry.
 * Defines the default permission level for an operation.
 */
export interface OperationPolicy {
  /** Default permission level for this operation */
  defaultLevel: PermissionLevel;
  /** Human-readable description of why this level is assigned */
  rationale: string;
  /** Whether this operation can be elevated by active elements */
  canBeElevated?: boolean;
}

Constraining every operation to one of four levels makes failing open structurally hard: there is no truthy default to slip through. canBeElevated is the mechanism by which the most destructive operations opt out of promotion.

Precedence: deny > confirm > allow > route default

Active elements (personas, skills, ensembles) contribute policy. When they conflict, the resolver applies a strict hierarchy. A deny returns immediately and unconditionally. An allow can override the route default, but it can never override another element's confirm, and it can never elevate an operation that is not elevatable.

From src/handlers/mcp-aql/policies/ElementPolicies.ts

// Check each active element's policy in order
for (const element of activeElements) {
  const policy = element.metadata.gatekeeper;
  if (!policy) {
    continue;
  }

  // 1. Check deny list first (highest priority)
  if (policy.deny?.includes(operation)) {
    return {
      permissionLevel: PermissionLevel.DENY,
      sourceElement: element.name,
      matchedPolicy: 'deny',
      conflictingElements: conflictingElements.length > 0 ? conflictingElements : undefined,
    };
  }

  // ...

  // 3. Check confirm list (requires confirmation)
  if (policy.confirm?.includes(operation)) {
    // Don't downgrade from DENY or CONFIRM_SINGLE_USE
    if (effectiveLevel !== PermissionLevel.DENY) {
      effectiveLevel = PermissionLevel.CONFIRM_SESSION;
      confirmedByElement = true;
      sourceElement = element.name;
      matchedPolicy = 'confirm';
    }
  }

  // 4. Check allow list (auto-approves)
  // Issue #674: allow CAN override the route default, but NOT another element's confirm policy.
  // Priority hierarchy: element deny > element confirm > element allow > route default
  if (policy.allow?.includes(operation)) {
    // Only elevate if the operation allows elevation
    if (canOperationBeElevated(operation)) {
      if (!confirmedByElement) {
        // No element has confirmed this — safe to elevate (overrides route default)
        effectiveLevel = PermissionLevel.AUTO_APPROVE;
        sourceElement = element.name;
        matchedPolicy = 'allow';
      } else {
        // Another element's confirm policy takes priority over this allow
        conflictingElements.push({ name: element.name, wantedLevel: PermissionLevel.AUTO_APPROVE });
      }
    }
  }
}

This defends against privilege escalation by a malicious or careless active element: activating a permissive persona cannot silently unlock an operation another active element gated, and cannot affect a non-elevatable operation at all.

Safe defaults and pinned destructive operations

Permission level is derived from CRUD semantics — reads are auto-approved, writes and execution are gated. An unknown operation name does not fall through to "allowed"; it falls back to per-use confirmation. And the most destructive operations are pinned canBeElevated: false so no element policy can ever auto-approve them.

From src/handlers/mcp-aql/policies/OperationPolicies.ts

const ENDPOINT_DEFAULT_LEVELS: Record<CRUDEndpoint, PermissionLevel> = {
  READ: PermissionLevel.AUTO_APPROVE,
  CREATE: PermissionLevel.CONFIRM_SESSION,
  UPDATE: PermissionLevel.CONFIRM_SINGLE_USE,
  DELETE: PermissionLevel.CONFIRM_SINGLE_USE,
  EXECUTE: PermissionLevel.CONFIRM_SINGLE_USE,
};

// ...

// ===== DELETE endpoint overrides =====
// These match the endpoint default (CONFIRM_SINGLE_USE) but need canBeElevated: false
delete_element: {
  defaultLevel: PermissionLevel.CONFIRM_SINGLE_USE,
  rationale: 'Destructive operation, permanently removes data',
  canBeElevated: false,
},
clear: {
  defaultLevel: PermissionLevel.CONFIRM_SINGLE_USE,
  rationale: 'Destructive operation, clears all memory entries',
  canBeElevated: false,
},
clear_github_auth: {
  defaultLevel: PermissionLevel.CONFIRM_SINGLE_USE,
  rationale: 'Destructive operation, removes authentication credentials',
  canBeElevated: false,
},

The resolver itself fails safe: an unrecognised operation receives CONFIRM_SINGLE_USE, never auto-approval.

export function getDefaultPermissionLevel(operation: string): PermissionLevel {
  // 1. Check for explicit override
  const override = OPERATION_POLICY_OVERRIDES[operation];
  if (override) {
    return override.defaultLevel;
  }

  // 2. Derive from endpoint routing
  const route = getRoute(operation);
  if (route) {
    return ENDPOINT_DEFAULT_LEVELS[route.endpoint];
  }

  // 3. Secure fallback for unknown operations
  return PermissionLevel.CONFIRM_SINGLE_USE;
}

export function canOperationBeElevated(operation: string): boolean {
  const override = OPERATION_POLICY_OVERRIDES[operation];
  // Default to allowing elevation for operations without explicit overrides
  return override?.canBeElevated ?? true;
}

The nuclear sandbox

An element can ship gatekeeper: { deny: ['confirm_operation'] }. Because nothing can then satisfy a CONFIRM_* requirement, every mutating operation freezes and the session degrades to read-only — while a small set of recovery operations (verify_challenge, release_deadlock) are intentionally un-gatable so the human-in-the-loop path can never be locked out.

From src/handlers/mcp-aql/policies/ElementPolicies.ts

const UNGATABLE_OPERATIONS = new Set([
  'verify_challenge',
  'release_deadlock',
  'approve_cli_permission',
  'permission_prompt',
]);

/** Derived from UNGATABLE_OPERATIONS + confirm_operation. See block comment above. */
const GATEKEEPER_INFRA_OPERATIONS = new Set([
  ...UNGATABLE_OPERATIONS,
  'confirm_operation',
]);

/**
 * Check if any active elements deny confirm_operation (nuclear sandbox).
 * Returns the denying element name if found, undefined otherwise.
 */
export function findConfirmDenyingElement(
  activeElements: Array<{ name: string; type: string; metadata: Record<string, unknown> }>
): { name: string; type: string } | undefined {
  for (const element of activeElements) {
    const gatekeeper = element.metadata?.gatekeeper as Record<string, unknown> | undefined;
    const denyList = gatekeeper?.deny;
    if (Array.isArray(denyList) && denyList.includes('confirm_operation')) {
      return { name: element.name, type: element.type };
    }
  }
  return undefined;
}

This defends against a compromised agent confirming its own destructive operations: the escape hatch is removable, while the recovery path is not.

The Gatekeeper is server-side and runs after the MCP client approves a tool call, so an "always allow" toggle in the client never reaches it. The model can request any operation, but what it receives is decided here, in code, by policy it cannot edit.

Position in the security stack

The Gatekeeper is one layer and depends on the others. Content validation runs before a request is formed, so the prompt that produced the operation is already sanitized. The Gatekeeper then decides whether the operation runs. If it does, and the caller is an autonomous agent, the autonomy evaluator and danger zone enforcer gate the next step. CLI tool classification feeds a risk score into the confirmation decision for shell-touching operations.

flowchart LR
  CV[Content validation sanitizes element input] --> GK[Gatekeeper permission resolution]
  CLI[CLI tool classification: risk and irreversibility] --> GK
  GK -- "operation allowed" --> EX[Operation executes]
  GK -- "needs confirmation" --> HC[Out-of-band human challenge]
  EX --> AE[Autonomy evaluator gates the next agent step]
  AE -- "danger threshold crossed" --> DZ[Danger zone enforcer: process-level block]
  AE -- "continue" --> GK
  classDef deny fill:#b91c1c,stroke:#7f1d1d,color:#fff;
  classDef allow fill:#15803d,stroke:#14532d,color:#fff;
  class DZ deny;
  class EX allow;

Security overview
The full eight-layer defense-in-depth model and how the Gatekeeper fits into it.
Agent safety
The autonomy evaluator and danger zone enforcer that gate every autonomous step after the Gatekeeper allows an operation.
Dynamic permissioning
How active elements reshape the permission surface at runtime, with the live console view.

The Gatekeeper permission engine.

The threat

Operation pipeline

A four-value permission model

Precedence: deny > confirm > allow > route default

Safe defaults and pinned destructive operations

The nuclear sandbox

Position in the security stack

Related