Motebit

Policy & Governance

How motebit controls what crosses the surface

The droplet metaphor is literal in the codebase. The body is passive. The interior is active. Governance is the surface tension — it controls what crosses the boundary between the agent's interior (memory, tools, state) and the outside world. Every outbound action passes through the policy gate. Every inbound data payload is sanitized. Every decision is logged.

For the conceptual foundation, see Governance. For where policy sits in the package architecture, see Architecture.

The PolicyGate

PolicyGate (in @motebit/policy) is the central decision engine. It sits between the agentic loop and the tool registry. Every tool call passes through it.

The gate evaluates each tool call and returns one of three outcomes:

OutcomeMeaning
allowedTool executes immediately
requires_approvalExecution pauses until the user approves or denies
deniedTool is blocked unconditionally

The gate also filters which tools the AI model can see (filterTools), sanitizes tool results before they enter the conversation, redacts detected secrets, and enforces budgets. Every decision is logged to the audit trail.

Risk classification

Tools are classified by the risk they carry. Classification uses the tool's own riskHint if provided, otherwise infers from the tool name and description via pattern matching.

LevelNameSide effectExamples
R0R0_READNoneSearch, read file, recall memories
R1R1_DRAFTNoneDraft, compose, suggest, format
R2R2_WRITEReversibleWrite file, send message, create issue
R3R3_EXECUTEIrreversibleShell execution, deploy, restart
R4R4_MONEYIrreversiblePayment, transfer, checkout

Risk classification also considers data class (PUBLIC, PRIVATE, SECRET) inferred from tool context, and per-tool risk overrides configured in the policy.

Three-band governance

When a motebit.md identity file declares governance thresholds, the PolicyGate operates in three-band mode. Three fields partition all tool calls into three bands:

ThresholdDefaultPurpose
max_risk_autoR1_DRAFTMaximum risk the agent executes without any approval
require_approval_aboveR1_DRAFTAbove this, the user must approve each call
deny_aboveR4_MONEYAbove this, the tool is blocked unconditionally

The constraint max_risk_auto <= require_approval_above <= deny_above must hold. If any threshold is missing or the constraint is violated, the daemon refuses to start (fail-closed).

In practice with the defaults: read and draft tools flow automatically, write and execute tools require approval, and financial tools are hard-denied.

Operator mode

Operator mode is a PIN-protected session elevation — sudo for the agent. When disabled, only R0/R1 tools are available. When enabled, higher-risk tools become accessible (still subject to the three-band governance thresholds and per-tool approval rules).

AspectDetail
PIN4-6 digits. SHA-256 hash stored in OS keyring — never plaintext, never sent over the network.
ScopeSession-level. Disabling operator mode re-locks everything immediately.
Separation from identityOperator mode is not part of the cryptographic identity. It is a runtime gate on the device.

The motebit.md file can declare operator_mode: true or false as a governance default, but the actual PIN and its hash live only in the device's secure storage.

Tool budgets

The BudgetEnforcer rate-limits tool invocations to prevent runaway loops during goal execution:

  • budgetMaxCalls — maximum tool calls per turn (configurable per-device)
  • Turn elapsed time and cost accumulation are tracked in the TurnContext

When the budget is exhausted, additional tool calls are denied with a reason explaining the limit. Budget state resets on each new turn.

Memory governance

The MemoryGovernor controls what the agent is allowed to remember. Every memory candidate is evaluated before entering the graph:

CheckRule
Secret detectionIf the content contains tokens, keys, passwords, or credentials, the memory is rejected. Never stored.
Per-turn limitMaximum memories per turn (default: 5). Excess candidates become ephemeral (session-only).
Confidence thresholdBelow the persistence threshold (default: 0.5), memories are kept as ephemeral.
Sensitivity classificationSECRET-level memories are rejected unconditionally.

Memories that pass all checks are classified as PERSISTENT and stored to the graph with a human-readable explanation ("why did you remember this?").

Sensitivity and retention

The SensitivityManager (in @motebit/privacy-layer) enforces retention rules per sensitivity level:

LevelMax retentionDisplay allowed
noneUnlimitedYes
personal365 daysYes
medical90 daysNo
financial90 daysNo
secret30 daysNo

These defaults can be overridden in the motebit.md privacy section. The fail_closed flag (default true) means that if sensitivity cannot be determined, access is denied.

Deletion certificates

When a memory is deleted, the DeleteManager produces a DeletionCertificate:

  • target_id, target_type, deleted_at, deleted_by
  • tombstone_hash — SHA-256 hash of the deletion metadata, proving the memory existed and was deleted at the recorded time

The certificate is an audit artifact. The memory content is gone; the proof of deletion remains.

Injection defense

The ContentSanitizer protects the agent from prompt injection in tool results. Three detection layers run independently:

  1. Regex pattern matching — known attack signatures: "ignore previous instructions", chat template markers (<|im_start|>system), identity rewrites, jailbreak keywords. Patterns are tested against text normalized to defeat homoglyph and zero-width character evasion.
  2. Directive density — if more than 5% of words in a tool result are instruction-like phrases ("you must", "override", "execute"), the content is flagged as suspicious.
  3. Structural anomaly detection — JSON role markers ("role": "system"), prompt section headers, XML prompt framing tags in tool output data.

Regardless of detection, all external content is wrapped in [EXTERNAL_DATA] boundary markers. The system prompt instructs the model to treat everything inside these boundaries as data, never as directives. Detection triggers an injection_warning event in the streaming output.

The audit trail

Every policy decision is recorded to an append-only audit log:

FieldDescription
tool_nameWhich tool was requested
argsThe arguments passed (first 500 chars for approvals)
decisionallowed, denied, or requires_approval
reasonWhy the decision was made
turn_idWhich conversation turn
call_idUnique identifier for this specific tool call
timestampWhen the decision occurred

The audit log is queryable via the admin dashboard, the CLI (motebit approvals list), and programmatically through the AuditLogger API. Mode changes (operator mode enable/disable) are also logged as audit entries.

Fail-closed everywhere

The design principle is consistent across every layer: if something goes wrong, deny rather than allow.

  • PolicyGate errors default to deny
  • Memory classification failures default to reject
  • Privacy layer wraps every operation in try/catch and re-throws as access denied
  • Keyring unavailability prevents operator mode from enabling
  • Missing governance thresholds prevent the daemon from starting
  • Invalid URLs against the domain allowlist are denied (not silently passed)