Mentu

Execution Model

Execution Model

Mentu does not run tasks. It runs epistemic operations — units of computation that consume intent, produce evidence, and record both into a tamper-evident substrate.

Every execution primitive in mentu shares three properties:

  1. It accepts an intent. Not just a command. An intent carries context — what is known, what was tried, what constraints apply.
  2. It produces evidence. Not just output. Evidence is trust-scored, semantically tagged, and causally chained.
  3. It records both. The ledger captures intent, execution, and evidence as a single cryptographically verifiable chain.

The ten primitives form a closed composition algebra. Any primitive can embed any other. A step runs identically whether it executes alone or inside a 200-step compound.

The Ten Primitives

Step — The Atom

One agent, one intent, one execution boundary.

S: (Intent, Context) → (Evidence, Trust)

The engine dispatches intent to a backend (Claude, Gemini, Codex, or any OpenAI-compatible provider), iterates through the execution loop, and monitors for termination via four mechanical guards:

  • Completion keyword — the agent signals when work is done
  • Max iterations — default 12 attempts
  • Max runtime — default 1 hour wall clock
  • Circuit breaker — 3 consecutive failures halt execution

Each step produces a StepResult with output, cost, duration, trust metadata (seven mechanical weights), and a ledger entry chained to the previous one.

Formula — The Sequence

An ordered composition of steps. The unit of epistemic work.

F = fold(S₁, S₂, …, Sₙ)

Each step's output context flows into the next step's input. The fold is accumulation, not concatenation — step 5 starts with the evidence from steps 1 through 4. This is where epistemic acceleration begins: a formula's value exceeds the sum of its steps run independently.

Two execution modes:

  • Sequential — steps execute in order
  • DAG-parallel — steps declare depends_on labels; the scheduler builds execution layers via topological sort with cycle detection

Pipeline — The Chain

Sequential composition of formulas. Each formula's output feeds the next.

P = F₁ ;c F₂ ;c … ;c Fₙ

Conditional sequencing between formulas: success (default), failure (run a different formula for the failure case), or always. Each formula can target a different workspace. Resume is built in — --from N resolves to the correct formula and local step offset.

Parallel — The Fleet

Multiple formulas running concurrently, each in an isolated git worktree.

‖{F₁, F₂, …, Fₙ}

Isolation is structural. Each formula gets its own worktree on a temporary branch. The ledger is shared and flock-protected for concurrent writes. Fleet-wide rate limit coordination pauses all formulas when any hits an API limit. Pre-flight cost estimation projects total spend before execution begins.

Compound — The Graph

A dependency graph of layers. Each layer can be a formula, parallel, pipeline, adversarial, convergent, temporal, sentinel, or another compound.

G = (V, E)   where V = {layers}, E = {depends_on edges}

Independent layers run concurrently. The scheduler uses topological sort with cycle detection. Compounds support recursive nesting (depth-limited, default 10) and reactive gates — CIR queries evaluated between execution waves that can inject, skip, or abort layers based on live evidence.

The compound is the most general primitive. It subsumes all others: a compound with one formula layer is a formula; a compound with one parallel layer is a parallel; sequential layers form a pipeline. Use the simplest primitive that fits.

Adversarial — The Self-Correcting Pair

Red/Blue pairing. Blue defends. Red attacks.

A(F_blue, F_red) → (Verdict, ΔTrust)

Blue runs first. Its outputs are assembled into an evidence document. Red runs second with Blue's evidence as context, looking for contradictions, vulnerabilities, and logical flaws. The engine then queries CIR for contradictions between their signals.

Three verdicts:

  • SURVIVED — Red found nothing. Blue's trust increases.
  • COMPROMISED — Red found contradictions. Blue's trust decreases. Degradation propagates through every signal that cites Blue's evidence.
  • INCONCLUSIVE — Red failed to complete. No trust adjustment.

Convergent — Selection Under Uncertainty

N formulas run concurrently — same goal, different strategies — and the best result is selected via mechanical evaluation.

C({F₁, F₂, …, Fₙ}, σ) → F_winner

Three selector methods:

  • trust_score — highest mean effective confidence wins
  • contradiction_min — fewest unresolved contradictions wins
  • formula — an evaluation formula ranks all attempts

Only the winner survives. Losers are preserved in CIR as negative evidence — proof of what does not work. Models can differ per attempt: Opus on approach A, Sonnet on B, Gemini on C. The selector evaluates evidence, not identity.

Temporal — Scheduled Execution

A formula bound to a time dimension.

{
  "name": "weekly-audit",
  "schedule": "0 9 * * 1",
  "cooldown": 86400,
  "ttl": 604800,
  "formula": "security-audit",
  "on_ttl_expire": "rerun"
}

Cron scheduling with cooldown enforcement, TTL expiry (rerun, notify, or decay), catch-up policies for missed runs, jitter for load spreading, budget caps, and circuit breaker protection. Results are CIR signals with trust metadata — evidence that knows when it expires.

For the full temporal mechanics, see Temporals.

Sentinel — Continuous Monitoring

Long-running monitoring with progressive escalation and attention budgets.

{
  "name": "drift-sentinel",
  "watch": { "source": "cir", "condition": "contradiction_rate > 0.15" },
  "heartbeat": 60,
  "escalation": [
    { "threshold": 1, "action": "log" },
    { "threshold": 3, "action": "formula:investigate-drift" },
    { "threshold": 5, "action": "notify:human" }
  ],
  "attention_budget": 0.50
}

Four watch sources: CIR conditions, shell commands, file changes, formula results. Escalation is progressive — first trigger logs, third triggers a formula, fifth notifies a human. Attention budget (USD/day) caps spending on triggered formulas. When exhausted, the sentinel watches but stops acting until the budget resets.

Substrate — Meta-Operations

Formulas that modify execution infrastructure itself — trust weights, CIR configuration, routing, prompt templates. Changes are staged, diffed, and applied with snapshot-based rollback. Approval gates prevent unauthorized modifications.

The Composition Algebra

The primitives form a closed algebra under two operators:

  • Sequential composition (;)A ; B means run A, then B. Associative.
  • Parallel composition ()A ‖ B means run A and B concurrently. Commutative.

Any primitive can embed any other:

Outer Inner What happens
Formula Step Every formula is steps
Pipeline Formula Pipeline chains formulas
Parallel Formula Fleet runs formulas concurrently
Compound Any primitive Compound dispatches to the appropriate runner
Adversarial Formula Blue and Red are both formulas
Convergent Formula Each attempt is a formula

The Embedding Principle

Every higher primitive embeds lower ones without modification. A step does not know whether it runs alone, inside a formula, or deep inside a compound. The interface is invariant. Trust computation is per-step, always. Higher primitives aggregate but never override individual step trust.

Mechanical Trust

Trust is computed, never self-reported. Seven weighted signals produce a confidence score between 0 and 1:

Signal What it measures
Exit code Did the process complete without error?
Completion Did the agent signal completion?
Cost efficiency Was token spend within expected bounds?
Context utilization How much available context was used?
Build gate Did the build pass?
Test coverage What fraction of tests passed?
Cross-step coherence Does this step's output align with prior steps?

Every trust change is recorded in the trust event log with its cause — the specific signal that triggered the adjustment. The entire chain is explainable.

The Substrate Interface

All primitives operate on the CIR substrate through two invariants:

Read-before-act. Before every step — not every formula, every step — the engine queries CIR for prior evidence. The agent starts from what is known, not from zero.

Write-after-step. After every step, the engine writes a signal to CIR with trust metadata, semantic tags, and a cryptographic chain to the prior signal. Evidence persists, decays at a configured half-life, and can be queried by future steps.

┌─ Step ─┐  ┌─ Formula ─┐  ┌─ Pipeline ─┐  ┌─ Parallel ─┐  ┌─ Compound ─┐
└───┬────┘  └─────┬─────┘  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘
    │             │               │                │                │
    ▼             ▼               ▼                ▼                ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                    CIR Substrate (epistemic memory)                        │
│   Signals · Relations · Trust · Decay · Embeddings · Contradictions       │
└─────────────────────────────────────────────────────────────────────────────┘

Epistemic Acceleration

A formula's epistemic value exceeds the sum of its steps run independently:

V(F_n) > Σ V(S_i)

Three mechanisms drive this:

  1. Preamble injection — before each step, the engine injects summaries of all prior steps (labels, durations, costs, trust scores, output tails)
  2. CIR read-before-act — each step starts with accumulated workspace knowledge, not just the current formula's context
  3. Output hash chaining — cache keys include dependency output hashes; changed outputs trigger re-execution downstream

The acceleration has limits. Context windows saturate, errors compound through chains, and diminishing returns emerge in long formulas. Circuit breakers and budget caps are the mechanical safeguards.

© 2026 Mentu.