CIR Substrate
CIR Substrate
CIR — Cognitive Infrastructure Retrieval — is the epistemic substrate at the heart of mentu. It is a structured knowledge base that tracks not just what you know, but when you learned it, how confident you are, and whether it contradicts something else.
Why CIR exists
Traditional tools treat knowledge as static. A config file is true until someone changes it. A document is current until someone updates it. But real knowledge has a time dimension:
- An observation from last week may be contradicted by new evidence today
- A high-confidence classification may decay as the underlying system evolves
- Two independent analyses may reach conflicting conclusions
CIR makes this explicit. Every piece of knowledge is a signal with metadata: type, confidence, source, timestamp, tags. Signals connect through relations. Patterns emerge from recurring structures. And when two signals conflict, CIR surfaces the contradiction rather than silently picking one.
For the conceptual introduction to CIR and why it exists, see What is CIR?. This page covers the technical mechanics.
Three layers of record
CIR maintains a constitutional separation between evidence, understanding, and judgment. These three layers must never collapse into a single record:
Layer 1: Raw signal — Immutable, append-only. The evidence itself. Once a signal is captured, it is never modified. Raw signals carry an asserted_confidence that is frozen at creation time — the confidence the source claimed when it produced the signal.
Layer 2: Semantic interpretation — Versionable, re-computable. The system's understanding of what the evidence means. Interpretations can be revised as new evidence arrives. Embeddings, entity extraction, and domain classification live in this layer.
Layer 3: Mechanics state — Derived, dynamic. Trust scores, contradiction status, salience rankings. These are substrate judgments that evolve as the evidence graph changes. They are not intrinsic facts of the original signal — they are computed properties.
If these layers collapse into one record, the system cannot distinguish what it observed from what it infers. When a signal's effective confidence drops because its supporting evidence was contradicted, the raw signal remains intact. The evidence is preserved even when the system's judgment about it changes.
Three confidence values
Every signal carries three distinct confidence measures:
| Value | Stored where | Mutability | Computation |
|---|---|---|---|
asserted_confidence |
Raw signal (Layer 1) | Frozen at creation | Set by source |
effective_confidence |
Mechanics state (Layer 3) | Updated on graph changes | Trust propagation across citation edges |
current_confidence |
Never stored | Computed at read time | effective_confidence × decay(age) |
Trust propagation: When a signal's supporting evidence is weakened (e.g., a cited signal is contradicted), the effective confidence of the dependent signal drops automatically. Every confidence change is recorded in the trust_events log with a cause_signal_id — the specific signal that triggered the change. This makes the entire trust chain explainable.
Synchronous minimum pass on write: When a new signal is captured, citation edges, initial effective confidence, and local contradiction checks happen synchronously. Full graph propagation across the wider evidence network runs asynchronously.
Read-before-act invariant
Before mentu executes any formula step or automated action, it MUST perform a bounded CIR query. This is a hard engineering constraint:
- The query happens even if the result set is empty
- The query is bounded (limited scope, limited time)
- The results shape the execution context for the action that follows
This invariant is what transforms CIR from a database into an active substrate. Intelligence in mentu is not stateless function execution — it is action informed by accumulated evidence.
Five layers
1. Signals
The atomic unit of knowledge. A signal is a single observation, classification, inference, or fact captured into CIR.
interface CIRSignal {
id: string; // Unique identifier
type: string; // 'observation', 'classification', 'generation', 'embedding', 'model_load'
body: string; // The actual content
domain?: string; // Knowledge domain (e.g., 'infrastructure', 'security')
confidence?: number; // 0.0 to 1.0
source?: string; // What produced this signal
created_at?: string; // ISO 8601 timestamp
tags?: string[]; // Categorization tags
}Signals are immutable once captured. You do not update a signal — you capture a new one with a higher confidence or a different observation, and CIR tracks both.
2. Relations
Connections between signals. Relations form a directed graph over the signal corpus — currently ~3.9 million edges. Each relation has a type, direction, and strength.
Relations fall into three categories:
Logical relations — Epistemic connections between claims:
cites— Direct reference without endorsementsupports— Provides evidence forcontradicts— Directly opposesqualifies— Adds conditions or limitations
Evolutionary relations — How knowledge develops over time:
extends— Builds upon while preserving validityrefines— Improves precision or accuracysupersedes— Replaces with better understandingforks— Creates a divergent interpretation
Epistemic relations — Metacognitive connections:
questions— Challenges assumptionsspeculates— Proposes without certaintyverifies— Confirms through independent meanssynthesizes— Integrates multiple signals into a new understanding
The relation type determines how trust propagation flows. A supports edge strengthens the target's effective confidence. A contradicts edge triggers contradiction detection. A supersedes edge marks the target for decay.
3. Embeddings
Vector representations of signals for semantic search. When you call mentu.cir.search('authentication failure'), the search runs against embeddings to find signals whose meaning matches, not just signals whose text contains the words.
4. Patterns
Recurring structures detected across signals. A pattern emerges when CIR observes the same type of signal from the same domain with similar content appearing repeatedly. Patterns are the substrate's way of saying "this keeps happening."
Signal lifecycle
Signals are not static records. They move through a lifecycle driven by verification, access, and time:
- Birth — Captured with maximum uncertainty. Asserted confidence is set; effective confidence initialized.
- Maturation — Confidence grows as other signals verify or support it. Citation edges accumulate.
- Peak relevance — Maximum epistemic utility. Frequently accessed, well-connected in the graph.
- Decay — Time erodes current confidence. Access frequency drops. Newer signals supersede.
- Archival — Low current confidence but preserved in the record. Still citable, still part of the evidence graph.
Access-based reinforcement counteracts decay: signals that are frequently queried or cited maintain higher effective confidence. Context-aware decay adjusts the rate — a signal with many strong supporting relations decays slower than an isolated one.
5. Contradictions
When two signals conflict — one says the API is healthy, another says it is down — CIR surfaces the contradiction. Contradictions are not errors. They are evidence that the world is more complex than a single observation suggests.
Signal anatomy
A concrete example:
{
"id": "sig_a7f3b2c1",
"type": "observation",
"body": "API latency exceeded 500ms threshold at 14:32 UTC",
"domain": "infrastructure",
"confidence": 0.95,
"source": "health-script",
"created_at": "2026-04-04T14:32:00Z",
"tags": ["latency", "alert", "api"]
}- type describes what kind of knowledge this is.
observationis something directly measured.classificationis a categorization.generationis something produced by an LLM.embeddingandmodel_loadare system-level signals from the ANE pipeline. - confidence is a score between 0 and 1. A direct measurement might be 0.95. An inference from indirect evidence might be 0.6. Confidence is metadata, not a filter — low-confidence signals are still valuable as context.
- source identifies what produced the signal. This can be a script name, a sequence, or a manual capture.
Querying CIR
Via CLI
# Recent signals
mentu cir query --type observation --since 24h --limit 10
# Search by content
mentu cir search "authentication failure"
# Aggregate stats
mentu cir stats
# Check for contradictions
mentu cir contradictionsVia SDK (in scripts)
// Query with filters
const signals = mentu.cir.query({ type: 'observation', since: '24h', limit: 50 });
// Full-text search
const matches = mentu.cir.search('authentication failure', { limit: 5 });
// Aggregate stats
const stats = mentu.cir.stats();
// Contradiction check
const contradictions = mentu.cir.contradictions();Via MCP tools
AI models with mentu-mcp configured can access CIR through the mcp_do tool with CIR-related intents.
Capturing signals
Signals enter CIR through three paths:
- SDK capture — scripts call
mentu.cir.capture()to record observations - CLI capture —
mentu cir capture "some observation"from the terminal - System signals — mentu itself emits signals for model loads, embeddings, and other system events
// Capture from a script
mentu.cir.capture('Deployment completed successfully', {
type: 'observation',
domain: 'deployment',
confidence: 1.0,
source: 'deploy-script',
tags: ['deploy', 'success'],
});Substrate statistics
CIR tracks its own scale:
interface CIRStats {
signals: number; // Total signal count
relations: number; // Total relation count
embeddings: number; // Vector embedding count
patterns: number; // Detected pattern count
contradictions: number; // Active contradiction count
database_size_bytes: number; // SQLite database size
}Use mentu.cir.stats() in scripts or mentu cir stats from the CLI to inspect the current state of the substrate.
Why CIR matters
CIR is what makes mentu more than a task runner. Without CIR, a health check script runs, prints output, and the output disappears. With CIR, the script captures its observations as signals. The next run can query what happened last time. A temporal can trend observations over days. A contradiction surfaces when two health checks disagree.
Knowledge compounds. CIR is the compound interest mechanism.