REFERENCE // CONTEXT_MGMT

Context & State Management

Keep useful project signal in context without making the provider cache—or one fixed compression ratio—the continuity boundary. Supported routes can combine output shaping, stable prefixes, digest envelopes, inline diagnostics, and local context pages.

CONTEXT_AND_STATE_REFERENCE

1. Context Lifecycle Loop
2. Prompt Caching & Economics
3. Output Compression (Squasher)
4. Digest Envelopes & Read Prevention
5. Enrichment Hooks & Diagnostics
6. Capabilities & the SKILL Directive
7. Task Eligibility & the WORK Directive
8. Context Pressure & Handoff

01 // CONTEXT_LIFECYCLE

The Context Lifecycle Loop

A supported ostk-managed turn can combine the following stages. Provider caching, output shaping, enrichment, and drain behavior remain route-specific rather than universal.

PRELOAD_RENDER

The kernel compiles the base prompt, identity (.language), registers, and Agentfile context. These blocks are byte-stable across turns to maximize prompt cache hits. Volatile elements are appended after the cache boundaries.

TOOL_CALL

The model invokes a kernel-mediated tool. The selected route determines whether output is preserved, narrated, structured, treated as dangerous, or sent through the Condense path.

OUTPUT_COMPRESSION

On the Condense route, command grammars and output shapes help preserve hazards and outcomes before repetitive structure is collapsed. Actual savings are measured per result.

DIGEST_INJECTION

The dispatcher can append a compact delta envelope for changed process, presence, file, load, and memory signals. Unchanged sections need not be repeated.

304_ELISION

If the agent re-reads a file that the digest indicates is unchanged, the kernel checks the generation table and returns a short "[304] path:gen=N (current)" message instead of the full content.

DRAIN_SNAPSHOT

At the turn boundary, session messages, token counts, and configuration are persisted to `.ostk/drain/<lineage_id>.json`. This facilitates crash recovery and hot-rehydration.

02 // PROMPT_CACHE

Prompt Caching & Fleet Economics

Measure the route, not a universal percentage.

Prompt caching and output compression can reduce repeated input on routes that support them. The result depends on provider pricing, cache eligibility and lifetime, message stability, model behavior, and harness integration. Needle Bench reports model-and-harness outcomes separately; use a measured combination instead of a blanket savings claim.

STABLE_PREFIX

Supported provider routes can reuse an identical stable prompt prefix. Volatile project and turn data belongs after the cacheable boundary.

ROUTE_SPECIFIC

Cache writes, reads, token accounting, and prices vary across providers, models, API versions, and configured gateways.

PROVIDER_LIFETIME

Cache lifetime and refresh behavior are provider-controlled. A warm prefix is an optimization, not part of ostk's continuity contract.

On compatible routes, ostk positions cache-control boundaries around stable system, preload, tool-definition, and message-history regions. Unsupported routes continue without this provider-side optimization.

03 // SQUASHER_PIPELINE

Output Compression (The Squasher)

Kernel-mediated command output is routed by category, matched to a command grammar or output shape, and reduced only after hazards and outcomes are identified. Elisions remain visible and telemetry records the actual result.

Branching ostk output pipeline with a raw bypass, category-specific handlers, and a condense branch that preserves hazards and outcomes before visible deduplication — Keep tool output useful without claiming one fixed reduction ratio. Scope: Shell output routed through ostk; category, grammar, output shape, input size, and raw: true determine the path. Scroll horizontally or open the full-size SVG to inspect every label.

Signal Before Reduction

Command grammars and output-shape detectors identify hazards, outcomes, and structured diagnostic blocks before deduplication. Unknown shapes fall back conservatively, tiny inputs avoid the full grammar path, and raw: true bypasses compression when exact output matters.

Condense Signal Classification

Lines are evaluated against rules and grouped into:

HAZARD (High Priority): Deprecations, lock timeouts, and warnings. Always preserved.
OUTCOME (Medium Priority): Build metrics, test counts, and exit states. Preserved verbatim.
NOISE (Low Priority): Iterative logs, progress bars, and dividers. Subject to immediate Levenshtein collapsing.

Consecutive Structural Deduplication

Consecutive lines that begin with the same first token are compared using normalized edit distance. The current implicit path collapses a run when distance is below 0.4 (roughly similarity above 0.6), generalizing dynamic tokens with tags such as {hash}, {path}, and {ver}. Every collapsed run stays visible as [⋯ N similar lines].

ROUTE_CATEGORIES

CONDENSE Verbose output → summary. cargo build, npm install. Implicit dedup + hazard/outcome filtering

NARRATE Silent commands → execution narration. cp, mv, mkdir. "→ cp: a.txt → b.txt (ok)"

PASSTHROUGH Verbatim-oriented preservation. cat, grep, jq. Original output + route metadata

STRUCTURED Known structure → formatted summary. ls, git status, docker ps. Parse the current command shape

DANGEROUS Destructive ops. rm -rf, git push --force. CRITICAL/WARNING severity with context

Semantic Deduplication via Potion-Base

Optional semantic clustering can identify lines that mean the same thing even when their structure differs. Its current similarity threshold is 0.85; this is separate from the lower-threshold consecutive structural deduplication above. Install the optional local model with ostk embeddings download.

Runs on Metal (Apple Silicon) with CPU fallbacks via Wgpu. Enabled with --features embeddings.

DEDUP_PATTERN

RAW TERMINAL STREAM

Compiling ostk v3.0.0

Compiling tokio v1.38.0

Compiling serde v1.0.203

... 47 more crate compilations

warning: unused import `std::io`

warning: 2 warnings generated

Finished release in 42.3s

COMPRESSED CONTEXT

Compiling ostk v3.0.0

[⋯ 49 similar lines]

warning: unused import `std::io`

warning: 2 warnings generated

Finished release in 42.3s

04 // READ_DEFENSE

Digest Envelopes & Read Prevention

To prevent agents from repeatedly reading files to check for external updates, the kernel appends a 5-line status envelope to every tool response.

EXAMPLE_DIGEST

[procs] builder:active:2m:45% reviewer:stale:5m:78%

[presence] :arrived(builder)

[files] src/main.rs:gen=12:reviewer:3m

[loadavg] needles=3 p0=1 fleet=2/2 nudges=0

[meminfo] ctx=45% used=360k/800k buffers=2 calls=14

Layer 1: Digest Suppression

If files haven't changed, they are omitted from the [files] block. Seeing no stale entries, the agent has no reason to issue a read command, avoiding the lookup entirely.

Layer 2: 304 Elision

If the agent attempts to read a file anyway, the kernel queries the generation table. If no writes have occurred since the agent's last read, the kernel overrides the read and returns [304] path:gen=N (current).

05 // ENRICHMENT_HOOKS

Driver Enrichment Hooks & Diagnostics

Registered FCP drivers (such as fcp-rust wrapping LSP) intercept file operations to inject compilation diagnostics, outline symbols, and manage type-safe multi-file refactoring.

Inline Diagnostic Injection

Diagnostics are injected directly as virtual code comments inside the file read response. The agent receives compiler errors inline with the source code, eliminating the need to compile manually to find syntax errors.

enriched read response

fn main() {
    let x = 5;  // [error] unused variable `x` (E0001)
    println!("hello");
}

PROGRESSIVE_DISCLOSURE

304 (unchanged)

None. The model already possesses the file state.

First read

Errors only (severity >= error). Hints and warnings are excluded to conserve tokens.

Explicit enrich=full

Full diagnostic set, symbol outlines, structural annotations, and reference trees.

Post-edit checks

Always enriched. The kernel calls drivers immediately after an edit to check for breakages.

Type-Safe Refactoring via LSP

Drivers expose symbol graphs to support complex, multi-file refactoring verbs. When executing refactorings, the driver computes all edits, and the kernel processes them atomically under OCC CAS rules.

RENAME

Updates symbol and all references across the codebase safely. Prevents regex search errors.

EXTRACT_FUNCTION

Selects code, extracts it, and computes parameters and return structures.

INLINE

Inlines function or variables, validating that visibility and scopes are preserved.

EMBEDDINGS_VS_DRIVERS

Signal	Embeddings (Breadth)	Drivers (Precision)
Related function	~0.75 cosine similarity	Exact call graph mapping
Relevant file	Shared vocabulary/topics	Direct import dependency
Dead code detection	Cannot determine	Zero reference symbols
Call chain path	Co-occurrence heuristics	Exact static call-stack traversal
Test coverage scope	Cannot determine	Test target reference mapping

SAFETY_AND_TIMEOUTS

Diagnostic Limit Diagnostic messages are truncated to 256 characters; binary codes are stripped.

Path Sandbox Drivers are restricted to authorized workspace paths; leaks outside root are blocked.

Circuit Breaker Three consecutive driver timeouts trigger a 5-minute cooldown period.

Timeout (Warm / Cold) 100ms for warm calls (falls back to raw read); 2000ms for cold start.

Refactor Timeout 10s. Fails explicitly instead of falling back to raw edits.

06 // SKILLS_BUNDLE

Capabilities & the SKILL Directive

The SKILL directive declares named capability bundles within an Agentfile. The parser extracts these into a simple vector, which is resolved at spawn time.

Format: SKILL <bundle_name>. Multiple declarations compile into Agentfile.skills: Vec<String>. Missing skill arguments trigger a ParseError::MissingArgument error.

At spawn time, the harness maps these identifiers to skill packages (e.g. resolving skills/<name>/SKILL.md) to append system prompt instructions, configure required tools, and establish style conventions.

agents/fixer.af

FROM claude-sonnet-4-6
PROMPT You fix bugs and write tests.
SKILL tdd
SKILL commit
TOOL shell
TOOL file:edit

07 // PULL_MODEL

Task Eligibility & the WORK Directive

Rather than a push-based routing engine, ostk implements a pull-based task architecture. Agents declare task eligibility using affinity masks defined via the WORK directive.

Format: WORK <expr> [<expr>...]. Multiple expressions on a single line are space-separated (evaluated as logical AND). An Agentfile can contain at most one WORK directive.

Parsed into WorkFilter containing a list of match expressions.
If omitted, the agent defaults to work: None, indicating it is eligible to pull any task.
Declaring multiple WORK directives triggers a ParseError::MultipleWork error.

EXPRESSION_OPERATORS

= priority=P0 Exact match constraint.

>= priority>=P1 Lower bound mapping (P0 < P1 < P2 < P3).

<= priority<=P2 Upper bound mapping.

=a,b tags=rust,bugfix Comma-separated list (matches if the task has ANY listed tag).

agents/rust-worker.af

FROM claude-sonnet-4-6
WORK tags=rust,bugfix priority>=P1
TOOL shell
TOOL file:edit

08 // CONTEXT_PRESSURE

Context Pressure & Successor Handoff

To preserve prompt caching, the kernel avoids in-turn context compaction. Instead, when context thresholds are crossed, the agent initiates a clean, structured handoff to a fresh successor process.

70% Threshold (AGING)

The kernel signals AGING state. Pre-computations begin to compile the handoff registry, while the current task loop runs unhindered.

90% Threshold (DYING)

State advances to DYING. Future tool calls are blocked. Handoff payloads are finalized. Sudden token jumps bypass AGING directly here.

Handoff (DRAINING/DEAD)

A single finalization turn (DRAINING) commits the handoff to disk, transitioning the session to DEAD. A fresh successor rehydrates the handoff state.