Skip to content

ostk-cache

An optimizing wire proxy for OpenAI and Anthropic client payloads. Intercepts outgoing requests to enforce size boundaries, inject cache markers, and dynamically compress context.

License: AGPL-3.0 | Port: 8080 (default) | Schema: HTTP

ostk-cache acts as a middleware layer running between your AI agent (like Claude Code or Cursor) and the model provider's API endpoint. It sits on your local machine, intercepting outgoing HTTPS payloads, compressing redundant codebase references, and maintaining a local cache history to maximize token savings.

Integration Example

Redirect your client requests to the local proxy. For Claude Code (whose source exposure date is March 31, 2026), export the proxy environment variable:

export ANTHROPIC_BASE_URL=http://127.0.0.1:8080

The proxy can run in one of four processing modes, configured via .ostk/config:

PASSTHROUGH

Forwards payloads byte-identically to the upstream provider. Acts as a silent recorder logging metadata to ledger.jsonl.

MUTATE

Injects prompt cache boundaries, inserts HUD diagnostic overlays into the assistant prompt, and strips redundant cache_control headers that conflict with local limits.

REBUILD

L1 Cache Rebuild. Resolves file paths inside user prompt scripts and swaps them with local filesystem caches if the content hasn't changed.

REBUILD_KERNEL

L2 Federated Rebuild. Communicates via IPC socket with the main ostk daemon. Queries the shared workspace graph to inject highly compressed structural summaries of changed files.

To prevent run-away developer API costs, ostk-cache enforces a default 30MB payload soft-cap limit. When a payload exceeds this limit, the proxy executes a sequential degradation pipeline:

Tier A Tool Result Ejection Ejects individual tool output bytes exceeding the threshold, replacing them with truncation notices.
Tier B Message Pruning Prunes oldest assistant tool-use and corresponding user tool-result message pairs to save context window.
Tier C Tool Definition Dropping Drops unused tool schema definitions from the system instructions.
Tier D Hard Rejection Rejects the payload entirely and returns HTTP 413 (Payload Too Large).

Every transaction processed by the proxy is appended to .ostk/memory/ledger.jsonl. This log serves as the single source of truth for billing, cost auditing, and efficiency tracking.

LEDGER ENTRY SCHEMA
{
  "timestamp": "2026-05-22T20:50:50.105Z",
  "mode": "rebuild_kernel",
  "upstream": "api.anthropic.com",
  "tokens_in": 12894,
  "tokens_cached": 88042,
  "tokens_out": 451,
  "cost_usd": 0.3129,
  "reduction_tier": "B"
}

Claude Code can be integrated with ostk-cache hooks. By running the installation command, the proxy intercepts Claude Code events and maintains hook logs:

INSTALL HOOKS
ostk-cache-hooks install

This command registers global hooks that post lifecycle events to /hook/event on the proxy server. Lifecycle logs are recorded locally to .l1.5/hooks.jsonl.