Skip to content

Architecture

This document provides a mental model of Sage Protocol's architecture — what the components are, why they exist, and how they relate to each other. It's written for contributors and coding agents who need to navigate the codebase and understand where changes should land.

For the full protocol narrative, see Home. For user-facing surface rationale, see Tooling and Surfaces.

User-facing shorthand: Sage is a trust and distribution layer for agent capability. The architecture exists to make that sentence real.


The Core Idea (One Sentence)

On-chain governance decides what is canonical. Content-addressed artifacts carry the capability itself. Everything else exists to publish, index, materialize, and consume that canon.

This is the organizing principle. If you understand it, you can predict where most features should live: changes to what is canonical happen in contracts; changes to how canonicity is expressed happen in the worker, subgraph, and CLI.


Why This Architecture

We chose content-addressed storage (IPFS CIDs) over on-chain content storage because prompt libraries are too large and change too frequently for on-chain storage to be practical. But we need governance to be on-chain because that's where trust lives — you can verify who approved what, when, and under what authority.

The result is a hybrid: on-chain for pointers and provenance, off-chain for content and indexing. This is a deliberate tradeoff — it means the worker and subgraph are UX dependencies (not trust anchors), and it means availability is partly an ops problem. But the alternative (fully on-chain) would make the protocol too expensive and slow to be usable.


Components and What They Do

Contracts (Canonical State)

Location: contracts/, with deployment tooling in packages/contracts/

Contracts are the source of truth for ownership, governance, and stable pointers (e.g., "this DAO's library → manifest CID"). They emit events that the rest of the system indexes.

We use contracts sparingly — only for things that need on-chain trust guarantees. Everything else is pushed off-chain to keep gas costs low and iteration speed high.

Subgraph (Indexing Layer)

Location: subgraph/

The subgraph indexes on-chain events into a queryable dataset so the rest of the system can do fast reads without expensive chain scans. It powers governance visibility, library pointer resolution, discovery indexes, and reputation tracking.

The subgraph is eventually consistent with the chain. User-facing systems should assume minutes of propagation delay between "transaction mined" and "visible in discovery." See Subgraph for details.

IPFS Worker (Materialization Layer)

Location: packages/ipfs-worker/

The worker is the edge "view + cache" layer. It reads canonical pointers from the protocol, fetches content from IPFS/R2, and serves developer-friendly interfaces (HTTP discovery, git smart HTTP for installs, marketplace feeds).

The worker is not the source of truth — it materializes and caches state from the chain and IPFS. But it's the primary surface for consumers, which makes it operationally critical. See Worker for details.

CLI (Execution Surface)

Location: packages/sage/ (Rust)

The CLI is the most complete workflow surface — publishing, governance, installs, and local project setup. It's designed for both humans and agents, and it scripts well.

Web App (User UX)

Location: packages/sage-web-app/

The web app provides discovery and governance visibility for non-CLI users. It proxies worker and subgraph endpoints under packages/sage-web-app/src/app/api/*.

SDK (Integration Surface)

Location: packages/sdk/

Shared clients and utilities used by multiple Sage surfaces (web app, scripts, some CLI helpers). The SDK exists to reduce drift across surfaces — one place for shared protocol logic.


End-to-End Flows

Publish (author → CID → on-chain pointer → indexed → served)

  1. Author prompt/skill content locally
  2. Upload content → CID(s)
  3. Build/update library manifest → manifest CID
  4. Update on-chain pointer (operator execution or governance proposal)
  5. Subgraph indexes the event; worker updates KV/indexes
  6. Consumers discover/install via worker (HTTP + git routes)

Consume (discover → install → use)

  1. Discover via worker/web app (search/trending/library pages)
  2. Install via worker git endpoints (Claude/Codex ecosystem pattern) or CID fetch
  3. Use locally via Claude Code conventions / Sage skill workflows

Design Constraints (Non-Obvious Invariants)

Three design constraints shape how features should be built:

Canonical pointer is on-chain + CID. The worker is a cache/materializer, not a source of truth. If the worker disagrees with the chain, the chain wins. This constraint ensures that consumers can always verify what they received.

Hot paths should avoid chain RPC. If the worker needs a fact at request time, prefer subgraph → KV over direct RPC scans. Chain RPC is expensive and rate-limited; the subgraph + KV pattern keeps read latency low.

Manifests are the unit of library state. A manifest describes a versioned library — all the prompts, their CIDs, and their metadata. Avoid "overlay state" that can drift from the manifest unless it's explicitly versioned and reconciled. This keeps library state atomic and auditable.


Where to Change X (Fast Triage)

If you're changing… Start here
Worker routes/auth/caching packages/ipfs-worker/src/router.ts
Worker materialization (git/manifest/skill) packages/ipfs-worker/src/services/*
Subgraph entities/handlers subgraph/schema.graphql, subgraph/src/*
CLI UX/commands packages/sage/crates/cli/src/commands/*
Contract logic contracts/*
Deploy scripts + address books packages/contracts/deployment/runbooks/*, packages/contracts/deployment/addresses/*
Web app APIs packages/sage-web-app/src/app/api/*

How This Connects