Overview
Version
v1.0 — Stable
Status
Stable, 2026-06-06
License
BUSL-1.1
Maintained by
Fasad Salatov (Unyly)
Sael is a streaming-first protocol for connecting AI agents (LLM clients) to tools (tool servers). It replaces Model Context Protocol (MCP) where MCP cracks under production load: composition, streaming, subscriptions, backpressure, capability security, and multi-agent — all built into the protocol, not bolted on.
▸ v1.0 — stable (2026-06-06)
- ✓ Stability guarantee — no breaking changes through v1.x
- ✓ Federation — server-to-server routing via mesh discovery
- ✓ MessagePack binary frames — opt-in via subprotocol
- ✓ Extended filter — matches/in/notIn, parens, !, arithmetic, nested.field
- ✓ Schema registry — $ref to https://schemas.sael.dev/
- ✓ Python SDK — [email protected]
- ✓ Conformance test suite — 78 tests (Go 35 + TS 21 + Python 22)
- ✓ IETF draft alignment (sael-protocol-00)
▸ v0.2 — production-grade (2026-06-05)
- ✓ Cryptographically signed capability tokens (SCT, HMAC-SHA256)
- ✓ Bearer authentication in handshake
- ✓ Session resume after disconnect (RSM frame)
- ✓ Heartbeat (HBT/HBA) — detect dead connections
- ✓ Tool input validation via JSON Schema (server-side)
- ✓ Cost tracking in RES/END (cost_used per call)
- ✓ Distributed tracing (W3C trace_id/span_id)
- ✓ Tool versioning (name@version syntax)
- ✓ Protocol version field bumped 1 → 2
▸ v0.1 (2026-04-15)
Initial draft: streaming, composition, subscriptions, backpressure, capabilities.
Anthropic shipped MCP in 2024 and it became the de-facto standard for AI tooling. Technically it is JSON-RPC over stdio/SSE, designed for a desktop-app world. Production workloads expose architectural cracks:
✗ No native streaming
Long-running tool calls (LLM streams, file uploads, log tails) are SSE workarounds.
✗ Stateless per call
Every tool invocation rebuilds context. Burns tokens, multiplies latency.
✗ No composition
"Get repos → filter → summarize → post to Slack" — N round-trips, each with full payload.
✗ No subscriptions
Reactive flows (notify on new PR) need external poll loops.
✗ No backpressure
A flood of requests can DoS a tool server with no graceful degradation.
✗ No multi-agent
Agent-to-agent comms (Claude → Gemini) require bespoke bridges.
✗ No capability model
Tools can do anything. Clients can't restrict.
These aren't bugs — they're consequences of choosing JSON-RPC as the substrate. Fixing them requires a new protocol. Sael is that protocol.
✓ Streaming-native
Bidirectional streams are first-class. Every call can stream.
✓ Composable
Pipe syntax (tool₁ | filter | tool₂) — server-side execution.
✓ Stateful sessions
Open a channel, keep context, save tokens and latency.
✓ Subscriptions
Subscribe to events. Server pushes. No polling.
✓ Backpressure
Built-in flow control. Servers throttle gracefully.
✓ Typed
JSON Schema everywhere. Composition is type-checked.
✓ Multi-agent
Agent IDs in the protocol. Direct agent-to-agent calls.
✓ Capability-based
Calls require capability tokens. Agents can't do what they're not allowed.
✓ MCP-compatible
Adapter layer wraps existing MCP servers. Zero migration cost.
Sael uses WebSocket as the primary transport. WebSocket gives bidirectional streaming, runs in every browser and server, and survives mobile network handovers.
ws://server:port/sael
wss://server:port/sael (TLS, recommended)For server-to-server federation, plain TCP with the same framing is allowed for lower overhead. Optional QUIC support is reserved for future versions (v0.3+).
Every message on the wire is a frame: 4-байтовый header + JSON payload.
┌─────────┬─────────┬─────────────────────────────┐
│ version │ kind │ payload (JSON) │
│ 1 byte │ 3 bytes │ variable length │
└─────────┴─────────┴─────────────────────────────┘version— версия протокола, сейчас0x01kind— 3-буквенный ASCII opcode (см. ниже)payload— UTF-8 JSON, может быть пустым
Default is WebSocket text frames with JSON. MessagePack is opt-in via the WS subprotocol x-quark-msgpack: frames travel as binary, more compact (−12% on a typical payload, more on numeric-heavy ones). Implemented in the reference server.
Channels
channel is a persistent stateful connection between a client (AI agent) and a Sael server. Opened on connect, closed on disconnect.
Within a channel, state is preserved:
- Capability grants (valid for channel lifetime or until revoked)
- Subscriptions (active until explicitly unsubscribed or channel closed)
- Open tool streams (alive until completed or cancelled)
A single channel can carry multiple simultaneous calls, streams, and subscriptions, distinguished by seq.
Sael v0.2 introduces cryptographically signed capability tokens — Sael Capability Tokens (SCT). JWT-like, but spec-defined for Sael.
▸ Token format
qct.v1.<base64url(payload)>.<base64url(signature)>
signature = HMAC-SHA256(secret, "v1." + base64url(payload))▸ Payload
{
"iss": "https://issuer.example.com",
"sub": "[email protected]",
"iat": 1690000000,
"nbf": 1690000000,
"exp": 1700000000,
"scope": ["github:read:*", "slack:notify:#dev"],
"client_id": "claude-desktop",
"max_cost_usd": 5.00
}iss— issuer URL (required)sub— subject (user/principal) (required)iat— issued at, Unix secondsnbf— not before, optionalexp— expires, Unix seconds (required)scope— array of capability strings (required)client_id— restricts which client (optional)max_cost_usd— ceiling on total cost (optional)
▸ Capability strings
github:read:* # read anything on GitHub
github:write:repo:owner/name # write to specific repo
slack:notify:#general # notify a specific Slack channel
slack:notify:* # notify any Slack channel
*:read # read anything anywhereA capability "a:b:c" grants exact match AND descendants when granted as "a:b:c:*".
▸ Usage in handshake
{
"kind": "HEY",
"v": 2,
"auth": { "type": "bearer", "token": "qct.v1.eyJ..." },
"agent": { "id": "claude-desktop", "kind": "llm", "name": "Claude" }
}▸ Server verification
- Parse SCT (split by ".")
- Verify HMAC signature
- Check iat <= now, nbf <= now, exp > now
- If client_id set, verify matches agent.id
- Store granted scope as channel capabilities
On failure — ERR { code: "AUTH_INVALID" } + close.
▸ why it matters
Without signed tokens, an AI agent could claim any capabilities and the server would trust them. SCT means only legitimate issuers can mint tokens. This is the foundation for enterprise/compliance use cases (audit trails, SOC2, GDPR).
▸ client → server
▸ server → client
Every channel starts with a HEY.
{
"kind": "HEY",
"v": 1,
"agent": {
"id": "claude-desktop-3.7-mac",
"kind": "llm",
"name": "Claude Desktop"
},
"supports": ["streaming", "subscribe", "compose", "capabilities"]
}{
"kind": "HEY",
"v": 1,
"server": {
"id": "github-tools-v2",
"name": "GitHub Tools",
"version": "2.1.0"
},
"supports": ["streaming", "subscribe", "compose", "capabilities"],
"tools": 12,
"topics": 4
}If versions don't match, server replies with ERR and closes.
Both sides exchange heartbeats to detect dead connections (firewall timeouts, mobile NAT drops).
// Client (every 30s)
{ "kind": "HBT", "ts": 1700000000 }
// Server
{ "kind": "HBA", "ts": 1700000000 }▸ client
No HBA in 60s → reconnect
▸ server
No HBT in 90s → close (state held for TTL)
After disconnect (drop, mobile sleep), client reconnects and sends RSM:
{
"kind": "RSM",
"v": 2,
"session_id": "ses_a7b3c9d1",
"last_seq_received": 42
}Server:
- If session valid — replays buffered frames with seq > 42, then resumes
- If expired — ERR { code: "SESSION_EXPIRED" }, client falls back to fresh HEY
Subscriptions and capability grants survive resume. Open tool streams are cancelled (clients should re-INV if needed).
Servers MUST buffer the last 64 outgoing frames per session.
▸ mobile-friendly
iPhone in pocket → 4G handover → WebSocket drops → user opens app → client auto-RSMs → last 30s of missed push events arrive at once. No data loss.
Tools are registered server-side at startup. Clients discover via LST:
{
"kind": "LST",
"seq": 1,
"tools": [
{
"name": "github.list_repos",
"description": "List repos for a user/org",
"input": { "type": "object", "properties": { "owner": { "type": "string" } } },
"output": { "type": "array", "items": { "$ref": "#/types/Repo" } },
"effects": ["network", "read"],
"cost": { "estimate": 0.0001, "currency": "USD" },
"streaming": true,
"requires_capability": "github:read"
}
]
}Tool schemas use JSON Schema Draft 2020-12 with two extensions:
effects— массив изpure | read | write | network | money | irreversible | costcost— оценённая стоимость вызова (помогает AI бюджетировать)requires_capability— capability которая нужна для вызова
▸ One-shot
{
"kind": "INV",
"seq": 2,
"tool": "github.list_repos",
"input": { "owner": "anthropic" }
}{
"kind": "RES",
"seq": 2,
"output": [{ "name": "claude-code", "stars": 12000, "owner": "anthropic" }]
}▸ Streaming
When tool spec advertises streaming: true, results come as STR, ending with END.
// → INV { "seq": 3, "tool": "logs.tail", "input": { "file": "app.log" } }
// ← STR { "seq": 3, "data": { "line": "GET /api 200" } }
// ← STR { "seq": 3, "data": { "line": "GET /api 200" } }
// ← STR { "seq": 3, "data": { "line": "GET /api 500" } }
// ← END { "seq": 3 }When a tool advertises delta: true, chunks arrive as merge-patches in the delta field (RFC 7386): a snapshot first, then incremental patches. The client keeps local state and applies them — bandwidth tracks the change, not the size of the state. Measured: 60 updates to a 30-metric dashboard — 28.1 KB full-resend vs 3.6 KB deltas (−87%).
// → INV { "seq": 4, "tool": "demo.dashboard" }
// ← STR { "seq": 4, "delta": { "metric_0": 0, "metric_1": 10, ... } } // snapshot
// ← STR { "seq": 4, "delta": { "metric_3": 42 } } // patch
// ← STR { "seq": 4, "delta": { "metric_7": 99 } } // patch
// ← END { "seq": 4 }This is Sael's killer feature vs MCP.
A single INV can describe a pipeline. The server executes the whole pipeline, only sending the final result back. No round-trips between steps.
{
"kind": "INV",
"seq": 4,
"pipeline": [
{ "tool": "github.list_repos", "input": { "owner": "anthropic" } },
{ "filter": "stars > 100" },
{ "map": ["name"] },
{ "tool": "slack.notify", "input_bind": { "items": "$prev", "channel": "#dev" } }
]
}Stages:
tool— invoke a tool, output flows to next stagefilter— CEL/SQL-like expression filtering itemsmap— project fieldsreduce— aggregateinput_bind— bind previous stage output into the next tool's inputparallel— fan-out: N independent sub-pipelines run concurrently, result is an ordered array
{
"kind": "INV",
"seq": 5,
"pipeline": [
{ "tool": "data.fetch", "input": { "n": 20 } },
{ "parallel": [
[ { "tool": "enrich.summary", "input_bind": { "data": "$prev" } } ],
[ { "tool": "enrich.sentiment", "input_bind": { "data": "$prev" } } ],
[ { "tool": "enrich.translate", "input_bind": { "data": "$prev" } } ]
] }
]
}Each branch starts from the same $prev and runs in its own goroutine. Wall-clock = the slowest branch, not the sum. Measured (6 branches × 50 ms): 7 round-trips / 312 ms on MCP vs 1 round-trip / 53 ms on Sael.
▸ effect
Collapses N HTTP round-trips into one. ~10× latency reduction in real workloads.
{
"kind": "SUB",
"seq": 5,
"topic": "github.pr_opened",
"filter": { "repo": "anthropic/claude-code" }
}Server replies with RES (subscription id), then streams EVT:
{ "kind": "EVT", "seq": 5, "data": { "pr": 123, "title": "Fix typo" } }Until client sends UNS with same seq, or channel closes.
When server is overloaded, it sends WIN with a smaller window. Client MUST not send more than window outstanding requests.
{ "kind": "WIN", "window": 3 }Default window: 64. Server can shrink any time. Client must respect.
Capabilities
Sael v0.1 ships a minimal capability model. Capabilities are strings declared by tools and granted by users.
Tool declares: requires_capability: "github:write:repo:foo/bar"
Client (after user consent) in HEY:
"capabilities": [
"github:read:*",
"github:write:repo:foo/bar",
"slack:notify:#dev"
]Server validates capability against granted set on each INV. If missing, returns ERR with code MISSING_CAPABILITY. v0.2 will add cryptographically signed grants for audit/compliance.
Errors
{
"kind": "ERR",
"seq": 4,
"code": "MISSING_CAPABILITY",
"message": "Tool github.write_issue requires github:write",
"stage": 1
}Standard error codes:
Tracing
Every frame may include W3C Trace Context metadata:
{
"kind": "INV",
"seq": 5,
"trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
"span_id": "00f067aa0ba902b7",
"parent_span_id": "00f067aa0ba902b1",
"tool": "github.list_repos"
}trace_id— 32 hex chars, идентифицирует distributed tracespan_id— 16 hex chars, идентифицирует span внутри traceparent_span_id— связывает span с родителем
Server propagates trace_id to child operations (pipeline stages, federation). OpenTelemetry collectors can ingest Sael traces via a sidecar that reads frames and emits spans.
▸ use cases
- • Debugging: see exactly what calls an AI agent made in a session
- • Latency analysis: spot bottlenecks in a pipeline
- • Compliance: full audit trail of who-what-when
- • Cost attribution: which feature spent what
Sael ships with an MCP-Sael adapter. Any existing MCP server can be wrapped:
[AI agent] ──Sael──> [Sael adapter] ──MCP──> [legacy MCP server]The adapter:
- Converts Sael INV → MCP tools/call
- Converts MCP responses → Sael RES
- Loses streaming/composition/subscriptions (MCP doesn't support)
- Logs a warning when advanced feature is requested
This means zero migration cost. Clients start using Sael, existing MCPs continue to work via the adapter, authors migrate to native Sael when they want the new features.
SDKs
Provided in this repository:
▸ Go server
srv := sael.NewServer()
srv.RegisterTool(sael.Tool{
Name: "echo.upper",
Description: "Echo text in uppercase",
Handler: func(ctx context.Context, in map[string]any) (any, error) {
return strings.ToUpper(in["text"].(string)), nil
},
})
http.Handle("/sael/ws", srv)▸ TypeScript client
import { Sael } from '@fasad_salatov/sael-client'
const ch = await Sael.connect('wss://server/sael/ws', {
agent: { id: 'my-bot', kind: 'llm', name: 'My Bot' },
})
const repos = await ch.invoke('github.list_repos', { owner: 'anthropic' })
for await (const log of ch.stream('logs.tail', { file: 'app.log' })) {
console.log(log.line)
}
const filtered = await ch.pipeline([
{ tool: 'github.list_repos', input: { owner: 'anthropic' } },
{ filter: 'stars > 100' },
{ map: ['name'] },
])Roadmap
- v0.1Apr 2026initial draft, reference impls, MCP adapter
- v0.2Jun 5, 2026SCT auth, session resume, heartbeat, validation, cost tracking, tracing
- v1.0Jun 6, 2026stable. Federation, MessagePack, extended filter, schema registry, Python SDK.▸ now
- v1.1Q3 2026QUIC transport, mesh routing improvements
- v1.2Q4 2026WebRTC P2P for browser-to-browser AI agents
- v1.3Q1 2027WASM pipeline stages (sandboxed user code)
- v2.0Q3 2027Asymmetric SCT signing (RSA/ECDSA), full CEL adoption, capability delegation chains
▸ discussion
Spec is source-available (CC BY-NC-ND), code is BUSL-1.1. Contribute via GitHub issues/PRs (rights assignment, see CONTRIBUTING). Commercial licensing, questions and integrations — email or Telegram.