How it works, Fuze Docs

A technical walkthrough for engineers new to compliance-grade agent systems. Assumes familiarity with LLMs and TypeScript; introduces the parts most ML engineers haven't met yet, hash chains, transparency logs, KMS, EU AI Act mechanics, policy engines, sandbox tiers, and replay-protected HITL.

If you read code, packages/agent/src/loop/loop.ts is the authoritative answer for every claim here.

1. The big picture

The 2024 EU AI Act and GDPR put concrete obligations on anyone running AI agents in the EU:

Art. 12, automatic logging for the lifetime of a high-risk system.
Art. 14, humans must be able to detect anomalies, interpret outputs, intervene, halt.
GDPR Art. 6 / 9, declare a lawful basis; special-category data has nine narrow gates.
GDPR Art. 13–22, answer "show me", "delete me", "explain" inside 30 days.

Most frameworks treat this as someone-else's-problem. Fuze's wedge: make compliance evidence a type-system invariant. A tool that handles personal data cannot be defined without declaring its lawful basis, and a run cannot proceed if that basis is incompatible.

code

+------------------------------------------------------------+
|  Customer process (Node.js)                                |
|                                                            |
|  +----------------------+                                  |
|  | @fuze-ai/agent loop  |                                  |
|  +----------+-----------+                                  |
|             |                                              |
|             v                                              |
|  +----------------------+    deny / engine_error           |
|  | Policy gate (Cerbos) +------------------> halt          |
|  +----------+-----------+                                  |
|             | allow                                        |
|             v                                              |
|  +----------------------+   +--------------------------+   |
|  | Tool execute         |   | Sandbox tier            |   |
|  | (per dispatch)       +-->|  +--------------------+ |   |
|  +----------+-----------+   |  | vm-self-hosted (EU)| |   |
|             |               |  +--------------------+ |   |
|             v               |  | vm-managed (E2B)   | |   |
|  +----------------------+   |  +--------------------+ |   |
|  | Evidence emitter     |   |  | in-process (bash)  | |   |
|  +----------+-----------+   |  +--------------------+ |   |
|             |               +--------------------------+   |
|             v                                              |
|  +----------------------+                                  |
|  | ChainedRecord stream |--> sink / sign / anchor          |
|  +----------------------+                                  |
+------------------------------------------------------------+

2. Two products

Fuze ships two coupled products in one repo family:

Fuze Compliance (fuze-ai + fuze-cloud-dashboard), the safety SDK that wraps any agent framework: loop detection, budgets, side-effect tracking, hash-chained traces.
Fuze Agent (@fuze-ai/agent + 23 sibling packages), the opinionated framework with compliance baked in.

They share a wire format. Fuze Agent emits trace events that Fuze Compliance ingests via the same hash-chain protocol the safety SDK already uses. This page is about Fuze Agent; the SDK has its own docs section.

3. Runtime tiers

Where the agent code actually runs:

Tier	Customer process	Fuze API	Where data lives
Dev	Anywhere	Local in-process	Local SQLite
Cloud	Anywhere	Fuze-hosted	Fuze EU region (default non-Annex-III)
EU Sovereign	Customer's EU infra	Self-hosted in customer's EU infra	Customer-owned; nothing leaves the perimeter

The public surface is identical across tiers. Switching tiers swaps the policy engine, sink, and signer, nothing inside defineAgent or defineTool changes.

4. The agent loop

When you call runAgent(deps, input):

code

1. Validate definition compatibility (compile-time + runtime)
   - lawfulBasis ⊂ ⋂ tools.allowedLawfulBases
   - subjectRef present if any tool is non-public
   - annexIIIDomain != 'none' ⇒ art14OversightPlan required
   - model.residency compatible with tool residency
   [if any check fails, halt with status='error' before any spans]

2. Emit span: agent.invoke (genesis of hash chain)

3. Run input guardrails (PII / injection / residency)
   [tripwire ⇒ halt]

4. While stepsUsed < maxSteps:
   a. model.generate({messages, tools})  → model.generate span
   b. for each tool_call:
      - Cerbos.evaluate({tool, args, ctx})    → policy.evaluate span
        [deny ⇒ halt; requires-approval ⇒ suspend; engine error ⇒ fail-stop]
      - Tool.run(parsedInput, ctx)            → tool.execute span
        [Result<T, Retryable>, loop owns retry, not the tool]
      - guardrail.toolResult                   → guardrail span
   c. Append assistant + tool messages; persist DurableRunSnapshot

5. Validate final output against zod schema
6. Run output guardrails
7. Return AgentRunResult { status, output, runId, evidenceHashChainHead }

A clean run produces ~8 spans. Every path emits evidence. There is no way to call a tool that bypasses the policy gate, no way to call a model that doesn't get token-counted. Tools never receive sibling tools, they get ctx.invoke(name, input), which re-enters the pipeline.

5. The evidence pipeline

We are not just logging. We produce records a third party can verify without trusting us.

5.1 Spans

Every event is a span (OpenTelemetry GenAI conventions): span name, role, runId/stepId, startedAt/endedAt, common attributes (tenant, principal, lawful basis, Annex III domain, retention), attrs, plus contentHash and contentRef. The full payload is captured only when captureFullContent: true.

5.2 Hash chain

Spans are linked into an append-only chain:

code

Span 0 (genesis)
  prevHash: 0x000...000
  hash:     H(canonical({sequence: 0, prevHash: 0x000..., payload: span0}))

Span n
  prevHash: hash of span n-1
  hash:     H(canonical({sequence: n, prevHash: <prev>, payload: spanN}))

Structurally a blockchain without consensus: any byte change invalidates the chain from that point. verifyChain([records]) recomputes every hash; if any linkage breaks, returns false. Tamper-evidence is a math property, not a permission.

5.3 Canonicalization (RFC 8785)

{"a":1,"b":2} and {"b":2,"a":1} are the same logical object but hash differently. RFC 8785 (JCS) is the byte-exact JSON serialization standard: keys sorted, no whitespace, integers without .0, control characters escaped, undefined dropped, no trailing newline. About 50 lines in packages/agent/src/evidence/canonical.ts, property-tested with fast-check over 200 random JSON values × shuffled keys.

5.4 Redaction

Before any payload reaches the chain it goes through redaction:

Pattern-based: emails, phones, IBANs (mod-97), credit cards (Luhn), SSNs, IPs, JWTs, OAuth, API keys (sk-…, AWS, GitHub, Slack, etc.).
Structural: walks nested objects; SecretRef becomes <<fuze:secret:redacted>>.
Optional ML: Microsoft Presidio sidecar via JSON-RPC.

The hash is over the canonical, redacted form. The original payload exists only in memory; what is stored, transmitted, and chained is already redacted.

5.5 Run-root signing (Ed25519, customer-managed)

At end of run (or at suspend, for HITL):

code

runRoot = Ed25519.sign(privKey, chainHead || runId || nonce)

The signing key is the customer's, held in their KMS (AWS/GCP/Azure/Vault). Fuze never sees the private key, we call kms.sign(key, payload) and get a signature. A compromised Fuze deployment cannot forge audit records.

5.6 Transparency log

Run-roots are anchored to an append-only public log. Two adapters: SqliteTransparencyLog (self-hosted Merkle, default for sovereign) and RekorTransparencyLog (Sigstore Rekor, opt-in). The log returns a Merkle inclusion proof so anyone can verify "this run-root was in the log at this position" without the full log.

code

                Root
              /      \
           N01        N23
          /   \      /   \
        L0    L1   L2    L3   ← leaves (run-roots)
                                inclusion proof for L0 = [L1, N23]

This is what lets the customer prove a run happened before time T, without a transparency log, the auditor must trust your timestamps.

6. HITL, the human-oversight primitive

Art. 14 needs more than an approve button: the human can see state up to the suspend point, decide with rationale (which becomes evidence), and the decision is non-replayable.

6.1 Suspend

When a tool hits effect: requires-approval, the loop:

Records the suspended state (tool, args, current chain head).

Mints a resume token:

code

token = {
  runId, suspendedAtSequence, chainHeadAtSuspend,
  nonce: random(16 bytes),
  signature: Ed25519.sign(customerKey, runId || sequence || chainHead || nonce),
  publicKeyId
}

Persists the SuspendedRun (durable snapshot survives a restart).
Returns to caller with status: 'suspended'.

6.2 Resume

code

overseer reviews evidence panel → submits decision → resumeRun()
  1. Verify resume token signature   (with customer's public key)
  2. Check definitionFingerprint     (refuse if agent definition drifted)
  3. Consume the nonce               (replay attempt → ResumeTokenReplayError)
  4. Emit oversight.decision span    (action, rationale, overseerId, trainingId)
  5. Continue or halt

Nonces matter: an approved token is otherwise a permanent ticket. The fingerprint check closes the "ship innocuous tool, get approved, redefine before approve hits" attack.

7. Compliance type system

FuzeTool is a discriminated union:

type FuzeTool<TIn, TOut, TDeps> =
  | PublicTool<TIn, TOut, TDeps>           // 'public'
  | PersonalTool<TIn, TOut, TDeps>         // 'personal' | 'business'
  | SpecialCategoryTool<TIn, TOut, TDeps>  // 'special-category'

PublicTool, no extra requirements.
PersonalTool, allowedLawfulBases and residencyRequired are required by the type.
SpecialCategoryTool, also requires art9Basis (one of nine Art. 9(2) gates) and forces residencyRequired: 'eu'.

defineTool.specialCategory({
  name: 'lookupHealthRecord',
  // ❌ TS error: Property 'art9Basis' is missing
  ...
})

The compiler refuses the bad shape. Not a lint warning, the code doesn't compile.

The framework is six primitives: Tool, Model, Agent, Memory, Guardrail, Tracer. Anything else is composition.

The Ctx<TDeps> passed to tools exposes only tenant, principal, runId, stepId, subjectRef, deps (frozen), secrets (opaque refs), attribute(k, v), and invoke(name, input). No tracer access, no raw secrets, no sibling tools. Bypass tests with // @ts-expect-error prove the bad shapes don't typecheck.

8. Policy gating with Cerbos

Cerbos is open-source. Embedded WASM mode: YAML+CEL policies compile to a bundle that evaluates in-process in ~100µs.

yaml

apiVersion: api.cerbos.dev/v1
resourcePolicy:
  resource: transfer_funds
  rules:
    - actions: ["invoke"]
      effect: EFFECT_REQUIRES_APPROVAL
      condition:
        match: { expr: R.attr.amount > 1000 }
    - actions: ["invoke"]
      effect: EFFECT_ALLOW
      condition:
        match: { expr: R.attr.amount <= 1000 && P.attr.role == "operator" }

CEL is deliberately not Turing-complete: compare, arithmetic, list membership; no loops, recursion, function calls. Policies always terminate in tiny constant time.

Two reasons over if-statements: compliance officers can review YAML (not TS control flow), and policies survive code refactors.

Fail-stop: a policy engine error halts the run with engine_error=true. There is no --allow-on-engine-error runtime flag (a build-time dev flag exists; production builds disable it). A security review flagged this Critical-1.

9. Sandbox tiers

Threat	Defense
Tool args contain a payload that, if eval'd, owns the host	Run in a sandbox
Tool fetches a URL that returns a billion bytes	Sandbox enforces output cap
Tool reads `/etc/passwd`	Sandbox has its own FS; host FS not mounted
Tool exfiltrates via DNS	Sandbox egress allowlist
Multi-tenant: tool A reads tool B's secrets	Per-tenant sandbox process

Three tiers:

In-process (just-bash), TypeScript bash interpreter with virtual FS. No child_process. Usable only with the TrustedInputOnly brand and a single-tenant deployment (a watchdog refuses if a second tenant ID appears within an hour).
vm-managed (E2B Cloud), each sandbox in a Firecracker microVM. ~150ms cold, <30ms from a paused snapshot. Default for Cloud tier. Caveat: managed cloud is US-region by default.
vm-self-hosted (Sovereign), E2B is Apache-2.0; the Sovereign tier runs it on customer's Hetzner / Scaleway / OVHcloud / AWS-Frankfurt. CIS-benchmark Packer image, pinned kernel, WireGuard mesh, mTLS-only control plane, deny-all-inbound default firewall. We ship the Terraform.

Tier is recorded in every tool.execute span: fuze.sandbox.tier: 'in-process' | 'vm-managed' | 'vm-self-hosted'.

MCP is Anthropic's open standard for "agents talk to tool servers over JSON-RPC". Fuze Agent is both host and server.

As a host: @fuze-ai/agent-mcp wraps @modelcontextprotocol/sdk Client. Every tools/call is intercepted by RecordingTransport and emitted as evidence. Server fingerprints are pinned at admission; rotation without re-approval throws FingerprintMismatchError. Tools discovered from MCP servers go through unverifiedTool(), which requires the operator to supply Fuze metadata (classification, lawful basis allowlist, retention) before the tool can be called, otherwise it defaults to special-category and Cerbos default-denies.

As a server: serveFuzeAgent({tools, policy, transport}) exposes Fuze tools as MCP. Inbound tools/call gets the same evidence pipeline. Special-category tools are refused unless allowSpecialCategory: true. Fuze tools become usable from Claude Desktop, Cursor, Cline with audit trail intact.

11. EU AI Act mapping

Article	Requirement	Fuze mechanism
Art. 9	Risk management for high-risk	`annexIIIDomain` field forces declaration; non-`'none'` requires `art14OversightPlan`
Art. 12	Automatic logging	Every span hash-chained, signed, anchored. Logs include responsible person (`fuze.principal.id`), retention (`fuze.retention.policy_id`)
Art. 13	Transparency to deployers	`definitionFingerprint` lets deployers verify the agent didn't drift
Art. 14	Human oversight	HITL with replay-protected tokens; decision rationale, overseer ID, training reference all captured
Art. 22 (also GDPR)	Solely-automated decisions	`producesArt22Decision: boolean` flag forces approval gate
Art. 26	Deployer obligations	DPA + sub-processor manifest + TIA in `@fuze-ai/agent-legal-templates`
Art. 33/34 (GDPR)	Breach notification 72h	Incident-event generator produces Art. 33 + Art. 34 packets
Art. 73	Serious incident reporting	Same machinery; `IncidentEvent` schema flags severity

Article	Fuze mechanism
Art. 5(1)(e) retention	`RetentionPolicy` is a required type field; `@fuze-ai/agent-compliance` ships a partition function that drops expired records
Art. 6 lawful basis	`lawfulBasis` is a run-level field; compatible bases per tool checked at run start
Art. 9 special category	`SpecialCategoryTool` requires `art9Basis`
Art. 13/14 info to subject	Every span carries `fuze.subject.ref` (HMAC of stable identifier with tenant secret)
Art. 15 access	`GET /v1/subjects/:hmac/spans`
Art. 17 erasure	`eraseBySubjectRef(hmac)` cascades across spans, suspend store, durable store, memory
Art. 22 automated decisions	Same as AI Act Art. 22
Art. 28 processor obligations	DPA template generator
Art. 33/34 breach	Incident packets generator
Art. 35 DPIA	Auto-fills DPIA from agent definition
Art. 44–49 transfers	TIA generator per non-EU sub-processor; SCC selector

13. Where to read the actual code

Concept	Source
Loop	`packages/agent/src/loop/loop.ts`
Hash chain	`packages/agent/src/evidence/hash-chain.ts`
Canonicalization	`packages/agent/src/evidence/canonical.ts`
Discriminated `FuzeTool`	`packages/agent/src/types/tool.ts`
`Ctx` and `ctx.invoke`	`packages/agent/src/types/ctx.ts`
Resume tokens + nonces	`packages/agent/src/loop/suspend.ts`
Definition fingerprint	`packages/agent/src/loop/fingerprint.ts`
Cerbos engine	`packages/agent-policy-cerbos/`
just-bash sandbox	`packages/agent-sandbox-justbash/`
E2B sandbox	`packages/agent-sandbox-e2b/`
Transparency log	`packages/agent-transparency/`
KMS signers	`packages/agent-signing-kms/`
MCP host / server	`packages/agent-mcp/`, `packages/agent-mcp-server/`
API contract / server	`packages/agent-api/`, `packages/agent-api-server/`
Annex IV mapping	`packages/agent-annex-iv/`
Eval framework	`packages/agent-eval/`
Sovereign Terraform	`packages/agent-sovereign-terraform/modules/`

Reference agents: agent-employment-screening (Annex III), agent-customer-support (PII), agent-code-gen (sandbox), agent-hitl-demo (full HITL roundtrip).

14. Beyond this page

OpenTelemetry GenAI conventions, the spec, and our architecture reference.
Cerbos policy authoring, Cerbos docs.
Sigstore Rekor, Sigstore docs.
TypeScript discriminated unions, TS handbook.
Firecracker microVMs, the AWS paper.
Operating Fuze in production, operations guide.
Building your first agent, first agent tutorial.

If something here is unclear, the source is the authoritative answer. Every claim about Fuze's behavior maps to a test under packages/*/test/.

How it works

1. The big picture

2. Two products

3. Runtime tiers

4. The agent loop

5. The evidence pipeline

5.1 Spans

5.2 Hash chain

5.3 Canonicalization (RFC 8785)

5.4 Redaction

5.5 Run-root signing (Ed25519, customer-managed)

5.6 Transparency log

6. HITL, the human-oversight primitive

6.1 Suspend

6.2 Resume

7. Compliance type system

8. Policy gating with Cerbos

9. Sandbox tiers

10. MCP, sharing tools across the ecosystem

11. EU AI Act mapping

12. GDPR mapping

13. Where to read the actual code

14. Beyond this page