Full-interaction spans

guard() records tamper-evident step records for individual tool calls. The span API records the whole interaction — user input, retrieval, LLM messages, tool args, assistant output — into the same hash chain, with semantic roles, parent linkage, and optional inline content capture.

It's the right primitive when you want a conversation timeline and aggregate optimization views (slow paths, stuck tools, retrieval quality) without adopting an agent framework.

Three primitives

  • run(opts, fn) — establishes an implicit run scope. Anything inside the callback inherits runId via AsyncLocalStorage (JS) or contextvars (Python). No threading required.
  • span(opts) — record a leaf span at the current scope: role, optional captured content, optional attrs bag.
  • traced(fn, opts) — wrap a function so its invocation becomes a span. Nested traced/span calls inherit parentStepId automatically.

Quickstart

TypeScript

typescript
import { run, span, traced, configure } from 'fuze-ai'

configure({ redactor })   // required only when capture === 'full+redact'

await run({ sessionId, userId, tenant }, async () => {
  await span({
    role: 'user',
    capture: 'full',
    content: { kind: 'text', text: userInput },
  })

  const hits = await traced(searchKnowledge, {
    role: 'retrieval',
    capture: 'full',
  })(query)

  const reply = await traced(callLLM, {
    role: 'llm',
    capture: 'full+redact',
  })(messages)

  await span({
    role: 'assistant',
    capture: 'full',
    content: { kind: 'text', text: reply },
  })
})

Python

python
from fuze_ai import run, span, traced, configure

configure({"redactor": redactor})   # required only for capture='full+redact'

async with run(session_id=..., user_id=..., tenant=...):
    await span(role='user', capture='full',
               content={'kind': 'text', 'text': user_input})

    hits = traced(search_knowledge, role='retrieval', capture='full')(query)

    reply = traced(call_llm, role='llm', capture='full+redact')(messages)

    await span(role='assistant', capture='full',
               content={'kind': 'text', 'text': reply})

Roles

A span's role drives both dashboard rendering and the cross-run optimization queries.

RoleWhen to use
userThe user's message that started this turn.
assistantThe model's final text reply to the user.
systemA system/setup span (e.g., context window injection). Optional — most apps don't emit these.
llmA call to a language model. attrs.model is conventional. Content is typically kind: 'messages'.
toolA tool invocation that isn't retrieval (writes, planners, browsers, etc.).
retrievalVector / hybrid / FTS / graph search. Content is kind: 'retrieval' with query and results[] — this is what powers the retrieval-quality dashboard view.

Capture modes

Capture is a deliberate per-span decision. Defaults to hash so existing code is unaffected.

ModeBehaviour
hash (default)Tamper-evident only. No content stored. Same as legacy guard() records.
fullInline raw content. Replayable but stored. Use for non-PII data (policy doc IDs, public planning text).
full+redactInline content after redaction. The unredacted form never enters the hash chain. Fail-closed: throws FuzeError if no redactor is configured.
sampledReserved for future sampling policies.

The redactor is a single-method interface you supply at configure(...) time:

typescript
configure({
  redactor: {
    redactContent(content) {
      // strip emails, phone numbers, etc. Return the modified shape.
      return content
    },
  },
})

There is no built-in redactor — the SDK stays neutral about what counts as PII for your domain.

Content shapes

content is a discriminated union keyed by kind. Authoritative schema lives at data/trace-schema.json.

  • { kind: 'text', text } — for user, assistant, system spans.
  • { kind: 'messages', messages: [{ role, text }] } — for llm spans.
  • { kind: 'tool_call', args, result? } — auto-generated by traced(fn); you rarely emit this manually.
  • { kind: 'retrieval', query, results: [{ docId, chunkId, score, cited?, snippet? }] } — for retrieval spans. The cited flag drives the retrieval-quality scorecard.

attrs is an open record for span-type-specific fields (attrs.model on LLM spans, attrs.jurisdiction on retrieval spans, etc.). Promote frequently-used keys to typed fields in a future revision rather than letting attrs grow load-bearing.

What you get in the dashboard

Once spans are flowing, the cloud dashboard exposes two views you couldn't build before:

  • /runs/:runId/timeline — a per-run conversation view. Spans render by role (chat bubbles for user/assistant, collapsible message arrays for LLM, score-tagged hits for retrieval). Indented by parentStepId. Errors get a red border.
  • /optimization — five panels backed by SQL over the span table: stuck tool calls, runs above P95 step count, slow steps by (role, tool_name), retrieval quality (cited vs uncited scores per jurisdiction), token hotspots. Each row drills into the timeline.

Compliance gating

For tenants in the Cloud or Daemon mode, the ingest endpoint rejects capture !== 'hash' unless organisations.allow_content_capture = true. The default is false, so content can never accidentally enter storage. Operators flip the gate explicitly once data-residency and DPA terms are in place.

Common patterns

Bracketing a conversation turn

typescript
async function handleUserMessage(req) {
  await run({ sessionId: req.body.conversation_id, userId: req.user.id, tenant: req.org.id }, async () => {
    await span({ role: 'user', capture: 'full', content: { kind: 'text', text: req.body.message } })
    const reply = await runAgent(req.body.message)   // tools + LLM inside use traced()/span()
    await span({ role: 'assistant', capture: 'full', content: { kind: 'text', text: reply } })
  })
}

Recording a retrieval call with cited results

typescript
const hits = await retrieve(query)
await span({
  role: 'retrieval',
  capture: 'full',
  attrs: { jurisdiction: 'dcc' },
  content: {
    kind: 'retrieval',
    query,
    results: hits.map((h) => ({
      docId: h.doc_id,
      chunkId: h.chunk_id,
      score: h.score,
      cited: citedChunkIds.has(h.chunk_id),
    })),
  },
})

Wrapping a streaming LLM call

traced() records the span when the wrapped function settles. For streaming calls, record the span at stream-end:

typescript
async function callLLM(messages) {
  const stream = await openai.chat.completions.create({ ..., stream: true })
  let text = ''
  let usage = null
  for await (const chunk of stream) {
    text += chunk.choices[0]?.delta?.content ?? ''
    if (chunk.usage) usage = chunk.usage
  }
  await span({
    role: 'llm',
    capture: 'full+redact',
    attrs: { model, finish_reason: 'stop' },
    content: { kind: 'messages', messages: [...messages, { role: 'assistant', text }] },
  })
  return text
}

Compatibility

  • guard(), createRun(), guardMethod, guarded continue to work unchanged. The span API is additive.
  • Pre-v2 records (no role, no capture) validate against the new schema after defaults apply (role='tool', capture='hash').
  • verifyChain() succeeds with mixed pre-v2 and v2 records in the same chain.

Reference