Skip to main content
This page is the reference for the spans AgentMark collects and the attributes it reads off them. Use it to look up a span type, an attribute name, or a SpanOptions field. To wire any of this into your application, see Set up tracing.

Collected spans

AgentMark classifies every ingested OpenTelemetry span by type:
Span typeDescriptionKey attributes
GENERATIONA model call. Detected from gen_ai.* attributes or AI SDK span names (ai.generateText, ai.streamText, and their doGenerate / doStream children)ai.model.id, ai.prompt, ai.response.text, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens; streaming calls add ai.response.msToFirstChunk, ai.response.avgCompletionTokensPerSecond
SPANAny other operation: tool calls, retrieval steps, custom span() / observe() wrappersai.toolCall.name, ai.toolCall.args, ai.toolCall.result, agentmark.*
The spans API accepts a third type, EVENT, for point-in-time records.

Span attributes

Each span carries detailed attributes: Model information: ai.model.id (e.g., "gpt-5-mini"), ai.model.provider (e.g., "openai") Token usage: gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.usage.total_tokens (the ai.usage.promptTokens / ai.usage.completionTokens aliases are also accepted) Telemetry metadata: ai.telemetry.functionId, ai.telemetry.metadata.* Response details: ai.response.text, ai.response.toolCalls, ai.response.finishReason

Prompt identity and version linking

Prompt runs are linked to the exact prompt version, not just the prompt name:
  • agentmark.prompt_name: the prompt’s frontmatter name. Both the TypeScript and Python SDKs emit this attribute on the prompt-run span; it powers the prompt label column on the trace surfaces.
  • agentmark.metadata.commit_sha: the git commit the prompt content was served at. When a prompt is loaded from AgentMark Cloud, the gateway stamps the served-at commit into the prompt’s agentmark_meta.commit_sha (the pinned environment commit for keys bound to a pinned environment, the latest synced commit otherwise). The local CLI dev server stamps your repo’s HEAD the same way. The SDK’s run path echoes that commit onto the trace automatically, with no code change needed.
On ingest, the gateway verifies the SDK-supplied commit against the environment’s server-side deployment pointer (trust-but-verify): for API keys bound to an environment, the stamped CommitSha always reflects the server’s recorded deployment, so a client can never claim an arbitrary commit.

Trace-level input and output

The trace list and trace detail views show a single input/output per trace. These are derived from spans at read time, identically in AgentMark Cloud and the local dev server’s GET /v1/traces/:id:
  1. Root span first. The prompt-run (root) span’s agentmark.input / agentmark.output attributes win when present. The WebhookRunner records these automatically on every run: the formatted messages as input right after the prompt renders, and the final text/object as output when the event stream drains, in both streaming and non-streaming modes.
  2. GENERATION fallback. When the root span carries no I/O (third-party OTEL instrumentation pointed straight at the collector), the first GENERATION span’s input and the last GENERATION span’s output are used instead, in timestamp order.
The fields resolve independently: a trace with only a root-span output still gets its input from the GENERATION fallback. Two practical implications:
  • Executors never set trace I/O. If agentmark doctor --smoke (the CLI’s instrumentation smoke test) reports a trace missing input/output, the fix is in instrumentation, not in your executor.
  • GENERATION spans come from your model SDK’s instrumentation, e.g. experimental_telemetry with the Vercel AI SDK, or Agent.instrument_all(InstrumentationSettings(version=3)) with Pydantic AI.

SpanOptions

SpanOptions configure a span created with span() / span_context(). To set them up in code, see Grouping operations into a span.
OptionTypeRequiredDescription
namestringYesName for the span
metadataRecord<string, string>NoCustom key-value metadata (strings only)
promptNamestringNoPrompt name, emitted as the agentmark.prompt_name attribute (same key in Python via prompt_name)
sessionIdstringNoGroup traces into a session
sessionNamestringNoHuman-readable session name
userIdstringNoAssociate trace with a user
datasetRunIdstringNoLink to a dataset run
datasetRunNamestringNoHuman-readable dataset run name
datasetItemNamestringNoSpecific dataset item name
datasetExpectedOutputstringNoExpected output for evaluation
datasetPathstringNoPath to the dataset file
experimentKeystringNoStable identity of the evaluation for experiment runs. Set it whenever you set datasetRunId
sourceTreeHashstringNoGit tree hash of the code state the run executed against
datasetInputstringNoJSON-serialized dataset row input, used to match rows across experiment runs
The option names above are TypeScript. Python’s SpanOptions uses the snake_case equivalents: user_id, session_id, session_name, prompt_name, dataset_run_id, and so on.

observe() options

observe() wraps an async function with automatic input/output capture and lets you set a SpanKind. For the call patterns, see Wrapping functions with observe().
OptionTypeDescription
namestringDisplay name for the span (defaults to function name)
kindSpanKindType of operation (defaults to SpanKind.FUNCTION)
captureInput / capture_inputbooleanRecord function arguments (default: true)
captureOutput / capture_outputbooleanRecord return value (default: true)
processInputs / process_inputsfunctionTransform arguments before recording (useful for redacting sensitive data)
processOutputs / process_outputsfunctionTransform return value before recording

SpanKind values

SpanKind sets the semantic kind of a span, which drives how spans are filtered and grouped on dashboards.
KindDescription
SpanKind.FUNCTIONGeneric computation step (default)
SpanKind.LLMA call to a language model
SpanKind.TOOLAn external tool or API call
SpanKind.AGENTAn orchestration loop
SpanKind.RETRIEVALA vector database query or document search
SpanKind.EMBEDDINGA call to an embedding model
SpanKind.GUARDRAILA content safety or validation check

SpanResult

span() returns a SpanResult in TypeScript:
interface SpanResult<T> {
  result: Promise<T>;  // The result of your callback (as a Promise)
  traceId: string;     // The trace ID for correlation
}
result is Promise<T>, not T. You need to await it to get the resolved value.
In Python, span_context() is an async context manager that exposes trace_id on the context object:
async with span_context(SpanOptions(name="my-operation")) as ctx:
    print(ctx.trace_id)  # Available immediately
    result = await my_async_function()

Have questions?

Reach out any time: