Traces and logs - AgentMark Docs

See exactly how your prompts perform in production: every step, tool call, token count, and error for each request. AgentMark captures this by tracing each execution with OpenTelemetry.

Set up tracing in your application before traces appear here. See Tracing setup for the integration steps.

Traces panel showing prompt execution timeline with spans, token usage, and response times

The Traces panel lists each execution with columns for Name, Status, Latency, Cost, Tokens, Spans, Tags, and Timestamp. Click a row to open the trace detail view with the full span tree and attribute drill-down.

Understanding traces

A trace represents the complete execution of a prompt, including all its steps, tool calls, and metadata. Each trace contains: Execution timeline: see exactly when each step occurred and how long it took. Token usage: track input tokens, output tokens, and total tokens consumed. Costs: monitor spending on a per-request basis. Tool calls: view all tool executions, their parameters, and results. Custom metadata: add context like user IDs, session IDs, and custom attributes. Error information: detailed error messages and stack traces when issues occur.

Collected spans

AgentMark classifies every ingested span as GENERATION (a model call, detected from gen_ai.* attributes or AI SDK span names such as ai.generateText and ai.streamText) or SPAN (any other operation, including tool calls and custom wrappers); the spans API accepts a third type, EVENT, for point-in-time records. See Collected spans in the Span reference for the canonical span-type and attribute reference.

Span kinds

Each span carries a semantic kind (function (default), llm, tool, agent, retrieval, embedding, or guardrail) that categorizes the type of operation it represents. Span kinds affect how you filter spans and how dashboards group analytics. Set span kinds in code by wrapping functions with observe(). See SpanKind values for the full list.

LLM span attributes

Each LLM span contains attributes that vary slightly depending on the SDK integration you use. The table below shows common attributes across integrations:

AI SDK (Vercel)
Claude Agent SDK

Attribute	Description
`ai.model.id`	Model identifier
`ai.model.provider`	Model provider name
`gen_ai.usage.input_tokens`	Number of input (prompt) tokens
`gen_ai.usage.output_tokens`	Number of output (completion) tokens
`gen_ai.usage.prompt_tokens`	Number of input (prompt) tokens
`gen_ai.usage.completion_tokens`	Number of output (completion) tokens
`ai.settings.maxRetries`	Maximum retry attempts
`ai.telemetry.functionId`	Function identifier
`ai.telemetry.metadata.*`	Custom metadata
`ai.response.text`	Response text
`ai.response.toolCalls`	Tool calls array
`ai.response.finishReason`	Finish reason

Attribute	Description
`gen_ai.request.model`	Requested model name
`gen_ai.system`	AI system identifier (for example, `anthropic`)
`gen_ai.usage.input_tokens`	Number of input tokens
`gen_ai.usage.output_tokens`	Number of output tokens
`gen_ai.response.output`	Agent response output
`gen_ai.response.finish_reasons`	Completion finish reasons
`gen_ai.tool.name`	Tool name
`gen_ai.operation.name`	Operation type (`chat`, `execute_tool`, `invoke_agent`)

All integrations also support custom metadata via agentmark.metadata.* attributes.

Standard OTel GenAI attributes accepted on ingest

Trace ingestion also accepts the standard OTel GenAI semantic convention attributes as fallbacks when the AgentMark keys are absent, so spec-conformant instrumentation works without modification:

Standard attribute	Mapped to
`gen_ai.input.messages`	Span input (parsed from the `{role, parts[]}` message shape)
`gen_ai.output.messages`	Span output
`gen_ai.system_instructions`	Folded into the span input as a leading `system` message
`gen_ai.provider.name`	Provider identification (accepted wherever AgentMark reads `gen_ai.system`)
`gen_ai.conversation.id`	Session id (when `agentmark.session_id` is absent)
`gen_ai.prompt` / `gen_ai.completion`	Span input / output (legacy OTel GenAI keys)
`gen_ai.usage.prompt_tokens` / `gen_ai.usage.completion_tokens`	Input / output token counts (legacy names)

AgentMark-emitted keys (agentmark.* and the existing gen_ai.request.input / gen_ai.response.output) always take precedence over these fallbacks. The SDKs’ observe() / @observe wrappers and setInput() / setOutput() helpers write their IO to the canonical vendor-namespaced agentmark.request.input / agentmark.response.output attributes. They also emit the matching gen_ai.request.input / gen_ai.response.output keys for compatibility.

Grouping traces

Organize related traces together using custom grouping. This is useful for understanding complex workflows that span multiple prompt executions.

Grouped traces view showing a parent trace with nested child traces in the timeline

Grouped traces show a parent-child hierarchy in the trace list, with child spans indented under their parent. Use this to model multi-step agent workflows, nested component execution, and parallel processing pipelines.

Viewing traces

View traces in your local dev server at http://localhost:3000 or in the AgentMark Dashboard under the Traces tab. Both render the same trace explorer (execution timeline, span tree, graph view, and per-span attribute drill-down). Each trace shows:

Complete prompt execution timeline
Tool calls and their durations
Token usage and costs
Custom metadata and attributes
Error information (if any)
Graph visualization (when graph metadata is present)
Manual annotations for quality assessment

Generated media

When a span’s input or output contains generated media (images or audio), the Input / Output tab renders it inline. Images render as an <img>, audio with a player, in place of the raw payload. Media renders when the field value is, or contains, objects shaped { mimeType, base64 }. The Vercel AI SDK’s mediaType key also works. An image-generation output of [{ "mimeType": "image/png", "base64": "..." }], for example, renders as the image. A value of any other shape renders as text. AgentMark stores large fields (base64 media included) in object storage at ingest, then fetches them on demand when you open the span, so the trace list and timeline stay fast. This happens automatically, with no configuration.

AgentMark’s prompt runner captures generated images and audio automatically. The AI SDK’s experimental_generateImage emits no telemetry of its own, so to capture media from your own code, instrument the call and set the media as the span output. See Tracing image generation.

Filtering and search

AgentMark provides filtering across all trace dimensions: model, status, latency, cost, tokens, metadata, scores, and more. You can combine filters, save them as views, and share them via URL. Learn more about filtering and search

Integration

AgentMark works with any application that uses OpenTelemetry. For detailed setup instructions, see Tracing setup.

Gateway MCP

For debugging traces directly from your IDE, the gateway MCP server exposes the full AgentMark API as editor tools; for trace work you’ll mostly use list_traces, get_trace, and list_spans. This lets you query and inspect traces without leaving your development environment.

Traces and spans API

You can query traces and spans programmatically using the REST API or the CLI. Both the local dev server and the AgentMark Cloud gateway expose /v1/traces, /v1/traces/{traceId}, and /v1/spans, so you can develop against local data and switch to Cloud without changing your integration. Bulk export (/v1/traces/export) is Cloud-only.

Local REST
REST API (Cloud)
REST API (local)

# List traces from the local dev server
curl "http://localhost:9418/v1/traces?limit=20"

# Get a specific trace with its spans
curl "http://localhost:9418/v1/traces/<traceId>"

# Query spans across all traces
curl "http://localhost:9418/v1/spans?limit=50"

# List traces with filters. The traces `status` filter accepts OK | ERROR.
curl "https://api.agentmark.co/v1/traces?status=ERROR&limit=20" \
  -H "Authorization: Bearer am_live_abc123" \
  -H "X-Agentmark-App-Id: app_abc123"

# Get a single trace with all its spans
curl "https://api.agentmark.co/v1/traces/abc123-trace-id" \
  -H "Authorization: Bearer am_live_abc123" \
  -H "X-Agentmark-App-Id: app_abc123"

# Query spans across all traces
curl "https://api.agentmark.co/v1/spans?type=GENERATION&model=gpt-5&limit=50" \
  -H "Authorization: Bearer am_live_abc123" \
  -H "X-Agentmark-App-Id: app_abc123"

# Same endpoints, no auth required locally
curl "http://localhost:9418/v1/traces?status=ERROR&limit=20"

curl "http://localhost:9418/v1/traces/abc123-trace-id"

curl "http://localhost:9418/v1/spans?type=GENERATION&limit=50"

See the API reference for all available endpoints, filters, and response schemas. You can also create scores for spans and traces programmatically.

Cross-trace span search

The GET /v1/spans endpoint lets you search spans across all traces in your project. Unlike the traces API, which returns traces and their nested spans, the spans endpoint queries individual spans directly, regardless of which trace they belong to. This is useful when you need to:

Find all LLM calls using a specific model across your entire project
Identify slow operations by filtering on duration thresholds
Audit error spans across traces without browsing each trace individually
Analyze usage patterns for a particular span type (for example, all GENERATION spans)

Available filters:

Parameter	Description
`trace_id`	Restrict results to spans of one trace
`type`	Span type: `GENERATION`, `SPAN`, or `EVENT`
`status`	Span status: `UNSET`, `OK`, or `ERROR`
`name`	Partial match on span name
`model`	Partial match on model name
`min_duration`	Minimum duration in milliseconds
`max_duration`	Maximum duration in milliseconds
`start_date`	Earliest span start time (ISO 8601)
`end_date`	Latest span start time (ISO 8601)
`user_id`	Exact match on the user attributed to the span
`session_id`	Exact match on the session attributed to the span
`filter`	Advanced filter expression (string DSL); see Advanced filtering
`limit`	Results per page (1-500, default 100)
`offset`	Pagination offset

Local REST
REST API

# Find all error spans
curl "http://localhost:9418/v1/spans?status=ERROR"

# Find slow generations (over 5 seconds)
curl "http://localhost:9418/v1/spans?type=GENERATION&min_duration=5000"

# Search spans by model
curl "http://localhost:9418/v1/spans?model=claude&limit=20"

# Find all error spans
curl "https://api.agentmark.co/v1/spans?status=ERROR" \
  -H "Authorization: Bearer am_live_abc123" \
  -H "X-Agentmark-App-Id: app_abc123"

# Find slow generations (over 5 seconds)
curl "https://api.agentmark.co/v1/spans?type=GENERATION&min_duration=5000" \
  -H "Authorization: Bearer am_live_abc123" \
  -H "X-Agentmark-App-Id: app_abc123"

# Combine filters: slow Claude generations with errors
curl "https://api.agentmark.co/v1/spans?type=GENERATION&model=claude&status=ERROR&min_duration=3000" \
  -H "Authorization: Bearer am_live_abc123" \
  -H "X-Agentmark-App-Id: app_abc123"

# Paginate through results
curl "https://api.agentmark.co/v1/spans?type=GENERATION&limit=100&offset=100" \
  -H "Authorization: Bearer am_live_abc123" \
  -H "X-Agentmark-App-Id: app_abc123"

Each span in the response includes its traceId, so you can drill into the full trace for any span that matches your search.

Advanced filtering

Both GET /v1/traces and GET /v1/spans accept a filter parameter: a string DSL that combines predicates with and and supports parenthesized OR-groups, for example (model = "gpt-5" or model = "o3") and status = ERROR. For structured JSON filters with the additional in, notIn, and between operators, use POST /v1/traces/search and POST /v1/spans/search. GET /v1/filter-schema returns a machine-readable list of the filterable fields and operators per resource. See the API reference for the full grammar and request schemas.

Payload size limits

To guarantee reliable ingestion, AgentMark truncates oversized span payload fields at ingest time:

Per-field limit: 48 KiB. AgentMark cuts any string attribute value (inputs, outputs, tool calls, event attributes such as stack traces) larger than 48 KiB at a UTF-8-safe boundary.
Per-span limit: 100 KB. If a span’s total serialized size still exceeds 100 KB (several large fields), AgentMark shrinks the largest fields further until the span fits.

Truncated values always end with an explicit marker so you can tell at a glance that AgentMark cut the content:

…[truncated by AgentMark: original 384211 bytes]

Truncated spans also carry a truncated_fields entry in their metadata (visible in the span’s Metadata, for example ["ai.prompt.messages"]) listing exactly which fields AgentMark cut. AgentMark stores spans below the limits byte-for-byte as sent. If you routinely need full multi-hundred-KB payloads, store the full content in your own object storage and put a reference (URL or key) in span metadata instead.

Best practices

Use meaningful IDs: choose descriptive function IDs for easy filtering and debugging.
Add context: include relevant metadata like user IDs, session IDs, and business context.
Monitor regularly: check traces frequently to catch issues early.
Set up alerts: configure alerts for cost, latency, or error thresholds.
Analyze patterns: use the Dashboard’s filtering to identify trends and patterns.

Next steps

Sessions

Group related traces together

Alerts

Get notified of critical issues

Annotations

Manually label and score traces

Tracing setup

Integrate observability in your app

Have questions?

Reach out any time:

Email the team at hello@agentmark.co for support
Schedule an Enterprise Demo to learn about AgentMark’s business solutions

​Understanding traces

​Collected spans

​Span kinds

​LLM span attributes

​Standard OTel GenAI attributes accepted on ingest

​Grouping traces

​Viewing traces

​Generated media

​Filtering and search

​Integration

​Gateway MCP

​Traces and spans API

​Cross-trace span search

​Advanced filtering

​Payload size limits

​Best practices

​Next steps

Sessions

Alerts

Annotations

Tracing setup

​Have questions?

Understanding traces

Collected spans

Span kinds

LLM span attributes

Standard OTel GenAI attributes accepted on ingest

Grouping traces

Viewing traces

Generated media

Filtering and search

Integration

Gateway MCP

Traces and spans API

Cross-trace span search

Advanced filtering

Payload size limits

Best practices

Next steps

Have questions?