Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.agentmark.co/llms.txt

Use this file to discover all available pages before exploring further.

Instrument with the SDK to capture traces automatically. Explore them in your terminal (local) or in the Dashboard with filtering, search, graph view, dashboards, and alerts. Built on OpenTelemetry, AgentMark automatically collects telemetry data from your prompts and provides actionable insights.

Core concepts

Traces

A trace represents a complete request or workflow in your application. Each trace is identified by a unique trace ID and contains one or more spans. Traces carry top-level attributes such as metadata, tags, user ID, and session ID.

Spans

Spans are individual operations within a trace, forming a tree structure. AgentMark records three span types:
  • ai.inference — the full lifecycle of an LLM call, including model, tokens, cost, and response
  • ai.toolCall — a single tool execution, including name, arguments, and result
  • ai.stream — streaming response metrics such as time to first token and tokens per second

Span kinds

Every span has a semantic kind that categorizes the operation. Span kinds determine how spans can be filtered and how analytics are grouped on dashboards.
KindDescription
functionGeneric computation step (default)
llmA call to a language model
toolAn external tool or API call
agentAn orchestration loop that decides what to do next
retrievalA vector database query or document search
embeddingA call to an embedding model
guardrailA content safety or validation check
Set span kinds using observe() or ctx.span().

Sessions

Sessions group related traces together by session ID. Track multi-turn conversations, agent workflows, and batch processing runs. Each session aggregates cost, tokens, and latency across its traces. Learn more about Sessions →

Scores

Numeric evaluations attached to spans or traces. Set scores programmatically via the SDK using sdk.score(), or manually through annotations in the Dashboard.

Metadata and tags

Metadata — Custom key-value pairs attached to traces for context (environment, feature flags, customer tier). Automatically discovered as filter fields. Tags — String labels for categorization (environment, team, feature, release). Metadata → · Tags →

What gets tracked

Inference spans — Full prompt execution lifecycle: token usage, costs, response times, model information, completion status. Tool calls — Tool name, parameters, execution duration, success/failure status, return values. Streaming metrics — Time to first token, tokens per second, total streaming duration. Sessions — Group related traces by user interaction, multi-step workflow, or batch run. Alerts — Monitor cost thresholds, latency spikes, error rates, and evaluation scores.

Quick start

Enable telemetry when formatting your prompts:
import { client } from './agentmark.client';
import { generateText } from 'ai';

const prompt = await client.loadTextPrompt('greeting.prompt.mdx');
const input = await prompt.format({
  props: { name: 'Alice' },
  telemetry: {
    isEnabled: true,
    functionId: 'greeting-handler',
    metadata: {
      userId: 'user-123',
      sessionId: 'session-abc',
      sessionName: 'Customer Support Chat'
    }
  }
});

const result = await generateText(input);
For full tracing setup including AgentMarkSDK, child spans, observe(), and span kinds, see Tracing Setup.

How data flows

Your application sends telemetry via the AgentMark SDK, which exports OpenTelemetry spans to the AgentMark gateway. The gateway processes and stores the data, powering the traces, metrics, and analytics views.
Spans are exported to the AgentMark Cloud gateway and stored in ClickHouse. View traces, dashboards, alerts, and analytics in the Dashboard.

Programmatic access

You can query traces, spans, sessions, scores, metrics, datasets, experiments, prompts, and runs programmatically using the REST API or the agentmark api CLI command. Use this to build custom integrations, pull data into external tools, or automate monitoring workflows. Most endpoints are available on both the local dev server and the AgentMark Cloud gateway. The local server returns 501 Not Available Locally for features that require ClickHouse aggregations (/v1/metrics, score analytics, /v1/traces/export). Use the capabilities endpoint to check which features a server supports.
# Query traces from the local dev server
npx agentmark api traces list --limit 10

# Get a specific trace with all its spans
npx agentmark api traces get <traceId>

# List sessions, scores, spans, and metrics locally
npx agentmark api sessions list --limit 5
npx agentmark api scores list
npx agentmark api spans list --limit 20

# Target AgentMark Cloud with --remote
npx agentmark api traces list --remote --limit 10

# List experiment results
npx agentmark api experiments list --limit 5

# Check which features are available
npx agentmark api capabilities get

Next steps

Tracing setup

Instrument your app with the SDK

Traces and logs

View execution timelines in the Dashboard

Sessions

Group related traces together

Alerts

Get notified of critical issues

Dashboards

Analyze usage, performance, and scores

API reference

Query traces, scores, and metrics via REST API

Have Questions?

We’re here to help! Choose the best way to reach us: