AgentMark uses OpenTelemetry to collect distributed traces and metrics for your prompt executions. This page covers everything from basic setup to advanced tracing patterns.
Install the SDK
npm install @agentmark-ai/sdk
pip install agentmark-sdk
Initialize Tracing
import { AgentMarkSDK } from "@agentmark-ai/sdk" ;
import { createAgentMarkClient , VercelAIModelRegistry } from "@agentmark-ai/ai-sdk-v5-adapter" ;
import { openai } from "@ai-sdk/openai" ;
import { generateText } from "ai" ;
const sdk = new AgentMarkSDK ({
apiKey: process . env . AGENTMARK_API_KEY ,
appId: process . env . AGENTMARK_APP_ID ,
baseUrl: process . env . AGENTMARK_BASE_URL // defaults to https://api.agentmark.co
});
// Initialize tracing
const tracer = sdk . initTracing ();
// Configure client
const modelRegistry = new VercelAIModelRegistry ()
. registerModels ([ "gpt-4o-mini" ], ( name ) => openai ( name ));
const client = createAgentMarkClient ({
loader: sdk . getApiLoader (),
modelRegistry
});
// Load and run prompt with telemetry
const prompt = await client . loadTextPrompt ( "greeting.prompt.mdx" );
const input = await prompt . format ({
props: { name: 'Alice' },
telemetry: {
isEnabled: true ,
functionId: "greeting-function" ,
metadata: {
userId: "123" ,
environment: "production"
}
}
});
const result = await generateText ( input );
// Shutdown tracer (only for short-running scripts)
await tracer . shutdown ();
import os
from agentmark_sdk import AgentMarkSDK
from agentmark_pydantic_ai_v0 import run_text_prompt
from agentmark_client import client
sdk = AgentMarkSDK(
api_key = os.environ[ "AGENTMARK_API_KEY" ],
app_id = os.environ[ "AGENTMARK_APP_ID" ],
)
# Initialize tracing
tracer = sdk.init_tracing()
# Load and run prompt with telemetry
prompt = await client.load_text_prompt( "greeting.prompt.mdx" )
params = await prompt.format(
props = { "name" : "Alice" },
telemetry = {
"isEnabled" : True ,
"functionId" : "greeting-function" ,
"metadata" : {
"userId" : "123" ,
"environment" : "production" ,
},
},
)
result = await run_text_prompt(params)
# Shutdown tracer (only for short-running scripts)
await tracer.shutdown()
For local development with agentmark dev, traces are sent to http://localhost:9418 automatically. Pass disableBatch: true for short-running scripts: const tracer = sdk . initTracing ({ disableBatch: true });
For local development with agentmark dev, traces are sent to http://localhost:9418 automatically. Pass disable_batch=True for short-running scripts: tracer = sdk.init_tracing( disable_batch = True )
Collected Spans
AgentMark records these OpenTelemetry spans:
Span Type Description Key Attributes ai.inferenceFull inference call lifecycle ai.model.id, ai.prompt, ai.response.text, ai.usage.promptTokens, ai.usage.completionTokensai.toolCallIndividual tool executions ai.toolCall.name, ai.toolCall.args, ai.toolCall.resultai.streamStreaming response metrics ai.response.msToFirstChunk, ai.response.msToFinish, ai.response.avgCompletionTokensPerSecond
Span Attributes
Each span contains detailed attributes:
Model Information : ai.model.id (e.g., “gpt-4o-mini”), ai.model.provider (e.g., “openai”)
Token Usage : ai.usage.promptTokens, ai.usage.completionTokens
Telemetry Metadata : ai.telemetry.functionId, ai.telemetry.metadata.*
Response Details : ai.response.text, ai.response.toolCalls, ai.response.finishReason
Grouping Traces
Group related operations using the trace function. In TypeScript, trace accepts a TraceOptions object and a callback. In Python, trace is an async context manager.
import { trace } from "@agentmark-ai/sdk" ;
const { result , traceId } = await trace (
{ name: 'user-request-handler' },
async ( ctx ) => {
const prompt = await client . loadTextPrompt ( 'handler.prompt.mdx' );
const input = await prompt . format ({
props: { query: 'What is AgentMark?' },
telemetry: { isEnabled: true }
});
return await generateText ( input );
}
);
console . log ( 'Trace ID:' , traceId );
const output = await result ;
from agentmark_sdk import trace
from agentmark_pydantic_ai_v0 import run_text_prompt
from agentmark_client import client
async with trace( name = "user-request-handler" ) as ctx:
prompt = await client.load_text_prompt( "handler.prompt.mdx" )
params = await prompt.format(
props = { "query" : "What is AgentMark?" },
telemetry = { "isEnabled" : True },
)
result = await run_text_prompt(params)
print ( f "Trace ID: { ctx.trace_id } " )
TraceOptions
Option Type Required Description namestringYes Name for the trace metadataRecord<string, string>No Custom key-value metadata sessionIdstringNo Group traces into a session sessionNamestringNo Human-readable session name userIdstringNo Associate trace with a user datasetRunIdstringNo Link to a dataset run datasetRunNamestringNo Human-readable dataset run name datasetItemNamestringNo Specific dataset item name datasetExpectedOutputstringNo Expected output for evaluation datasetPathstringNo Path to the dataset file
TraceResult
interface TraceResult < T > {
result : Promise < T >; // The result of your callback (as a Promise)
traceId : string ; // The trace ID for correlation
}
result is Promise<T>, not T. You need to await it to get the resolved value.
async with trace( name = "my-operation" ) as ctx:
print (ctx.trace_id) # Available immediately
result = await my_async_function()
Creating Child Spans
Use ctx.span() to create child spans within a trace:
import { trace } from "@agentmark-ai/sdk" ;
const { result , traceId } = await trace (
{ name: 'multi-step-workflow' },
async ( ctx ) => {
await ctx . span ({ name: 'validate-input' }, async ( spanCtx ) => {
spanCtx . setAttribute ( 'input.length' , 42 );
});
const output = await ctx . span ({ name: 'process-request' }, async ( spanCtx ) => {
const prompt = await client . loadTextPrompt ( 'process.prompt.mdx' );
const input = await prompt . format ({
props: { query: 'process this' },
telemetry: { isEnabled: true }
});
return await generateText ( input );
});
await ctx . span ({ name: 'format-response' }, async ( spanCtx ) => {
spanCtx . addEvent ( 'formatting-complete' );
});
return output ;
}
);
from agentmark_sdk import trace
from agentmark_pydantic_ai_v0 import run_text_prompt
from agentmark_client import client
async with trace( name = "multi-step-workflow" ) as ctx:
async with ctx.span( "validate-input" ) as span_ctx:
span_ctx.set_attribute( "input.length" , 42 )
async with ctx.span( "process-request" ) as span_ctx:
prompt = await client.load_text_prompt( "process.prompt.mdx" )
params = await prompt.format(
props = { "query" : "process this" },
telemetry = { "isEnabled" : True },
)
output = await run_text_prompt(params)
async with ctx.span( "format-response" ) as span_ctx:
span_ctx.add_event( "formatting-complete" )
Setting Span Kind on Child Spans
Pass the kind option to categorize spans. This controls how they appear in graph view , filtering , and dashboards .
import { trace , SpanKind } from "@agentmark-ai/sdk" ;
const { result } = await trace (
{ name: 'rag-pipeline' },
async ( ctx ) => {
const docs = await ctx . span (
{ name: 'search-knowledge-base' , kind: SpanKind . RETRIEVAL },
async ( spanCtx ) => {
return await vectorDb . query ({ query: userQuestion , topK: 5 });
}
);
await ctx . span (
{ name: 'check-content-policy' , kind: SpanKind . GUARDRAIL },
async ( spanCtx ) => {
return await moderationService . check ( userQuestion );
}
);
const answer = await ctx . span (
{ name: 'generate-answer' , kind: SpanKind . LLM },
async ( spanCtx ) => {
const prompt = await client . loadTextPrompt ( 'answer.prompt.mdx' );
const input = await prompt . format ({
props: { question: userQuestion , context: docs },
telemetry: { isEnabled: true }
});
return await generateText ( input );
}
);
return answer ;
}
);
from agentmark_sdk import trace, SpanKind
from agentmark_pydantic_ai_v0 import run_text_prompt
from agentmark_client import client
async with trace( name = "rag-pipeline" ) as ctx:
async with ctx.span( "search-knowledge-base" , kind = SpanKind. RETRIEVAL ) as span_ctx:
docs = await vector_db.query( query = user_question, top_k = 5 )
async with ctx.span( "check-content-policy" , kind = SpanKind. GUARDRAIL ) as span_ctx:
await moderation_service.check(user_question)
async with ctx.span( "generate-answer" , kind = SpanKind. LLM ) as span_ctx:
prompt = await client.load_text_prompt( "answer.prompt.mdx" )
params = await prompt.format(
props = { "question" : user_question, "context" : docs},
telemetry = { "isEnabled" : True },
)
answer = await run_text_prompt(params)
Wrapping Functions with observe()
observe() wraps an async function with automatic input/output capture. Unlike trace() and ctx.span() which create inline spans, observe() wraps a reusable function so every call is automatically traced.
import { observe , SpanKind } from "@agentmark-ai/sdk" ;
const searchWeb = observe (
async ( query : string ) => {
const response = await fetch ( `https://api.search.com?q= ${ query } ` );
return response . json ();
},
{ name: "search-web" , kind: SpanKind . TOOL }
);
// Every call is now automatically traced
const results = await searchWeb ( "AgentMark tracing" );
from agentmark_sdk import observe, SpanKind
@observe ( name = "search-web" , kind = SpanKind. TOOL )
async def search_web ( query : str ) -> dict :
async with httpx.AsyncClient() as client:
response = await client.get( f "https://api.search.com?q= { query } " )
return response.json()
# Every call is now automatically traced
results = await search_web( "AgentMark tracing" )
observe() Options
Option Type Description namestringDisplay name for the span (defaults to function name) kindSpanKindType of operation (defaults to SpanKind.FUNCTION) captureInput / capture_inputbooleanRecord function arguments (default: true) captureOutput / capture_outputbooleanRecord return value (default: true) processInputs / process_inputsfunctionTransform arguments before recording (useful for redacting sensitive data) processOutputs / process_outputsfunctionTransform return value before recording
Observed functions automatically attach to the active trace context — they nest correctly inside trace() without extra wiring.
SpanKind Values
Kind Description SpanKind.FUNCTIONGeneric computation step (default) SpanKind.LLMA call to a language model SpanKind.TOOLAn external tool or API call SpanKind.AGENTAn orchestration loop SpanKind.RETRIEVALA vector database query or document search SpanKind.EMBEDDINGA call to an embedding model SpanKind.GUARDRAILA content safety or validation check
Scoring Traces
Use sdk.score() to attach quality scores to traces or spans:
const { result , traceId } = await trace (
{ name: 'scored-workflow' },
async ( ctx ) => {
return output ;
}
);
await sdk . score ({
resourceId: traceId ,
name: 'correctness' ,
score: 0.95 ,
label: 'correct' ,
reason: 'Output matches expected result'
});
async with trace( name = "scored-workflow" ) as ctx:
output = await my_async_function()
await sdk.score(
resource_id = ctx.trace_id,
name = "correctness" ,
score = 0.95 ,
label = "correct" ,
reason = "Output matches expected result" ,
)
Best Practices
Use meaningful function IDs — "customer-support-greeting" not "func1"
Add relevant metadata — userId, environment, query parameters
Always enable telemetry in production — monitor performance and set up alerts
Shutdown tracer for short scripts — call tracer.shutdown() before the process exits
Next Steps
Sessions Group related traces together
Metadata Add custom context to traces
Tags Categorize traces with labels
PII Masking Redact sensitive data from traces
Have Questions? We’re here to help! Choose the best way to reach us: