Claude Agent SDK (Python) - AgentMark Docs

The Claude Agent SDK adapter runs AgentMark prompts as agentic tasks using Anthropic’s Claude Agent SDK — multi-turn tool use, budget controls, permission management, and tracing. This page covers the Python adapter; for TypeScript, see Claude Agent SDK (TypeScript).

The Python adapter (agentmark-claude-agent-sdk-v0) is in alpha. The API surface documented here is stable, but expect occasional changes ahead of the v1 release. For type-safe structured outputs across all major providers, Pydantic AI is the recommended general-purpose Python adapter — reach for Claude Agent SDK when you specifically need Claude’s agentic capabilities.

Installation

pip install agentmark-claude-agent-sdk-v0 agentmark-prompt-core claude-agent-sdk

Set your Anthropic API key:

echo "ANTHROPIC_API_KEY=sk-ant-..." >> .env

Setup

Create your client with a ClaudeAgentModelRegistry. There is no create_default() — you must register models or providers explicitly so the model names in your prompt frontmatter resolve. The simplest setup registers a provider by prefix (anthropic/...):

agentmark_client.py

from pathlib import Path
from dotenv import load_dotenv
from agentmark.prompt_core import FileLoader
from agentmark_claude_agent_sdk_v0 import (
    create_claude_agent_client,
    ClaudeAgentModelRegistry,
)

load_dotenv()

model_registry = ClaudeAgentModelRegistry()
model_registry.register_providers({"anthropic": "anthropic"})

loader = FileLoader(str(Path(__file__).parent / "dist" / "agentmark"))

client = create_claude_agent_client(
    model_registry=model_registry,
    loader=loader,
)

For per-model options like max_thinking_tokens, register models explicitly with a ModelConfig creator. The creator receives the model name and the run options:

agentmark_client.py

from agentmark_claude_agent_sdk_v0 import (
    create_claude_agent_client,
    ClaudeAgentModelRegistry,
    ModelConfig,
)

model_registry = ClaudeAgentModelRegistry()
model_registry.register_models(
    ["claude-sonnet-4-20250514"],
    lambda name, _: ModelConfig(model=name),
)
model_registry.register_models(
    ["claude-opus-4-20250514"],
    lambda name, _: ModelConfig(model=name, max_thinking_tokens=10000),
)

client = create_claude_agent_client(
    model_registry=model_registry,
    loader=loader,
)

Running prompts

Use traced_query — it accepts the output of prompt.format() directly, runs the Claude Agent SDK query internally, and yields messages as the agent works. (run_text_prompt / run_object_prompt belong to the Pydantic AI adapter and are not exported here.)

import asyncio
from agentmark_claude_agent_sdk_v0 import traced_query
from agentmark_client import client

async def main():
    prompt = await client.load_text_prompt("code-reviewer.prompt.mdx")
    adapted = await prompt.format(props={
        "task": "Analyze the auth module and suggest improvements",
    })

    async for message in traced_query(adapted):
        print(message)

asyncio.run(main())

Messages stream as the agent runs; the final aggregated result is only available once all turns complete.

Adapter options

Adapter options are set at client construction time, not inside prompt.format():

agentmark_client.py

from agentmark_claude_agent_sdk_v0 import (
    create_claude_agent_client,
    ClaudeAgentAdapterOptions,
)

client = create_claude_agent_client(
    model_registry=model_registry,
    loader=loader,
    adapter_options=ClaudeAgentAdapterOptions(
        permission_mode="bypassPermissions",
        max_turns=10,
        cwd="/path/to/project",
        max_budget_usd=5.00,
        allowed_tools=["Read", "Write", "Bash"],
        disallowed_tools=["WebFetch"],
        system_prompt_preset=False,
        on_warning=lambda w: print(f"Warning: {w}"),
    ),
)

Option	Type / values	Purpose
`permission_mode`	`'default' \| 'acceptEdits' \| 'bypassPermissions' \| 'plan'`	How the agent handles tool-permission prompts
`max_turns`	`int`	Cap on agent turns
`cwd`	`str`	Working directory for file/Bash tools
`max_budget_usd`	`float`	Hard spend cap in USD
`allowed_tools`	`list[str]`	Tool whitelist
`disallowed_tools`	`list[str]`	Tool blacklist
`system_prompt_preset`	`bool` (default `False`)	Use Claude Code’s built-in system prompt
`on_warning`	`Callable[[str], None]`	Warning handler

Object generation

For structured output, load an object prompt. The structured result arrives in the final result message:

from agentmark_claude_agent_sdk_v0 import traced_query
from agentmark_client import client

prompt = await client.load_object_prompt("sentiment.prompt.mdx")
adapted = await prompt.format(props={"text": "This product is amazing!"})

async for message in traced_query(adapted):
    if message.type == "result":
        print(message.result)

Tools

The Claude Agent SDK adapter handles tools by name, not by registering executors. List tool names in your prompt frontmatter and the adapter passes them through as allowed_tools to the SDK. Tools can be the SDK’s built-ins (Read, Write, Bash, …) or tools served by MCP servers. Configure MCP servers on the client (the Python field is mcp_servers, snake_case):

agentmark_client.py

client = create_claude_agent_client(
    model_registry=model_registry,
    loader=loader,
    mcp_servers={"weather": {"url": "https://weather-mcp.example.com/sse"}},
)

Then reference tools by name in your prompt:

task.prompt.mdx

---
name: task
text_config:
  model_name: claude-sonnet-4-20250514
  tools:
    - weather
---

<System>You are a helpful assistant with access to weather data.</System>
<User>{props.task}</User>

Evals

Register evaluation functions to score prompt outputs during experiments. Score schemas live in agentmark.json; eval functions connect to them by name.

agentmark_client.py

from agentmark.prompt_core import EvalParams, EvalResult

def exact_match(params: EvalParams) -> EvalResult:
    match = str(params["output"]).strip() == str(params.get("expectedOutput", "")).strip()
    return {"passed": match, "score": 1.0 if match else 0.0}

evals = {
    "exact_match": exact_match,
}

client = create_claude_agent_client(
    model_registry=model_registry,
    loader=loader,
    evals=evals,
)

Reference evals in your prompt frontmatter:

---
test_settings:
  dataset: ./datasets/test.jsonl
  evals:
    - exact_match
---

Learn more about evaluations.

Tracing

traced_query emits OpenTelemetry spans automatically when telemetry is enabled on prompt.format(). All tracing context — prompt name, model, system prompt, and props — is extracted from the adapted output:

from agentmark_claude_agent_sdk_v0 import traced_query

adapted = await prompt.format(
    props={"task": "..."},
    telemetry={"isEnabled": True},
)

async for message in traced_query(adapted):
    print(message)

See Tracing setup for wiring spans to the local dev server or AgentMark Cloud.

Limitations

No image generation — use the AI SDK adapter via a Node.js service.
No speech generation — same as above.
The aggregated final result is only available after all turns complete; intermediate state arrives as streamed messages.

Next steps

Pydantic AI

The general-purpose Python adapter

Tools and agents

Configure tools for your agents

Observability

Monitor your agents in production

Other integrations

Explore other AI frameworks

Have Questions?

We’re here to help! Choose the best way to reach us:

Email us at hello@agentmark.co for support
Schedule an Enterprise Demo to learn about our business solutions

Documentation Index

​Installation

​Setup

​Running prompts

​Adapter options

​Object generation

​Tools

​Evals

​Tracing

​Limitations

​Next steps

Pydantic AI

Tools and agents

Observability

Other integrations

​Have Questions?

Installation

Setup

Running prompts

Adapter options

Object generation

Tools

Evals

Tracing

Limitations

Next steps

Have Questions?