Set up your AgentMark client
The AgentMark client is the piece of your code that actually executes prompts: it connects your .prompt.mdx files to real models through an adapter. One client powers three surfaces:
| Surface | Entry point (TS / Python) | What runs it |
|---|
| Your application code | agentmark.client.ts / agentmark_client.py | You |
| Local development | dev-entry.ts / .agentmark/dev_server.py | npx agentmark dev |
| AgentMark Cloud | handler.ts / handler.py | The deployment pipeline |
Run buttons in the dashboard (playground runs, experiments) dispatch to your deployed client. Until a deployment exists, those buttons are disabled with a pointer to this page.
Every stack’s client looks different — your adapter, your models, your tools — which is why there’s no one-size scaffolder: your AI tool detects your stack and writes the right one. See Let your agent set it up. The steps below are the manual path, and what the agent follows under the hood.
Prerequisites
- An AgentMark project:
agentmark.json + an agentmark/ directory with at least one prompt (Quickstart)
- Node.js 18+ (the
agentmark CLI runs on Node for both languages); Python projects also need Python 3.10+
- Your model provider’s API key (e.g.
OPENAI_API_KEY)
Step 1: Install the adapter and CLI
Pick the adapter matching your stack (all adapters). For the Vercel AI SDK:npm install @agentmark-ai/ai-sdk-v5-adapter @agentmark-ai/loader-api @ai-sdk/openai ai
npm install -D @agentmark-ai/cli tsx typescript
Calling an SDK with no AgentMark adapter (the raw OpenAI client, AWS Bedrock Converse, a bespoke HTTP wrapper)? You don’t need one — Bring your own SDK builds the same client from createExecutor + createWebhookRunner, and the runner plugs into the exact dev-entry.ts / handler.ts shapes below in place of the adapter’s webhook handler. Step 2: Create agentmark.client.ts
The client file wires together three things: a loader (where prompts come from), a model registry (how model names resolve to real models), and the adapter:import {
createAgentMarkClient,
VercelAIModelRegistry,
} from "@agentmark-ai/ai-sdk-v5-adapter";
import { ApiLoader } from "@agentmark-ai/loader-api";
import { openai } from "@ai-sdk/openai";
// Local dev: prompts come from `agentmark dev`'s API server (--api-port,
// default 9418). Deployed: prompts and datasets come from AgentMark Cloud
// using the env vars the deployment pipeline injects automatically.
const loader =
process.env.NODE_ENV === "development"
? ApiLoader.local({ baseUrl: "http://localhost:9418" })
: ApiLoader.cloud({
apiKey: process.env.AGENTMARK_API_KEY!,
appId: process.env.AGENTMARK_APP_ID!,
baseUrl: process.env.AGENTMARK_BASE_URL,
});
const modelRegistry = new VercelAIModelRegistry();
modelRegistry.registerProviders({ openai });
export const client = createAgentMarkClient({
loader,
modelRegistry,
});
Don’t point ApiLoader.local at process.env.AGENTMARK_BASE_URL — that variable overrides the cloud endpoint (managed deployments inject it), and reusing it for the local loader silently breaks agentmark dev whenever it’s set.
See Client config for everything else the client can register: tools, evals, MCP servers, custom models, type safety.Step 3: Run locally with agentmark dev
agentmark dev starts a local API server (serves your prompt files) and a webhook server (executes prompts through your client). The webhook server boots from a dev-entry.ts file at your project root:// Local dev webhook server — `agentmark dev` runs this with tsx.
// Mark this process as development BEFORE the client loads, so the
// client's loader switch picks the local dev server over the cloud.
process.env.NODE_ENV ||= "development";
import { createWebhookServer } from "@agentmark-ai/cli/runner-server";
import { VercelAdapterWebhookHandler } from "@agentmark-ai/ai-sdk-v5-adapter/runner";
async function main() {
const { client } = await import("./agentmark.client");
const args = process.argv.slice(2);
const portArg = args.find((arg) => arg.startsWith("--webhook-port="));
const port = portArg ? parseInt(portArg.split("=")[1], 10) : 9417;
const handler = new VercelAdapterWebhookHandler(client);
await createWebhookServer({ port, handler });
}
main().catch((err) => {
console.error(err);
process.exit(1);
});
The NODE_ENV line matters: dev-entry.ts only ever runs under agentmark dev, so it pins the loader to local mode no matter what your shell exports. Static imports are hoisted, but the client import is dynamic — it resolves after the assignment runs.
Start the dev stack and run a prompt:npx agentmark dev
# in another terminal:
npx agentmark run-prompt ./agentmark/my-prompt.prompt.mdx
=== Text Prompt Results ===
The capital of France is Paris.
────────────────────────────────────────────────────────────
🪙 12 in, 8 out, 20 total
Experiments work the same way — datasets resolve through the local API server. Your prompt needs a dataset first (test_settings.dataset in its frontmatter — see Datasets):npx agentmark run-experiment ./agentmark/my-prompt.prompt.mdx
If agentmark dev exits with No dev server entry point found, the dev-entry.ts file above is what it’s looking for.Step 4: Add a deployment entry point (handler.ts)
AgentMark Cloud executes your client through a single handler function. Each dashboard run (playground or experiment) arrives as one { type, data } event; handleWebhookRequest does the dispatch:// AgentMark Cloud deployment entry point. The deployment pipeline bundles
// this file and wraps it in a managed HTTP server.
import {
handleWebhookRequest,
type WebhookRequest,
} from "@agentmark-ai/cli/runner-server";
import { VercelAdapterWebhookHandler } from "@agentmark-ai/ai-sdk-v5-adapter/runner";
import { client } from "./agentmark.client";
const webhookHandler = new VercelAdapterWebhookHandler(client);
export default function handler(event: WebhookRequest) {
return handleWebhookRequest(event, webhookHandler);
}
The pipeline resolves your handler in this order: the handler key in agentmark.json if set, then handler.py, then handler.ts at the repository root — see handler detection.Step 5: Deploy
- Connect your repository in the dashboard (the app’s setup card, or Deployments). Every push then triggers the deployment pipeline: file sync, then a code deploy of your handler to a managed machine.
- Set your provider keys under Settings → Environment variables (e.g.
OPENAI_API_KEY). The platform injects AGENTMARK_API_KEY, AGENTMARK_APP_ID, AGENTMARK_BASE_URL, and AGENTMARK_DISPATCH_SECRET automatically — your client’s cloud loader is already wired for them.
- Push. Watch the build under Deployments; when it goes green, Run buttons in the playground and experiments go live against your deployed client.
Deployed experiments stream their datasets from your linked repository (at the environment’s branch or pinned commit), so the repo connection isn’t just for syncing — it’s how your datasets reach the deployed client at run time.
Step 1: Install the adapter
Pick the adapter matching your stack (all adapters). For Pydantic AI:pip install agentmark-pydantic-ai-v0
ApiLoader ships with agentmark-prompt-core (installed as a dependency) — there’s no separate loader package.Step 2: Create agentmark_client.py
The client file wires together three things: a loader (where prompts come from), a model registry (how model names resolve to real models), and the adapter:import os
from agentmark.prompt_core import ApiLoader
from agentmark_pydantic_ai_v0 import (
create_pydantic_ai_client,
PydanticAIModelRegistry,
)
model_registry = PydanticAIModelRegistry()
model_registry.register_models(
["gpt-5", "gpt-5-mini"],
lambda name, opts=None: f"openai:{name}",
)
# Local dev: prompts come from `agentmark dev`'s API server (--api-port,
# default 9418). Deployed: prompts and datasets come from AgentMark Cloud
# using the env vars the deployment pipeline injects automatically.
if os.getenv("NODE_ENV") == "development":
loader = ApiLoader.local(base_url="http://localhost:9418")
else:
loader = ApiLoader.cloud(
api_key=os.environ["AGENTMARK_API_KEY"],
app_id=os.environ["AGENTMARK_APP_ID"],
)
client = create_pydantic_ai_client(
model_registry=model_registry,
loader=loader,
)
ApiLoader.cloud falls back to the AGENTMARK_BASE_URL environment variable automatically, so managed deployments reach the right gateway without extra configuration. Don’t reuse that variable for the local branch — it would re-point local dev at the cloud endpoint whenever it’s set.
See Client config for everything else the client can register: tools, evals, MCP servers, custom models.Step 3: Run locally with agentmark dev
agentmark dev starts a local API server (serves your prompt files) and a webhook server (executes prompts through your client). For Python projects it boots .agentmark/dev_server.py:"""AgentMark dev webhook server entry point."""
import argparse
import os
import sys
from pathlib import Path
# Mark this process as development BEFORE the client loads, so the
# client's loader switch picks the local dev server over the cloud.
os.environ.setdefault("NODE_ENV", "development")
sys.path.insert(0, str(Path(__file__).parent.parent))
from agentmark_pydantic_ai_v0 import create_webhook_server
from agentmark_client import client
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--webhook-port", type=int, default=9417)
parser.add_argument("--api-server-port", type=int, default=9418)
args = parser.parse_args()
create_webhook_server(client, args.webhook_port, args.api_server_port)
The NODE_ENV line matters: dev_server.py only ever runs under agentmark dev, and it sets the variable before importing agentmark_client, so the loader switch can’t fall into cloud mode locally.
Start the dev stack (the CLI detects your virtualenv) and run a prompt:npx agentmark dev
# in another terminal:
npx agentmark run-prompt ./agentmark/my-prompt.prompt.mdx
=== Text Prompt Results ===
The capital of France is Paris.
────────────────────────────────────────────────────────────
🪙 12 in, 8 out, 20 total
Experiments work the same way — datasets resolve through the local API server. Your prompt needs a dataset first (test_settings.dataset in its frontmatter — see Datasets):npx agentmark run-experiment ./agentmark/my-prompt.prompt.mdx
More on the Python dev server (custom entry points, ports, environment): Python dev server.Step 4: Add a deployment entry point (handler.py)
AgentMark Cloud executes your client through a single async handler function. Each dashboard run (playground or experiment) arrives as one {type, data} event:"""AgentMark Cloud deployment entry point.
The deployment pipeline wraps this file in a managed HTTP server. Each
dashboard run (playground or experiment) arrives as one {type, data} event.
"""
from agentmark_pydantic_ai_v0.webhook import PydanticAIWebhookHandler
from agentmark_client import client
_webhook = PydanticAIWebhookHandler(client)
async def handler(event):
event_type = event.get("type")
data = event.get("data", {})
if event_type == "prompt-run":
return await _webhook.run_prompt(data["ast"], data.get("options") or {})
if event_type == "dataset-run":
return await _webhook.run_experiment(
data["ast"],
data.get("experimentId", "experiment"),
data.get("datasetPath"),
data.get("sampling"),
data.get("commitSha"),
data.get("concurrency"),
)
raise ValueError(f"Unknown event type: {event_type}")
The pipeline resolves your handler in this order: the handler key in agentmark.json if set, then handler.py, then handler.ts at the repository root — see handler detection.Step 5: Deploy
- Connect your repository in the dashboard (the app’s setup card, or Deployments). Every push then triggers the deployment pipeline: file sync, then a code deploy of your handler to a managed machine.
- Set your provider keys under Settings → Environment variables (e.g.
OPENAI_API_KEY). The platform injects AGENTMARK_API_KEY, AGENTMARK_APP_ID, AGENTMARK_BASE_URL, and AGENTMARK_DISPATCH_SECRET automatically — your client’s cloud loader is already wired for them.
- Push. Watch the build under Deployments; when it goes green, Run buttons in the playground and experiments go live against your deployed client.
Deployed experiments stream their datasets from your linked repository (at the environment’s branch or pinned commit), so the repo connection isn’t just for syncing — it’s how your datasets reach the deployed client at run time.
Let your agent set it up
The AgentMark skill gives your AI tool (Claude Code, Cursor, etc.) a setup workflow that scaffolds everything on this page — the client file, the dev entry, and the handler — matched to your stack’s language and adapter. Prompt it with:
Set up AgentMark in this project, including the client and a deployable handler.
The agent verifies its work the same way you would: npx agentmark run-prompt against the dev server.
Troubleshooting
| Symptom | Cause | Fix |
|---|
No dev server entry point found | Missing dev entry | Create dev-entry.ts at the project root (TS) or .agentmark/dev_server.py (Python) — Step 3 |
Local experiment fails Not authorized | Client fell into the cloud loader locally | Keep the NODE_ENV line first in your dev entry (Step 3) |
| Run buttons disabled in the dashboard | No deployment for the selected environment | Deploy (Step 5) |
Deployed run fails Authentication failed | Stale dispatch secret | Trigger a Rebuild under Deployments |
| Deployed experiment finds no dataset | Dataset isn’t in the linked repo branch | Commit the .jsonl under agentmark/ and push |
Have Questions?
We’re here to help! Choose the best way to reach us: