AgentMark

AgentMark is a fully local-first agent engineering platform that helps you build, test, and monitor AI applications with confidence.

What You Get

Prompt Management - Write prompts in readable Markdown + JSX

Declarative syntax that shows exactly what your LLM sees
Reusable components and templating
Version control-friendly format
Hot-reload during development

Testing & Evaluation - Ensure quality before production

Datasets for systematic testing
Custom evaluation functions
CLI and SDK for running experiments
CI/CD integration

Observability - Monitor and debug in production

Distributed tracing with OpenTelemetry
Session tracking for multi-step workflows
Token usage and cost monitoring
Integration with AgentMark platform

Local-First - Everything runs on your machine

No vendor lock-in
Full control over your data
Works offline
Optional cloud sync with AgentMark platform

Core Features

Readable Prompts:

<System>You are a helpful math tutor.</System>
<User>What's {props.num1} + {props.num2}?</User>

Systematic Testing:

npm run experiment math-tutor.prompt.mdx

Production Monitoring:

telemetry: {
  isEnabled: true,
  metadata: { userId, sessionId }
}

Why AgentMark?

For Development:

Decouple prompts from application code
Test prompts systematically with datasets
Iterate quickly with hot-reload
Works with any AI SDK (Vercel AI, Claude Agent SDK, Mastra, Pydantic AI)

For Production:

Monitor performance and costs
Track token usage
Debug user issues
Analyze usage patterns

Supported Languages

Language	Adapters
TypeScript	AI SDK (Vercel), Claude Agent SDK, Mastra
Python	Pydantic AI, Claude Agent SDK

Have Questions?

We’re here to help! Choose the best way to reach us:

Join our Discord community for quick answers and discussions
Email us at hello@agentmark.co for support
Schedule an Enterprise Demo to learn about our business solutions

Getting Started

Prompts and Agents

Testing

Observability

Integrations

Python

Further Reference

What You Get

Core Features

Why AgentMark?

Supported Languages

Have Questions?

Getting Started

Prompts and Agents

Testing

Observability

Integrations

Python

Further Reference

​What You Get

​Core Features

​Why AgentMark?

​Supported Languages

​Have Questions?

What You Get

Core Features

Why AgentMark?

Supported Languages

Have Questions?