Skip to main content
AgentMark is a fully local-first agent engineering platform that helps you build, test, and monitor AI applications with confidence.

What You Get

Prompt Management - Write prompts in readable Markdown + JSX
  • Declarative syntax that shows exactly what your LLM sees
  • Reusable components and templating
  • Version control-friendly format
  • Hot-reload during development
Testing & Evaluation - Ensure quality before production
  • Datasets for systematic testing
  • Custom evaluation functions
  • CLI and SDK for running experiments
  • CI/CD integration
Observability - Monitor and debug in production
  • Distributed tracing with OpenTelemetry
  • Session tracking for multi-step workflows
  • Token usage and cost monitoring
  • Integration with AgentMark platform
Local-First - Everything runs on your machine
  • No vendor lock-in
  • Full control over your data
  • Works offline
  • Optional cloud sync with AgentMark platform

Core Features

Readable Prompts:
<System>You are a helpful math tutor.</System>
<User>What's {props.num1} + {props.num2}?</User>
Systematic Testing:
npm run experiment math-tutor.prompt.mdx
Production Monitoring:
telemetry: {
  isEnabled: true,
  metadata: { userId, sessionId }
}

Why AgentMark?

For Development:
  • Decouple prompts from application code
  • Test prompts systematically with datasets
  • Iterate quickly with hot-reload
  • Works with any AI SDK (Vercel AI, Claude Agent SDK, Mastra, Pydantic AI)
For Production:
  • Monitor performance and costs
  • Track token usage
  • Debug user issues
  • Analyze usage patterns

Supported Languages

LanguageAdapters
TypeScriptAI SDK (Vercel), Claude Agent SDK, Mastra
PythonPydantic AI, Claude Agent SDK

Have Questions?

We’re here to help! Choose the best way to reach us: