Changelog

2025-07-07

AgentMark init, CLI, Auth, Webhook, Streaming, Rollbacks, and more

Features & Improvements

AgentMark init
Updated examples in CLI init
Rebrand Puzzlet -> AgentMark
CLI: “run-prompt” for dataset + single props
Webhook Helpers
Alerts enhancements
Google/GitHub auth
Dataset runs directly via prompts
Vercel v4 webhook helper
Streaming to the platform
Commit History + Rollbacks

Bug Fixes

Ollama fix on init

2025-06-08

Attachments, Dataset references, MCP Server, Editor UI, and more

Features & Improvements

File/Image Attachments
Dataset references in prompts
MCP Server on init
Popout editor to prompt input
Improved datasets UI in editor

Bug Fixes

Disable publish button for read users

2025-05-09

AgentMark v3, Image/Speech Models, Vercel Adapter, Type-Safety, and more

Features & Improvements

AgentMark v3
Support Image + Speech Models
Vercel AI Adapter
Type-Safety for tools
Internal: Feature flag support
Commit messages

Bug Fixes

General bug fixes

2025-04-09

JSONL Datasets, Evals/Scoring, and more

JSONL Datasets

Datasets are now supported with JSONL files. This allows you to test your large datasets in bulk against, and supports streaming.Read Docs

Evals & Scoring

We rolled out our initial evals support. Evals allow you to evaluate your prompts against a set of data, and get a score. More to come here soon.Read Docs

Other

Consolidating prompts, evals, and datasets into single “files”
Officially rolled out alerts
Some CLI improvements
Minor bug fixes

2025-03-12

Sessions, Alerts, Trace UI Improvements, Onboarding Improvements

Sessions

Sessions provide a way to group related traces together, making it easier to monitor and debug complex workflows in your LLM applications. By organizing traces into sessions, you can track the entire lifecycle of a user interaction or a multi-step process.Read Docs

Alerts

Now, you can get notified when your application is experiencing increases errors, latency, or costs. Configure alerts to notify you via slack, or a webhook.Read Docs

Traces UI Improvements

Traces now have a more user-friendly UI, with a focus on providing important information at a glance.

Onboarding Improvements

We’ve improved our onboarding. Now, you can see your dashboard without having to sync your repo first. We also support modular onboarding, so you can skip steps you don’t need.

2025-02-18

Add Trace Examples to Datasets, Load Trace in Prompt, Re-indexing, App UI Improvments, bug fixes

Adding Examples to Datasets

You can now add production trace data to your datasets with a single click.Read Docs

Adding Examples to Prompts

You can now add production trace examples to your prompts. This allows you to iterate/test against your prompts with real data.Read Docs

Re-indexing

You can now re-index your prompts, and datasets. This allows you to perform a fresh pull on the content from your synced repository.

App UI Improvements

You can now view your easily app’s repo configuration, including repo names, branch, and more.

2025-01-27

Type Safety, Datasets, and more

Type Safety

AgentMark aims to provide developers with the best developer experience possible. As part of this, we’ve just added type safety to our platform.

Types can now be generated via our CLI
Fetching prompts from our CDN or AgentMark are now type-safe
Prompts now support run/compile/deserialize functions

Datasets

Datasets now allow you to test your prompts in bulk against a large set of data.

Run your datasets in bulk against your prompts
View previous runs, with inputs/outputs
View traces associated with each run
View high-level metrics for each run

Trace Grouping

Traces can now be grouped based on the trace function, and the component function. Trace groups together at the root level, while component allows for sub-groups.

New function added: trace
New function added: component

CLI Improvements

Our CLI has been improved to provide a better developer experience.

AgentMark init can optionally create an example app
Added pull-models to walk through adding new models to your platform

Bug Fixes

Fixed a bug which could cause an app’s templates to be deleted when a new app was created
Fixed a bug which could cause some branches not to show up in the UI
Fixed a bug which could prevent newly created local prompts from being synced to the platform

Other

Improved UI for prompts input/output
Paginate traces
Improved UI theme for prompts

2025-01-16

Initial AgentMark Release

Overview

AgentMark is a git-based Prompt Engineering Platform that empowers both application developers and prompt engineers to collaborate seamlessly on GenAI products. AgentMark enables application developers to manage their configuration, prompts, datasets, and evals in a git-based workflow while also providing a hosted platform for seamless collaboration with non-technical team members.

Features

Prompt Management
Observability
Datasets
CLI
Platform Management
Evals

Prompt Management

AgentMark takes a developer-first approach to prompt management, treating prompts as files that live in your repository while still providing a platform for non-technical team members. All prompts are saved in AgentMark, a markdown-based format that is easy to write and read.Read Docs

Observability

We build on top of OpenTelemetry for collecting telemetry data from your prompts. This helps you monitor, debug, and optimize your LLM applications in production. We provide traces, logs, metrics, and more.Read Docs

Datasets

Create datasets to test easily test your prompts in bulk against a large set of data.Read Docs

CLI

We provide a CLI for initializing your AgentMark app, customizing it, and deploying it to the cloud. Add new models to your platform with just a single command. You can also develop w/ AgentMark locally using our serve command.

bash

npx @agentmark/cli@latest init

Read Docs

Platform Management

AgentMark offers an intuitive platform for creating new git-synced apps, adding team members with roles, and setting up API keys for users.

AgentMark SDK

AgentMark’s SDK is simple and easy to use. We offer features like: one-LOC observability, securely fetching prompts from our CDN, and more.Read Docs

2025-01-03

Initial AgentMark Release

Features

Initial release of AgentMark
Support for OpenAI, Anthropic, and other LLM providers
MDX-based prompt templating
Type-safe prompt development
Tools and agents support

Documentation

Added comprehensive documentation
Included examples and guides
API reference documentation

Getting Started

Prompt Management

Observability

Testing

Further Reference

Features & Improvements

Bug Fixes

Features & Improvements

Bug Fixes

Features & Improvements

Bug Fixes

JSONL Datasets

Evals & Scoring

Other

Sessions

Alerts

Traces UI Improvements

Onboarding Improvements

Adding Examples to Datasets

Adding Examples to Prompts

Re-indexing

App UI Improvements

Type Safety

Datasets

Trace Grouping

CLI Improvements

Bug Fixes

Other

Overview

Features

Prompt Management

Observability

Datasets

CLI

Platform Management

AgentMark SDK

Getting Started

Prompt Management

Observability

Testing

Further Reference

​Features & Improvements

​Bug Fixes

​Features & Improvements

​Bug Fixes

​Features & Improvements

​Bug Fixes

​JSONL Datasets

​Evals & Scoring

​Other

​Sessions

​Alerts

​Traces UI Improvements

​Onboarding Improvements

​Adding Examples to Datasets

​Adding Examples to Prompts

​Re-indexing

​App UI Improvements

​Type Safety

​Datasets

​Trace Grouping

​CLI Improvements

​Bug Fixes

​Other

​Overview

​Features

​Prompt Management

​Observability

​Datasets

​CLI

​Platform Management

​AgentMark SDK

Features & Improvements

Bug Fixes

Features & Improvements

Bug Fixes

Features & Improvements

Bug Fixes

JSONL Datasets

Evals & Scoring

Other

Sessions

Alerts

Traces UI Improvements

Onboarding Improvements

Adding Examples to Datasets

Adding Examples to Prompts

Re-indexing

App UI Improvements

Type Safety

Datasets

Trace Grouping

CLI Improvements

Bug Fixes

Other

Overview

Features

Prompt Management

Observability

Datasets

CLI

Platform Management

AgentMark SDK