CLI Reference
The AgentMark CLI (@agentmark-ai/cli) provides tools for developing, testing, building, and deploying your AI prompts.
Installation
npm install -g @agentmark-ai/cli
# or use with npx
npx @agentmark-ai/cli <command>
Environment Variables
The CLI automatically loads environment variables from a .env file in the current working directory. This happens before any command execution, so you can store API keys and configuration there.
# .env
OPENAI_API_KEY=sk-...
AGENTMARK_API_KEY=...
AGENTMARK_APP_ID=...
Update Notifications
The CLI checks for updates asynchronously when you run commands. If a newer version is available, you’ll see a notification after your command completes. This check is non-blocking and won’t slow down your workflow.
To disable update checks, set the environment variable:
export AGENTMARK_DISABLE_UPDATE_CHECK=1
Commands
agentmark dev
Start the local development environment with API server, webhook server, and UI app. With --remote, the CLI also connects to the AgentMark platform via WebSocket Connect, enabling you to run prompts and experiments from the platform dashboard.
Options:
| Option | Description | Default |
|---|
--api-port <number> | API server port | 9418 |
--webhook-port <number> | Webhook server port | 9417 |
--app-port <number> | AgentMark UI app port | 3000 |
-r, --remote | Connect to the platform via WebSocket | false |
--no-forward | Disable trace forwarding (only relevant with --remote) | forwarding is on by default with --remote |
Project Detection:
The dev server automatically detects your project type:
- TypeScript projects: Looks for
agentmark.client.ts in the project root
- Python projects: Looks for
pyproject.toml, agentmark_client.py, or .agentmark/dev_server.py
Dev Server Entry Points (TypeScript):
The CLI looks for dev server files in this order:
dev-server.ts - Custom override (project root)
dev-entry.ts - Default location (project root)
.agentmark/dev-entry.ts - Legacy location
Python Virtual Environment:
For Python projects, the CLI automatically detects and uses virtual environments in .venv/ or venv/ directories.
Remote Connection:
When using --remote, the CLI establishes a WebSocket connection to the AgentMark platform gateway. This enables running prompts and experiments directly from the platform dashboard. The CLI handles authentication and app linking automatically:
- If not logged in, opens a browser for OAuth login.
- If the project is not linked, prompts you to select an app.
- Opens a WebSocket connection with automatic heartbeat and reconnection.
See Connect for details on how WebSocket Connect works.
Example:
# Local-only development
agentmark dev
# Connect to the platform (recommended)
agentmark dev --remote
# Custom ports with platform connection
agentmark dev --api-port 9500 --webhook-port 9501 --remote
agentmark deploy
Upload prompt files, datasets, score configurations, and supporting documents directly to the AgentMark platform. This command deploys .prompt.mdx, .mdx, .md, and .jsonl files, as well as score schemas defined in agentmark.json, without requiring a git repository connection.
agentmark deploy [options]
Options:
| Option | Description | Default |
|---|
--api-key <key> | API key for authentication | env or stored credentials |
--app-id <id> | Target app ID | env or linked app |
--dry-run | List files that would be deployed without uploading | false |
--base-url <url> | Override platform API URL | https://app.agentmark.co |
Authentication resolution order:
--api-key flag
AGENTMARK_API_KEY environment variable
- Stored credentials from
agentmark login (OAuth)
App resolution order:
--app-id flag
AGENTMARK_APP_ID environment variable
- Linked app from
agentmark link (forwarding config)
File collection:
The command reads agentmark.json to determine the source directory, then collects all files from <agentmarkPath>/agentmark/ matching these extensions: .prompt.mdx, .mdx, .md, .jsonl.
Score config deployment:
If agentmark.json contains a scores field, the deploy command persists those score schemas to the platform database. Once deployed, score configs are available in the dashboard for annotation queues and experiment results — even when no worker is connected. See Project configuration for the schema format.
Exit codes:
0 — Success
1 — Authentication failure
2 — Validation failure
3 — Permission denied
4 — Deployment conflict (app is connected to a git repository)
5 — Server error
If your app is connected to a git repository, agentmark deploy returns exit code 4. Disconnect the repository in Settings first, or use git-based deployment instead.
Example:
# Preview what would be deployed
agentmark deploy --dry-run
# Deploy to the platform
agentmark deploy
# Deploy with explicit credentials
agentmark deploy --api-key am_live_xxxxx --app-id app_xxxxx
agentmark login
Authenticate with the AgentMark platform via browser OAuth. The CLI opens your default browser to complete the login flow, then stores credentials locally for subsequent commands.
Stored credentials are used automatically by agentmark dev --remote, agentmark deploy, and agentmark link. You can override stored credentials with the --api-key flag or AGENTMARK_API_KEY environment variable on any command.
agentmark logout
Clear stored CLI authentication credentials and revoke any dev API keys created during agentmark link.
agentmark logout [options]
Options:
| Option | Description | Default |
|---|
--base-url <url> | Platform URL | https://app.agentmark.co |
agentmark link
Link your local project to an AgentMark platform app. The CLI prompts you to select an app from your account, then stores the app ID and a dev API key in your local project configuration.
After linking, commands like agentmark dev --remote and agentmark deploy automatically use the linked app without requiring --app-id or AGENTMARK_APP_ID.
agentmark dev --remote runs agentmark link automatically if your project is not yet linked. You only need to run agentmark link manually if you want to change which app your project is linked to.
agentmark run-prompt
Run a single prompt file with test props.
agentmark run-prompt <filepath> [options]
Arguments:
| Argument | Description |
|---|
filepath | Path to the .prompt.mdx file |
Options:
| Option | Description | Default |
|---|
--server <url> | Webhook server URL | http://localhost:9417 |
--props <json> | Props as JSON string | - |
--props-file <path> | Path to JSON or YAML file containing props | - |
Example:
# Run with inline props
agentmark run-prompt ./agentmark/greeting.prompt.mdx --props '{"name": "Alice"}'
# Run with props from file
agentmark run-prompt ./agentmark/greeting.prompt.mdx --props-file ./test-props.yaml
# Run against a remote server
agentmark run-prompt ./agentmark/greeting.prompt.mdx --server https://my-webhook.example.com
agentmark run-experiment
Run an experiment against its dataset, with evaluations by default.
agentmark run-experiment <filepath> [options]
Arguments:
| Argument | Description |
|---|
filepath | Path to the .prompt.mdx file with test configuration |
Options:
| Option | Description | Default |
|---|
--server <url> | Webhook server URL | http://localhost:9417 |
--skip-eval | Skip running evals even if they exist | false |
--format <format> | Output format: table, csv, json, jsonl | table |
--threshold <percent> | Fail if pass rate is below threshold (0-100) | - |
--sample <percent> | Sample N% of dataset rows randomly (1-100) | - |
--rows <spec> | Select specific rows by index/range (e.g. 0,3-5,9) | - |
--split <spec> | Train/test split (e.g. train:80, test:80) | - |
--seed <number> | Seed for reproducible sampling/splitting | - |
--truncate <chars> | Truncate table cell content to N chars (0 = no limit) | 1000 |
Example:
# Run experiment with table output
agentmark run-experiment ./agentmark/qa-bot.prompt.mdx
# Run experiment with JSON output, skip evals
agentmark run-experiment ./agentmark/qa-bot.prompt.mdx --format json --skip-eval
# Run with CI threshold (fails if <80% pass rate)
agentmark run-experiment ./agentmark/qa-bot.prompt.mdx --threshold 80
agentmark generate-types
Generate TypeScript type definitions from your prompt schemas.
agentmark generate-types [options]
Options:
| Option | Description | Default |
|---|
-l, --language <language> | Target language | typescript |
--local <port> | Local server port to fetch prompts from | - |
--root-dir <path> | Root directory containing agentmark files | - |
Output:
The command outputs TypeScript definitions to stdout. Redirect to a file:
agentmark generate-types --root-dir ./agentmark > agentmark.types.ts
Generated Types Include:
- Input types based on
input_schema
- Output types based on the model’s
schema
- A mapping of prompt paths to their respective types
- Tool argument types
Example:
# Generate from local files
agentmark generate-types --root-dir ./prompts > agentmark.types.ts
# Generate from local dev server
agentmark generate-types --local 9418 > agentmark.types.ts
See Type Safety for usage examples.
agentmark generate-schema
Generate a JSON Schema file for .prompt.mdx frontmatter. This enables IDE validation (squiggles) for fields like model_name in your prompt files.
agentmark generate-schema [options]
Options:
| Option | Description | Default |
|---|
-o, --out <directory> | Output directory | .agentmark |
Example:
agentmark generate-schema
agentmark generate-schema --out ./schemas
agentmark build
Build prompts into pre-compiled JSON files for static loading with FileLoader.
agentmark build [options]
Options:
| Option | Description | Default |
|---|
-o, --out <directory> | Output directory | dist/agentmark |
Requirements:
- An
agentmark.json config file must exist in the current directory
- Prompts are read from the directory specified by
agentmarkPath in the config
Output Structure:
dist/agentmark/
manifest.json # Build manifest with all prompts
greeting.prompt.json # Compiled prompt (mirrors source structure)
nested/
helper.prompt.json
Example:
# Build with default output directory
agentmark build
# Build to custom directory
agentmark build --out ./build/prompts
See Loaders for using built prompts with FileLoader.
agentmark pull-models
Interactive command to pull and configure models from a provider.
This command opens an interactive prompt to:
- Select a model provider
- Choose models to enable
- Update your local configuration
agentmark export traces
Export trace data from the AgentMark platform as JSONL, OpenAI chat-completion format, or CSV. Useful for fine-tuning, quality analysis, and custom data pipelines.
agentmark export traces [options]
Options:
| Option | Description | Default |
|---|
--format <format> | Export format: jsonl, openai, csv | jsonl |
--app <id> | App ID to export from | linked app |
--score <filter> | Score filter, e.g. "correctness>=0.8" (repeatable) | - |
--since <date> | Start date (ISO 8601) | 30 days ago |
--until <date> | End date (ISO 8601) | now |
--limit <number> | Max rows to export (1-2,000) | 500 |
--dry-run | Preview matching count and sample without exporting | false |
-o, --output <path> | Output file path | stdout |
--api-key <key> | API key (overrides stored credentials) | - |
--base-url <url> | Gateway URL override | - |
--type <type> | Span type: GENERATION, SPAN, EVENT | all |
--model <name> | Filter by model name (exact match) | - |
--status <code> | Filter by status: STATUS_CODE_OK, STATUS_CODE_ERROR | - |
--name <pattern> | Filter by span name (partial match) | - |
--user-id <id> | Filter by user ID | - |
--tag <value> | Filter by tag | - |
--lightweight | Exclude large I/O fields (Input, Output, ToolCalls) | false |
Score filtering:
Filter by evaluation scores using operators >=, >, <=, <, =, !=. Pass --score multiple times to combine filters:
agentmark export traces \
--format openai \
--score "correctness>=0.9" \
--score "hallucination<=0.1" \
--output training-data.jsonl
Dry run:
agentmark export traces --score "correctness>=0.9" --dry-run
Pipe to tools:
agentmark export traces --format jsonl | jq '.scores'
See Data Export for full documentation including API reference and format details.
agentmark api
Access the AgentMark gateway API from the command line. Commands are auto-generated from the gateway’s OpenAPI spec at runtime, so new endpoints appear in the CLI automatically.
agentmark api [options] <resource> <action> [flags]
Options:
| Option | Description | Default |
|---|
--remote | Target the cloud gateway instead of local dev server | local (localhost:9418) |
--refresh | Force re-fetch of the OpenAPI spec (cached for 24 hours) | false |
By default, agentmark api targets your local dev server. Use --remote to target the cloud gateway (requires agentmark login and agentmark link).
Available resources:
| Resource | Actions | Description |
|---|
traces | list, get | List and retrieve traces |
sessions | list, get | List and retrieve sessions |
spans | list | List spans across traces |
scores | list | List scores for spans and traces |
metrics | get | Retrieve aggregated metrics |
datasets | list, get | List and retrieve datasets |
experiments | list, get | List and retrieve experiment results |
prompts | list, get | List and retrieve prompt execution logs |
runs | list, get | List and retrieve individual runs within experiments |
capabilities | get | Check which features the server supports |
health | get | Check gateway health |
Run agentmark api __schema to see all available resources and actions directly from the live OpenAPI spec.
Self-documenting help:
Every resource and action has built-in help. Use --help to see available actions and their parameters:
# List all actions for a resource
agentmark api traces --help
# See parameters for a specific action
agentmark api traces list --help
Example:
# List recent traces from the local dev server
agentmark api traces list --limit 10
# List traces from the cloud gateway
agentmark api traces list --remote --limit 10
# Get a specific trace by ID
agentmark api traces get <traceId>
# List sessions from the cloud
agentmark api sessions list --remote --limit 5
# List scores for review
agentmark api scores list --remote
# List datasets
agentmark api datasets list
# List experiment results
agentmark api experiments list --limit 5
# List prompt execution logs
agentmark api prompts list --limit 10
# List runs for a specific experiment
agentmark api runs list
# Check which features are available
agentmark api capabilities get
# Check gateway health
agentmark api health get
The OpenAPI spec is cached locally for 24 hours. If endpoints were recently added, pass --refresh to pick up changes: agentmark api --refresh traces list.
Configuration Files
agentmark.json
Project configuration file in your project root:
{
"agentmarkPath": ".",
"version": "1.0",
"mdxVersion": "1.0"
}
| Field | Description |
|---|
agentmarkPath | Base path for agentmark files (contains the agentmark/ directory) |
version | Configuration version |
mdxVersion | MDX syntax version |
.agentmark/dev-config.json
Auto-generated local development configuration (gitignored):
{
"webhookSecret": "...",
"createdAt": "2024-01-15T10:30:00.000Z",
"appPort": 3000
}
This file stores:
- Webhook secrets for signature verification
- Current app port (updated when dev server starts)
The configuration expires after 30 days and is automatically regenerated.
Have Questions?
We’re here to help! Choose the best way to reach us: