Data Export

AgentMark lets you export production trace data in multiple formats for fine-tuning, quality analysis, and integration with external tools. You can export from the dashboard UI, the CLI, or the REST API.

Developers set up observability in your application. See Development documentation for setup instructions.

Export formats

AgentMark supports three export formats. Choose the one that fits your workflow:

Format	Use case	Content
Generic JSONL (`jsonl`)	Custom pipelines, data analysis, backups	One JSON object per line with all trace fields, metadata, and scores
OpenAI Chat (`openai`)	Fine-tuning with OpenAI, Together, Fireworks, and other providers that accept the OpenAI format	OpenAI chat-completion messages format, one object per line
CSV (`csv`)	Spreadsheets, BI tools, quick inspection	Standard CSV with header row

The OpenAI chat-completion format is the industry standard for fine-tuning. Most providers — including Anthropic (via Bedrock), Together AI, and Fireworks — accept it directly or with minor modifications.

Export from the dashboard

On the Traces page, click the Export button in the toolbar. Select a format, set a row limit (1-2,000), and click Download. Filters applied to the trace list carry over to the export — filter first, then export.

Export from the CLI

The agentmark export traces command exports trace data directly to a file or stdout. Link your project first with agentmark link, then run:

agentmark export traces --format openai --output training-data.jsonl

Score-based quality filtering

Export only high-quality traces by filtering on evaluation scores. This is the core of the fine-tuning data flywheel: run evals on production traces, then export the best examples.

agentmark export traces \
  --format openai \
  --score "correctness>=0.9" \
  --output fine-tuning-data.jsonl

Score filters support >=, >, <=, <, =, and !=. You can pass --score multiple times:

agentmark export traces \
  --format jsonl \
  --score "correctness>=0.8" \
  --score "hallucination<=0.1" \
  --output curated-traces.jsonl

Filter by model, time range, and more

# Export GPT-4o traces from the last week
agentmark export traces \
  --format jsonl \
  --model gpt-4o \
  --since 2026-04-01 \
  --limit 500

# Export only error traces
agentmark export traces --format csv --status STATUS_CODE_ERROR -o errors.csv

# Export traces for a specific user
agentmark export traces --format jsonl --user-id user-123

# Lightweight export (exclude large input/output fields)
agentmark export traces --format jsonl --lightweight -o metadata-only.jsonl

Preview before exporting

Use --dry-run to see how many traces match your filters and preview a sample before committing to a full export:

agentmark export traces \
  --format openai \
  --score "correctness>=0.9" \
  --dry-run

Pipe to other tools

When --output is omitted, data streams to stdout. Status messages go to stderr, so piping works cleanly:

# Count exported traces
agentmark export traces --format jsonl | wc -l

# Extract scores with jq
agentmark export traces --format jsonl | jq '.scores'

# Feed directly into a fine-tuning script
agentmark export traces --format openai | python train.py --data -

All CLI options

Flag	Description	Default
`--format`	Export format: `jsonl`, `openai`, `csv`	`jsonl`
`--app`	App ID (uses linked app if omitted)	From `agentmark link`
`--score`	Score filter (repeatable), e.g. `"correctness>=0.8"`	—
`--since`	Start date (ISO 8601)	30 days ago
`--until`	End date (ISO 8601)	Now
`--limit`	Max rows (1-2,000)	500
`--model`	Filter by model name	—
`--type`	Filter by span type: `GENERATION`, `SPAN`, `EVENT`	All
`--status`	Filter by status: `STATUS_CODE_OK`, `STATUS_CODE_ERROR`	—
`--name`	Filter by span name (partial match)	—
`--user-id`	Filter by user ID	—
`--tag`	Filter by tag	—
`--lightweight`	Exclude Input, Output, and ToolCalls fields	`false`
`--dry-run`	Preview matching count and sample without exporting	—
`-o, --output`	Write to file (prompts before overwriting)	stdout
`--api-key`	API key (overrides stored credentials)	—

Export from the API

The export endpoint is available on the gateway at GET /v1/traces/export. It returns streamed NDJSON (application/x-ndjson).

Authentication

Two authentication methods are supported:

API Key
JWT

Pass your API key and app ID as headers:

curl "https://api.agentmark.co/v1/traces/export?format=jsonl&limit=100" \
  -H "Authorization: YOUR_API_KEY" \
  -H "X-Agentmark-App-Id: YOUR_APP_ID"

Pass a Supabase JWT as a Bearer token, with appId as a query parameter:

curl "https://api.agentmark.co/v1/traces/export?format=jsonl&appId=YOUR_APP_ID" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

Query parameters

All CLI filter flags map directly to query parameters:

GET /v1/traces/export?format=openai&limit=500&minScore=0.9&model=gpt-4o&startDate=2026-04-01

Parameter	Type	Description
`format`	string	`jsonl`, `openai`, or `csv`
`limit`	number	1-2,000 (default: 500)
`startDate`	string	ISO 8601 start date
`endDate`	string	ISO 8601 end date
`minScore`	number	Include traces with any score at or above this value
`maxScore`	number	Include traces with any score at or below this value
`type`	string	`GENERATION`, `SPAN`, `EVENT`, or `all`
`model`	string	Exact model name match
`status`	string	`STATUS_CODE_OK` or `STATUS_CODE_ERROR`
`name`	string	Partial span name match
`userId`	string	Exact user ID match
`tag`	string	Tag array contains value
`metadata_key`	string	Metadata key (requires `metadata_value`)
`metadata_value`	string	Metadata value (requires `metadata_key`)
`lightweight`	boolean	Exclude large I/O fields
`cursor`	string	Pagination cursor from previous response

Response format

Each line is a valid JSON object. The last line (except for CSV) is an export metadata trailer containing total, exported, skipped counts and a next_cursor for pagination. Generic JSONL example:

{"trace_id":"abc","model":"gpt-4o","input":"Hello","output":"Hi!","scores":{"correctness":0.95}}
{"_export_meta":{"total":1,"exported":1,"skipped":0,"skipped_reasons":{},"next_cursor":null}}

When using the OpenAI format, traces that cannot be converted to chat messages (non-GENERATION spans, missing output) are skipped. The _export_meta trailer reports skip counts and reasons.

Pagination

For exports larger than a single page, use cursor-based pagination. When the result count equals your limit, next_cursor in the metadata trailer contains a timestamp to pass as the cursor parameter for the next page.

# Page 1
curl "https://api.agentmark.co/v1/traces/export?limit=500" -H "..."
# Response metadata includes next_cursor with a timestamp value

# Page 2 — pass the cursor from page 1
curl "https://api.agentmark.co/v1/traces/export?limit=500&cursor=2026-04-07T12:00:00" -H "..."

When next_cursor is null, there are no more pages.

Rate limits

The export endpoint is rate-limited to 10 requests per hour per tenant. If you exceed this limit, the API returns 429 Too Many Requests. The CLI surfaces this as a readable error message.

The scores table has a 90-day retention policy. Export scored traces before the data expires. Traces without scores are unaffected.

Output field reference

Generic JSONL fields

Field	Type	Description
`trace_id`	string	Unique trace identifier
`span_name`	string	Name of the span
`type`	string	`GENERATION`, `SPAN`, or `EVENT`
`model`	string	LLM model used
`latency_ms`	number	Request duration in milliseconds
`timestamp`	string	When the trace was recorded
`input`	object or string	Parsed input (JSON if possible, raw string otherwise)
`output`	object or string	Parsed output
`metadata`	object	Key-value metadata attached to the trace
`scores`	object	Evaluation scores keyed by name (e.g. correctness: 0.95)

OpenAI chat-completion fields

Each line contains a messages array with role-based chat messages:

system — system prompt (when present in the trace input)
user — the user message
assistant — the model response

Tool calls are preserved in the assistant message when present, including the function name and arguments.

Filtering & Search

Build filters in the dashboard before exporting

Cost & Tokens

Track costs alongside your export workflows

Have Questions?

We’re here to help! Choose the best way to reach us:

Email us at hello@agentmark.co for support
Schedule an Enterprise Demo to learn about our business solutions

Getting Started

Prompt Management

Observability

Evaluation

Further Reference

Export formats

Export from the dashboard

Export from the CLI

Score-based quality filtering

Filter by model, time range, and more

Preview before exporting

Pipe to other tools

All CLI options

Export from the API

Authentication

Query parameters

Response format

Rate limits

Output field reference

Generic JSONL fields

OpenAI chat-completion fields

Filtering & Search

Cost & Tokens

Have Questions?

Getting Started

Prompt Management

Observability

Evaluation

Further Reference

​Export formats

​Export from the dashboard

​Export from the CLI

​Score-based quality filtering

​Filter by model, time range, and more

​Preview before exporting

​Pipe to other tools

​All CLI options

​Export from the API

​Authentication

​Query parameters

​Response format

​Pagination

​Rate limits

​Output field reference

​Generic JSONL fields

​OpenAI chat-completion fields

Filtering & Search

Cost & Tokens

​Have Questions?

Export formats

Export from the dashboard

Export from the CLI

Score-based quality filtering

Filter by model, time range, and more

Preview before exporting

Pipe to other tools

All CLI options

Export from the API

Authentication

Query parameters

Response format

Pagination

Rate limits

Output field reference

Generic JSONL fields

OpenAI chat-completion fields

Have Questions?