Skip to main content
AgentMark lets you export production trace data in multiple formats for fine-tuning, quality analysis, and integration with external tools. You can export from the dashboard UI, the CLI, or the REST API.
Developers set up observability in your application. See Development documentation for setup instructions.

Export formats

AgentMark supports three export formats. Choose the one that fits your workflow:
FormatUse caseContent
Generic JSONL (jsonl)Custom pipelines, data analysis, backupsOne JSON object per line with all trace fields, metadata, and scores
OpenAI Chat (openai)Fine-tuning with OpenAI, Together, Fireworks, and other providers that accept the OpenAI formatOpenAI chat-completion messages format, one object per line
CSV (csv)Spreadsheets, BI tools, quick inspectionStandard CSV with header row
The OpenAI chat-completion format is the industry standard for fine-tuning. Most providers — including Anthropic (via Bedrock), Together AI, and Fireworks — accept it directly or with minor modifications.

Export from the dashboard

On the Traces page, click the Export button in the toolbar. Select a format, set a row limit (1-2,000), and click Download. Filters applied to the trace list carry over to the export — filter first, then export.

Export from the CLI

The agentmark export traces command exports trace data directly to a file or stdout. Link your project first with agentmark link, then run:
agentmark export traces --format openai --output training-data.jsonl

Score-based quality filtering

Export only high-quality traces by filtering on evaluation scores. This is the core of the fine-tuning data flywheel: run evals on production traces, then export the best examples.
agentmark export traces \
  --format openai \
  --score "correctness>=0.9" \
  --output fine-tuning-data.jsonl
Score filters support >=, >, <=, <, =, and !=. You can pass --score multiple times:
agentmark export traces \
  --format jsonl \
  --score "correctness>=0.8" \
  --score "hallucination<=0.1" \
  --output curated-traces.jsonl

Filter by model, time range, and more

# Export GPT-4o traces from the last week
agentmark export traces \
  --format jsonl \
  --model gpt-4o \
  --since 2026-04-01 \
  --limit 500

# Export only error traces
agentmark export traces --format csv --status STATUS_CODE_ERROR -o errors.csv

# Export traces for a specific user
agentmark export traces --format jsonl --user-id user-123

# Lightweight export (exclude large input/output fields)
agentmark export traces --format jsonl --lightweight -o metadata-only.jsonl

Preview before exporting

Use --dry-run to see how many traces match your filters and preview a sample before committing to a full export:
agentmark export traces \
  --format openai \
  --score "correctness>=0.9" \
  --dry-run

Pipe to other tools

When --output is omitted, data streams to stdout. Status messages go to stderr, so piping works cleanly:
# Count exported traces
agentmark export traces --format jsonl | wc -l

# Extract scores with jq
agentmark export traces --format jsonl | jq '.scores'

# Feed directly into a fine-tuning script
agentmark export traces --format openai | python train.py --data -

All CLI options

FlagDescriptionDefault
--formatExport format: jsonl, openai, csvjsonl
--appApp ID (uses linked app if omitted)From agentmark link
--scoreScore filter (repeatable), e.g. "correctness>=0.8"
--sinceStart date (ISO 8601)30 days ago
--untilEnd date (ISO 8601)Now
--limitMax rows (1-2,000)500
--modelFilter by model name
--typeFilter by span type: GENERATION, SPAN, EVENTAll
--statusFilter by status: STATUS_CODE_OK, STATUS_CODE_ERROR
--nameFilter by span name (partial match)
--user-idFilter by user ID
--tagFilter by tag
--lightweightExclude Input, Output, and ToolCalls fieldsfalse
--dry-runPreview matching count and sample without exporting
-o, --outputWrite to file (prompts before overwriting)stdout
--api-keyAPI key (overrides stored credentials)

Export from the API

The export endpoint is available on the gateway at GET /v1/traces/export. It returns streamed NDJSON (application/x-ndjson).

Authentication

Two authentication methods are supported:
Pass your API key and app ID as headers:
curl "https://api.agentmark.co/v1/traces/export?format=jsonl&limit=100" \
  -H "Authorization: YOUR_API_KEY" \
  -H "X-Agentmark-App-Id: YOUR_APP_ID"

Query parameters

All CLI filter flags map directly to query parameters:
GET /v1/traces/export?format=openai&limit=500&minScore=0.9&model=gpt-4o&startDate=2026-04-01
ParameterTypeDescription
formatstringjsonl, openai, or csv
limitnumber1-2,000 (default: 500)
startDatestringISO 8601 start date
endDatestringISO 8601 end date
minScorenumberInclude traces with any score at or above this value
maxScorenumberInclude traces with any score at or below this value
typestringGENERATION, SPAN, EVENT, or all
modelstringExact model name match
statusstringSTATUS_CODE_OK or STATUS_CODE_ERROR
namestringPartial span name match
userIdstringExact user ID match
tagstringTag array contains value
metadata_keystringMetadata key (requires metadata_value)
metadata_valuestringMetadata value (requires metadata_key)
lightweightbooleanExclude large I/O fields
cursorstringPagination cursor from previous response

Response format

Each line is a valid JSON object. The last line (except for CSV) is an export metadata trailer containing total, exported, skipped counts and a next_cursor for pagination. Generic JSONL example:
{"trace_id":"abc","model":"gpt-4o","input":"Hello","output":"Hi!","scores":{"correctness":0.95}}
{"_export_meta":{"total":1,"exported":1,"skipped":0,"skipped_reasons":{},"next_cursor":null}}
When using the OpenAI format, traces that cannot be converted to chat messages (non-GENERATION spans, missing output) are skipped. The _export_meta trailer reports skip counts and reasons.

Pagination

For exports larger than a single page, use cursor-based pagination. When the result count equals your limit, next_cursor in the metadata trailer contains a timestamp to pass as the cursor parameter for the next page.
# Page 1
curl "https://api.agentmark.co/v1/traces/export?limit=500" -H "..."
# Response metadata includes next_cursor with a timestamp value

# Page 2 — pass the cursor from page 1
curl "https://api.agentmark.co/v1/traces/export?limit=500&cursor=2026-04-07T12:00:00" -H "..."
When next_cursor is null, there are no more pages.

Rate limits

The export endpoint is rate-limited to 10 requests per hour per tenant. If you exceed this limit, the API returns 429 Too Many Requests. The CLI surfaces this as a readable error message.
The scores table has a 90-day retention policy. Export scored traces before the data expires. Traces without scores are unaffected.

Output field reference

Generic JSONL fields

FieldTypeDescription
trace_idstringUnique trace identifier
span_namestringName of the span
typestringGENERATION, SPAN, or EVENT
modelstringLLM model used
latency_msnumberRequest duration in milliseconds
timestampstringWhen the trace was recorded
inputobject or stringParsed input (JSON if possible, raw string otherwise)
outputobject or stringParsed output
metadataobjectKey-value metadata attached to the trace
scoresobjectEvaluation scores keyed by name (e.g. correctness: 0.95)

OpenAI chat-completion fields

Each line contains a messages array with role-based chat messages:
  • system — system prompt (when present in the trace input)
  • user — the user message
  • assistant — the model response
Tool calls are preserved in the assistant message when present, including the function name and arguments.

Filtering & Search

Build filters in the dashboard before exporting

Cost & Tokens

Track costs alongside your export workflows

Have Questions?

We’re here to help! Choose the best way to reach us: