Developers set up observability in your application. See Development documentation for setup instructions.
Export formats
AgentMark supports three export formats. Choose the one that fits your workflow:| Format | Use case | Content |
|---|---|---|
Generic JSONL (jsonl) | Custom pipelines, data analysis, backups | One JSON object per line with all trace fields, metadata, and scores |
OpenAI Chat (openai) | Fine-tuning with OpenAI, Together, Fireworks, and other providers that accept the OpenAI format | OpenAI chat-completion messages format, one object per line |
CSV (csv) | Spreadsheets, BI tools, quick inspection | Standard CSV with header row |
Export from the dashboard
On the Traces page, click the Export button in the toolbar. Select a format, set a row limit (1-2,000), and click Download. Filters applied to the trace list carry over to the export — filter first, then export.Export from the CLI
Theagentmark export traces command exports trace data directly to a file or stdout. Link your project first with agentmark link, then run:
Score-based quality filtering
Export only high-quality traces by filtering on evaluation scores. This is the core of the fine-tuning data flywheel: run evals on production traces, then export the best examples.>=, >, <=, <, =, and !=. You can pass --score multiple times:
Filter by model, time range, and more
Preview before exporting
Use--dry-run to see how many traces match your filters and preview a sample before committing to a full export:
Pipe to other tools
When--output is omitted, data streams to stdout. Status messages go to stderr, so piping works cleanly:
All CLI options
| Flag | Description | Default |
|---|---|---|
--format | Export format: jsonl, openai, csv | jsonl |
--app | App ID (uses linked app if omitted) | From agentmark link |
--score | Score filter (repeatable), e.g. "correctness>=0.8" | — |
--since | Start date (ISO 8601) | 30 days ago |
--until | End date (ISO 8601) | Now |
--limit | Max rows (1-2,000) | 500 |
--model | Filter by model name | — |
--type | Filter by span type: GENERATION, SPAN, EVENT | All |
--status | Filter by status: STATUS_CODE_OK, STATUS_CODE_ERROR | — |
--name | Filter by span name (partial match) | — |
--user-id | Filter by user ID | — |
--tag | Filter by tag | — |
--lightweight | Exclude Input, Output, and ToolCalls fields | false |
--dry-run | Preview matching count and sample without exporting | — |
-o, --output | Write to file (prompts before overwriting) | stdout |
--api-key | API key (overrides stored credentials) | — |
Export from the API
The export endpoint is available on the gateway atGET /v1/traces/export. It returns streamed NDJSON (application/x-ndjson).
Authentication
Two authentication methods are supported:- API Key
- JWT
Pass your API key and app ID as headers:
Query parameters
All CLI filter flags map directly to query parameters:| Parameter | Type | Description |
|---|---|---|
format | string | jsonl, openai, or csv |
limit | number | 1-2,000 (default: 500) |
startDate | string | ISO 8601 start date |
endDate | string | ISO 8601 end date |
minScore | number | Include traces with any score at or above this value |
maxScore | number | Include traces with any score at or below this value |
type | string | GENERATION, SPAN, EVENT, or all |
model | string | Exact model name match |
status | string | STATUS_CODE_OK or STATUS_CODE_ERROR |
name | string | Partial span name match |
userId | string | Exact user ID match |
tag | string | Tag array contains value |
metadata_key | string | Metadata key (requires metadata_value) |
metadata_value | string | Metadata value (requires metadata_key) |
lightweight | boolean | Exclude large I/O fields |
cursor | string | Pagination cursor from previous response |
Response format
Each line is a valid JSON object. The last line (except for CSV) is an export metadata trailer containing total, exported, skipped counts and anext_cursor for pagination.
Generic JSONL example:
_export_meta trailer reports skip counts and reasons.
Pagination
For exports larger than a single page, use cursor-based pagination. When the result count equals your limit,next_cursor in the metadata trailer contains a timestamp to pass as the cursor parameter for the next page.
next_cursor is null, there are no more pages.
Rate limits
The export endpoint is rate-limited to 10 requests per hour per tenant. If you exceed this limit, the API returns429 Too Many Requests. The CLI surfaces this as a readable error message.
The
scores table has a 90-day retention policy. Export scored traces before the data expires. Traces without scores are unaffected.Output field reference
Generic JSONL fields
| Field | Type | Description |
|---|---|---|
trace_id | string | Unique trace identifier |
span_name | string | Name of the span |
type | string | GENERATION, SPAN, or EVENT |
model | string | LLM model used |
latency_ms | number | Request duration in milliseconds |
timestamp | string | When the trace was recorded |
input | object or string | Parsed input (JSON if possible, raw string otherwise) |
output | object or string | Parsed output |
metadata | object | Key-value metadata attached to the trace |
scores | object | Evaluation scores keyed by name (e.g. correctness: 0.95) |
OpenAI chat-completion fields
Each line contains amessages array with role-based chat messages:
system— system prompt (when present in the trace input)user— the user messageassistant— the model response
Filtering & Search
Build filters in the dashboard before exporting
Cost & Tokens
Track costs alongside your export workflows
API Reference
Full REST API documentation for traces, scoring, and templates
Have Questions?
We’re here to help! Choose the best way to reach us:
- Email us at hello@agentmark.co for support
- Schedule an Enterprise Demo to learn about our business solutions