The AgentMark Gateway API provides direct HTTP access to trace ingestion, scoring, and template retrieval.Documentation Index
Fetch the complete documentation index at: https://docs.agentmark.co/llms.txt
Use this file to discover all available pages before exploring further.
Most developers should use the AgentMark SDK instead of calling the REST API directly.
The SDK handles authentication, retries, and serialization automatically.
Base URL
- Cloud
- Local
npx agentmark dev) and Cloud share the same /v1/* wire contract — the same Zod schemas are the source of truth on both surfaces. What differs is which handlers are implemented where:
- Both surfaces:
/v1/config,/v1/traces(ingest + read),/v1/sessions,/v1/spans,/v1/scores(full CRUD + batch),/v1/datasets,/v1/experiments,/v1/templates,/v1/capabilities,/v1/pricing. - Cloud-only (local returns
404or a501 not_available_locallystub):/v1/metrics,/v1/scores/aggregations,/v1/traces/export, annotation queues (/v1/annotation-queues/*), and the health endpoints (/health,/v1/health/*). - Local-only (Cloud returns
501 not_available_on_cloud):/v1/promptslists the prompt files on disk — the Cloud handler is a documented stub pending an implementation decision. - Deprecated:
/v1/runs/{runId}/tracesstill works on local for backwards compatibility with older SDK versions, but new code should use/v1/traces?dataset_run_id={runId}— both paths hit the same ClickHouse predicate. The Cloud endpoint has always returned501.
GET /v1/capabilities to probe which features a server supports at runtime.
All endpoints are prefixed with /v1/ except the root health check.
Available endpoints
The Where column shows which environments implement each route. “Cloud + Local” means the same handler semantics on both; “Cloud only” / “Local only” mean the other side returns501 (with a not_available_on_cloud / not_available_locally error code) or 404.
| Endpoint | Method | Where | Description |
|---|---|---|---|
/v1/traces | POST | Cloud + Local | Ingest trace data in OTLP format (supports gzip) |
/v1/traces | GET | Cloud + Local | List traces with filtering — supports dataset_run_id for run-scoped listings |
/v1/traces/{traceId} | GET | Cloud + Local | Get a single trace with all its spans |
/v1/traces/{traceId}/spans | GET | Cloud + Local | List every span belonging to a trace |
/v1/traces/{traceId}/spans/{spanId} | GET | Cloud + Local | Get full input/output payload for a single span |
/v1/traces/{traceId}/graph | GET | Cloud + Local | Return nodes + edges for visualizing a trace’s agent-execution flow |
/v1/traces/export | GET | Cloud only | Export traces as JSONL, CSV, or OpenAI fine-tuning format |
/v1/sessions | GET | Cloud + Local | List sessions with filtering by name and user |
/v1/sessions/{sessionId}/traces | GET | Cloud + Local | List traces for a specific session |
/v1/spans | GET | Cloud + Local | Query spans across traces with filtering by type, status, model, and duration |
/v1/scores | POST | Cloud + Local | Create a score record for a span or trace |
/v1/scores/batch | POST | Cloud + Local | Create up to 1000 scores in one request (per-item results, 207-style) |
/v1/scores | GET | Cloud + Local | List scores for a specific span or trace |
/v1/scores/{scoreId} | GET | Cloud + Local | Get a single score by ID |
/v1/scores/{scoreId} | DELETE | Cloud + Local | Delete a score record |
/v1/scores/names | GET | Cloud + Local | List distinct score names (for UI filters) |
/v1/scores/aggregations | GET | Cloud only | Aggregated score statistics grouped by name |
/v1/score-configs | GET | Cloud + Local | List score configurations (reusable score schemas) |
/v1/score-configs/{name} | GET | Cloud + Local | Get a single score configuration by name |
/v1/metrics | GET | Cloud only | Aggregated analytics (trace volume, latency, cost, tokens, error rates) |
/v1/config | GET | Cloud + Local | Retrieve the synced agentmark.json project configuration plus the current commit SHA |
/v1/datasets | GET | Cloud + Local | List datasets with per-dataset metadata (row_count, created_at), case-insensitive ?name= substring filter, and canonical { data, pagination } envelope |
/v1/datasets/{datasetName}/rows | POST | Cloud + Local | Append a canonical dataset row with input, expected_output, and metadata |
/v1/datasets/{datasetName}/rows/from-traces | POST | Cloud + Local | Import one or more traces into canonical dataset rows using optional field mapping |
/v1/datasets/{datasetName}/rows/from-spans | POST | Cloud + Local | Import one or more spans into canonical dataset rows using optional field mapping |
/v1/experiments | GET | Cloud + Local | List experiments |
/v1/experiments/{experimentId} | GET | Cloud + Local | Get an experiment by ID |
/v1/prompts | GET | Local only | List prompt file paths in the project. Cloud returns 501 not_available_on_cloud. |
/v1/runs/{runId}/traces | GET | Local only · deprecated | Use /v1/traces?dataset_run_id={runId} instead — both paths hit the same predicate. Kept on Local for older SDK versions; Cloud returns 501. |
/v1/capabilities | GET | Cloud + Local | Check which features the server supports (no auth required) |
/v1/templates/{templatePath} | GET | Cloud + Local | Retrieve a prompt template by file path |
/v1/pricing | GET | Cloud + Local | Per-model LLM pricing data (no auth required) |
/v1/annotation-queues | GET · POST | Cloud only | List / create annotation queues for human review |
/v1/annotation-queues/{queueId} | GET · PATCH · DELETE | Cloud only | Read / update / delete a queue |
/v1/annotation-queues/{queueId}/items | GET · POST | Cloud only | List items or add traces/spans/sessions to a queue |
/v1/annotation-queues/{queueId}/items/{itemId} | GET · PATCH · DELETE | Cloud only | Read / update / remove a single queue item |
/v1/annotation-queues/{queueId}/items/{itemId}/reviews | POST | Cloud only | Submit a review — LLM-as-judge pipelines can land annotations the same way human reviewers do |
/v1/api-keys | GET · POST | Cloud only | List API keys (metadata only — no plaintext) or mint a new key. The plaintext value of a newly created key is returned exactly once in the POST response and is unrecoverable afterward. |
/v1/api-keys/{apiKeyId} | DELETE | Cloud only | Revoke an API key. Revoked keys are rejected immediately. |
/v1/connect | GET (WebSocket upgrade) | Cloud only | Persistent connection for deployed workers to receive dispatched jobs. |
/health | GET | Cloud only | Root health check (no auth required) |
/v1/health/ingestion | GET | Cloud only | Ingestion pipeline health with dependency statuses |
/v1/health/files | GET | Cloud only | Files service health with dependency statuses |
Response format
All responses are JSON unless otherwise noted (e.g., CSV exports). Error responses follow a consistent canonical envelope:error.code field is the programmatic discriminator — use it to branch on specific error cases. The error.message field is the human-readable description to show to users. Additional context (e.g. retry_after_seconds, jobId) appears as a sibling details object inside the error:
Rate limiting
Requests are rate-limited per tenant. When you exceed your rate limit, the API returns a429 status code.
Trace ingestion has additional monthly span and storage quotas depending on your plan.
See Authentication for details.
Versioning
Every endpoint is prefixed with/v1/. Breaking changes ship under new version prefixes (/v2/, etc.) with a 90+ day deprecation window — /v1/ keeps working while you migrate.
See API versioning & stability for the full policy on what’s breaking, what’s additive, and how deprecations are announced.
Why there is no PATCH /v1/traces
Traces are immutable in AgentMark. Once a span lands in ClickHouse, the row representing what happened during that execution is frozen — there is deliberately no endpoint that mutates it.
Other observability platforms expose a “patch trace” endpoint that lets clients backfill metadata, attach a label, or correct a field after ingestion. AgentMark covers those workflows through three separate, append-only resources instead:
- Scores (
POST /v1/scores,POST /v1/scores/batch) — attach a graded value (numeric, categorical, or boolean) to a trace or span after the fact. Scores are versioned bycreated_atand never overwrite the underlying span. - Comments — free-form human notes on a trace or span, stored alongside the trace as a separate resource.
- Annotation queues (
/v1/annotation-queues/*) — structured human-in-the-loop review that produces new score and comment records, again without modifying the trace itself.
PATCH /v1/traces will not ship in /v1/, /v2/, or any future version.
Filtering on /v1/spans and /v1/scores
/v1/spans and /v1/scores accept the same filter vocabulary as /v1/traces. The point is that one filter expression composes across surfaces — write it once, reuse it for trace listings, span listings, score listings, and saved-filter exports.
/v1/spans accepts:
-
start_date,end_date— ISO 8601 timestamps. Inclusive on both ends. -
user_id,session_id— scope the result to a specific user or session. -
filter— a JSON-encoded filter DSL, identical to the one/v1/tracesaccepts. Example:Decoded:See Filtering & search for the full operator list.
/v1/scores accepts session_id (newly added — scope scores to a session), alongside start_date, end_date, and source which were already supported.