When you run a dataset in the AgentMark platform, it sends a dataset-run event to your webhook endpoint. This event contains the dataset items and prompt configuration for processing.
{
"event": {
"type": "dataset-run",
"data": {
"datasetRunName": "string",
"prompt": "// Prompt AST object"
}
}
}
Processing Dataset Runs
The webhook handler processes dataset runs by executing the prompt for each item in the dataset:
if (event.type === "dataset-run") {
const data = event.data;
const frontmatter = getFrontMatter(data.prompt) as any;
const runId = crypto.randomUUID();
if (frontmatter.text_config) {
const prompt = await agentmarkClient.loadTextPrompt(data.prompt);
const dataset = await prompt.formatWithDataset({
datasetPath: frontmatter?.test_settings?.dataset,
telemetry: { isEnabled: true },
});
const stream = new ReadableStream({
async start(controller) {
let index = 0;
for await (const item of dataset) {
const traceId = crypto.randomUUID();
const result = await generateText({
...item.formatted,
experimental_telemetry: {
...item.formatted.experimental_telemetry,
metadata: {
...item.formatted.experimental_telemetry?.metadata,
dataset_run_id: runId,
dataset_path: frontmatter?.test_settings?.dataset,
dataset_run_name: data.datasetRunName,
dataset_item_name: index,
traceName: `ds-run-${data.datasetRunName}-${index}`,
traceId,
dataset_expected_output: item.dataset.expected_output,
},
},
});
const chunk =
JSON.stringify({
type: "dataset",
result: {
input: item.dataset.input,
expectedOutput: item.dataset.expected_output,
actualOutput: result.text,
tokens: result.usage?.totalTokens,
},
runId,
runName: data.datasetRunName,
}) + "\n";
controller.enqueue(chunk);
index++;
}
controller.close();
},
});
return new Response(stream, {
headers: {
"AgentMark-Streaming": "true",
},
});
}
// Handle object_config similarly...
}
Streaming Response
Dataset runs now return streaming responses for real-time processing updates. Each chunk in the stream contains:
{
type: "dataset",
result: {
input: any, // Original dataset item input
expectedOutput: any, // Expected output from dataset
actualOutput: any, // Generated output from model
tokens: number, // Token usage for this item
},
runId: string, // Unique run identifier
runName: string, // Dataset run name
}
Telemetry
Each dataset item includes comprehensive telemetry information:
const telemetry = {
dataset_run_id: runId, // required
dataset_path: frontmatter?.test_settings?.dataset, // required
dataset_run_name: data.datasetRunName, // required
dataset_item_name: index, // required
traceName: `ds-run-${data.datasetRunName}-${index}`, // required
traceId: traceId, // required
dataset_expected_output: item.dataset.expected_output, // required
};
Error Handling
Handle errors appropriately in your webhook:
try {
// Process dataset run
} catch (error) {
console.error("Dataset run error:", error);
return NextResponse.json(
{ message: "Error processing dataset run" },
{ status: 500 }
);
}
Best Practices
-
Streaming
- Always return streaming responses for dataset runs
- Use proper headers:
"AgentMark-Streaming": "true"
- Handle stream errors appropriately
-
Telemetry
- Include all required metadata in
experimental_telemetry
- Use unique
traceId
and runId
for each execution
- Track dataset progress and results
-
Error Handling
- Validate prompt configuration before processing
- Handle individual item failures gracefully
- Return appropriate HTTP status codes
-
Performance
- Process dataset items sequentially to avoid overwhelming the model
- Use appropriate timeouts for long-running datasets
- Monitor memory usage for large datasets
Next Steps
Have Questions?
We’re here to help! Choose the best way to reach us: