
CLI Usage
Quick Start
- Dataset configured in prompt frontmatter
- Development server running (
npm run dev) - Optional: Evaluation functions defined
Command Options
Skip evaluations (output-only mode):passed field.
Output Example
| # | Input | AI Result | Expected Output | sentiment_check |
|---|---|---|---|---|
| 1 | {"text":"I love it"} | positive | positive | PASS (1.00) |
| 2 | {"text":"Terrible"} | negative | negative | PASS (1.00) |
| 3 | {"text":"It's okay"} | neutral | neutral | PASS (1.00) |
How It Works
Therun-experiment command:
- Loads your prompt file and parses the frontmatter
- Reads the dataset specified in
test_settings.dataset - Sends the prompt and dataset to the dev server (http://localhost:9417)
- The server runs the prompt against each dataset row
- Evaluates results using the evals specified in
test_settings.evals - Streams results back to the CLI as they complete
- Displays formatted output (table, CSV, JSON, or JSONL)
Configuration
Link dataset and evals in prompt frontmatter:Workflow
1. Develop prompts - Iterate on your prompt design 2. Create datasets - Add test cases covering your scenarios 3. Write evaluations - Define success criteria 4. Run experiments - Test against datasetSDK Usage
Run experiments programmatically usingformatWithDataset():
dataset- The test case (inputandexpected_output)formatted- The formatted prompt ready for your AI SDKevals- List of evaluation names to runtype- Always"dataset"
FormatWithDatasetOptions):
datasetPath?: string- Override dataset from frontmatterformat?: 'ndjson' | 'json'- Buffer all rows ('json') or stream as available ('ndjson', default)
- Custom test logic in your test framework
- Fine-grained control over test execution
- Integrating with existing test infrastructure
- Running experiments in application code
Troubleshooting
CLI Issues
Dataset not found:- Check dataset path in frontmatter
- Verify file exists and is valid JSONL
- Ensure
npm run devis running - Check ports are available
- Each line must be valid JSON
- Required:
inputfield - Optional:
expected_outputfield
- Add
evalstotest_settingsin frontmatter - Or use
--skip-evalflag for output-only mode