.mint-mb-3 {
    margin-bottom: 0.75rem
}
.mint-mb-4 {
    margin-bottom: 1rem
}
.mint-mt-8 {
    margin-top: 2rem
}
.mint-rounded-lg {
    border-radius: 0.5rem
}
.mint-bg-blue-50 {
    --tw-bg-opacity: 1;
    background-color: rgb(239 246 255 / var(--tw-bg-opacity))
}
.mint-p-6 {
    padding: 1.5rem
}
.mint-font-semibold {
    font-weight: 600
}
.mint-text-blue-600 {
    --tw-text-opacity: 1;
    color: rgb(37 99 235 / var(--tw-text-opacity))
}
.hover\:mint-text-blue-800:hover {
    --tw-text-opacity: 1;
    color: rgb(30 64 175 / var(--tw-text-opacity))
}
.dark\:mint-bg-blue-900\/30:is(.dark *) {
    background-color: rgb(30 58 138 / 0.3)
}
.dark\:mint-text-blue-400:is(.dark *) {
    --tw-text-opacity: 1;
    color: rgb(96 165 250 / var(--tw-text-opacity))
}
.dark\:hover\:mint-text-blue-200:hover:is(.dark *) {
    --tw-text-opacity: 1;
    color: rgb(191 219 254 / var(--tw-text-opacity))
}

Datasets

LLM as Judge Evaluations

Testing in Puzzlet

Learn how to evaluate your LLM application using datasets and LLM as a judge evaluations

Overview

Puzzlet Docs

Puzzlet is the git-based Prompt Engineering Platform that empowers both application developers and domain experts to collaborate seamlessly on GenAI products. Puzzlet enables companies to manage, evaluate, and improve their full-stack LLM application - with version control, type-safety, and local development built-in.

Start building awesome GenAI products in under 5 minutes.

Quickstart

The main concepts to help you get started with Puzzlet

Core Concepts

Learn how to configure Puzzlet for your application needs

Configure models and their user interfaces in Puzzlet

Model Schemas

Set up a webhook endpoint to test your Puzzlet inference in your application

Test Webhook

Learn how Puzzlet manages prompts in your application

Learn the core syntax and features of AgentMark

AgentMark Syntax

Learn how to migrate your existing prompts to Puzzlet

Migrating Prompts to Puzzlet

Learn how to develop and test Puzzlet prompts

Development

Learn how to create and use reusable components in Puzzlet

Components

Learn how to get structured JSON responses from your prompts

JSON Output

Learn how to extend prompts with tools and create multi-step agents

Tools and Agents

Monitor and debug your prompts with Puzzlet

Monitor and debug your prompts using OpenTelemetry

Traces and Logs

Monitor your application with customizable alerts

Alerts

Group related traces together for better observability

Sessions

Analytics provide a high-level overview of how your application is using Generative AI. This includes metrics like costs, tokens, requests, latency, top models, etc.

Metrics

Datasets enable bulk testing of prompts against diverse inputs and expected outputs

Guide to evaluating AI responses with Puzzlet

Evaluations

Puzzlet allows you to collaborate on prompts while maintaining type safety in a production environment.

Type Safety

CI/CD Integration

Using Puzzlet with AI-powered code editors

AI Editors Integration

Our VSCode extension allows you to run prompts and evaluate results directly in your editor

VS Code Extension

Follow the instructions below to install AgentMark in your app.

Getting Started

Learn how to add model providers to your AgentMark project.

Model Providers

Learn how to migrate your existing LLM application to AgentMark

Migration Guide

Learn how AgentMark processes and transforms prompts

Architecture

AgentMark provides a powerful and flexible way to create prompts using Markdown and JSX. This section will cover the core concepts and features of prompting with AgentMark.

Configure model parameters using standard settings.

Model Settings

Access and use variables in your prompts using props.

Props

AgentMark supports importing and reusing components across your prompts.

Reusable Components

Conditionals allow you to create dynamic prompts that adapt based on props or other conditions.

Conditionals

AgentMark supports iterating over arrays using the `<ForEach>` tag.

Loops

Transform values in your prompts using filter functions

Filter Functions

Define structured output using JSON Schema

Object Schema

Tools and agents allow you to extend your prompts with external capabilities.

Execute prompts and get responses from language models

Core API

Observability

AgentMark provides robust type safety through JSON Schema definitions in your prompt files. This ensures type checking for both inputs and outputs, making your prompts more reliable and maintainable.

Learn how to use Puzzlet Studio for local development

Getting Started

Configuring Puzzlet

Prompt Management

Observability

Testing

Further Reference

Overview

Testing in Puzzlet

Datasets

LLM as Judge Evaluations

Have Questions?

Getting Started

Configuring Puzzlet

Prompt Management

Observability

Testing

Further Reference

​Testing in Puzzlet

​Datasets

​LLM as Judge Evaluations

Have Questions?

Testing in Puzzlet

Datasets

LLM as Judge Evaluations