Evaluation templates define how responses are assessed:
Copy
---name: correctnessobject_config: model_name: gpt-4 max_tokens: 4096 temperature: 0.7 schema: type: object properties: correctness: type: object properties: label: type: string description: label of the answer score: type: number description: score of the answer (0-1) reason: type: string description: reason for the score required: - label - score - reason required: - correctness---<System>You are an evaluation assistant focused on assessing the correctness of responses based on:1. Accuracy - factually correct information2. Completeness - all required elements present3. Relevance - addresses the questionAssign a label ("correct", "partially correct", or "incorrect"), score (0-1), and provide reasoning.</System><User>Please evaluate the following response:**Input:**{props.input}--------------------------**Expected Output:**{props.expected_output}--------------------------**Output:**{props.output}--------------------------Analyze based on accuracy, completeness, and relevance.</User>