Boolean LLM-as-Judge (Modelmetry)
This evaluator uses a large language model (LLM) to perform a boolean (true/false) evaluation of the input based on specified instructions and a maximum token count.
Configuration
Option | Description | Type | Default | Required | Constraints |
---|---|---|---|---|---|
Model | The namespace model used for evaluation |
|
|
| |
Instructions | Directions given to the LLM to ensure responses adhere to the task's requirements |
| "You are an LLM evaluator. We need the guarantee that the output answers what is being asked on the input, please evaluate as False if it doesn't." |
| MinLength: 1 |
MaxTokens | The limit on the number of tokens the entire prompt can be, to prevent excessive input processing |
|
|
|
|
Metrics
This evaluator does not report specific metrics but evaluates whether the processed content meets the boolean conditions specified.
Last updated