Boolean LLM-as-Judge (Modelmetry)

This evaluator uses a large language model (LLM) to perform a boolean (true/false) evaluation of the input based on specified instructions and a maximum token count.

Configuration

Option Description Type Default Required Constraints

Option	Description	Type	Default	Required	Constraints
Model	The namespace model used for evaluation	`string`	`openai/gpt-4o-mini`	`true`
Instructions	Directions given to the LLM to ensure responses adhere to the task's requirements	`string`	"You are an LLM evaluator. We need the guarantee that the output answers what is being asked on the input, please evaluate as False if it doesn't."	`true`	MinLength: 1
MaxTokens	The limit on the number of tokens the entire prompt can be, to prevent excessive input processing	`int`	`8192`	`true`	`Min: 1`

Model

The namespace model used for evaluation

string

openai/gpt-4o-mini

true

Instructions

Directions given to the LLM to ensure responses adhere to the task's requirements

string

"You are an LLM evaluator. We need the guarantee that the output answers what is being asked on the input, please evaluate as False if it doesn't."

true

MinLength: 1

MaxTokens

The limit on the number of tokens the entire prompt can be, to prevent excessive input processing

int

8192

true

Min: 1

Metrics

This evaluator does not report specific metrics but evaluates whether the processed content meets the boolean conditions specified.

PreviousText Moderation (Google)NextCompetitor Blocklist (Modelmetry)

Last updated 6 days ago