How few-shot examples work
- Few-shot examples are added to your evaluator prompt using the
{{Few-shot examples}}
variable - Creating an evaluator with few-shot examples, will automatically create a dataset for you, which will be auto-populated with few-shot examples once you start making corrections
- At runtime, these examples will inserted into the evaluator to serve as a guide for its outputs - this will help the evaluator to better align with human preferences
Configure your evaluator
Few-shot examples are not currently supported in LLM-as-a-judge evaluators that use the prompt hub and are only compatible with prompts that use mustache formatting.
1. Configure variable mapping
Each few-shot example is formatted according to the variable mapping specified in the configuration. The variable mapping for few-shot examples, should contain the same variables as your main prompt, plus afew_shot_explanation
and a score
variable which should have the same name as your feedback key.
For example, if your main prompt has variables question
and response
, and your evaluator outputs a correctness
score, then your few-shot prompt should have the vartiables question
, response
, few_shot_explanation
, and correctness
.
2. Specify the number of few-shot examples to use
You may also specify the number of few-shot examples to use. The default is 5. If your examples are very long, you may want to set this number lower to save tokens - whereas if your examples tend to be short, you can set a higher number in order to give your evaluator more examples to learn from. If you have more examples in your dataset than this number, we will randomly choose them for you.Make corrections
As you start logging traces or running experiments, you will likely disagree with some of the scores that your evaluator has given. When you make corrections to these scores, you will begin seeing examples populated inside your corrections dataset. As you make corrections, make sure to attach explanations - these will get populated into your evaluator prompt in place of thefew_shot_explanation
variable.
The inputs to the few-shot examples will be the relevant fields from the inputs, outputs, and reference (if this an offline evaluator) of your chain/dataset. The outputs will be the corrected evaluator score and the explanations that you created when you left the corrections. Feel free to edit these to your liking. Here is an example of a few-shot example in a corrections dataset:

View your corrections dataset
In order to view your corrections dataset:- Online evaluators: Select your run rule and click Edit Rule
- Offline evaluators: Select your evaluator and click Edit Evaluator

