Agent Tool Selection

This article covers evaluating how well your agent selects a tool to use. Agents are often heavy users of tool calling, which can also be an approximation for workflow selection or dialog tree management. Given a set of tools and an input, which one should be chosen?

This is a more specific evaluation template compared to Agent Tool Calling, which tries to evaluate the whole tool call, instead of just whether the tool call selection was correct or not.

You can use this in tandem with Agent Parameter Extraction.

Prompt Template

You are comparing a function call response to a question and trying to determine if the generated call is correct.

Here is the data:
    [BEGIN DATA]
    ************
    [Question]: {question}
    ************
    [LLM Response]: {response}
    ************
    [END DATA]

Compare the Question above to the function call. You must determine whether the function call
will return the answer to the Question. Please focus on whether the very specific
question can be answered by the function call.
Your response must be single word, either "correct", "incorrect", or "not-applicable",
and should not contain any text or characters aside from that word.
"correct" means the function call will definitely provide an answer to the Question.
"incorrect" means that the function call will not provide an answer to the Question.
"not-applicable" means that response was not a function call.

Here is more information on each function:
{function_definition}

Last updated 4 months ago

Was this helpful?