Initializer Constructor.
Properties of the selection benchmark
Execute the benchmark.
Execute the benchmark of the LLM function selection, and returns the result of the benchmark.
If you wanna see progress of the benchmark, you can pass a callback
function as the argument of the listener
. The callback function
would be called whenever a benchmark event is occurred.
Also, you can publish a markdown format report by calling the report function after the benchmark execution.
Optional
listener: (event: IAgenticaSelectBenchmarkEvent<Model>) => voidCallback function listening the benchmark events
Results of the function selection benchmark
Report the benchmark result as markdown files.
Report the benchmark result executed by
AgenticaSelectBenchmark
as markdown files, and returns a
dictionary object of the markdown reporting files. The key of
the dictionary would be file name, and the value would be the
markdown content.
For reference, the markdown files are composed like below:
./README.md
./scenario-1/README.md
./scenario-1/1.success.md
./scenario-1/2.failure.md
./scenario-1/3.error.md
Dictionary of markdown files.
LLM function calling selection benchmark.
AgenticaSelectBenchmark
is a class for the benchmark of the LLM (Large Model Language) function calling's selection part. It utilizes theselector
agent and tests whether the expected IAgenticaOperation operations are properly selected from the given IAgenticaSelectBenchmarkScenario scenarios.Note that, this
AgenticaSelectBenchmark
class measures only the selection benchmark, testing whether theselector
agent can select candidate functions to call as expected. Therefore, it does not test about the actual function calling which is done by theexecutor
agent. If you want that feature, use AgenticaCallBenchmark class instead.Author
Samchon