tau_eval.utils module
- tau_eval.utils.evaluate_system_output(inputs: list[str], outputs: list[str], metrics: list[str | Callable[[str | list[str], str | list[str]], dict[str, float]]] = ['rouge', 'meteor', 'luar']) dict[source]
Evaluate a system output with automatic metrics
- tau_eval.utils.run_models_on_custom_task(models, task: CustomTask, metrics)[source]