tau_eval.metrics package
Submodules
- tau_eval.metrics.bertscore.compute_bertscore(input_texts: str | list[str], output_texts: str | list[str], model_id: str = 'distilbert-base-uncased') dict[str, list[float]][source]
Computes BERTScore for a list of input and output text pairs.
- Parameters:
input_texts – A string or a list of input text strings.
output_texts – A string or a list of output text strings.
model_id – Bert specification, HuggingFace model to use.
- Returns:
A dictionary containing BERTScore scores for each input-output pair. The dictionary will contain keys “precision”, “recall”, and “f1”.
- tau_eval.metrics.cola.cola_score(text: str, cola_tokenizer: PreTrainedTokenizer, cola_model: PreTrainedModel, device: str = 'cuda') float[source]
Calculates the CoLA score for a single piece of text.
- Parameters:
text – The text to score.
cola_tokenizer – The tokenizer for the CoLA model.
cola_model – The CoLA model.
device – The device to run the model on (“cuda” or “cpu”).
- Returns:
The CoLA score (a float between 0 and 1).
- tau_eval.metrics.cola.compute_cola(output_texts: str | list[str], cola_tokenizer: PreTrainedTokenizer, cola_model: PreTrainedModel, device: str = 'cuda') dict[str, list[float]][source]
Computes CoLA scores for a list of input texts.
- Parameters:
output_texts – A list of text strings.
cola_tokenizer – The tokenizer for the CoLA model.
cola_model – The CoLA model.
device – The device to run the model on (“cuda” or “cpu”).
- Returns:
A dictionary containing CoLA scores for each input text.
- tau_eval.metrics.cola.load_cola(model_name: str = 'textattack/roberta-base-CoLA', device: str = 'cuda') tuple[PreTrainedModel, PreTrainedTokenizer][source]
Loads the CoLA (Corpus of Linguistic Acceptability) model and tokenizer.
- Parameters:
model_name – HuggingFace model to load
device – The device to load the model onto (“cuda” or “cpu”).
- Returns:
A tuple containing the loaded model and tokenizer.
Sentence transformers version of LUAR
- tau_eval.metrics.luar.compute_luar(input_texts: str | list[str], output_texts: str | list[str], sim_model: SentenceTransformer) dict[str, list[float]][source]
Computes LUAR scores based on cosine similarity between the embeddings of original and rewritten texts.
- Parameters:
original – A string or list of original texts.
rewrites – A string or list of rewritten texts.
sim_model – The loaded SentenceTransformer model.
- Returns:
A dictionary containing the LUAR scores for each pair of input texts. The dictionary has the key “luar” with a list of float values.
- tau_eval.metrics.luar.load_luar(model_name: str = 'gabrielloiseau/LUAR-MUD-sentence-transformers', device: str = 'cuda') SentenceTransformer[source]
Loads the LUAR (Language Understanding and Reasoning) sentence transformer model.
- Parameters:
model_name – SentenceTransformers model to load.
device – The device to load the model onto (“cuda” or “cpu”).
- Returns:
The loaded SentenceTransformer model.
- tau_eval.metrics.meteor.compute_meteor(input_texts: str | list[str], output_texts: str | list[str], alpha: float = 0.9, beta: float = 3, gamma: float = 0.5) dict[str, list[float]][source]
Computes METEOR scores for a list of input and output text pairs.
- Parameters:
input_texts – A list of input text strings.
output_texts – A list of output text strings.
alpha – Parameter for controlling relative weights of precision and recall.
beta – Parameter for controlling shape of penalty function.
gamma – Relative weight of fragmentation penalty.
- Returns:
A dictionary containing METEOR scores for each input-output pair.
- tau_eval.metrics.nli.compute_nli(input_texts: str | list[str], output_texts: str | list[str], nli_tokenizer: PreTrainedTokenizer, nli_model: PreTrainedModel, batch_size: int = 16, device: str = 'cuda', max_length: int = 128) dict[str, list[float]][source]
Computes the probability of entailment between two texts using the NLI model.
- Parameters:
input_text – The premise text.
output_text – The hypothesis text.
nli_tokenizer – The tokenizer for the NLI model.
nli_model – The NLI model.
- Returns:
A dictionary containing the probability of entailment. The dictionary has the key “entailment” with a float value.
- tau_eval.metrics.nli.load_nli(model_name: str = 'alisawuffles/roberta-large-wanli', device: str = 'cuda') tuple[PreTrainedTokenizer, PreTrainedModel][source]
Loads the NLI (Natural Language Inference) model and tokenizer.
- Parameters:
model_name – HuggingFace model to load
device – The device to load the model onto (“cuda” or “cpu”).
- Returns:
A tuple containing the loaded tokenizer and model.
- tau_eval.metrics.perplexity.compute_perplexity(output_texts: str | list[str], model_id: str = 'gpt2') dict[str, list[float]][source]
Computes perplexity scores for a list of output texts.
- Parameters:
output_texts – A string or list of output text strings.
model_id – HuggingFace model to use
- Returns:
A dictionary containing perplexity scores for each input text.
- tau_eval.metrics.rouge.compute_rouge(input_texts: str | list[str], output_texts: str | list[str]) dict[str, list[float]][source]
Computes ROUGE scores for a list of input and output text pairs.
- Parameters:
input_texts – A list of input text strings.
output_texts – A list of output text strings.
- Returns:
A dictionary containing ROUGE scores for each input-output pair.
- tau_eval.metrics.sbert.compute_sbert(input_texts: str | list[str], output_texts: str | list[str], sim_model: SentenceTransformer) dict[str, list[float]][source]
Computes the cosine similarity between the embeddings of original and rewritten texts.
- Parameters:
original – A string or a list of original texts.
rewrites – A string or a list of rewritten texts.
sim_model – The loaded SentenceTransformer model.
- Returns:
A dictionary containing the similarity scores for each input text pair. The dictionary has the key “similarity” with a list of float values.
- tau_eval.metrics.sbert.load_sbert(model_name: str = 'sentence-transformers/all-MiniLM-L6-v2', device: str = 'cuda') SentenceTransformer[source]
Loads the sentence similarity model.
- Parameters:
model_name – SentenceTransformers model to load.
device – The device to load the model onto (“cuda” or “cpu”).
- Returns:
The loaded SentenceTransformer model.