-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log Probabilities of the T5 Model #4
Comments
HuggingFace interface provides logits when you pass the following options: with torch.no_grad():
outputs = model.generate(inputs, return_dict_in_generate=True, output_scores=True) This returns the logits for all tokens in the output. However, you only want the scores for the answer choices. Example code (file paths and others should be changed in order to match the format above. from datasets import load_dataset
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, AutoConfig
import json
import pandas as pd
def get_dataset_names(filename):
with open(filename) as fp:
lines = fp.readlines()
dataset_names = [line.rstrip() for line in lines]
return dataset_names
PROMPT_OUTPUT_TYPE = 'MC ID' # TODO use only rows of dataframe where dataset_df['Output type'] == PROMPT_OUTPUT_TYPE
model_name = "T0"
config = AutoConfig.from_pretrained('bigscience/' + model_name)
model = AutoModelForSeq2SeqLM.from_pretrained('bigscience/' + model_name)
tokenizer = AutoTokenizer.from_pretrained('bigscience/' + model_name)
dataset_df = pd.read_csv('../data/p3_dataset_info.csv')
# dataset_names = get_dataset_names('../data/p3_datasets_mc_id.txt')
for index, row in dataset_df.iterrows():
predictions_list = []
# Only getting validation set predictions
dataset_name = row['Dataset']
# TODO fix--currently skipping a dataset with no answer choice field
if dataset_name == 'wiqa_effect_with_label_answer':
continue
split = row['Validation split']
answer_choice_field = row['Answer choice field']
# other_split = row['Other split'] # TODO
output_file = '../data/predictions/' + model_name + '_' + PROMPT_OUTPUT_TYPE + '_' + dataset_name + '_' + split + '.json'
print(f"Getting predictions for {dataset_name} and saving to {output_file}")
dataset = load_dataset("bigscience/P3", dataset_name)
counter = 0
for example in dataset[split]:
# print(tokenizer.decode(example['targets']))
with torch.no_grad():
inputs = torch.tensor(example['inputs'], dtype=torch.long).view(1,-1)
outputs = model.generate(inputs, return_dict_in_generate=True, output_scores=True)
text_out = tokenizer.decode(outputs.sequences[0], skip_special_tokens=False)
scores = outputs['scores']
scores = scores[0] # We only have output with sequence length of 1, so we only need to keep the first probability
output_token = torch.argmax(scores).item()
options = example[answer_choice_field]
indices = tokenizer.batch_encode_plus(options, add_special_tokens=False, return_attention_mask=False, return_tensors="pt")['input_ids']
correct_token = example['targets'][0]
output_is_correct = output_token == correct_token
unnormalized_scores = scores[0][indices] # unnormalized meaning no softmax
normalized_scores = torch.nn.functional.softmax(scores, dim=-1)[0][indices] # normalized meaning that softmax was applied
# Get dictionary of key=token index, value=option
indices = indices.tolist()
tokens_to_options_dict = {}
for option, index in zip(options, indices):
tokens_to_options_dict[index[0]] = option
# Get dictionary of key=option, value=token index
options_to_tokens_dict = {}
for option, index in zip(options, indices):
options_to_tokens_dict[option] = index[0]
# Get dictionary of key=option, value=unnormalized scores
unnormalized_scores = unnormalized_scores.tolist()
unnormalized_scores_dict = {}
for option, score in zip(options, unnormalized_scores):
unnormalized_scores_dict[option] = score[0]
# Get dictionary of key=option, value=unnormalized scores
normalized_scores = normalized_scores.tolist()
normalized_scores_dict = {}
for option, score in zip(options, normalized_scores):
normalized_scores_dict[option] = score[0]
correct_option = tokens_to_options_dict[correct_token]
output_option = tokens_to_options_dict[output_token]
dictObj = {'dataset': dataset_name, 'options': options, 'tokens_to_options_dict': tokens_to_options_dict, 'options_to_tokens_dict': options_to_tokens_dict, 'unnormalized_scores': unnormalized_scores_dict, 'normalized_scores': normalized_scores_dict, 'correct_option': correct_option, 'output_option': output_option, 'output_is_correct': output_is_correct}
# print(dictObj)
predictions_list.append(dictObj)
counter += 1
if counter % 100 == 0:
print(f"{dataset_name}: Finished with {counter} examples") # TODO /{total_count}
print(f"Saving predictions to {output_file}...")
with open(output_file, 'w') as f:
json.dump(predictions_list, f)
print(f"Done with {dataset_name}")
print() |
@AADeLucia Thank you so much for the quick and elaborated response. I am dealing with the text generation task. So I am still trying to adapt the solution which you provided to my task. Will keep you posted in case of any further queries. Once again many thanks for your active response. |
Dear Team,
This is Divya. First of all congratulations for the great work. I am referring to your to estimate the confidence of a language model. I have actually taken the Flan T5 model. I would like to know how to get the log probs of the T5 model.
From your repository, I could see in the following path ,,data/predictions/T0_prompts/flan/cos_e", the probabilities and the log probabilities could be seen as follows:
{"dataset_name": "cos_e", "dataset_config_name": "v1.11", "template_name": "description_question_option_id", "context_id": "080ef6941410139d6869e78122bc741e", "target": 2, "prediction": 2, "probabilities": [0.00020205100008752197, 0.005021515768021345, 0.9925065636634827, 0.00039718663902021945, 1.838841853896156e-05], "log_probabilities": [-8.506990432739258, -5.294023513793945, -0.007521673105657101, -7.831104278564453, -10.903789520263672]}
As far as I understood, the model does not provide this information in the response. Could you provide me an idea about how to derive this.
I would be very thankful to you if you could help me out.
Thanks and Regards,
Divya
The text was updated successfully, but these errors were encountered: