Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chapter 4 (04_multilingual-ner.ipynb), Trainer() reported KeyError: 'email' #142

Open
10 tasks
shandong1970 opened this issue Jul 11, 2024 · 4 comments
Open
10 tasks

Comments

@shandong1970
Copy link

Information

The problem arises in chapter:

  • Introduction
  • [*] Text Classification
  • Transformer Anatomy
  • Multilingual Named Entity Recognition
  • Text Generation
  • Summarization
  • Question Answering
  • Making Transformers Efficient in Production
  • Dealing with Few to No Labels
  • Training Transformers from Scratch
  • Future Directions

Describe the bug

To Reproduce

Steps to reproduce the behavior:

  1. one by one to run the code of <04_multilingual-ner.ipynb>.
  2. when you call Trainer, you will meet the error.

The code snippets are below:

from transformers import Trainer

trainer = Trainer(model_init=model_init, args=training_args, 
                  data_collator=data_collator, compute_metrics=compute_metrics,
                  train_dataset=panx_de_encoded["train"],
                  eval_dataset=panx_de_encoded["validation"], 
                  tokenizer=xlmr_tokenizer)

The error messages are below:

/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_deprecation.py:131: FutureWarning: 'Repository' (from 'huggingface_hub.repository') is deprecated and will be removed from version '1.0'. Please prefer the http-based alternatives instead. Given its large adoption in legacy code, the complete removal is only planned on next major release.
For more details, please read https://huggingface.co/docs/huggingface_hub/concepts/git_vs_http.
  warnings.warn(warning_message, FutureWarning)
/content/notebooks/xlm-roberta-base-finetuned-panx-de is already a clone of https://huggingface.co/shandong1970/xlm-roberta-base-finetuned-panx-de. Make sure you pull the latest changes with `repo.git_pull()`.
WARNING:huggingface_hub.repository:/content/notebooks/xlm-roberta-base-finetuned-panx-de is already a clone of https://huggingface.co/shandong1970/xlm-roberta-base-finetuned-panx-de. Make sure you pull the latest changes with `repo.git_pull()`.
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
[<ipython-input-54-ef227f75390c>](https://localhost:8080/#) in <cell line: 4>()
      2 from transformers import Trainer
      3 
----> 4 trainer = Trainer(model_init=model_init, args=training_args, 
      5                   data_collator=data_collator, compute_metrics=compute_metrics,
      6                   train_dataset=panx_de_encoded["train"],

4 frames
[/usr/local/lib/python3.10/dist-packages/huggingface_hub/repository.py](https://localhost:8080/#) in __init__(self, local_dir, clone_from, repo_type, token, git_user, git_email, revision, skip_lfs_files, client)
    543 
    544             if git_email is None:
--> 545                 git_email = user["email"]
    546 
    547             if git_user is None:

KeyError: 'email'

Expected behavior

@shandong1970 shandong1970 changed the title Chapter 4, Trainer() reported KeyError: 'email' Chapter 4, 04_multilingual-ner.ipynb, Trainer() reported KeyError: 'email' Jul 11, 2024
@shandong1970 shandong1970 changed the title Chapter 4, 04_multilingual-ner.ipynb, Trainer() reported KeyError: 'email' Chapter 4 (04_multilingual-ner.ipynb), Trainer() reported KeyError: 'email' Jul 11, 2024
@Ice-Citron
Copy link

i think i faced this error before. this is something you need to try to solve by passing in your HF_TOKEN i think. something along the line.

@dongshan7005
Copy link

i think i faced this error before. this is something you need to try to solve by passing in your HF_TOKEN i think. something along the line.

I think you are right. The issue is due to huggingface token. Could you please help me how to setup HF_TOKEN in colab, so that the code of chater04 can run normally? Thanks! (I've tried my ways to setup huggingface token, but unfortunately I still have the error.)

@dongshan7005
Copy link

i think i faced this error before. this is something you need to try to solve by passing in your HF_TOKEN i think. something along the line.

Thank you. I've created a new token with WRITE permission. The program runs successfully.

@Ice-Citron
Copy link

@dongshan7005 Sorry for the late response. But feel free to use my ipynb files in my repo for reference too. Good luck.

https://github.com/Ice-Citron/NLP-Transformer/blob/main/Chapter%204/CS_EE_Multilingual_NER.ipynb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants