Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproduce the PPL accuracy anomaly of GPT2 W8A8 (PPL=17590.9778) #46

Open
FlyingPotatoZ opened this issue Oct 19, 2023 · 0 comments
Open

Comments

@FlyingPotatoZ
Copy link

FlyingPotatoZ commented Oct 19, 2023

Use the gpt2 model, and test the quantification accuracy.
model download: https://github.com/quic/aimet-model-zoo/releases/download/torch_gpt2/gpt2_wikitext_finetune.tar.gz
test data:wikitext-2-raw-v1,

Item Description
AIMET 1.28.0
Linux kernel 20.04
cuda 11.6
torch torch1.13.1-cu116
python 3.8.10
aimet-zoo-torch 1.5.0

The accuracy of fp32 is correct, but the accuracy of W8A8 is particularly large.
The results are as follows:
aimet_zoo_torch/gpt2/evaluators# python gpt2_quanteval.py --model_config gpt2_w8a8 --per_device_eval_batch_size 8
2023-10-19 02:52:23,612 - root - INFO - AIMET
2023-10-19 02:52:39,262 - datasets.builder - WARNING - Reusing dataset wikitext (/root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0)
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 712.27it/s]
2023-10-19 02:52:39,374 - datasets.arrow_dataset - WARNING - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/cache-957e58d88e4ab49c.arrow
2023-10-19 02:52:39,407 - datasets.arrow_dataset - WARNING - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/cache-10932a0976197214.arrow
2023-10-19 02:52:39,440 - datasets.arrow_dataset - WARNING - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/cache-ba370f2b62ba6d71.arrow
2023-10-19 02:52:39,452 - datasets.arrow_dataset - WARNING - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/cache-1252412874756be5.arrow
2023-10-19 02:52:39,464 - datasets.arrow_dataset - WARNING - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/cache-cfab500129fdf76e.arrow
2023-10-19 02:52:39,476 - datasets.arrow_dataset - WARNING - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/cache-af35feebb8a10af8.arrow
orig model fp32 inference
loss: 3.320616739840547 , ppl: 27.67741506034785
/usr/local/lib/python3.8/dist-packages/aimet_zoo_torch/gpt2/model/huggingface/baseline_models/gpt2/modeling_gpt2.py:188: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
w = w / (float(v.size(-1)) ** 0.5)
2023-10-19 02:52:51,367 - Quant - INFO - Unsupported op type Squeeze
2023-10-19 02:52:51,368 - Quant - INFO - Unsupported op type Mean
2023-10-19 02:52:51,542 - Quant - INFO - Selecting DefaultOpInstanceConfigGenerator to compute the specialized config. hw_version:default
loss: 3.1809085607528687 , ppl: 24.06861141667116
sim_orig model int8 inference
loss: 9.775141424384 , ppl: 17590.977796391602
2023-10-19 02:53:10,600 - main - INFO - Original model performances
2023-10-19 02:53:10,601 - main - INFO - ===========================
2023-10-19 02:53:10,601 - main - INFO - Original Model | 32-bit Environment | perplexity : 27.6774
2023-10-19 02:53:10,601 - main - INFO - Original Model | 8-bit Environment | perplexity: 17590.9778

Is there any issues about my usage?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant