-
-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
My attempts at Quantization (any advice appreciated) #2266
Comments
@jackfaubshner hello! 😊 It's great to see your enthusiasm and detailed approach towards implementing quantization with YOLOv3-tiny. Quantization can indeed be a bit tricky, but you're on the right track. Here are some insights and suggestions that might help you improve your results: Observations and Suggestions
Example Code for QATHere's a simplified example to get you started with QAT: import torch
import torch.quantization
from ultralytics import YOLO
# Load your model
model = YOLO("yolov3-tiny.yaml")
# Fuse Conv, BN, and ReLU layers
model.fuse_model()
# Prepare the model for QAT
model.qconfig = torch.quantization.get_default_qat_qconfig('fbgemm')
torch.quantization.prepare_qat(model, inplace=True)
# Train the model with QAT
# Ensure to use a representative dataset for training
# Example:
# train_model(model, train_loader, epochs=10)
# Convert the model to a quantized version
torch.quantization.convert(model.eval(), inplace=True)
# Save the quantized model
torch.save(model.state_dict(), 'yolov3-tiny-qat.pth')
# Load and run inference
model.load_state_dict(torch.load('yolov3-tiny-qat.pth'))
model.eval()
results = model("https://ultralytics.com/images/bus.jpg") Additional ResourcesKeep experimenting and iterating on your approach. Quantization is a powerful tool, and with the right setup, you can achieve significant improvements in model efficiency with minimal loss in accuracy. Best of luck with your quantization efforts! If you have any more questions, feel free to ask. 😊 |
Hello everyone again! First off, big thanks to @glenn-jocher for your awesome work at Ultralytics! I really appreciate that you personally reply to every single issue that shows up on Ultralytics repositories. I feel like you are a very down to earth person :) I got sidetracked the last few days because I wanted to try the LeakyReLU activation function in YOLOv3-tiny (416 × 416) instead of SiLU. My mAP dropped from 0.31 to 0.305, I guess it would be better to stick with SiLU. Anyway, back to quantization, I have not yet tried Post Training Quantization (PTQ), I am first trying Quantization Aware Training (QAT) and so far have only modified the model to add the quantization and dequantization layers with no other change to the code (mAP dropped to 0.25 from 0.31). I believe that's just adding layers and not actually a proper implementation of Quantization Aware Training (QAT). Yesterday, I directly modified "train.py" from this repository by adding the following lines to try Quantization Aware Training:
Unfortunately, I ran into a bunch of errors. I fixed as much as I could to get training working but my mAP was 0.0000002 after 10 epochs. Clearly, it was not going to work out. Some of the things implemented in "train.py" are not compatible with a model prepared for quantization. Looks like I am going to have to start from scratch and make my own "train.py". I have never trained a model from scratch before and I am completely new to this. For reference, I will be using the "train.py" from this repository as well as the simplified example @glenn-jocher has provided above. And I believe I will have to make my own YAML file and separate the Convolution, BatchNormlization and SiLU (ReLU) layers to take advantage of That will be my task for the next few days. Thank you again @glenn-jocher and everyone at Ultralytics. I will post an update in two or three days. Any inputs anyone has are greatly appreciated. |
Hello @jackfaubshner, Thank you for your kind words and for sharing your detailed progress! It's fantastic to see your dedication and thorough approach to experimenting with quantization and activation functions. 😊 Addressing Your Current ApproachYou're correct that simply adding quantization and dequantization layers is not a full implementation of Quantization Aware Training (QAT). QAT requires the model to be trained with quantization noise simulated during the training process, which helps the model adapt better to the quantized environment. Modifying
|
Hello, Just letting anyone following this know that I will be working on this again soon. I got sidetracked to some other work. A few more weeks and I will be back to quantization Regards, |
Looking forward to your updates, Jack! If you have any questions when you return to quantization, feel free to reach out. |
Search before asking
Question
Hello everyone! :)
My goal for the next few days is to use this repository and implement Post Training Static Quantization and Quantization Aware Training to compare the mAP and inference speed with that of the non-quantized model
To get started, I am using YOLOv3-tiny (416 × 416) as it takes way less time to train (about a day for me). Once the results with YOLOv3-tiny are acceptable, I can move on to the full YOLOv3 (416 × 416)
What have I done so far:
Train YOLOv3-tiny (416 × 416) from scratch using this repository (with no modifications to the code). The below command was used:
python3.8 -m torch.distributed.run --nproc_per_node 4 train.py --data coco.yaml --epochs 300 --weight '' --cfg yolov3-tiny.yaml --img 416 --batch-size 128
This model has an mAP of 0.31, the original Darknet model has an mAP of 0.331. I believe a 0.02 mAP is loss is acceptable.
Use the modifications from another issue (#1734) to add quantization and dequantization layers in the beginning and end of the model
yolo.py:
Why use this method? The person who used this method said it worked for him, so, might as well start with something that works.
Anyway, doing a print(model) gives me the following output:
Once again, I trained the model with the below command:
python3.8 -m torch.distributed.run --nproc_per_node 4 train.py --data coco.yaml --epochs 300 --weight '' --cfg yolov3-tiny.yaml --img 416 --batch-size 128
This model got an mAP of 0.25
It did not however change my model size, maybe the author of that issue made some other modifications
Instead of adding the dequantization layer the way it was done in the above mentioned issue, I added the dequantization layer inside the YOLOv3-tiny YAML file. It looks as follows:
The dequantization layer had to be added before the Detect layer to avoid running into errors.
The mAP was 0.25 (same as the method used in the previous issue)
Current questions I have:
I am unsure if this change in mAP by using the quantization and dequantization layers was supposed to happen or not. I did add those layers, but made no other modifications to the model and the inference was performed in float32 mode as well. Yet, the mAP dropped.
I am going to keep up with my attempts to use quantization and get better mAP. Any inputs are greatly appreciated, I am new to all of this
Additional
No response
The text was updated successfully, but these errors were encountered: