-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use maxcut for total ru #1324
use maxcut for total ru #1324
Conversation
e718734
to
274f77d
Compare
@@ -67,13 +67,19 @@ def __init__(self, | |||
self.compute_metric_fn = self.get_sensitivity_metric() | |||
self._cuts = None | |||
|
|||
self.ru_metrics = target_resource_utilization.get_restricted_metrics() | |||
# To define RU Total constraints we need to compute weights and activations even if they have no constraints |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't just replace Total target computation without ugly hacks, so I changed the way it's handled altogether. I'm not thrilled about this implementation, but I think it's the lesser evil, as now we can actually reuse activations and weights utilization without recomputing it for Total and it simplifies the code around utilization matrix (and no need for the hack).
@@ -510,21 +508,22 @@ def compare(self, quantized_model, float_model, input_x=None, quantization_info: | |||
activation_bits = [layer.activation_holder_quantizer.get_config()['num_bits'] for layer in holder_layers] | |||
# TODO maxcut: restore activation_bits == [4, 4] and unique_tensor_values=16 when maxcut calculates tensor sizes | |||
# of fused nodes correctly. | |||
self.unit_test.assertTrue((activation_bits == [4, 8])) | |||
# TODO: maxcut Test updated but lowered activation ru (how can 4000 enforce 4,4??). Not sure what the fused nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elad please comment on my comment to your comment :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove comments
@@ -174,13 +174,12 @@ def compare(self, quantized_model, float_model, input_x=None, quantization_info= | |||
# test with its current setup (therefore, we don't check the input layer's bitwidth) | |||
self.unit_test.assertTrue((activation_bits == [4, 8])) | |||
|
|||
# TODO maxcut: restore this test after total_memory is fixed to be the sum of weight & activation metrics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I restored some of maxcut TODOs but there are more remaining that I wasn't sure about
@@ -510,21 +508,22 @@ def compare(self, quantized_model, float_model, input_x=None, quantization_info: | |||
activation_bits = [layer.activation_holder_quantizer.get_config()['num_bits'] for layer in holder_layers] | |||
# TODO maxcut: restore activation_bits == [4, 4] and unique_tensor_values=16 when maxcut calculates tensor sizes | |||
# of fused nodes correctly. | |||
self.unit_test.assertTrue((activation_bits == [4, 8])) | |||
# TODO: maxcut Test updated but lowered activation ru (how can 4000 enforce 4,4??). Not sure what the fused nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove comments
Pull Request Description:
Use max cut activation method for total resource utilization target.
Compute total target from weights and activations utilization instead of as a separate metric.
Simplify mixed precision search manger and linear programming to support 2d utilization matrix only instead of >=2.
Checklist before requesting a review: