Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Torch FX] Post Quantize Weights Compression #2984

Merged
Show file tree
Hide file tree
Changes from 73 commits
Commits
Show all changes
74 commits
Select commit Hold shift + click to select a range
f9e5d7c
Update torch_fx_backend.py
anzr299 Aug 20, 2024
5b11455
Merge branch 'openvinotoolkit:develop' into develop
anzr299 Aug 26, 2024
0eff5cb
Merge branch 'openvinotoolkit:develop' into develop
anzr299 Aug 30, 2024
c7b9093
Merge branch 'openvinotoolkit:develop' into develop
anzr299 Aug 30, 2024
e7097bd
Merge branch 'openvinotoolkit:develop' into develop
anzr299 Aug 30, 2024
2665666
Merge branch 'openvinotoolkit:develop' into develop
anzr299 Sep 2, 2024
1b4a926
Merge branch 'openvinotoolkit:develop' into develop
anzr299 Sep 10, 2024
74d8f4c
Merge branch 'openvinotoolkit:develop' into develop
anzr299 Sep 12, 2024
415a222
Merge branch 'openvinotoolkit:develop' into develop
anzr299 Sep 18, 2024
75978ac
post quantize compression transformation init
anzr299 Sep 19, 2024
f231f17
Merge branch 'openvinotoolkit:develop' into fx_post_quantize_compress…
anzr299 Sep 19, 2024
b49d9f7
fix per tensor transformation
anzr299 Sep 20, 2024
094802d
add test for post quantization compression transformation
anzr299 Sep 20, 2024
9c49e77
Merge branch 'openvinotoolkit:develop' into fx_post_quantize_compress…
anzr299 Sep 23, 2024
b4719a8
remove buffer test
anzr299 Sep 23, 2024
939a560
Merge branch 'openvinotoolkit:develop' into develop
anzr299 Sep 24, 2024
7b05343
Merge branch 'openvinotoolkit:develop' into fx_post_quantize_compress…
anzr299 Sep 24, 2024
a71e892
update reference graphs
anzr299 Sep 24, 2024
21826ef
Merge branch 'fx_post_quantize_compression_transformation' of https:/…
anzr299 Sep 24, 2024
5990008
fix tests
anzr299 Sep 24, 2024
eba3ea8
Merge branch 'openvinotoolkit:develop' into fx_post_quantize_compress…
anzr299 Sep 24, 2024
1e2c8b5
remove redundant code
anzr299 Sep 24, 2024
28ca749
Merge branch 'fx_post_quantize_compression_transformation' of https:/…
anzr299 Sep 24, 2024
cc544ff
Merge branch 'openvinotoolkit:develop' into develop
anzr299 Sep 24, 2024
445466f
disable transformation test for sanity
anzr299 Sep 25, 2024
fff4b1f
test ci value
anzr299 Sep 25, 2024
9a359ab
Merge branch 'openvinotoolkit:develop' into develop
anzr299 Sep 26, 2024
efb490c
post quantize compression transformation init
anzr299 Sep 19, 2024
3c8d5fc
fix per tensor transformation
anzr299 Sep 20, 2024
82ad7c5
add test for post quantization compression transformation
anzr299 Sep 20, 2024
1618119
remove buffer test
anzr299 Sep 23, 2024
0b29461
update reference graphs
anzr299 Sep 24, 2024
e4a3386
fix tests
anzr299 Sep 24, 2024
9fd0746
remove redundant code
anzr299 Sep 24, 2024
b7b9388
disable transformation test for sanity
anzr299 Sep 25, 2024
0852843
test ci value
anzr299 Sep 25, 2024
811b2a8
Merge branch 'fx_post_quantize_compression_transformation' of https:/…
anzr299 Sep 27, 2024
b059c4c
Merge branch 'develop' into fx_post_quantize_compression_transformation
anzr299 Sep 30, 2024
e5917fa
add transformation for FQ of weights
anzr299 Oct 7, 2024
7a80e5c
update graph tests
anzr299 Oct 7, 2024
f0bb1ec
Add backend parameters,
anzr299 Oct 10, 2024
f6bcdf8
pre-commit fix
anzr299 Oct 10, 2024
2b0ef25
Merge branch 'develop' into fx_post_quantize_compression_transformation
anzr299 Oct 10, 2024
e1046e9
update reference graphs
anzr299 Oct 10, 2024
c032e1c
update meta of new node
anzr299 Oct 10, 2024
e605c97
1. update meta copy transformation
anzr299 Oct 13, 2024
609c3f5
update gold for sanity test
anzr299 Oct 13, 2024
5f52ff0
refactor imports
anzr299 Oct 13, 2024
7d2d901
1. Add checks for qdq nodes pattern
anzr299 Oct 14, 2024
6a42d5f
Test for QDQ nodes with add node in between to simulate non NNCF QDQ …
anzr299 Oct 14, 2024
90353a3
extrend the check for ignoring ndoes with qdq already to linear node
anzr299 Oct 14, 2024
464d280
Include zero point for asymmetric quantization
anzr299 Oct 14, 2024
4206768
update reference graphs and make changes as suggested
anzr299 Oct 14, 2024
853fc2a
move test to test_models.py
anzr299 Oct 15, 2024
bd1636d
pre-commit fix
anzr299 Oct 15, 2024
c5f1bff
Merge branch 'openvinotoolkit:develop' into fx_post_quantize_compress…
anzr299 Oct 15, 2024
d35392a
fix test
anzr299 Oct 15, 2024
237fd00
1. update constant function docstring
anzr299 Oct 15, 2024
3857668
add comment about model buffer update line in quantize_model
anzr299 Oct 15, 2024
d3ba573
Merge branch 'openvinotoolkit:develop' into fx_post_quantize_compress…
anzr299 Oct 15, 2024
caa4213
adding space between descriptions and params
anzr299 Oct 16, 2024
00884f8
Merge branch 'fx_post_quantize_compression_transformation' of https:/…
anzr299 Oct 16, 2024
90342f4
Add spaces, Remove extra code
anzr299 Oct 16, 2024
bc568ba
pre-commit fix, comment refactoring
anzr299 Oct 16, 2024
30b7773
Merge branch 'develop' into fx_post_quantize_compression_transformation
anzr299 Oct 16, 2024
31fc7b3
Fix single node being passed to _set_new_node_meta
anzr299 Oct 16, 2024
f6a7a34
Update tests/torch/fx/test_models.py
anzr299 Oct 16, 2024
55a5716
Merge branch 'develop' into fx_post_quantize_compression_transformation
anzr299 Oct 17, 2024
9caa478
pre-commit fix
anzr299 Oct 17, 2024
59512ce
Merge branch 'openvinotoolkit:develop' into fx_post_quantize_compress…
anzr299 Oct 18, 2024
e2e0f85
change transformation to update weight, zp, scale when replacing the …
anzr299 Oct 18, 2024
91bc576
update return of qdq constant transformation function
anzr299 Oct 18, 2024
feac3b8
Minor Fixes
anzr299 Oct 18, 2024
9f13704
Minor Fix #2
anzr299 Oct 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions nncf/experimental/torch/fx/quantization/backend_parameters.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Copyright (c) 2024 Intel Corporation
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from typing import Optional

from nncf.quantization.advanced_parameters import AdvancedQuantizationParameters


class FXBackendParameters:
COMPRESS_WEIGHTS = "compress_weights"


def is_weight_compression_needed(advanced_parameters: Optional[AdvancedQuantizationParameters]) -> bool:
"""
Determines whether weight compression is needed based on the provided
advanced quantization parameters.

:param advanced_parameters: Advanced quantization parameters.
:return: True if weight compression is needed, False otherwise.
"""
if advanced_parameters is not None and advanced_parameters.backend_params is not None:
return advanced_parameters.backend_params.get(FXBackendParameters.COMPRESS_WEIGHTS, True)
return True
11 changes: 11 additions & 0 deletions nncf/experimental/torch/fx/quantization/quantize_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,10 @@
from nncf.common.logging import nncf_logger
from nncf.common.quantization.structs import QuantizationPreset
from nncf.data import Dataset
from nncf.experimental.torch.fx.quantization.backend_parameters import is_weight_compression_needed
from nncf.experimental.torch.fx.transformations import apply_quantization_transformations
from nncf.experimental.torch.fx.transformations import compress_post_quantize_transformation
from nncf.experimental.torch.fx.transformations import fq_weights_transformation
from nncf.experimental.torch.fx.transformations import revert_quantization_transformations
from nncf.experimental.torch.fx.transformations import shared_constants_unification_transformation
from nncf.parameters import BackupMode
Expand Down Expand Up @@ -94,6 +97,11 @@ def quantize_impl(
# bias configuration.
revert_quantization_transformations(quantized_model)

if is_weight_compression_needed(advanced_parameters):
compress_post_quantize_transformation(quantized_model)
else:
fq_weights_transformation(quantized_model)

# Magic. Without this call compiled model
# is not preformant
quantized_model = GraphModule(quantized_model, quantized_model.graph)
Expand All @@ -107,6 +115,9 @@ def quantize_impl(

quantized_model.meta.update(original_graph_meta)
quantized_model = _disallow_eval_train(quantized_model)
# Each transformation adds a duplicate tensor value to the model buffer.
# This step removes the duplicates tensor values from the buffer.
quantized_model = GraphModule(quantized_model, quantized_model.graph)

return quantized_model

Expand Down
Loading
Loading