[GPU] fix property overwritten issue #28209

riverlijunjie · 2024-12-26T06:16:49Z

Details:

Avoid ov::hint::dynamic_quantization_group_size and ov::hint::kv_cache_precision is overwritten to be default value if ExecutionConfig::apply_user_properties is called twice.
For example
If user set ov::hint::dynamic_quantization_group_size to be 128, the second ExecutionConfig::apply_user_properties calling will rewrite it to be 32, such behavior will call performance drop on MTL 125H.
This issue is brought by PR: [GPU] Integrate dynamic quantization for onednn #26940
Performance before and after this PR:

Test result on master branch:

Tickets:

CVS-159322

byungilm · 2024-12-26T08:03:11Z

Hi! I have a question about the fix.
I see set_property() is writing config to internal_properties by "internal_properties[name]=val;".
Is there a certain reason that is_set_by_user() function searches its input name not from internal_properties, but from user_properties?

riverlijunjie · 2024-12-26T08:56:06Z

Hi! I have a question about the fix. I see set_property() is writing config to internal_properties by "internal_properties[name]=val;". Is there a certain reason that is_set_by_user() function searches its input name not from internal_properties, but from user_properties?

yes, is_set_by_user() only searches user_properties, and user_properties will be cleared after each called. So the next calling of is_set_by_user() will return false, which will cause ov::hint::dynamic_quantization_group_size is reset to default value.

byungilm · 2024-12-27T01:19:08Z

Hi! I have a question about the fix. I see set_property() is writing config to internal_properties by "internal_properties[name]=val;". Is there a certain reason that is_set_by_user() function searches its input name not from internal_properties, but from user_properties?

yes, is_set_by_user() only searches user_properties, and user_properties will be cleared after each called. So the next calling of is_set_by_user() will return false, which will cause ov::hint::dynamic_quantization_group_size is reset to default value.

I understood user_properties would be cleared.
when is_set_by_user() is true for a config, set_property() stores the config to internal_properties. Is it fine?

riverlijunjie · 2024-12-27T09:22:33Z

Hi! I have a question about the fix. I see set_property() is writing config to internal_properties by "internal_properties[name]=val;". Is there a certain reason that is_set_by_user() function searches its input name not from internal_properties, but from user_properties?

yes, is_set_by_user() only searches user_properties, and user_properties will be cleared after each called. So the next calling of is_set_by_user() will return false, which will cause ov::hint::dynamic_quantization_group_size is reset to default value.

I understood user_properties would be cleared. when is_set_by_user() is true for a config, set_property() stores the config to internal_properties. Is it fine?

The policy should be reasonable, the latest user properties should be the highest priority, it can update internal_properties.

sshlyapn · 2024-12-28T09:36:43Z

src/plugins/intel_gpu/src/runtime/execution_config.cpp

        set_property(ov::hint::kv_cache_precision(ov::element::i8));
    }

    // Enable dynamic quantization by default for non-systolic platforms
-    if (!is_set_by_user(ov::hint::dynamic_quantization_group_size) && !info.supports_immad) {
+    if (!is_set_by_user(ov::hint::dynamic_quantization_group_size) &&
+        internal_properties.find(ov::hint::dynamic_quantization_group_size.name()) == internal_properties.end() &&


This change will disable kv_cache_compression and dynamic_group_size default configurations for non-systolic platforms entirely, because this check will never return true, as during property registration all properties are added to internal_properties with default values here.:

openvino/src/plugins/intel_gpu/src/runtime/execution_config.cpp

Line 91 in 82d553e

internal_properties[property.first] = property.second;

As a short-term solution, we can avoid using get_property(ov::hint::dynamic_quantization_group_size) calls at runtime for FC configuration, saving the value at model compilation stage to primitive/implementation or somewhere else
However, the proper solution would be to move these default configurations out of this function entirely and call them only once at the very beginning, either in the constructor or right after config creation

riverlijunjie requested review from a team as code owners December 26, 2024 06:16

github-actions bot added the category: GPU OpenVINO GPU plugin label Dec 26, 2024

[GPU] fix property overwritten issue

e637a9f

byungilm approved these changes Dec 26, 2024

View reviewed changes

sshlyapn reviewed Dec 28, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GPU] fix property overwritten issue #28209

[GPU] fix property overwritten issue #28209

riverlijunjie commented Dec 26, 2024 •

edited

Loading

byungilm commented Dec 26, 2024

riverlijunjie commented Dec 26, 2024

byungilm commented Dec 27, 2024

riverlijunjie commented Dec 27, 2024

sshlyapn Dec 28, 2024

[GPU] fix property overwritten issue #28209

Are you sure you want to change the base?

[GPU] fix property overwritten issue #28209

Conversation

riverlijunjie commented Dec 26, 2024 • edited Loading

Details:

Tickets:

byungilm commented Dec 26, 2024

riverlijunjie commented Dec 26, 2024

byungilm commented Dec 27, 2024

riverlijunjie commented Dec 27, 2024

sshlyapn Dec 28, 2024

Choose a reason for hiding this comment

riverlijunjie commented Dec 26, 2024 •

edited

Loading