[ACL] Stateless feature impacts winograd convolution performance #2324
Labels
platform:cpu-aarch64
Codeowner: @oneapi-src/onednn-cpu-aarch64
sighting
Suspicious library behavior. Should be promoted to a bug when confirmed
ACL stateless feature integrated into oneDNN in the recent releases affects winograd convolution performance.
The performance issue has been reproduced on Ampere and Apple M2 Pro.
Several benchdnn reproducers
ACL without stateless feature gives 0.39 ms / 0.24 ms / 0.23 ms respectively on Apple M2 Pro.
ACL with stateless feature (vanilla ACL 24.11.1) gives 17.79 ms / 4.06 ms / 1.14 ms respectively on Apple M2 Pro.
To get ACL without stateless feature the following commits were reverted:
The text was updated successfully, but these errors were encountered: