ONNXRuntime Optimization Causes Output Discrepancy with Certain opt_level Settings #23210
Labels
model:transformer
issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.
Describe the issue
When running inference on an ONNX model with ONNXRuntime, discrepancies are observed between the original and optimized outputs. The issue occurs when using optimization levels 0, 2, or 99, but not with opt_level=1. The discrepancy is observed only in the
output
of all the outputs of the model.To reproduce
Urgency
No response
Platform
Linux
OS Version
Ubuntu 20.04
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
5c1b7cc
ONNX Runtime API
Python
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: