Releases: huggingface/optimum
v1.20.0: VITS, Phi-3 ONNX export
Extended ONNX export
- VITS ONNX export by @echarlaix in #1607
- Phi-3 ONNX export by @JingyaHuang in #1870
- Add Phi-3 normalized config by @kunal-vaishnavi in #1841
- Add Phi-3 small normalized config by @JingyaHuang in #1864
Other changes and bugfixes
-
Bump transformers version by @echarlaix in #1824
-
Remove call to
apt update
beforeapt purge
in the main doc build workflow by @regisss in #1830 -
Update github workflows by @echarlaix in #1829
-
Remove bad PPA in main doc build workflow by @regisss in #1831
-
Fix sentence transformers models infer library by @echarlaix in #1832
-
Fix random initialization of bias when using GPTQ quantization with models without bias by @B-201 in #1827
-
Update the Transformers dependency in the Habana extra by @regisss in #1851
-
Make stable diffusion unet and vae number of channels static by @eaidova in #1840
-
Fix compatibility with transformers v4.41.0 for ONNX by @echarlaix in #1860
-
Fix FX CI by @IlyasMoutawwakil in #1866
-
Fix Utils CI by @IlyasMoutawwakil in #1867
-
Fix BT CI by @IlyasMoutawwakil in #1872
-
Fix ORTConfig loading by @mr-sarthakgupta in #1879
-
Update ORT doc for ROCM 6.0 by @mht-sharma in #1862
-
Fix ort config instantiation (from_pretrained) and saving (save_pretrained) by @IlyasMoutawwakil in #1865
-
Fix ORT CI by @IlyasMoutawwakil in #1875
-
Update optimum intel extra by @echarlaix in #1882
-
Bump transformers version for neuron extras by @JingyaHuang in #1881
New Contributors
- @B-201 made their first contribution in #1827
- @mr-sarthakgupta made their first contribution in #1879
Full Changelog: v1.19.0...v1.20.0
v1.19.2: Patch release
Full Changelog: v1.19.1...v1.19.2
v1.19.1: Patch release
- Bump transformers version by @echarlaix in #1824
- Remove call to
apt update
beforeapt purge
in the main doc build workflow by @regisss in #1830
Full Changelog: v1.19.0...v1.19.1
v1.19.0: Musicgen, MarkupLM ONNX export
Extended ONNX export
Musicgen and MarkupLM models from Transformers can now be exported to ONNX through optimum-cli export onnx
. Musicgen ONNX export is used to run the model locally in a browser through transformers.js.
- Musicgen ONNX export (text-conditional only) by @fxmarty in #1779
- Add support for markuplm ONNX export by @pogzyb in #1784
Other changes and bugfixes
- Fix IR version for merged ONNX decoders by @fxmarty in #1780
- Update test model id by @echarlaix in #1785
- Add Nvidia and Neuron to README by @JingyaHuang in #1791
- adds debug options to dump onnx graphs by @prathikr in #1789
- Improve PR template by @fxmarty in #1799
- Add Google TPU to the mix by @mfuntowicz in #1797
- Add redirection for Optimum TPU by @regisss in #1801
- Add Nvidia and Neuron to the installation doc by @JingyaHuang in #1803
- Update installation instructions by @echarlaix in #1806
- Fix offline compatibility by @fxmarty in #1805
- Remove unnecessary constants for > 2GB ONNX models by @fxmarty in #1808
- Add onnx export function for pix2struct model by @naormatania in #1815
New Contributors
- @pogzyb made their first contribution in #1784
- @naormatania made their first contribution in #1815
Full Changelog: v1.18.0...v1.19.0
v1.18.1: Patch release
Fix the installation for Optimum Neuron v0.0.21 release
- Improve the installation of optimum-neuron through optimum extras #1778
Fix the task inference of stable diffusion
- Fix infer task for stable diffusion #1793
Full Changelog: v1.18.0...v1.18.1
v1.18.0: Gemma, OWLv2, MPNet Qwen2 ONNX support
New architectures ONNX export :
- OWLv2 by @xenova in #1689
- Gemma by @fxmarty in #1714
- MPNet by @nathan-az in #1471
- Qwen2 by @uniartisan in #1746
Other changes and bugfixes
v1.17.1: Patch release
Update Transformers dependency for the release of Optimum Habana v1.10.2
- Update Transformers dependency in Habana extra #1700
Full Changelog: v1.17.0...v1.17.1
v1.17.0: Improved ONNX support & many bugfixes
ONNX export from nn.Module
A function is exposed to programmatically export any nn.Module
(e.g. models coming from Transformers, but modified). This is useful in case you need to do some modifications on models loaded from the Hub before exporting. Example:
from transformers import AutoModelForImageClassification
from optimum.exporters.onnx import onnx_export_from_model
model = AutoModelForImageClassification.from_pretrained("google/vit-base-patch16-224")
# Here one could do any modification on the model before the export.
onnx_export_from_model(model, output="vit_onnx")
- Enable model ONNX export by @echarlaix in #1649
ONNX export with static shapes
The Optimum ONNX export CLI allows to disable dynamic shape for inputs/outputs:
optimum-cli export onnx --model timm/ese_vovnet39b.ra_in1k out_vov --no-dynamic-axes
This is useful if the exported model is to be consumed by a runtime that does not support dynamic shapes. The static shape can be specified e.g. with --batch_size 1
. See all the shape options in optimum-cli export onnx --help
.
- Enable export of model with fixed shape by @mht-sharma in #1643
BF16 ONNX export
The Optimum ONNX export now supports BF16 export on CPU and GPU. Beware though that ONNX Runtime is most often not able to consume the models as some operation are not implemented in this data type, although the exported models comply with ONNX standard. This is useful if you are developing a runtime that consomes BF16 ONNX models.
Example:
optimum-cli export onnx --model bert-base-uncased --dtype bf16 bert_onnx
ONNX export for news models
You can now export to ONNX table-transformer, bart for text-classification.
- Add ONNX export for table-transformer by @xenova in #1616
- Reactivate BART Onnx Export by @claeyzre in #1666
Sentence Transformers ONNX export
- Fix sentence transformers ONNX export by @fxmarty in #1632
- Bump sentence-transformers ONNX opset by @fxmarty in #1634
- Pass
trust_remote_code
to sentence transformers export by @xenova in #1677 - Fix library detection by @fxmarty in #1690
Timm models support with ONNX Runtime
Timm models can now be run through ONNX Runtime with the class ORTModelForImageClassification
:
from urllib.request import urlopen
import timm
import torch
from PIL import Image
from optimum.onnxruntime import ORTModelForImageClassification
# Export the model to ONNX under the hood with export=True.
model = ORTModelForImageClassification.from_pretrained("timm/resnext101_64x4d.c1_in1k", export=True)
# Get model specific transforms (normalization, resize).
data_config = timm.data.resolve_data_config(pretrained_cfg=model.config.pretrained_cfg)
transforms = timm.data.create_transform(**data_config, is_training=False)
img = Image.open(
urlopen("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png")
)
output = model(transforms(img).unsqueeze(0)).logits
top5_probabilities, top5_class_indices = torch.topk(torch.softmax(output, dim=1) * 100, k=5)
- Add Timm support in ORTModelForImageClassification by @mht-sharma in #1578
Other changes and bugfixes
- Modify SEW-D model for tests by @echarlaix in #1601
- Add phi and mixtral model type to normalizedconfig by @changwangss in #1625
- Remove "to ONNX" from info message when exporting model by @helena-intel in #1627
- Modify model id for test by @echarlaix in #1628
- Fix cupy detection by @fxmarty in #1635
- Fix ORT detection by @fxmarty in #1636
- Enable sdpa export for SD unet component by @echarlaix in #1637
- [ORT] Improve dummy mask & add tips for attention fusion in the doc by @JingyaHuang in #1640
- Improve error message by @Almonok in #1623
- Add
input_labels
input to SAM model export by @xenova in #1638 - Fix c4 dataset loading by @SunMarc in #1646
- Avoid loading onnx file in weight deduplication if not necessary by @fxmarty in #1648
- Allow lower ONNX opsets by @fxmarty in #1650
- Remove abstract decorator from
_export
by @JingyaHuang in #1652 - Add rjieba install by @mht-sharma in #1661
- Fix wikitext2 processing by @SunMarc in #1663
- Fix: local variable 'dataset' referenced before assignment by @hiyouga in #1600
- Support float16 images in StableDiffusionXLWatermarker by @jambayk in #1603
- Extend autocast check to cover more platforms like XPU by @hoshibara in #1639
- Support IO Binding for ORTModelForCTC by @vidalmaxime in #1629
- Add fp16 support for split cache by @PatriceVignola in #1602
- ORTModelForFeatureExtraction always exports as transformers models by @fxmarty in #1684
- Avoid overriding model_type in TasksManager by @fxmarty in #1647
- Fix gptq device_map = "cpu" by @SunMarc in #1662
- CI: Avoid iterating over a mutated iterable by @fxmarty in #1683
- Add option to disable ONNX constant folding by @fxmarty in #1682
- re-enable decoder sequence classification by @dwyatte in #1679
- Move & rename
onnx_export
by @fxmarty in #1685 - Update standardize_model_attributes by @mht-sharma in #1686
- Fix: AttributeError: module 'packaging' has no attribute 'version' by @soulteary in #1660
- Disable failing test & free space when building documentation by @fxmarty in #1693
- Fix no space left on device in actions by @fxmarty in #1694
- Add end-to-end Marlin benchmark by @fxmarty in #1695
- Fix main doc build by @fxmarty in #1697
- Update optimum-intel requirements by @echarlaix in #1699
New Contributors
- @tomaarsen made their first contribution in #1597
- @helena-intel made their first contribution in #1627
- @Almonok made their first contribution in #1623
- @hiyouga made their first contribution in #1600
- @jambayk made their first contribution in #1603
- @hoshibara made their first contribution in #1639
- @vidalmaxime made their first contribution in #1629
- @PatriceVignola made their first contribution in #1602
- @claeyzre made their first contribution in #1666
- @dwyatte made their first contribution in #1679
- @soulteary made their first contribution in #1660
Full Changelog: v1.16.0...v1.17.0
v1.16.2: Patch release
-
Fix ORT training compatibility for transformers v4.36.0 by @AdamLouly #1586
-
Fix ONNX export compatibility for transformers v4.37.0 by @echarlaix #1641
v1.16.1: Patch release
Breaking change: BetterTransformer llama, falcon, whisper, bart is deprecated
The features from BetterTransformer for Llama, Falcon, Whisper and Bart have been upstreamed in Transformers. Please use transformers>=4.36
and torch>=2.1.1
to use by default PyTorch's scaled_dot_product_attention
.
More details: https://github.com/huggingface/transformers/releases/tag/v4.36.0
What's Changed
- Update dev version by @fxmarty in #1596
- Typo: tansformers -> transformers by @tomaarsen in #1597
- [GPTQ] fix tests by @SunMarc in #1598
- Show correct error message on using BT for SDPA models by @fxmarty in #1599
New Contributors
- @tomaarsen made their first contribution in #1597
Full Changelog: v1.16.0...v1.16.1