Release v4.46.0
New model additions
Moshi
The Moshi model was proposed in Moshi: a speech-text foundation model for real-time dialogue by Alexandre Défossez,
Laurent Mazaré, Manu Orsini, Amélie Royer, Patrick Pérez, Hervé Jégou, Edouard Grave and Neil Zeghidour.
Moshi is a speech-text foundation model that casts spoken dialogue as speech-to-speech generation. Starting from a
text language model backbone, Moshi generates speech as tokens from the residual quantizer of a neural audio codec,
while modeling separately its own speech and that of the user into parallel streams. This allows for the removal of
explicit speaker turns, and the modeling of arbitrary conversational dynamics. Moshi also predicts time-aligned text
tokens as a prefix to audio tokens. This “Inner Monologue” method significantly improves the linguistic quality of
generated speech and provides streaming speech recognition and text-to-speech. As a result, Moshi is the first
real-time full-duplex spoken large language model, with a theoretical latency of 160ms, 200ms in practice.
Zamba
Zamba-7B-v1 is a hybrid between state-space models (Specifically Mamba) and transformer, and was trained using
next-token prediction. Zamba uses a shared transformer layer after every 6 mamba blocks. It uses the Mistral
v0.1 tokenizer. We came to this architecture after a series of ablations at small scales. Zamba-7B-v1 was
pre-trained on 1T tokens of text and code data.
GLM
The GLM Model was proposed in ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools by GLM Team,
THUDM & ZhipuAI.
The abstract from the paper starts with the following:
We introduce ChatGLM, an evolving family of large language models that we have been developing over time. This
report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B.
- add Glm by @Cyrilvallez in #33823
Idefics 3
The Idefics3 model was proposed in Building and better understanding vision-language models: insights and future directions by Hugo Laurençon, Andrés Marafioti, Victor Sanh, and Léo Tronchon.
Idefics3 is an adaptation of the Idefics2 model with three main differences:
- It uses Llama3 for the text model.
- It uses an updated processing logic for the images.
- It removes the perceiver.
- Add Idefics 3! by @andimarafioti in #32473
PhiMoE
The PhiMoE model was proposed in Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone by Microsoft.
This model is very similar to Mixtral with the main difference of Phi3LongRoPEScaledRotaryEmbedding, where they are
used to extend the context of the rotary embeddings. The query, key and values are fused, and the MLP’s up and gate
projection layers are also fused.
- PhiMoE by @garg-amit in #33363
Watermarking
This release adds SynthID, a novel state-of-the-art watermarking technique by Google DeepMind. SynthID has a low generation-time computational cost and can be configured to be nearly imperceptible (at the cost of harder watermarking detection). The release also comes with the code to train and run the corresponding detector, which is a machine learning model itself.
from transformers import AutoModelForCausalLM, AutoTokenizer, SynthIDTextWatermarkingConfig
tokenizer = AutoTokenizer.from_pretrained('google/gemma-2-2b', padding_side="left")
model = AutoModelForCausalLM.from_pretrained('google/gemma-2-2b')
# SynthID Text configuration
watermarking_config = SynthIDTextWatermarkingConfig(
keys=[654, 400, 836, 123, 340, 443, 597, 160, 57],
ngram_len=5,
)
# Generation with watermarking
tokenized_prompts = tokenizer(["Once upon a time, "], return_tensors="pt", padding=True)
output_sequences = model.generate(
**tokenized_prompts, watermarking_config=watermarking_config, do_sample=True, max_new_tokens=10
)
watermarked_text = tokenizer.batch_decode(output_sequences, skip_special_tokens=True)
print(watermarked_text)
Docs for applying SynthID watermarking: https://huggingface.co/docs/transformers/internal/generation_utils#transformers.SynthIDTextWatermarkLogitsProcessor
Docs for detecting SynthID watermarking: https://huggingface.co/docs/transformers/internal/generation_utils#transformers.SynthIDTextWatermarkDetector
Quantization
BitNet
BitNet is an architecture introduced by Microsoft Research that uses extreme quantization, representing each parameter with only three values: -1, 0, and 1. This results in a model that uses just 1.58 bits per parameter, significantly reducing computational and memory requirements. It replaces traditional Linear layers in Multi-Head Attention and Feed-Forward Networks with specialized layers called BitLinears that use ternary precision (or even binary, in the initial version)
- FEAT : Adding BitNet quantization method to HFQuantizer by @MekkCyber in #33410
GGUF loading in transformers
More architectures are now supported in our GGUF loader; GGUF files saved with this architecture can now
be loaded directly in transformers to be fine-tuned. We recommend using tooling from llama.cpp to requantize
the models after further training has been done.
- Add gguf support for bloom by @VladOS95-cyber in #33473
- Add falcon gguf by @g-prz in #33437
- Add gguf support for StableLM by @VladOS95-cyber in #33793
- Add gguf support for gpt2 by @VladOS95-cyber in #34044
- Add GGUF for starcoder2 by @VladOS95-cyber in #34094
Notable improvements and additions
Pipeline API synchronisation
We are pushing for a unified inference API across multiple libraries. As part of this, we are cleaning up the input and output signatures for our pipeline classes and deprecating some rarely-used arguments. This is still a work-in-progress, but when it's finished, transformers
pipelines should exactly match workflows in deployment libraries like transformers.js or TGI, allowing you to seamlessly move from development to production.
- Sync video classification pipeline with huggingface_hub spec by @Rocketknight1 in #34288
- Image pipelines spec compliance by @Rocketknight1 in #33899
- Make ASR pipeline compliant with Hub spec + add tests by @Rocketknight1 in #33769
- Cleanup return_text and return_full_text options in TextGenerationPipeline by @Rocketknight1 in #33542
- Make audio classification pipeline spec-compliant and add test by @Rocketknight1 in #33730
- Sync QuestionAnsweringPipeline by @Rocketknight1 in #34039
Also, pipelines now fully support the Processor
class, used by vision-language models. Expect full pipeline support for chatting with VLMs in the very near future!
Executorch compatibility
ExecuTorch is an end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch ecosystem and supports the deployment of PyTorch models with a focus on portability, productivity, and performance.
We are collaborating with the executorch team so that 🤗 Transformers models can be exported using torch.export
. The goal of this integration is not only to enable export but also to ensure that the exported artifact can be further lowered and optimized to run efficiently in ExecuTorch, particularly for mobile and edge use cases.
- Generate using exported model and enable gemma2-2b in ExecuTorch by @guangy10 in #33707
- Qwen2.5 is ExecuTorch Compatible by @guangy10 in #34102
- Olmo is ExecuTorch Compatible by @guangy10 in #34181
- Llama3 and Llama2 are ExecuTorch compatible by @guangy10 in #34101
Gradient accumulation bugfix
- Fix Gradient Accumulation issue by @ArthurZucker in #34191
- Enable users to use their own loss functions + deal with prefetching for grad accum by @muellerzr in #34198
- Enable Gradient Accumulation fix across all models + trainer fully in forward() by @muellerzr #34283
Bugfixes and improvements
- adding positional encoder changes and tests by @manuelsh in #32600
- Uniformize kwargs for chameleon processor by @leloykun in #32181
- [
MllamaProcessor
] Update errors and API with multiple image by @ArthurZucker in #33715 - fix: use correct var names for check_tokenizers script by @niqodea in #33702
- Fix docs and docstrings Omdet-Turbo by @yonigozlan in #33726
- Fix position embeddings singular/plural by @molbap in #33678
- Generate:
can_generate()
recursive check by @gante in #33718 - clean_up_tokenization_spaces=False if unset by @itazap in #31938
- fix: add docstring for
image_size
in Convnextv2 config by @lucianosrp in #33734 - Fix modular model converter unable to generate Processor classes by @tonywu71 in #33737
- fix trainer tr_loss add error by @Wang-Xiaodong1899 in #33651
- Update Albumentations Versions by @vasqu in #33704
- Doc and config mismatch for DeBERTa by @fkrasnov2 in #33713
- [
clean_up_tokenization_spaces
] Pl bart was failing, updating by @ArthurZucker in #33735 - [
MllamaImageProcessing
] Update doc by @ArthurZucker in #33747 - Make siglip examples clearer and error free by @jbn in #33667
- Paligemma support for multi-image by @zucchini-nlp in #33447
- remove warning v2 by @itazap in #33761
- Model addition timeline by @LysandreJik in #33762
- Fix typing in
load_balancing_loss_func
function ofmodeling_mixtral.py
. by @PhilipMay in #33641 - Enable non-safetensor ser/deser for TorchAoConfig quantized model 🔴 by @jerryzh168 in #33456
- Fix typo in documentation by @qgallouedec in #33805
- Hqq serialization by @mobicham in #33141
- Add Slow CI reminder bot by @ydshieh in #33506
- [
modular
] fixes! by @ArthurZucker in #33820 - Fix ViT-MAE decoder interpolate by @xenova in #33330
- Fixes for issue #33763 in idefics2 model by @aroun-coumar in #33766
- Fix link in gguf.md by @pogpog in #33768
- minor typo fix by @a-r-r-o-w in #33784
- Fix Mamba slow path bug with dtype mismatch. by @Adibvafa in #32691
- Fix passing str dtype to static cache by @guangy10 in #33741
- fix check for hidden size in text model for deepspeed zero3 auto entries by @winglian in #33829
- post reminder comment only once by @ydshieh in #33848
- Generate: move llama
prepare_inputs_for_generation
toGenerationMixin
by @gante in #33677 - Refactor image features selection in LlaVa by @kenza-bouzid in #33696
- fix: skip dropout in eval for flash_attn in various models by @fdschmidt93 in #33844
- add attention weight up-cast to float32 in chameleon by @francescortu in #33822
- Workaround for bark issue in pipelines by @Rocketknight1 in #33824
- Fix device mismatch errors by @zucchini-nlp in #33851
- This PR contains additional changes for #33143 by @aroun-coumar in #33581
- Raise
accelerate
dependency error in case of defaultinglow_cpu_mem_usage=True
by @kylesayrs in #33830 - Validate the eval dataset in advance. by @jackyjinjing in #33743
- Add include_loss_for_metrics by @Manalelaidouni in #33088
- Avoid using context that is not accessable from external contributors by @ydshieh in #33866
- fix: repair depth estimation multiprocessing by @niqodea in #33759
- Move weight initilization deformabledetr by @g-prz in #33339
- [Fix] ViViT interpolate_pos_encoding by @RUFFY-369 in #33815
- Repo consistency fix after #33339 by @amyeroberts in #33873
- Add support for custom inputs and batched inputs in ProcessorTesterMixin by @yonigozlan in #33711
- Fix: typo by @TrickEye in #33880
- Uniformize model processors by @molbap in #31368
- Don't run reminder bot for now by @ydshieh in #33883
- populate quantization_config for kv-cache-scheme only configs by @horheynm in #33874
- Allow for nightly packages of
compressed_tensors
by @kylesayrs in #33828 - Fix kwargs passed by AutoQuantizationConfig.from_pretrained by @kylesayrs in #33798
- Add sdpa for DistilBert by @OmarManzoor in #33724
- Trainer - deprecate tokenizer for processing_class by @amyeroberts in #32385
- [Quantization] Switch to optimum-quanto by @SunMarc in #31732
- Optim deformable detr by @yonigozlan in #33600
- Handle Trainer
tokenizer
kwarg deprecation with decorator by @qubvel in #33887 - rename all test_processing_.py to test_processor_.py by @yonigozlan in #33878
- uniformize processor Mllama by @yonigozlan in #33876
- Fix dt proj bias reassigned by @HofitBata in #33314
- Update an keyerror on _save_check_point prevent confusion of missing … by @fadingNA in #33832
- VLM Generate: tag
test_static_cache_matches_dynamic
as flaky by @gante in #33630 - Migrate the CI runners to the new clusters by @glegendre01 in #33849
- Fix module initialization for root module under Zero3 by @Ben-Schneider-code in #33632
- Add
SplinterTokenizer
unit test by @ariepratama in #32652 - Generate tests: modality-agnostic input preparation by @gante in #33685
- Fix: use unidic-lite instead of ipadic as the tokenizer dictionary for Japanese by @KanTakahiro in #33372
- [Tests] Diverse Whisper fixes by @ylacombe in #33665
- [PEFT] Support low_cpu_mem_usage option for PEFT loading adapters by @BenjaminBossan in #33725
- add setter for trainer processor by @ArthurZucker in #33911
- Add support for
weights_only
flag when loading state_dict by @jerryzh168 in #32481 - Config: lower
save_pretrained
exception to warning by @gante in #33906 - Uniformize kwargs for Idefics/2 processors by @yonigozlan in #32568
- Remove
logits.float()
by @ringohoffman in #33902 - Minor error condition bug fix by @htahboub in #33781
- Fix distil whisper segment computation by @ylacombe in #33920
- [Doc]: Broken link in Kubernetes doc by @saldanhad in #33879
- [i18n-ru] Fixes typo in the README_ru.md by @Artanias in #33882
- Ignore keys on
validate_rope
by @zucchini-nlp in #33753 - [
PR run-slow
] by @ArthurZucker in #33939 - Add a section on writing tool templates to the chat template docs by @Rocketknight1 in #33924
- Enables CPU AWQ model with IPEX version. by @jiqing-feng in #33460
- 🔴 🚨 Resizing tokens embeddings: initialize from old embeddings' normal distribution. by @abuelnasr0 in #33325
- Removed unnecessary transpose in Switch Transformer Routing by @karan-uppal3 in #33582
- Fix attn mask ignore logic in training-time trace by @zhenglongjiepheonix in #32613
- hot fix
self.position_embeddings->self.position_embedding
by @ArthurZucker in #33958 - fix red check-copies by @ArthurZucker in #33964
- Cache: revert DynamicCache init for BC by @gante in #33861
- Paligemma: fix static cache test by @zucchini-nlp in #33941
- Updating
char_to_token
documentation to note behaviour whentrim_offsets
is True by @Craigacp in #33919 - add test for Jamba with new model jamba-tiny-dev by @yecohn in #33863
- Bug fix gguf qwen2moe by @VladOS95-cyber in #33940
- [
TF
] Fix Tensorflow XLA Generation on limited seq_len models by @vasqu in #33903 - [WIP] Add Tokenizer for MyT5 Model by @tomlimi in #31286
- Add position ids in forward pass to opt model by @avishaiElmakies in #33121
- Flash-attn performance: remove cuda sync during inference by @Cyrilvallez in #33570
- [Docs] Improve VLM docs by @NielsRogge in #33393
- [Docs] Add Developer Guide: How to Hack Any Transformers Model by @MagnusS0 in #33979
- [
Red CIs
] Fix hub failures by @ArthurZucker in #34001 - Fix Tensor + Embedding error in some cases when using SiglipVisionModel by @kaitolucifer in #33994
- properly fix and RUN_SLOW by @ArthurZucker in #33965
- Enable customized optimizer for DeepSpeed by @dataKim1201 in #32049
- [
pytes collection
] Fix flax test collection by @ArthurZucker in #34004 - Fix undefined default_config in configuration_utils.py by @mgoin in #33934
- 🌐 [i18n-KO] Translated
gguf.md
to Korean by @yijun-lee in #33764 - 🌐 [i18n-KO] Translated
swinv2.md
to Korean by @mreraser in #33566 - 🌐 [i18n-KO] Translated
audio_utils.md
to Korean by @yijun-lee in #33802 - 🌐 [i18n-KO] Translated
esm.md
to Korean by @yijun-lee in #33796 - 🌐 [i18n-KO] Translated
time_series_utils.md
to Korean by @yijun-lee in #33806 - 🌐 [i18n-KO] Translated
pipelines_utils.md
to Korean by @yijun-lee in #33809 - 🌐 [i18n-KO] Translated
trainer.md
to Korean by @yijun-lee in #33797 - 🌐 [i18n-KO] Translated
chameleon.md
to Korean by @yijun-lee in #33799 - 🌐 [i18n-KO] Translated
logging.md
to Korean by @chhaewxn in #33543 - 🌐 [i18n-KO] Translated
auto.md
to Korean by @boyunJang in #33590 - 🌐 [i18n-KO] Translated
swin2sr.md
to Korean by @mreraser in #33795 - 🌐 [i18n-KO] Translated
vit.md
to Korean by @mreraser in #33884 - 🌐 [i18n-KO] Translated
gemma.md
to Korean by @yijun-lee in #33936 - Cache: slight change in naming by @zucchini-nlp in #32421
- Add support for all and potentilly deleting functions by @ArthurZucker in #33859
- Processors: don't default padding side by @zucchini-nlp in #33942
- Add auto model for image-text-to-text by @yonigozlan in #32472
- BatchFeature.to() supports non-tensor keys by @Rocketknight1 in #33918
- Improve modular converter by @Cyrilvallez in #33991
- Fixup DeepSpeed things by @muellerzr in #34007
- Fix typing issue by @SunMarc in #34012
- fix awq tests due to ipex backend by @SunMarc in #34011
- Remove
decoder_config=None
by @SunMarc in #34014 - Fix
trainer_seq2seq.py
's__init__
type annotations by @benglewis in #34021 - 🌐 [i18n-KO] Translated
feature_extractor.md
to Korean by @yijun-lee in #33775 - 🌐 [i18n-KO] Translated
bertweet.md
to Korean by @ahnjj in #33891 - 🌐 [i18n-KO] Translated
gpt_neox_japanese.md
to Korean by @ahnjj in #33894 - 🌐 [i18n-KO] Translated
rag.md
to Korean by @chhaewxn in #33989 - 🌐 [i18n-KO] Translated
main_classes/quantization.md
to Korean by @fabxoe in #33959 - 🌐 [i18n-KO] Translated
main_classes/configuration.md
to Korean by @fabxoe in #33952 - 🌐 [i18n-KO] Translated
model_doc/mamba.md
to Korean by @fabxoe in #33626 - 🌐 [i18n-KO] Translated
model_doc/autoformer.md
to Korean by @fabxoe in #33574 - 🌐 [i18n-KO] Translated
model_doc/patchtsmixer.md
to Korean by @fabxoe in #33587 - 🌐 [i18n-KO] Translated
�model_doc/clip.md
to Korean by @fabxoe in #33610 - 🌐 [i18n-KO] Translated
model_doc/paligemma.md
to Korean by @fabxoe in #33612 - 🌐 [i18n-KO] Translated
model_doc/llama3.md
to Korean by @fabxoe in #33635 - 🌐 [i18n-KO] Translated
model_doc/mistral.md
to Korean by @fabxoe in #33648 - 🌐 [i18n-KO] Translated
model_doc/cohere.md
to Korean by @fabxoe in #33885 - 🌐 [i18n-KO] Translated
model_doc/dbrx.md
to Korean by @fabxoe in #33951 - 🌐 [i18n-KO] Translated
model_doc/deberta-v2.md
to Korean by @fabxoe in #33968 - 🌐 [i18n-KO] Translated
main_classes/onnx.md
to Korean by @fabxoe in #33601 - 🌐 [i18n-KO] Translated
tokenization_utils.md
to Korean by @yijun-lee in #33813 - 🌐 [i18n-KO] Translated
swin.md
to Korean by @mreraser in #33510 - 🌐 [i18n-KO] Translated
file_utils.md
to Korean by @yijun-lee in #33803 - 🌐 [i18n-KO] Translated
openai-gpt.md
to Korean by @yijun-lee in #33801 - 🌐 [i18n-KO] Translated
biogpt.md
to Korean by @yijun-lee in #33773 - 🌐 [i18n-KO] Translated
blip.md
to Korean by @cjfghk5697 in #33515 - 🌐 [i18n-KO] Translated output.md to Korean by @4N3MONE in #33607
- 🌐 [i18n-KO] Translated
image_processing_utils.md
to Korean by @yijun-lee in #33804 - 🌐 [i18n-KO] Translated
modular_transformers.md
to Korean by @yijun-lee in #33772 - [
Patch helper
] update to not have to checkout main by @ArthurZucker in #34006 - Fix Failed tests with mobile bert resize tokens embedding by @abuelnasr0 in #33950
- Generate: remove most decoder-only LLMs
prepare_inputs_for_generation
by @gante in #33870 - Mllama: fix tests by @zucchini-nlp in #34000
- Fix PIL dep for tests by @muellerzr in #34028
- 🌐 [i18n-KO] Translated
model_doc/bart.md
to Korean by @fabxoe in #33893 - 🌐 [i18n-KO] Translated
model_doc/deberta.md
to Korean by @fabxoe in #33967 - 🌐 [i18n-KO] Translated
main_classes/keras_callbacks.md
to Korean by @fabxoe in #33955 - 🌐 [i18n-KO] Translated
model_doc/mamba2.md
to Korean by @fabxoe in #33629 - 🌐 [i18n-KO] Translated
main_classes/model.md
to Korean by @fabxoe in #33606 - 🌐 [i18n-KO] Translated
model_doc/trajectory_transformer.md
to Korean by @fabxoe in #33597 - 🌐 [i18n-KO] Translated
model_doc/time_series_transformer.md
to Korean by @fabxoe in #33596 - 🌐 [i18n-KO] Translated
model_doc/informer.md
to Korean by @fabxoe in #33585 - 🌐 [i18n-KO] Translated
model_doc/graphormer.md
to Korean by @fabxoe in #33569 - 🌐 [i18n-KO] Translated
modeling_utils.md
to Korean by @yijun-lee in #33808 - 🌐 [i18n-KO] Translated
main_classes/data_collator.md
to Korean by @fabxoe in #33954 - 🌐 [i18n-KO] Translated
model_doc/patchtst.md
to Korean by @fabxoe in #33589 - 🌐 [i18n-KO] Translated
text_generation.md
to Korean by @yijun-lee in #33777 - 🌐 [i18n-KO] Translated
main_classes/callback.md
to Korean by @Jwaminju in #33572 - 🌐 [i18n-KO] Translated
generation_utils.md
to Korean by @yijun-lee in #33818 - Add Translate docs into Arabic - section files CONCEPTUAL GUIDES by @AhmedAlmaghz in #33982
- add sdpa to OPT by @avishaiElmakies in #33298
- Phi3: fix attn for sliding window by @zucchini-nlp in #33586
- HfArgumentParser: allow for hyhenated field names in long-options by @djmarti in #33990
- Fix pipelines tests by @qubvel in #34049
- Specifying torch dtype in Qwen2VLForConditionalGeneration by @htahboub in #33953
- Universal Assisted Generation: Assisted generation with any assistant model (by Intel Labs) by @danielkorat in #33383
- check if eigenvalues of covariance matrix are complex. by @abuelnasr0 in #34037
- [Docs] Update compressed_tensors.md by @mgoin in #33961
- Fix data_seed unused by @MekkCyber in #33731
- [TESTS] ASR pipeline by @ylacombe in #33925
- Update Blip2
is_pipeline_test_to_skip
method signature by @qubvel in #34067 - provide trust_remote_code for search feat extractor in model config by @eaidova in #34036
- Small Fix to modular converter by @MekkCyber in #34051
- Default
synced_gpus
toTrue
when usingFullyShardedDataParallel
by @ringohoffman in #33483 - Idefics: fix position ids by @zucchini-nlp in #33907
- Update SSH workflow file by @ydshieh in #34084
- Tests: upcast
logits
tofloat()
by @gante in #34042 - Fix flax failures by @LysandreJik in #33912
- Fix DAC slow tests by @ylacombe in #34088
- Fix failing conversion by @LysandreJik in #34010
- Fix PushToHubMixin when pusing to a PR revision by @Wauplin in #34090
- avoid many failures for ImageGPT by @ydshieh in #34071
- Fix NaNs in cost_matrix for mask2former by @ducha-aiki in #34074
- Fix flaky tests by @zucchini-nlp in #34069
- Generate: move
prepare_inputs_for_generation
in encoder-decoder llms by @gante in #34048 - Avoid many test failures for
LlavaNextVideoForConditionalGeneration
by @ydshieh in #34070 - refactor: benchmarks by @McPatate in #33896
- fix(ci): benchmarks dashboard was failing due to missing quotations by @McPatate in #34100
- Generate: Fix modern llm
generate
calls withsynced_gpus
by @gante in #34095 - Mistral-related models for QnA by @vasqu in #34045
- Fix a typo by @PengWeixuan in #34148
- Fixed error message in mllama by @dmgcsilva in #34106
- Specify that users should be careful with their own files by @LysandreJik in #34153
- Add documentation for docker by @ArthurZucker in #33156
- Update README.md with Enterprise Hub by @gary149 in #34150
- Idefics: enable generation tests by @zucchini-nlp in #34062
- Add sdpa for Vivit by @RUFFY-369 in #33757
- Fix FSDP resume Initialization issue by @Itssshikhar in #34032
- Fix default behaviour in TextClassificationPipeline for regression problem type by @subhalingamd in #34066
- Generate: move
logits
to same device asinput_ids
by @gante in #34076 - Add support for inheritance from class with different suffix in modular by @yonigozlan in #34077
- Fix optuna ddp hp search by @SunMarc in #34073
- [feat] LlavaNext add feature size check to avoid CUDA Runtime Error by @laurentd-lunit in #33608
- 🌐 [i18n-KO] Translated
vivit.md
to Korean by @mreraser in #33935 - 🌐 [i18n-KO] Translated
gemma2.md
to Korean by @yijun-lee in #33937 - 🌐 [i18n-KO] Translated
trainer_utils.md
to Korean by @yijun-lee in #33817 - 🌐 [i18n-KO] Translated
blip-2.md
to Korean by @cjfghk5697 in #33516 - IDEFICS: support inputs embeds by @zucchini-nlp in #34043
- [fix] fix token healing tests and usage errors by @alpertunga-bile in #33931
- Revert
accelerate
error caused by46d09af
by @steveepreston in #34197 - Fix wrong name for llava onevision and qwen2_vl in tokenization auto by @yonigozlan in #34177
- Avoid using torch's Tensor or PIL's Image in chat template utils if not available by @RezaRahemtola in #34165
- Revert "Fix FSDP resume Initialization issue" by @SunMarc in #34193
- Update
trainer._get_eval_sampler()
to supportgroup_by_length
arg by @larin92 in #33514 - Fix warning message for fp32_cpu_offloading in bitsandbytes configs by @amosyou in #34079
- Ping team members for new failed tests in daily CI by @ydshieh in #34171
- fix(Wav2Vec2ForCTC): torch export by @chrsmcgrr in #34023
- Fix for tokenizer.apply_chat_template with continue_final_message=True by @schoennenbeck in #34214
- removes decord by @vrnvu in #33987
- Fix bus error when using GPT2 on M1 macs by @chanind in #34031
- Generate: visit non-llm
prepare_inputs_for_generation
by @gante in #34199 - Support Llama 3.2 conversion (text models) by @pcuenca in #33778
- Fix-red-ci by @ArthurZucker in #34230
- BLIP: fix input expansion logic by @zucchini-nlp in #34225
- Fix broken test decorator
require_torch_up_to_2_accelerators
by @byi8220 in #34201 - Informative 2 by @LysandreJik in #34154
- Fix UDOP dtype issue by @Rocketknight1 in #34180
- Only cast logits to float when computing loss by @ringohoffman in #34147
- Generation tests: don't rely on main input name by @zucchini-nlp in #34228
- Change Paligemma import logging to work with modular by @yonigozlan in #34211
- Add DetrImageProcessorFast by @yonigozlan in #34063
- Add a doc section on writing generation prompts by @Rocketknight1 in #34248
- Fix method name which changes in tutorial by @andimarafioti in #34252
- Attn implementation for composite models by @zucchini-nlp in #32238
- VLM: add more modularity by @zucchini-nlp in #34175
- T5 compile compatibilty by @zucchini-nlp in #34089
- [docs] Fix GenerationConfig params by @stevhliu in #34299
- Fix Korean doc _toctree.yml by @regisss in #34293
- Update PR templates by @SunMarc in #34065
- [RT-DETR] Fix onnx inference bug for Optype (Where) by @YHallouard in #33877
- Fix FA2 attention for models supporting sliding window by @Cyrilvallez in #34093
- Fix: tensor of examples of the same length triggers invalid stacking by @pbelcak in #34166
- Add post_process_depth_estimation to image processors and support ZoeDepth's inference intricacies by @alex-bene in #32550
- Add option for running ffmpeg_microphone_live as a background process by @mikamerath in #32838
- Feature: Add
MLFLOW_MAX_LOG_PARAMS
toMLflowCallback
by @cecheta in #34279 - Fix continue_final_message for image-text-to-text chat templates by @yonigozlan in #34236
- fix error in _get_eval_sampler when group_by_length enabled by @akakakakakaa in #34237
- [docs] fix typo by @faaany in #34235
- 🌐 [i18n-KO] Translated
executorch.md
to Korean by @ahnjj in #33888 - 🌐 [i18n-KO] Translated
bert japanese.md
to Korean by @ahnjj in #33890 - 🌐 [i18n-KO] Translated
model_doc/bartpho.md
to Korean by @Jwaminju in #33981 - Example doc for token classification of Llama and Dependent/Copied Models by @h3110Fr13nd in #34139
- [docs] Fix Korean toctree by @stevhliu in #34324
- Added Deberta model type support by @FilipposVentirozos in #34308
Significant community contributions
The following contributors have made significant changes to the library over the last release:
- @manuelsh
- adding positional encoder changes and tests (#32600)
- @ArthurZucker
- [
MllamaProcessor
] Update errors and API with multiple image (#33715) - [
clean_up_tokenization_spaces
] Pl bart was failing, updating (#33735) - [
MllamaImageProcessing
] Update doc (#33747) - [
modular
] fixes! (#33820) - add setter for trainer processor (#33911)
- [
PR run-slow
] (#33939) - hot fix
self.position_embeddings->self.position_embedding
(#33958) - fix red check-copies (#33964)
- [
Red CIs
] Fix hub failures (#34001) - properly fix and RUN_SLOW (#33965)
- [
pytes collection
] Fix flax test collection (#34004) - Add support for all and potentilly deleting functions (#33859)
- [
Patch helper
] update to not have to checkout main (#34006) - Add documentation for docker (#33156)
- Fix Gradient Accumulation issue (#34191)
- Fix-red-ci (#34230)
- [
- @molbap
- @vasqu
- @VladOS95-cyber
- @ydshieh
- Add Slow CI reminder bot (#33506)
- post reminder comment only once (#33848)
- Avoid using context that is not accessable from external contributors (#33866)
- Don't run reminder bot for now (#33883)
- Update SSH workflow file (#34084)
- avoid many failures for ImageGPT (#34071)
- Avoid many test failures for
LlavaNextVideoForConditionalGeneration
(#34070) - Ping team members for new failed tests in daily CI (#34171)
- @amyeroberts
- @ylacombe
- @ringohoffman
- @garg-amit
- PhiMoE (#33363)
- @pglorio
- Add Zamba (#30950)
- @tomlimi
- [WIP] Add Tokenizer for MyT5 Model (#31286)
- @yijun-lee
- 🌐 [i18n-KO] Translated
gguf.md
to Korean (#33764) - 🌐 [i18n-KO] Translated
audio_utils.md
to Korean (#33802) - 🌐 [i18n-KO] Translated
esm.md
to Korean (#33796) - 🌐 [i18n-KO] Translated
time_series_utils.md
to Korean (#33806) - 🌐 [i18n-KO] Translated
pipelines_utils.md
to Korean (#33809) - 🌐 [i18n-KO] Translated
trainer.md
to Korean (#33797) - 🌐 [i18n-KO] Translated
chameleon.md
to Korean (#33799) - 🌐 [i18n-KO] Translated
gemma.md
to Korean (#33936) - 🌐 [i18n-KO] Translated
feature_extractor.md
to Korean (#33775) - 🌐 [i18n-KO] Translated
tokenization_utils.md
to Korean (#33813) - 🌐 [i18n-KO] Translated
file_utils.md
to Korean (#33803) - 🌐 [i18n-KO] Translated
openai-gpt.md
to Korean (#33801) - 🌐 [i18n-KO] Translated
biogpt.md
to Korean (#33773) - 🌐 [i18n-KO] Translated
image_processing_utils.md
to Korean (#33804) - 🌐 [i18n-KO] Translated
modular_transformers.md
to Korean (#33772) - 🌐 [i18n-KO] Translated
modeling_utils.md
to Korean (#33808) - 🌐 [i18n-KO] Translated
text_generation.md
to Korean (#33777) - 🌐 [i18n-KO] Translated
generation_utils.md
to Korean (#33818) - 🌐 [i18n-KO] Translated
gemma2.md
to Korean (#33937) - 🌐 [i18n-KO] Translated
trainer_utils.md
to Korean (#33817)
- 🌐 [i18n-KO] Translated
- @fabxoe
- 🌐 [i18n-KO] Translated
main_classes/quantization.md
to Korean (#33959) - 🌐 [i18n-KO] Translated
main_classes/configuration.md
to Korean (#33952) - 🌐 [i18n-KO] Translated
model_doc/mamba.md
to Korean (#33626) - 🌐 [i18n-KO] Translated
model_doc/autoformer.md
to Korean (#33574) - 🌐 [i18n-KO] Translated
model_doc/patchtsmixer.md
to Korean (#33587) - 🌐 [i18n-KO] Translated
�model_doc/clip.md
to Korean (#33610) - 🌐 [i18n-KO] Translated
model_doc/paligemma.md
to Korean (#33612) - 🌐 [i18n-KO] Translated
model_doc/llama3.md
to Korean (#33635) - 🌐 [i18n-KO] Translated
model_doc/mistral.md
to Korean (#33648) - 🌐 [i18n-KO] Translated
model_doc/cohere.md
to Korean (#33885) - 🌐 [i18n-KO] Translated
model_doc/dbrx.md
to Korean (#33951) - 🌐 [i18n-KO] Translated
model_doc/deberta-v2.md
to Korean (#33968) - 🌐 [i18n-KO] Translated
main_classes/onnx.md
to Korean (#33601) - 🌐 [i18n-KO] Translated
model_doc/bart.md
to Korean (#33893) - 🌐 [i18n-KO] Translated
model_doc/deberta.md
to Korean (#33967) - 🌐 [i18n-KO] Translated
main_classes/keras_callbacks.md
to Korean (#33955) - 🌐 [i18n-KO] Translated
model_doc/mamba2.md
to Korean (#33629) - 🌐 [i18n-KO] Translated
main_classes/model.md
to Korean (#33606) - 🌐 [i18n-KO] Translated
model_doc/trajectory_transformer.md
to Korean (#33597) - 🌐 [i18n-KO] Translated
model_doc/time_series_transformer.md
to Korean (#33596) - 🌐 [i18n-KO] Translated
model_doc/informer.md
to Korean (#33585) - 🌐 [i18n-KO] Translated
model_doc/graphormer.md
to Korean (#33569) - 🌐 [i18n-KO] Translated
main_classes/data_collator.md
to Korean (#33954) - 🌐 [i18n-KO] Translated
model_doc/patchtst.md
to Korean (#33589)
- 🌐 [i18n-KO] Translated
- @MekkCyber
- @AhmedAlmaghz
- Add Translate docs into Arabic - section files CONCEPTUAL GUIDES (#33982)
- @alex-bene
- Add post_process_depth_estimation to image processors and support ZoeDepth's inference intricacies (#32550)