v4.43.3 Patch deepspeed
Patch release v4.43.3:
We still saw some bugs so @zucchini-nlp added:
- Resize embeds with DeepSpeed #32214
- don't log base model architecture in wandb if log model is false #32143
Other fixes:
- [whisper] fix short-form output type #32178, by @sanchit-gandhi which fixes the short audio temperature fallback!
- [BigBird Pegasus] set _supports_param_buffer_assignment to False #32222 by @kashif, mostly related to the new super fast init, some models have to get this set to False. If you see a weird behavior look for that 😉