Release v0.7.0: Logging API, FSDP, batch size finder and examples revamp · huggingface/accelerate

v0.7.0: Logging API, FSDP, batch size finder and examples revamp

Logging API

Use any of your favorite logging libraries (TensorBoard, Wandb, CometML...) with just a few lines of code inside your training scripts with Accelerate. All details are in the documentation.

Add logging capabilities by @muellerzr in #293

Support for FSDP (fully sharded DataParallel)

PyTorch recently released a new model wrapper for sharded DDP training called FSDP. This release adds support for it (note that it doesn't work with mixed precision yet). See all caveats in the documentation.

PyTorch FSDP Feature Incorporation by @pacman100 in #321

Batch size finder

Say goodbye to the CUDA OOM errors with the new find_executable_batch_size decorator. Just decorate your training function and pick a starting batch size, then let Accelerate do the rest.

Add a memory-aware decorator for CUDA OOM avoidance by @muellerzr in #324

Examples revamp

The Accelerate examples are now split in two: you can find in the base folder a very simple nlp and computer vision examples, as well as complete versions incorporating all features. But you can also browse the examples in the by_feature subfolder, which will show you exactly what code to add for each given feature (checkpointing, tracking, cross-validation etc.)

Refactor Examples by Feature by @muellerzr in #312

What's Changed

Document save/load state by @muellerzr in #290
Refactor precisions to its own enum by @muellerzr in #292
Load model and optimizet states on CPU to void OOMs by @sgugger in #299
Fix example for datasets v2 by @sgugger in #298
Leave default as None in mixed_precision for launch command by @sgugger in #300
Pass lr_scheduler to Accelerator.prepare by @sgugger in #301
Create new TestCase classes and clean up W&B tests by @muellerzr in #304
Have custom trackers work with the API by @muellerzr in #305
Write tests for comet_ml by @muellerzr in #306
Fix training in DeepSpeed by @sgugger in #308
Update example scripts by @muellerzr in #307
Use --no_local_rank for DeepSpeed launch by @sgugger in #309
Fix Accelerate CLI CPU option + small fix for W&B tests by @muellerzr in #311
Fix DataLoader sharding for deepspeed in accelerate by @m3rlin45 in #315
Create a testing framework for example scripts and fix current ones by @muellerzr in #313
Refactor Tracker logic and write guards for logging_dir by @muellerzr in #316
Create Cross-Validation example by @muellerzr in #317
Create alias for Accelerator.free_memory by @muellerzr in #318
fix typo in docs of accelerate tracking by @loubnabnl in #320
Update examples to show how to deal with extra validation copies by @muellerzr in #319
Fixup all checkpointing examples by @muellerzr in #323
Introduce reduce operator by @muellerzr in #326

New Contributors

@m3rlin45 made their first contribution in #315
@loubnabnl made their first contribution in #320
@pacman100 made their first contribution in #321

Full Changelog: v0.6.0...v0.7.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly