You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I run with LightningCLI. Set check_val_every_n_epoch > 1 (e.g. 2) to run an experiment with 20 max_epoches, the model ckpt is save by lightning.pytorch.callbacks.ModelCheckpoint. The learning rate schedular is torch.optim.lr_scheduler.CosineAnnealingWarmRestarts(optimizer, T_0=self.trainer.max_epochs, T_mult=1, eta_min=self.eta_min) and update each epoch. When I load a ckpt (e.g. saved at epoch 3) to continue training, the learning rate will update 1 epoch quicker than expected.
Here the red is original learning rate curve and the yellow is the continued. The lr is logged by lightning.pytorch.callbacks.LearningRateMonitor.
#- PyTorch Lightning Version (e.g., 2.4.0):
#- PyTorch Version (e.g., 2.4):
#- Python version (e.g., 3.12):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration:
#- How you installed Lightning(conda, pip, source):
</details>
### More info
_No response_
The text was updated successfully, but these errors were encountered:
I found that the self._restarting=False if check_val_every_n_epoch=1 when I load from ckpt but self._restarting=True if check_val_every_n_epoch>1 in self.fit_loop.run().
Bug description
I run with LightningCLI. Set check_val_every_n_epoch > 1 (e.g. 2) to run an experiment with 20 max_epoches, the model ckpt is save by lightning.pytorch.callbacks.ModelCheckpoint. The learning rate schedular is torch.optim.lr_scheduler.CosineAnnealingWarmRestarts(optimizer, T_0=self.trainer.max_epochs, T_mult=1, eta_min=self.eta_min) and update each epoch. When I load a ckpt (e.g. saved at epoch 3) to continue training, the learning rate will update 1 epoch quicker than expected.
Here the red is original learning rate curve and the yellow is the continued. The lr is logged by lightning.pytorch.callbacks.LearningRateMonitor.
What version are you seeing the problem on?
v2.4
How to reproduce the bug
Error messages and logs here please
#- PyTorch Lightning Version (e.g., 2.4.0):
#- PyTorch Version (e.g., 2.4):
#- Python version (e.g., 3.12):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration:
#- How you installed Lightning(
conda
,pip
, source):The text was updated successfully, but these errors were encountered: