SDXL Training, prediction_type = 'epsilon' + snr_gamma + rescale_betas_zero_snr will case NaN #10372

Jannchie · 2024-12-24T14:47:17Z

Jannchie
Dec 24, 2024

I'm using SDXL training script.

I read the document at

https://huggingface.co/docs/diffusers/api/schedulers/ddim.

It mentions that using rescale_betas_zero_snr with v-prediction might be more scientifically sound. However, rescale_betas_zero_snr with e-prediction should also work. Yet, when rescale_betas_zero_snr is True, the calculated snr is 0 at timestep 999.

And when snr_gamma is enabled, calculating the loss requires dividing by snr, which triggers a division by zero error, causing the loss to become NaN. huggingface/diffusers@6dfaec3/examples/text_to_image/train_text_to_image_sdxl.py#L1147

I think e-prediction, snr_gamma, and rescale_betas_zero_snr should be compatible options that can work together. I don’t have much knowledge about how these work, but maybe adding a small epsilon when calculating snr could solve the problem, though I’m not sure if it’s reasonable. Could someone offer some advice?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SDXL Training, prediction_type = 'epsilon' + snr_gamma + rescale_betas_zero_snr will case NaN #10372

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

SDXL Training, prediction_type = 'epsilon' + snr_gamma + rescale_betas_zero_snr will case NaN #10372

Jannchie Dec 24, 2024

Replies: 0 comments

Jannchie
Dec 24, 2024