Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompatible dimensions #13

Open
DataJuggler opened this issue Nov 1, 2024 · 4 comments
Open

Incompatible dimensions #13

DataJuggler opened this issue Nov 1, 2024 · 4 comments

Comments

@DataJuggler
Copy link

DataJuggler commented Nov 1, 2024

I ran into numerous problems getting this installed.

First, I think your documentation left out the creating Models folder. I found this in sd3_infer.py

# NOTE: Must have folder `models` with the following files:
# - `clip_g.safetensors` (openclip bigG, same as SDXL)
# - `clip_l.safetensors` (OpenAI CLIP-L, same as SDXL)
# - `t5xxl.safetensors` (google T5-v1.1-XXL)
# - `sd3_medium.safetensors` (or whichever main MMDiT model file)
# Also can have
# - `sd3_vae.safetensors` (holds the VAE separately if needed)

Also, to get to this work, I had to install these two

pip install fire safetensors tqdm einops transformers sentencepiece protobuf pillow

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

I think your link to the t5xxl.safetensors file is wrong or your Python is wrong. I downloaded the file from Hugging Face, and the file had the name t5xxl_F16.safetensors. The app was looking for t5xxl.safetensors. I renamed the file without the F16, and I got to the Models Loaded point.

Then it started generating the images, took a long time and posted this:

(.sd3.5) E:\SD35Turbo.sd3.5\sd3.5>python sd3_infer.py --prompt "cute picture of a dog" --model E:\SD35Turbo\sd3.5_large_turbo.safetensors --width 1920 --height 1080
Loading tokenizers...
You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the legacy (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in huggingface/transformers#24565
Loading OpenAI CLIP L...
Loading OpenCLIP bigG...
Loading Google T5-v1-XXL...
Skipping key 'shared.weight' in safetensors file as 'shared' does not exist in python model
Loading SD3 model sd3.5_large_turbo.safetensors...
Loading VAE model...
Models loaded.
Saving images to outputs\sd3.5_large_turbo\cute picture of a dog_2024-11-01T08-58-20
0%| | 0/4 [00:04<?, ?it/s]
0%| | 0/1 [01:40<?, ?it/s]
Traceback (most recent call last):
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_infer.py", line 481, in
fire.Fire(main)
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\fire\core.py", line 135, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\fire\core.py", line 468, in _Fire
component, remaining_args = _CallAndUpdateTrace(
^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\fire\core.py", line 684, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_infer.py", line 465, in main
inferencer.gen_image(
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_infer.py", line 358, in gen_image
sampled_latent = self.do_sampling(
^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_infer.py", line 286, in do_sampling
latent = sample_fn(
^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\torch\amp\autocast_mode.py", line 44, in decorate_autocast
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_impls.py", line 285, in sample_euler
denoised = model(x, sigma_hat * s_in, **extra_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5.sd3.5\Lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_impls.py", line 151, in forward
batched = self.model.apply_model(
^^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_impls.py", line 126, in apply_model
return self.model_sampling.calculate_denoised(sigma, model_output, x)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\SD35Turbo.sd3.5\sd3.5\sd3_impls.py", line 47, in calculate_denoised
return model_input - model_output * sigma
~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
RuntimeError: The size of tensor a (135) must match the size of tensor b (134) at non-singleton dimension 2

--

Any suggestions on how to fix this, or did I do something wrong?

Thanks

@ysxk
Copy link

ysxk commented Nov 22, 2024

I encountered the same problem, have you solved it

@DataJuggler
Copy link
Author

No. I think I gave up.

@dch0319
Copy link

dch0319 commented Nov 23, 2024

same problem

@nguyenthekhoig7
Copy link

Same issue, I downloaded the 4 models using the code:

from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="stabilityai/stable-diffusion-3.5-medium", filename="sd3.5_medium.safetensors", local_dir='models')
hf_hub_download(repo_id="stabilityai/stable-diffusion-3.5-large", filename="text_encoders/clip_g.safetensors", local_dir='models')
hf_hub_download(repo_id="stabilityai/stable-diffusion-3.5-large", filename="text_encoders/clip_l.safetensors", local_dir='models')
hf_hub_download(repo_id="stabilityai/stable-diffusion-3.5-large", filename="text_encoders/t5xxl_fp16.safetensors", local_dir='models')
hf_hub_download(repo_id="stabilityai/stable-diffusion-3.5-large", filename="text_encoders/t5xxl_fp8_e4m3fn.safetensors", local_dir='models')

But still encounter the issue as above:

RuntimeError: The size of tensor a (135) must match the size of tensor b (134) at non-singleton dimension 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants