Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] DDUF format #10037

Open
wants to merge 62 commits into
base: main
Choose a base branch
from
Open

[FEAT] DDUF format #10037

wants to merge 62 commits into from

Conversation

SunMarc
Copy link
Member

@SunMarc SunMarc commented Nov 27, 2024

What does this PR do?

This PR adds support to DDUF format. More docs about this format here.

Requires transformers PR : https://github.com/huggingface/transformers/pull/35217/files (merged)
Requires huggingface_hub PR: huggingface/huggingface_hub#2692 (released)

Example

Load from the hub

pipe = StableDiffusion3Pipeline.from_pretrained("DDUF/stable-diffusion-3-medium-diffusers-DDUF",
                                                dduf_file="stable-diffusion-3-medium-diffusers.dduf")
pipe = pipe.to("cuda")

image = pipe(
    "A cat holding a sign that says hello world",
    negative_prompt="",
    num_inference_steps=28,
    guidance_scale=7.0,
).images[0]
image.save("cat.png")

Save to DDUF format

from huggingface_hub import export_folder_as_dduf

pipe = StableDiffusion3Pipeline.from_pretrained("stabilityai/stable-diffusion-3-medium-diffusers", 
                                                torch_dtype=torch.float16)

pipe.save_pretrained(save_directory="stable-diffusion-3-medium-diffusers")

export_folder_as_dduf("stable-diffusion-3-medium-diffusers.dduf", folder_path="stable-diffusion-3-medium-diffusers")

TODO list:

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@SunMarc SunMarc changed the title # [FEAT] DDUF format [FEAT] DDUF format Nov 27, 2024
@sayakpaul sayakpaul changed the title [FEAT] DDUF format [FEAT] [WIP] DDUF format Nov 28, 2024
@sayakpaul sayakpaul marked this pull request as draft November 28, 2024 08:56
Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for starting this! Apart from the initial comments I left, I have a few higher level questions:

We're assuming that the archive will contain all the pipeline-related files in the diffusers format. I am fine with that. This means we'll be shipping DDUF only at the pipeline-level for now, correct?

Comment on lines 196 to 197
dduf_format: bool = False,
dduf_filename: Optional[Union[str, os.PathLike]] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned over Slack, why do we need two arguments here? How about just dduf_filename?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will just use that then !

Copy link
Member

@pcuenca pcuenca Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that most, if not all, the components of the file name should be inferred. One of the reasons to use DDUF would be to provide confidence about the contents based on filename inspection, so we should try to make it work with something like an optional prefix, but build the rest automatically. So something like dduf_name="my-lora" resulting in the library saving to "my-lora-stable-diffusion-3-medium-diffusers-fp16.dduf", or something like that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok to use some kind of convention, but i don't think it should be an absolute requirement, ie. people can rename their files if they want to (but the metadata should be inside the file anyways)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that is the plan.

src/diffusers/pipelines/pipeline_utils.py Outdated Show resolved Hide resolved
src/diffusers/pipelines/pipeline_utils.py Outdated Show resolved Hide resolved
src/diffusers/pipelines/pipeline_utils.py Outdated Show resolved Hide resolved
src/diffusers/pipelines/pipeline_utils.py Outdated Show resolved Hide resolved
src/diffusers/pipelines/pipeline_utils.py Outdated Show resolved Hide resolved
@sayakpaul
Copy link
Member

sayakpaul commented Nov 28, 2024

I just tried out bitsanbdytes and it seems to work as well. Checkpoint: https://huggingface.co/DDUF/Flux.1-Dev-Tnf4-TE2NF4 (private for now)

Serialization
from diffusers import FluxTransformer2DModel, DiffusionPipeline 
from diffusers import BitsAndBytesConfig
from transformers import BitsAndBytesConfig as BnbConfig
from transformers import T5EncoderModel 
from huggingface_hub import create_repo, upload_folder
import torch 

repo_id = "DDUF/Flux.1-Dev-Tnf4-TE2NF4"
repo_id = create_repo(repo_id=repo_id, exist_ok=True).repo_id
dtype = torch.bfloat16

nf4_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=dtype,
)
transformer = FluxTransformer2DModel.from_pretrained(
    "black-forest-labs/FLUX.1-dev", 
    subfolder="transformer", 
    quantization_config=nf4_config, 
    torch_dtype=dtype
)
nf4_config = BnbConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=dtype,
)
text_encoder_2 = T5EncoderModel.from_pretrained(
    "black-forest-labs/FLUX.1-dev", 
    subfolder="text_encoder_2", 
    quantization_config=nf4_config,
    torch_dtype=dtype
)

pipeline = DiffusionPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-dev", 
    transformer=transformer, 
    text_encoder_2=text_encoder_2,
    torch_dtype=dtype
)
pipeline.enable_model_cpu_offload()

image = pipeline(
    "a cat standing by the sea waiting for its companion to come and console it.",
    guidance_scale=4.5,
    num_inference_steps=50,
    max_sequence_length=512,
    generator=torch.manual_seed(0)
).images[0]
image.save("dduf_out.png")

folder = "flux.1-dduf"
pipeline.save_pretrained(folder, dduf_filename="Flux.1-Dev-Tnf4-TE2NF4.dduf")
upload_folder(repo_id=repo_id, folder_path=folder, create_pr=True)
Loading
from diffusers import DiffusionPipeline 
import torch

repo_id = "DDUF/Flux.1-Dev-Tnf4-TE2NF4"
pipeline = DiffusionPipeline.from_pretrained(
    repo_id, dduf="Flux.1-Dev-Tnf4-TE2NF4.dduf", torch_dtype=torch.bfloat16
)
pipeline.enable_model_cpu_offload()

image = pipeline(
    "a cat standing by the sea waiting for its companion to come and console it.",
    guidance_scale=4.5,
    num_inference_steps=50,
    max_sequence_length=512,
    generator=torch.manual_seed(0)
).images[0]
image.save("dduf_out.png")

# remove tar archive to free memory
os.remove(tar_file_path)
# rename folder to match the name of the dduf archive
os.rename(extract_to, tar_file_path)
Copy link
Member

@pcuenca pcuenca Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that expanding the archive and removing it could be problematic.

  • If I understand correctly, we would be removing the symlink to the blob, but not the blob itself. We won't be saving disk space.
  • If we did remove the blob, I think it'd trigger another download next time we call .from_pretrained.
  • Removing the reference could maybe have consequences when checking for newer versions in the Hub (not sure).

When we change to using zips, I wonder if we could extract files progressively in memory during the load process, and keeping the big .dduf file as is. Having a disk snapshot is convenient, but I can see no easy solution to avoid duplicating space.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I will check if it is possible to stream directly from the zip or at least extract progressively in memory as you said

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the link ! I'll have a look !

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW on the Hub side we want to constraint the types of Zip a lot:

  • no compression only (to make sure it's deduplicated well by XetHub)
  • single disk (internal simplifications of Zip)
  • no encryption etc

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To ensure restrictions are met, we'll provide tooling in Python to save/load DDUF files in a consistent way: huggingface/huggingface_hub#2692. This will be part of huggingface_hub to that it can be easily reused by any library without having necessarily diffusers as a dependency.

Copy link
Member

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW a bit orthogonal but do you have examples of quantized diffusers repos right now?

@sayakpaul
Copy link
Member

BTW a bit orthogonal but do you have examples of quantized diffusers repos right now?

This is one example: https://huggingface.co/sayakpaul/FLUX.1-Fill-dev-nf4. Note that it doesn't include all the models needed to run Flux as it's not typical to quantize the VAE and the other CLIP-based text encoder.

src/diffusers/pipelines/pipeline_utils.py Outdated Show resolved Hide resolved
src/diffusers/pipelines/pipeline_utils.py Outdated Show resolved Hide resolved
src/diffusers/pipelines/pipeline_utils.py Outdated Show resolved Hide resolved
src/diffusers/pipelines/pipeline_utils.py Outdated Show resolved Hide resolved
@sayakpaul
Copy link
Member

sayakpaul commented Nov 29, 2024

@SunMarc I don't think loading is working:

from diffusers import DiffusionPipeline

repo_id = "DDUF/stable-diffusion-3-medium-diffusers-DDUF"
filename = "stable-diffusion-3-medium-diffusers.dduf"
pipeline = DiffusionPipeline.from_pretrained(repo_id, dduf=filename)

Leads to:

Error
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/fsx/sayak/diffusers/load_dduf.py", line 23, in <module>
    pipeline = DiffusionPipeline.from_pretrained(repo_id, dduf=filename)
  File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/fsx/sayak/diffusers/src/diffusers/pipelines/pipeline_utils.py", line 776, in from_pretrained
    cached_folder = cls.download(
  File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/fsx/sayak/diffusers/src/diffusers/pipelines/pipeline_utils.py", line 1379, in download
    config_file = hf_hub_download(
  File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 862, in hf_hub_download
    return _hf_hub_download_to_cache_dir(
  File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 925, in _hf_hub_download_to_cache_dir
    (url_to_download, etag, commit_hash, expected_size, head_call_error) = _get_metadata_or_catch_error(
  File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1376, in _get_metadata_or_catch_error
    metadata = get_hf_file_metadata(
  File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1296, in get_hf_file_metadata
    r = _request_wrapper(
  File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 277, in _request_wrapper
    response = _request_wrapper(
  File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 301, in _request_wrapper
    hf_raise_for_status(response)
  File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 417, in hf_raise_for_status
    raise _format(EntryNotFoundError, message, response) from e
huggingface_hub.errors.EntryNotFoundError: 404 Client Error. (Request ID: Root=1-674931ef-1fdc6f274e6f862c31db623e;2056ffe1-9bd0-4667-806f-273827e95101)

Entry Not Found for url: https://huggingface.co/DDUF/stable-diffusion-3-medium-diffusers-DDUF/resolve/main/model_index.json.

which is expected.

Update: I have pushed my updates in sayak-dduf which fixes it (it's incomplete but gives a heads-up). Feel free to cherry-pick commits from it. It has progressive extraction with a progressbar as well.

@yiyixuxu yiyixuxu added the roadmap Add to current release roadmap label Dec 3, 2024
@SunMarc
Copy link
Member Author

SunMarc commented Dec 4, 2024

Update: I have pushed my updates in sayak-dduf which fixes it (it's incomplete but gives a heads-up). Feel free to cherry-pick commits from it. It has progressive extraction with a progressbar as well.

Merged !

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're close to merge for this one. Thank you!

src/diffusers/configuration_utils.py Show resolved Hide resolved
src/diffusers/configuration_utils.py Outdated Show resolved Hide resolved
src/diffusers/pipelines/pipeline_utils.py Show resolved Hide resolved
src/diffusers/utils/hub_utils.py Show resolved Hide resolved
@sayakpaul
Copy link
Member

@stevhliu @SunMarc I have added basic docs. LMK.

Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for adding!

docs/source/en/using-diffusers/other-formats.md Outdated Show resolved Hide resolved
docs/source/en/using-diffusers/other-formats.md Outdated Show resolved Hide resolved
docs/source/en/using-diffusers/other-formats.md Outdated Show resolved Hide resolved
docs/source/en/using-diffusers/other-formats.md Outdated Show resolved Hide resolved
docs/source/en/using-diffusers/other-formats.md Outdated Show resolved Hide resolved
@sayakpaul
Copy link
Member

@DN6 @yiyixuxu I think PR is ready to be reviewed. @SunMarc is away for this week. So, expect delay in addressing comments that are more involved. I think I can address / comment on some.

@SunMarc SunMarc requested review from yiyixuxu and DN6 December 23, 2024 11:40
Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for working on this! @SunMarc!
I left a question about shared dduf checkpoint, looking great otherwise and hope to see this merged soon:)

@@ -1453,10 +1488,14 @@ def download(cls, pretrained_model_name, **kwargs) -> Union[str, os.PathLike]:
return snapshot_folder

user_agent = {"pipeline_class": cls.__name__}
if custom_pipeline is not None and not custom_pipeline.endswith(".py"):
if not dduf_file and custom_pipeline is not None and not custom_pipeline.endswith(".py"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use ddif_file with custom_pipeline? if not yet, let's just throw an error instead, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @sayakpaul as you added this part

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we should throw an error when custom_pipeline and dduf_file both are passed.

@@ -1313,7 +1342,14 @@ def download(cls, pretrained_model_name, **kwargs) -> Union[str, os.PathLike]:
local_files_only = True
model_info_call_error = e # save error to reraise it if model is not cached locally

if not local_files_only:
if (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is probably easier if we make different download_dduf /load_dduf_config/_get_dduf_model_file etc because it is so different and seems would just a few lines of code

but I don't feel strongly about it, adding to existing functions also works for me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah decoupling these into different methods like suggested is a nice idea.

@@ -808,7 +811,8 @@ def from_pretrained(cls, pretrained_model_name_or_path: Optional[Union[str, os.P

model = load_flax_checkpoint_in_pytorch_model(model, model_file)
else:
if is_sharded:
# in the case it is sharded, we have already the index
if is_sharded and not dduf_entries:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we support dduf_entries for sharded checkpoints?
theis_sharded logic (on line 786) depends on whether the index_file exists in the filesystem, which is just never going to be True for dduf because all the files are contained in the dduf_entires - it is unclear to me whether this is intended or not

Copy link
Member Author

@SunMarc SunMarc Dec 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dduf_entries do not support sharded checkpoint indeed. Thanks for noticing this. Maybe it is better to refactor the loading as we discussed a long time ago sayak and do the same as transformers where we just pass the state_dict and the config ? WDYT @sayakpaul @yiyixuxu ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the loading refactoring (#10013) is turning out to be more and more urgent which @DN6 wants to tackle. If we can make dduf_entries distinct from is_sharded in this PR, I think that might be a better option to explore for the minimal DDUF PoC?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
roadmap Add to current release roadmap
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

9 participants