Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration "preload_from_hub" does not support "repo_type" (preload datasets)? #1536

Open
cboettig opened this issue Dec 18, 2024 · 1 comment

Comments

@cboettig
Copy link

Bug description.

Spaces configuration supports the field preload_from_hub according to the documentation. The contents of this field get passed to huggingface-cli download, which defaults to repo-type="model". It appears there is no way to configure this to pre-load a "dataset" repo type instead, despite the documentation description:

This is particularly useful for Spaces that rely on large models or datasets that would otherwise need to be downloaded at runtime.

Describe the expected behaviour

Provide (or document) a method to pre-load datasets. The examples all show model-repo type.

Additional information

I know we can work around this by committing data to a space, or downloading, or using a model repo instead of a dataset repo to store the data, but these are clearly not ideal solutions.

@julien-c
Copy link
Member

maybe cc @apolinario given #1156

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants