Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

One or several metadata.jsonl were found, but not in the same directory or in a parent directory of #7337

Open
mst272 opened this issue Dec 17, 2024 · 0 comments

Comments

@mst272
Copy link

mst272 commented Dec 17, 2024

Describe the bug

ImageFolder with metadata.jsonl error. I downloaded liuhaotian/LLaVA-CC3M-Pretrain-595K locally from Hugging Face. According to the tutorial in https://huggingface.co/docs/datasets/image_dataset#image-captioning, only put images.zip and metadata.jsonl containing information in the same folder. However, after loading, an error was reported: One or several metadata.jsonl were found, but not in the same directory or in a parent directory of.

The data in my jsonl file is as follows:

{"id": "GCC_train_002448550", "file_name": "GCC_train_002448550.jpg", "conversations": [{"from": "human", "value": "\nProvide a brief description of the given image."}, {"from": "gpt", "value": "a view of a city , where the flyover was proposed to reduce the increasing traffic on thursday ."}]}

Steps to reproduce the bug

from datasets import load_dataset
image = load_dataset("imagefolder",data_dir='data/opensource_data')

Expected behavior

success

Environment info

datasets==3.2.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant