-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eval bug: Out of Memory Error with Qwen2-VL on Windows #10973
Comments
What are the dimensions of the image you're using? |
I met the same error |
It seems like the issue was indeed related to the image resolution. |
Actually it‘s the image resolution problem, for qwen2-vl model’s vision part can handle arbitrary resolution images, which means more tokens are projected to the LLM part. According to the paper, an image with 4132x5858 pixels means (4132x5858)/(14x14)/4= 30874 tokens, which demands very large RAM and causes very slow inference speed. |
Name and Version
version: 4391 (9ba399d)
built with MSVC 19.29.30157.0 for
Operating systems
Windows
GGML backends
CPU, CUDA
Hardware
CPU : intel core i7 13850 HX
GPU : RTX 3500 ADA
RAM : 32 GB
Models
Qwen2-VL-7B-instruct-Q4_K_M.gguf
bartowski/Qwen2-VL-7B-Instruct-GGUF
Problem description & steps to reproduce
I tried running the following command on Windows using both the AVX2 and CUDA binaries downloaded from the releases
This is for CUDA :
It was also The same for the CPU version.
I tried using different options like lowering batch size (-b) or context size (-c), but it still crashes with the same error
First Bad Commit
No response
Relevant log output
The text was updated successfully, but these errors were encountered: