Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: tokenize each request individually and increase warmup image size #2802

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

drbh
Copy link
Collaborator

@drbh drbh commented Dec 5, 2024

This PR resolves some small issues with qwen2-vl.

  1. doubles the size of WARMUP_IMAGE_BASE64 from 20x20px to 40x40px (meets qwens minimal requirement without hacky fix)
  2. removes hacky fix to double the warmup image
  3. prefer tokenizing each request instead of the whole batch at once. this change allows r.truncate to be passed for each request - as previouslly it was not respected when one of the request was smaller than others in the batch.
  4. sets max_s to the max of max_s or the input size. This is required so the rotary and create self._cos_cached of the correct size in relation to the position ids.

these changes resolve a startup issue reproducible with:

text-generation-launcher \
--model-id Qwen/Qwen2-VL-2B-Instruct \
--max-input-tokens 40 \
--max-batch-prefill-tokens 50 \
--max-total-tokens 51

*(note the underlying issue triggers when max-input-tokens is less than max-batch-prefill-tokens)

@Narsil
Copy link
Collaborator

Narsil commented Dec 9, 2024

  1. from 20x20px to 40x40px (meets qwens minimal requirement without hacky fix)

I do not understand, why should we impose anything on the user for the images. If 20px x 20x is not supported we should:

  • Rescale the image seemlessly and correctly infer on it.
  • Reject the image with a proper error message.
    User's shouldn't have to know anything about the model's internals, 20x20px should be ok imho.

@drbh drbh force-pushed the improve-qwen2-vl-warmup branch from 60b9c18 to 3cc8297 Compare December 9, 2024 20:31
@drbh drbh force-pushed the improve-qwen2-vl-warmup branch from 32a9564 to a3049f1 Compare December 9, 2024 21:32
@drbh drbh requested a review from Narsil December 16, 2024 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants