Text Context Length #27

Jiangbo-Shi · 2024-05-31T01:21:41Z

Dear authors,
Thanks for your great work. The maximum text context length for the CLIP text encoder is 77. However, the token length of several captions in quilt-1m is larger than 77. How can we utilize the CLIP text encoder to extract the caption features?

wisdomikezogwo · 2024-05-31T01:44:00Z

Hi,

For your needs you can try the PMB version of QuiltNet here: https://huggingface.co/wisdomik/QuiltNet-B-16-PMB which refers to PubmedBert, a BERT model of 256 context length pre-trained on PMC-15M and fine-tuned alongside the image tower on Quilt-1M.

Jiangbo-Shi · 2024-06-02T11:43:22Z

Thank you very much for your quick reply. Regarding the ViT-B-32|GPT-77 version of QuiltNet, how do you handle captions that exceed 77 in length? Did you implement a truncation operation?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Text Context Length #27

Text Context Length #27

Jiangbo-Shi commented May 31, 2024

wisdomikezogwo commented May 31, 2024

Jiangbo-Shi commented Jun 2, 2024

Text Context Length #27

Text Context Length #27

Comments

Jiangbo-Shi commented May 31, 2024

wisdomikezogwo commented May 31, 2024

Jiangbo-Shi commented Jun 2, 2024