Slower image encode the lower the quantization #85

kfuseini · 2023-11-13T22:08:04Z

I'm running this model clip-vit-base-patch32_ggml on my intel mac and it looks like the lower the quantization the slower image encoding is. I tried the main clip-vit-base-patch32_ggml-model-f32.gguf model and the q8_0 and q4_0 variants.

These are the encode times I get for a batch of 4 images:

clip-vit-base-patch32_ggml-model-f32.gguf
Avg Batch Img Encode Time: 272.21ms

clip-vit-base-patch32_ggml-model-f16.gguf
Avg Batch Img Encode Time: 665.07ms

clip-vit-base-patch32_ggml-model-q8_0.gguf
Avg Batch Img Encode Time: 333.96ms

clip-vit-base-patch32_ggml-model-q5_1.gguf
Avg Batch Img Encode Time: 322.71ms

clip-vit-base-patch32_ggml-model-q5_0.gguf
Avg Batch Img Encode Time: 354.86ms

clip-vit-base-patch32_ggml-model-q4_1.gguf
Avg Batch Img Encode Time: 330.20ms

clip-vit-base-patch32_ggml-model-q4_0.gguf
Avg Batch Img Encode Time: 539.32ms

f16 looks like an outlier, taking the most time.
But looking at f32(272.21ms) -> q8_0(333.96ms) -> q5_0(354.86ms) -> q4_0(539.32ms), time is getting worse. Its better with the _1 variants though.

Anyone know if this expected or is there something wrong?

The text was updated successfully, but these errors were encountered:

ellonde · 2024-01-05T14:26:36Z

Im having the same issue it takes a very long time to encode images. Im getting an average of 830 ms for my q5_0 model on a rather old(2019, i7) mac.

Any information regarding this would be much appreciated :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slower image encode the lower the quantization #85

Slower image encode the lower the quantization #85

kfuseini commented Nov 13, 2023 •

edited

Loading

ellonde commented Jan 5, 2024

Slower image encode the lower the quantization #85

Slower image encode the lower the quantization #85

Comments

kfuseini commented Nov 13, 2023 • edited Loading

ellonde commented Jan 5, 2024

kfuseini commented Nov 13, 2023 •

edited

Loading