You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm running this model clip-vit-base-patch32_ggml on my intel mac and it looks like the lower the quantization the slower image encoding is. I tried the main clip-vit-base-patch32_ggml-model-f32.gguf model and the q8_0 and q4_0 variants.
These are the encode times I get for a batch of 4 images:
f16 looks like an outlier, taking the most time.
But looking at f32(272.21ms) -> q8_0(333.96ms) -> q5_0(354.86ms) -> q4_0(539.32ms), time is getting worse. Its better with the _1 variants though.
Anyone know if this expected or is there something wrong?
The text was updated successfully, but these errors were encountered:
I'm running this model clip-vit-base-patch32_ggml on my intel mac and it looks like the lower the quantization the slower image encoding is. I tried the main
clip-vit-base-patch32_ggml-model-f32.gguf
model and theq8_0
andq4_0
variants.These are the encode times I get for a batch of 4 images:
f16
looks like an outlier, taking the most time.But looking at
f32(272.21ms)
->q8_0(333.96ms)
->q5_0(354.86ms)
->q4_0(539.32ms)
, time is getting worse. Its better with the_1
variants though.Anyone know if this expected or is there something wrong?
The text was updated successfully, but these errors were encountered: