Performance of llama.cpp with Vulkan #10879
netrunnereve
started this conversation in
General
Replies: 3 comments
-
AMD FirePro W8100
|
Beta Was this translation helpful? Give feedback.
0 replies
-
AMD RX 470
|
Beta Was this translation helpful? Give feedback.
0 replies
-
ubuntu 24.04, vulkan and cuda installed from official APT packages.
build: 4da69d1 (4351) vs CUDA on the same build/setup
build: 4da69d1 (4351) |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This is similar to the Apple Silicon benchmark thread, but for Vulkan! Many improvements have been made to the Vulkan backend in the past month and I think it's good to consolidate and discuss our results here.
We'll be testing the Llama 2 7B model like the other thread to keep things consistent, and use Q4_0 as it's simple to compute and small enough to fit on a 4GB GPU. You can download it here.
Instructions
Share your llama-bench results along with the git hash and Vulkan info string in the comments. Feel free to try other models, compare backends, and so forth, but only valid runs will be placed on the scoreboard.
Vulkan Scoreboard for Llama 2 7B, Q4_0
Beta Was this translation helpful? Give feedback.
All reactions