Where to learn about threading logic in llama.cpp #10770
Unanswered
Nick-infinity
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello, I am trying to understand how multithreading works in llamma.cpp/gmml and where is the control point. While looking at the operator code, I can see only the microkernel. I assume this microkernel is called with N threads to perform chunking and parallelization. Can someone please help me in better understanding how multiple threads are controlled within the operator and who is the owner/ caller of these N thread calls?
Thanks,
Beta Was this translation helpful? Give feedback.
All reactions