Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rwkv_clone_context thread-safety when using cuBLAS #182

Open
eduardsui opened this issue Sep 14, 2024 · 0 comments
Open

rwkv_clone_context thread-safety when using cuBLAS #182

eduardsui opened this issue Sep 14, 2024 · 0 comments

Comments

@eduardsui
Copy link

Hello,

I'm trying to use rwkv.cpp in two different threads. For this, I'm loading the model and then using two context clones (via rwkv_clone_context). Everything works fine when each thread runs rwkv_eval, but when running simultaneously in two threads, I get an error:

GGML_ASSERT: /root/rwkv.cpp/ggml/src/ggml-cuda.cu:409: ptr == (void *) (pool_addr + pool_used)
GGML_ASSERT: /root/rwkv.cpp/ggml/src/ggml-cuda.cu:409: ptr == (void *) (pool_addr + pool_used)

It seems that alloc/free are called "out of order" for the two contexts. Any idea how to solve this?

Thanks!

@eduardsui eduardsui changed the title rwkv_clone_context when using cuBLAS rwkv_clone_context thread-safety when using cuBLAS Sep 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant