Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPU Stats for when it's possible #15

Open
eren23 opened this issue May 22, 2023 · 2 comments
Open

CPU Stats for when it's possible #15

eren23 opened this issue May 22, 2023 · 2 comments

Comments

@eren23
Copy link

eren23 commented May 22, 2023

Running sentence-transformers on a CPU for various tasks is also possible, especially for consumer-grade libraries, etc. People are running these models w/o any GPU acceleration, which might be good to mention in the section.

We were using a sentence-transformer since the beginning in and even if it's a small open-source project all the users I know are using it on their CPUs.

@waleedkadous
Copy link
Collaborator

Weirdly, I tried it myself and it was considerably slower: like 20x slower. But I think that would be a really good section to add, especially with us also adding more info on llama.cpp (which we are starting to benchmark now). Give us 2 weeks and we'll see if we can do it.

@lcrmorin
Copy link

I would have appreciated to find this number too. From personnal experience (see: https://www.kaggle.com/code/lucasmorin/mistral-7-b-instruct-electricity-co2-consumption) the run time for the same query is 10x, which generally make the cpu usage impractical (or impossible).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants