-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: support bm42 embeddings #338
base: main
Are you sure you want to change the base?
Conversation
BM42 is super controversial. |
To me Qdrant came at as a prety shady actor here so I would take everything they publish with a big grain of salt. |
Fair. We really just want support for it so we could test on our own datasets and verify how good/bad it actually is. I agree a lot of that launch was shady; especially the quickwit bench portion. Any feedback on the actual implementation? Because the router function Which means that This will impact adding support for other sparse models as well as this model. It would be a lot better if
I don't have enough context on what the best approach for this is, but it would open the opportunity for supporting other sparse embedding models that have larger keyspaces. |
Why is that a problem? Moving a Vec shouldn't be a concern, it's not expensive as far as I know. |
Creating a Vector with a max of 9223372036854776000 f32's is pretty expensive. In this case, even if they are all 0 filled |
Ok so when you say "Which means that Infer has to create and send a very large dense vector across a channel." you are only referring to the creation part? As I said the moving part ("send accross a channel") is not an issue. |
What does this PR do?
This pr adds support for
Qdrant/all_miniLM_L6_v2_with_attentions
I followed https://github.com/Anush008/bm42-rs as a reference and added a new pooling method ofbm42
.Tested with running with command
HF_TOKEN=hf_********************************** cargo run --no-default-features -F http -F ort -- --model-id Qdrant/all_miniLM_L6_v2_with_attentions --pooling bm42 --port 5888
Fixes #337
Before submitting
Pull Request section?
to it if that's the case. Issue Support for Qdrant bm42 document encoder #337
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.