-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add Replicate API option #4
Comments
Good question! My short-list right now is to focus on the LLMs. I want to add support for Llama 2 and Bard. My next focus, (we'll see how it goes), is to add local Llama 2 support through Nx/Bumblebee. I'm not opposed to Replicate support. I'm not familiar with their API either. Also, I'd like to upgrade to the latest Req which changes the internal API for streaming responses back. |
I'm trying to go into doing some agent stuff, so I'm digging into the text processing and that side of things for the moment. |
Hey @brainlid
Yep, sounds great. I guess the challenge from a lib design perspective is that there's a difference between models and (hosting) platforms - although it's a challenge that's currently masked by the fact that OpenAI is sortof both. I think I favour they way you've got it currently, organising the code around platforms (because that's what determines the API interface code). Then, between-models-on-the-same-platform differences can be handled within each platform's module (e.g.
Yep, agreed, Bumblebee support is a great way to go.
It's pretty standard; from a user (of this lib) perspective you could set up
Also 👍🏻 Anyway, I know you don't need me internet quarterbacking this whole thing, and I'm sure you're aware of all the above challenges. Just wanted to see what the plans were so that if there was an overlap between the way you wanted to take things and my ability to contribute 😄 |
Yes, that's the idea. That one module, plus the protocols, are used to adapt a specific service like Replicate to the rest of the library. That way nothing else in the library needs to know about how different services work or what they support. Thanks for the Replicate API docs link.
So is there an overlap? 🙂 |
Yep, there is 😉 I'm on parental leave atm so finding time is a bit tricky (it might be a week or two) but it's a nice bite-sized chunk of work that I'd be happy to contribute. |
…plicate` Ok, as promised in brainlid#4, this commit adds initial (alpha) support for calling Replicate-hosted models (currently this includes llama2 and several variants, a chat-tuned Mistral, and a several others---with more added all the time). Tests included. Main caveats (as mentioned in the module doc) are the lack of support for functions and streaming. It's not clear when they might be added, due to Replicate vs OpenAI platform differences (e.g. Replicate does support streaming, but to a different user-provided (and client-hosted) endpoint rather than as SSEs on the "create completion" server endpoint. So it really raises some deeper questions about how to cover over platform differences in this lib (as I'm sure adding local bumblebee support will as well). Do we offer the union of all features, and just raise an error if the user tries to do something unsupported on the given platform backend? Or to smooth over the differences in this lib to hide those differences from the user? As a result I've got the basics working here, but wanted to see how you wanted to proceed.
Hey @brainlid , have a look at the Replicate stuff I pushed up here. tl;dr is that it works for a limited subset of features (no streaming, no functions) for now. My |
Even though it's just OpenAI for now the code is nice and modular and obviously extensible to other hosted LLM providers (🙌🏻).
I'm not sure if there's a roadmap somewhere that I've missed, but Replicate might be a good option for the next "platform" to be added. It's one place that Meta are putting up their various LLama models. However, I think it'd only support the
LangChain.Message
stuff - there's no function call support in the models as yet.I'd be open to putting together a PR to add replicate support (their official Elixir client lib uses httpoison, so I guess it'd be better to just call the Replicate API directly using Req).
Would you be interested in accepting it? Happy to discuss implementation strategies, because I know the move from single -> multiple platform options introduces some decisions & tradeoffs.
The text was updated successfully, but these errors were encountered: