-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support model providers other than OpenAI and Azure #657
Comments
@Mxk-1 has found chunking settings that help resolve issues with create_base_entity_graph with Ollama:
|
I collected several comments scattered across different issues and created a monkey patch script along with a working setting for Ollama. It has been tested on version 0.3.2 and works properly. I’m sharing it for those who might need it: https://gist.github.com/MortalHappiness/7030bbe96c4bece8a07ea9057ba18b86. I’m not sure if it’s appropriate to comment here, so if the reviewers think it’s not, I'll delete this comment and post it in a more suitable place. Thank you in advance! |
Please consider just using allowing openai compatible endpoint (vllm/llama-server) for llm and embedding model. I can get it to work for the normal llm, but not to make embeddings (nomic-embed-text) via llama-server. Please don't shoehorn this into only ollama, yet creating another niche constriction for usability. @MortalHappiness does your monkey-patch work with other openai locally hosted endpoints that are not ollama? |
Right now GraphRAG only natively supports models hosted by OpenAI and Azure. Many users would like to run additional models, including alternate APIs, SLMs, or models running locally. As a research team with limited bandwidth it is unlikely we will add native support for more model providers in the near future. Our focus is on memory structures and algorithms to improve LLM information retrieval, and we've got a lot of experiments in the queue!
There are alternative options to achieve extensibility, and many GraphRAG users have had luck extending the library. So far we've seen this most commonly with Ollama, which runs on localhost and supports a very wide variety of models. This approach depends on Ollama supporting the standard OpenAI API for chat completion and embeddings so it can proxy our API calls, and it looks like this is working for a lot of folks (though may require some hacking).
Please note: while we are excited to see GraphRAG used with more models, our team will not have time to help diagnose issues. We'll do our best to route bug reports to existing conversations that might be helpful. For the most part you should expect that if you file a bug related to running an alternate solution, we'll link to this issue, a relevant conversation if we're aware of one, and then we'll close the bug.
Here is a general discussion regarding OSS LLMs: #321.
And a couple of popular Ollama-related issues: #339 and #345. We'll link to others in the comments when relevant.
Have a look at issues tagged with the community_support label as well.
The text was updated successfully, but these errors were encountered: