-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Elixir Function optional for LangChain.Function? #143
Comments
@avergin That's a valid use-case. I plan to create an demo for this, which may identify any remaining issues if they are there. But here's the idea:
Also, data extraction can be done without functions. I'm publishing an article on Monday morning showing this. Does any of that help? |
@avergin I published the article about using AI to create image ALT tag text and image captions using AI. It's doing data extraction using JSON without functions. https://fly.io/phoenix-files/using-ai-to-boost-accessibility-and-seo/ |
Thanks for the article! The idea of validating the function results sounds nice. In this case, will it be similar to the instructor library? Sometimes, models are not able to generate a valid JSON even in function calls. So does it make sense to have the option for enabling |
@avergin Yes, it's similar to instructor in this way. The idea is that a custom processor (it's really just a function) can validate against a changeset. OR, the tool call can validate against a changeset. I believe the ToolCall will fail and return an error to the LLM if the returned data is not valid JSON. If it's not doing that, then it should. |
A possible option here would be to provide a default Elixir function that does nothing. Setting the function would override the default. 🤔 |
My favourite Python library for interfacing with LLMs is @jackmpcollins's excellent magentic. You can do something like this: from magentic import AssistantMessage, SystemMessage, UserMessage, OpenaiChatModel
from magentic.chat import Chat
from pydantic import BaseModel, Field
class CustomerNumber(BaseModel):
"""The user's Acme Corporation customer number."""
# Ideally, we'd provide other validation here.
customer_number: str = Field(description="The user's customer number")
chat = Chat(
messages = [
SystemMessage(
"""You are an assistant for Acme Corporation. You will provide customer support,
but only once the user provides their Acme customer number.
Solicit the customer number from the user (if one has not already been provided)."""
)
],
output_types=[str, CustomerNumber], model=OpenaiChatModel("gpt-4o")
) The part in particular that's relevant to this discussion is the For example: chat = chat.add_user_message("What is the capital of Ontario?").submit()
chat.messages[-1]
chat = chat.add_user_message("My customer number is 1234567891111111").submit()
chat.messages[-1]
I've found this pattern to be extremely helpful for building complex (many-step) chat-driven agent workflows, because you can directly dump the string response back to the user if you don't get the structured output you were expecting back from the LLM. The reason I mention this is because I took a quick look at the new |
Hi @mjrusso! Welcome! Yes, that is a nice feature. There are two main approaches to do that in this library and, of course, there are other options as well. We call this "extracting structured data" in general.
Yes, that's the idea. A JSON Processor converts an assistant's text response to JSON and returns errors to the LLM if not. The next processor could be a custom one for processing the data you expect. It is my intention to also create an EctoProcessor where you provide an Ecto changeset (Elixir's database structure interface, but it can be used without going to a database) to process the now valid JSON. If a required value is not present or it violates some other rule, the error is returned to the LLM for it to try again. Another option is to use a Tool/Function where a schema defines the structure you require. This can also be processed through an Ecto changeset to ensure your requirements are met. Finally, if extracting structured data is your primary need/goal, then you should be aware of https://github.com/thmsmlr/instructor_ex as another option. |
Thanks Mark for the detailed answer (and the wonderful library :) I had a suspicion that an EctoProcessor was coming, and am very much looking forward to the addition. I've played around with hybrid LangChain plus Instructor usage (example Livebook: https://gist.github.com/mjrusso/c74803ed7ed49d42f9aefe77b6a62c52), and it totally works, but there's a ton of advantages to baking into LangChain directly. What I don't think I've generally seen are examples of structured data extraction where multiple return types are considered acceptable. (The example I shared in my previous comment is not great, especially because the LLM natively returns a string. But imagine something like this: Of course, under-the-hood this is all just sugar over tool calls, so (exactly as you mentioned) expressing each return type as separate tools is a totally viable implementation approach. But -- does it make sense to support multiple chains of message processors (instead of limiting to a single processor chain)? This would cleanly support the multiple acceptable return types use case without the need to manually define tools. Perhaps not worth the additional complexity if this isn't a common need (as one data point, though, as I've started digging in to building multi-step chat-driven agents, I've found it pretty essential). |
The idea with the message processors is that some simple models are only good at giving a single response. You can't reliably have a follow-up chat with them. Most importantly, they are terrible at function/tool calls. That's the idea I was trying to support here. That I want the LLM to do this one task and it can't do functions. Personally, I'd approach your example using a more capable model like ChatGPT, Antropic, etc where it has a single function it can call like "account_identifier" that takes an object of any one of those things as separate keys. Instruct the model to prompt the user for the data and call the function. Conceptually, it might look like this The function would execute the lookup and determine that is a valid CaseNumber, CustomerNumber or not. If not, return an error and let the LLM press on asking for the right data. Depending on the chat interface, I'd probably stop there with that chain, having obtained the account identification information I needed and start a new chain and system message that loads up the most relevant account information and defines new functions applicable to it's task. My point is, I don't think it's important to express datatypes specifically that way. |
Arghhh, I assumed that the message processor chain was implemented under-the-covers with a tool call 🤦 I see the value and think the approach you've taken here is the right one; it definitely makes sense to have a facility that works without tools. (Separately, I'd argue that there is value in adding some entirely unrelated API sugar on top of tools for coercing return types. I'll plan to put together an actual non-handwavy proposal once I start using this for anything serious. Thanks and sorry for the digression!) |
Thanks @mjrusso! I look forward to seeing what you come up with! |
For workflows where we just need structured JSON outputs from the models (e.g. data extraction) through using tools, we may not need to execute any code in the client and send any messages back to the models. For such cases, does it make sense to make the
function
(Elixir Function) attribute optional forLangChain.Function
?(From Anthropic API Docs)
The text was updated successfully, but these errors were encountered: