Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: @field_validator for url in ImageBlock does not handle None correctly #17382

Open
malsaraa opened this issue Dec 27, 2024 · 2 comments
Open
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized

Comments

@malsaraa
Copy link

Bug Description

My chat messages are managed outside the basic llama-index framework (i.e., not using chat stores, etc.) because the bot is a Teams app. These messages are stored as JSON and retrieved as needed, then converted into a list of ChatMessage. However, I'm currently trying to integrate ImageBlock and encountering a ValidationError.

The ImageBlock model’s url field is declared as AnyUrl | str | None, which suggests that None should be an accepted value. However, the @field_validator for the url field does not account for None. As a result, attempting to pass None for the url field results in a TypeError when the validator tries to convert it to AnyUrl.

I assume that the validator needs a minor update to handle the None

def urlstr_to_anyurl(cls, url: str | AnyUrl | None) -> AnyUrl | None:
   if url is None:
       return None
   if isinstance(url, AnyUrl):
       return url
   return AnyUrl(url=url)

Version

0.12.7

Steps to Reproduce

Example code:

def serialize_message(obj):
    if isinstance(obj, Path):
        return str(obj)
    if isinstance(obj, ChatMessage):
        return obj.model_dump()
    raise TypeError(f"Type {type(obj)} not serializable")


def deserialize_message(obj):
    if 'path' in obj:
        obj['path'] = Path(obj['path'])
    return obj


test_message = ChatMessage(
                role="user",
                content=[
                    TextBlock(text="test-message-text"),
                    ImageBlock(path=local_image_path, image_mimetype='image/jpeg'),
                ],
            )

jsonStr = json.dumps(test_message.model_dump(), default=serialize_message)

converted = json.loads(jsonStr, object_hook=deserialize_message)

new_message = ChatMessage(**converted)

Relevant Logs/Tracbacks

ValidationError                           Traceback (most recent call last)
Cell In[2], line 31
     27 jsonStr = json.dumps(test_message.model_dump(), default=serialize_message)
     29 converted = json.loads(jsonStr, object_hook=deserialize_message)
---> 31 new_message = ChatMessage(**converted)

File c:\Users\user\AppData\Local\miniconda3\envs\llama\Lib\site-packages\llama_index\core\base\llms\types.py:142, in ChatMessage.__init__(self, content, **data)
    139     elif isinstance(content, list):
    140         data["blocks"] = content
--> 142 super().__init__(**data)

File c:\Users\user\AppData\Local\miniconda3\envs\llama\Lib\site-packages\pydantic\main.py:214, in BaseModel.__init__(self, **data)
    212 # `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks
    213 __tracebackhide__ = True
--> 214 validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
    215 if self is not validated_self:
    216     warnings.warn(
    217         'A custom validator is returning a value other than `self`.\n'
    218         "Returning anything other than `self` from a top level model validator isn't supported when validating via `__init__`.\n"
    219         'See the `model_validator` docs (https://docs.pydantic.dev/latest/concepts/validators/#model-validators) for more details.',
    220         stacklevel=2,
    221     )

ValidationError: 1 validation error for ChatMessage
blocks.1.image.url
  URL input should be a string or URL [type=url_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.10/v/url_type
@malsaraa malsaraa added bug Something isn't working triage Issue needs to be triaged/prioritized labels Dec 27, 2024
Copy link

dosubot bot commented Dec 27, 2024

Hey @malsaraa! I'm here to help you with any bugs, questions, or contributions. Let's dive into this issue and get it sorted!

To resolve the TypeError caused by the @field_validator for the url field in the ImageBlock model not handling None correctly, you can update the urlstr_to_anyurl method to check for None before processing the value. Here's the updated code:

@field_validator("url", mode="after")
@classmethod
def urlstr_to_anyurl(cls, url: str | AnyUrl | None) -> AnyUrl | None:
    """Store the url as Anyurl."""
    if url is None:
        return None
    if isinstance(url, AnyUrl):
        return url
    return AnyUrl(url=url)

This modification ensures that if the url is None, the method will return None without attempting to process it further, thus avoiding a TypeError [1].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@logan-markewich
Copy link
Collaborator

Feel free to make a PR @malsaraa -- I think you are correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized
Projects
None yet
Development

No branches or pull requests

2 participants