Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(template): read jinja templates from gguf files #4332

Merged
merged 8 commits into from
Dec 8, 2024
Merged

Conversation

mudler
Copy link
Owner

@mudler mudler commented Dec 6, 2024

Description

This PR adds automatic detection and parsing of jinja templates in gguf files. If we fail to identify a variant and we do not have already a specific template, it injects the jinja templates which is part of the model metadata if one is found.

Alternatively, it is possible to enable jinja templates manually in the model config file, in the template config section with jinja_template: true.

Notes for Reviewers

  • This is extracted from feat: Realtime API support #3722 as it refactors message templating in a way that is more re-usable by other endpoints.
  • Refactors mainly how the template is processed, removing the business logic from the handler and separate it so can be re-used from other parts (e.g. realtime needs same templating)
  • Refactors some of the dependency passed down to the routes as it adds another component (Template Evaluator) aside the model loader
  • It might cross with feat: Centralized Request Processing middleware #3847 ( didn't checked closely ) but with other PRs too as it refactors some core parts, however, if we don't move this forward there will be too many changes that requires this adaptation or introduces dups

Signed commits

  • Yes, I signed my commits.

Signed-off-by: Ettore Di Giacinto <[email protected]>
pkg/templates/cache.go Dismissed Show dismissed Hide dismissed
@mudler mudler self-assigned this Dec 6, 2024
@mudler mudler marked this pull request as draft December 6, 2024 21:07
Copy link

netlify bot commented Dec 6, 2024

Deploy Preview for localai ready!

Name Link
🔨 Latest commit 56f6ab4
🔍 Latest deploy log https://app.netlify.com/sites/localai/deploys/67557dade0632500089601d4
😎 Deploy Preview https://deploy-preview-4332--localai.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@mudler
Copy link
Owner Author

mudler commented Dec 6, 2024

WIP as need to still add mapping between the transformer tokenizer and the templates (see TODO note in code comments)

Signed-off-by: Ettore Di Giacinto <[email protected]>
@mudler mudler added the enhancement New feature or request label Dec 7, 2024
Signed-off-by: Ettore Di Giacinto <[email protected]>
Signed-off-by: Ettore Di Giacinto <[email protected]>
Signed-off-by: Ettore Di Giacinto <[email protected]>
Signed-off-by: Ettore Di Giacinto <[email protected]>
@mudler mudler marked this pull request as ready for review December 8, 2024 11:02
Signed-off-by: Ettore Di Giacinto <[email protected]>
@mudler
Copy link
Owner Author

mudler commented Dec 8, 2024

basic support should work (tested with llama3 prompt), probably is not going to cover all cases as gonja has limitations, but, since this kicks-in when no other template was defined it is safe to merge without drawbacks.

@mudler mudler merged commit cea5a0e into master Dec 8, 2024
33 checks passed
@mudler mudler deleted the read_template branch December 8, 2024 12:50
sozercan pushed a commit to sozercan/LocalAI that referenced this pull request Dec 8, 2024
* Read jinja templates as fallback

Signed-off-by: Ettore Di Giacinto <[email protected]>

* Move templating out of model loader

Signed-off-by: Ettore Di Giacinto <[email protected]>

* Test TemplateMessages

Signed-off-by: Ettore Di Giacinto <[email protected]>

* Set role and content from transformers

Signed-off-by: Ettore Di Giacinto <[email protected]>

* Tests: be more flexible

Signed-off-by: Ettore Di Giacinto <[email protected]>

* More jinja

Signed-off-by: Ettore Di Giacinto <[email protected]>

* Small refactoring and adaptations

Signed-off-by: Ettore Di Giacinto <[email protected]>

---------

Signed-off-by: Ettore Di Giacinto <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant