v2.6.1

mudler released this 23 Jan 17:22

· 2156 commits to master since this release

d5d82ba

This is a patch release containing bug-fixes around parallel request support with llama.cpp models.

What's Changed

Bug fixes 🐛

fix(llama.cpp): Enable parallel requests by @tauven in #1616
fix(llama.cpp): enable cont batching when parallel is set by @mudler in #1622

Exciting New Features 🎉

feat(grpc): backend SPI pluggable in embedding mode by @coyzeng in #1621

👒 Dependencies

⬆️ Update ggerganov/llama.cpp by @localai-bot in #1623

Other Changes

⬆️ Update docs version mudler/LocalAI by @localai-bot in #1619
⬆️ Update ggerganov/llama.cpp by @localai-bot in #1620
⬆️ Update ggerganov/llama.cpp by @localai-bot in #1626

New Contributors

@tauven made their first contribution in #1616
@coyzeng made their first contribution in #1621

Full Changelog: v2.6.0...v2.6.1

Contributors

tauven, mudler, and 2 other contributors

Assets 8