Skip to content

v2.6.1

Compare
Choose a tag to compare
@mudler mudler released this 23 Jan 17:22
· 2156 commits to master since this release
d5d82ba

This is a patch release containing bug-fixes around parallel request support with llama.cpp models.

What's Changed

Bug fixes 🐛

  • fix(llama.cpp): Enable parallel requests by @tauven in #1616
  • fix(llama.cpp): enable cont batching when parallel is set by @mudler in #1622

Exciting New Features 🎉

  • feat(grpc): backend SPI pluggable in embedding mode by @coyzeng in #1621

👒 Dependencies

Other Changes

New Contributors

Full Changelog: v2.6.0...v2.6.1