v2.24.0
LocalAI release v2.24.0!
🚀 Highlights
- Backend deprecation: We’ve removed
rwkv.cpp
andbert.cpp
, replacing them with enhanced functionalities inllama.cpp
for simpler installation and better performance. - New Backends Added: Introducing
bark.cpp
for text-to-audio andstablediffusion.cpp
for image generation, both powered by the ggml framework. - Voice Activity Detection (VAD): Added support for
silero-vad
to detect speech in audio streams. - WebUI Improvements: Now supports API key authentication for enhanced security.
- Real-Time Token Usage: Monitor token consumption during streamed outputs.
- Expanded P2P Settings: Greater flexibility with new configuration options like
listen_maddrs
,dht_announce_maddrs
, andbootstrap_peers
.
📤 Backends Deprecation
As part of our cleanup efforts, the rwkv.cpp
and bert.cpp
backends have been deprecated. Their functionalities are now integrated into llama.cpp
, offering a more streamlined and efficient experience.
🆕 New Backends Introduced
-
bark.cpp
Backend: Transform text into realistic audio using Bark, a transformer-based text-to-audio model. Install it easily with:local-ai models install bark-cpp-small
Or start it directly:
local-ai run bark-cpp-small
-
stablediffusion.cpp
Backend: Create high-quality images from textual descriptions using the Stable Diffusion backend, now leveraging the ggml framework. -
Voice Activity Detection with
silero-vad
: Introducing support for accurate speech segment detection in audio streams. Install via:local-ai models install silero-vad
Or configure it through the WebUI.
🔒 WebUI Access with API Keys
The WebUI now supports API key authentication. If one or more API Keys are configured, the WebUI will automatically display a page to authenticate with.
🏆 Enhancements and Features
- Real-Time Token Usage: Monitor token consumption dynamically during streamed outputs. This feature helps optimize performance and manage costs effectively.
- P2P Configuration: New settings for advanced peer-to-peer mode:
listen_maddrs
: Define specific multiaddresses for your node.dht_announce_maddrs
: Specify addresses to announce to the DHT network.bootstrap_peers
: Set custom bootstrap peers for initial connectivity.
These options offer more control, especially in constrained networks or custom P2P environments.
🖼️ New Models in the Gallery
We've significantly expanded our model gallery with a variety of new models to cater to diverse AI applications. Among these:
- Calme-3 Qwen2.5 Series: Enhanced language models offering improved understanding and generation capabilities.
- Mistral-Nemo-Prism-12b: A powerful model designed for complex language tasks.
- Llama 3.1 and 3.2 Series: Upgraded versions of the Llama models with better performance and accuracy.
- Qwen2.5-Coder Series: Specialized models optimized for code generation and programming language understanding.
- Rombos-Coder Series: Advanced coder models for sophisticated code-related tasks.
- Silero-VAD: High-quality voice activity detection model for audio processing applications.
- Bark-Cpp-Small: Lightweight audio generation model suitable for quick and efficient audio synthesis.
Explore these models and more in our updated model gallery to find the perfect fit for your project needs.
🐞 Bug Fixes and Improvements
- Performance Enhancements: Resolved issues with AVX flags and optimized binaries for accelerated performance, especially on macOS systems.
- Dependency Updates: Upgraded various dependencies to ensure compatibility, security, and performance improvements across the board.
- Parsing Corrections: Fixed parsing issues related to maddr and ExtraLLamaCPPArgs in P2P configurations.
📚 Documentation and Examples
- Updated Guides: Refreshed documentation with new configuration examples, making it easier to get started and integrate the latest features.
📥 How to Upgrade
To upgrade to LocalAI v2.24.0:
- Download the Latest Release: Get the binaries from our GitHub Releases page.
- Update Docker Image: Pull the latest Docker image using:
docker pull localai/localai:latest
See also the Documentation at: https://localai.io/basics/container/#standard-container-images
Happy hacking!
What's Changed
Breaking Changes 🛠
- feat(models): use rwkv from llama.cpp by @mudler in #4264
- feat(backends): Drop bert.cpp by @mudler in #4272
Bug fixes 🐛
- fix(hipblas): disable avx flags when accellerated bins are used by @mudler in #4167
- chore(deps): bump sycl intel image by @mudler in #4201
- fix(go.mod): add urfave/cli v2 by @mudler in #4206
- chore(go.mod): add valyala/fasttemplate by @mudler in #4207
- fix(p2p): parse maddr correctly by @mudler in #4219
- fix(p2p): parse correctly ExtraLLamaCPPArgs by @mudler in #4220
- fix(llama.cpp): embed metal file into result binary for darwin by @mudler in #4279
Exciting New Features 🎉
- feat: add WebUI API token authorization by @mintyleaf in #4197
- feat(p2p): add support for configuration of edgevpn listen_maddrs, dht_announce_maddrs and bootstrap_peers by @mintyleaf in #4200
- feat(silero): add Silero-vad backend by @mudler in #4204
- feat: include tokens usage for streamed output by @mintyleaf in #4282
- feat(bark-cpp): add new bark.cpp backend by @mudler in #4287
- feat(backend): add stablediffusion-ggml by @mudler in #4289
🧠 Models
- models(gallery): add calme-3 qwen2.5 series by @mudler in #4107
- models(gallery): add calme-3 qwenloi series by @mudler in #4108
- models(gallery): add calme-3 llamaloi series by @mudler in #4109
- models(gallery): add mn-tiramisu-12b by @mudler in #4110
- models(gallery): add qwen2.5-coder-14b by @mudler in #4125
- models(gallery): add qwen2.5-coder-3b-instruct by @mudler in #4126
- models(gallery): add qwen2.5-coder-32b-instruct by @mudler in #4127
- models(gallery): add qwen2.5-coder-14b-instruct by @mudler in #4128
- models(gallery): add qwen2.5-coder-1.5b-instruct by @mudler in #4129
- models(gallery): add qwen2.5-coder-7b-instruct by @mudler in #4130
- models(gallery): add qwen2.5-coder-7b-3x-instruct-ties-v1.2-i1 by @mudler in #4131
- models(gallery): add qwen2.5-coder-7b-instruct-abliterated-i1 by @mudler in #4132
- models(gallery): add rombos-coder-v2.5-qwen-7b by @mudler in #4133
- models(gallery): add rombos-coder-v2.5-qwen-32b by @mudler in #4134
- models(gallery): add rombos-coder-v2.5-qwen-14b by @mudler in #4135
- models(gallery): add eva-qwen2.5-72b-v0.1-i1 by @mudler in #4136
- models(gallery): add mistral-nemo-prism-12b by @mudler in #4141
- models(gallery): add tess-3-llama-3.1-70b by @mudler in #4143
- models(gallery): add celestial-harmony-14b-v1.0-experimental-1016-i1 by @mudler in #4145
- models(gallery): add llama3.1-8b-enigma by @mudler in #4146
- chore(model): add llama3.1-8b-cobalt to the gallery by @mudler in #4147
- chore(model): add qwen2.5-32b-arliai-rpmax-v1.3 to the gallery by @mudler in #4148
- chore(model): add llama3.2-3b-enigma to the gallery by @mudler in #4149
- chore(model): add llama-3.1-8b-arliai-rpmax-v1.3 to the gallery by @mudler in #4150
- chore(model): add magnum-12b-v2.5-kto-i1 to the gallery by @mudler in #4151
- chore(model): add l3.1-8b-slush-i1 to the gallery by @mudler in #4152
- models(gallery): add q2.5-ms-mistoria-72b-i1 by @mudler in #4158
- chore(model): add l3.1-ms-astoria-70b-v2 to the gallery by @mudler in #4159
- chore(model): add magnum-v2-4b-i1 to the gallery by @mudler in #4160
- chore(model): add athene-v2-agent to the gallery by @mudler in #4161
- chore(model): add athene-v2-chat to the gallery by @mudler in #4162
- chore(model-gallery): ⬆️ update checksum by @localai-bot in #4165
- chore(model): add qwen2.5-7b-nerd-uncensored-v1.7 to the gallery by @mudler in #4171
- chore(model): add llama3.2-3b-shiningvaliant2-i1 to the gallery by @mudler in #4174
- chore(model): add l3.1-nemotron-sunfall-v0.7.0-i1 to the gallery by @mudler in #4175
- chore(model): add evathene-v1.0 to the gallery by @mudler in #4176
- chore(model): add miniclaus-qw1.5b-unamgs to the gallery by @mudler in #4177
- chore(model): add silero-vad to the gallery by @mudler in #4210
- models(gallery): add llama-mesh by @mudler in #4222
- chore(model): add llama-doctor-3.2-3b-instruct to the gallery by @mudler in #4223
- chore(model): add copus-2x8b-i1 to the gallery by @mudler in #4225
- chore(model): add llama-3.1-8b-instruct-ortho-v3 to the gallery by @mudler in #4226
- chore(model): add llama-3.1-tulu-3-8b-dpo to the gallery by @mudler in #4228
- chore(model): add marco-o1 to the gallery by @mudler in #4229
- chore(model): add onellm-doey-v1-llama-3.2-3b to the gallery by @mudler in #4230
- chore(model): add llama-sentient-3.2-3b-instruct to the gallery by @mudler in #4235
- chore(model): add qwen2.5-3b-smart-i1 to the gallery by @mudler in #4236
- chore(model): add l3.1-aspire-heart-matrix-8b to the gallery by @mudler in #4237
- chore(model): add dark-chivalry_v1.0-i1 to the gallery by @mudler in #4242
- chore(model): add qwen2.5-coder-32b-instruct-uncensored-i1 to the gallery by @mudler in #4241
- chore(model): add tulu-3.1-8b-supernova-i1 to the gallery by @mudler in #4243
- chore(model): add steyrcannon-0.2-qwen2.5-72b to the gallery by @mudler in #4244
- chore(model-gallery): ⬆️ update checksum by @localai-bot in #4261
- chore(model): add llama-3.1_openscholar-8b to the gallery by @mudler in #4262
- chore(model): add rwkv-6-world-7b to the gallery by @mudler in #4270
- chore(model): add q2.5-ms-mistoria-72b-v2 model config by @mudler in #4275
- chore(model): add llama-3.1-tulu-3-70b-dpo model config by @mudler in #4276
- chore(model): add llama-3.1-tulu-3-8b-sft to the gallery by @mudler in #4277
- chore(model): add eva-qwen2.5-72b-v0.2 to the gallery by @mudler in #4278
- fix(rwkv model): add stoptoken by @mudler in #4283
- chore(model gallery): add qwq-32b-preview by @mudler in #4284
- chore(model gallery): add llama-smoltalk-3.2-1b-instruct by @mudler in #4285
- chore(model gallery): add q2.5-32b-slush-i1 by @mudler in #4292
- chore(model gallery): add freyja-v4.95-maldv-7b-non-fiction-i1 by @mudler in #4293
- chore(model gallery): add qwestion-24b by @mudler in #4294
- chore(model gallery): add volare-i1 by @mudler in #4296
- chore(model gallery): add skywork-o1-open-llama-3.1-8b by @mudler in #4297
- chore(model gallery): add teleut-7b by @mudler in #4298
- chore(model gallery): add sparse-llama-3.1-8b-2of4 by @mudler in #4309
- chore(model gallery): add qwen2.5-7b-homercreative-mix by @mudler in #4310
- chore(model gallery): add bggpt-gemma-2-2.6b-it-v1.0 by @mudler in #4311
- chore(model gallery): add cybercore-qwen-2.1-7b by @mudler in #4314
- chore(model gallery): add chatty-harry_v3.0 by @mudler in #4315
- chore(model gallery): add homercreativeanvita-mix-qw7b by @mudler in #4316
- chore(model gallery): add flux.1-dev-ggml by @mudler in #4317
- chore(model gallery): add bark-cpp-small by @mudler in #4318
📖 Documentation and examples
- docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #4105
- fix: #4215 404 in documentation due to migrated configuration examples by @rmmonster in #4216
- chore(docs): integrating LocalAI with Microsoft Word by @GPTLocalhost in #4218
- add new community integrations by @JackBekket in #4224
👒 Dependencies
- chore: ⬆️ Update ggerganov/llama.cpp to
4b3a9212b602be3d4e2e3ca26efd796cef13c55e
by @localai-bot in #4106 - chore(deps): Bump setuptools from 69.5.1 to 75.4.0 in /backend/python/transformers by @dependabot in #4117
- Revert "chore(deps): Bump setuptools from 69.5.1 to 75.4.0 in /backend/python/transformers" by @mudler in #4123
- chore(deps): Bump dcarbone/install-yq-action from 1.1.1 to 1.2.0 by @dependabot in #4114
- chore(deps): Bump sentence-transformers from 3.2.0 to 3.3.0 in /backend/python/sentencetransformers by @dependabot in #4120
- chore: ⬆️ Update ggerganov/llama.cpp to
54ef9cfc726a799e6f454ac22c4815d037716eda
by @localai-bot in #4122 - chore: ⬆️ Update ggerganov/whisper.cpp to
f19463ece2d43fd0b605dc513d8800eeb4e2315e
by @localai-bot in #4139 - chore: ⬆️ Update ggerganov/llama.cpp to
fb4a0ec0833c71cff5a1a367ba375447ce6106eb
by @localai-bot in #4140 - chore(deps): bump llama-cpp to ae8de6d50a09d49545e0afab2e50cc4acfb280e2 by @mudler in #4157
- chore: ⬆️ Update ggerganov/llama.cpp to
883d206fbd2c5b2b9b589a9328503b9005e146c9
by @localai-bot in #4164 - chore(deps): bump grpcio to 1.68.0 by @mudler in #4166
- chore: ⬆️ Update ggerganov/whisper.cpp to
01d3bd7d5ccd1956a7ddf1b57ee92d69f35aad93
by @localai-bot in #4163 - chore: ⬆️ Update ggerganov/llama.cpp to
db4cfd5dbc31c90f0d5c413a2e182d068b8ee308
by @localai-bot in #4169 - chore(deps): Bump sentence-transformers from 3.3.0 to 3.3.1 in /backend/python/sentencetransformers by @dependabot in #4178
- chore: ⬆️ Update ggerganov/llama.cpp to
d3481e631661b5e9517f78908cdd58cee63c4903
by @localai-bot in #4196 - chore: ⬆️ Update ggerganov/whisper.cpp to
d24f981fb2fbf73ec7d72888c3129d1ed3f91916
by @localai-bot in #4195 - chore(deps): Bump dcarbone/install-yq-action from 1.2.0 to 1.3.0 by @dependabot in #4182
- chore(deps): Bump appleboy/ssh-action from 1.1.0 to 1.2.0 by @dependabot in #4183
- chore: ⬆️ Update ggerganov/whisper.cpp to
6266a9f9e56a5b925e9892acf650f3eb1245814d
by @localai-bot in #4202 - chore: ⬆️ Update ggerganov/llama.cpp to
9fe0fb062630728e3c21b5839e3bce87bff2440a
by @localai-bot in #4203 - chore: ⬆️ Update ggerganov/llama.cpp to
9abe9eeae98b11fa93b82632b264126a010225ff
by @localai-bot in #4212 - chore: ⬆️ Update ggerganov/llama.cpp to
a5e47592b6171ae21f3eaa1aba6fb2b707875063
by @localai-bot in #4221 - chore: ⬆️ Update ggerganov/llama.cpp to
6dfcfef0787e9902df29f510b63621f60a09a50b
by @localai-bot in #4227 - chore: ⬆️ Update ggerganov/llama.cpp to
55ed008b2de01592659b9eba068ea01bb2f72160
by @localai-bot in #4232 - chore: ⬆️ Update ggerganov/llama.cpp to
cce5a9007572c6e9fa522296b77571d2e5071357
by @localai-bot in #4238 - chore(deps): Bump whisper-timestamped from 1.14.2 to 1.15.8 in /backend/python/openvoice by @dependabot in #4248
- chore(deps): Bump faster-whisper from 0.9.0 to 1.1.0 in /backend/python/openvoice by @dependabot in #4249
- chore(deps): Bump dcarbone/install-yq-action from 1.3.0 to 1.3.1 by @dependabot in #4253
- chore(deps): bump llama.cpp to
47f931c8f9a26c072d71224bc8013cc66ea9e445
by @mudler in #4263 - chore: ⬆️ Update ggerganov/llama.cpp to
30ec39832165627dd6ed98938df63adfc6e6a21a
by @localai-bot in #4273 - chore: ⬆️ Update ggerganov/llama.cpp to
3ad5451f3b75809e3033e4e577b9f60bcaf6676a
by @localai-bot in #4280 - chore: ⬆️ Update ggerganov/llama.cpp to
dc22344088a7ee81a1e4f096459b03a72f24ccdc
by @localai-bot in #4288 - chore: ⬆️ Update ggerganov/llama.cpp to
3a8e9af402f7893423bdab444aa16c5d9a2d429a
by @localai-bot in #4290 - chore: ⬆️ Update ggerganov/llama.cpp to
0c39f44d70d058940fe2afe50cfc789e3e44d756
by @localai-bot in #4295 - chore: ⬆️ Update ggerganov/llama.cpp to
5e1ed95583ca552a98d8528b73e1ff81249c2bf9
by @localai-bot in #4299 - chore(deps): bump grpcio to 1.68.1 by @mudler in #4301
- chore(deps): Bump docs/themes/hugo-theme-relearn from
28fce6b
tobe85052
by @dependabot in #4305 - chore: ⬆️ Update ggerganov/llama.cpp to
8648c521010620c2daccfa1d26015c668ba2c717
by @localai-bot in #4307 - chore: ⬆️ Update ggerganov/llama.cpp to
cc98896db858df7aa40d0e16a505883ef196a482
by @localai-bot in #4312
Other Changes
- chore: update jobresult_test.go by @eltociear in #4124
- chore(linguist): add *.hpp files to linguist-vendored by @mudler in #4154
- chore(api): return values from schema by @mudler in #4153
- feat(swagger): update swagger by @localai-bot in #4155
- chore(Makefile): default to non-native builds for llama.cpp by @mudler in #4173
- chore(refactor): imply modelpath by @mudler in #4208
- chore(go.mod): tidy by @mudler in #4209
- feat(swagger): update swagger by @localai-bot in #4211
- integrations: add Nextcloud by @meonkeys in #4233
- Revert "chore(deps): Bump whisper-timestamped from 1.14.2 to 1.15.8 in /backend/python/openvoice" by @mudler in #4267
- Revert "chore(deps): Bump faster-whisper from 0.9.0 to 1.1.0 in /back… by @mudler in #4268
- chore(scripts): handle summarization errors by @mudler in #4271
New Contributors
- @mintyleaf made their first contribution in #4197
- @rmmonster made their first contribution in #4216
- @GPTLocalhost made their first contribution in #4218
- @JackBekket made their first contribution in #4224
- @meonkeys made their first contribution in #4233
Full Changelog: v2.23.0...v2.24.0