Releases: guinmoon/LLMFarm
Releases · guinmoon/LLMFarm
v1.4.1
Changes
- llama.cpp updated to b4122
- Ask question to PDF document content from shortcut, see shortcut example in documentation
- Max RAG answer count option
- Fixed duplication of documents in the index
- Fixed keyboard overlapping of settings items
- Some metal fixes and improvements
- Some other fixes
v1.4.0
Changes
- llama.cpp updated to b3982
- Added RAG support for
pdf
documents - Chat settings UI improvements
- Added text summarization shortcut
- Some metal improvemets
- Added support for Chameleon
- Fixed some errors
- Currently, adding documents when creating a chat is not supported. To add documents to a chat, first create it. Then go to the chat settings and add documents.
v1.3.9
Changes
- llama.cpp updated to b3837
- Added support for llama 3.2, 3.1, RWKV, MiniCPM(2.5, 2.6, 3), Chameleon models
- Added Metal support for Mamba
- Added Llama 3.2 download links (working with LLaMa3 Instruct template)
- Added Phi 3.5 download links (working with Phi 3 template)
- Added Bunny template and download link
- Some metal improvements
- Fixed some errors for gemma2, DeepSeek-V2, ChatGLM4, llama 3.1, T5, TriLMs, BitNet models
- Fixed some other errors
v1.3.4
Changes:
- Added Gemma2, T5, JAIS, Bitnet support, GLM(3,4), Mistral Nemo
- Added OpenELM support
- Added the ability to change the styling of chat messages.
- Added a built-in demo chat for new users. (The ChatML template is used)
- Some detokenizer fixes
- LoRa and FineTune are temporarily disabled due to the need for code refactoring due to a large number of errors.
v1.3.0
Changes:
- LLaMA.cpp updated to b3190
- Added support for DeepseekV2, GPTNeoX (Pythia and others)
- Added support for Markdown formatting
- Added support for using history in Shortcuts
- Added Flash Attention support
- Added NPredict option
- Metal and CPU inference improvements
- Sampling and eval improvements
- Some fixes for phi-3 and MiniCPM
- Fixed some errors
- Added Qwen template
v1.2.5
Changes:
- Save/Load context state. Now it is possible to continue the dialog with the model even after the program is reopened. Read more here.
- Chat settings such as sampling are now applied without reloading the chat.
- Skip tokens option. Allows you to specify tokens that will not be displayed in the results. May be useful for phi3 and qwen.
- Fixed some errors.
v1.2.0
Changes:
- Added shortcuts support
- llama.cpp updated to b2864
- Some llama-3, Command-R fixes
- Added llama3 instruct template
- Fixed a bug that could cause the application to crash if the system prompt format is incorrect
- Fix memory bug in grammar parser
- Fixed some other bugs
v1.1.1
v1.1.0
Changes:
- llama.cpp updated to b2717
- Phi3, Mamba(CPU only), gemma, StarCoder2, GritLM, Command-R, MobileVLM_V2, qwen2moe models
- IQ1_S, IQ2_S, IQ2_M, IQ3_S, IQ4_NL, IQ4_XS quntization support
- Performance improvements
- Fixed crash when EOS option is on
- Fixed image orientation
v1.0.1
Changes:
- Fixed some bugs that could cause the application to crash
- When you clear the message history, the model context is also cleared
- Added the ability to hide the keyboard, to do this tap anywhere in the chat window
- Added ability to temporarily disable chat autoscrolling by tapping anywhere in the chat window, autoscrolling will be enabled automatically when sending a new message
*If you are getting "strange" prediction results in version 1.0.1, but everything was fine in version 1.0.0, try disabling the BOS option in the template.