Skip to content

Releases: matatonic/openedai-vision

0.41.0

07 Dec 21:43
Compare
Choose a tag to compare

Version 0.41.0

  • new model support: OpenGVLab's InternVL 2.5 family of models (1B-78B)
  • I tried many ways to get split_model() working with InternVL but failed repeatedly, sorry!

0.40.0

07 Dec 04:06
Compare
Choose a tag to compare

Version 0.40.0

  • new model support: AIDC-AI/Ovis1.6-Llama3.2-3B, AIDC-AI/Ovis1.6-Gemma2-27B
  • new model support: BAAI/Aquila-VL-2B-llava-qwen
  • new model support: HuggingFaceTB/SmolVLM-Instruct
  • new model support: google/paligemma2 family of models (very limited instruct/chat training so far)
  • Qwen2-VL: unpin Qwen2-VL-7B & remove Qwen hacks, GTPT-Int4/8 working again (still slow - why?)
  • pin bitsandbytes==0.44.1
  • ⚠️ DEPRECATED MODELS (use the 0.39.2 docker image for support of these models): internlm-xcomposer2-7b, internlm-xcomposer2-7b-4bit, internlm-xcomposer2-vl-1_8b, internlm-xcomposer2-vl-7b, internlm-xcomposer2-vl-7b-4bit, nvidia/NVLM-D-72B, Llama-3-8B-Dragonfly-Med-v1, Llama-3-8B-Dragonfly-v1

0.39.2

13 Oct 13:28
Compare
Choose a tag to compare

Version 0.39.2

  • performance: use float16 with Qwen2 AWQ, small performance improvement
  • fix: handle Ubuntu 24 / Python 3.12 a little better, thanks @Lissanro
  • old code in the last docker, github worker problems again?

0.39.1

10 Oct 21:26
Compare
Choose a tag to compare

Version 0.39.1

  • fix: the github docker package build seems to have been broken a while

0.39.0

10 Oct 20:48
Compare
Choose a tag to compare

Version 0.39.0

  • new model support: rhymes-ai/Aria
  • improved support for multi-image in various models.
  • docker package: The latest release will now be tagged with :latest, rather than latest commit.
  • ⚠️ docker: docker will now run as a user instead of root. Your hf_home volume may need the ownership fixed, you can use this command: sudo chown $(id -u):$(id -g) -R hf_home

0.38.0

08 Oct 23:49
Compare
Choose a tag to compare

Recent updates

Version 0.38.0

  • new model support: AIDC-AI/Ovis1.6-Gemma2-9B

Version 0.37.0 (missing release build)

  • new model support: nvidia/NVLM-D-72B

0.36.0

01 Oct 03:29
Compare
Choose a tag to compare

Recent updates

Version 0.36.0

  • new model support: BAAI/Emu3-Chat
  • Experimental support for fancyfeast/joy-caption-alpha-two with multiple images

0.35.0

29 Sep 20:52
Compare
Choose a tag to compare

Recent updates

Version 0.35.0

  • Update Molmo (tensorflow-cpu no longer required), and add autocast for faster, smaller types than float32.
  • New option: --use-double-quant to enable double quantization with --load-in-4bit, a little slower for a little less VRAM.
  • Molmo 72B will now run in under 48GB of vram using --load-in-4bit --use-double-quant.
  • Add completion_tokens counts in API and logged tokens/s for most results, other compatibility improvements
  • Include sample tokens/s data (A100) in vision.sample.env

0.34.0

26 Sep 01:14
Compare
Choose a tag to compare

Recent updates

Version 0.34.0

  • new model support: Meta-llama: Llama-3.2-11B-Vision-Instruct, Llama-3.2-90B-Vision-Instruct
  • new model support: Ai2/allenai Molmo family of models (requires additional pip install tensorflow-cpu for now, see note)
  • new model support: stepfun-ai/GOT-OCR2_0, this is an OCR only model, all chat is ignored.
  • Support moved to alt image: Bunny-Llama-3-8B-V, Bunny-v1_1-Llama-3-8B-V, Mantis-8B-clip-llama3, Mantis-8B-siglip-llama3, omchat-v2.0-13B-single-beta_hf, qihoo360/360VL-8B

0.33.0

22 Sep 21:47
Compare
Choose a tag to compare

Recent updates

Version 0.33.0

  • new model support: mx262/MiniMonkey, thanks @white2018
  • Fix Qwen2-VL when used with Qwen-Agent and multiple system prompts (tools), thanks @cedonley
  • idefics2-8b support moved to alt image
  • pin Qwen2-VL-7B-Instruct-AWQ revision, see note for info