-
-
Notifications
You must be signed in to change notification settings - Fork 136
Home
AlpinDale edited this page Jan 8, 2024
·
5 revisions
Aphrodite Engine is designed for serving LLMs at scale. It supports the majority of HuggingFace models, including Llama, Mistral, and Mixtral.
Aphrodite also supports multiple weight quantization methods for not-at-scale use-cases. The currently supported quantization methods are GPTQ, AWQ, and SqueezeLLM.
Please refer to the Installation page for instructions on how to use the engine.