Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design

Ruisi Cai¹, Yeonju Ro¹, Geon-Woo Kim¹, Peihao Wang¹, Babak Ehteshami Bejnordi², Aditya Akella¹, Zhangyang Wang¹

¹University of Texas at Austin, ²Qualcomm AI Research

Usage

The code is based on the Hugging Face Transformers repository. We modified src/transformers/model/modeling_llama.py to integrate the MoE-fication process.

The main scripts are located in the moefication directory. Start by running the preprocessing scripts, moefication/scripts/preprocess_1.sh and moefication/scripts/preprocess_2.sh, to generate experts. After preprocessing, train the model using moefication/scripts/train.sh.

Citation

If you find this useful, please cite the following paper:

@inproceedings{
cai2024textitreadme,
title={\${\textbackslash}textit\{Read-{ME}\}\$: Refactorizing {LLM}s as Router-Decoupled Mixture of Experts with System Co-Design},
author={Ruisi Cai and Yeonju Ro and Geon-Woo Kim and Peihao Wang and Babak Ehteshami Bejnordi and Aditya Akella and Zhangyang Wang},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
year={2024},
url={https://openreview.net/forum?id=i8JaxY7tDI}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
docker		docker
docs		docs
examples		examples
model_cards		model_cards
moe_fication		moe_fication
notebooks		notebooks
scripts		scripts
src/transformers		src/transformers
templates		templates
tests		tests
utils		utils
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design

Usage

Citation

About

Releases

Packages

Languages

VITA-Group/READ-ME

Folders and files

Latest commit

History

Repository files navigation

Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design

Usage

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages