Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Metal backend #150

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ python3 encodec.cpp/convert.py --dir-model ./models/ --out-dir ./ggml_weights/ -
mv ggml_weights/ggml-model.bin ggml_weights/encodec_weights.bin

# run the inference
./build/examples/main/main -m ./ggml_weights/ -em ./ggml_weights/encodec_weights.bin -p "this is an audio"
./build/examples/main/main -m ./ggml_weights/ -em ./ggml_weights/encodec_weights.bin -t 4 -p "this is an audio"
```

### (Optional) Quantize weights
Expand All @@ -133,6 +133,15 @@ Note that to preserve audio quality, we do not quantize the codec model. The bul
./build/examples/quantize/quantize ./ggml_weights.bin ./ggml_weights_q4.bin q4_0
```

### Using Metal

To build Bark with support of the Metal backend, run

```bash
cmake -DGGML_CUBLAS=ON ..
./build/examples/main/main -m ./ggml_weights/ -ngl 100 -t 8 -p "this is an audio"
```

### Seminal papers

- Bark
Expand Down
7 changes: 7 additions & 0 deletions bark.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,13 @@ static void write_safe(std::ofstream& fout, T& dest) {
fout.write((char*)&dest, sizeof(T));
}

static void ggml_log_callback_default(ggml_log_level level, const char* text, void* user_data) {
(void)level;
(void)user_data;
fputs(text, stderr);
fflush(stderr);
}

static void bark_print_statistics(gpt_model* model) {
printf("\n\n");
printf("%s: sample time = %8.2f ms / %lld tokens\n", __func__, model->t_sample_us / 1000.0f, model->n_sample);
Expand Down
17 changes: 10 additions & 7 deletions examples/common.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -28,17 +28,18 @@ void bark_print_usage(char** argv, const bark_params& params) {
std::cout << "usage: " << argv[0] << " [options]\n"
<< "\n"
<< "options:\n"
<< " -h, --help show this help message and exit\n"
<< " -t N, --threads N number of threads to use during computation (default: " << params.n_threads << ")\n"
<< " -s N, --seed N seed for random number generator (default: " << params.seed << ")\n"
<< " -h, --help show this help message and exit\n"
<< " -t N, --threads N number of threads to use during computation (default: " << params.n_threads << ")\n"
<< " -s N, --seed N seed for random number generator (default: " << params.seed << ")\n"
<< " -ngl N, --n_gpu_layers N number of GPU layers (default: " << params.n_gpu_layers << ")\n"
<< " -p PROMPT, --prompt PROMPT\n"
<< " prompt to start generation with (default: random)\n"
<< " prompt to start generation with (default: random)\n"
<< " -m FNAME, --model FNAME\n"
<< " model path (default: " << params.model_path << ")\n"
<< " model path (default: " << params.model_path << ")\n"
<< " -em FNAME, --encodec_model_path FNAME\n"
<< " Encodec model path (default: " << params.encodec_model_path << ")\n"
<< " Encodec model path (default: " << params.encodec_model_path << ")\n"
<< " -o FNAME, --outwav FNAME\n"
<< " output generated wav (default: " << params.dest_wav_path << ")\n"
<< " output generated wav (default: " << params.dest_wav_path << ")\n"
<< "\n";
}

Expand All @@ -54,6 +55,8 @@ int bark_params_parse(int argc, char** argv, bark_params& params) {
params.model_path = argv[++i];
} else if (arg == "-em" || arg == "--encodec_model_path") {
params.encodec_model_path = argv[++i];
} else if (arg == "-ngl" || arg == "--n_gpu_layers") {
params.n_gpu_layers = std::stoi(argv[++i]);
} else if (arg == "-s" || arg == "--seed") {
params.seed = std::stoi(argv[++i]);
} else if (arg == "-o" || arg == "--outwav") {
Expand Down
3 changes: 3 additions & 0 deletions examples/common.h
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ struct bark_params {
// Number of threads used for audio generation.
int32_t n_threads = std::min(4, (int32_t)std::thread::hardware_concurrency());

// Number of GPU layers. Used for cuBLAS and Metal backends.
int32_t n_gpu_layers = 0;

// User prompt.
std::string prompt = "This is an audio generated by bark.cpp";

Expand Down
1 change: 1 addition & 0 deletions examples/main/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ int main(int argc, char **argv) {
exit(1);
}

bctx->n_gpu_layers = params.n_gpu_layers;
bctx->encodec_model_path = params.encodec_model_path;

// generate audio
Expand Down