Skip to content

Commit 386c5fb

Browse files
Add Metal backend documentation to Whisper README (#15740)
1 parent 2aeee9b commit 386c5fb

File tree

1 file changed

+51
-3
lines changed

1 file changed

+51
-3
lines changed

examples/models/whisper/README.md

Lines changed: 51 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,11 +20,25 @@ module to generate the spectrogram tensor.
2020

2121
## Build
2222

23-
Currently we have CUDA build support only. CPU and Metal backend builds are WIP.
23+
Currently we have CUDA and Metal build support. CPU is WIP.
24+
25+
For CUDA:
26+
```
27+
BUILD_BACKEND="EXECUTORCH_BUILD_CUDA"
28+
```
29+
30+
For Metal:
31+
```
32+
BUILD_BACKEND="EXECUTORCH_BUILD_METAL"
33+
```
2434

2535
```bash
2636
# Install ExecuTorch libraries:
27-
cmake --preset llm -DEXECUTORCH_BUILD_CUDA=ON -DCMAKE_INSTALL_PREFIX=cmake-out -DCMAKE_BUILD_TYPE=Release . -Bcmake-out
37+
cmake --preset llm \
38+
-D${BUILD_BACKEND}=ON \
39+
-DCMAKE_INSTALL_PREFIX=cmake-out \
40+
-DCMAKE_BUILD_TYPE=Release \
41+
-Bcmake-out -S.
2842
cmake --build cmake-out -j$(nproc) --target install --config Release
2943

3044
# Build the runner:
@@ -44,6 +58,8 @@ tokenizer target (`tokenizers::tokenizers`).
4458

4559
Use [Optimum-ExecuTorch](https://github.com/huggingface/optimum-executorch) to export a Whisper model from Hugging Face:
4660

61+
#### CUDA backend:
62+
4763
```bash
4864
optimum-cli export executorch \
4965
--model openai/whisper-small \
@@ -58,6 +74,23 @@ This command generates:
5874
- `model.pte` — Compiled Whisper model
5975
- `aoti_cuda_blob.ptd` — Weight data file for CUDA backend
6076

77+
#### Metal backend:
78+
79+
```bash
80+
optimum-cli export executorch \
81+
--model openai/whisper-small \
82+
--task automatic-speech-recognition \
83+
--recipe metal \
84+
--dtype bfloat16 \
85+
--output_dir ./
86+
```
87+
88+
This command generates:
89+
- `model.pte` — Compiled Whisper model
90+
- `aoti_metal_blob.ptd` — Weight data file for Metal backend
91+
92+
### Preprocessor
93+
6194
Export a preprocessor to convert raw audio to mel-spectrograms:
6295

6396
```bash
@@ -71,7 +104,7 @@ python -m executorch.extension.audio.mel_spectrogram \
71104

72105
### Quantization
73106

74-
Export quantized models to reduce size and improve performance:
107+
Export quantized models to reduce size and improve performance (Not enabled for Metal yet):
75108

76109
```bash
77110
# 4-bit tile packed quantization for encoder
@@ -120,6 +153,8 @@ python -c "from datasets import load_dataset; import soundfile as sf; sample = l
120153

121154
After building the runner (see [Build](#build) section), execute it with the exported model and audio:
122155

156+
#### CUDA backend:
157+
123158
```bash
124159
# Set library path for CUDA dependencies
125160
export LD_LIBRARY_PATH=/opt/conda/lib:$LD_LIBRARY_PATH
@@ -133,3 +168,16 @@ cmake-out/examples/models/whisper/whisper_runner \
133168
--processor_path whisper_preprocessor.pte \
134169
--temperature 0
135170
```
171+
172+
#### Metal backend:
173+
174+
```bash
175+
# Run the Whisper runner
176+
cmake-out/examples/models/whisper/whisper_runner \
177+
--model_path model.pte \
178+
--data_path aoti_metal_blob.ptd \
179+
--tokenizer_path ./ \
180+
--audio_path output.wav \
181+
--processor_path whisper_preprocessor.pte \
182+
--temperature 0
183+
```

0 commit comments

Comments
 (0)