Commit 1aa196f
authored
[MoE Calibration] Simplify MoE calibration interface (#1851)
## Introduce standardized MoE calibration interface and deprecate legacy
replace_modules_for_calibration
### Summary
Implements a simplified, decorator-based registration system for MoE
model calibration using a single `MoECalibrationModule` base class,
making MoE model integration easier and deprecates the legacy
`replace_modules_for_calibration` function.
### Problem
MoE model calibration currently requires module replacement logic
scattered across `replace_modules_for_calibration` and manual context
management. This makes contributing new MoE model support difficult and
error-prone. Additionally, each model required custom replacement
functions with duplicated boilerplate code.
### Relevant Issues
Fixes #1829
### Solution
**`MoECalibrationModule`** abstract base class implementation
- Only two required methods: `from_original()` classmethod and optional
`restore()`
- `is_permanent` flag to specify if module replacement is to be restored
using `restore()`
- Clear contract: permanent modules stay in calibration form,
non-permanent modules get restored after context exit
**Decorator-Based Registration**:
`@register_moe_calibration("ModuleName")` decorator
- Automatic registration in `MOE_CALIBRATION_MODULES` registry
- Models self-register when their module is imported
**New Model Integration**: Adding MoE support requires only:
```python
@register_moe_calibration("YourMoEModule")
class CalibrationYourMoE(MoECalibrationModule):
is_permanent = True # or False
@classmethod
def from_original(cls, original, config, calibrate_all_experts=True):
return cls(config, original, calibrate_all_experts)
```
**Dataset Arguments**: New: `moe_calibrate_all_experts: bool = True` -
Controls whether all experts see all tokens during calibration
- `True` (default): All experts receive all tokens for proper
quantization statistics
- `False`: Normal routing behavior (only routed experts are used)
- Used by both `oneshot()` and `DatasetArguments`
- Automatically passed to `moe_calibration_context` by pipelines
**Automatic Context Management**: `moe_calibration_context` integrated
into pipelines
- Wraps calibration automatically in `oneshot.py`
- Handles module replacement and restoration transparently
- No manual context management required by users
**Backward Compatibility**: Deprecation of
`replace_modules_for_calibration` with warnings
- Legacy function preserved for compatibility
- Clear migration path documented in deprecation message
### Test Plan
- ✅ Unit tests for contextual MoE calibration with automatic module
restoration
- ✅ Unit tests for permanent MoE calibration persistence
- ✅ Integration tests with Qwen3, Llama4, and DeepSeek V3 models
- ✅ Verification that all experts receive data during calibration
- ✅ Deprecation warning verification for legacy functions
### Testing
- ✅ All unit tests pass
- ✅ Calibration types working correctly
- ✅ Model structure correctly modified and restored inside/outside
contexts
- ✅ Linting and type checking pass
- ✅ Backward compatibility verified with deprecation warnings
### Migration Guide
**Before**:
```python
# Required defining MoEModelConfig entries, handling context manually
from llmcompressor.modeling.prepare import replace_modules_for_calibration
model = replace_modules_for_calibration(model, calibrate_all_experts=True)
```
**After**:
```python
# Automatic - just use moe_calibration_context
from llmcompressor.modeling import moe_calibration_context
with moe_calibration_context(model, calibrate_all_experts=True):
# Run calibration - modules replaced automatically
for batch in dataloader:
model(**batch)
# Modules restored automatically (if not permanent)
```
---------
Signed-off-by: Sairam Pillai <sairam.pillai61@gmail.com>1 parent 0f346cf commit 1aa196f
File tree
19 files changed
+460
-225
lines changed- examples
- multimodal_vision
- quantization_w4a4_fp4
- quantization_w8a8_fp8
- quantizing_moe
- src/llmcompressor
- args
- entrypoints
- modeling
- pipelines
- basic
- layer_sequential
- sequential
- tests/llmcompressor/modeling
19 files changed
+460
-225
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
6 | | - | |
7 | 6 | | |
8 | 7 | | |
9 | 8 | | |
10 | 9 | | |
11 | 10 | | |
12 | 11 | | |
13 | | - | |
14 | | - | |
15 | | - | |
16 | | - | |
17 | | - | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
18 | 24 | | |
19 | 25 | | |
20 | 26 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
84 | 84 | | |
85 | 85 | | |
86 | 86 | | |
87 | | - | |
| 87 | + | |
88 | 88 | | |
89 | | - | |
90 | | - | |
| 89 | + | |
| 90 | + | |
91 | 91 | | |
92 | | - | |
| 92 | + | |
93 | 93 | | |
94 | 94 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
6 | | - | |
7 | 6 | | |
8 | 7 | | |
9 | 8 | | |
10 | 9 | | |
11 | 10 | | |
12 | 11 | | |
13 | | - | |
14 | | - | |
15 | | - | |
16 | | - | |
17 | | - | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
59 | 59 | | |
60 | 60 | | |
61 | 61 | | |
62 | | - | |
63 | | - | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
64 | 69 | | |
65 | | - | |
66 | | - | |
| 70 | + | |
| 71 | + | |
67 | 72 | | |
68 | 73 | | |
69 | 74 | | |
70 | 75 | | |
71 | 76 | | |
72 | 77 | | |
73 | | - | |
| 78 | + | |
74 | 79 | | |
75 | 80 | | |
76 | 81 | | |
| |||
Lines changed: 6 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
5 | 4 | | |
6 | 5 | | |
7 | 6 | | |
| |||
10 | 9 | | |
11 | 10 | | |
12 | 11 | | |
13 | | - | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
14 | 18 | | |
15 | 19 | | |
16 | 20 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
6 | 5 | | |
7 | 6 | | |
8 | 7 | | |
| |||
20 | 19 | | |
21 | 20 | | |
22 | 21 | | |
23 | | - | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
24 | 27 | | |
25 | 28 | | |
26 | 29 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
126 | 126 | | |
127 | 127 | | |
128 | 128 | | |
129 | | - | |
130 | | - | |
131 | | - | |
132 | | - | |
133 | | - | |
134 | | - | |
135 | | - | |
136 | | - | |
137 | | - | |
138 | | - | |
139 | 129 | | |
140 | 130 | | |
141 | 131 | | |
| |||
181 | 171 | | |
182 | 172 | | |
183 | 173 | | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
184 | 186 | | |
185 | 187 | | |
186 | 188 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
| |||
209 | 210 | | |
210 | 211 | | |
211 | 212 | | |
212 | | - | |
| 213 | + | |
| 214 | + | |
213 | 215 | | |
214 | | - | |
215 | | - | |
216 | | - | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
217 | 223 | | |
218 | 224 | | |
219 | 225 | | |
| |||
252 | 258 | | |
253 | 259 | | |
254 | 260 | | |
255 | | - | |
| 261 | + | |
256 | 262 | | |
257 | 263 | | |
258 | 264 | | |
| |||
316 | 322 | | |
317 | 323 | | |
318 | 324 | | |
319 | | - | |
320 | | - | |
321 | | - | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
322 | 329 | | |
323 | 330 | | |
324 | 331 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
7 | 12 | | |
8 | | - | |
| 13 | + | |
| 14 | + | |
9 | 15 | | |
10 | | - | |
| 16 | + | |
11 | 17 | | |
12 | 18 | | |
| 19 | + | |
| 20 | + | |
13 | 21 | | |
14 | 22 | | |
15 | | - | |
16 | 23 | | |
17 | | - | |
| 24 | + | |
| 25 | + | |
18 | 26 | | |
19 | 27 | | |
20 | 28 | | |
| |||
65 | 73 | | |
66 | 74 | | |
67 | 75 | | |
| 76 | + | |
68 | 77 | | |
69 | 78 | | |
70 | 79 | | |
71 | 80 | | |
72 | 81 | | |
73 | | - | |
74 | | - | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
75 | 90 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
14 | 18 | | |
15 | 19 | | |
16 | 20 | | |
17 | | - | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
18 | 34 | | |
19 | 35 | | |
20 | | - | |
21 | 36 | | |
22 | | - | |
| 37 | + | |
| 38 | + | |
23 | 39 | | |
24 | 40 | | |
25 | | - | |
26 | | - | |
27 | | - | |
28 | | - | |
29 | | - | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
30 | 50 | | |
31 | 51 | | |
32 | 52 | | |
33 | 53 | | |
34 | | - | |
| 54 | + | |
35 | 55 | | |
36 | 56 | | |
37 | 57 | | |
| |||
74 | 94 | | |
75 | 95 | | |
76 | 96 | | |
| 97 | + | |
77 | 98 | | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
78 | 103 | | |
79 | | - | |
80 | | - | |
| 104 | + | |
| 105 | + | |
81 | 106 | | |
82 | 107 | | |
0 commit comments