Skip to content

Commit ef2af91

Browse files
authored
Update AutoRound layer_config usage (#2331)
Signed-off-by: Kaihui-intel <kaihui.tang@intel.com>
1 parent b6336d4 commit ef2af91

File tree

1 file changed

+19
-0
lines changed

1 file changed

+19
-0
lines changed

docs/source/3x/PT_WeightOnlyQuant.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -178,6 +178,8 @@ model = convert(model, config) # after this step, the model is ready for W4A8 i
178178
| not_use_best_mse (bool) | Whether to use mean squared error | False |
179179
| dynamic_max_gap (int) | The dynamic maximum gap | -1 |
180180
| scale_dtype (str) | The data type of quantization scale to be used, different kernels have different choices | "float16" |
181+
| scheme (str) | A preset scheme that defines the quantization configurations. | "W4A16" |
182+
| layer_config (dict) | Layer-wise quantization config | None |
181183

182184
``` python
183185
# Quantization code
@@ -283,6 +285,23 @@ quant_config = RTNConfig()
283285
lm_head_config = RTNConfig(dtype="fp32")
284286
quant_config.set_local("lm_head", lm_head_config)
285287
```
288+
3. Example of using `layer_config` for AutoRound
289+
```python
290+
# layer_config = {
291+
# "layer1": {
292+
# "data_type": "int",
293+
# "bits": 3,
294+
# "group_size": 128,
295+
# "sym": True,
296+
# },
297+
# "layer2": {
298+
# "W8A16"
299+
# }
300+
# }
301+
# Use the AutoRound specific 'layer_config' instead of the 'set_local' API.
302+
layer_config = {"lm_head": {"data_type": "int"}}
303+
quant_config = AutoRoundConfig(layer_config=layer_config)
304+
```
286305

287306
### Saving and Loading
288307

0 commit comments

Comments
 (0)