Commit 0e4d3a0
[Cache] Fix environment variable handling for offline mode (#1902)
SUMMARY:
Previously, llm-compressor ignored HF_HUB_CACHE and other environment
variables when loading models and datasets, making offline mode
difficult to use with unified cache directories.
This change:
- Removes hard-coded TRANSFORMERS_CACHE in model_load/helpers.py to
respect HF_HOME, HF_HUB_CACHE environment variables
- Propagates cache_dir from model_args to dataset_args to enable unified
cache directory for both models and datasets
- Updates dataset loading to use cache_dir parameter instead of
hardcoded None
Now users can specify cache_dir parameter or use HF_HOME/HF_HUB_CACHE
environment variables for true offline operation.
Offline mode is super helpful to supply-chain security use cases. It
helps us generate trustworthy SBOMs for AI stuff. 🔐
🧠
TEST PLAN:
I start with the oneshot example from the README, and called it
`example.py`:
```python
""" This is the example from the README """
from llmcompressor.modifiers.smoothquant import SmoothQuantModifier
from llmcompressor.modifiers.quantization import GPTQModifier
from llmcompressor import oneshot
recipe = [
SmoothQuantModifier(smoothing_strength=0.8),
GPTQModifier(scheme="W8A8", targets="Linear", ignore=["lm_head"]),
]
oneshot(
model="TinyLlama/TinyLlama-1.1B-Chat-v1.0",
dataset="open_platypus",
recipe=recipe,
output_dir="TinyLlama-1.1B-Chat-v1.0-INT8",
max_seq_length=2048,
num_calibration_samples=512,
)
```
Next, remove your hf local cache to ensure your system has nothing
available to it yet:
```bash
❯ rm -rf ~/.cache/huggingface
```
Then, run `example.py` with the HF_HUB_OFFLINE=1 env var. This should
fail, proving that you have nothing cached.
```bash
❯ HF_HUB_OFFLINE=1 python example.py
Traceback (most recent call last):
File "/home/rbean/code/llm-compressor/testtest/lib64/python3.13/site-packages/transformers/utils/hub.py", line 479, in cached_files
...
<snip>
...
OSError: We couldn't connect to 'https://huggingface.co' to load the files, and couldn't find them in the cached files.
Check your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.
```
Good. Now, run it with `HF_HUB=./hf-hub` which will run it in online
mode, populating the cache in a new non-standard location (just to be
sure things don't get mixed up during our test):
```bash
❯ HF_HOME=./hf-hub python example.py
<lots of downloading happens, but you can ctrl-C when it gets into the real compression work>
```
Now, finally, you can run with both HF_HOME and HF_HUB_OFFLINE=1 and
prove to yourself that llm-compressor uses that freshly-populated cache
for both the model and the dataset.
```bash
❯ HF_HOME=./hf-hub HF_HUB_OFFLINE=1 python example.py
<it works!>
```
---------
Signed-off-by: Ralph Bean <rbean@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Brian Dellabetta <brian-dellabetta@users.noreply.github.com>1 parent 51ff37d commit 0e4d3a0
File tree
6 files changed
+9
-16
lines changed- src/llmcompressor
- args
- entrypoints
- pytorch/model_load
- transformers/finetune/data
- tests/llmcompressor/transformers/finetune/data
6 files changed
+9
-16
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
50 | 50 | | |
51 | 51 | | |
52 | 52 | | |
53 | | - | |
54 | | - | |
55 | | - | |
56 | | - | |
57 | 53 | | |
58 | 54 | | |
59 | 55 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
231 | 231 | | |
232 | 232 | | |
233 | 233 | | |
234 | | - | |
235 | 234 | | |
236 | 235 | | |
237 | 236 | | |
| |||
279 | 278 | | |
280 | 279 | | |
281 | 280 | | |
282 | | - | |
283 | | - | |
284 | 281 | | |
285 | 282 | | |
286 | 283 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
175 | 175 | | |
176 | 176 | | |
177 | 177 | | |
178 | | - | |
| 178 | + | |
179 | 179 | | |
180 | 180 | | |
181 | 181 | | |
| |||
211 | 211 | | |
212 | 212 | | |
213 | 213 | | |
214 | | - | |
| 214 | + | |
215 | 215 | | |
216 | 216 | | |
217 | 217 | | |
| |||
233 | 233 | | |
234 | 234 | | |
235 | 235 | | |
236 | | - | |
| 236 | + | |
237 | 237 | | |
238 | 238 | | |
239 | 239 | | |
| |||
266 | 266 | | |
267 | 267 | | |
268 | 268 | | |
269 | | - | |
| 269 | + | |
270 | 270 | | |
271 | 271 | | |
272 | 272 | | |
| |||
285 | 285 | | |
286 | 286 | | |
287 | 287 | | |
288 | | - | |
| 288 | + | |
289 | 289 | | |
290 | 290 | | |
291 | 291 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
149 | 149 | | |
150 | 150 | | |
151 | 151 | | |
152 | | - | |
153 | 152 | | |
154 | 153 | | |
155 | 154 | | |
156 | 155 | | |
157 | 156 | | |
| 157 | + | |
| 158 | + | |
158 | 159 | | |
159 | 160 | | |
160 | 161 | | |
161 | | - | |
| 162 | + | |
162 | 163 | | |
163 | 164 | | |
164 | 165 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
195 | 195 | | |
196 | 196 | | |
197 | 197 | | |
198 | | - | |
| 198 | + | |
199 | 199 | | |
200 | 200 | | |
201 | 201 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
22 | 21 | | |
23 | 22 | | |
24 | 23 | | |
| |||
0 commit comments