You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/generic-methodologies-and-resources/python/keras-model-deserialization-rce-and-gadget-hunting.md
+48-5Lines changed: 48 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -149,10 +149,53 @@ Gadget via Lambda that references an allowed function (not serialized Python byt
149
149
Important limitation:
150
150
- Lambda.call() prepends the input tensor as the first positional argument when invoking the target callable. Chosen gadgets must tolerate an extra positional arg (or accept *args/**kwargs). This constrains which functions are viable.
- Network callbacks/SSRF-like effects depending on environment
155
-
- Chaining to code execution if written paths are later imported/executed or added to PYTHONPATH, or if a writable execution-on-write location exists
152
+
## ML pickle import allowlisting for AI/ML models (Fickling)
153
+
154
+
Many AI/ML model formats (PyTorch .pt/.pth/.ckpt, joblib/scikit-learn, older TensorFlow artifacts, etc.) embed Python pickle data. Attackers routinely abuse pickle GLOBAL imports and object constructors to achieve RCE or model swapping during load. Blacklist-based scanners often miss novel or unlisted dangerous imports.
155
+
156
+
A practical fail-closed defense is to hook Python’s pickle deserializer and only allow a reviewed set of harmless ML-related imports during unpickling. Trail of Bits’ Fickling implements this policy and ships a curated ML import allowlist built from thousands of public Hugging Face pickles.
157
+
158
+
Security model for “safe” imports (intuitions distilled from research and practice): imported symbols used by a pickle must simultaneously:
159
+
- Not execute code or cause execution (no compiled/source code objects, shelling out, hooks, etc.)
160
+
- Not get/set arbitrary attributes or items
161
+
- Not import or obtain references to other Python objects from the pickle VM
162
+
- Not trigger any secondary deserializers (e.g., marshal, nested pickle), even indirectly
163
+
164
+
Enable Fickling’s protections as early as possible in process startup so that any pickle loads performed by frameworks (torch.load, joblib.load, etc.) are checked:
165
+
166
+
```python
167
+
import fickling
168
+
# Sets global hooks on the stdlib pickle module
169
+
fickling.hook.activate_safe_ml_environment()
170
+
```
171
+
172
+
Operational tips:
173
+
- You can temporarily disable/re-enable the hooks where needed:
174
+
175
+
```python
176
+
fickling.hook.deactivate_safe_ml_environment()
177
+
# ... load fully trusted files only ...
178
+
fickling.hook.activate_safe_ml_environment()
179
+
```
180
+
181
+
- If a known-good model is blocked, extend the allowlist for your environment after reviewing the symbols:
- Fickling also exposes generic runtime guards if you prefer more granular control:
191
+
- fickling.always_check_safety() to enforce checks for all pickle.load()
192
+
- with fickling.check_safety(): for scoped enforcement
193
+
- fickling.load(path) / fickling.is_likely_safe(path) for one-off checks
194
+
195
+
- Prefer non-pickle model formats when possible (e.g., SafeTensors). If you must accept pickle, run loaders under least privilege without network egress and enforce the allowlist.
196
+
197
+
This allowlist-first strategy demonstrably blocks common ML pickle exploit paths while keeping compatibility high. In ToB’s benchmark, Fickling flagged 100% of synthetic malicious files and allowed ~99% of clean files from top Hugging Face repos.
198
+
156
199
157
200
## Researcher toolkit
158
201
@@ -247,4 +290,4 @@ Repeat tests across codebases and formats (.keras vs legacy HDF5) to uncover reg
0 commit comments