The vulnerability exists in the IndexLookup layer, which is the parent class for StringLookup and IntegerLookup layers in Keras. The root cause is the improper handling of the vocabulary parameter when loading a model using keras.Model.load_model with safe_mode=True.
An attacker can create a malicious .keras model file where a StringLookup layer's configuration contains a vocabulary key pointing to a local file path (e.g., /etc/passwd) or a URL. When a victim loads this model, the deserialization process reconstructs the layers. This triggers the IndexLookup.__init__ constructor, which receives the malicious path. The constructor then calls the IndexLookup.set_vocabulary method to load the vocabulary.
Prior to the patch, the set_vocabulary method did not enforce the safe_mode restriction. It would check if the vocabulary argument is a string (a path) and, if so, would use tf.io.gfile to read from that path. Since tf.io.gfile supports local file (file://) and remote (http://, https://, gs://) protocols, this behavior could be exploited for either arbitrary local file reads or Server-Side Request Forgery (SSRF).
The patch addresses this by adding a check if serialization_lib.in_safe_mode(): inside set_vocabulary. If safe_mode is enabled, the function now raises a ValueError if it's asked to load a vocabulary from an external file path, effectively closing the vulnerability.
IndexLookup.set_vocabularykeras/src/layers/preprocessing/index_lookup.py
IndexLookup.__init__keras/src/layers/preprocessing/index_lookup.py
| Package Name | Ecosystem | Vulnerable Versions | First Patched Version |
|---|---|---|---|
| keras | pip | < 3.12.0 | 3.12.0 |
Ongoing coverage of React2Shell