The vulnerability is a serialization injection issue in LangChain. It stems from two core problems that, when combined, allow for remote code execution and secret exfiltration.
First, the serialization functions langchain_core.load.dump.dumps and langchain_core.load.dump.dumpd failed to properly sanitize user-provided data. Specifically, they did not escape dictionaries that contained a special lc key. This key is used by LangChain to identify its own serialized objects. By crafting input with this key, an attacker could make arbitrary data appear as a legitimate LangChain object during serialization.
Second, the deserialization functions langchain_core.load.load.load and langchain_core.load.load.loads had insecure default settings. Most critically, the secrets_from_env parameter was set to True by default. This meant that if a serialized object contained a special "secret" structure, the deserializer would attempt to read the value of a specified environment variable and include it in the resulting object.
An attacker could exploit this by injecting a malicious dictionary (e.g., {'lc': 1, 'type': 'secret', 'id': ['ENV_VAR']}) into data that would be processed by dumps or dumpd. The resulting serialized string, when passed to load or loads, would cause the deserializer to read the ENV_VAR environment variable and return its value to the attacker.
The patch addresses both issues. It introduces an escaping mechanism in dumps and dumpd to prevent user data from being misinterpreted as LangChain objects. It also changes the defaults in load and loads to be secure, setting secrets_from_env to False and adding an allowlist (allowed_objects) for classes that can be deserialized.