The vulnerability existed in the pypickle library prior to version 2.0.0, specifically within its deserialization mechanism. The core issue was that the main pypickle.pypickle.load function provided an option (safe=False) to bypass its default safer unpickling process. When safe=False was used, load would delegate the deserialization to an internal helper function, pypickle.pypickle.load_unsafe.
The load_unsafe function then used Python's built-in pickle.load directly, without any checks on the content being deserialized. This is a well-known security risk, as specially crafted pickle files can execute arbitrary code upon deserialization.
Thus, an attacker with local access (to provide a malicious pickle file) could exploit this by calling pypickle.load('malicious.pkl', safe=False), leading to potential code execution. Both load (for allowing the unsafe path) and load_unsafe (for performing the unsafe deserialization) were critical to the vulnerability.
The patch (commit 14b4cae704a0bb4eb6723e238f25382d847a1917) remediated this by:
- Removing the
load_unsafe function entirely.
- Replacing the boolean
safe parameter in the load function with a more robust validate parameter and an allowlist mechanism.
- Introducing a
ValidateUnpickler class that checks for allowed modules before actual deserialization occurs, even if full validation is requested.
If validate=False is used in the patched version, load_pickle (a new function similar to the old load_unsafe) is called, but this is now an explicit choice to disable security measures, rather than an easily overlooked bypass of a safe default. The vulnerable path involved the safe=False parameter in the load function leading to the execution of the now-removed load_unsafe function.