The vulnerability lies in the file-type detection logic within the picklescan.scanner.scan_bytes function. This function is the primary entry point for scanning files and byte streams for malicious pickle data.
The flaw occurs when the scanner encounters a file with a file extension typically associated with PyTorch models (e.g., .bin). The scanner's logic gives precedence to the file extension, and it will first attempt to parse the file as a PyTorch model using the scan_pytorch function. If the file is actually a standard pickle file and not a PyTorch model, this parsing will fail, raising an InvalidMagicError.
In the vulnerable versions (<= 0.0.30), the scan_bytes function catches this exception but then immediately returns an error result, halting any further analysis of the file. It fails to implement a crucial fallback mechanism to check if the file might be a standard pickle file despite its misleading extension.
An attacker can exploit this by taking a malicious pickle file and renaming it with a .bin extension. When a user scans this file with the vulnerable version of picklescan, the tool will fail to detect the malicious payload because the scan_pickle_bytes function is never called for that file. This effectively creates a bypass of the scanner's security checks.
The patch rectifies this by removing the premature return. When scan_pytorch fails, the code now logs a warning and allows execution to continue. This enables the scanner to subsequently attempt other scanning methods, including scan_pickle_bytes, which will correctly identify the malicious content. Therefore, a runtime profile of an exploit would show a call to scanner.scan_bytes that, in the vulnerable version, exits without calling scanner.scan_pickle_bytes for the malicious file.