The vulnerability stems from the unsafe use of pickle.loads for deserialization within the monai.data.utils.pickle_operations function. The vulnerability can be triggered when this function is called with the is_encode parameter set to False.
The exploit chain typically begins when a user processes a dataset using a PyTorch DataLoader configured with collate_fn=monai.data.utils.list_data_collate. If the dataset contains a dictionary with a key ending in _transforms (the default suffix) and a value that is a maliciously crafted pickled byte string, this payload gets included in a batch.
While iterating over the DataLoader, the application will likely need to de-collate the batch to process individual items. This de-collation process, as implemented in monai.data.utils.decollate_batch, calls pickle_operations with is_encode=False. This triggers the pickle.loads call on the malicious payload, resulting in arbitrary code execution on the machine running the code.
Therefore, a runtime profile of an exploit would show list_data_collate as the entry point for handling the data, followed by decollate_batch to unpack the batch, and finally pickle_operations where the malicious code is actually executed via pickle.loads.
malicious_data = {
'image': normal_image_tensor,
'label': normal_label_tensor,
'preprocessing_transforms': pickle.dumps(MaliciousPayload()), # Malicious payload
'augmentation_transforms': pickle.dumps(MaliciousPayload()) # Multiple attack points
}
dataset = [malicious_data, ...]
When a user batch-processes data using MONAI's list_data_collate function, the system automatically calls pickle_operations to handle the serialization transformations.
from monai.data import list_data_collate
dataloader = DataLoader(
dataset,
batch_size=4,
collate_fn=list_data_collate # Trigger the vulnerability
)
# Automatically execute malicious code while traversing the data
for batch in dataloader:
# Malicious code is executed in pickle_operations
pass
When a user loads a serialized file from an external, untrusted source, the remote code execution (RCE) is triggered.
Arbitrary code execution
Verify the data source and content before deserializing, or use a safe deserialization method, which should have a similar fix in huggingface's transformer library.
| Package Name | Ecosystem | Vulnerable Versions | First Patched Version |
|---|---|---|---|
| monai | pip | <= 1.5.0 |