The vulnerability is a Denial of Service (DoS) caused by insufficient validation of sparse tensors when processing prompt embeddings. The root cause is that PyTorch, for performance reasons, does not validate sparse tensor invariants by default. An attacker can provide a maliciously crafted sparse tensor with out-of-bounds indices. When the application attempts to convert this sparse tensor to a dense tensor using .to_dense(), it can result in an out-of-bounds memory write, leading to a process crash.
The patch addresses this by wrapping the tensor loading and conversion logic within a torch.sparse.check_sparse_tensor_invariants() context manager. This forces PyTorch to validate the sparse tensor's indices, preventing the out-of-bounds write from occurring.
The vulnerable functions are all entry points where external data is deserialized into PyTorch tensors to be used as embeddings. This includes functions for handling prompt embeddings in the completions API, as well as image and audio embeddings in the multimodal APIs, whether loaded from a file, raw bytes, or a base64 encoded string.