The vulnerability consists of two separate issues that can be chained together to achieve Remote Code Execution (RCE). The root cause of the RCE is a heap-based buffer overflow in the JPEG2000 decoder of FFmpeg, which is bundled with the version of OpenCV used by vLLM prior to the patch. An attacker can craft a malicious video file and provide its URL to a vLLM instance serving a video model. The MediaConnector.load_from_url function fetches the malicious file, and OpenCVVideoBackend.load_bytes passes the data to the vulnerable cv2.VideoCapture, triggering the overflow and allowing arbitrary code execution.
To facilitate this exploit, a separate information leak vulnerability allows an attacker to bypass Address Space Layout Randomization (ASLR). By sending a request with an invalid image or video, the server would generate an exception from the underlying PIL library. Prior to patching, various API endpoint handlers, such as create_chat_completion, and generic exception handlers like http_exception_handler, would return the raw, unsanitized exception message to the client. These messages included memory addresses of Python objects, leaking information about the process's memory layout.
The patches address both issues. The RCE is fixed by upgrading the opencv-python-headless package to a version that includes a patched FFmpeg library (commit d45c96aa3caff51ac6bba556829c461f5df4449c). The information leak is fixed by implementing and applying a sanitization function (sanitize_message) to all error responses to strip memory addresses before they are sent to the client (commits 54e21708e8aa3f2e9978adc023782110b78ce163 and aedff6c26233bcf969cc04606c412592f2eb9a93).