The vulnerability is caused by an incomplete fix for a previous vulnerability (CVE-2026-22778). The original fix introduced a sanitize_message function to prevent leaking memory addresses in error messages but failed to apply it in all necessary locations. Specifically, several new API endpoints and streaming handlers added after the original fix did not use the sanitizer.
The vulnerability exists in multiple locations where exceptions are caught, and the exception message (str(e)) is returned in an API response or sent over a WebSocket without being sanitized. An attacker can trigger this vulnerability by sending a malformed request, such as an image with invalid format, which causes an exception in the server. The resulting error message, containing sensitive information like memory addresses from object representations (e.g., <_io.BytesIO object at 0x7a95e299e750>), is then leaked to the attacker. This information leak can be used to bypass security mechanisms like ASLR.
The vulnerable functions are:
vllm.entrypoints.anthropic.api_router.create_messages: Handles POST /v1/messages and returns unsanitized exception messages.
vllm.entrypoints.anthropic.api_router.count_tokens: Handles POST /v1/messages/count_tokens and returns unsanitized exception messages.
vllm.entrypoints.anthropic.serving.AnthropicServing._stream_response_handler: A generator for streaming responses that yields unsanitized exception messages.
vllm.entrypoints.speech_to_text.realtime.connection.Connection.handle_connection: Handles WebSocket connections and sends unsanitized exception messages.
vllm.entrypoints.speech_to_text.realtime.connection.Connection._run_generation: Part of the WebSocket logic that sends unsanitized exception messages.
The fix involves applying the sanitize_message function to the exception message in all these locations before sending it to the client.
Why the global exception handler does not save these paths
api_server.py registers a catch-all app.exception_handler(Exception)(exception_handler) at line 262, and that handler calls create_error_response(exc) which DOES apply sanitize_message. However, FastAPI exception handlers fire only on unhandled exceptions that propagate out of a route function.
All affected HTTP paths catch Exceptioninside the route coroutine and construct the response themselves:
Because the exception is caught and a JSONResponse is returned in-route, every registered FastAPI exception handler — including the sanitizing global one — is bypassed. The WebSocket path bypasses it for a different reason: WebSocket frames don't traverse FastAPI's HTTP exception handler chain at all.
Reachability — the same primitive as the parent CVE
The Anthropic Messages API accepts image content parts in the request body (type: "image" with base64 source.data or type: "image_url"). Image bytes are passed to the same multimodal loader used by the OpenAI router. Malformed bytes cause PIL.Image.open to raise:
UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7a95e299e750>
The exception propagates up through handler.create_messages into the except Exception as e: at api_router.py:75. str(e) returns the exception message verbatim, including the address. The address ends up in the error.message field of the JSON response body returned to the attacker. ASLR entropy on the affected process drops from ~4 billion to ~8 candidates, identically to CVE-2026-22778 Stage 1.
The same primitive is reachable on POST /v1/messages/count_tokens (route #2), inside the SSE streaming converter when an exception is raised mid-stream (route #3), and over the realtime speech-to-text WebSocket when audio decoder or generation paths raise an exception containing any object repr (routes #4, #5).
Chronology — these are scope misses, not legacy code
2026-01-09: PR #31987 (aa125ecf0) introduces sanitize_message and applies it to OpenAI router HTTP exception handlers.
2026-01-15 (six days later): PR #32369 (4c1c501a7) adds vllm/entrypoints/anthropic/api_router.py containing line 78's message=str(e). The fix was not applied to the new router.
2026-03-02 (~two months later): PR #35588 (9a87b0578) adds the Anthropic count_tokens endpoint, replicating the same message=str(e) pattern at line 124.
2026-05-12 (~four months later): PR #42370 (d37e25ffb) consolidates speech-to-text entrypoints and the realtime WebSocket uses send_error(str(e), ...) for both error paths.
2026-05-26: current main HEAD, all five lines still present.
Remediation
1. Apply sanitize_message symmetrically to the five sites
# vllm/entrypoints/anthropic/api_router.py — add at top:
from vllm.entrypoints.utils import sanitize_message
# Line 78 (POST /v1/messages) and Line 124 (count_tokens):
message=sanitize_message(str(e)),
# vllm/entrypoints/anthropic/serving.py — add at top:
from vllm.entrypoints.utils import sanitize_message
# Line 808:
error=AnthropicError(type="internal_error", message=sanitize_message(str(e))),
# vllm/entrypoints/speech_to_text/realtime/connection.py — add at top:
from vllm.entrypoints.utils import sanitize_message
# Lines 75 and 265:
await self.send_error(sanitize_message(str(e)), "processing_error")
2. Tighten the regex (defense in depth)
The current regex r" at 0x[0-9a-f]+>" is narrow — it only matches the exact CPython builtin object-repr suffix in lowercase hex with a trailing >. Future Python versions, C extensions, or custom __repr__ methods could produce non-matching formats that re-enable the leak:
3. Future-proofing: consider a response middleware
Both the route-local exception handling pattern (Anthropic router) and the WebSocket path bypass FastAPI's exception handler chain. A response-level middleware that always invokes sanitize_message on outgoing error bodies would prevent this class of regression entirely.
Affected versions
All vLLM versions containing vllm/entrypoints/anthropic/api_router.py (introduced 2026-01-15 in PR #32369).
All vLLM versions containing vllm/entrypoints/speech_to_text/realtime/connection.py (introduced 2026-05-12 in PR #42370).
Confirmed present in main HEAD 771e1e48b (2026-05-26).
Steps to reproduce
Clone the target: git clone --depth 1 https://github.com/vllm-project/vllm
Run the proof of concept (PoC.py) against the cloned source.
Observe the result shown under Verified result below.
Credit
Kai Aizen — SnailSploit (@SnailSploit). Adversarial & Offensive Security Research.
Fix
A fix for this vulnerability was added here: https://github.com/vllm-project/vllm/pull/45119
The function `create_messages` catches exceptions and returns the exception message directly to the user without sanitization. This can leak sensitive information, such as memory addresses, from the exception message.
count_tokens
vllm/entrypoints/anthropic/api_router.py
The function `count_tokens` catches exceptions and returns the exception message directly to the user without sanitization. This can leak sensitive information, such as memory addresses, from the exception message.
The function `handle_connection` catches exceptions and sends the exception message over a WebSocket connection without sanitization. This can leak sensitive information, such as memory addresses, from the exception message.
The function `_run_generation` catches exceptions and sends the exception message over a WebSocket connection without sanitization. This can leak sensitive information, such as memory addresses, from the exception message.
AnthropicServing._stream_response_handler
vllm/entrypoints/anthropic/serving.py
The function `_stream_response_handler` in the `AnthropicServing` class has a generator that catches exceptions and yields an error event containing the unsanitized exception message. This can leak sensitive information, such as memory addresses, from the exception message in a streaming response.