llm-d EnvironmentsThe risk is significantly amplified in orchestrated environments such as llm-d, where multiple pods communicate over an internal network.
Denial of Service (DoS): An attacker could target internal management endpoints of other services within the llm-d cluster. For instance, if a monitoring or metrics service is exposed internally, an attacker could send malformed requests to it. A specific example is an attacker causing the vLLM pod to call an internal API that reports a false KV cache utilization, potentially triggering incorrect scaling decisions or even a system shutdown.
Internal Network Reconnaissance: Attackers can use the vulnerability to scan the internal network for open ports and services by providing URLs like http://10.0.0.X:PORT and observing the server's response time or error messages.
Interaction with Internal Services: Any unsecured internal service becomes a potential target. This could include databases, internal APIs, or other model pods that might not have robust authentication, as they are not expected to be directly exposed.
Delegating this security responsibility to an upper-level orchestrator like llm-d is problematic. The orchestrator cannot easily distinguish between legitimate requests initiated by the vLLM engine for its own purposes and malicious requests originating from user input, thus complicating traffic filtering rules and increasing management overhead.
To address this vulnerability, it is essential to restrict the URLs that the MediaConnector can access. The principle of least privilege should be applied.
It is recommend to implement a configurable allowlist or denylist for domains and IP addresses.
Allowlist: The most secure approach is to allow connections only to a predefined list of trusted domains. This could be configured via a command-line argument, such as --allowed-media-domains. By default, this list could be empty, forcing administrators to explicitly enable external media fetching.
Denylist: Alternatively, a denylist could block access to private IP address ranges (127.0.0.1, 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) and other sensitive domains.
A check should be added at the beginning of the load_from_url methods to validate the parsed hostname against this list before any connection is made.
Example Implementation Idea:
# In MediaConnector.__init__
self.allowed_domains = set(config.get("allowed_media_domains", []))
self.denied_ip_ranges = [ip_network(r) for r in PRIVATE_IP_RANGES]
# In MediaConnector.load_from_url
url_spec = urlparse(url)
hostname = url_spec.hostname
if self.allowed_domains and hostname not in self.allowed_domains:
raise ValueError(f"Domain {hostname} is not in the allowed list.")
ip_address = ip_address(socket.gethostbyname(hostname))
if any(ip_address in network for network in self.denied_ip_ranges):
raise ValueError(f"Access to private IP address {ip_address} is forbidden.")
By integrating this control directly into vLLM, empower administrators to enforce security policies at the source, creating a more secure deployment by default and reducing the burden on higher-level infrastructure management.
| Package Name | Ecosystem | Vulnerable Versions | First Patched Version |
|---|---|---|---|
| vllm | pip | >= 0.5.0, < 0.11.0 | 0.11.0 |
The vulnerability is a Server-Side Request Forgery (SSRF) within the vLLM project's multimodal features. The root cause lies in the MediaConnector class, specifically in the load_from_url and load_from_url_async methods located in vllm/multimodal/utils.py. These methods are designed to fetch media (like images) from a URL provided by the user.
The vulnerability description and the provided patch (9d9a2b77f19f68262d5e469c4e82c0f6365ad72d) make it clear that the core issue is the lack of sufficient restrictions on the URLs being fetched. An attacker can supply a URL pointing to an internal service, and the vLLM server will make a request to it.
The patch specifically addresses a vector of this attack: bypassing domain-based allowlists via HTTP redirects. It does this by introducing an allow_redirects parameter, controlled by the VLLM_MEDIA_URL_ALLOW_REDIRECTS environment variable. This parameter is threaded through the call chain from the MediaConnector methods down to the underlying HTTP request functions in vllm/connections.py (get_response and get_async_response).
The primary vulnerable functions are MediaConnector.load_from_url and MediaConnector.load_from_url_async as they are the entry points for the user-provided malicious URL. The functions HTTP.get_bytes and HTTP.async_get_bytes are also included as they are the direct downstream consumers of the URL that perform the unsafe network request without the redirect mitigation that the patch introduces.
MediaConnector.load_from_urlvllm/multimodal/utils.py
MediaConnector.load_from_url_asyncvllm/multimodal/utils.py
HTTP.get_bytesvllm/connections.py
HTTP.async_get_bytesvllm/connections.py