6.5

CVSS Score

3.1

-

CVSS Score

Basic Information

Concerned about an active attack path?

Talk to our security experts and see Miggo in action.

Miggo Vulnerability Database

→

CVE-2025-62426

CVE-2025-62426: vLLM vulnerable to DoS via large Chat Completion or Tokenization requests with specially crafted `chat_template_kwargs`

Summary

The /v1/chat/completions and /tokenize endpoints allow a chat_template_kwargs request parameter that is used in the code before it is properly validated against the chat template. With the right chat_template_kwargs parameters, it is possible to block processing of the API server for long periods of time, delaying all other requests

Details

In serving_engine.py, the chat_template_kwargs are unpacked into kwargs passed to chat_utils.py apply_hf_chat_template with no validation on the keys or values in that chat_template_kwargs dict. This means they can be used to override optional parameters in the apply_hf_chat_template method, such as tokenize, changing its default from False to True.

https://github.com/vllm-project/vllm/blob/2a6dc67eb520ddb9c4138d8b35ed6fe6226997fb/vllm/entrypoints/openai/serving_engine.py#L809-L814

https://github.com/vllm-project/vllm/blob/2a6dc67eb520ddb9c4138d8b35ed6fe6226997fb/vllm/entrypoints/chat_utils.py#L1602-L1610

Both serving_chat.py and serving_tokenization.py call into this _preprocess_chat method of serving_engine.py and they both pass in chat_template_kwargs.

So, a chat_template_kwargs like {"tokenize": True} makes tokenization happen as part of applying the chat template, even though that is not expected. Tokenization is a blocking operation, and with sufficiently large input can block the API server's event loop, which blocks handling of all other requests until this tokenization is complete.

This optional tokenize parameter to apply_hf_chat_template does not appear to be used, so one option would be to just hard-code that to always be False instead of allowing it to be optionally overridden by callers. A better option may be to not pass chat_template_kwargs as unpacked kwargs but instead as a dict, and only unpack them after the logic in apply_hf_chat_template that resolves the kwargs against the chat template.

Impact

Any authenticated user can cause a denial of service to a vLLM server with Chat Completion or Tokenize requests.

Fix

https://github.com/vllm-project/vllm/pull/27205

(GitHub Advisory)

Miggo Vulnerability Database

→

CVE-2025-62426

CVE-2025-62426:

6.5

CVSS Score

3.1

-

CVSS Score

Basic Information

Is this CVE running in your environment?

Easily map the attack path and prioritize which CVEs are a threat to your organization

Validate Exposure

Technical Details

Package Name	Ecosystem	Vulnerable Versions	First Patched Version
vllm	pip	>= 0.5.5, < 0.11.1	0.11.1

Technical Details

Vulnerability Intelligence
Miggo AI

Root Cause Analysis

The vulnerability is a Denial of Service (DoS) in the vLLM project, exploitable via the /v1/chat/completions and /tokenize endpoints. The root cause is the improper handling of the chat_template_kwargs request parameter.

The vulnerability originates in the OpenAIServing._preprocess_chat method within vllm/entrypoints/openai/serving_engine.py. This method accepts chat_template_kwargs from the user and unpacks them directly into a call to apply_hf_chat_template without any validation. This allows an attacker to inject arbitrary keyword arguments.

The core of the DoS vulnerability lies in the apply_hf_chat_template function in vllm/entrypoints/chat_utils.py. Before the patch, this function accepted a tokenize boolean parameter. By sending {"tokenize": true} in the chat_template_kwargs, an attacker could force this function to perform a synchronous, blocking tokenization operation on the input. With a sufficiently large input, this operation would block the server's event loop for an extended period, preventing it from handling any other requests and thus causing a DoS.

The fix involves two main changes in vllm/entrypoints/chat_utils.py:

The apply_hf_chat_template function was modified to remove the tokenize parameter and to hardcode tokenize=False in its internal call to the tokenizer. This directly remediates the vulnerability.
The resolve_chat_template_kwargs function was updated to explicitly disallow tokenize and chat_template keys in the chat_template_kwargs, providing an additional layer of defense.

Vulnerable functions

Only Mi**o us*rs **n s** t*is s**tion

Vulnerability Intelligence
Miggo AI

Unlock WAF rules for this CVE

Generate vendor-ready rules for the observed attack patterns, plus reasoning and safe deployment guidance

Get WAF rules

WAF Protection Rules

WAF Rule

W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.

Reasoning

*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.