6.5

CVSS Score

3.1

-

CVSS Score

Basic Information

CVE ID

CVE-2025-61620

GHSA ID

GHSA-6fvq-23cw-5628

EPSS Score

CWE

CWE-20

Published

10/7/2025

Updated

10/7/2025

KEV Status

Technology

Python

Basic Information

CVE ID

GHSA ID

EPSS Score

CWE

Published

Updated

KEV Status

Technology

Concerned about an active attack path?

Talk to our security experts and see Miggo in action.

Miggo Vulnerability Database

→

CVE-2025-61620

CVE-2025-61620: vLLM: Resource-Exhaustion (DoS) through Malicious Jinja Template in OpenAI-Compatible Server

Summary

A resource-exhaustion (denial-of-service) vulnerability exists in multiple endpoints of the OpenAI-Compatible Server due to the ability to specify Jinja templates via the chat_template and chat_template_kwargs parameters. If an attacker can supply these parameters to the API, they can cause a service outage by exhausting CPU and/or memory resources.

Details

When using an LLM as a chat model, the conversation history must be rendered into a text input for the model. In hf/transformer, this rendering is performed using a Jinja template. The OpenAI-Compatible Server launched by vllm serve exposes a chat_template parameter that lets users specify that template. In addition, the server accepts a chat_template_kwargs parameter to pass extra keyword arguments to the rendering function.

Because Jinja templates support programming-language-like constructs (loops, nested iterations, etc.), a crafted template can consume extremely large amounts of CPU and memory and thereby trigger a denial-of-service condition.

Importantly, simply forbidding the chat_template parameter does not fully mitigate the issue. The implementation constructs a dictionary of keyword arguments for apply_hf_chat_template and then updates that dictionary with the user-supplied chat_template_kwargs via dict.update. Since dict.update can overwrite existing keys, an attacker can place a chat_template key inside chat_template_kwargs to replace the template that will be used by apply_hf_chat_template.

# vllm/entrypoints/openai/serving_engine.py#L794-L816
_chat_template_kwargs: dict[str, Any] = dict(
    chat_template=chat_template,
    add_generation_prompt=add_generation_prompt,
    continue_final_message=continue_final_message,
    tools=tool_dicts,
    documents=documents,
)
_chat_template_kwargs.update(chat_template_kwargs or {})

request_prompt: Union[str, list[int]]
if isinstance(tokenizer, MistralTokenizer):
    ...
else:
    request_prompt = apply_hf_chat_template(
        tokenizer=tokenizer,
        conversation=conversation,
        model_config=model_config,
        **_chat_template_kwargs,
    )

Impact

If an OpenAI-Compatible Server exposes endpoints that accept chat_template or chat_template_kwargs from untrusted clients, an attacker can submit a malicious Jinja template (directly or by overriding chat_template inside chat_template_kwargs) that consumes excessive CPU and/or memory. This can result in a resource-exhaustion denial-of-service that renders the server unresponsive to legitimate requests.

Fixes

https://github.com/vllm-project/vllm/pull/25794

(GitHub Advisory)

Miggo Vulnerability Database

→

CVE-2025-61620

CVE-2025-61620:

6.5

CVSS Score

3.1

-

CVSS Score

Basic Information

CVE ID

CVE-2025-61620

GHSA ID

GHSA-6fvq-23cw-5628

EPSS Score

CWE

CWE-20

Published

10/7/2025

Updated

10/7/2025

KEV Status

Technology

Python

Basic Information

CVE ID

GHSA ID

EPSS Score

CWE

Published

Updated

KEV Status

Technology

Technical Details

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Package Name	Ecosystem	Vulnerable Versions	First Patched Version
vllm	pip	>= 0.5.1, < 0.11.0	0.11.0

Technical Details

Vulnerability Intelligence
Miggo AI

Root Cause Analysis

The vulnerability is a resource-exhaustion (DoS) issue in the vLLM OpenAI-Compatible Server, identified as GHSA-6fvq-23cw-5628. The root cause is the improper handling of user-supplied Jinja2 templates through the chat_template and chat_template_kwargs parameters in chat completion requests.

An attacker can craft a malicious Jinja2 template with constructs like nested loops, which, when rendered, consume excessive CPU and memory, leading to a denial-of-service. The vulnerability is exacerbated because the chat_template can be overwritten by a value within chat_template_kwargs, bypassing simple checks that might only forbid the top-level chat_template parameter.

The analysis of the patch commit 7977e5027c2250a4abc1f474c5619c40b4e5682f reveals two key functions involved in the vulnerability:

vllm.entrypoints.openai.serving_chat.ServingChat.create_chat_completion: This is the public-facing API endpoint that receives the user's request. Before the patch, it would pass the user-provided template parameters down to the processing engine without sufficient validation. The patch introduces a crucial security control: a trust_request_chat_template flag. If this flag is not enabled, the server will reject any request that attempts to provide a custom chat template, effectively blocking the attack at the entry point.
vllm.entrypoints.chat_utils.apply_hf_chat_template: This function is responsible for applying the chat template to the conversation. The vulnerability was present here because it used **kwargs to pass arguments to the underlying tokenizer.apply_chat_template function. This allowed the chat_template from chat_template_kwargs to be passed through, enabling the exploit. The patch fixes this by introducing a new function, resolve_chat_template_kwargs, which sanitizes the keyword arguments, specifically filtering out chat_template and other unexpected variables before they are passed to the tokenizer. This ensures that even if a malicious template gets past the initial checks, it won't be used during the rendering process.

In summary, an exploit would involve sending a malicious Jinja2 template to the create_chat_completion endpoint. This template would then be processed by apply_hf_chat_template, causing the server to hang or crash. The identified functions would be present in any runtime profile or stack trace during such an attack.

Vulnerable functions

Only Mi**o us*rs **n s** t*is s**tion

WAF Protection Rules

WAF Rule

W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.

Reasoning

*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.

6.5

-

Basic Information

Basic Information

Concerned about an active attack path?

CVE-2025-61620: vLLM: Resource-Exhaustion (DoS) through Malicious Jinja Template in OpenAI-Compatible Server

Summary

Details

Impact

Fixes

CVE-2025-61620:

6.5

-

Basic Information

Basic Information

Technical Details

Technical Details

Vulnerability IntelligenceMiggo AI

Root Cause Analysis

Vulnerable functions

WAF Protection Rules

WAF Rule

Reasoning

Vulnerability IntelligenceMiggo AI

Technical Details

Basic Information

Basic Information

6.5

6.5

CVE-2025-61620: vLLM: Resource-Exhaustion (DoS) through Malicious Jinja Template in OpenAI-Compatible Server

Summary

Details

Impact

Fixes

Vulnerability IntelligenceMiggo AI

Root Cause Analysis

Vulnerable functions

WAF Protection Rules

WAF Rule

Reasoning

Vulnerability Intelligence
Miggo AI

Vulnerability Intelligence
Miggo AI

Vulnerability Intelligence
Miggo AI