6.5

CVSS Score

3.1

Basic Information

CVE ID

GHSA ID

GHSA-hf3c-wxg2-49q9

EPSS Score

CWE

CWE-770

Published

4/15/2025

Updated

4/15/2025

KEV Status

Technology

Python

Concerned about an active attack path?

Talk to our security experts and see Miggo in action.

Miggo Vulnerability Database

→

GHSA-hf3c-wxg2-49q9

GHSA-hf3c-wxg2-49q9: vLLM vulnerable to Denial of Service by abusing xgrammar cache

Impact

This report is to highlight a vulnerability in XGrammar, a library used by the structured output feature in vLLM. The XGrammar advisory is here: https://github.com/mlc-ai/xgrammar/security/advisories/GHSA-389x-67px-mjg3

The xgrammar library is the default backend used by vLLM to support structured output (a.k.a. guided decoding). Xgrammar provides a required, built-in cache for its compiled grammars stored in RAM. xgrammar is available by default through the OpenAI compatible API server with both the V0 and V1 engines.

A malicious user can send a stream of very short decoding requests with unique schemas, resulting in an addition to the cache for each request. This can result in a Denial of Service by consuming all of the system's RAM.

Note that even if vLLM was configured to use a different backend by default, it is still possible to choose xgrammar on a per-request basis using the guided_decoding_backend key of the extra_body field of the request with the V0 engine. This per-request choice is not available when using the V1 engine.

Patches

https://github.com/vllm-project/vllm/pull/16283

Workarounds

There is no way to workaround this issue in existing versions of vLLM other than preventing untrusted access to the OpenAI compatible API server.

References

https://github.com/mlc-ai/xgrammar/security/advisories/GHSA-389x-67px-mjg3

(GitHub Advisory)

6.5

CVSS Score

3.1

Basic Information

CVE ID

GHSA ID

GHSA-hf3c-wxg2-49q9

EPSS Score

CWE

CWE-770

Published

4/15/2025

Updated

4/15/2025

KEV Status

Technology

Python

Technical Details

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Package Name	Ecosystem	Vulnerable Versions	First Patched Version
vllm	pip	>= 0.6.5, < 0.8.4	0.8.4

Vulnerability Intelligence
Miggo AI

Root Cause Analysis

The vulnerability stems from an unbounded cache in the xgrammar library, which is used by vLLM for structured output. The patch (commit 549f429bfb26b58fe7ed33f2af8cf90b2bd71ae1) addresses this by upgrading xgrammar to a version with cache control (0.1.18) and modifying vLLM to utilize these new cache control features. Specifically, the patch introduces an environment variable VLLM_XGRAMMAR_CACHE_MB to set a cache size limit. The key changes are in how xgr.GrammarCompiler is instantiated within vLLM.

In vllm.model_executor.guided_decoding.xgrammar_decoding.XGrammarDecoder.get_compiler, the instantiation of xgr.GrammarCompiler was updated to include cache_enabled=True and cache_limit_bytes. Before this change, the compiler would use xgrammar's default (vulnerable) caching mechanism.
Similarly, in vllm.v1.structured_output.backend_xgrammar.XGrammarBackend.__init__, the instantiation of xgr.GrammarCompiler was also updated to pass these cache control parameters.

These two functions are directly responsible for creating and configuring the GrammarCompiler instances. In their pre-patch state, they were the points where vLLM invoked the xgrammar library in a way that exposed the unbounded cache vulnerability. Therefore, these functions would be active during the processing of requests that trigger the vulnerability, as they set up the component (the grammar compiler with its cache) that is being abused.

Vulnerable functions

Only Mi**o us*rs **n s** t*is s**tion

WAF Protection Rules

WAF Rule

### Imp**t T*is r*port is to *i**li**t * vuln*r**ility in X*r*mm*r, * li*r*ry us** *y t** stru*tur** output ***tur* in vLLM. T** X*r*mm*r **visory is **r*: *ttps://*it*u*.*om/ml*-*i/x*r*mm*r/s**urity/**visori*s/**S*-***x-**px-mj** T** [x*r*mm*r](*t

Reasoning

T** vuln*r**ility st*ms *rom *n un*oun*** ***** in t** x*r*mm*r li*r*ry, w*i** is us** *y vLLM *or stru*tur** output. T** p*t** (*ommit ****************************************) ***r*ss*s t*is *y up*r**in* x*r*mm*r to * v*rsion wit* ***** *ontrol (*.

6.5

Basic Information

Concerned about an active attack path?

GHSA-hf3c-wxg2-49q9: vLLM vulnerable to Denial of Service by abusing xgrammar cache

Impact

Patches

Workarounds

References

6.5

Basic Information

Technical Details

Vulnerability IntelligenceMiggo AI

Root Cause Analysis

Vulnerable functions

WAF Protection Rules

WAF Rule

Reasoning

Vulnerability Intelligence
Miggo AI