Miggo Logo

GHSA-hf3c-wxg2-49q9: vLLM vulnerable to Denial of Service by abusing xgrammar cache

6.5

CVSS Score
3.1

Basic Information

CVE ID
-
EPSS Score
-
Published
4/15/2025
Updated
4/15/2025
KEV Status
No
Technology
TechnologyPython

Technical Details

CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H
Package NameEcosystemVulnerable VersionsFirst Patched Version
vllmpip>= 0.6.5, < 0.8.40.8.4

Vulnerability Intelligence
Miggo AIMiggo AI

Miggo AIRoot Cause Analysis

The vulnerability stems from an unbounded cache in the xgrammar library, which is used by vLLM for structured output. The patch (commit 549f429bfb26b58fe7ed33f2af8cf90b2bd71ae1) addresses this by upgrading xgrammar to a version with cache control (0.1.18) and modifying vLLM to utilize these new cache control features. Specifically, the patch introduces an environment variable VLLM_XGRAMMAR_CACHE_MB to set a cache size limit. The key changes are in how xgr.GrammarCompiler is instantiated within vLLM.

  1. In vllm.model_executor.guided_decoding.xgrammar_decoding.XGrammarDecoder.get_compiler, the instantiation of xgr.GrammarCompiler was updated to include cache_enabled=True and cache_limit_bytes. Before this change, the compiler would use xgrammar's default (vulnerable) caching mechanism.
  2. Similarly, in vllm.v1.structured_output.backend_xgrammar.XGrammarBackend.__init__, the instantiation of xgr.GrammarCompiler was also updated to pass these cache control parameters.

These two functions are directly responsible for creating and configuring the GrammarCompiler instances. In their pre-patch state, they were the points where vLLM invoked the xgrammar library in a way that exposed the unbounded cache vulnerability. Therefore, these functions would be active during the processing of requests that trigger the vulnerability, as they set up the component (the grammar compiler with its cache) that is being abused.

Vulnerable functions

Only Mi**o us*rs **n s** t*is s**tion

WAF Protection Rules

WAF Rule

### Imp**t T*is r*port is to *i**li**t * vuln*r**ility in X*r*mm*r, * li*r*ry us** *y t** stru*tur** output ***tur* in vLLM. T** X*r*mm*r **visory is **r*: *ttps://*it*u*.*om/ml*-*i/x*r*mm*r/s**urity/**visori*s/**S*-***x-**px-mj** T** [x*r*mm*r](*t

Reasoning

T** vuln*r**ility st*ms *rom *n un*oun*** ***** in t** x*r*mm*r li*r*ry, w*i** is us** *y vLLM *or stru*tur** output. T** p*t** (*ommit ****************************************) ***r*ss*s t*is *y up*r**in* x*r*mm*r to * v*rsion wit* ***** *ontrol (*.