GHSA-hf3c-wxg2-49q9: vLLM vulnerable to Denial of Service by abusing xgrammar cache
6.5
Basic Information
Technical Details
Package Name | Ecosystem | Vulnerable Versions | First Patched Version |
---|---|---|---|
vllm | pip | >= 0.6.5, < 0.8.4 | 0.8.4 |
Vulnerability Intelligence
Miggo AI
Root Cause Analysis
The vulnerability stems from an unbounded cache in the xgrammar library, which is used by vLLM for structured output. The patch (commit 549f429bfb26b58fe7ed33f2af8cf90b2bd71ae1) addresses this by upgrading xgrammar to a version with cache control (0.1.18) and modifying vLLM to utilize these new cache control features. Specifically, the patch introduces an environment variable VLLM_XGRAMMAR_CACHE_MB
to set a cache size limit. The key changes are in how xgr.GrammarCompiler
is instantiated within vLLM.
- In
vllm.model_executor.guided_decoding.xgrammar_decoding.XGrammarDecoder.get_compiler
, the instantiation ofxgr.GrammarCompiler
was updated to includecache_enabled=True
andcache_limit_bytes
. Before this change, the compiler would use xgrammar's default (vulnerable) caching mechanism. - Similarly, in
vllm.v1.structured_output.backend_xgrammar.XGrammarBackend.__init__
, the instantiation ofxgr.GrammarCompiler
was also updated to pass these cache control parameters.
These two functions are directly responsible for creating and configuring the GrammarCompiler
instances. In their pre-patch state, they were the points where vLLM invoked the xgrammar library in a way that exposed the unbounded cache vulnerability. Therefore, these functions would be active during the processing of requests that trigger the vulnerability, as they set up the component (the grammar compiler with its cache) that is being abused.