Miggo Logo

CVE-2025-59425: vLLM is vulnerable to timing attack at bearer auth

7.5

CVSS Score
3.1

Basic Information

EPSS Score
-
Published
10/7/2025
Updated
10/7/2025
KEV Status
No
Technology
TechnologyPython

Technical Details

CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N
Package NameEcosystemVulnerable VersionsFirst Patched Version
vllmpip< 0.11.00.11.0

Vulnerability Intelligence
Miggo AIMiggo AI

Miggo AIRoot Cause Analysis

The analysis of the security advisory and the associated commit ee10d7e6ff5875386c7f136ce8b5f525c8fcef48 clearly indicates a timing attack vulnerability in the API key authentication mechanism of vLLM's OpenAI-compatible API server. The root cause is the use of a standard, non-constant-time string comparison to validate bearer tokens.

The vulnerable code is located in the __call__ method of the AuthenticationMiddleware class in vllm/entrypoints/openai/api_server.py. Before the patch, this method directly checked for the presence of the Authorization header value within a set of known tokens (self.api_tokens). This operation is not secure against timing attacks, as the time taken for the string comparison leaks information about the correctness of the provided token's prefix.

The patch rectifies this by replacing the vulnerable check with a call to a new verify_token method. This new method implements a secure comparison by first hashing the provided token and then using secrets.compare_digest to compare it against a list of pre-hashed valid tokens. This ensures the comparison operation takes a constant amount of time, regardless of whether the input is correct or not, thus mitigating the timing attack vector. Therefore, the AuthenticationMiddleware.__call__ function is the primary vulnerable function that would appear in a runtime profile during an exploitation attempt.

Vulnerable functions

AuthenticationMiddleware.__call__
vllm/entrypoints/openai/api_server.py
The vulnerability is in the `__call__` method of the `AuthenticationMiddleware` class. The line `headers.get("Authorization") not in self.api_tokens` performs a string comparison that is not constant time. An attacker can exploit this by sending partial API keys and measuring the response time. A longer response time indicates that more initial characters of the token are correct, allowing the attacker to reconstruct the full API key character by character. The patch mitigates this by introducing the `verify_token` method, which uses `secrets.compare_digest` for a constant-time comparison of hashed tokens.

WAF Protection Rules

WAF Rule

### Summ*ry T** *PI k*y support in vLLM p*r*orm** v*li**tion usin* * m*t*o* t**t w*s vuln*r**l* to * timin* *tt**k. T*is *oul* pot*nti*lly *llow *n *tt**k*r to *is*ov*r * v*li* *PI k*y usin* *n *ppro*** mor* ***i*i*nt t**n *rut* *or**. ### **t*ils *

Reasoning

T** *n*lysis o* t** s**urity **visory *n* t** *sso*i*t** *ommit `****************************************` *l**rly in*i**t*s * timin* *tt**k vuln*r**ility in t** *PI k*y *ut**nti**tion m****nism o* vLLM's Op*n*I-*omp*ti*l* *PI s*rv*r. T** root **us*