The analysis of the security advisory and the associated commit ee10d7e6ff5875386c7f136ce8b5f525c8fcef48 clearly indicates a timing attack vulnerability in the API key authentication mechanism of vLLM's OpenAI-compatible API server. The root cause is the use of a standard, non-constant-time string comparison to validate bearer tokens.
The vulnerable code is located in the __call__ method of the AuthenticationMiddleware class in vllm/entrypoints/openai/api_server.py. Before the patch, this method directly checked for the presence of the Authorization header value within a set of known tokens (self.api_tokens). This operation is not secure against timing attacks, as the time taken for the string comparison leaks information about the correctness of the provided token's prefix.
The patch rectifies this by replacing the vulnerable check with a call to a new verify_token method. This new method implements a secure comparison by first hashing the provided token and then using secrets.compare_digest to compare it against a list of pre-hashed valid tokens. This ensures the comparison operation takes a constant amount of time, regardless of whether the input is correct or not, thus mitigating the timing attack vector. Therefore, the AuthenticationMiddleware.__call__ function is the primary vulnerable function that would appear in a runtime profile during an exploitation attempt.
| Package Name | Ecosystem | Vulnerable Versions | First Patched Version |
|---|---|---|---|
| vllm | pip | < 0.11.0 | 0.11.0 |
Ongoing coverage of React2Shell