CVE-2026-34756
CWE-770Published: April 6, 2026· Updated: Apr 7, 2026
Official Description
vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.19.0, a Denial of Service vulnerability exists in the vLLM OpenAI-compatible API server. Due to the lack of an upper bound validation on the n parameter in the ChatCompletionRequest and CompletionRequest Pydantic models, an unauthenticated attacker can send a single HTTP request with an astronomically large n value. This completely blocks the Python asyncio event loop and causes immediate Out-Of-Memory crashes by allocating millions of request object copies in the heap before the request even reaches the scheduling queue. This vulnerability is fixed in 0.19.0.
Technical Analysis
CVE-2026-34756 can be exploited remotely over the network without requiring physical or adjacent access, significantly expanding the attack surface for threat actors.
Exploitation requires low privileges, which limits the exposure to scenarios where an attacker has already gained initial access.
A successful exploit results in availability disruption (denial of service), with a CVSS base score of 6.5.
CVSS v3.1 Vector Breakdown
Affected Vendors & Products
Exploit & PoC Resources
All References (3)
Quick Facts
Related CVEs (CWE-770)
Recommended Actions
- →Apply vendor patches immediately
- →Monitor CVE-2026-34756 in threat intel feeds
- →Review IDS/IPS signatures for exploitation attempts