A Semantic Attack on Google Gemini - Read the Latest Research
| Package Name | Ecosystem | Vulnerable Versions | First Patched Version |
|---|---|---|---|
| pypdf | pip | < 6.4.0 | 6.4.0 |
The vulnerability lies in the handling of LZW-encoded streams within PDF files. The pypdf library's LzwCodec class had a default maximum output length of 1 GB, which could be exploited by a crafted PDF to cause excessive memory allocation, leading to a denial-of-service. The analysis of the patch commit 96186725e5e6f237129a58a97cd19204a9ce40b2 reveals that the vulnerability was addressed by lowering this default limit.
The primary vulnerable function is LzwCodec.decode, which performs the actual decompression. Although not directly shown in the patch, its behavior is governed by the max_output_length parameter set in the LzwCodec.__init__ constructor. The modification of the __init__ method in the patch is direct evidence of the mitigation strategy. Therefore, both LzwCodec.__init__ (where the fix is applied) and LzwCodec.decode (where the vulnerability manifests at runtime) are identified as the key functions related to this vulnerability.
LzwCodec.__init__pypdf/_codecs/_codecs.py
LzwCodec.decodepypdf/_codecs/_codecs.py