The vulnerability is a resource exhaustion issue in the pypdf library's LZW decoding functionality. A malicious actor could craft a PDF file with a highly compressed LZW stream. When pypdf processes this stream, it would attempt to decompress it, leading to an allocation of up to 1GB of memory per stream, as defined by the default max_output_length in the LzwCodec class. This can be exploited to cause a denial of service by exhausting the system's memory.
The provided patch directly addresses this by modifying the LzwCodec.__init__ method in pypdf/_codecs/_codecs.py. The default value for max_output_length is reduced from 1_000_000_000 (1GB) to 75_000_000 (75MB). This change restricts the amount of memory that can be allocated during the decompression of a single LZW stream, mitigating the vulnerability. The change to the LZW_MAX_OUTPUT_LENGTH constant in pypdf/filters.py is consistent with this fix.