The vulnerability is an email header injection caused by improper quoting of newlines during serialization. I analyzed the provided commit URLs, particularly those from the main pull request fixing the issue.
-
email.generator.Generator._write_headers: This function directly writes headers. The patch (commit 921cbfd2165abdfe387bc283996ed9cde11f717d) adds verification logic (verify_generated_headers) to check for improperly folded newlines before writing. This indicates that, prior to the patch, this function would write potentially malicious header strings from policy.fold() without these checks, making it a key vulnerable function.
-
email._header_value_parser._refold_parse_tree: This function is involved in processing header values. The patch (commit bd7f922f6214364923aa5959ee5d9733f600ce85 and its refinements) ensures that newlines within header content are marked for encoding (RFC 2047). Before this, unencoded newlines could pass through this stage to the folding mechanism, contributing to the vulnerability if not handled correctly downstream.
-
email.policy.Policy.fold (and its implementations like email.header.Header.fold or custom fold methods): The core of the vulnerability lies in the improper quoting of newlines, which is the responsibility of the fold method. While the primary patches modify the generator and parser, the tests (e.g., test_verify_generated_headers in commit 59c06c3dd1a8435e76aa53d43d89ea9866181a4b) and commit messages explicitly point out that fold implementations not careful about newlines were the source of the malformed headers. The generator now verifies the output of fold.
These three functions form a chain: _refold_parse_tree prepares header content, fold formats it (potentially incorrectly for newlines), and _write_headers serializes it (previously without sufficient validation). All would appear in a runtime profile when a vulnerable header is constructed and then serialized.