The vulnerability occurs when a comma separating email addresses is incorrectly unicode-encoded during line folding. The provided patches modify the Lib/email/_header_value_parser.py file.
- Commit 09fab93c3d857496c0bd162797fab816c311ee48 (and identical changes in 70754d21c288535e86070ca7a6e90dcb670b8593 and 9148b77e0af91cdacaa7fe3dfac09635c3fe9a74):
- In
Lib/email/_header_value_parser.py:
- A global
ListSeparator object is defined: ListSeparator = ValueTerminal(',', 'list-separator').
- A new property is set on this object:
ListSeparator.as_ew_allowed = False. This is the core of the fix, preventing the list separator (comma) from being included in 'encoded-word' (EW) encoding.
- The function
get_address_list(value) is modified. The line address_list.append(ValueTerminal(',', 'list-separator')) is changed to address_list.append(ListSeparator). This ensures that the specifically configured ListSeparator (which is not allowed to be EW-encoded) is used, rather than a generic ValueTerminal for the comma.
The function get_address_list is directly involved in parsing the address string and appending the comma separator. In the vulnerable version, the comma it appended could be subsequently (during the folding and encoding process handled by other parts of the module) incorrectly unicode-encoded. The patch modifies get_address_list to use a comma token that is explicitly marked as not eligible for such encoding. Therefore, email._header_value_parser.get_address_list is the function where the vulnerable behavior (adding a potentially mis-encodable separator) originates within the scope of the provided patch.