The vulnerability (CVE-2022-2309) is a NULL pointer dereference in lxml when processing forged XML input with libxml2 versions 2.9.10-2.9.14, specifically when using iterwalk or canonicalize. The provided commit (86368e9cf70a0ad23cccd5ee32de847149af0c6f) patches three internal helper functions: _build_nsmap, _countNsDefs, and _appendStartNsEvents. These functions handle namespace data. The patches involve adding checks for NULL or inconsistent namespace attributes (like c_ns.href or c_ns.prefix) before they are accessed. The absence of these checks in the vulnerable versions meant that malformed namespace data (resulting from libxml2's handling of incorrect parser input) could cause these helper functions to dereference NULL pointers, leading to a crash. These helper functions are called by higher-level APIs like iterwalk, and thus are the direct locations of the exploitable code defects. The identified functions are those that contained the unsafe code prior to the patch and would appear in a stack trace during exploitation.
| Package Name | Ecosystem | Vulnerable Versions | First Patched Version |
|---|---|---|---|
| lxml | pip | < 4.9.1 | 4.9.1 |