CVE-2018-19787: Improper Neutralization of Input During Web Page Generation in LXML
6.1
CVSS Score
3.0
Basic Information
CVE ID
GHSA ID
EPSS Score
0.53129%
CWE
Published
5/13/2022
Updated
9/30/2024
KEV Status
No
Technology
Python
Technical Details
CVSS Vector
CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N
Package Name | Ecosystem | Vulnerable Versions | First Patched Version |
---|---|---|---|
lxml | pip | < 4.2.5 | 4.2.5 |
Vulnerability Intelligence
Miggo AI
Root Cause Analysis
The vulnerability stems from insufficient handling of encoded JavaScript URLs in the HTML sanitizer. The commit 6be1d08 shows the fix was applied to _remove_javascript_link by adding unquote_plus() to decode URLs before whitespace substitution. This function's pre-patch version only performed whitespace substitution without URL decoding, making it vulnerable to obfuscated 'javascript:' schemes. The test case changes in test_clean.txt demonstrate this was the attack vector. The function's direct responsibility for URL sanitization and the explicit patch confirm its vulnerability.