N/A

CVSS Score

-

CVSS Score

Basic Information

Concerned about an active attack path?

Talk to our security experts and see Miggo in action.

Miggo Vulnerability Database

→

GHSA-qvc2-mg72-jjhx

GHSA-qvc2-mg72-jjhx: JustHTML Affected by Mutation XSS via Literal Text Serialization in Raw Text Elements (style/script)

Summary

Sanitized DOM trees can be unsafe to serialize when a custom policy allows raw-text elements such as <style> or <script>.

The issue affects DOM trees that are constructed or modified programmatically and then passed through sanitize_dom() with a policy that keeps these elements. Text nodes inside <style> and <script> are serialized literally, so attacker-controlled text containing the matching closing tag sequence can break out of the raw-text context and inject HTML into the serialized output.

The default sanitization policy is not affected because it drops the contents of style and script.

Details

The root cause is in HTML serialization of raw-text elements. In serialize.py, text children of script and style are emitted verbatim:

_LITERAL_TEXT_SERIALIZATION_ELEMENTS = frozenset({"script", "style"})

def _serialize_text_for_parent(text: str | None, parent_name: str | None) -> str:
    if not text:
        return ""
    if parent_name in _LITERAL_TEXT_SERIALIZATION_ELEMENTS:
        return text
    return _escape_text(text)

(GitHub Advisory)

Miggo Vulnerability Database

→

GHSA-qvc2-mg72-jjhx

GHSA-qvc2-mg72-jjhx:

N/A

CVSS Score

-

CVSS Score

Basic Information

Is this CVE running in your environment?

Easily map the attack path and prioritize which CVEs are a threat to your organization

Validate Exposure

Technical Details

Package Name	Ecosystem	Vulnerable Versions	First Patched Version
justhtml	pip	<= 1.11.0	1.12.0

Technical Details

Vulnerability Intelligence
Miggo AI

Root Cause Analysis

The vulnerability is a mutation XSS in the justhtml library that occurs during the serialization of DOM trees. The issue arises when a custom sanitization policy permits raw-text elements like <style> or <script>. The core problem, as detailed in the advisory and confirmed by code analysis, is that the text content within these elements was serialized literally, without escaping potentially dangerous character sequences.

An attacker could exploit this by crafting input text that includes a closing tag sequence (e.g., </style>). When this text is processed and serialized, the closing tag would prematurely terminate the raw-text element, allowing the subsequent attacker-controlled content to be interpreted as arbitrary HTML by the browser, leading to XSS.

The security patch addresses this flaw by introducing sanitization logic that specifically targets the content of these raw-text elements before serialization. The key changes are in commit bd2ddd9ef92991d8b1d7a871f1c9d27e72cabd5b, which adds the _sanitize_rawtext_element_contents function. This function neutralizes closing tag sequences (e.g., converting </style> to </style>) and removes any non-text child nodes from within <style> and <script> elements.

The investigation of the patch identified the primary vulnerable functions as the public API entry points that failed to perform this sanitization prior to the fix:

sanitize_dom: This function in src/justhtml/sanitize.py is a primary method for sanitizing DOM fragments. The patch retrofits it with a call to the new _sanitize_rawtext_element_contents function.
JustHTML.__init__: The constructor of the main JustHTML class in src/justhtml/parser.py was also found to be vulnerable when initialized directly with a crafted DOM node. The patch ensures that such nodes are sanitized before further processing.
_serialize_text_for_parent: While not modified in the patch, this function from src/justhtml/serialize.py was explicitly named in the advisory as the root cause. It performs the unsafe, literal serialization, and the vulnerability exists because unsanitized data was allowed to reach it.

A secondary, but related, vulnerability was fixed in commit 23c188284afe261eadd5705fc5408420634ec00f. The _markdown_escape_text function was not escaping HTML-significant characters, creating a similar XSS risk in the to_markdown() output. This function has also been included in the analysis.

Vulnerable functions

sanitize_dom

src/justhtml/sanitize.py

This function is responsible for sanitizing a DOM tree. Before the patch, it did not sanitize the content of raw text elements like `<style>` and `<script>`, allowing malicious content containing a closing tag to be serialized literally, leading to XSS. The patch adds a call to `_sanitize_rawtext_element_contents` to fix this.

JustHTML.__init__

src/justhtml/parser.py

The constructor for the main `JustHTML` class can take a DOM node as input. Before the patch, it would serialize this node without properly sanitizing the contents of raw text elements, leading to the same XSS vulnerability as `sanitize_dom`. The patch adds the necessary sanitization step.

_serialize_text_for_parent

src/justhtml/serialize.py

This function, identified in the vulnerability advisory, is the root cause of the vulnerability. It serializes the text content of raw-text elements (`<script>`, `<style>`) without any escaping. When unsanitized input reaches this function, it can lead to an XSS vulnerability by allowing an attacker to break out of the raw-text element.

_markdown_escape_text

src/justhtml/node.py

This function did not escape '<' and '&' characters in text nodes when converting to Markdown. This could allow a crafted text node to be rendered as raw HTML in the Markdown output, leading to a cross-site scripting (XSS) vulnerability.

Vulnerability Intelligence
Miggo AI

Unlock WAF rules for this CVE

Generate vendor-ready rules for the observed attack patterns, plus reasoning and safe deployment guidance

Get WAF rules

WAF Protection Rules

WAF Rule

W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.

Reasoning

*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.