7.1

CVSS Score

3.1

-

CVSS Score

Basic Information

Concerned about an active attack path?

Talk to our security experts and see Miggo in action.

Miggo Vulnerability Database

→

CVE-2026-47214

CVE-2026-47214: Docling: Unsafe URI and Path Handling in HTML Backend

Impact

The HTML backend did not perform sufficient validation during resource handling:

Accepted file:// URIs enabling local file system access when enable_local_fetch=True
Path resolution allowed traversal outside intended directories via ../ sequences and absolute paths
Did not block internal network resources under enable_remote_fetch=True
HTTP redirects were not validated, potentially redirecting to unintended schemes
No resource limits for remote image downloads and data: URIs

Patches

Fixed in versions 2.91.0 (initial fixes) and 2.94.0 (additional improvements). The fixes implement:

Updated local path treatment: absolute files always blocked, relative paths require enable_local_fetch=True (default: False) and containment within configured base_path for path traversal protection
file:// scheme stripped & treated as local path (above)
IP address validation to prevent SSRF
HTTP redirect validation, connection and read timeouts
Size limit for both remote images (with streaming download) and base64-decoded data URIs

Workarounds

Keep both enable_local_fetch=False and enable_remote_fetch=False (defaults) when processing untrusted HTML documents.

References

Initial fixes: v2.91.0
Additional improvements: v2.94.0

(GitHub Advisory)

Miggo Vulnerability Database

→

CVE-2026-47214

CVE-2026-47214:

7.1

CVSS Score

3.1

-

CVSS Score

Basic Information

Is this CVE running in your environment?

Easily map the attack path and prioritize which CVEs are a threat to your organization

Validate Exposure

Technical Details

Package Name	Ecosystem	Vulnerable Versions	First Patched Version
docling	pip	< 2.94.0	2.94.0

Technical Details

Vulnerability Intelligence
Miggo AI

Root Cause Analysis

The analysis focused on the patches introduced in versions 2.91.0 and 2.94.0 of the docling library, which were referenced in the security advisory. By comparing the code before and after the patches, I identified two key functions in docling/backend/html_backend.py that were the source of the vulnerabilities.

HTMLDocumentBackend._render_with_browser: The commit 9813190ab4126c1ff2fde1e3e72322821530390b shows that this function originally lacked controls over the browser context it created. The patch added crucial security measures: disabling JavaScript and implementing a request routing system to block unauthorized resource loading. The absence of these controls in the vulnerable versions made this function a primary entry point for exploitation.
HTMLDocumentBackend._load_image_data: The commit cd0cb695303d8ce1b3c9fe620b182b0e22d8c53f reveals multiple vulnerabilities within this single function. The original code performed HTTP requests (requests.get), base64 decoding (base64.b64decode), and file access (os.path.isfile) without proper validation. This allowed for Server-Side Request Forgery (SSRF), Uncontrolled Resource Consumption (DoS), and Path Traversal. The patch systematically adds validation and limits: IP address validation (_validate_url_safety), size checks for remote and data URI images, and stricter timeout handling.

The identified functions are directly responsible for processing external HTML and its resources (like images), which is where the vulnerabilities lie. The patch evidence clearly demonstrates the introduction of security controls that were previously missing, confirming these functions as the vulnerable ones.

Vulnerable functions

HTMLDocumentBackend._render_with_browser

docling/backend/html_backend.py

The function was vulnerable because it rendered HTML using a headless browser (Playwright) without adequate sandboxing. The original implementation allowed JavaScript execution and unrestricted network requests from the rendered page. This could be exploited by a malicious HTML document to fetch remote resources, access local files (`file://`), or execute arbitrary scripts. The patch mitigates this by disabling JavaScript (`java_script_enabled=False`) and implementing a routing mechanism (`_route_request`) to block unauthorized resource requests based on the URL scheme and remote fetch policy.

HTMLDocumentBackend._load_image_data

docling/backend/html_backend.py

This function was vulnerable to multiple issues related to insecure resource handling. 1. **Server-Side Request Forgery (SSRF):** It fetched remote images via HTTP/HTTPS without validating the destination IP address, allowing an attacker to make requests to internal or restricted network resources. The patch added the `_validate_url_safety` function to prevent this. 2. **Uncontrolled Resource Consumption:** It downloaded remote images and decoded base64 data URIs without enforcing any size limits, making it susceptible to denial-of-service attacks using overly large files. The patch introduced size checks for both remote downloads and decoded data URIs. 3. **Path Traversal:** It handled `file://` URIs and local file paths without proper validation, allowing an attacker to read arbitrary files on the local filesystem by using `../` sequences or absolute paths when `enable_local_fetch` was true. The vulnerability lies in the insufficient validation before the `os.path.isfile(src_loc)` check.

Vulnerability Intelligence
Miggo AI

Unlock WAF rules for this CVE

Generate vendor-ready rules for the observed attack patterns, plus reasoning and safe deployment guidance

Get WAF rules

WAF Protection Rules

WAF Rule

W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.

Reasoning

*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.