The vulnerability, GHSA-9mv7-3c64-mmqw, allows for arbitrary file reads in xml2rfc due to a path traversal weakness when processing link elements with rel="attachment" during PDF generation. The root cause is that the input XML was not sanitized early enough in the processing pipeline.
The provided patch at commit 73fb1c91fc62ac540bb6bd24f982f2becf84c1b0 reveals the fix and, by extension, the flaw in the previous design.
-
Ineffective Late Sanitization: The patch removes the call to strip_link_attachments(self.tree) from the validate method within the BaseWriter class (xml2rfc/writers/base.py). This indicates that previously, the sanitization was intended to happen within the writer itself. However, the vulnerability's existence proves this was either too late or insufficient, as the malicious payload was likely processed before this validate method was called.
-
Introduction of Early Sanitization: The patch introduces a new sanitize() method to the XmlRfcParser class (xml2rfc/parser.py), which calls xml2rfc.utils.strip_link_attachments. Crucially, the main function in xml2rfc/run.py is modified to call this new sanitize() method immediately after the XML file is parsed and before any writer is invoked.
Based on this, two key functions are identified as central to the vulnerability:
xml2rfc.run.main: In its vulnerable state, this function orchestrates the entire process without ensuring the input is safe, directly leading to the invocation of vulnerable code in the writers.
xml2rfc.writers.base.BaseWriter.validate: This function represents the flawed attempt at mitigation. It would appear in a runtime profile during exploitation as part of the PDF generation process, but it fails to prevent the vulnerability because the damage is already done by the time it is executed.
Therefore, an engineer looking for this vulnerability in their environment should monitor for executions of the xml2rfc tool that process untrusted XML, paying close attention to the call stack involving the main function and the validate method of the writers.