5.3

CVSS Score

3.0

-

CVSS Score

Basic Information

Concerned about an active attack path?

Talk to our security experts and see Miggo in action.

Miggo Vulnerability Database

→

CVE-2025-3044

CVE-2025-3044: LlamaIndex vulnerability in ArxivReader class can cause MD5 hash collisions

A vulnerability in the ArxivReader class of the run-llama/llama_index repository allows for MD5 hash collisions when generating filenames for downloaded papers. This can lead to data loss as papers with identical titles but different contents may overwrite each other, preventing some papers from being processed for AI model training. The issue is resolved in llama-index-readers-papers version 0.3.1 (in llama-index 0.12.28).

(GitHub Advisory)

Miggo Vulnerability Database

→

CVE-2025-3044

CVE-2025-3044:

5.3

CVSS Score

3.0

-

CVSS Score

Basic Information

Is this CVE running in your environment?

Easily map the attack path and prioritize which CVEs are a threat to your organization

Validate Exposure

Technical Details

Package Name	Ecosystem	Vulnerable Versions	First Patched Version
llama-index-readers-papers	pip	< 0.3.1	0.3.1

Technical Details

Vulnerability Intelligence
Miggo AI

Root Cause Analysis

The vulnerability exists in the ArxivReader class within the llama-index-readers-papers package. Specifically, the load_data and load_papers_and_abstracts methods were found to be vulnerable. The root cause of the vulnerability is the insecure generation of filenames for downloaded Arxiv papers. The filename was created by hashing the title of the paper using the _hacky_hash method. However, paper titles are not guaranteed to be unique. This creates a scenario where two different papers with the same title would result in the same filename. When these papers are downloaded, the second paper would overwrite the first, leading to data loss and preventing the overwritten paper from being processed. The fix, implemented in commit f69e1c0e7579228fec4cfaf716e4f951e131de77, addresses this issue by making the input to the hash function unique. It now combines the paper's title with its unique entry_id before hashing, thus ensuring that each paper gets a unique filename and preventing collisions.

Vulnerable functions

Only Mi**o us*rs **n s** t*is s**tion

Vulnerability Intelligence
Miggo AI

Unlock WAF rules for this CVE

Generate vendor-ready rules for the observed attack patterns, plus reasoning and safe deployment guidance

Get WAF rules

WAF Protection Rules

WAF Rule

W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.W** rul*s *v*il**l* *or Mi**o *ustom*rs only.

Reasoning

*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.*v*il**l* *or Mi**o *ustom*rs only.