| Package Name | Ecosystem | Vulnerable Versions | First Patched Version |
|---|---|---|---|
| transformers | pip | < 4.50.0 | 4.50.0 |
The vulnerability description points to a ReDoS in tokenization_gpt_neox_japanese.py within the SubWordJapaneseTokenizer class due to a problematic regex. The provided commit 92c5ca9dd70de3ade2af2eb835c96215cc50e815 confirms this by modifying the self.content_repatter6 regex in the __init__ method of this class. The same vulnerable pattern and fix were identified in the __init__ method of the SubWordJapaneseTokenizer class within the deprecated gptsan_japanese module. Additionally, the commit addresses another ReDoS vulnerability in the normalize_list_like_lines function in tokenization_nougat_fast.py by removing a complex regex and refactoring the function. These functions are identified as vulnerable because they either define (in __init__ methods, where the regex is compiled) or directly utilize (in normalize_list_like_lines) regular expressions prone to catastrophic backtracking, leading to excessive CPU usage and potential DoS.
Ongoing coverage of React2Shell