Regular expression denial of service ReDoS occurs when a poorly written regex pattern causes excessive CPU consumption when processing certain inputs. An…
Regular expression denial of service (ReDoS) occurs when a poorly written regex pattern causes excessive CPU consumption when processing certain inputs. An attacker can craft a malicious string that forces the regex engine into catastrophic backtracking, consuming CPU resources and potentially freezing or crashing the application. This is a denial-of-service vulnerability that requires no special privileges or authentication.
02How It Happens
Most regex engines use backtracking to match patterns. When a pattern contains nested quantifiers (like (a+)+) or alternation with overlapping branches (like (a|a)*), the engine may try exponentially many combinations before concluding that a string does not match. A carefully constructed input string can force the engine to explore all these combinations, causing the regex to take seconds or minutes to process a short string. The vulnerability is especially dangerous in web applications where user input is validated against regexes, since an attacker can submit a malicious string via a form field or API parameter and tie up server resources.
03Real-World Impact
A ReDoS vulnerability can be exploited to disable a web application without requiring any authentication or special access. An attacker sends a single HTTP request containing a crafted string that triggers the vulnerable regex, and the server's CPU spikes to 100% while the regex engine struggles. If the application is single-threaded or has limited worker processes, this can make the entire site unresponsive to legitimate users. In multi-user environments, repeated ReDoS attacks can exhaust all available worker threads, resulting in a complete denial of service.
04Vulnerable & Fixed Patterns
Vulnerable pattern
import re
# Vulnerable: nested quantifiers cause catastrophic backtracking
email_pattern = r'^([a-zA-Z0-9]+)*@example\.com$'
user_input = 'a' * 50 + 'X' # Does not match; engine backtracks exponentially
if re.match(email_pattern, user_input):
print("Valid email")
else:
print("Invalid email")
Why it's vulnerable: The pattern ([a-zA-Z0-9]+)* contains nested quantifiers. When the input fails to match, the regex engine tries every possible way to partition the string across the repeated group, leading to exponential backtracking.
Fixed pattern
import re
# Fixed: use a simpler, non-backtracking pattern
email_pattern = r'^[a-zA-Z0-9]+@example\.com$'
user_input = 'a' * 50 + 'X'
if re.match(email_pattern, user_input):
print("Valid email")
else:
print("Invalid email")
Vulnerable pattern
<?php
// Vulnerable: alternation with overlapping branches
$pattern = '/^(a|a)*b$/';
$user_input = str_repeat('a', 50) . 'X'; // Does not match; catastrophic backtracking
if (preg_match($pattern, $user_input)) {
echo "Match found";
} else {
echo "No match";
}
?>
Why it's vulnerable: The pattern (a|a)* allows the regex engine to match the same character via either branch of the alternation, creating exponential combinations when the string does not end with 'b'.
Fixed pattern
<?php
// Fixed: use a single, unambiguous quantifier
$pattern = '/^a*b$/';
$user_input = str_repeat('a', 50) . 'X';
if (preg_match($pattern, $user_input)) {
echo "Match found";
} else {
echo "No match";
}
?>
05Prevention Checklist
Avoid nested quantifiers — do not use patterns like (a+)+, (a*)*, or (a+)*. Use a single quantifier instead.
Avoid overlapping alternation — patterns like (a|a)* or (a|ab)* allow the engine to match the same text in multiple ways. Use a single, unambiguous branch.
Use atomic grouping or possessive quantifiers — if your regex engine supports them (e.g., Java, PCRE), use (?>...) or + / *+ to prevent backtracking.
Set regex timeout limits — configure your regex engine to abort if matching takes longer than a reasonable threshold (e.g., 1 second).
Test regexes with long, non-matching input — before deploying a regex that validates user input, test it with strings that are long and do not match the pattern to ensure it completes quickly.
Use regex linters or analyzers — tools like regex101.com or static analysis plugins can flag potentially catastrophic patterns before they reach production.
06Signs You May Already Be Affected
Monitor your application logs and server metrics for sudden CPU spikes or request timeouts that correlate with specific form submissions or API calls. If you see requests that take an unusually long time to process despite small input sizes, or if your regex validation endpoints become unresponsive after receiving certain inputs, investigate the regex patterns used in those code paths. Check your regex patterns for nested quantifiers or overlapping alternation branches.