Regular expression engines rely on backtracking to evaluate patterns against input. Certain regex patterns lead to catastrophic backtracking, where the evaluation time grows exponentially with the input size.
Regular expression engines use backtracking to try all possible execution paths when evaluating a pattern against an input. In some cases, this leads to catastrophic backtracking situations where the worst-case evaluation time grows exponentially with the input size. A small, carefully crafted input (around 20 characters) can trigger exponential backtracking and cause a denial of service of the application.
This rule reports regular expressions that create exponential backtracking.
An attacker who can control the input matched against a vulnerable regular expression can craft a payload that causes the regex engine to explore an exponential number of execution paths. This causes the matching thread to consume excessive CPU time, preventing the application from serving legitimate users.
To fix a vulnerable regular expression, consider the following strategies:
. with negated character classes to exclude separators where applicable (e.g., [^_]* instead of
.* before _).{1,5} to limit repetitions.The following regular expression is vulnerable to exponential backtracking and will never terminate for the given input.
/(a+)+$/.test( "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"+ "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"+ "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"+ "aaaaaaaaaaaaaaa!" ); // Noncompliant
Atomic grouping can be emulated using lookahead assertions and backreferences:
/((?=(a+))\2)+$/.test( "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"+ "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"+ "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"+ "aaaaaaaaaaaaaaa!" );