Regular expression engines rely on backtracking to evaluate patterns against input. Certain regex patterns lead to catastrophic backtracking, where the evaluation time grows exponentially with the input size.

Why is this an issue?

Regular expression engines use backtracking to try all possible execution paths when evaluating a pattern against an input. In some cases, this leads to catastrophic backtracking situations where the worst-case evaluation time grows exponentially with the input size. A small, carefully crafted input (around 20 characters) can trigger exponential backtracking and cause a denial of service of the application.

This rule reports regular expressions that create exponential backtracking.

What is the potential impact?

Denial of service

An attacker who can control the input matched against a vulnerable regular expression can craft a payload that causes the regex engine to explore an exponential number of execution paths. This causes the matching thread to consume excessive CPU time, preventing the application from serving legitimate users.

How to fix it

To fix a vulnerable regular expression, consider the following strategies:

Code examples

The following regular expression is vulnerable to exponential backtracking and will never terminate for the given input.

Noncompliant code example

/(a+)+$/.test(
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"+
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"+
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"+
"aaaaaaaaaaaaaaa!"
); // Noncompliant

Compliant solution

Atomic grouping can be emulated using lookahead assertions and backreferences:

/((?=(a+))\2)+$/.test(
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"+
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"+
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"+
"aaaaaaaaaaaaaaa!"
);

Resources

Articles & blog posts

Standards

Related rules