# LeetCode: Regular Expression Matching



## Question

Given an input string (s) and a pattern (p), implement regular expression matching with support for ‘.’ and ‘*’.

'.' Matches any single character.
'*' Matches zero or more of the preceding element.


The matching should cover the entire input string (not partial).

Note:

s could be empty and contains only lowercase letters a-z.
p could be empty and contains only lowercase letters a-z, and characters like . or *.


Example 1:

Input:

s = "aa"
p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".


## Explanation

This problem is not easy, some corner test cases need to handle carefully.

Please try to write the correct code first and then simplify it.

### Depth-first search (DFS)

Let’s compare the current character of input and pattern.

There are two normal cases without consideration of *:

1. cur_s == cur_p
2. cur_s != cur_p, but cur_p == '.' and cur_s is any character


Now let’s consider how to handle the pattern *:

If we have a X* in the pattern(X stands for any character), we can skip zero or more than one character(same as the previous one) from the input.

According to these analyses, we can construct a depth-first search algorithm, it’s a recursive algorithm.

#include <string>
#include <iostream>
using namespace std;

class Solution {
public:
bool isMatch(string s, string p) {
if (p.empty()) return s.empty();

bool first_match = !s.empty() && (s[0] == p[0] || '.' == p[0]);
if (p.size() >= 1 && '*' == p[1])
return isMatch(s, p.substr(2)) || (first_match && isMatch(s.substr(1), p));
else
return  first_match && isMatch(s.substr(1), p.substr(1));
}
};


This solution is slow, from the result we can see it only beat 11% submissions. The time complexity is hard to analyze. You may refer to more details here.

### DFS with memorization

In the recursive version, there are some overlap subproblems we can optimize.

Naive DFS algorithm could be optimized with a memorization data structure. We could add a memo map to store the computed result, it’s a typical caching technology.

In this solution, I use text + "_" + pattern as a key for memorization, which is not the best.

The result is much better:

#include <string>
#include <map>
#include <iostream>
using namespace std;

class Solution {
public:
bool isMatch(string s, string p) {
if (p.empty()) return s.empty();
string key = s + "_" + p;

map<string, bool>::iterator iter = memo.find(key);
if(iter != memo.end()) { return iter->second; }

bool res;
bool first_match = !s.empty() && (s[0] == p[0] || '.' == p[0]);

if (p.size() >= 1 && '*' == p[1])
res = isMatch(s, p.substr(2)) || (first_match && isMatch(s.substr(1), p));
else
res = first_match && isMatch(s.substr(1), p.substr(1));
memo.insert(std::pair<string, bool>(key, res));
return res;
}

private:
map<string, bool> memo;
};


### Dynamic programming

Since we could implement the algorithm in a top-down pattern, we could also implement it from bottom to up.

Here is the scenario where the dynamic programming techniques could be applied to.

The time complexity is $$O(N*M)$$

#include <string>
#include <vector>
#include <iostream>
using namespace std;

class Solution {
public:
bool isMatch(string s, string p) {
vector<vector<bool>> dp(s.size()+1, vector<bool>(p.size()+1, false));
dp[0][0] = true;
for(int i = 1; i <= p.size(); i++) {
dp[0][i] = (i>1 && p[i-1]=='*' && dp[0][i-2]);
}
for (int i = 1; i <= s.size(); i++) {
for (int j = 1; j <= p.size(); j++) {
if (s[i-1] == p[j-1] || p[j-1] == '.')
dp[i][j] = dp[i-1][j-1];
else if (p[j-1] == '*') {
if (s[i-1] == p[j-2] || p[j-2] == '.')
dp[i][j] = dp[i][j-1] || dp[i][j-2] || dp[i-1][j];
else
dp[i][j] = dp[i][j-2];
};
}
}
return dp[s.size()][p.size()];
}
};

Last Updated on