Question Given an input string (s) and a pattern (p), implement regular expression matching with support for ‘.’ and ‘*’.
'.' Matches any single character. '*' Matches zero or more of the preceding element.
The matching should cover the entire input string (not partial).
Note: s could be empty and contains only lowercase letters a-z. p could be empty and contains only lowercase letters a-z, and characters like . or *.
Example 1:
Input:
s = "aa" p = "a" Output: false Explanation: "a" does not match the entire string "aa".
Explanation This problem is not easy, some corner test cases need to handle carefully.
Please try to write the correct code first and then simplify it.
Depth-first search (DFS) Let’s compare the current character of input and pattern.
There are two normal cases without consideration of *
:
1. cur_s == cur_p 2. cur_s != cur_p, but cur_p == '.' and cur_s is any character
Now let’s consider how to handle the pattern *
:
If we have a X*
in the pattern(X stands for any character), we can skip zero or more than one character(same as the previous one) from the input.
According to these analyses, we can construct a depth-first search algorithm, it’s a recursive algorithm.
#include <string> #include <iostream> using namespace std ;class Solution {public : bool isMatch (string s, string p) { if (p.empty()) return s.empty(); bool first_match = !s.empty() && (s[0 ] == p[0 ] || '.' == p[0 ]); if (p.size() >= 1 && '*' == p[1 ]) return isMatch(s, p.substr(2 )) || (first_match && isMatch(s.substr(1 ), p)); else return first_match && isMatch(s.substr(1 ), p.substr(1 )); } };
This solution is slow, from the result we can see it only beat 11% submissions. The time complexity is hard to analyze. You may refer to more details here .
DFS with memorization In the recursive version, there are some overlap subproblems we can optimize.
Naive DFS algorithm could be optimized with a memorization data structure. We could add a memo
map to store the computed result, it’s a typical caching technology .
In this solution, I use text + "_" + pattern
as a key for memorization, which is not the best.
The result is much better:
#include <string> #include <map> #include <iostream> using namespace std ;class Solution {public : bool isMatch (string s, string p) { if (p.empty()) return s.empty(); string key = s + "_" + p; map <string , bool >::iterator iter = memo.find(key); if (iter != memo.end()) { return iter->second; } bool res; bool first_match = !s.empty() && (s[0 ] == p[0 ] || '.' == p[0 ]); if (p.size() >= 1 && '*' == p[1 ]) res = isMatch(s, p.substr(2 )) || (first_match && isMatch(s.substr(1 ), p)); else res = first_match && isMatch(s.substr(1 ), p.substr(1 )); memo.insert(std ::pair <string , bool >(key, res)); return res; }private : map <string , bool > memo; };
Dynamic programming Since we could implement the algorithm in a top-down pattern, we could also implement it from bottom to up.
Here is the scenario where the dynamic programming techniques could be applied to.
The time complexity is $O(N*M)$
#include <string> #include <vector> #include <iostream> using namespace std ;class Solution {public : bool isMatch (string s, string p) { vector <vector <bool >> dp (s.size()+1 , vector <bool >(p.size()+1 , false )) ; dp[0 ][0 ] = true ; for (int i = 1 ; i <= p.size(); i++) { dp[0 ][i] = (i>1 && p[i-1 ]=='*' && dp[0 ][i-2 ]); } for (int i = 1 ; i <= s.size(); i++) { for (int j = 1 ; j <= p.size(); j++) { if (s[i-1 ] == p[j-1 ] || p[j-1 ] == '.' ) dp[i][j] = dp[i-1 ][j-1 ]; else if (p[j-1 ] == '*' ) { if (s[i-1 ] == p[j-2 ] || p[j-2 ] == '.' ) dp[i][j] = dp[i][j-1 ] || dp[i][j-2 ] || dp[i-1 ][j]; else dp[i][j] = dp[i][j-2 ]; }; } } return dp[s.size()][p.size()]; } };