Coder's Cat

LeetCode: String to Integer (atoi)

2020-12-29

Implement atoi which converts a string to an integer.

The function first discards as many whitespace characters as necessary until the first non-whitespace character is found. Then, starting from this character takes an optional initial plus or minus sign followed by as many numerical digits as possible, and interprets them as a numerical value.

The string can contain additional characters after those that form the integral number, which are ignored and have no effect on the behavior of this function.

If the first sequence of non-whitespace characters in str is not a valid integral number, or if no such sequence exists because either str is empty or it contains only whitespace characters, no conversion is performed.

If no valid conversion could be performed, a zero value is returned.

Note:

Only the space character ‘ ‘ is considered a whitespace character.
Assume we are dealing with an environment that could only store integers within the 32-bit signed integer range: [−231,  231 − 1]. If the numerical value is out of the range of representable values, 231 − 1 or −231 is returned.

Example 1:

Input: str = "42"
Output: 42

Explanation

This challenge is not easy, there are many corner cases in test cases. We need to handle two categories of tests:

  • Overflow or underflow, we need to return INT_MAX or INT_MIN depends on it is signed or not
  • Signed or not, and whether it is a valid number

Solution #1: Brute force

My first version of CPP is complicated, I use long long to handle overflow issue:

class Solution {
public:
int myAtoi(string s) {
long long ans = 0;
int cur = 0;
int sign_flag = 0;
string number;
while (cur < s.size()) {
char c = s[cur];
if (c == ' ' &&
(number.size() != 0 || sign_flag > 0 || sign_flag < 0))
break;
if (c == '-' || c == '+') {
if (cur > 0 && s[cur - 1] >= '0' && s[cur - 1] <= '9') break;
if (sign_flag != 0 || number.size() != 0) return 0;
sign_flag = c == '+' ? 1 : -1;
} else if (c >= '0' && c <= '9') {
ans = (ans * 10) + (c - '0');
number.push_back(c);
if (sign_flag >= 0 && ans > (long long)INT_MAX) return INT_MAX;
if (sign_flag < 0 && (-ans < (long long)INT_MIN))
return INT_MIN;
} else if (c != ' ' && number.size() == 0)
return 0;
else if (c == '.' || (c >= 'a' && c <= 'z'))
break;
cur++;
}
if (sign_flag < 0) ans = -ans;
return (int)ans;
}
};

Solution #2: An more simple implementation

class Solution {
public:
int myAtoi(string str) {
int i = 0;
int sign = 1;
long long result = 0;
if (str.length() == 0) return 0;

while (i < str.length() && str[i] == ' ')
i++;

if (i < str.length() && (str[i] == '+' || str[i] == '-'))
sign = (str[i++] == '-') ? -1 : 1;

// Build the result and check for overflow/underflow condition
while (i < str.length() && str[i] >= '0' && str[i] <= '9') {
result = result * 10 + (str[i++] - '0');
if(sign > 0 && result >= (long long)INT_MAX) return INT_MAX;
if(sign < 0 && -result <= (long long)INT_MIN) return INT_MIN;
}
return result * sign;
}
};

Solution #3: Finite-state machine

This challenge involves complex string processing for parsing. It’s error-prone if we write it by hand.

Therefore, in order to analyze the processing of each input character in an organized way, we can use the concept of finite-state machine.

Our program has a state s at each moment, and each time a character c is entered from the sequence, it is transferred to the next state s' according to the character c. In this way, we only need to build a table covering all cases of mapping from s and c to s'.

atoi-fsm

We can also represent the finite-state machine as a table:

state\char space +/- digits other
start start signed number wrong
signed wrong wrong number wrong
number wrong wrong number wrong
wrong wrong wrong wrong wrong

With this table we can easily implement the whole program, the initial state is start and we change the state according to current character:

class Automaton {
string state = "start";
unordered_map<string, vector<string>> table = {
{"start", {"start", "signed", "in_number", "wrong"}},
{"signed", {"wrong", "wrong", "in_number", "wrong"}},
{"in_number", {"wrong", "wrong", "in_number", "wrong"}},
{"wrong", {"wrong", "wrong", "wrong", "wrong"}}
};

int get_col(char c) {
if (isspace(c)) return 0;
if (c == '+' or c == '-') return 1;
if (isdigit(c)) return 2;
return 3;
}
public:
int sign = 1;
long long ans = 0;

void get(char c) {
state = table[state][get_col(c)];
if (state == "in_number") {
ans = ans * 10 + c - '0';
ans = sign == 1 ? min(ans, (long long)INT_MAX) : min(ans, -(long long)INT_MIN);
}
else if (state == "signed")
sign = c == '+' ? 1 : -1;
}
};

class Solution {
public:
int myAtoi(string str) {
Automaton automaton;
for (char c : str)
automaton.get(c);
return automaton.sign * automaton.ans;
}
};

Preparing for an interview? Check out this!

Join my Email List for more insights, It's Free!😋