Coder's Cat

Binary Search

2020-12-03

When teaching programming to beginners, I always like to use binary search as an introduction example to show the importance of algorithms and details in programming.

Binary Search is one of the most famous algorithms in computer science. It’s adopted as a core algorithm in software foundation.

Divide and conquer

We use binary search to find a target value in a sorted array. The worst and average time complexity are both O(logN), it’s a classic logN algorithm.

The core idea in binary search is we always reduce the search space into half compared previous iteration.

Think about how we play the guess-number game?

This general algorithm paradigm is called ‘divide and conquer’

Even if the core idea is simple, many programmers do not know how to implement an error-free binary search.

Although the basic idea of binary search is comparatively straightforward, the details can be surprisingly tricky

— Donald Knuth

Binary Search

Binary search is a typical recursive algorithm, so we can implement it as a recursive function with l as left range and r as the right range.

We need to calculate the mid index, then update left range and right range according to the compare results between nums[mid] and target.

int BinarySearch(vector<int>& nums, int target, int l, int r) {
if(l > r) return -1;

int m = l + ((r - l) >> 1);
if(nums[m] > target)
return BinarySearch(nums, target, l, m-1);
else if(nums[m] < target)
return BinarySearch(nums, target, m+1, r);
else return m;
}

The terminate condition is l > r, which means the search space is zero. Based on recursive version, we can safely convert it into an imperative function:

int BinarySearch(vector<int>& nums, int target) {
if(l > r) return -1;

int m = l + ((r - l) >> 1);
if(nums[m] > target)
return BinarySearch(nums, target, l, m-1);
else if(nums[m] < target)
return BinarySearch(nums, target, m+1, r);
else return m;
}

Why binary search is not easy to implement bug-free?

There are two tricky points here:

The first is terminate condition, if someone wrote the testing expression as l < r, target value will be missed in some scenarios.

Another issue that is easy to overlook is programming language depended, in any language with integer in a fixed range, inappropriate method to compute mid-index may introduce overflow error.

int m = l + ((r - l) >> 1); is same semantic meaning int m = (l + r) / 2, but l+r will overflow for large numbers.

Duplicated values

When there are duplicated values in an array, what the result will be returned? You can write some sample arrays as input.

For the above naive implementation, we will get anyone from the continuous index of the target.

But how we could get the first or last positions of a target?

We could still use divide and conquer method but with trivial detail.

Suppose we want to get the first position, whenever we find a index with target value, we drop the right part and continue to search with left part. But we need an extra index to store current position:

int binary_first(vector<int>& nums, int target) {
int l = 0;
int index = -1;
int r = nums.size() - 1;
while(l <= r) {
int mid = l + ((r - l)>>1);
if(nums[mid] < target) l = mid+1;
else if(nums[mid] > target) r = mid-1;
else {
index = mid;
r = mid-1; // set right range
}
}
return index;
}

Similarly, to get the last position we only need to update l to mid+1.

Conclusion

Try to write a binary-search in your favorite programming language and think about how to write unit testing for it.

And here are my more tips on learning data structures and algorithms.

Join my Email List for more insights, It's Free!😋