Binary Search, and Why Sorted Data Is So Powerful

The 'halve it every time' idea. How binary search finds something in a million items in about twenty steps, where it quietly powers databases and even git, and the one condition it needs.

DSA · 9 June 2026 · 5 min read

In the last post I said the hash map is my favourite structure because it turns “give me this exact thing” into a single step. But I also said its weakness: it has no sense of order. It cannot answer “everything between these two dates” or “the closest match below this value.” For those, you need the data in sorted order, and once it is sorted, you get to use one of the most satisfying ideas in computing: binary search.

The idea: a guessing game

You have played this without knowing its name. I think of a number between 1 and 100, and you guess. Each time, I only tell you “higher” or “lower.” What do you do? You do not start at 1 and count up. You guess 50. I say lower. You guess 25. I say higher. You guess 37. And so on. Every guess throws away half of what is left.

That is binary search. Look at the middle, decide which half the answer is in, throw the other half away, repeat. It is exactly how you use a physical dictionary. You do not read from page one. You open near the middle, see whether your word is before or after, and flip into the correct half.

Why it is so fast

Because you halve the problem every step, the number of steps grows incredibly slowly.

100 items: about 7 steps
1,000 items: about 10 steps
1,000,000 items: about 20 steps
1,000,000,000 items: about 30 steps

A plain scan of a billion items does a billion checks. Binary search does about thirty. And here is the magic of it: doubling the size of the data adds only one extra step. We call this “order log n,” and it is the next best thing to instant.

A tiny example

def binary_search(sorted_items, target):
    low, high = 0, len(sorted_items) - 1
    while low <= high:
        mid = (low + high) // 2     # the middle
        if sorted_items[mid] == target:
            return mid              # found it
        elif sorted_items[mid] < target:
            low = mid + 1           # answer is in the right half
        else:
            high = mid - 1          # answer is in the left half
    return -1                       # not present

nums = [3, 11, 18, 27, 34, 42, 55, 61]   # must be sorted
print(binary_search(nums, 42))           # 5

Notice the one assumption sitting quietly in there: the list must be sorted. That is the whole price of admission.

Where it shows up in real work

You rarely write binary search by hand, because it is built into the tools you already use. But it is underneath a lot of them.

Database indexes. When I put an index on a column so lookups are fast, the database is keeping that column in sorted order and doing a binary-search-style descent through it (a B-tree, which is the same idea fanned out for disk). This is a big part of why an indexed query is fast and an unindexed one crawls. It connects straight to the point I made in “Your data is the whole game”: structure is what makes data usable.
Nearest match and ranges. Most languages have a “bisect” or “lower bound” function that finds where a value would slot into a sorted list. That instantly answers questions like which age band or which tax slab a number falls into, or the nearest timestamp before a given moment.
git bisect. This is my favourite everyday example. When a bug appears and you do not know which commit introduced it, git bisect checks out the commit halfway through your history, you say “good” or “bad,” and it halves the range each time. A thousand commits, found in about ten checks. That is binary search applied to your own project history.
Debugging in general. Half the code disabled, does the bug still happen? That narrows it to one half. Bisection is a way of thinking, not just an algorithm.

The catches, honestly

It must be sorted. This is the real cost. Sorting takes effort, so binary search pays off when you sort once and search many times, not when the data changes constantly.
For exact keys, a hash map still wins. If all you ever do is “find this exact ID,” a hash map is one step versus binary search’s twenty. Use sorted-and-binary-search when you also need order, ranges, or nearest matches.
It is famously easy to get slightly wrong. The off-by-one details, whether a boundary is inclusive, where to put the plus one and minus one, have tripped up even very good programmers for decades. This is exactly why you lean on the tested version in your standard library rather than rolling your own in production.

The takeaway

Hash map and binary search are the two answers to two different questions. The hash map says “give me this exact thing, now.” Binary search says “the data is in order, so let me jump to roughly the right place and home in.” One trades memory for instant exact lookups. The other trades the cost of sorting for fast order-aware search.

Between them, they cover a huge share of what programs spend their time doing, which is finding things. Most of the speed you feel in good software is one of these two ideas working quietly underneath.

Next in this series: stacks and queues, the two ways to wait in line.