Back

A binary search algorithm - a more detailed explanation

A binary search
If you have read the simple explanation of what a binary search is in our previous article and are hungry for more, here is a more detailed explanation!

This is a different approach to finding a particular record. The idea is that you divide all the records into two, a top half and a bottom half. You then test to see which half the record you want is in. Which ever half it isn't in, you discard! So you are now left with only half the original records. You then split that half into two and repeat the process, until you eventually find the record you want. Binary searching will not work unless the records are all in order! Of course, you could take a serial file (one whose records are not in any particular order) and sort them (which we call a 'sequential file') and then perform a binary search on that file!

Advantages and disadvantages of serial searching compared to a binary search
There are some advantages and disadvantages to both approaches.

    • Compared to a binary search, serial searching is a very easy system to write a program for (start at the beginning and compare each record in turn until you find it).
    • Serial searching isn’t very efficient. That's because it takes lots of comparisons to find a particular record in big files, compared to binary searching. It takes longer on average to find a record in a big file compared to binary searching.
    • If you only have a very few records in a file then it makes little difference which method you use!
    • Serial searching is excellent if your records are unordered. In binary searching, you will have to write code that puts records in order, before the binary search program can be used, or you won't be able to use binary searching at all. This will slow down processing and increase the complexity of the code, compared to serial searching.

Binary searching
Binary searching is a programming method used to search through data to find a particular item or record.

    • The data must be in order for binary searching to work.
    • It takes far less comparisons than serial searching to find an item (for files that have more than a few items in them).
    • Binary search algorithms are more complicated to write and understand than serial search algorithms.

Searching for a record using binary searching
To illustrate how binary searching works, we will look at an example. How would you search for record number 3 within these records using the binary search method? Remember, the records must be in order. If they are not, then they must be put in order first using some code. In the example below, the records are already in a sequential order (using the records’ key values).

STEP 1: We set a flag to indicate the first and last records. We’ll call these Low and High.

STEP 2: We then work out the Middle position. We can do this by calculating:

Trunc (Low + High) / 2

In our example, (1 + 10) / 2 = 5.5. Truncated, 5.5 equals 5, so our middle record is 5.

STEP 3: We first of all test to see if Middle equals the value we are looking for. If it does, we got lucky! We can stop searching and report that the record has been found.

STEP 4: If we didn’t get lucky, we then ask ourselves, is the record we are looking for (in our case, record 3), higher or lower than the record pointed to by Middle?

    • If the record we want is above Middle, then we can discard all the records from Middle and below. We would then have about a half of the original number of records to search through.
    • If the record we are looking for is below Middle, then we can discard all the records from Middle and above. We would then have about a half of the original number of records to search through.

In our case, record 3 is less than record 5. We can therefore discard all the records from Middle up to High. We are now left with records 1 to 4 to search through.  We then set the new Low value to point to the first record. We then set the new High value to point to the last record as before, and work out the new Middle (Trunc (1 + 4) / 2 = 2).

At this stage, we are now in exactly the same position as we were at the start of the problem. The only difference is that we have about half the records to search through after just one comparison! We can now call our binary search function again (recursion) and perform the exact same steps we have just done. The remaining logic for finding record 3 goes like this:

    • Record 3 does not equal Middle (record 2).
    • Record 3 is greater than Middle (record 2).
    • Discard all records from Middle and below.

We are now left with the following records:

    • Middle now equals Trunc ( 3 + 4 ) / 2 = 3.
    • Record 3 equals Middle. Stop and report found.

Searching for a record using binary searching – another example
Suppose we want to search for record 10 in our previous example.

A file of records we wish to search through

The logic goes like this:

    • Record 10 is not equal to middle.
    • Record 10 is greater than middle. Therefore, discard all records from Middle and below.
    • Reset Low, High and Middle.

We are now left with the following situation:

    • Record 10 is not equal to Middle.
    • Record 10 is greater than Middle. Therefore, discard all records from Middle and below.
    • Reset Low, High and Middle.

We are now left with this situation:

    • Middle now equals Trunc ( 9 + 10 ) / 2 = 9.
    • Record 10 is not equal to Middle.
    • Record 10 is greater than Middle. Therefore, discard all records from Middle and below.

    • Reset Low, High and Middle.
    • Record 10 is equal to Middle. Stop and report found.

Searching for a record using binary searching – another example
Suppose we want to search for record 20 in our previous example.

A file of records that we wish to search through.

The logic goes like this:

    • Record 20 is not equal to middle.
    • Record 20 is greater than middle. Therefore, discard all records from Middle and below.
    • Reset Low, High and Middle.

    • Record 20 is not equal to Middle.
    • Record 20 is greater than Middle. Therefore, discard all records from Middle and below.
    • Reset Low, High and Middle.

    • Middle now equals Trunc ( 9 + 10 ) / 2 = 9.
    • Record 20 is not equal to Middle.
    • Record 20 is greater than Middle. Therefore, discard all records from Middle and below.

    • Reset Low, High and Middle.
    • Record 20 is not equal to Middle.
    • Stop and report record not found.

Back