An introduction to binary searching - the simple explanation!
Binary searching is a very efficient way of searching through lots of data items to find the found you want. In serial searching, the data doesn't have to be organised in any particular way. We just start at the beginning and start looking through the data one-by-one. If you are going to use binary searching, the data must be sorted into an order first or it just won't work. We can use one of the sorting algorithms we have studied to do this or another one, but the data must be sorted first.
We can represent a lot of data, which has been sorted into an order, by a red line.
We find the middle point of the data (easily found - it's given by (lowest data item position + highest data item position) / 2
We then ask ourselves 'is the data pointed to my middle the one I am searching for?'
If it is, we stop and report that we've found the data.
If it isn't, we ask 'is the data item we won't lower than the middle or higher than the middle?
If it's lower than the middle, we can discard the data higher than the middle.
If it's higher than the middle, we can discard the data lower than the middle.
What we have in fact done is to throw away half of the data!
Now, using our formula to find the new middle point of our new set of data, we repeat the above steps, but this time, we only have half the data to check!
We ask ourselves again 'is the data pointed to my middle the one I am searching for?'
If it is, we stop and report that we've found the data.
If it isn't, we ask 'is the data item we won't lower than the middle or higher than the middle?
If it's lower than the middle, we can discard the data higher than the middle.
If it's higher than the middle, we can discard the data lower than the middle.
And then we throw away another half of the data again!
We keep doing the above until either we find the data or we get down to the last data item and can't find it, so we stop and report that it's not there.
We can represent the above in a diagram.
A first attempt at a binary search algorithm might go something like this:
found = false
While (there is data to search) and (you haven't found the data:)
find the midpoint of the data.
if midpoint = the data you are looking for
found = true
report 'data found'
else
discard the half of the data you don't need
So binary searching does one comparison, then, assuming you didn't get lucky and find the data you want, gets rid of half the data, and just keeps doing this, chopping the amount of data in half each time you do a compare. If you had a huge data file with 100,000 data items in it, you would only need to do a maximum of just 18 comparisons before you found the data or had to report it wasn't there!
100 000
50 000
25 000
12 500
6 250
3 125
1 562
781
390
195
98
49
24
12
6
3
2
1