Sorting Algorithms
Sorting data (i.e., placing the data into some particular order, such as ascending or descending) is one of the most important computing applications. A bank sorts all checks by account number so that it can prepare individual bank statements at the end of each month. Telephone companies sort their lists of accounts by last name and, further, by first name to make it easy to find phone numbers. Virtually every organization must sort some data, and often, massive amounts of it. Sorting data is an intriguing, computer-intensive problem that has attracted intense research efforts.
An important item to understand about sorting is that the end resultthe sorted arraywill be the same no matter which algorithm you use to sort the array. The choice of algorithm affects only the run time and memory use of the program. The rest of the chapter introduces three common sorting algorithms. The first twoselection sort and insertion sortare simple algorithms to program, but are inefficient. The last algorithmmerge sortis a much faster algorithm than selection sort and insertion sort, but is harder to program. We focus on sorting arrays of primitive type data, namely ints. It is possible to sort arrays of objects of classes as well. We discuss this in Section 19.6.1.
16.3.1. Selection Sort
Selection sort is a simple, but inefficient, sorting algorithm. The first iteration of the algorithm selects the smallest element in the array and swaps it with the first element. The second iteration selects the second-smallest item (which is the smallest item of the remaining elements) and swaps it with the second element. The algorithm continues until the last iteration selects the second-largest element and swaps it with the second-to-last index, leaving the largest element in the last index. After the ith iteration, the smallest i items of the array will be sorted into increasing order in the first i elements of the array.
As an example, consider the array
34 56 4 10 77 51 93 30 5 52
A program that implements selection sort first determines the smallest element (4) of this array which is contained in index 2. The program swaps 4 with 34, resulting in
4 56 34 10 77 51 93 30 5 52
The program then determines the smallest value of the remaining elements (all elements except 4), which is 5, contained in index 8. The program swaps 5 with 56, resulting in
4 5 34 10 77 51 93 30 56 52
On the third iteration, the program determines the next smallest value (10) and swaps it with 34.
4 5 10 34 77 51 93 30 56 52
The process continues until the array is fully sorted.
4 5 10 30 34 51 52 56 77 93
Note that after the first iteration, the smallest element is in the first position. After the second iteration, the two smallest elements are in order in the first two positions. After the third iteration, the three smallest elements are in order in the first three positions.
Figure 16.6 declares the SelectionSort class. This class has two private instance variablesan array of int s named data, and a static Random object to generate random integers to fill the array. When an object of class SelectionSort is instantiated, the constructor (lines 1219) creates and initializes the array data with random ints in the range 1099.
Figure 16.6. SelectionSort class.
(This item is displayed on pages 798 - 799 in the print version)
1 // Fig 16.6: SelectionSort.java 2 // Class that creates an array filled with random integers. 3 // Provides a method to sort the array with selection sort. 4 import java.util.Random; 5 6 public class SelectionSort 7 { 8 private int [] data; // array of values 9 private static Random generator = new Random(); 10 11 // create array of given size and fill with random integers 12 public SelectionSort( int size ) 13 { 14 data = new int [ size ]; // create space for array 15 16 // fill array with random ints in range 10-99 17 for ( int i = 0; i < size; i++ ) 18 data[ i ] = 10 + generator.nextInt( 90 ); 19 } // end SelectionSort constructor 20 21 // sort array using selection sort 22 public void sort() 23 { 24 int smallest; // index of smallest element 25 26 // loop over data.length - 1 elements 27 for ( int i = 0 ; i < data.length - 1 ; i++ ) 28 { 29 smallest = i; // first index of remaining array 30 31 // loop to find index of smallest element 32 for ( int index = i + 1 ; index < data.length; index++ ) 33 if ( data[ index ] < data[ smallest ] ) 34 smallest = index; 35 36 swap( i, smallest ); // swap smallest element into position 37 printPass( i + 1, smallest ); // output pass of algorithm 38 } // end outer for 39 } // end method sort 40 41 // helper method to swap values in two elements 42 public void swap( int first, int second ) 43 { 44 int temporary = data[ first ]; // store first in temporary 45 data[ first ] = data[ second ]; // replace first with second 46 data[ second ] = temporary; // put temporary in second 47 } // end method swap 48 49 // print a pass of the algorithm 50 public void printPass( int pass, int index ) 51 { 52 System.out.print( String.format( "after pass %2d: ", pass ) ); 53 54 // output elements till selected item 55 for ( int i = 0 ; i < index; i++ ) 56 System.out.print( data[ i ] + " " ); 57 58 System.out.print( data[ index ] + "* " ); // indicate swap 59 60 // finish outputting array 61 for ( int i = index + 1; i < data.length; i++ ) 62 System.out.print( data[ i ] + " " ); 63 64 System.out.print( " " ); // for alignment 65 66 // indicate amount of array that is sorted 67 for ( int j = 0 ; j < pass; j++ ) 68 System.out.print( "-- " ); 69 System.out.println( " " ); // add endline 70 } // end method indicateSelection 71 72 // method to output values in array 73 public String toString() 74 { 75 StringBuffer temporary = new StringBuffer(); 76 77 // iterate through array 78 for ( int element : data ) 79 temporary.append( element + " " ); 80 81 temporary.append( " " ); // add endline character 82 return temporary.toString(); 83 } // end method toString 84 } // end class SelectionSort |
Lines 2239 declare the sort method. Line 24 declares the variable smallest, which will store the index of the smallest element in the remaining array. Lines 2738 loop data.length - 1 times. Line 29 initializes the index of the smallest element to the current item. Lines 3234 loop over the remaining elements in the array. For each of these elements, line 33 compares its value to the value of the smallest element. If the current element is smaller than the smallest element, line 34 assigns the current element's index to smallest. When this loop finishes, smallest will contain the index of the smallest element in the remaining array. Line 36 calls method swap (lines 4247) to place the smallest remaining element in the next spot in the array.
Line 9 of Fig. 16.7 creates a SelectionSort object with 10 elements. Line 12 implicitly call method toString to output the unsorted object. Line 14 calls method sort (lines 2239 of Fig. 16.6), which sorts the elements using selection sort. Then lines 1617 output the sorted object. The output of this program uses dashes to indicate the portion of the array that is sorted after each pass. An asterisk is placed next to the position of the element that was swapped with the smallest element on that pass. On each pass, the element next to the asterisk and the element above the rightmost set of dashes were the two values that were swapped.
Figure 16.7. SelectionSortTest class.
(This item is displayed on pages 799 - 800 in the print version)
1 // Fig 16.7: SelectionSortTest.java 2 // Test the selection sort class. 3 4 public class SelectionSortTest 5 { 6 public static void main( String[] args ) 7 { 8 // create object to perform selection sort 9 SelectionSort sortArray = new SelectionSort( 10 ); 10 11 System.out.println( "Unsorted array:" ); 12 System.out.println( sortArray ); // print unsorted array 13 14 sortArray.sort(); // sort array 15 16 System.out.println( "Sorted array:" ); 17 System.out.println( sortArray ); // print sorted array 18 } // end main 19 } // end class SelectionSortTest
|
Efficiency of Selection Sort
The selection sort algorithm runs in O(n2 ) time. The sort method in lines 2239 of Fig. 16.6, which implements the selection sort algorithm, contains two for loops. The outer for loop (lines 2738) iterates over the first n 1 elements in the array, swapping the smallest remaining item into its sorted position. The inner for loop (lines 3234) iterates over each item in the remaining array, searching for the smallest element. This loop executes n 1 times during the first iteration of the outer loop, n 2 times during the second iteration, then n 3, ..., 3, 2, 1. This inner loop will iterate a total of n(n 1)/2 or (n2 n)/2. In Big O notation, smaller terms drop out and constants are ignored, leaving a final Big O of O(n2).
16.3.2. Insertion Sort
Insertion sort is another simple, but inefficient, sorting algorithm. The first iteration of this algorithm takes the second element in the array and, if it is less than the first element, swaps it with the first element. The second iteration looks at the third element and inserts it into the correct position with respect to the first two elements, so all three elements are in order. At the ith iteration of this algorithm, the first i elements in the original array will be sorted.
Consider as an example the following array [Note: This array is identical to the array used in the discussions of selection sort and merge sort.]
34 56 4 10 77 51 93 30 5 52
A program that implements the insertion sort algorithm will first look at the first two elements of the array, 34 and 56. These two elements are already in order, so the program continues (if they were out of order, the program would swap them).
In the next iteration, the program looks at the third value, 4. This value is less than 56, so the program stores 4 in a temporary variable and moves 56 one element to the right. The program then checks and determines that 4 is less than 34, so it moves 34 one element to the right. The program has now reached the beginning of the array, so it places 4 in the zeroth element. The array now is
4 34 56 10 77 51 93 30 5 52
In the next iteration, the program stores the value 10 in a temporary variable. Then the program compares 10 to 56 and moves 56 one element to the right because it is larger than 10. The program then compares 10 to 34, moving 34 right one element. When the program compares 10 to 4, it observes that 10 is larger than 4 and places 10 in element 1. The array now is
4 10 34 56 77 51 93 30 5 52
Using this algorithm, at the ith iteration, the first i elements of the original array are sorted. They may not be in their final locations, however, because smaller values may be located later in the array.
Figure 16.8 declares the InsertionSort class. Lines 2246 declare the sort method. Line 24 declares the variable insert, which holds the element you are going to insert while you move the other elements. Lines 2745 loop over data.length - 1 items in the array. In each iteration, line 30 stores in insert the value of the element that will be inserted into the sorted portion of the array. Line 33 declares and initializes the variable moveItem, which keeps track of where to insert the element. Lines 3641 loop to locate the correct position where the element should be inserted. The loop will terminate either when the program reaches the front of the array or when it reaches an element that is less than the value to be inserted. Line 39 moves an element to the right, and line 40 decrements the position at which to insert the next element. After the loop ends, line 43 inserts the element into place. Figure 16.9 is the same as Fig. 16.7 except that it creates and uses an InsertionSort object. The output of this program uses dashes to indicate the portion of the array that is sorted after each pass. An asterisk is placed next to the element that was inserted into place on that pass.
Figure 16.8. InsertionSort class.
(This item is displayed on pages 801 - 803 in the print version)
1 // Fig 16.8: InsertionSort.java 2 // Class that creates an array filled with random integers. 3 // Provides a method to sort the array with insertion sort. 4 import java.util.Random; 5 6 public class InsertionSort 7 { 8 private int[] data; // array of values 9 private static Random generator = new Random(); 10 11 // create array of given size and fill with random integers 12 public InsertionSort( int size ) 13 { 14 data = new int [ size ]; // create space for array 15 16 // fill array with random ints in range 10-99 17 for ( int i = 0; i < size; i++ ) 18 data[ i ] = 10 + generator.nextInt( 90 ); 19 } // end InsertionSort constructor 20 21 // sort array using insertion sort 22 public void sort() 23 { 24 int insert; // temporary variable to hold element to insert 25 26 // loop over data.length - 1 elements 27 for ( int next = 1; next < data.length; next++ ) 28 { 29 // store value in current element 30 insert = data[ next ]; 31 32 // initialize location to place element 33 int moveItem = next; 34 35 // search for place to put current element 36 while ( moveItem > 0 && data[ moveItem - 1 ] > insert ) 37 { 38 // shift element right one slot 39 data[ moveItem ] = data[ moveItem - 1 ]; 40 moveItem--; 41 } // end while 42 43 data[ moveItem ] = insert; // place inserted element 44 printPass( next, moveItem ); // output pass of algorithm 45 } // end for 46 } // end method sort 47 48 // print a pass of the algorithm 49 public void printPass( int pass, int index ) 50 { 51 System.out.print( String.format( "after pass %2d: ", pass ) ); 52 53 // output elements till swapped item 54 for ( int i = 0 ; i < index; i++ ) 55 System.out.print( data[ i ] + " " ); 56 57 System.out.print( data[ index ] + "* " ); // indicate swap 58 59 // finish outputting array 60 for ( int i = index + 1; i < data.length; i++ ) 61 System.out.print( data[ i ] + " " ); 62 63 System.out.print( " " ); // for alignment 64 65 // indicate amount of array that is sorted 66 for( int i = 0; i <= pass; i++ ) 67 System.out.print( "-- " ); 68 System.out.println( " " ); // add endline 69 } // end method printPass 70 71 // method to output values in array 72 public String toString() 73 { 74 StringBuffer temporary = new StringBuffer(); 75 76 // iterate through array 77 for ( int element : data ) 78 temporary.append( element + " " ); 79 80 temporary.append( " " ); // add endline character 81 return temporary.toString(); 82 } // end method toString 83 } // end class InsertionSort |
Figure 16.9. InsertionSortTest class.
(This item is displayed on pages 803 - 804 in the print version)
1 // Fig 16.9: InsertionSortTest.java 2 // Test the insertion sort class. 3 4 public class InsertionSortTest 5 { 6 public static void main( String[] args ) 7 { 8 // create object to perform selection sort 9 InsertionSort sortArray = new InsertionSort( 10 ); 10 11 System.out.println( "Unsorted array:" ); 12 System.out.println( sortArray ); // print unsorted array 13 14 sortArray.sort(); // sort array 15 16 System.out.println( "Sorted array:" ); 17 System.out.println( sortArray ); // print sorted array 18 } // end main 19 } // end class InsertionSortTest
|
Efficiency of Insertion Sort
The insertion sort algorithm also runs in O(n2) time. Like selection sort, the implementation of insertion sort (lines 2246 of Fig. 16.8) contains two loops. The for loop (lines 2745) iterates data.length - 1 times, inserting an element into the appropriate position in the elements sorted so far. For the purposes of this application, data.length - 1 is equivalent to n 1 (as data.length is the size of the array). The while loop (lines 3641) iterates over the preceding elements in the array. In the worst case, this while loop will require n 1 comparisons. Each individual loop runs in O(n) time. In Big O notation, nested loops mean that you must multiply the number of comparisons. For each iteration of an outer loop, there will be a certain number of iterations of the inner loop. In this algorithm, for each O(n) iterations of the outer loop, there will be O(n) iterations of the inner loop. Multiplying these values results in a Big O of O(n2).
16.3.3. Merge Sort
Merge sort is an efficient sorting algorithm, but is conceptually more complex than selection sort and insertion sort. The merge sort algorithm sorts an array by splitting it into two equal-sized subarrays, sorting each subarray and then merging them into one larger array. With an odd number of elements, the algorithm creates the two subarrays such that one has one more element than the other.
The implementation of merge sort in this example is recursive. The base case is an array with one element. A one-element array is, of course, sorted, so merge sort immediately returns when it is called with a one-element array. The recursion step splits an array into two approximately equal pieces, recursively sorts them and then merges the two sorted arrays into one larger, sorted array.
Suppose the algorithm has already merged smaller arrays to create sorted arrays A:
4 10 34 56 77
and B:
5 30 51 52 93
Merge sort combines these two arrays into one larger, sorted array. The smallest element in A is 4 (located in the zeroth index of A). The smallest element in B is 5 (located in the zeroth index of B). In order to determine the smallest element in the larger array, the algorithm compares 4 and 5. The value from A is smaller, so 4 becomes the first element in the merged array. The algorithm continues by comparing 10 (the second element in A) to 5 (the first element in B). The value from B is smaller, so 5 becomes the second element in the larger array. The algorithm continues by comparing 10 to 30, with 10 becoming the third element in the array, and so on.
Lines 2225 of Fig. 16.10 declare the sort method. Line 24 calls method sortArray with 0 and data.length - 1 as the arguments. The arguments correspond to the beginning and ending indices of the array to be sorted. These values tell method sortArray to operate on the entire array.
Figure 16.10. MergeSort class.
(This item is displayed on pages 805 - 807 in the print version)
1 // Figure 16.10: MergeSort.java 2 // Class that creates an array filled with random integers. 3 // Provides a method to sort the array with merge sort. 4 import java.util.Random; 5 6 public class MergeSort 7 { 8 private int [] data; // array of values 9 private static Random generator = new Random(); 10 11 // create array of given size and fill with random integers 12 public MergeSort( int size ) 13 { 14 data = new int [ size ]; // create space for array 15 16 // fill array with random ints in range 10-99 17 for ( int i = 0 ; i < size; i++ ) 18 data[ i ] = 10 + generator.nextInt( 90 ); 19 } // end MergeSort constructor 20 21 // calls recursive split method to begin merge sorting 22 public void sort() 23 { 24 sortArray( 0, data.length - 1 ); // split entire array 25 } // end method sort 26 27 // splits array, sorts subarrays and merges subarrays into sorted array 28 private void sortArray( int low, int high ) 29 { 30 // test base case; size of array equals 1 31 if ( ( high - low ) >= 1 ) // if not base case 32 { 33 int middle1 = ( low + high ) / 2; // calculate middle of array 34 int middle2 = middle1 + 1; // calculate next element over 35 36 // output split step 37 System.out.println( "split: " + subarray( low, high ) ); 38 System.out.println( " " + subarray( low, middle1 ); 39 System.out.println( " " + subarray( middle2, high ) ); 40 System.out.println(); 41 42 // split array in half; sort each half (recursive calls) 43 sortArray( low, middle1 ); // first half of array 44 sortArray( middle2, high ); // second half of array 45 46 // merge two sorted arrays after split calls return 47 merge ( low, middle1, middle2, high ); 48 } // end if 49 } // end method split 50 51 // merge two sorted subarrays into one sorted subarray 52 private void merge( int left, int middle1, int middle2, int right ) 53 { 54 int leftIndex = left; // index into left subarray 55 int rightIndex = middle2; // index into right subarray 56 int combinedIndex = left; // index into temporary working array 57 int [] combined = new int [ data.length ]; // working array 58 59 // output two subarrays before merging 60 System.out.println( "merge: " + subarray( left, middle1 ) ); 61 System.out.println( " " + subarray( middle2, right ) ); 62 63 // merge arrays until reaching end of either 64 while ( leftIndex <= middle1 && rightIndex <= right ) 65 { 66 // place smaller of two current elements into result 67 // and move to next space in arrays 68 if ( data[ leftIndex ] <= data[ rightIndex ] ) 69 combined[ combinedIndex++ ] = data[ leftIndex++ ]; 70 else 71 combined[ combinedIndex++ ] = data[ rightIndex++ ]; 72 } // end while 73 74 // if left array is empty 75 if ( leftIndex == middle2 ) 76 // copy in rest of right array 77 while ( rightIndex <= right ) 78 combined[ combinedIndex++ ] = data[ rightIndex++ ]; 79 else // right array is empty 80 // copy in rest of left array 81 while ( leftIndex <= middle1 ) 82 combined[ combinedIndex++ ] = data[ leftIndex++ ]; 83 84 // copy values back into original array 85 for ( int i = left; i <= right; i++ ) 86 data[ i ] = combined[ i ]; 87 88 // output merged array 89 System.out.println( " " + subarray( left, right ) ); 90 System.out.println(); 91 } // end method merge 92 93 // method to output certain values in array 94 public String subarray( int low, int high ) 95 { 96 StringBuffer temporary = new StringBuffer(); 97 98 // output spaces for alignment 99 for ( int i = 0; i < low; i++ ) 100 temporary.append( " " ); 101 102 // output elements left in array 103 for ( int i = low; i <= high; i++ ) 104 temporary.append( " " + data[ i ] ); 105 106 return temporary.toString(); 107 } // end method subarray 108 109 // method to output values in array 110 public String toString() 111 { 112 return subarray( 0, data.length - 1 ); 113 } // end method toString 114 } // end class MergeSort |
Method sortArray is declared in lines 2849. Line 31 tests the base case. If the size of the array is 1, the array is already sorted, so the method simply returns immediately. If the size of the array is greater than 1, the method splits the array in two, recursively calls method sortArray to sort the two subarrays, then merges them. Line 43 recursively calls method sortArray on the first half of the array, and line 44 recursively calls method sortArray on the second half of the array. When these two method calls return, each half of the array has been sorted. Line 47 calls method merge (lines 5291) on the two halves of the array to combine the two sorted arrays into one larger sorted array.
Lines 6472 in method merge loop until the program reaches the end of either subarray. Line 68 tests which element at the beginning of the arrays is smaller. If the element in the left array is smaller, line 69 places it in position in the combined array. If the element in the right array is smaller, line 71 places it in position in the combined array. When the while loop has completed (line 72), one entire subarray is placed in the combined array, but the other subarray still contains data. Line 75 tests whether the left array has reached the end. If so, lines 7778 fill the combined array with the elements of the right array. If the left array has not reached the end, then the right array must have reached the end, and lines 8182 fill the combined array with the elements of the left array. Finally, lines 8586 copy the combined array into the original array. Figure 16.11 creates and uses a Merge-Sort object. The output from this program displays the splits and merges performed by merge sort, showing the progress of the sort at each step of the algorithm.
Figure 16.11. MergeSortTest class.
(This item is displayed on pages 808 - 809 in the print version)
1 // Figure 16.11: MergeSortTest.java 2 // Test the merge sort class. 3 4 public class MergeSortTest 5 { 6 public static void main( String[] args ) 7 { 8 // create object to perform merge sort 9 MergeSort sortArray = new MergeSort( 10 ); 10 11 // print unsorted array 12 System.out.println( "Unsorted:" + sortArray + " " ); 13 14 sortArray.sort(); // sort array 15 16 // print sorted array 17 System.out.println( "Sorted: " + sortArray ); 18 } // end main 19 } // end class MergeSortTest
|
Efficiency of Merge Sort
Merge sort is a far more efficient algorithm than either insertion sort or selection sort. Consider the first (nonrecursive) call to method sortArray. This results in two recursive calls to method sortArray with subarrays each approximately half the size of the original array, and a single call to method merge. This call to method merge requires, at worst, n 1 comparisons to fill the original array, which is O(n). (Recall that each element in the array can be chosen by comparing one element from each of the subarrays.) The two calls to method sortArray result in four more recursive calls to method sortArray, each with a subarray approximately a quarter the size of the original array along with two calls to method merge. These two calls to method merge each require, at worst, n/2 1 comparisons for a total number of comparisons of O(n). This process continues, each call to sortArray generating two additional calls to sortArray and a call to merge, until the algorithm has split the array into one-element subarrays. At each level, O(n) comparisons are required to merge the subarrays. Each level splits the size of the arrays in half, so doubling the size of the array requires one more level. Quadrupling the size of the array requires two more levels. This pattern is logarithmic and results in log2n levels. This results in a total efficiency of O(n log n). Figure 16.12 summarizes many of the searching and sorting algorithms covered in this book and lists the Big O for each of them. Figure 16.13 lists the Big O values we have covered in this chapter along with a number of values for n to highlight the differences in the growth rates.
Algorithm |
Location |
Big O |
---|---|---|
Searching Algorithms: |
||
Linear Search |
Section 16.2.1 |
O(n) |
Binary Search |
Section 16.2.2 |
O(log n) |
Recursive Linear Search |
Exercise 16.8 |
O(n) |
Recursive Binary Search |
Exercise 16.9 |
O(log n) |
Sorting Algorithms: |
||
Selection Sort |
Section 16.3.1 |
O(n2) |
Insertion Sort |
Section 16.3.2 |
O(n2) |
Merge Sort |
Section 16.3.3 |
O(n log n) |
Bubble Sort |
Exercises 16.3 and 16.4 |
O(n2) |
n = |
O(log n) |
O(n) |
O(n log n) |
O(n2) |
---|---|---|---|---|
1 |
0 |
1 |
0 |
1 |
2 |
1 |
2 |
2 |
4 |
3 |
1 |
3 |
3 |
9 |
4 |
1 |
4 |
4 |
16 |
5 |
1 |
5 |
5 |
25 |
10 |
1 |
10 |
10 |
100 |
100 |
2 |
100 |
200 |
10000 |
1,000 |
3 |
1000 |
3000 |
106 |
1,000,000 |
6 |
1000000 |
6000000 |
1012 |
1,000,000,000 |
9 |
1000000000 |
9000000000 |
1018 |