Monday, 17 November 2014

Insertion sort

Insertion sort is an elementary sorting algorithm. It has a time complexity of Θ(n2), thus being slower than heapsortmerge sort and also shellsort. Insertion sort is well suited for sorting small data sets or for the insertion of new elements into a sorted sequence.

Idea

Let a0, ..., an-1 be the sequence to be sorted. At the beginning and after each iteration of the algorithm the sequence consists of two parts: the first part a0, ..., ai-1 is already sorted, the second part ai, ..., an-1 is still unsorted (i element 0, ..., n).
In order to insert element ai into the sorted part, it is compared with ai-1ai-2 etc. When an element aj with aj<=ai is found, ai is inserted behind it. If no such element is found, then ai is inserted at the beginning of the sequence.
After inserting element ai the length of the sorted part has increased by one. In the next iteration, ai+1 is inserted into the sorted part etc. While at the beginning the sorted part consists of element a0 only, at the end it consists of all elements a0, ..., an-1.
Example:  The following table shows the steps for sorting the sequence 5 7 0 3 4 2 6 1. On the left side the sorted part of the sequence is shown in red. For each iteration, the number of positions the inserted element has moved is shown in brackets. Altogether this amounts to 17 steps.

57034261(0)
57034261(0)
05734261(2)
03574261(2)
03457261(2)
02345761(4)
02345671(1)
01234567(6)

Implementation

The following simulation illustrates the realization of the algorithm. The corresponding program is given subsequently.
Simulation of insertion sort

The following function insertionsort sorts an integer array a[0], ..., a[n-1].
The sorting function is encapsulated in a class InsertionSorter. The method sort passes the array to be sorted to an array a, sets n to its length and calls insertionsort.
The statement to sort an array b with insertion sort is
InsertionSorter.sort(b);
The program follows.

public class InsertionSorter
{
    private static int[] a;
    private static int n;

    public static void sort(int[] a0)
    {
        a=a0;
        n=a.length;
        insertionsort();
    }

    private static void insertionsort()
    {
        int i, j, t;
        for (i=1; i<n; i++)
        {
            j=i;
            t=a[j];
            while (j>0 && a[j-1]>t)
            {
                a[j]=a[j-1];
                j--;
            }
            a[j]=t;
        }
    }

}    // end class InsertionSorter

Analysis

The worst case occurs when in every step the proper position for the element that is inserted is found at the beginning of the sorted part of the sequence. I.e. in the while-loop sequences of length 1, 2, 3, ..., n-1 are scanned. Altogether, these are (n-1)·n / 2  element  Θ(n2) operations. This case occurs when the original sequence is sorted in decreasing order.
It is possible to find the inserting position of element ai faster, namely by binary search. However, moving the elements to the right in order to make room for the element to be inserted takes linear time anyway.

The exact number of steps required by insertion sort is given by the number of inversions of the sequence.
Definition:  Let a  =  a0, ..., an-1 be a finite sequence. An inversion is a pair of index positions, where the elements of the sequence are out of order. Formally, an inversion is a pair (ij), where i < j and ai > aj.1)
Example:  Let a = 5, 7, 4, 9, 7. Then, (0, 2) is an inversion, since a0 > a2, namely 5 > 4. Also, (1, 2) is an inversion, since 7 > 4, and (3, 4) is an inversion, since 9 > 7. There are no other inversions in this sequence.
We now count the inversions (ij) of a sequence a separately for each position j. The resulting value of vj gives the number of elements ai to the left of aj which are greater than aj.
For instance, for the sequence a = 5, 7, 4, 9, 7 we have v2 = 2, since the two inversions (0, 2) and (1, 2) have 2 as their second component. Here, 5 and 7 are greater than 4.
Definition:  The inversion sequence v = v0, ..., vn-1 of a sequence a = a0, ..., an-1 is defined by
vj  =  |{ (ij)  |  i < j   and   ai > aj }|
for j = 0, ..., n-1.
Example:  The above sequence a = 5, 7, 4, 9, 7 has the inversion sequence v = 0, 0, 2, 0, 1.

Obviously, we have vi<=i for all i = 0, ..., n-1. If all vi are 0, then and only then the sequence a is sorted. If the sequence a is a permutation, then it is uniquely determined by its inversion sequence v. The permutation n-1, ..., 0 has inversion sequence 0, ..., n-1.
Theorem:  Let a  =  a0, ..., an-1 be a sequence and v  =  v0, ..., vn-1 be its inversion sequence. Then the number of steps T(a) required by insertion sort to sort sequence a is
T(a)  =   sum over i = 0, ..., n-1  vi
Proof:  Obviously, insertion sort takes vi steps in each iteration i. Thus, the overall number of steps is equal to the sum of all vi.
Example:  The following table shows the sequence a of the first example and its corresponding inversion sequence. For instance,v5 = 4 since 4 elements to the left of a5 = 2 are greater than 2 (namely 5, 7, 3 and 4). The total number of inversions is 17.
i01234567
ai57034261
vi00222416
Thus, insertion sort requires 17 steps to sort this sequence (see first example).


http://www.iti.fh-flensburg.de/lang/algorithmen/sortieren/insert/insertionen.htm



http://www.programiz.com/article/insertion-sort-algorithm-programming


No comments:

Post a Comment