Wednesday, May 27, 2015

Implementing Merge Sort Algorithm on an Array - Sorting Algorithms - C Program (Procedural | Recursion)


Problem Question



Write a program to implement Merge Sort.

Explanation of Problem



Merge Sort is considered an intermediate level sorting algorithm, from implementation complexity perspective. Usually, it is taught to students once they are familiar with basic programming constructs and recursion. The merge sort algorithm is better than selection or bubble sort algorithms because of the space-time complexity, which reduces from quadratic to logarithmic, due to continue division of the data set. This algorithm is used to demonstrate recursion and divide and conquer problem solving techniques.
A question may arise here if it is possible to implement merge sort without recursion, and the answer is yes. However, in this post today, we are going to implement it via recursion.
In merge sort the program the program divides the list into smaller sub-lists, and continues to do so until it is left with unit sized lists. It then takes two sublists, sorts their elements, and continues to work up by combining smaller sublist and sorting them.

To understand merge sort better, consider a list of 5 numbers, [5, 3, 2, 1, 4]. Following is how merge sort would sort the list. The blue-colored boxes represent the left sub-list in the iteration and pink boxes represent the right sub-array.



Unlike insertion/selection/bubble sort, the problem is solved by dividing and conquering the data set, hence sorting the list incrementally.




Code



#include <stdio.h>
/**@Title: SortingAlgorithms v1.3.c*
*@Language: ANSI C*
*@Compiler: GNU GCC*
*@IDE: Code::Blocks 13.12*
*@Author: Toxifier*
*@URL: http://letsplaycoding.blogspot.com/*
*@Date: 27-05-2015*
*/

void merge(int inputArray[], int startIndex, int medianIndex, int inputArrayLength)
{
    int leftLoop, rightLoop, remainderLoop;
    int leftArrayLength = medianIndex - startIndex + 1;
    int rightArrayLength = inputArrayLength - medianIndex;
    int leftArray[leftArrayLength], rightArray[rightArrayLength];

    for (leftLoop = 0; leftLoop < leftArrayLength; leftLoop++)
    {
        leftArray[leftLoop] = inputArray[startIndex + leftLoop];
    }
    for (rightLoop = 0; rightLoop < rightArrayLength; rightLoop++)
    {
        rightArray[rightLoop] = inputArray[medianIndex + 1 + rightLoop];
    }

    leftLoop = 0, rightLoop = 0, remainderLoop = startIndex;

    while (leftLoop < leftArrayLength && rightLoop < rightArrayLength)
    {
        if (leftArray[leftLoop] <= rightArray[rightLoop])
        {
            inputArray[remainderLoop] = leftArray[leftLoop];
            leftLoop++;
        }
        else
        {
            inputArray[remainderLoop] = rightArray[rightLoop];
            rightLoop++;
        }
        remainderLoop++;
    }

    while (leftLoop < leftArrayLength)
    {
        inputArray[remainderLoop] = leftArray[leftLoop];
        leftLoop++;
        remainderLoop++;
    }
    while (rightLoop < rightArrayLength)
    {
        inputArray[remainderLoop] = rightArray[rightLoop];
        rightLoop++;
        remainderLoop++;
    }
}

void mergeSort(int inputArray[], int startIndex, int inputArrayLength)
{
    if (startIndex < inputArrayLength)
    {
        int medianIndex = startIndex + (inputArrayLength - startIndex) / 2;
        mergeSort (inputArray, startIndex, medianIndex);
        mergeSort (inputArray, medianIndex + 1, inputArrayLength);
        merge (inputArray, startIndex, medianIndex, inputArrayLength);
    }
}

int main()
{
    int userChoice, outerLoop, lengthNumberList = 0, originalNumberList[25], sortedNumberList[25];
    printf ("Welcome to SortingAlgorithms v1.3\n");
    printf ("Made by Toxifier\n");
    do
    {
        printf ("\n1: Add an element to the list\n");
        printf ("2: Display Original List\n");
        printf ("3: Perform Merge Sort\n");
        printf ("4: Empty List\n");
        printf ("5: Exit\n");
        printf ("Enter your choice: ");

        scanf ("%d", &userChoice);

        switch (userChoice)
        {
        case 1:
            if (lengthNumberList >= 25)
            {
                printf ("\nArray Full\n");
            }
            else
            {
                printf ("\nEnter the value: ");
                scanf ("%d", &originalNumberList[lengthNumberList]);
                sortedNumberList[lengthNumberList] = originalNumberList[lengthNumberList];
                lengthNumberList++;
            }
            break;
        case 2:
            if (lengthNumberList == 0)
            {
                printf ("\nList Empty!\n");
            }
            else
            {
                printf ("\nOriginal List: ");
                for (outerLoop = 0; outerLoop < lengthNumberList; outerLoop++)
                {
                    printf ("%d ", originalNumberList[outerLoop]);
                }
            }
            break;
        case 3:
            if (lengthNumberList == 0)
            {
                printf ("\nList Empty!\n");
            }
            else
            {
                mergeSort (sortedNumberList, 0, lengthNumberList - 1);
                printf ("\nSorted List: ");
                for (outerLoop = 0; outerLoop < lengthNumberList; outerLoop++)
                {
                    printf ("%d ", sortedNumberList[outerLoop]);
                }
            }
            break;
        case 4:
            lengthNumberList = 0;
            break;
        case 5:
            printf ("Thank You for using SortingAlgorithms v1.3\n");
            printf ("Made by Toxifier\n");
            break;
        default:
            printf ("\nWrong Choice!\n");
            break;
        }
    }
    while (userChoice != 5);
    system("pause");
    return 0;
}





Explanation of Code



#include <stdio.h> -> This is the step which occurs before compilation starts. The compiler calls the C Preprocessor to include the STDIO(Standard Input Output) header file into the program, thus letting the use of the standard input/output functions like printf() and scanf() which come from STDIO.H

int main() -> The entry point of the program where the execution starts. This function has to be named main. As per the ANSI specification, the return type has to be int. If you use the traditional C, you may use void as the return type. Since the return type is specified as int in my program, I have to use a return statement at the end of my code. So, I use return 0 since zero returned from a function, by convention, implies a correct execution of the program. The return values are used to debug the program.

printf() -> This is a standard output function used to print something on the screen. We have to pass a string to this function which will be displayed on user's terminal.

scanf() -> This is the scanf() function which waits for the user to enter certain value using his/her keyboard. We store the user input at the location in memory which is pointed to by the variable whose address is passed to this function.

int userChoice, outerLoop, lengthNumberList = 0, originalNumberList[25], sortedNumberList[25]; -> Here we define the variables we are going to use in our program.
userChoice is used for the switch and do-while loop. The program is menu driven to allow user to change the array and move in any order they wish to.
outerLoop is the loop variable.
lengthNumberList is used to track the final position of the arrays. originalNumberList[25], sortedNumberList[25] are the arrays which will hold the user input and the sorted list, respectively.

            if (lengthNumberList >= 25)
            {
                printf ("\nArray Full\n");
            }
            else
            {
                printf ("\nEnter the value: ");
                scanf ("%d", &originalNumberList[lengthNumberList]);
                sortedNumberList[lengthNumberList] = originalNumberList[lengthNumberList];
                lengthNumberList++;
            }


This is the first case of the program where the user inputs the array values. Since the array is defined with maximum length of 25, an error is thrown to the user and they are not allowed to add more values to the list once the limit is reached.
If the array has enough space though, the user can proceed further and input a value to the array, which will get stored at the end of the list. Every user input is copied to the sortedNumberList as well, so that original list is intact for future reference.

            if (lengthNumberList == 0)
            {
                printf ("\nList Empty!\n");
            }
            else
            {
                printf ("\nOriginal List: ");
                for (outerLoop = 0; outerLoop < lengthNumberList; outerLoop++)
                {
                    printf ("%d ", originalNumberList[outerLoop]);
                }
            }


This is the second case of our program which is used to display the list created by the user so far. The code iterates through the array, originalNumberList and prints each element.

            if (lengthNumberList == 0)
            {
                printf ("\nList Empty!\n");
            }
            else
            {
                mergeSort (sortedNumberList, 0, lengthNumberList - 1);
                printf ("\nSorted List: ");
                for (outerLoop = 0; outerLoop < lengthNumberList; outerLoop++)
                {
                    printf ("%d ", sortedNumberList[outerLoop]);
                }
            }


This is the third case of our program where we have the problem statement addressed - implementing the merge sort algorithm. We first check the length of the array, and proceed with sorting only if there are any elements. The program then calls the function mergeSort to sort the list. Once the list is sorted, it is displayed.
It is important to note that we are working on the sortedNumberList array instead of the originalNumberList array. The idea is to preserve the original list for any reference.
When implementing merge sort, this code and the two functions merge and mergeSort are good enough to solve the problem. Rest of the code in our C program is only having a few useful additions which helps in easy execution of the program, making it more user friendly.




void mergeSort(int inputArray[], int startIndex, int inputArrayLength)
{
    if (startIndex < inputArrayLength)
    {
        int medianIndex = startIndex + (inputArrayLength - startIndex) / 2;
        mergeSort (inputArray, startIndex, medianIndex);
        mergeSort (inputArray, medianIndex + 1, inputArrayLength);
        merge (inputArray, startIndex, medianIndex, inputArrayLength);
    }
}


The function mergeSort expects 3 parameters. The array to be sorted (int inputArray[]); the starting point of the array for the function call (int startIndex); and the length of the array (int inputArayLength). The length of the array represents the index of the last element (counting begins from 0).
This function implements Merge Sort algorithm in a recursive manner (note the function calls itself). The function first gets a suitable middle element to divide the list it is working with into two halves (equal, or almost equal; in the latter one list has one element more than the other).
The function then calls itself for the left half and the right half and then calls another function merge, to merge the two lists it worked on.
When calling itself recursively, mergeSort keeps dividing the list it works upon further, until the list reduces to unit size. Hence, the entire logic is wrapped under an if condition, which says that the function should do something only if the starting position of the array is less than the number of elements in the list, basically giving the control back to the previous recursive call upon encountering a single element sub-array.

void merge(int inputArray[], int startIndex, int medianIndex, int inputArrayLength)
{
    int leftLoop, rightLoop, remainderLoop;
    int leftArrayLength = medianIndex - startIndex + 1;
    int rightArrayLength = inputArrayLength - medianIndex;
    int leftArray[leftArrayLength], rightArray[rightArrayLength];

    for (leftLoop = 0; leftLoop < leftArrayLength; leftLoop++)
    {
        leftArray[leftLoop] = inputArray[startIndex + leftLoop];
    }
    for (rightLoop = 0; rightLoop < rightArrayLength; rightLoop++)
    {
        rightArray[rightLoop] = inputArray[medianIndex + 1 + rightLoop];
    }

    leftLoop = 0, rightLoop = 0, remainderLoop = startIndex;

    while (leftLoop < leftArrayLength && rightLoop < rightArrayLength)
    {
        if (leftArray[leftLoop] <= rightArray[rightLoop])
        {
            inputArray[remainderLoop] = leftArray[leftLoop];
            leftLoop++;
        }
        else
        {
            inputArray[remainderLoop] = rightArray[rightLoop];
            rightLoop++;
        }
        remainderLoop++;
    }

    while (leftLoop < leftArrayLength)
    {
        inputArray[remainderLoop] = leftArray[leftLoop];
        leftLoop++;
        remainderLoop++;
    }
    while (rightLoop < rightArrayLength)
    {
        inputArray[remainderLoop] = rightArray[rightLoop];
        rightLoop++;
        remainderLoop++;
    }
}


The function merge is the place where actual sorting of elements happen. The function sorts the elements by merging the two sub-arrays, comparing each element and putting them in their correct place when doing so. It is important to note, that the program is passing the address of main array, sortedNumberList since the beginning, hence the sub-lists are actually changing data elements in the main array itself. Let's look at it further.
The function merge accepts 4 arguments: int inputArray[] (the address of the array being worked upon), int startIndex (the index of the first element of the sub-list), int medianIndex (the index of the middle element of the sub-list), int inputArrayLength (the length of the sublist).
Additionally, we have a bunch of variables defined to aid in the process. int leftLoop, rightLoop, remainderLoop are the various loop variables. int leftArrayLength = medianIndex - startIndex + 1 stores the length of left sub-array and int rightArrayLength = inputArrayLength - medianIndex; stores the length of right sub-array. int leftArray[leftArrayLength], rightArray[rightArrayLength]; are the arrays that will hold the sublist data.
Left and right sub-arrays are then populated with the data, left of the median and right of the median.
Next, the while loop compares all the elements of both sub arrays and put them at appropriate positions. Whichever element is successfully put in place, should not be repeated in comparisons, hence the loop counter for that list is incremented. However, it is important to note that once either of the lists is empty, the remaining elements of the other list are left as-is. Hence, we have the other two while loops, that push in the remaining elements. Since the mergeSort function calls merge with lists as small as 2 elements, the remaining sub-list is always sorted. The remainderLoop variable is pointing to the element of main array that should be replaced. Hence, in the end when the control comes out of the merge function, the part of the list is sorted which it was working with.

The combination of mergeSort and merge calls, eventually sorts the entire array.




lengthNumberList = 0; -> This is the fourth case where we reset the lengthNumberList variable, which tracks the last element of the array. This helps in giving the user ability to redo any operation, without the need of closing and starting the program again.

printf ("Thank You for using SortingAlgorithms v1.3\n");
            printf ("Made by Toxifier\n");

This is the fifth case statement, which allows the user to safely exit the program.

default:
            printf ("\nWrong Choice!\n");

When using the case statement, it is important to use the break keyword at the end of each case to avoid the program from falling through the next case. It is also important to note the default case, which is important to handle any wrong values in the switch-case logic.

system("pause") -> This statement is used to pause the program, until user presses a key. This function is not necessary in your program, I use it to see my outputs paused. If you use cmd to run your programs, you might not need this. If you use linux/unix you might not need this. Depending on your compiler, this function may or may not work. Moreover, removing this line of code from this program, doesn't affect the functionality of the program.




Output(s)









Download Source Code





Saturday, May 09, 2015

Implementing Selection Sort Algorithm on an Array - Sorting Algorithms - C Program (Procedural)


Problem Question



Write a program to implement Selection Sort.

Explanation of Problem



Selection Sort is a simple sorting algorithm. However, due to the high order of space-time complexity it incurs to sort a list, makes it impractical for large applications.
In selection sort the program iterates through the list of values as many times as the number of items in the list. Through each pass, the algorithm finds the minimum value in the list and assigns it to the first element. In the next pass, the algorithm works with the remaining list, except for the first element. Hence, the new minimum will be assigned to the first element of this sublist, which is actually second element of the overall list. Going forward this way, the list will be sorted after the algorithm has iterated the entire list as many times as the numer of elements. When the first element of the working sublist is assigend with new value, the value at the first element is exchanged with the value where the new minimum is found. Hence the algorithm is a type of Exchange Sort.

To understand selection sort better, consider a list of 5 numbers, [5, 3, 2, 1, 4]. Following is how selection sort would sort the list.



Unlike bubble sort, the working size of the list shrinks with every pass, hence it appears to be slightly better. However, in generalised Big-O notation, it would still have the quadratic complexity.




Code



#include <stdio.h>
/**@Title: SortingAlgorithms v1.2.c*
*@Language: ANSI C*
*@Compiler: GNU GCC*
*@IDE: Code::Blocks 13.12*
*@Author: Toxifier*
*@URL: http://letsplaycoding.blogspot.com/*
*@Date: 09-05-2015*
*/

#include <stdio.h>
int main()
{
    int userChoice, outerLoop, innerLoop, minIndex, exchangeSpace, lengthNumberList = 0, originalNumberList[25], sortedNumberList[25];
    printf ("Welcome to SortingAlgorithms v1.2\n");
    printf ("Made by Toxifier\n");
    do
    {
        printf ("\n1: Add an element to the list\n");
        printf ("2: Display Original List\n");
        printf ("3: Perform Selection Sort\n");
        printf ("4: Empty List\n");
        printf ("5: Exit\n");
        printf ("Enter your choice: ");

        scanf ("%d", &userChoice);

        switch (userChoice)
        {
        case 1:
            if (lengthNumberList >= 25)
            {
                printf ("\nArray Full\n");
            }
            else
            {
                printf ("\nEnter the value: ");
                scanf ("%d", &originalNumberList[lengthNumberList]);
                sortedNumberList[lengthNumberList] = originalNumberList[lengthNumberList];
                lengthNumberList++;
            }
            break;
        case 2:
            if (lengthNumberList == 0)
            {
                printf ("\nList Empty!\n");
            }
            else
            {
                printf ("\nOriginal List: ");
                for (outerLoop = 0; outerLoop < lengthNumberList; outerLoop++)
                {
                    printf ("%d ", originalNumberList[outerLoop]);
                }
            }
            break;
        case 3:
            if (lengthNumberList == 0)
            {
                printf ("\nList Empty!\n");
            }
            else
            {
                for ( outerLoop = 0; outerLoop < lengthNumberList - 1; outerLoop++ )
                {
                    minIndex = outerLoop;
                    for ( innerLoop = outerLoop + 1; innerLoop < lengthNumberList; innerLoop++ )
                    {
                        if (sortedNumberList[minIndex] > sortedNumberList[innerLoop])
                        {
                            minIndex = innerLoop;
                        }
                    }
                    if (minIndex != outerLoop)
                    {
                        exchangeSpace = sortedNumberList[outerLoop];
                        sortedNumberList[outerLoop] = sortedNumberList[minIndex];
                        sortedNumberList[minIndex] = exchangeSpace;
                    }
                }
                printf ("\nSorted List: ");
                for (outerLoop = 0; outerLoop < lengthNumberList; outerLoop++)
                {
                    printf ("%d ", sortedNumberList[outerLoop]);
                }
            }
            break;
        case 4:
            lengthNumberList = 0;
            break;
        case 5:
            printf ("Thank You for using SortingAlgorithms v1.2\n");
            printf ("Made by Toxifier\n");
            break;
        default:
            printf ("\nWrong Choice!\n");
            break;
        }
    }
    while (userChoice != 5);
    system("pause");
    return 0;
}





Explanation of Code



#include <stdio.h> -> This is the step which occurs before compilation starts. The compiler calls the C Preprocessor to include the STDIO(Standard Input Output) header file into the program, thus letting the use of the standard input/output functions like printf() and scanf() which come from STDIO.H

int main() -> The entry point of the program where the execution starts. This function has to be named main. As per the ANSI specification, the return type has to be int. If you use the traditional C, you may use void as the return type. Since the return type is specified as int in my program, I have to use a return statement at the end of my code. So I use return 0 since zero returned from a function, by convention, implies a correct execution of the program. The return values are used to debug the program.

printf() -> This is a standard output function used to print something on the screen. We have to pass a string to this function which will be displayed on user's terminal.

scanf() -> This is the scanf() function which waits for the user to enter certain value using his/her keyboard. We store the user input at the location in memory which is pointed to by the variable whose address is passed to this function.

int userChoice, outerLoop, innerLoop, minIndex, exchangeSpace, lengthNumberList = 0, originalNumberList[25], sortedNumberList[25]; -> Here we define the variables we are going to use in our program.
userChoice is used for the switch and do-while loop. The program is menu driven to allow user to change the array and move in any order they wish to.
outerLoop, innerLoop are the loop variables and exchangeSpace is the temporary space which we would use during value exchange operation.
lengthNumberList is used to track the final position of the arrays. originalNumberList[25], sortedNumberList[25] are the arrays which will hold the user input and the sorted list, respectively.
minIndex is used to track the index of minimum element in the list.

            if (lengthNumberList >= 25)
            {
                printf ("\nArray Full\n");
            }
            else
            {
                printf ("\nEnter the value: ");
                scanf ("%d", &originalNumberList[lengthNumberList]);
                sortedNumberList[lengthNumberList] = originalNumberList[lengthNumberList];
                lengthNumberList++;
            }


This is the first case of the program where the user inputs the array values. Since the array is defined with maximum length of 25, an error is thrown to the user and they are not allowed to add more values to the list once the limit is reached.
If the array has enough space though, the user can proceed further and input a value to the array, which will get stored at the end of the list. Every user input is copied to the sortedNumberList as well, so that original list is inatct for future reference.

            if (lengthNumberList == 0)
            {
                printf ("\nList Empty!\n");
            }
            else
            {
                printf ("\nOriginal List: ");
                for (outerLoop = 0; outerLoop < lengthNumberList; outerLoop++)
                {
                    printf ("%d ", originalNumberList[outerLoop]);
                }
            }


This is the second case of our program which is used to display the list created by the user so far. The code iterates through the array, originalNumberList and prints each element.

if (lengthNumberList == 0)
            {
                printf ("\nList Empty!\n");
            }
            else
            {
                for ( outerLoop = 0; outerLoop < lengthNumberList - 1; outerLoop++ )
                {
                    minIndex = outerLoop;
                    for ( innerLoop = outerLoop + 1; innerLoop < lengthNumberList; innerLoop++ )
                    {
                        if (sortedNumberList[minIndex] > sortedNumberList[innerLoop])
                        {
                            minIndex = innerLoop;
                        }
                    }
                    if (minIndex != outerLoop)
                    {
                        exchangeSpace = sortedNumberList[outerLoop];
                        sortedNumberList[outerLoop] = sortedNumberList[minIndex];
                        sortedNumberList[minIndex] = exchangeSpace;
                    }
                }
                printf ("\nSorted List: ");
                for (outerLoop = 0; outerLoop < lengthNumberList; outerLoop++)
                {
                    printf ("%d ", sortedNumberList[outerLoop]);
                }
            }


This is the third case of our program where we have the probem statement addressed - implementing the selection sort algorithm. We first check the lenght of the array, and proceed with sorting only if there are any elements. The outer loop is used to iterate through the list as many times as the number of elements in the list, actually one less because the last element will be sorted automatically. The inner loop finds the index of the element with least value in the list between the current index pointed by outer loop till the end of the array. After the inner loop completes, before going through next iteration of outer loop, the minimum value's indx is compared with the outer loop counter. If the index is not same as outer loop, the values are exhcnaged. If the values are same, that means the first element of the list (or sublist in current iteration) is alredy placed at correct position and does not require any change in place. Once the list is sorted, it is displayed.
It is important to note that we are working on the sortedNumberList array insted of the originalNumberList array. The idea is to preserve the original list for any reference.
When implementing selection sort, this code is good enough to solve the problem. Rest of the code in our C program is only having a few useful additions which helps in easy execution of the program, making it more user friendly.




lengthNumberList = 0; -> This is the fourth case where we reset the lengthNumberList variable, which tracks the last element of the array. This helps in giving the user ability to redo any operation, without the need of closing and starting the program again.

printf ("Thank You for using SortingAlgorithms v1.2\n");
            printf ("Made by Toxifier\n");

This is the fifth case statement, which allows the user to safely exit the program.

default:
            printf ("\nWrong Choice!\n");

When using the case statement, it is important to use the break keyword at the end of each case to avoid the program from falling through the next case. It is also important to note the default case, which is important to handle any wrong values in the swithc-case logic.

system("pause") -> This statement is used to pause the program, until user presses a key. This function is not necessary in your program, I use it to see my outputs paused. If you use cmd to run your programs, you might not need this. If you use linux/unix you might not need this. Depending on your compiler, this function may or may not work. Moreover, removing this line of code from this program, doesn't affect the functionality of the program.




Output(s)









Download Source Code