Jump to content
xisto Community
Sign in to follow this  
beatgammit

Sorting How to sort data in arrays- examples given in Java

Recommended Posts

Sorting- Using Java Language

 

First of all, why the heck should I care about sorting?

Easier to present data- it is very frustrating to be looking for some information and it is not in order

Faster searching algorithms- most searching algorithms require looking at every element in a list until the requested one is found, but if it is sorted, you can use binary search (I'll explain in another tutorial) to drastically cut search time

Less memory for searching- there ARE ways to increase search time in unsorted lists (with indexing and such), but these take much more memory than just keeping a list sorted, and memory is precious in web-user interfaces

Can't I just use the ones given to me? I see some structures already inherently support sorting.

 

Simply put, no. If you are considering using a TreeSet or something, it would need to sort every time you added something. This is not a problem if you only add one every once in a while, but sorting a random array takes quite a while. Even if these algorithms were fast, creating your own code is always much faster, and you can tweak the code yourself to make it even faster if you need it to. The Arrays.sort method is quite fast (faster than any algorithm that you could devise), but not every programming language has the equivalent.

 

Why do I care if my sorting algorithms are fast?

 

My Computer Science teacher told me that if your website takes more than four seconds to load, you have lost business; customers will move to another website with the same content and you lose advertising dollars (or, if your service has no advertising, bragging rights). A sorted list allows for faster presentation of data to the user.

 

Alright, you convinced me. How do I do it?

 

There are many different methods for sorting- insertion sort, bubble sort, merge sort, shell sort, quick sort, etc. Insertion sort creates a new array/list and stores the original values in order. This can be quite time consuming; if an element must go to the beginning of the new list, the rest of the elements must be moved down, requiring a lot of processing power. Bubble sort starts at the beginning of the array and moves a value until all of the values before it are smaller than it, bouncing the value until it finds its correct location. This is also very time consuming. There are a few other sorting algorithms, but I am only going to talk about shell, merge, and quick sorts.

 

Shell Sort

 

Shell sort was created by a very smart man by the name of Shell before computers were created to take advantage of his algorithm (1958 or 1959, I can't remember). Anyway, his algorithm was very mathematically based and is a very efficient way of sorting, although it seems to be quite arduous. This sorting algorithm is, in my own words, âjust plain sweetâ. Enough introductions, let's meet the guest of honor:

Here's an example list (I just typed random numbers):

79451383

To begin, we divide the list into two:

7945 1383

We then swap every corresponding value from each side, putting the smaller values on the left and bigger values on the right:

1343 7985

We then split each group in two:

13 43 79 85

And then swap values from each group, again:

13 43 75 89

It is pretty close to being sorted, so we now do the whole list:

13345789

 

This may sound like it is very time, and resource, consuming, but put to tests, it nearly always beats out the âlesserâ sorts (it only looses to some of the sorts in already sorted lists and ones where maybe the last two are switched, or something extremely minor like that).

 

In code it looks like this:

void sort(int data[]) {	for(int gap = data.length / 2; gap > 0; gap = gap == 2 ? 1 : (int)(gap / 2.2)){		for(int i = gap; i < data.length; i++){			short tmp = data[i];			int j = i;			for(; j >= gap && tmp < data[j - gap]; j -= gap){				data[j] = data[j - gap];			}			data[j] = tmp;		}	}	return data;}

Merge Sort

 

Shell sort is pretty sexy, but it is not the fastest. Merge sort often beats it in trials. So what is it?? Merge sort takes two sorted arrays and fills a third array by inserting the lowest of both arrays into this new array. But, what if the array isn't sorted?? We have a problem, or so it seems. What we'd do is recursively build two sorted lists and then merge both of them. Here's the code:

 

Call the recursive mergeSort routine

void mergeSort(Comparable[] a ){	mergeSort(a, 0, a.length - 1);}

Recursively sort the first half, then the second half, then merge them.

void mergeSort(Comparable[] a, int left, int right){	if(left < right) {		int center = (left + right) / 2;		mergeSort(a, left, center);		mergeSort(a, center + 1, right);		merge(a, left, center, right);	}}

This is one way to implement a merge routine, you could use two arrays if you like.

void merge(Comparable [] a, int left, int center, int right){	int size = right - left + 1;	Comparable[] tmp = new Comparable[size];	int i = 0;	int leftPos = left;	int rightPos = center + 1;	while(leftPos <= center && rightPos <= right)		if(a[leftPos].compareTo( a[rightPos]) <= 0)			tmp[i++] = a[leftPos++];		else			tmp[i++] = a[rightPos++];	while(leftPos <= center)		tmp[i++] = a[leftPos++];	while(rightPos <= right)		tmp[i++] = a[rightPos++];	for(i = 0; i < size; i++)		a[left + i] = tmp[i];}

Here's a breakdown of how it works with a random list:

79451383

7945 1383

79 45 13 83

7 9 4 5 1 3 8 3

79 46 13 38

4679 1338

13346789

 

Merge sort uses a lot of memory (3 arrays) and for this reason is generally impractical for general use.

 

Quick Sort

 

This is the fastest known way to sort an array and does not take up all that much memory (it uses only one array). What it does is pick a âpivotâ, an approximation to the median (the middle if the array was sorted) and move all of the numbers bigger than it to one side and all of the ones smaller than it to the other side. It keeps doing this recursively until the list is completely sorted. Here's the code:

 

Calls the recursive quicksort method.

void quicksort( Comparable [ ] a ) {	quicksort( a, 0, a.length - 1 );}

Sorts the first half and then the second half, according to the pivot.

void quicksort( Comparable [ ] a, int low, int high ) {	if( low < high ) {		int pivot = partition(a, low, high);		quicksort( a, low, pivot - 1 );		quicksort( a, pivot + 1, high );	}}

Picks a pivot (by getting the median of three [much safer than guessing low or high]) and moves low closer to high and high closer to low and switches them when low is > pivot and high is < pivot. The pivot is moved where low and high cross

int partition (Comparable[] a, int low, int high){	Comparable pivot = new Comparable();	int center = (low â high) / 2;	if(a[low] < a[high]){		if(a[center] >= a[low]){			if(a[center] < a[high]{				pivot = a[center];			}else{				pivot = a[high];			}		}else{			pivot = a[low];		}	}else if(a[center] >= a[high]{		if(a[center] < a[low]){			pivot = a[center];		}else{			pivot = a[low];		}	}else{		pivot = a[high];	}	int i = low;	int j = high+1;	while (i < j) {		do i++; while(i < j && a[ i ].compareTo(pivot) < 0 );		do j--; while(i <= j && a[ j ].compareTo(pivot) > 0 );		if (i < j) {			Comparable temp = a[i];			a[i] = a[j];			a[j] = temp;		}	}	a[low] = a[j];	a[j] = pivot;	return j;}

Here's the rundown of how it works with a random list of numbers (pivot is bold):

79451383 low = 7; High = 3; middle = 1

31 3 45978 lows = 3,4; highs = 1,8; middle = 3,9

1 3 3 457 8 9 low = 4, high = 7, middle = 5

13345789

 

The example I gave kinda sucked, but that comes with the territory. When it runs most efficiently, or at least decently efficiently, it is the fastest algorithm because it breaks up the sorting into independent chunks. You are guaranteed that your pivot is in the right place because everything left of it is smaller and everything right of it is bigger, thus it is in the right place. This allows for a really efficient search of a random listâquickselect. Quick select is just quicksort except you don't partition the half that your requested value is not in. This is really helpful when searching unordered, complex data.

 

Conclusions:

These sorts are really not all that important, especially if you are using Java because Java's built-in sort runs with the fastest algorithms; if you implement Comparable, you can take advantage of this functionality. Other programming languages may not have such algorithms, so applying these sorts can drastically improve performance.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×
×
  • Create New...

Important Information

Terms of Use | Privacy Policy | Guidelines | We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.