Quicksort In Python
We’ve all been guilty of it. Whenever we come across a problem that requires us to sort an array, we default to implementing bubble sort. I know we can do better!
As a quick refresher, here’s a table of the different sorting algorithms and their big O notation.
As we can see, on average the time complexity of quick sort is n log of n, and in a worst case scenario, the complexity is n squared.
If we visit every element only once, then we’d have O(N).
We have O(N²) when, for every element, we visit every other element. For instance, in the case of bubble sort, every time we add a new element to the list, not only do we have to compare it to every element, but every element is compared against it as well.
Whenever you see Log₂N, you should automatically think a subset of the N elements in the list. In the quick sort algorithm, we’re visiting every element once. It’s for this reason that we have the N in front of the log. Given that, on average, we roughly have the same number of elements that are greater than or less than the pivot, we’re required to traverse less and less elements in order to sort the list as we continue to split based on the current pivot. It’s for this reason, quick sort has a time complexity of O(NLog₂N).
Bubble Sort
Suppose we had the following list.
We start by comparing the first element with the next element.
Since the second element is less than the first, we swap them.
Next, we move the pointer to the second element and compare it with the third one.
Since, the third one is greater than the current element, we simply move on.
9 is greater than 3. Therefore, we swap them.
We move the pointer.
9 is greater than 8. Thus, we swap them.
We repeat the process until we’ve visited every index in the list.
As we can see, using bubble sort, after the first pass, we can only guarantee that the first element is in the correct position. Therefore, we repeat the process starting at the next index.
Quick Sort
Suppose we had the same list as the previous example.
We start off by selecting one number as the pivot. The pivot could be anything, but, in this example, we select the last element in the list.
The comparison of the numbers in the list will be relative to the pivot. In other words, all the numbers higher than our pivot, will be placed in one list, and all the numbers below our pivot will be placed in another list.
4 is greater than 2. Therefore, it’s placed in the list on the right.
We repeat the process for the remaining elements.
After the first pass, we can guarantee that the pivot is in the correct position of the sorted list. The process is then repeated for each of the two lists we just created.
Given that the list on the left only has one element, it’s already sorted.
We use the last element in the list as the pivot.
We put all the number greater than 6 in one list and all the numbers lower than 6 in another.
We iterate over the remaining lists.
We finish with a sorted list.
Python Example
To begin, we create a list of 1000 elements ranging from 0 to 100.
import random
number_of_elements = 1000
max_element_value = 100
min_element_value = 0
elements = []
for i in range(number_of_elements):
number = random.randint(min_element_value, max_element_value)
elements.append(number)
We define a function to sort the list using the bubble sort algorithm as a basis for comparison.
def bubble_sort(elements):
for i in range(len(elements)):
for j in range(len(elements) - 1):
if elements[j] > elements[j+1]:
temp = elements[j]
elements[j] = elements[j+1]
elements[j+1] = temp
return elements
We sort the list, and obtain a rough estimate of how long it takes.
import time
start_time = time.time()
sorted_elements = bubble_sort(elements)
print("--- %s seconds ---" % (time.time() - start_time))
--- 0.09066176414489746 seconds ---
We can confirm that we indeed sorted the array by viewing the first and last elements.
print(sorted_elements[:10])
print(sorted_elements[-10:])
[0, 0, 0, 0, 0, 0, 0, 1, 1, 1]
[100, 100, 100, 100, 100, 100, 100, 100, 100, 100]
We implement the quick sort algorithm using recursion.
def quick_sort(elements):
if len(elements) <= 1:
return elements
pivot = elements.pop()
greater_than_elements = []
lesser_or_equal_than_elements = []
for element in elements:
if (element > pivot):
greater_than_elements.append(element)
else:
lesser_or_equal_than_elements.append(element)
return quick_sort(lesser_or_equal_than_elements) + [pivot] + quick_sort(greater_than_elements)
Finally, we sort the list.
import time
start_time = time.time()
sorted_elements = quick_sort(elements)
print("--- %s seconds ---" % (time.time() - start_time))
--- 0.03184390068054199 seconds ---
As we can see, when sorting an array of 1000 elements, we already have a noticeable difference in the time taken by bubble sort and quick sort. The former takes roughly a third of the time of the former!