Skip to content

Algorithms

There are some common algorithms that you should understand. These include sorting algorithms, searching algorithms, dynamic programming, and graph algorithms. Each of these algorithms has its own strengths and weaknesses, so it's important to understand when to use each one.

Note

More general information about algorithms can be found in the NSW Software Engineering reference material. The content here is more about the implementation in Python.

Performance Analysis

Performance analysis is the process of measuring and analysing the efficiency of an algorithm or program. This includes identifying bottlenecks, optimising code, and choosing the right data structures. Performance analysis is important because it can help you improve the speed and scalability of your applications.

A commonly used unit for measuring performance is Big O notation, which describes the upper bound of an algorithm's time or space complexity as the input size grows. For example, an algorithm with a time complexity of O(n^2) will take twice as long to run when the input size doubles.

There are many ways to profile your python code. The quickest way is to read the code and rationalise the data structures used, but this can be time-consuming for large codebases.

  • You can use cProfile module which is built into Python. It provides a high-level interface for profiling Python programs. You can use it to measure the time taken by different functions or code blocks, and identify which parts of your program are taking the most time.

  • You can use timeit module which is also built into Python. It provides a simple way to measure the execution time of small code snippets. You can use it to compare the performance of different algorithms or implementations.

  • Pytest is a popular testing framework for Python, but it also provides profiling capabilities. You can use it to measure the performance of your tests and identify which parts of your code are taking the most time. There is a handy addon called pytest-profiling which will generate a graphical report showing the performance of your tested code. Internally it uses cProfile to collect profiling data.

Sorting Algorithms

Sorting algorithms are used to arrange a list of items in a specific order. There are many different types of sorting algorithms, including:

  • bubble sort
  • insertion sort
  • selection sort
  • merge sort
  • quicksort
  • heapsort

Each one has its own time complexity, which is the amount of time it takes to run as a function of the size of the input. For example, bubble sort has a time complexity of O(n^2), while merge sort has a time complexity of O(n log n). The choice of sorting algorithm depends on the specific requirements of your application.

Tip

Using a pack of cards can help you understand the concept of sorting algorithms. Mix up the cards and then try to sort them using different sorting algorithms. This can help you understand how each algorithm works and which one is most efficient.

Bubble Sort

  • Bubble sort is a simple and intuitive sorting algorithm that works by repeatedly swapping adjacent elements if they are in the wrong order. It gets its name from the way smaller elements "bubble" to the top of the list.
    import timeit
    
    def bubble_sort(arr):
        ''' Note that is modifies the array in-place. '''
        n = len(arr)
        for i in range(n - 1):
            for j in range(n-i-1):
                if arr[j] > arr[j+1]:
                    arr[j], arr[j+1] = arr[j+1], arr[j]
        return arr
    
    if __name__ == "__main__":
        input = [64, 34, 25, 12, 22, 11, 90]
        print("Unsorted array is:", list(input))
        bubble_sort(input)
        print ("Sorted array is:", input)
    
        result = timeit.timeit(lambda: bubble_sort(list(input)))
        print(f"Time taken to sort the array using Bubble Sort: {result} seconds for 100000 runs.")
    
  • Bubble sort has a time complexity of O(n^2), which makes it inefficient for large lists. However, it is easy to understand and implement.

In the code there are a few things you should be aware of: - As the bubble sort is done in-place you need to provide a copy of the input array when calling timeit.timeit so that the original array is not modified. - The default number of runs for timeit.timeit is 1,000,000. If you want to change this you can do so by passing the number parameter. - The prints are a separate step and not part of timing analysis as printing to the screen is time consuming and would influence the result

Try it yourself!

  • Modify the code above so that it doesn't make a copy of the list, does the time taken to sort the array change? Why or why not?
  • Change the number of runs for timeit.timeit and observe how it affects the result.
  • Try creating a bigger lists using random.sample(), and see how bubble sort performs on them.
  • Include print statements in the bubble_sort method to see how it impacts the time measurement.

Insertion Sort

  • Insertion sort has a time complexity of O(n^2), but it is more efficient than bubble sort for small lists or nearly sorted lists. It works by building the final sorted array one item at a time, inserting each new element into its correct position in the already-sorted portion of the array.
def insertion_sort(arr):
    for i in range(1, len(arr)):
        key = arr[i]
        j = i - 1
        while j >=0 and key < arr[j] :
                arr[j + 1] = arr[j]
                j -= 1
        arr[j + 1] = key
    return arr

Selection Sort

Merge Sort

Quicksort

Heapsort

Search Algorithms

The following search algorithms are covered in this section: - Linear Search - Binary Search - Depth-First Search (DFS) - Breadth-First Search (BFS)