Tired of know knowing what about linked lists?
Frustrated because you can’t speak BigO notation?

Tech companies in the valley don’t interview engineers the same way we do in other parts of the world. They look for very specific knowledge, concepts, and experience that are heavily rooted in traditional computer science. And while you may already be a great engineer, chances are you would fail one of these interviews, not because of a lack of skill, but more because of a lack of understanding.

By learning the basics behind how data structures work, and what the most commonly used computer algorithms are

You will be able to:

Answer the most commonly asked interview questions

Ace your technical interview, and

Land that dream job

Now I got to warn you. This course goes pretty fast, I am assuming you already know how program.

But if you don’t have a lot of time, and you need to get up to speed fast, this course is for you.

And who am I? My name is Jonathan Rasmusson. I am a former Spotify engineer who lived and worked in the Valley, and who, just like you, also like you wasn’t trained in traditional computer science. I had to learn all this stuff from scratch – just like you.

But the good news is it can be learned. I am proof you can you can land your dream job. And in this course, I am going to show you show you how.

Dynamic programming is a method for solving a complex problem by breaking a big problem down into several simpler smaller ones. Take this shortest path problem for example.

The goal here is to count how many paths there are from the upper left hand corner of the grid down to the bottom right. The rule is you can only continuously move down and to the right.

A dynamic programming approach to this problem would be to note that the number of paths from the start to the end can be broken down by summing the number of paths from A to the end, also with B to the end. So now we have broken down our big problem, into smaller sub problems.

That that these subproblems can then be further broken down into as we walk through the grid.

This recursion stops when we get to the end and the number new paths left is 1.

One way of translating this into a recursive algorithm would be to write something like this.

With code like this

DynamicProgrammingRecursive.java

public class DynamicProgrammingRecursive {
private int grid[][];
public DynamicProgrammingRecursive(int[][] grid) {
this.grid = grid;
}
public int countPaths(int row, int col) {
if (!isValidSquare(row, col)) return 0;
if (isAtEnd(row, col)) return 1;
return countPaths(row + 1, col) + countPaths(row, col + 1);
}
public boolean isValidSquare(int row, int col) {
return isInBounds(row, col) && !isBlocked(row, col);
}
public boolean isBlocked(int row, int col) {
return this.grid[row][col] == 1;
}
public boolean isInBounds(int row, int col) {
return (row < grid.length && col < grid[0].length);
}
public boolean isAtEnd(int row, int col) {
return grid.length - 1 == row && grid[row].length - 1 == col;
}
}

foo

import junit.framework.Assert;
import org.junit.Before;
import org.junit.Test;
public class DynamicProgrammingRecursiveTest {
private DynamicProgrammingIterative empty2x2;
private DynamicProgrammingIterative empty3x3;
@Before
public void setUp() throws Exception {
int[][] paths2x2 = new int[][] {
{0,0},
{0,0}
};
empty2x2 = new DynamicProgrammingIterative(paths2x2);
int[][] paths3x3 = new int[][] {
{0,0,0},
{0,0,0},
{0,0,0}
};
empty3x3 = new DynamicProgrammingIterative(paths3x3);
}
/*
2 1
1 x
*/
@Test
public void TwoByTwoEmptyPathCount() throws Exception {
Assert.assertEquals(2, empty2x2.countPaths(1,1));
}
/*
6 3 1
3 2 1
1 1 x
*/
@Test
public void ThreeByThreeEmptyPathCount() throws Exception {
Assert.assertEquals(6, empty3x3.countPaths(0,0));
}
@Test
public void IsAtEnd() throws Exception {
int row = 0;
int column = 0;
Assert.assertTrue(empty2x2.isAtEnd(row, column));
}
@Test
public void IsInBounds() throws Exception {
Assert.assertTrue(empty2x2.isInBounds(0,0));
Assert.assertTrue(empty2x2.isInBounds(0,1));
Assert.assertTrue(empty2x2.isInBounds(1,0));
Assert.assertTrue(empty2x2.isInBounds(1,1));
Assert.assertFalse(empty2x2.isInBounds(0,2));
Assert.assertFalse(empty2x2.isInBounds(2,0));
}
@Test
public void IsBlocked() throws Exception {
int[][] paths = new int[][] {
{0,0},
{0,1}
};
DynamicProgrammingRecursive brute = new DynamicProgrammingRecursive(paths);
Assert.assertFalse(brute.isBlocked(0,0));
Assert.assertFalse(brute.isBlocked(0,1));
Assert.assertFalse(brute.isBlocked(1,0));
Assert.assertTrue(brute.isBlocked(1,1));
}
@Test
public void CountBlockedPaths() throws Exception {
int[][] paths = new int[][] {
{0,0,0,0,0,0,0,0},
{0,0,1,0,0,0,1,0},
{0,0,0,0,1,0,0,0},
{1,0,1,0,0,1,0,0},
{0,0,1,0,0,0,0,0},
{0,0,0,1,1,0,1,0},
{0,1,0,0,0,1,0,0},
{0,0,0,0,0,0,0,0}
};
DynamicProgrammingRecursive brute = new DynamicProgrammingRecursive(paths);
Assert.assertEquals(27, brute.countPaths(0,0));
}
}

We just traverse down and to the right, making sure that we are not out of bounds or in a blocked off square. And when we get to the end, we just return 1, allowing the recursion of the algorithm to walk back up to the top where we started.

The validSquare method does the boundary checking as well as making sure we are not in a blocked off square. If we are blocked return 0. If we are at the end return 1. Else recurse and do the calc for the next down and right squares of the grid.

One Tip for when coding grids

Whenever you do matrix algorithms use row and column for your variable names rather than x and y.

The reason for this is the row and column translates into [y][x] and its a common cause for mistakes. So when dealing with matrices using row and column instead. Just easier.

Memoization

One thing you may have noticed in our previous examples, is that we calculate the number of paths from C to the end twice. That is something we can store for future look ups. And that’s a memoization approach.

DynamicProgrammingMemoized.java

public class DynamicProgrammingMemoized {
private int grid[][];
private int paths[][];
public DynamicProgrammingMemoized(int[][] grid) {
this.grid = grid;
this.paths = new int[grid.length][grid[0].length]; // assume square
}
public int countPaths(int row, int col) {
if (!isValidSquare(row, col)) return 0;
if (isAtEnd(row, col)) return 1;
if (this.paths[row][col] == 0) {
this.paths[row][col] = countPaths(row + 1, col) + countPaths(row, col + 1);
}
return this.paths[row][col];
}
public boolean isValidSquare(int row, int col) {
return isInBounds(row, col) && !isBlocked(row, col);
}
public boolean isBlocked(int row, int col) {
return this.grid[row][col] == 1;
}
public boolean isInBounds(int row, int col) {
return (row < grid.length && col < grid[0].length);
}
public boolean isAtEnd(int row, int col) {
return grid.length - 1 == row && grid[row].length - 1 == col;
}
}

DynamicProgrammingMemoizedTest.java

import junit.framework.Assert;
import org.junit.Before;
import org.junit.Test;
public class DynamicProgrammingMemoizedTest {
private DynamicProgrammingMemoized empty2x2;
private DynamicProgrammingMemoized empty3x3;
@Before
public void setUp() throws Exception {
int[][] paths2x2 = new int[][] {
{0,0},
{0,0}
};
empty2x2 = new DynamicProgrammingMemoized(paths2x2);
int[][] paths3x3 = new int[][] {
{0,0,0},
{0,0,0},
{0,0,0}
};
empty3x3 = new DynamicProgrammingMemoized(paths3x3);
}
/*
2 1
1 x
*/
@Test
public void TwoByTwoEmptyPathCount() throws Exception {
Assert.assertEquals(2, empty2x2.countPaths(0,0));
}
/*
6 3 1
3 2 1
1 1 x
*/
@Test
public void ThreeByThreeEmptyPathCount() throws Exception {
Assert.assertEquals(6, empty3x3.countPaths(0,0));
}
@Test
public void CountBlockedPaths() throws Exception {
int[][] paths = new int[][] {
{0,0,0,0,0,0,0,0},
{0,0,1,0,0,0,1,0},
{0,0,0,0,1,0,0,0},
{1,0,1,0,0,1,0,0},
{0,0,1,0,0,0,0,0},
{0,0,0,1,1,0,1,0},
{0,1,0,0,0,1,0,0},
{0,0,0,0,0,0,0,0}
};
DynamicProgrammingRecursive brute = new DynamicProgrammingRecursive(paths);
Assert.assertEquals(27, brute.countPaths(0,0));
}
}

Same idea as the recursive implementation. Only here we store the results as we calculate them which speeds the algorithm and makes the calculations happen a lot faster.

Traditional Dynamic Programming Approach

A more traditional dynamic approach to this problem would be to start at the end and work backwards working our way up.

If we think about what the recursive approach does, the first first concrete values it gets, other than blocked paths, are these two in the bottom right. 1 and 1.

Now, we can we go from there. If we take that recursion one step up, it may be filling one of the next three cells.

If we start on the left, all of these cells have just one path to the bottom right. So all of these values are one.

And if we continue doing this for the other cells, we can continue walking up and filling in values for the other cells as follows for the entire grid.

import junit.framework.Assert;
import org.junit.Before;
import org.junit.Test;
public class DynamicProgrammingIterativeTest {
/*
2 1
1 x
*/
@Test
public void TwoByTwoEmptyPathCount() throws Exception {
int[][] paths2x2 = new int[][] {
{0,0},
{0,0}
};
DynamicProgrammingIterative empty2x2 = new DynamicProgrammingIterative(paths2x2);
Assert.assertEquals(2, empty2x2.countPaths(1,1)); // start lower right
}
/*
1 x
1 1
*/
@Test
public void TwoByTwoOneCellBlocked() throws Exception {
int[][] paths2x2 = new int[][] {
{0,1},
{0,0}
};
DynamicProgrammingIterative iterative = new DynamicProgrammingIterative(paths2x2);
Assert.assertEquals(1, iterative.countPaths(1,1));
}
/*
6 3 1
3 2 1
1 1 1
*/
@Test
public void ThreeByThreeEmpty() throws Exception {
int[][] paths3x3 = new int[][] {
{0,0,0},
{0,0,0},
{0,0,0}
};
DynamicProgrammingIterative empty3x3 = new DynamicProgrammingIterative(paths3x3);
Assert.assertEquals(6, empty3x3.countPaths(2,2));
}
/*
3 1 0
2 1 x
1 1 1
*/
@Test
public void ThreeByThreeBlocked() throws Exception {
int[][] paths3x3 = new int[][] {
{0,0,0},
{0,0,1},
{0,0,0}
};
DynamicProgrammingIterative iterative = new DynamicProgrammingIterative(paths3x3);
Assert.assertEquals(3, iterative.countPaths(2,2));
}
/*
3 1 0
2 1 x
1 1 1
*/
@Test
public void ThreeByThreeBlocked2() throws Exception {
int[][] paths3x3 = new int[][] {
{0,0,0},
{1,0,0},
{0,0,0}
};
DynamicProgrammingIterative iterative = new DynamicProgrammingIterative(paths3x3);
Assert.assertEquals(3, iterative.countPaths(2,2));
}
/*
1 1 1
x x 1
1 1 1
*/
@Test
public void ThreeByThreeBlocked3() throws Exception {
int[][] paths3x3 = new int[][] {
{0,0,0},
{1,1,0},
{0,0,0}
};
DynamicProgrammingIterative iterative = new DynamicProgrammingIterative(paths3x3);
Assert.assertEquals(1, iterative.countPaths(2,2));
}
@Test
public void CountBlockedPaths() throws Exception {
int[][] paths = new int[][] {
{0,0,0,0,0,0,0,0},
{0,0,1,0,0,0,1,0},
{0,0,0,0,1,0,0,0},
{1,0,1,0,0,1,0,0},
{0,0,1,0,0,0,0,0},
{0,0,0,1,1,0,1,0},
{0,1,0,0,0,1,0,0},
{0,0,0,0,0,0,0,0}
};
DynamicProgrammingIterative iterative = new DynamicProgrammingIterative(paths);
Assert.assertEquals(27, iterative.countPaths(7,7));
}
}

Now the advantage of doing things this way is that we have used slightly less memory. Runtime is still O(n^2) but we’ve reduced the actual amount of memory used slightly because we no longer have to use the call stack.

So where does this leave us?

The main take away of dynamic programming problems is that when you have a big problem that consists of many similarly oriented smaller ones, you can often implement them using a combination of recursion, with memoization for performance.

Not also that if you look at the tests I wrote, a good place to start with any of these problems is which something really small and simple (like a 2×2 or 3×3 matrix), write tests, and then work your way up from there.

In computing, memoization is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached results when required.

The most common example of memoization is the implementation of a fibonacci series.

Because a fibonacci series recursively calls upon itself, when calculating the series for any larger numbers, you end up making the same calculations over and over again.

With memoization you store the results of certain calculations along the way, and then simply lookup and return them when called again in the future.

For example, here is a naive implementation of a fibonacci series. One that does the full on fibonacci calculation for every element beginning to end.

FibonacciNaive.java

public class FibonacciNaive {
public int fib(int n) {
System.out.println("n = " + n);
if (n <= 0) {
return 0;
} else if (n == 1) {
return 1;
} else {
return fib(n - 1) + fib(n - 2);
}
}
}

FibonacciNaive.java

public class FibonacciNaive {
public int fib(int n) {
System.out.println("n = " + n);
if (n <= 0) {
return 0;
} else if (n == 1) {
return 1;
} else {
return fib(n - 1) + fib(n - 2);
}
}
}

This is an extremely slow implementation. fib(30) here takes about 19 seconds.

By contrast, here is the say series, only memoized. Here we store the result of fib(n) as we calculate it, and then use it each time in subsequent calculations.

FibonacciMemoized.java

public class FibonacciMemoized {
private int[] memo = new int[1001];
public int fib(int n) {
System.out.println("n = " + n);
if (n <= 0) {
return 0;
} else if (n == 1) {
return 1;
} else if (memo[n] == 0){
memo[n] = fib(n - 1) + fib(n - 2);
}
return memo[n];
}
}

This one does fib(30) is less than a second. You have to bump it up to 1000 to approx 20s. So a massive increase in computational performance.

That’s the power and beauty of memoization. Also our runtime is now linear with memoization instead of exponential.

A hash table is a key value lookup. It gives you a way of storing a value, against any given key, for very quick lookups.

The key and value can be basically any kind of data structure. Strings are popular. But so long as the key has a hash function, it could be anything.

How does hashing work

At a high-level, we want to store our objects in a array. But how do we jump from a string, to an actual index in the array? This is what a hash function does.

A hash function takes some string, converts it into an integer, and then that integer gets converted into an index for that array, and hence the object we are looking for.

So the hashcode isn’t the index in the array. We go from the key, to the hashcode, and then to the index.

And the reason for this is that the array storing the values might be much much smaller than all the possible combinations of hash’s out there.

Collisions

One interesting things about hash codes is that it is possible for multiple different keys to have the same hashcode. There are in infinite number of strings, but only a finite number of hashcodes. Also, because we are further reducing the number of keys into a smaller array, the potential for key collisions when mapping can occur.

That’s why a lot of time and effort goes into finding hash algorithms that spread distributions of keys out. In an effort to avoid. But collisions do periodically occur. So we need to deal with them.

The most common way to deal with this collisions is to use chaining. Which basically stores all the collisions for a given key in a linked list. And then when a collisions occurs, walk each element of the linked list until you find the object with the perfect key match. That means the original keys are stored in these chains too.

And for those elements for where there is no collision, we have a one element linked list.

Runtime

Usually we assume that we have a good hash function that gives a good distribution, so for the purposes of an interview you can assume this is constant time. But if you had a bad hash worst case it could be O(n).

Code

HashTable.java

public class HashTable {
private int INITIAL_SIZE = 16;
private HashEntry[] data; // LinkedList
class HashEntry {
String key;
String value;
HashEntry next;
HashEntry(String key, String value) {
this.key = key;
this.value = value;
this.next = null;
}
}
HashTable() {
data = new HashEntry[INITIAL_SIZE];
}
void put(String key, String value) {
// Get the index
int index = getIndex(key);
// Create the linked list entry
HashEntry entry = new HashEntry(key, value);
// If no entry there - add it
if (data[index] == null) {
data[index] = entry;
}
// Else handle collision by adding to end of linked list
else {
HashEntry temp = data[index];
while(temp.next != null) {
temp = temp.next;
}
temp.next = entry;
}
// the above while loop walks down the chain to the end
// and then assigns the new entry as the last element
}
String get(String key) {
// Get the index
int index = getIndex(key);
// Get the value
HashEntry temp = data[index];
// if value
if (temp != null) {
// else walk chain until find match
while (!temp.key.equals(key) && temp.next !=null) {
temp = temp.next;
}
return temp.value;
}
// else return null
return null;
}
private int getIndex(String key) {
// Get the hash code
int hashCode = key.hashCode();
// Convert to index
int index = hashCode % INITIAL_SIZE;
// Hack to force collision for testing
if (key.equals("PeterCollision") || key.equals("PaulCollision")) {
index = 4;
}
return index;
}
@Override
public String toString() {
int bucket = 0;
StringBuilder hashTableStr = new StringBuilder();
for (HashEntry entry : data) {
if(entry == null) {
continue;
}
hashTableStr.append("\n bucket[")
.append(bucket)
.append("] = ")
.append(entry.toString());
bucket++;
HashEntry temp = entry.next;
while(temp != null) {
hashTableStr.append(" -> ");
hashTableStr.append(temp.toString());
temp = temp.next;
}
}
return hashTableStr.toString();
}
}

HashTableTest.java

import junit.framework.Assert;
import org.junit.Before;
import org.junit.Test;
public class HashTableTest {
private HashTable hashTable;
@Before
public void setUp() throws Exception {
hashTable = new HashTable();
}
@Test
public void PutAndGet() {
hashTable.put("Peter", "PeterValue");
Assert.assertEquals("PeterValue", hashTable.get("Peter"));
}
@Test
public void Collision() {
// these keys will collide (hacked in solution for testing)
hashTable.put("PeterCollision", "PeterValue");
hashTable.put("PaulCollision", "PaulValue");
Assert.assertEquals("PeterValue", hashTable.get("PeterCollision"));
Assert.assertEquals("PaulValue", hashTable.get("PaulCollision"));
System.out.println(hashTable.toString());
}
}

Stacks and Queues are pretty similar data structures. The difference is in how they remove elements.

A Stack is LIFO – Last In First Out. Much like a stack of books.
A Queue is FIFI – First In First Out. If you are the first to line up, you are the first to come out.

While a Queue is more like when you are waiting at a super market. FILO – First in Last Out.

The are both linear data structures.
They can be implemented as Arrays or Linked Lists.

Here is a Stack and Queue implemented as a Linked List.

Stack.java

public class Stack {
private class Node {
private int data;
private Node next;
private Node (int data) {
this.data = data;
}
}
private Node top; // add and remove things here
public boolean isEmpty() {
return top == null;
}
public int peek() {
return top.data;
}
public void push(int data) {
// Create new node
// Set it's next to be top
// Set top to be the new node
Node node = new Node(data);
node.next = top;
top = node;
}
public int pop() {
// Store the value you want to return
// Set the current top.next to be the new top
// return the value
// without null checks
int data = top.data;
top = top.next;
return data;
// with null check
// int data = top.data;
//
// if (top.next != null) {
// top = top.next;
// } else {
// top = null;
// }
// return data;
}
}

Queue.java

public class Queue {
private class Node {
private int data;
private Node next;
private Node (int data) {
this.data = data;
}
}
private Node head; // remove things here
private Node tail; // add things here
public boolean isEmpty() {
return head == null;
}
public int peek() {
return head.data;
}
public void add(int data) {
// Create a new node
// Set the current tail.next to point to this new node
// Then set the new node to be the new tail
Node newTail = new Node(data);
if (tail != null) {
tail.next = newTail;
}
tail = newTail;
// handle case of first element where head is null
if (head == null) {
head = tail;
}
}
public int remove() {
// Point the head at the current head next
int data = head.data;
head = head.next;
// Handle queue now being empty
if (head == null) {
tail = null;
}
return data;
}
}

Some notes from this excellent video by Gayle McDowell on trees.

What is a binary tree?

A binary tree is a data structure where by each node has no more than two child nodes.

What is a binary search tree?

A binary search tree is a binary tree sorted in a specific ordering property. On any subtree, the left nodes are less than the root node, which is less than all of the right nodes.

This ordering makes finding a node really fast because we have pretty good idea of which way to go as we search the tree, going left or right and essentially halving our search area every step of the way.

Inserts

Insert works much like a find. We start at the top, do a comparison, and then head down to the child subtree and repeat the process again until we get down to an empty spot or null element.

Note the one problem with this approach is that if we get elements in a particular order we can get really unbalanced.

And this will make our find ands inserts slow.

There are algorithms that you can use to keep a binary tree balanced, but they are pretty complicated, and in an interview you can generally assume that you are going to have a balanced tree.

Traversinge

There are three common ways we walk through a tree.

Preorder traversal: Means you visit the root of the tree first, and then you visit it’s left nodes and it’s right nodes. Top-bottom, left to right.

Inorder traversal: Left nodes first, then right. So left, parent, then right.

Post order: Left, then right, then root.

Typically in binary search trees we walk to do inorder traversal because that actually allows the nodes to be printed out in order.

Code for Binary Search Tree

NodeBST.java

public class NodeBST {
NodeBST left, right;
int data;
public NodeBST(int data) {
this.data = data;
}
public void insert(int value) {
// look to the left and the right to see where we want to insert
if (value <= data) {
if (left == null) {
left = new NodeBST(value);
} else {
// push down to child and ask it to handle : recursion
left.insert(value);
}
} else {
if (right == null) {
right = new NodeBST(value);
} else {
right.insert(value);
}
}
}
public boolean contains(int value) {
// if we are there, return true
if (value == data) {
return true;
} else if (value < data) {
// then it should be on the left
// if there is no left node
if (left == null) {
return false;
} else {
return left.contains(value);
}
} else {
if (right == null) {
return false;
} else {
return right.contains(value);
}
}
}
// left child -> parent -> right child
public void printInOrder() {
if (left != null) {
left.printInOrder();
}
System.out.println("data = " + data);
if (right != null) {
right.printInOrder();
}
}
}

NodeBSTTest.java

import junit.framework.Assert;
import org.junit.Before;
import org.junit.Test;
public class NodeBSTTest {
private NodeBST node;
@Before
public void setUp() throws Exception {
node = new NodeBST(10);
node.insert(5);
node.insert(15);
node.insert(8);
}
@Test
public void Contains() throws Exception {
Assert.assertTrue(node.contains(5));
Assert.assertTrue(node.contains(15));
Assert.assertTrue(node.contains(8));
}
@Test
public void PrintOrder() throws Exception {
node.printInOrder();
}
}

A binary heap is a binary tree heap data structure used in sorting and priority queue algorithms.

What is a binary tree?

A binary tree is a data structure that stores it’s data in the shape of a tree. At the top you have a root node, and underneath that you have at maximum two other nodes. One on the left and one on the right.

Binary trees are useful for all sorts of things. But one area where they really shine is in searching. When you sort a binary tree in certain ways, you can find elements much faster than if you were walking an array or linked list.

What are heaps?

A heap data structure is binary tree that is sorted. We call it a max heap (if the largest number is on the top). And a min heap if the largest numbers are on the bottom.

What’s cool about max heaps is that their structure lends itself very nicely to quickly finding maximum level elements in a queue. Which is why priority queues are often implemented using binary tree structures sorted as heaps.

What is a binary heap?

As you’ve probably guess, binary heaps combine binary trees with heap sorted structures.

In binary heaps any node has at least the value of the nodes children. There is no left/right orientation or distinction in values.

Also children further down the tree can be of greater value than nodes higher up the tree on the other side. That doesn’t matter. The only thing that matters is that the children at equal or less than their parents. That’s it.

Which means every node, it the root of it’s own sub-heap.

Heap representation

Because the shape of a heap is represented so nicely it is easy to store heap nodes in an array.

Top to bottom, left to right. Also, because we know where every element of the array lies, we can access its elements very easily using the above formula.

Finding the maximum

Finding the maximum is easy. It’s simply the first, or highest element at the top of our heap. So it is always the first element in our array.

Inserting

When inserting we stick the new element at the end of our tree (always reading top to bottom, left to right) and stick it in the last element of our array. And if is smaller than the parent we are done.

But what happens if our element is larger that the parent?

For insertions like this we walk up the tree, swapping nodes until our heap constraint (of the child being of less than or equal value to our parent) is satisfied.

And if we insert a number each to it’s parent, we just leave it there. It’s done.

Deletion

Say we want to delete the root of our heap (i.e. we pulled off it’s max value and now we need the heap to resort itself – or heapify).

To delete the max element, we take the max node, swap it with the last leaf, and then delete the last leaf which we know is so easy.

Now we need to resort the heap (because that top element isn’t quite right as it doesn’t represent the max).

We do this by comparing the top with with each left/right subnode, and continuously swapping with the larger of the two.

In this case the left child is larger, so we are going to swap the root to the left.

No we repeat the process with the left hand subtree. We keep swapping the node in question with the largest of it’s children until we can’t swap anymore.

At this point we say the heap is heapified, and finally sorted. It is good to go for another extraction.

Cool things

Some other cool things about binary trees is that this data structure can sort itself very efficiently. i.e. It can sort itself in place. No need to copy all the contents somewhere else and then copy back. When we heapify it does it all with the existing array and simply swapping nodes. Very cool. Very efficient.

Code

Here is an example of a Max Heap implementation for ints in Java.

MaxIntHeap.java

import java.util.Arrays;
public class MaxIntHeap {
private int capactity = 10;
private int size = 0;
int[] items = new int[capactity];
private int leftChildIndex(int parentIndex) { return 2 * parentIndex + 1; }
private int rightChildIndex(int parentIndex) { return 2 * parentIndex + 2; }
private int parentIndex(int childIndex ) { return (childIndex - 1) / 2; }
private boolean hasLeftChild(int index) { return leftChildIndex(index) < size; }
private boolean hasRightChild(int index) { return rightChildIndex(index) < size; }
private boolean hasParent(int index) { return parentIndex(index) >= 0; }
private int leftChild(int index) { return items[leftChildIndex(index)]; }
private int rightChild(int index) { return items[rightChildIndex(index)]; }
private int parent(int index) { return items[parentIndex(index)]; }
private void swap(int indexOne, int indexTwo) {
int temp = items[indexOne];
items[indexOne] = items[indexTwo];
items[indexTwo] = temp;
}
private void ensureCapactity() {
if (size == capactity) {
items = Arrays.copyOf(items, capactity * 2);
capactity *= 2;
}
}
public int extractMax() {
if (size == 0) throw new IllegalStateException();
int item = items[0]; // grab the max
items[0] = items[size - 1]; // copy the bottom to the top
size--;
heapifyDown(); // and no because top isn't right, we heapify down
return item;
}
public void add(int item) {
ensureCapactity();
items[size] = item; // put in last spot
size++;
heapifyUp();
}
public void heapifyUp() {
int index = size - 1; // start at last element
while (hasParent(index) && parent(index) > items[index]) { // walk up as long as there is a parent and it is bigger than you
swap(parentIndex(index), index);
index = parentIndex(index); // walk upwards to next node
}
}
public void heapifyDown() {
int index = 0; // starting at the top
while (hasLeftChild(index)) { // as long as I have children Note: Only need to check left because if no left, there is no right
// pick a direction, and get the smaller of the two indexes
int smallerChildIndex = leftChildIndex(index);
if (hasRightChild(index) && rightChild(index) < leftChild(index)) {
// swap right (because we are min heap
smallerChildIndex = rightChildIndex(index);
}
// Now compare
// if I am smaller than the items of my two children...then everything is good. I am sorted.
if(items[index] < items[smallerChildIndex]) {
break;
} else { // we are still not in order
swap(index, smallerChildIndex); // so swap with the smaller child
}
index = smallerChildIndex; // then move down to smaller child
}
}
public void print() {
for (int i=0; i < size; i++) {
System.out.println(i + "[" + items[i] + "]" );
}
}
}

MaxIntHeapTest.java

import org.junit.Assert;
import org.junit.Before;
import org.junit.Test;
public class MaxIntHeapTest {
private MaxIntHeap minHeap;
@Before
public void setUp() throws Exception {
minHeap = new MaxIntHeap();
minHeap.add(6);
minHeap.add(5);
minHeap.add(4);
minHeap.add(3);
minHeap.add(2);
minHeap.add(1);
}
@Test
public void Insert() throws Exception {
// Remember: The array walks top down / left to right
Assert.assertEquals(1, minHeap.items[0]);
Assert.assertEquals(3, minHeap.items[1]);
Assert.assertEquals(2, minHeap.items[2]);
Assert.assertEquals(6, minHeap.items[3]);
Assert.assertEquals(4, minHeap.items[4]);
Assert.assertEquals(5, minHeap.items[5]);
}
@Test
public void ExtractMin() throws Exception {
Assert.assertEquals(1, minHeap.extractMax());
Assert.assertEquals(2, minHeap.extractMax());
Assert.assertEquals(3, minHeap.extractMax());
Assert.assertEquals(4, minHeap.extractMax());
Assert.assertEquals(5, minHeap.extractMax());
Assert.assertEquals(6, minHeap.extractMax());
}
}

RT @ronaldnzimora: My GrandPa telling me about money. I was around 12 or 13 years old.
GrandPa: "Money is like a cat. If you chase the cat… 23 hours ago

RT @jasongorman: How much it costs your organisation to make a £100 decision is still by far the best indicator of agility 2 days ago

RT @martinfowler: new post: Do you want to split a monolith into microservices? If so @zhamakd has been down that road and has lessons to s… 2 days ago

RT @KentBeck: All the talk about the Four Rules of Simple Design got me inspired, so I added a reference card to my Etsy store: https://t.c…3 days ago