A tuple is a immutable heterogenous data structure while Lists are homogenous sequences. Tuples have structure. Lists have order.

For example say we wanted to save notes in a book on a specific page an line number. We could define that data structure as a tuple.

my_location = (42, 11) # page number, line number

And then use this as a key in a Dictionary to store notes on locations. A List on the other hand could be used to store multiple locations. Naturally one might want to add and remove from the list. That’s why Lists are mutable and tuples are immutable.

Tired of know knowing what about linked lists?
Frustrated because you can’t speak BigO notation?

Tech companies in the valley don’t interview engineers the same way we do in other parts of the world. They look for very specific knowledge, concepts, and experience that are heavily rooted in traditional computer science. And while you may already be a great engineer, chances are you would fail one of these interviews, not because of a lack of skill, but more because of a lack of understanding.

By learning the basics behind how data structures work, and what the most commonly used computer algorithms are

You will be able to:

Answer the most commonly asked interview questions

Ace your technical interview, and

Land that dream job

Now I got to warn you. This course goes pretty fast, I am assuming you already know how program.

But if you don’t have a lot of time, and you need to get up to speed fast, this course is for you.

And who am I? My name is Jonathan Rasmusson. I am a former Spotify engineer who lived and worked in the Valley, and who, just like you, also like you wasn’t trained in traditional computer science. I had to learn all this stuff from scratch – just like you.

But the good news is it can be learned. I am proof you can you can land your dream job. And in this course, I am going to show you show you how.

Dynamic programming is a method for solving a complex problem by breaking a big problem down into several simpler smaller ones. Take this shortest path problem for example.

The goal here is to count how many paths there are from the upper left hand corner of the grid down to the bottom right. The rule is you can only continuously move down and to the right.

A dynamic programming approach to this problem would be to note that the number of paths from the start to the end can be broken down by summing the number of paths from A to the end, also with B to the end. So now we have broken down our big problem, into smaller sub problems.

That that these subproblems can then be further broken down into as we walk through the grid.

This recursion stops when we get to the end and the number new paths left is 1.

One way of translating this into a recursive algorithm would be to write something like this.

With code like this

DynamicProgrammingRecursive.java

public class DynamicProgrammingRecursive {
private int grid[][];
public DynamicProgrammingRecursive(int[][] grid) {
this.grid = grid;
}
public int countPaths(int row, int col) {
if (!isValidSquare(row, col)) return 0;
if (isAtEnd(row, col)) return 1;
return countPaths(row + 1, col) + countPaths(row, col + 1);
}
public boolean isValidSquare(int row, int col) {
return isInBounds(row, col) && !isBlocked(row, col);
}
public boolean isBlocked(int row, int col) {
return this.grid[row][col] == 1;
}
public boolean isInBounds(int row, int col) {
return (row < grid.length && col < grid[0].length);
}
public boolean isAtEnd(int row, int col) {
return grid.length - 1 == row && grid[row].length - 1 == col;
}
}

foo

import junit.framework.Assert;
import org.junit.Before;
import org.junit.Test;
public class DynamicProgrammingRecursiveTest {
private DynamicProgrammingIterative empty2x2;
private DynamicProgrammingIterative empty3x3;
@Before
public void setUp() throws Exception {
int[][] paths2x2 = new int[][] {
{0,0},
{0,0}
};
empty2x2 = new DynamicProgrammingIterative(paths2x2);
int[][] paths3x3 = new int[][] {
{0,0,0},
{0,0,0},
{0,0,0}
};
empty3x3 = new DynamicProgrammingIterative(paths3x3);
}
/*
2 1
1 x
*/
@Test
public void TwoByTwoEmptyPathCount() throws Exception {
Assert.assertEquals(2, empty2x2.countPaths(1,1));
}
/*
6 3 1
3 2 1
1 1 x
*/
@Test
public void ThreeByThreeEmptyPathCount() throws Exception {
Assert.assertEquals(6, empty3x3.countPaths(0,0));
}
@Test
public void IsAtEnd() throws Exception {
int row = 0;
int column = 0;
Assert.assertTrue(empty2x2.isAtEnd(row, column));
}
@Test
public void IsInBounds() throws Exception {
Assert.assertTrue(empty2x2.isInBounds(0,0));
Assert.assertTrue(empty2x2.isInBounds(0,1));
Assert.assertTrue(empty2x2.isInBounds(1,0));
Assert.assertTrue(empty2x2.isInBounds(1,1));
Assert.assertFalse(empty2x2.isInBounds(0,2));
Assert.assertFalse(empty2x2.isInBounds(2,0));
}
@Test
public void IsBlocked() throws Exception {
int[][] paths = new int[][] {
{0,0},
{0,1}
};
DynamicProgrammingRecursive brute = new DynamicProgrammingRecursive(paths);
Assert.assertFalse(brute.isBlocked(0,0));
Assert.assertFalse(brute.isBlocked(0,1));
Assert.assertFalse(brute.isBlocked(1,0));
Assert.assertTrue(brute.isBlocked(1,1));
}
@Test
public void CountBlockedPaths() throws Exception {
int[][] paths = new int[][] {
{0,0,0,0,0,0,0,0},
{0,0,1,0,0,0,1,0},
{0,0,0,0,1,0,0,0},
{1,0,1,0,0,1,0,0},
{0,0,1,0,0,0,0,0},
{0,0,0,1,1,0,1,0},
{0,1,0,0,0,1,0,0},
{0,0,0,0,0,0,0,0}
};
DynamicProgrammingRecursive brute = new DynamicProgrammingRecursive(paths);
Assert.assertEquals(27, brute.countPaths(0,0));
}
}

We just traverse down and to the right, making sure that we are not out of bounds or in a blocked off square. And when we get to the end, we just return 1, allowing the recursion of the algorithm to walk back up to the top where we started.

The validSquare method does the boundary checking as well as making sure we are not in a blocked off square. If we are blocked return 0. If we are at the end return 1. Else recurse and do the calc for the next down and right squares of the grid.

One Tip for when coding grids

Whenever you do matrix algorithms use row and column for your variable names rather than x and y.

The reason for this is the row and column translates into [y][x] and its a common cause for mistakes. So when dealing with matrices using row and column instead. Just easier.

Memoization

One thing you may have noticed in our previous examples, is that we calculate the number of paths from C to the end twice. That is something we can store for future look ups. And that’s a memoization approach.

DynamicProgrammingMemoized.java

public class DynamicProgrammingMemoized {
private int grid[][];
private int paths[][];
public DynamicProgrammingMemoized(int[][] grid) {
this.grid = grid;
this.paths = new int[grid.length][grid[0].length]; // assume square
}
public int countPaths(int row, int col) {
if (!isValidSquare(row, col)) return 0;
if (isAtEnd(row, col)) return 1;
if (this.paths[row][col] == 0) {
this.paths[row][col] = countPaths(row + 1, col) + countPaths(row, col + 1);
}
return this.paths[row][col];
}
public boolean isValidSquare(int row, int col) {
return isInBounds(row, col) && !isBlocked(row, col);
}
public boolean isBlocked(int row, int col) {
return this.grid[row][col] == 1;
}
public boolean isInBounds(int row, int col) {
return (row < grid.length && col < grid[0].length);
}
public boolean isAtEnd(int row, int col) {
return grid.length - 1 == row && grid[row].length - 1 == col;
}
}

DynamicProgrammingMemoizedTest.java

import junit.framework.Assert;
import org.junit.Before;
import org.junit.Test;
public class DynamicProgrammingMemoizedTest {
private DynamicProgrammingMemoized empty2x2;
private DynamicProgrammingMemoized empty3x3;
@Before
public void setUp() throws Exception {
int[][] paths2x2 = new int[][] {
{0,0},
{0,0}
};
empty2x2 = new DynamicProgrammingMemoized(paths2x2);
int[][] paths3x3 = new int[][] {
{0,0,0},
{0,0,0},
{0,0,0}
};
empty3x3 = new DynamicProgrammingMemoized(paths3x3);
}
/*
2 1
1 x
*/
@Test
public void TwoByTwoEmptyPathCount() throws Exception {
Assert.assertEquals(2, empty2x2.countPaths(0,0));
}
/*
6 3 1
3 2 1
1 1 x
*/
@Test
public void ThreeByThreeEmptyPathCount() throws Exception {
Assert.assertEquals(6, empty3x3.countPaths(0,0));
}
@Test
public void CountBlockedPaths() throws Exception {
int[][] paths = new int[][] {
{0,0,0,0,0,0,0,0},
{0,0,1,0,0,0,1,0},
{0,0,0,0,1,0,0,0},
{1,0,1,0,0,1,0,0},
{0,0,1,0,0,0,0,0},
{0,0,0,1,1,0,1,0},
{0,1,0,0,0,1,0,0},
{0,0,0,0,0,0,0,0}
};
DynamicProgrammingRecursive brute = new DynamicProgrammingRecursive(paths);
Assert.assertEquals(27, brute.countPaths(0,0));
}
}

Same idea as the recursive implementation. Only here we store the results as we calculate them which speeds the algorithm and makes the calculations happen a lot faster.

Traditional Dynamic Programming Approach

A more traditional dynamic approach to this problem would be to start at the end and work backwards working our way up.

If we think about what the recursive approach does, the first first concrete values it gets, other than blocked paths, are these two in the bottom right. 1 and 1.

Now, we can we go from there. If we take that recursion one step up, it may be filling one of the next three cells.

If we start on the left, all of these cells have just one path to the bottom right. So all of these values are one.

And if we continue doing this for the other cells, we can continue walking up and filling in values for the other cells as follows for the entire grid.

import junit.framework.Assert;
import org.junit.Before;
import org.junit.Test;
public class DynamicProgrammingIterativeTest {
/*
2 1
1 x
*/
@Test
public void TwoByTwoEmptyPathCount() throws Exception {
int[][] paths2x2 = new int[][] {
{0,0},
{0,0}
};
DynamicProgrammingIterative empty2x2 = new DynamicProgrammingIterative(paths2x2);
Assert.assertEquals(2, empty2x2.countPaths(1,1)); // start lower right
}
/*
1 x
1 1
*/
@Test
public void TwoByTwoOneCellBlocked() throws Exception {
int[][] paths2x2 = new int[][] {
{0,1},
{0,0}
};
DynamicProgrammingIterative iterative = new DynamicProgrammingIterative(paths2x2);
Assert.assertEquals(1, iterative.countPaths(1,1));
}
/*
6 3 1
3 2 1
1 1 1
*/
@Test
public void ThreeByThreeEmpty() throws Exception {
int[][] paths3x3 = new int[][] {
{0,0,0},
{0,0,0},
{0,0,0}
};
DynamicProgrammingIterative empty3x3 = new DynamicProgrammingIterative(paths3x3);
Assert.assertEquals(6, empty3x3.countPaths(2,2));
}
/*
3 1 0
2 1 x
1 1 1
*/
@Test
public void ThreeByThreeBlocked() throws Exception {
int[][] paths3x3 = new int[][] {
{0,0,0},
{0,0,1},
{0,0,0}
};
DynamicProgrammingIterative iterative = new DynamicProgrammingIterative(paths3x3);
Assert.assertEquals(3, iterative.countPaths(2,2));
}
/*
3 1 0
2 1 x
1 1 1
*/
@Test
public void ThreeByThreeBlocked2() throws Exception {
int[][] paths3x3 = new int[][] {
{0,0,0},
{1,0,0},
{0,0,0}
};
DynamicProgrammingIterative iterative = new DynamicProgrammingIterative(paths3x3);
Assert.assertEquals(3, iterative.countPaths(2,2));
}
/*
1 1 1
x x 1
1 1 1
*/
@Test
public void ThreeByThreeBlocked3() throws Exception {
int[][] paths3x3 = new int[][] {
{0,0,0},
{1,1,0},
{0,0,0}
};
DynamicProgrammingIterative iterative = new DynamicProgrammingIterative(paths3x3);
Assert.assertEquals(1, iterative.countPaths(2,2));
}
@Test
public void CountBlockedPaths() throws Exception {
int[][] paths = new int[][] {
{0,0,0,0,0,0,0,0},
{0,0,1,0,0,0,1,0},
{0,0,0,0,1,0,0,0},
{1,0,1,0,0,1,0,0},
{0,0,1,0,0,0,0,0},
{0,0,0,1,1,0,1,0},
{0,1,0,0,0,1,0,0},
{0,0,0,0,0,0,0,0}
};
DynamicProgrammingIterative iterative = new DynamicProgrammingIterative(paths);
Assert.assertEquals(27, iterative.countPaths(7,7));
}
}

Now the advantage of doing things this way is that we have used slightly less memory. Runtime is still O(n^2) but we’ve reduced the actual amount of memory used slightly because we no longer have to use the call stack.

So where does this leave us?

The main take away of dynamic programming problems is that when you have a big problem that consists of many similarly oriented smaller ones, you can often implement them using a combination of recursion, with memoization for performance.

Not also that if you look at the tests I wrote, a good place to start with any of these problems is which something really small and simple (like a 2×2 or 3×3 matrix), write tests, and then work your way up from there.

In computing, memoization is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached results when required.

The most common example of memoization is the implementation of a fibonacci series.

Because a fibonacci series recursively calls upon itself, when calculating the series for any larger numbers, you end up making the same calculations over and over again.

With memoization you store the results of certain calculations along the way, and then simply lookup and return them when called again in the future.

For example, here is a naive implementation of a fibonacci series. One that does the full on fibonacci calculation for every element beginning to end.

FibonacciNaive.java

public class FibonacciNaive {
public int fib(int n) {
System.out.println("n = " + n);
if (n <= 0) {
return 0;
} else if (n == 1) {
return 1;
} else {
return fib(n - 1) + fib(n - 2);
}
}
}

FibonacciNaive.java

public class FibonacciNaive {
public int fib(int n) {
System.out.println("n = " + n);
if (n <= 0) {
return 0;
} else if (n == 1) {
return 1;
} else {
return fib(n - 1) + fib(n - 2);
}
}
}

This is an extremely slow implementation. fib(30) here takes about 19 seconds.

By contrast, here is the say series, only memoized. Here we store the result of fib(n) as we calculate it, and then use it each time in subsequent calculations.

FibonacciMemoized.java

public class FibonacciMemoized {
private int[] memo = new int[1001];
public int fib(int n) {
System.out.println("n = " + n);
if (n <= 0) {
return 0;
} else if (n == 1) {
return 1;
} else if (memo[n] == 0){
memo[n] = fib(n - 1) + fib(n - 2);
}
return memo[n];
}
}

This one does fib(30) is less than a second. You have to bump it up to 1000 to approx 20s. So a massive increase in computational performance.

That’s the power and beauty of memoization. Also our runtime is now linear with memoization instead of exponential.

A hash table is a key value lookup. It gives you a way of storing a value, against any given key, for very quick lookups.

The key and value can be basically any kind of data structure. Strings are popular. But so long as the key has a hash function, it could be anything.

How does hashing work

At a high-level, we want to store our objects in a array. But how do we jump from a string, to an actual index in the array? This is what a hash function does.

A hash function takes some string, converts it into an integer, and then that integer gets converted into an index for that array, and hence the object we are looking for.

So the hashcode isn’t the index in the array. We go from the key, to the hashcode, and then to the index.

And the reason for this is that the array storing the values might be much much smaller than all the possible combinations of hash’s out there.

Collisions

One interesting things about hash codes is that it is possible for multiple different keys to have the same hashcode. There are in infinite number of strings, but only a finite number of hashcodes. Also, because we are further reducing the number of keys into a smaller array, the potential for key collisions when mapping can occur.

That’s why a lot of time and effort goes into finding hash algorithms that spread distributions of keys out. In an effort to avoid. But collisions do periodically occur. So we need to deal with them.

The most common way to deal with this collisions is to use chaining. Which basically stores all the collisions for a given key in a linked list. And then when a collisions occurs, walk each element of the linked list until you find the object with the perfect key match. That means the original keys are stored in these chains too.

And for those elements for where there is no collision, we have a one element linked list.

Runtime

Usually we assume that we have a good hash function that gives a good distribution, so for the purposes of an interview you can assume this is constant time. But if you had a bad hash worst case it could be O(n).

Code

HashTable.java

public class HashTable {
private int INITIAL_SIZE = 16;
private HashEntry[] data; // LinkedList
class HashEntry {
String key;
String value;
HashEntry next;
HashEntry(String key, String value) {
this.key = key;
this.value = value;
this.next = null;
}
}
HashTable() {
data = new HashEntry[INITIAL_SIZE];
}
void put(String key, String value) {
// Get the index
int index = getIndex(key);
// Create the linked list entry
HashEntry entry = new HashEntry(key, value);
// If no entry there - add it
if (data[index] == null) {
data[index] = entry;
}
// Else handle collision by adding to end of linked list
else {
HashEntry temp = data[index];
while(temp.next != null) {
temp = temp.next;
}
temp.next = entry;
}
// the above while loop walks down the chain to the end
// and then assigns the new entry as the last element
}
String get(String key) {
// Get the index
int index = getIndex(key);
// Get the value
HashEntry temp = data[index];
// if value
if (temp != null) {
// else walk chain until find match
while (!temp.key.equals(key) && temp.next !=null) {
temp = temp.next;
}
return temp.value;
}
// else return null
return null;
}
private int getIndex(String key) {
// Get the hash code
int hashCode = key.hashCode();
// Convert to index
int index = hashCode % INITIAL_SIZE;
// Hack to force collision for testing
if (key.equals("PeterCollision") || key.equals("PaulCollision")) {
index = 4;
}
return index;
}
@Override
public String toString() {
int bucket = 0;
StringBuilder hashTableStr = new StringBuilder();
for (HashEntry entry : data) {
if(entry == null) {
continue;
}
hashTableStr.append("\n bucket[")
.append(bucket)
.append("] = ")
.append(entry.toString());
bucket++;
HashEntry temp = entry.next;
while(temp != null) {
hashTableStr.append(" -> ");
hashTableStr.append(temp.toString());
temp = temp.next;
}
}
return hashTableStr.toString();
}
}

HashTableTest.java

import junit.framework.Assert;
import org.junit.Before;
import org.junit.Test;
public class HashTableTest {
private HashTable hashTable;
@Before
public void setUp() throws Exception {
hashTable = new HashTable();
}
@Test
public void PutAndGet() {
hashTable.put("Peter", "PeterValue");
Assert.assertEquals("PeterValue", hashTable.get("Peter"));
}
@Test
public void Collision() {
// these keys will collide (hacked in solution for testing)
hashTable.put("PeterCollision", "PeterValue");
hashTable.put("PaulCollision", "PaulValue");
Assert.assertEquals("PeterValue", hashTable.get("PeterCollision"));
Assert.assertEquals("PaulValue", hashTable.get("PaulCollision"));
System.out.println(hashTable.toString());
}
}

Stacks and Queues are pretty similar data structures. The difference is in how they remove elements.

A Stack is LIFO – Last In First Out. Much like a stack of books.
A Queue is FIFI – First In First Out. If you are the first to line up, you are the first to come out.

While a Queue is more like when you are waiting at a super market. FILO – First in Last Out.

The are both linear data structures.
They can be implemented as Arrays or Linked Lists.

Here is a Stack and Queue implemented as a Linked List.

Stack.java

public class Stack {
private class Node {
private int data;
private Node next;
private Node (int data) {
this.data = data;
}
}
private Node top; // add and remove things here
public boolean isEmpty() {
return top == null;
}
public int peek() {
return top.data;
}
public void push(int data) {
// Create new node
// Set it's next to be top
// Set top to be the new node
Node node = new Node(data);
node.next = top;
top = node;
}
public int pop() {
// Store the value you want to return
// Set the current top.next to be the new top
// return the value
// without null checks
int data = top.data;
top = top.next;
return data;
// with null check
// int data = top.data;
//
// if (top.next != null) {
// top = top.next;
// } else {
// top = null;
// }
// return data;
}
}

Queue.java

public class Queue {
private class Node {
private int data;
private Node next;
private Node (int data) {
this.data = data;
}
}
private Node head; // remove things here
private Node tail; // add things here
public boolean isEmpty() {
return head == null;
}
public int peek() {
return head.data;
}
public void add(int data) {
// Create a new node
// Set the current tail.next to point to this new node
// Then set the new node to be the new tail
Node newTail = new Node(data);
if (tail != null) {
tail.next = newTail;
}
tail = newTail;
// handle case of first element where head is null
if (head == null) {
head = tail;
}
}
public int remove() {
// Point the head at the current head next
int data = head.data;
head = head.next;
// Handle queue now being empty
if (head == null) {
tail = null;
}
return data;
}
}

Some notes from this excellent video by Gayle McDowell on trees.

What is a binary tree?

A binary tree is a data structure where by each node has no more than two child nodes.

What is a binary search tree?

A binary search tree is a binary tree sorted in a specific ordering property. On any subtree, the left nodes are less than the root node, which is less than all of the right nodes.

This ordering makes finding a node really fast because we have pretty good idea of which way to go as we search the tree, going left or right and essentially halving our search area every step of the way.

Inserts

Insert works much like a find. We start at the top, do a comparison, and then head down to the child subtree and repeat the process again until we get down to an empty spot or null element.

Note the one problem with this approach is that if we get elements in a particular order we can get really unbalanced.

And this will make our find ands inserts slow.

There are algorithms that you can use to keep a binary tree balanced, but they are pretty complicated, and in an interview you can generally assume that you are going to have a balanced tree.

Traversinge

There are three common ways we walk through a tree.

Preorder traversal: Means you visit the root of the tree first, and then you visit it’s left nodes and it’s right nodes. Top-bottom, left to right.

Inorder traversal: Left nodes first, then right. So left, parent, then right.

Post order: Left, then right, then root.

Typically in binary search trees we walk to do inorder traversal because that actually allows the nodes to be printed out in order.

Code for Binary Search Tree

NodeBST.java

public class NodeBST {
NodeBST left, right;
int data;
public NodeBST(int data) {
this.data = data;
}
public void insert(int value) {
// look to the left and the right to see where we want to insert
if (value <= data) {
if (left == null) {
left = new NodeBST(value);
} else {
// push down to child and ask it to handle : recursion
left.insert(value);
}
} else {
if (right == null) {
right = new NodeBST(value);
} else {
right.insert(value);
}
}
}
public boolean contains(int value) {
// if we are there, return true
if (value == data) {
return true;
} else if (value < data) {
// then it should be on the left
// if there is no left node
if (left == null) {
return false;
} else {
return left.contains(value);
}
} else {
if (right == null) {
return false;
} else {
return right.contains(value);
}
}
}
// left child -> parent -> right child
public void printInOrder() {
if (left != null) {
left.printInOrder();
}
System.out.println("data = " + data);
if (right != null) {
right.printInOrder();
}
}
}

NodeBSTTest.java

import junit.framework.Assert;
import org.junit.Before;
import org.junit.Test;
public class NodeBSTTest {
private NodeBST node;
@Before
public void setUp() throws Exception {
node = new NodeBST(10);
node.insert(5);
node.insert(15);
node.insert(8);
}
@Test
public void Contains() throws Exception {
Assert.assertTrue(node.contains(5));
Assert.assertTrue(node.contains(15));
Assert.assertTrue(node.contains(8));
}
@Test
public void PrintOrder() throws Exception {
node.printInOrder();
}
}

Lesson to self. Sometimes it's hard to unlearn and stop doing things that have made you successful in the past. Evo… twitter.com/i/web/status/1…1 day ago