Binary Heaps

Leave a comment

A binary heap is a binary tree heap data structure used in sorting and priority queue algorithms.

What is a binary tree?

Screen Shot 2017-11-10 at 9.05.53 AM.png

A binary tree is a data structure that stores it’s data in the shape of a tree. At the top you have a root node, and underneath that you have at maximum two other nodes. One on the left and one on the right.

Binary trees are useful for all sorts of things. But one area where they really shine is in searching. When you sort a binary tree in certain ways, you can find elements much faster than if you were walking an array or linked list.

What are heaps?

A heap data structure is binary tree that is sorted. We call it a max heap (if the largest number is on the top). And a min heap if the largest numbers are on the bottom.

Screen Shot 2017-11-10 at 9.07.25 AM.png

What’s cool about max heaps is that their structure lends itself very nicely to quickly finding maximum level elements in a queue. Which is why priority queues are often implemented using binary tree structures sorted as heaps.

What is a binary heap?

As you’ve probably guess, binary heaps combine binary trees with heap sorted structures.

Screen Shot 2017-11-10 at 9.07.53 AM.png

In binary heaps any node has at least the value of the nodes children. There is no left/right orientation or distinction in values.

Also children further down the tree can be of greater value than nodes higher up the tree on the other side. That doesn’t matter. The only thing that matters is that the children at equal or less than their parents. That’s it.

Which means every node, it the root of it’s own sub-heap.

Heap representation

Because the shape of a heap is represented so nicely it is easy to store heap nodes in an array.

Screen Shot 2017-11-10 at 9.08.11 AM.png

Top to bottom, left to right. Also, because we know where every element of the array lies, we can access its elements very easily using the above formula.

Finding the maximum

Finding the maximum is easy. It’s simply the first, or highest element at the top of our heap. So it is always the first element in our array.

Inserting

When inserting we stick the new element at the end of our tree (always reading top to bottom, left to right) and stick it in the last element of our array. And if is smaller than the parent we are done.

But what happens if our element is larger that the parent?

Screen Shot 2017-11-10 at 9.08.26 AM.png

For insertions like this we walk up the tree, swapping nodes until our heap constraint (of the child being of less than or equal value to our parent) is satisfied.

Screen Shot 2017-11-10 at 9.08.44 AM.png

And if we insert a number each to it’s parent, we just leave it there. It’s done.

Deletion

Say we want to delete the root of our heap (i.e. we pulled off it’s max value and now we need the heap to resort itself – or heapify).

Screen Shot 2017-11-10 at 9.08.56 AM.png

To delete the max element, we take the max node, swap it with the last leaf, and then delete the last leaf which we know is so easy.

Screen Shot 2017-11-10 at 9.09.16 AM.png

Now we need to resort the heap (because that top element isn’t quite right as it doesn’t represent the max).

We do this by comparing the top with with each left/right subnode, and continuously swapping with the larger of the two.

Screen Shot 2017-11-10 at 9.09.46 AM.png

In this case the left child is larger, so we are going to swap the root to the left.

Screen Shot 2017-11-10 at 9.09.57 AM.png

No we repeat the process with the left hand subtree. We keep swapping the node in question with the largest of it’s children until we can’t swap anymore.

Screen Shot 2017-11-10 at 9.10.13 AM.png

At this point we say the heap is heapified, and finally sorted. It is good to go for another extraction.

Cool things

Some other cool things about binary trees is that this data structure can sort itself very efficiently. i.e. It can sort itself in place. No need to copy all the contents somewhere else and then copy back. When we heapify it does it all with the existing array and simply swapping nodes. Very cool. Very efficient.

Code

Here is an example of a Max Heap implementation for ints in Java.

MaxIntHeap.java

import java.util.Arrays;

public class MaxIntHeap {
    private int capactity = 10;
    private int size = 0;

    int[] items = new int[capactity];

    private int leftChildIndex(int parentIndex) { return 2 * parentIndex + 1; }
    private int rightChildIndex(int parentIndex) { return 2 * parentIndex + 2; }
    private int parentIndex(int childIndex ) { return (childIndex - 1) / 2; }

    private boolean hasLeftChild(int index) { return leftChildIndex(index) < size; }
    private boolean hasRightChild(int index) { return rightChildIndex(index) < size; }
    private boolean hasParent(int index) { return parentIndex(index) >= 0; }

    private int leftChild(int index) { return items[leftChildIndex(index)]; }
    private int rightChild(int index) { return items[rightChildIndex(index)]; }
    private int parent(int index) { return items[parentIndex(index)]; }

    private void swap(int indexOne, int indexTwo) {
        int temp = items[indexOne];
        items[indexOne] = items[indexTwo];
        items[indexTwo] = temp;
    }

    private void ensureCapactity() {
        if (size == capactity) {
            items = Arrays.copyOf(items, capactity * 2);
            capactity *= 2;
        }
    }


    public int extractMax() {
        if (size == 0) throw new IllegalStateException();
        int item = items[0];        // grab the max
        items[0] = items[size - 1]; // copy the bottom to the top
        size--;
        heapifyDown();             // and no because top isn't right, we heapify down
        return item;
    }

    public void add(int item) {
        ensureCapactity();
        items[size] = item;          // put in last spot
        size++;
        heapifyUp();
    }

    public void heapifyUp() {
        int index = size - 1;       // start at last element
        while (hasParent(index) && parent(index) > items[index]) {  // walk up as long as there is a parent and it is bigger than you
            swap(parentIndex(index), index);
            index = parentIndex(index); // walk upwards to next node
        }
    }

    public void heapifyDown() {
        int index = 0;              // starting at the top
        while (hasLeftChild(index)) {  // as long as I have children Note: Only need to check left because if no left, there is no right

            // pick a direction, and get the smaller of the two indexes
            int smallerChildIndex = leftChildIndex(index);
            if (hasRightChild(index) && rightChild(index) < leftChild(index)) {
                // swap right (because we are min heap
                smallerChildIndex = rightChildIndex(index);
            }

            // Now compare

            // if I am smaller than the items of my two children...then everything is good. I am sorted.
            if(items[index] < items[smallerChildIndex]) {
                break;
            } else { //  we are still not in order
                swap(index, smallerChildIndex);         // so swap with the smaller child
            }

            index = smallerChildIndex;              // then move down to smaller child
        }
    }

    public void print() {
        for (int i=0; i < size; i++) {
            System.out.println(i + "[" + items[i] + "]" );
        }
    }
}

MaxIntHeapTest.java

import org.junit.Assert;
import org.junit.Before;
import org.junit.Test;

public class MaxIntHeapTest {

    private MaxIntHeap minHeap;

    @Before
    public void setUp() throws Exception {
        minHeap = new MaxIntHeap();
        minHeap.add(6);
        minHeap.add(5);
        minHeap.add(4);
        minHeap.add(3);
        minHeap.add(2);
        minHeap.add(1);
    }

    @Test
    public void Insert() throws Exception {
        // Remember: The array walks top down / left to right
        Assert.assertEquals(1, minHeap.items[0]);
        Assert.assertEquals(3, minHeap.items[1]);
        Assert.assertEquals(2, minHeap.items[2]);
        Assert.assertEquals(6, minHeap.items[3]);
        Assert.assertEquals(4, minHeap.items[4]);
        Assert.assertEquals(5, minHeap.items[5]);
    }

    @Test
    public void ExtractMin() throws Exception {
        Assert.assertEquals(1, minHeap.extractMax());
        Assert.assertEquals(2, minHeap.extractMax());
        Assert.assertEquals(3, minHeap.extractMax());
        Assert.assertEquals(4, minHeap.extractMax());
        Assert.assertEquals(5, minHeap.extractMax());
        Assert.assertEquals(6, minHeap.extractMax());
    }
}

Links that help

https://cs.stackexchange.com/questions/27860/whats-the-difference-between-a-binary-search-tree-and-a-binary-heap
http://www.geeksforgeeks.org/binary-heap/

Advertisements

Binary trees

1 Comment

Below are some notes I took from the excellent Data Structure course at coursera.

A binary tree is a data structure that is extremely handy for representing certain constructs in computer science.

For example, you can use a binary tree to represent a sentence.

sentence.PNG

A syntax tree for an expression.

syntax-tree.PNG

Or abstract tree for code.

code-tree.PNG

And when you order the nodes on your tree, you can have the number of elements when doing search. Which is why binary search trees are so popular for searching.

binary-search-tree.PNG

What is a binary tree?

A binary tree consists of a node, with with one or two other nodes hanging off it. It’s called binary, because each node can only have at most two other nodes hanging off it. And usually when you sort your tree, you put the lower values on the left branch of the node, the the higher value ones on the right.

That’s what makes search with binary trees so quick. You effectively start in the middle, and at most only really need to search half the tree.

defn-binary-tree.PNG

Common operations

The amount of time it takes to search a tree, is proportional to it’s height. That’s why many binary trees have a method that can tell you how high they are.

height.PNG

Another common operation is size (or how many nodes).

size.PNG

One thing you’ll notice when you look at binary tree algorithms is their heavy use of recursion. Recursion is one of those things that can make your head hurt a bit at first, because from within a method, you continuously call yourself again and again. A dream within a dream.

But it is also what makes binary trees so elegant. With very little code you get a lot of amazing functionality. When is why they are so beautiful to study and understand.

Walking trees

There are different ways you can walk a tree which fall into two general categories:

  • Depth first
    • InOrderTraversal
    • PreOrderTraversal
    • PostOrderTraversal
  • Breadth first
    • LevelTraversal

But I am not really sure yet when one would make more sense over the other. I suspect InOrderTraversal would be common.

So anyways, there are some notes on binary trees.

%d bloggers like this: