Heap Data Structure – Simple Introduction to a Complex Topic – Part 1

The heap data structure is a very useful data structure.

In this post, we will be going through a brief introduction on the heap data structure.

In the following tutorials, we will be looking at the different types of heaps, how it is implemented and zoom into its key features.

Like most data structures, the heap data structure is often labelled as an advanced topic. One that only the computer science majors, and the brave dare to learn and understand.

As we unravel the contents, you will hopefully begin to realize that it is not as difficult as you once thought.

This tutorial will not be tied down to a specific language. In another words, regardless of what programming language you use, you will be able to follow, understand and benefit from this tutorial.

Understanding the heap data structure will make you more complete as a programmer.

What is the Heap data structure?

The heap is a binary tree, meaning at the most, each parent has two children. There are two types of heaps: the max and min heap. The root node of a max heap is the highest value in the heap, whereas a min heap has the minimum value allocated to the root node.

The heap data structure is a very useful data structure that every programmer should know well. The heap data structure is used behind the scenes to perform the heap sort. Understanding how the heap works will allow programmers to make wiser decisions when programming in an environment where memory management is crucial.

How is the Heap implemented?

The heap is an abstract data type.

By abstract data type, we are referring to a data type, who has pre-defined behaviors from a user perspective.

For example, lets take a look at the stack data structure.

It is an abstract data type that has the following methods.

  • push(T data) – Add data to the end of the stack.
  • pop() – remove and return the last element from the stack.
  • peek() – return the last element in the stack without removing it.

As for for the implementation, we as consumers, have no idea.

The heap or the stack could be implemented using an array, or a linked list.

It is up to the producer of the CDT or the concrete data type what underlying data structure is used.

Heap Data Structure Primary Use Cases

The heap is a very useful data structure. Some popular algorithms such as the heap sort use the heap data structure behind the scenes. Other use cases include

  • Finding the min, max, median value in a collection of data.
  • Graph algorithms
  • priority queues. The heap is a concrete implementation of the priority queue abstract data type.
  • Schedulers (for finding the first/earliest action item on the schedule).
  • Dijkstra’s shortest path algorithm (no, it is not a typo or something I made up).

API

Like most data structures, the heap has its own set of operations that allows the consumers of the API to interact with the heap.

  • findMax() find the largest data set in the heap
  • findMin() find the smallest data set inside of the heap
  • extractMin() extract the node containing the minimum data in the heap.
  • extractMax() extract the node containing the maximum data in the heap.
  • insert(T data) insert data into the heap
  • remove(T data) remove data from the heap containing that value.

The algorithm will vary depending on whether the current heap is a min or max heap. For the sake of simplicity, we are going to keep the number of  operations fairly small.

You may already be well aware of this, but how you construct the API is entirely up to you, as a developer.

As long as your heap adheres to the fundamental properties of the heap data structure (to be discussed in the next upcoming section), you are good to go.

Heap Data Structure Properties

Below are some of the features of the heap data structures

  • Unlike binary search trees, the left child of a node does not need to be less than the key of the parent. So naturally, the right child of a node does not have to be greater than that of the parent.
  • For each level of the tree, the data is inserted from left to right.
  • A new level of the tree is NOT started until the current level is filled from left to right.

Let me use the example below to clarify.

As you can see, unlike in the binary search tree, the left Child is not necessarily lesser than the right child. As long as the parent is greater than its children, it a valid heap.

If we were to insert another node into the heap below, it will be inserted as a left child of node with key of 5.

The tree currently has a depth of four. Level four will continue to be filled and a new level will not be started until the node with key value 1 (index 6) has both a left and a right child.

heap data structure max heap

Maximum Heap Properties

Lets take a look at the properties that define the max heap.

  • The key of parent nodes are greater than or equal to ( >= ) that of the children.
  • Therefore, the highest key is at the root node.

If this is not sticking, allow me to demonstrate with a picture. Hopefully this will make everything clearer.

Minimum Heap Properties

To put things into perspective, the min heap is the inverted result of the max heap.

Just to be thorough, let us go through each of the key properties of the min heap.

  • The key of the parent nodes are less than or equal to ( <= ) that of the children.
  • Therefore, lowest key is the root node.

Heap Implementation Using Arrays

As stated before, the heap can be implemented via two distinct methods. The first is to implement the heap using an array. The other method is the one that you should be familiar with if you have been following my blog: using nodes.

Implementing the Heap using Arrays

Lets take a look at implementing the heap using an array.

When inserting data to the heap, after inserting, we need to check whether the heap properties are met.

  • If we are working with a minimum heap, we need to ensure that the key of the parent node is greater than the current node.
  • Conversely, if we are implementing the maximum heap, we need to make sure that the key of the parent node is lesser than the current node.

If any of the conditions are violated, we need to perform swaps and check recursively to ensure that the heap properties are being upheld throughout the entire tree.

Below is a sample image of the maximum heap.

heap data structure max heap

If we look at the underlying array housing the heap displayed above, it will look like this

[80, 18, 40, 13, 5, 25, 1, 2, 9]

When we are building the heap using arrays as the underlying data structure, we need to be aware of the following formula for calculating the index of items in the tree.

heap array implementation index formula

In the heap example above, we have the following array.

[80, 18, 40, 13, 5, 25, 1, 2, 9]

Lets take a look at

[80, 18, 40]

The index of 8 is 0. Therefore, i in this case equals zero. No problem understanding this right?

If i is zero, the index of 18 is

(2 * 0) + 1 
0 + 1 
1

Working with 40, its index will be

(2 * 0) + 2 
0 + 2
2

The calculations above are simple examples. Before moving on, I would like to issue a challenge to you.

This will help solidify your understanding of how to implement the heap using arrays as the underlying data structure.

Exercise

Try working with all the items inside of the array and see if this property holds. Afterwards, try constructing your own heap using arrays with different values, and see if the property holds.

Inserting Data into the Maximum Heap

I am going to break down the insertion process into steps.

For demonstration purposes, I will also provide an image of the data structure after each insertion operation.

Step 1: Insert 35.

Array: [35]

max heap array insert 35

Step 2: Insert 10.

Array: [35, 10]

max heap array insert 10

Step 3: Insert 100. Max heap rule is violated: Parent (35) is less than right child (100). Swap 35 and 100.

Array: [35, 10, 100] --> [100, 10, 35]

max heap array insert 100

Step 4: Insert 83. Max heap rule is violated. Parent (10) is less than left child (83) Swap 10 and 83.

Array: [100, 10, 35, 80] --> [100, 80, 35, 10]

max heap array insert 80. Swap 80 and 10.

 

The Heap Data Structure has NOTHING to do with the Heap Memory

I just wanted to address this common misconceptions to those that read all the way up to this section.

Please, erase this misconception from your mind. The heap data structure and the heap memory in your computer are NOT related.

The heap data structure is not used in the heap memory implementation details.

Parting words

In the next post in this series, we will be discussing the implementation details. If this post helped you out, please share the goodies.

By now, you should have enough knowledge to implement a basic heap data structure in your favorite language.

In the future, I will upload the source code of the heap data structure implementation JavaScript, Java and C++.

Hope that this read was informative. In the near future (probably next week), I will be coming back to this post to add more information and details surrounding the heap data structure, namely more information on deleting data from the heap.

Peace out and happy coding!

About the Author Jay

I am a programmer currently living in Seoul, South Korea. I created this blog as an outlet to express what I know / have been learning in text form for retaining knowledge and also to hopefully help the wider community. I am passionate about data structures and algorithms. The back-end and databases is where my heart is at.

follow me on:
3 Shares