Trees: Hierarchies and Heaps

This chapter covers binary trees, BSTs, traversals, balanced tree variants, and the heap data structure behind priority queues.

What a Tree Is

A tree is a connected graph with no cycles. Each node has a parent (except the root) and zero or more children. No node has two parents.

Trees organize hierarchical data: file systems, DOM, organizational charts, parse trees, decision trees. They're also the backing structure for many fast data structures: BSTs, heaps, tries.

Binary Trees

Every node has at most two children, conventionally named left and right.

class TreeNode:
    def __init__(self, val, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

Build one manually:

#       1
#      / \
#     2   3
#    / \
#   4   5

root = TreeNode(1,
    TreeNode(2, TreeNode(4), TreeNode(5)),
    TreeNode(3),
)

Traversals

Four standard ways to walk a binary tree.

Preorder (Root, Left, Right)

def preorder(node):
    if not node:
        return
    print(node.val)
    preorder(node.left)
    preorder(node.right)

Output for the tree above: 1 2 4 5 3.

Preorder is useful for serializing a tree (the root comes first, which makes reconstruction easier).

Inorder (Left, Root, Right)

def inorder(node):
    if not node:
        return
    inorder(node.left)
    print(node.val)
    inorder(node.right)

Output: 4 2 5 1 3.

Inorder traversal of a BST produces sorted output. Key property.

Postorder (Left, Right, Root)

def postorder(node):
    if not node:
        return
    postorder(node.left)
    postorder(node.right)
    print(node.val)

Output: 4 5 2 3 1.

Useful when you need to process children before parent: deleting a tree, computing depths.

Level-Order (BFS)

from collections import deque

def level_order(root):
    if not root:
        return
    queue = deque([root])
    while queue:
        node = queue.popleft()
        print(node.val)
        if node.left:  queue.append(node.left)
        if node.right: queue.append(node.right)

Output: 1 2 3 4 5.

Level-order uses a queue, not recursion. Prints level by level. Useful for "how many levels", "rightmost in each level", and shortest-path-like problems.

Iterative Traversals

Recursive traversals blow the stack on very deep trees (Python's default is ~1000 levels). Iterative versions use an explicit stack.

Iterative Inorder

def inorder_iter(root):
    stack = []
    node = root
    while node or stack:
        while node:
            stack.append(node)
            node = node.left
        node = stack.pop()
        print(node.val)
        node = node.right

Useful to understand; rarely needed in interview practice (recursion is fine for most inputs).

Binary Search Trees

A BST is a binary tree with an ordering invariant: for every node, all values in the left subtree are less, all values in the right subtree are greater.

      5
     / \
    3   7
   / \   \
  1   4   9

Operations maintain the invariant.

def bst_search(root, target):
    node = root
    while node:
        if target == node.val:
            return node
        node = node.left if target < node.val else node.right
    return None

O(h) where h is the tree height. Balanced: O(log n). Worst case (sorted inserts): O(n).

Insert

def bst_insert(root, val):
    if not root:
        return TreeNode(val)
    if val < root.val:
        root.left = bst_insert(root.left, val)
    elif val > root.val:
        root.right = bst_insert(root.right, val)
    return root

O(h). Same balanced/unbalanced trade-off.

Delete

The hardest BST operation. Three cases:

  1. Node has no children: just remove it.
  2. Node has one child: replace with the child.
  3. Node has two children: replace with inorder successor (smallest in right subtree), recursively delete the successor.
def bst_delete(root, val):
    if not root:
        return None
    if val < root.val:
        root.left = bst_delete(root.left, val)
    elif val > root.val:
        root.right = bst_delete(root.right, val)
    else:
        if not root.left:  return root.right
        if not root.right: return root.left
        succ = root.right
        while succ.left:
            succ = succ.left
        root.val = succ.val
        root.right = bst_delete(root.right, succ.val)
    return root

Balanced Trees (AVL, Red-Black)

A BST degrades to O(n) when inputs are sorted or adversarial. Self-balancing variants keep height at O(log n).

  • AVL trees: strict balance (heights differ by at most 1). More rotations, faster lookups.
  • Red-black trees: looser balance. Fewer rotations, slightly slower lookups but faster inserts.

In Python, the sortedcontainers library gives you balanced-tree-like performance via a sorted-list skip-list hybrid:

from sortedcontainers import SortedList

s = SortedList([3, 1, 4, 1, 5])
print(s)                   # SortedList([1, 1, 3, 4, 5])
s.add(2)
print(s[0])                # 1
print(s.bisect_left(3))    # 2

SortedList supports O(log n) insert, delete, and search, plus O(1) index access. In practice, reach for it when you need a sorted structure with fast updates.

Heaps

A heap is a complete binary tree where each node's value is at most its children's (min-heap) or at least its children's (max-heap). The important property: the root is always the min (or max).

Heaps back priority queues. Every time you hear "process the smallest N first", think heap.

Python's heapq is a min-heap, backed by a list. No separate class.

import heapq

heap = []
heapq.heappush(heap, 3)
heapq.heappush(heap, 1)
heapq.heappush(heap, 4)

print(heapq.heappop(heap))    # 1 (smallest)
print(heap[0])                 # 3 (next smallest; peek)

Operations:

heappush    O(log n)
heappop     O(log n)
heap[0]     O(1)          peek at min
heapify     O(n)          turn a list into a heap in-place
nlargest    O(n log k)    top k largest
nsmallest   O(n log k)    top k smallest

Max-Heap in Python

heapq is min-only. For a max-heap, negate the priorities:

heapq.heappush(heap, -x)
x = -heapq.heappop(heap)

Or use a wrapper class with inverted __lt__.

Priority Queue with Data

Store tuples: (priority, data). Python compares tuples element by element.

tasks = []
heapq.heappush(tasks, (2, "write tests"))
heapq.heappush(tasks, (1, "fix bug"))
heapq.heappush(tasks, (3, "refactor"))

while tasks:
    priority, task = heapq.heappop(tasks)
    print(priority, task)
# 1 fix bug
# 2 write tests
# 3 refactor

If two priorities tie, the second element breaks the tie. That can throw errors if your data isn't comparable (dicts, for example). Add a tiebreaker (a monotonic counter) to avoid it.

When to Reach for a Heap

  • Top K problems: find the K largest/smallest items.
  • Priority queues: task scheduling, Dijkstra's algorithm.
  • Streaming medians (two heaps).
  • K-way merge.

Applications

Top K Largest

import heapq

def top_k(arr, k):
    return heapq.nlargest(k, arr)

print(top_k([3, 1, 4, 1, 5, 9, 2, 6], 3))    # [9, 6, 5]

Faster than sorting when k is small: O(n log k) vs O(n log n).

Kth Smallest in a BST

def kth_smallest(root, k):
    stack = []
    node = root
    while node or stack:
        while node:
            stack.append(node)
            node = node.left
        node = stack.pop()
        k -= 1
        if k == 0:
            return node.val
        node = node.right

Inorder traversal, stop at the kth pop. O(h + k).

Validate a BST

def is_bst(node, lo=float("-inf"), hi=float("inf")):
    if not node:
        return True
    if not (lo < node.val < hi):
        return False
    return is_bst(node.left, lo, node.val) and is_bst(node.right, node.val, hi)

Pass the valid range down. Each node must fit.

Common Pitfalls

Recursing on a very deep tree. Python crashes at ~1000 depth. For skewed trees or pathological inputs, iterate.

Treating BST as balanced. It isn't, unless it's SortedList or a self-balancing variant. Worst-case O(n) lurks.

Forgetting heap is a min-heap. Negate values for a max-heap, or use a wrapper.

Modifying a heap without heapq. heap[0] = x breaks the invariant. Always use heappush/heappop.

Confusing traversal orders. Write one or two out by hand to cement which does what. Misremembering inorder as preorder is a common bug.

Next Steps

Continue to 08-graphs.md to generalize from trees to arbitrary networks.