Trees: Hierarchies and Heaps
This chapter covers binary trees, BSTs, traversals, balanced tree variants, and the heap data structure behind priority queues.
What a Tree Is
A tree is a connected graph with no cycles. Each node has a parent (except the root) and zero or more children. No node has two parents.
Trees organize hierarchical data: file systems, DOM, organizational charts, parse trees, decision trees. They're also the backing structure for many fast data structures: BSTs, heaps, tries.
Binary Trees
Every node has at most two children, conventionally named left and right.
class TreeNode:
def __init__(self, val, left=None, right=None):
self.val = val
self.left = left
self.right = right
Build one manually:
# 1
# / \
# 2 3
# / \
# 4 5
root = TreeNode(1,
TreeNode(2, TreeNode(4), TreeNode(5)),
TreeNode(3),
)
Traversals
Four standard ways to walk a binary tree.
Preorder (Root, Left, Right)
def preorder(node):
if not node:
return
print(node.val)
preorder(node.left)
preorder(node.right)
Output for the tree above: 1 2 4 5 3.
Preorder is useful for serializing a tree (the root comes first, which makes reconstruction easier).
Inorder (Left, Root, Right)
def inorder(node):
if not node:
return
inorder(node.left)
print(node.val)
inorder(node.right)
Output: 4 2 5 1 3.
Inorder traversal of a BST produces sorted output. Key property.
Postorder (Left, Right, Root)
def postorder(node):
if not node:
return
postorder(node.left)
postorder(node.right)
print(node.val)
Output: 4 5 2 3 1.
Useful when you need to process children before parent: deleting a tree, computing depths.
Level-Order (BFS)
from collections import deque
def level_order(root):
if not root:
return
queue = deque([root])
while queue:
node = queue.popleft()
print(node.val)
if node.left: queue.append(node.left)
if node.right: queue.append(node.right)
Output: 1 2 3 4 5.
Level-order uses a queue, not recursion. Prints level by level. Useful for "how many levels", "rightmost in each level", and shortest-path-like problems.
Iterative Traversals
Recursive traversals blow the stack on very deep trees (Python's default is ~1000 levels). Iterative versions use an explicit stack.
Iterative Inorder
def inorder_iter(root):
stack = []
node = root
while node or stack:
while node:
stack.append(node)
node = node.left
node = stack.pop()
print(node.val)
node = node.right
Useful to understand; rarely needed in interview practice (recursion is fine for most inputs).
Binary Search Trees
A BST is a binary tree with an ordering invariant: for every node, all values in the left subtree are less, all values in the right subtree are greater.
5
/ \
3 7
/ \ \
1 4 9
Operations maintain the invariant.
Search
def bst_search(root, target):
node = root
while node:
if target == node.val:
return node
node = node.left if target < node.val else node.right
return None
O(h) where h is the tree height. Balanced: O(log n). Worst case (sorted inserts): O(n).
Insert
def bst_insert(root, val):
if not root:
return TreeNode(val)
if val < root.val:
root.left = bst_insert(root.left, val)
elif val > root.val:
root.right = bst_insert(root.right, val)
return root
O(h). Same balanced/unbalanced trade-off.
Delete
The hardest BST operation. Three cases:
- Node has no children: just remove it.
- Node has one child: replace with the child.
- Node has two children: replace with inorder successor (smallest in right subtree), recursively delete the successor.
def bst_delete(root, val):
if not root:
return None
if val < root.val:
root.left = bst_delete(root.left, val)
elif val > root.val:
root.right = bst_delete(root.right, val)
else:
if not root.left: return root.right
if not root.right: return root.left
succ = root.right
while succ.left:
succ = succ.left
root.val = succ.val
root.right = bst_delete(root.right, succ.val)
return root
Balanced Trees (AVL, Red-Black)
A BST degrades to O(n) when inputs are sorted or adversarial. Self-balancing variants keep height at O(log n).
- AVL trees: strict balance (heights differ by at most 1). More rotations, faster lookups.
- Red-black trees: looser balance. Fewer rotations, slightly slower lookups but faster inserts.
In Python, the sortedcontainers library gives you balanced-tree-like performance via a sorted-list skip-list hybrid:
from sortedcontainers import SortedList
s = SortedList([3, 1, 4, 1, 5])
print(s) # SortedList([1, 1, 3, 4, 5])
s.add(2)
print(s[0]) # 1
print(s.bisect_left(3)) # 2
SortedList supports O(log n) insert, delete, and search, plus O(1) index access. In practice, reach for it when you need a sorted structure with fast updates.
Heaps
A heap is a complete binary tree where each node's value is at most its children's (min-heap) or at least its children's (max-heap). The important property: the root is always the min (or max).
Heaps back priority queues. Every time you hear "process the smallest N first", think heap.
Python's heapq is a min-heap, backed by a list. No separate class.
import heapq
heap = []
heapq.heappush(heap, 3)
heapq.heappush(heap, 1)
heapq.heappush(heap, 4)
print(heapq.heappop(heap)) # 1 (smallest)
print(heap[0]) # 3 (next smallest; peek)
Operations:
heappush O(log n)
heappop O(log n)
heap[0] O(1) peek at min
heapify O(n) turn a list into a heap in-place
nlargest O(n log k) top k largest
nsmallest O(n log k) top k smallest
Max-Heap in Python
heapq is min-only. For a max-heap, negate the priorities:
heapq.heappush(heap, -x)
x = -heapq.heappop(heap)
Or use a wrapper class with inverted __lt__.
Priority Queue with Data
Store tuples: (priority, data). Python compares tuples element by element.
tasks = []
heapq.heappush(tasks, (2, "write tests"))
heapq.heappush(tasks, (1, "fix bug"))
heapq.heappush(tasks, (3, "refactor"))
while tasks:
priority, task = heapq.heappop(tasks)
print(priority, task)
# 1 fix bug
# 2 write tests
# 3 refactor
If two priorities tie, the second element breaks the tie. That can throw errors if your data isn't comparable (dicts, for example). Add a tiebreaker (a monotonic counter) to avoid it.
When to Reach for a Heap
- Top K problems: find the K largest/smallest items.
- Priority queues: task scheduling, Dijkstra's algorithm.
- Streaming medians (two heaps).
- K-way merge.
Applications
Top K Largest
import heapq
def top_k(arr, k):
return heapq.nlargest(k, arr)
print(top_k([3, 1, 4, 1, 5, 9, 2, 6], 3)) # [9, 6, 5]
Faster than sorting when k is small: O(n log k) vs O(n log n).
Kth Smallest in a BST
def kth_smallest(root, k):
stack = []
node = root
while node or stack:
while node:
stack.append(node)
node = node.left
node = stack.pop()
k -= 1
if k == 0:
return node.val
node = node.right
Inorder traversal, stop at the kth pop. O(h + k).
Validate a BST
def is_bst(node, lo=float("-inf"), hi=float("inf")):
if not node:
return True
if not (lo < node.val < hi):
return False
return is_bst(node.left, lo, node.val) and is_bst(node.right, node.val, hi)
Pass the valid range down. Each node must fit.
Common Pitfalls
Recursing on a very deep tree. Python crashes at ~1000 depth. For skewed trees or pathological inputs, iterate.
Treating BST as balanced. It isn't, unless it's SortedList or a self-balancing variant. Worst-case O(n) lurks.
Forgetting heap is a min-heap. Negate values for a max-heap, or use a wrapper.
Modifying a heap without heapq. heap[0] = x breaks the invariant. Always use heappush/heappop.
Confusing traversal orders. Write one or two out by hand to cement which does what. Misremembering inorder as preorder is a common bug.
Next Steps
Continue to 08-graphs.md to generalize from trees to arbitrary networks.