Trees and Binary Search Trees
Introduction
Trees are hierarchical data structures that model relationships like file systems, organization charts, and HTML documents. Binary Search Trees (BSTs) add ordering constraints that enable efficient searching, insertion, and deletion.
Learning Objectives
By the end of this reading, you will be able to:
- Understand tree terminology and properties
- Implement binary trees and BSTs
- Perform tree traversals
- Analyze BST operations
- Recognize when trees are appropriate
1. Tree Basics
A tree is a hierarchical data structure consisting of nodes connected by edges.
Terminology
A ← Root
/ | \
B C D ← Children of A
/| |
E F G ← Leaves (no children)
- Root: Top node (no parent)
- Parent: Node directly above another
- Child: Node directly below another
- Siblings: Nodes with the same parent
- Leaf: Node with no children
- Internal Node: Node with at least one child
- Edge: Connection between nodes
- Path: Sequence of nodes connected by edges
- Depth: Distance from root (root has depth 0)
- Height: Distance to deepest leaf (leaves have height 0)
- Subtree: Tree formed by a node and its descendants
Tree Properties
- A tree with n nodes has n-1 edges
- There's exactly one path between any two nodes
- A tree is connected and acyclic
2. Binary Trees
A binary tree is a tree where each node has at most two children (left and right).
Types of Binary Trees
Full Binary Tree: Every node has 0 or 2 children
A
/ \
B C
/ \
D E
Complete Binary Tree: All levels filled except possibly the last, which is filled left-to-right
A
/ \
B C
/ \ /
D E F
Perfect Binary Tree: All internal nodes have 2 children; all leaves at same level
A
/ \
B C
/ \ / \
D E F G
Balanced Binary Tree: Height of left and right subtrees differ by at most 1
Binary Tree Implementation
class TreeNode:
def __init__(self, val):
self.val = val
self.left = None
self.right = None
class BinaryTree:
def __init__(self):
self.root = None
def height(self, node):
if node is None:
return -1
return 1 + max(self.height(node.left), self.height(node.right))
def size(self, node):
if node is None:
return 0
return 1 + self.size(node.left) + self.size(node.right)
def is_leaf(self, node):
return node is not None and node.left is None and node.right is None
3. Tree Traversals
Depth-First Traversals
Pre-order (Root, Left, Right): Visit root before children
def preorder(node):
if node:
print(node.val, end=' ')
preorder(node.left)
preorder(node.right)
# Use: Copy tree, prefix expression
In-order (Left, Root, Right): Visit root between children
def inorder(node):
if node:
inorder(node.left)
print(node.val, end=' ')
inorder(node.right)
# Use: BST gives sorted order
Post-order (Left, Right, Root): Visit root after children
def postorder(node):
if node:
postorder(node.left)
postorder(node.right)
print(node.val, end=' ')
# Use: Delete tree, postfix expression
Example
1
/ \
2 3
/ \
4 5
Pre-order: 1 2 4 5 3
In-order: 4 2 5 1 3
Post-order: 4 5 2 3 1
Iterative Traversals
def inorder_iterative(root):
result = []
stack = []
current = root
while current or stack:
while current:
stack.append(current)
current = current.left
current = stack.pop()
result.append(current.val)
current = current.right
return result
Breadth-First (Level-Order) Traversal
from collections import deque
def level_order(root):
if not root:
return []
result = []
queue = deque([root])
while queue:
node = queue.popleft()
result.append(node.val)
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
return result
# Example: 1 2 3 4 5
4. Binary Search Trees (BST)
A Binary Search Tree is a binary tree where for each node:
- All values in left subtree < node's value
- All values in right subtree > node's value
8
/ \
3 10
/ \ \
1 6 14
/ \ /
4 7 13
BST Implementation
class BST:
def __init__(self):
self.root = None
def insert(self, val):
"""Insert value into BST: O(h)"""
if not self.root:
self.root = TreeNode(val)
return
current = self.root
while True:
if val < current.val:
if current.left is None:
current.left = TreeNode(val)
return
current = current.left
else:
if current.right is None:
current.right = TreeNode(val)
return
current = current.right
def search(self, val):
"""Search for value: O(h)"""
current = self.root
while current:
if val == current.val:
return current
elif val < current.val:
current = current.left
else:
current = current.right
return None
def contains(self, val):
return self.search(val) is not None
def find_min(self, node=None):
"""Find minimum value: O(h)"""
if node is None:
node = self.root
while node and node.left:
node = node.left
return node
def find_max(self, node=None):
"""Find maximum value: O(h)"""
if node is None:
node = self.root
while node and node.right:
node = node.right
return node
BST Deletion
Three cases:
- Leaf node: Simply remove
- One child: Replace with child
- Two children: Replace with in-order successor (or predecessor)
def delete(self, val):
"""Delete value from BST: O(h)"""
self.root = self._delete_recursive(self.root, val)
def _delete_recursive(self, node, val):
if node is None:
return None
if val < node.val:
node.left = self._delete_recursive(node.left, val)
elif val > node.val:
node.right = self._delete_recursive(node.right, val)
else:
# Node to delete found
if node.left is None:
return node.right
if node.right is None:
return node.left
# Two children: replace with in-order successor
successor = self.find_min(node.right)
node.val = successor.val
node.right = self._delete_recursive(node.right, successor.val)
return node
BST Complexity
| Operation | Average | Worst (unbalanced) |
|---|---|---|
| Search | O(log n) | O(n) |
| Insert | O(log n) | O(n) |
| Delete | O(log n) | O(n) |
| Find Min/Max | O(log n) | O(n) |
Worst case occurs when tree degenerates to a linked list (e.g., inserting sorted data).
5. BST Operations
Find Successor/Predecessor
def inorder_successor(self, node):
"""Find next node in in-order traversal"""
if node.right:
return self.find_min(node.right)
# Go up until we come from a left child
parent = node.parent
while parent and node == parent.right:
node = parent
parent = parent.parent
return parent
def inorder_predecessor(self, node):
"""Find previous node in in-order traversal"""
if node.left:
return self.find_max(node.left)
parent = node.parent
while parent and node == parent.left:
node = parent
parent = parent.parent
return parent
Validate BST
def is_valid_bst(root, min_val=float('-inf'), max_val=float('inf')):
"""Check if tree satisfies BST property"""
if root is None:
return True
if root.val <= min_val or root.val >= max_val:
return False
return (is_valid_bst(root.left, min_val, root.val) and
is_valid_bst(root.right, root.val, max_val))
Lowest Common Ancestor
def lowest_common_ancestor(root, p, q):
"""Find LCA in BST"""
while root:
if p.val < root.val and q.val < root.val:
root = root.left
elif p.val > root.val and q.val > root.val:
root = root.right
else:
return root
return None
6. Self-Balancing Trees (Preview)
To guarantee O(log n) operations, use self-balancing trees:
AVL Trees
- Strictly balanced: heights differ by at most 1
- Rotations on insert/delete to maintain balance
- Faster lookups than Red-Black trees
Red-Black Trees
- Less strictly balanced
- Nodes are colored red or black
- Used in Java TreeMap, C++ std::map
B-Trees
- Nodes can have many children
- Used in databases and file systems
- Optimized for disk access
Detailed coverage in advanced data structures.
7. Common Tree Problems
Maximum Depth
def max_depth(root):
if root is None:
return 0
return 1 + max(max_depth(root.left), max_depth(root.right))
Check if Balanced
def is_balanced(root):
def check(node):
if node is None:
return 0
left = check(node.left)
right = check(node.right)
if left == -1 or right == -1 or abs(left - right) > 1:
return -1
return 1 + max(left, right)
return check(root) != -1
Path Sum
def has_path_sum(root, target):
"""Check if root-to-leaf path sums to target"""
if root is None:
return False
if root.left is None and root.right is None:
return root.val == target
return (has_path_sum(root.left, target - root.val) or
has_path_sum(root.right, target - root.val))
Serialize/Deserialize
def serialize(root):
"""Convert tree to string"""
if root is None:
return "null"
return f"{root.val},{serialize(root.left)},{serialize(root.right)}"
def deserialize(data):
"""Reconstruct tree from string"""
def helper(values):
val = next(values)
if val == "null":
return None
node = TreeNode(int(val))
node.left = helper(values)
node.right = helper(values)
return node
return helper(iter(data.split(',')))
Invert Binary Tree
def invert_tree(root):
if root is None:
return None
root.left, root.right = root.right, root.left
invert_tree(root.left)
invert_tree(root.right)
return root
8. Trees in Practice
File System
class FileNode:
def __init__(self, name, is_file=False):
self.name = name
self.is_file = is_file
self.children = {} # name -> FileNode
def add_child(self, name, is_file=False):
self.children[name] = FileNode(name, is_file)
return self.children[name]
def find(self, path):
parts = path.split('/')
current = self
for part in parts:
if part and part in current.children:
current = current.children[part]
elif part:
return None
return current
Expression Trees
+
/ \
* 3
/ \
2 5
Represents: (2 * 5) + 3 = 13
def evaluate(node):
if node.is_number():
return node.val
left = evaluate(node.left)
right = evaluate(node.right)
if node.val == '+':
return left + right
elif node.val == '*':
return left * right
# ... etc
Trie (Prefix Tree)
class TrieNode:
def __init__(self):
self.children = {}
self.is_word = False
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word):
node = self.root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.is_word = True
def search(self, word):
node = self._find_node(word)
return node is not None and node.is_word
def starts_with(self, prefix):
return self._find_node(prefix) is not None
def _find_node(self, prefix):
node = self.root
for char in prefix:
if char not in node.children:
return None
node = node.children[char]
return node
9. When to Use Trees
Use BST When:
- Need sorted data with O(log n) operations
- Need range queries
- Need in-order traversal (sorted sequence)
- Data changes frequently
Use Hash Table Instead When:
- Only need exact match lookups
- Don't need ordering
- Average case O(1) is sufficient
Use Trees When:
- Data is hierarchical (file system, XML)
- Need to represent relationships (family tree)
- Need to evaluate expressions
Exercises
Basic
Implement pre-order, in-order, and post-order traversals iteratively.
Count the number of leaf nodes in a binary tree.
Find the minimum value in a BST.
Intermediate
Given a sorted array, construct a balanced BST.
Print all paths from root to leaves.
Find the kth smallest element in a BST.
Advanced
Convert a BST to a sorted doubly linked list in-place.
Find the diameter of a binary tree (longest path between any two nodes).
Implement a function to check if two trees are identical.
Summary
- Trees are hierarchical structures with root, parent, child relationships
- Binary trees have at most two children per node
- BST property: left < root < right
- Traversals: pre-order, in-order, post-order, level-order
- BST operations are O(log n) average, O(n) worst case
- Self-balancing trees guarantee O(log n)
- Use for sorted data, hierarchical data, range queries