Trees and Binary Search Trees

Introduction

Trees are hierarchical data structures that model relationships like file systems, organization charts, and HTML documents. Binary Search Trees (BSTs) add ordering constraints that enable efficient searching, insertion, and deletion.

Learning Objectives

By the end of this reading, you will be able to:

  • Understand tree terminology and properties
  • Implement binary trees and BSTs
  • Perform tree traversals
  • Analyze BST operations
  • Recognize when trees are appropriate

1. Tree Basics

A tree is a hierarchical data structure consisting of nodes connected by edges.

Terminology

            A          ← Root
          / | \
         B  C  D       ← Children of A
        /|     |
       E F     G       ← Leaves (no children)
  • Root: Top node (no parent)
  • Parent: Node directly above another
  • Child: Node directly below another
  • Siblings: Nodes with the same parent
  • Leaf: Node with no children
  • Internal Node: Node with at least one child
  • Edge: Connection between nodes
  • Path: Sequence of nodes connected by edges
  • Depth: Distance from root (root has depth 0)
  • Height: Distance to deepest leaf (leaves have height 0)
  • Subtree: Tree formed by a node and its descendants

Tree Properties

  • A tree with n nodes has n-1 edges
  • There's exactly one path between any two nodes
  • A tree is connected and acyclic

2. Binary Trees

A binary tree is a tree where each node has at most two children (left and right).

Types of Binary Trees

Full Binary Tree: Every node has 0 or 2 children

        A
       / \
      B   C
     / \
    D   E

Complete Binary Tree: All levels filled except possibly the last, which is filled left-to-right

        A
       / \
      B   C
     / \  /
    D  E F

Perfect Binary Tree: All internal nodes have 2 children; all leaves at same level

        A
       / \
      B   C
     / \ / \
    D  E F  G

Balanced Binary Tree: Height of left and right subtrees differ by at most 1

Binary Tree Implementation

class TreeNode:
    def __init__(self, val):
        self.val = val
        self.left = None
        self.right = None


class BinaryTree:
    def __init__(self):
        self.root = None

    def height(self, node):
        if node is None:
            return -1
        return 1 + max(self.height(node.left), self.height(node.right))

    def size(self, node):
        if node is None:
            return 0
        return 1 + self.size(node.left) + self.size(node.right)

    def is_leaf(self, node):
        return node is not None and node.left is None and node.right is None

3. Tree Traversals

Depth-First Traversals

Pre-order (Root, Left, Right): Visit root before children

def preorder(node):
    if node:
        print(node.val, end=' ')
        preorder(node.left)
        preorder(node.right)

# Use: Copy tree, prefix expression

In-order (Left, Root, Right): Visit root between children

def inorder(node):
    if node:
        inorder(node.left)
        print(node.val, end=' ')
        inorder(node.right)

# Use: BST gives sorted order

Post-order (Left, Right, Root): Visit root after children

def postorder(node):
    if node:
        postorder(node.left)
        postorder(node.right)
        print(node.val, end=' ')

# Use: Delete tree, postfix expression

Example

        1
       / \
      2   3
     / \
    4   5

Pre-order:  1 2 4 5 3
In-order:   4 2 5 1 3
Post-order: 4 5 2 3 1

Iterative Traversals

def inorder_iterative(root):
    result = []
    stack = []
    current = root

    while current or stack:
        while current:
            stack.append(current)
            current = current.left

        current = stack.pop()
        result.append(current.val)
        current = current.right

    return result

Breadth-First (Level-Order) Traversal

from collections import deque

def level_order(root):
    if not root:
        return []

    result = []
    queue = deque([root])

    while queue:
        node = queue.popleft()
        result.append(node.val)

        if node.left:
            queue.append(node.left)
        if node.right:
            queue.append(node.right)

    return result

# Example: 1 2 3 4 5

4. Binary Search Trees (BST)

A Binary Search Tree is a binary tree where for each node:

  • All values in left subtree < node's value
  • All values in right subtree > node's value
        8
       / \
      3   10
     / \    \
    1   6    14
       / \   /
      4   7 13

BST Implementation

class BST:
    def __init__(self):
        self.root = None

    def insert(self, val):
        """Insert value into BST: O(h)"""
        if not self.root:
            self.root = TreeNode(val)
            return

        current = self.root
        while True:
            if val < current.val:
                if current.left is None:
                    current.left = TreeNode(val)
                    return
                current = current.left
            else:
                if current.right is None:
                    current.right = TreeNode(val)
                    return
                current = current.right

    def search(self, val):
        """Search for value: O(h)"""
        current = self.root

        while current:
            if val == current.val:
                return current
            elif val < current.val:
                current = current.left
            else:
                current = current.right

        return None

    def contains(self, val):
        return self.search(val) is not None

    def find_min(self, node=None):
        """Find minimum value: O(h)"""
        if node is None:
            node = self.root
        while node and node.left:
            node = node.left
        return node

    def find_max(self, node=None):
        """Find maximum value: O(h)"""
        if node is None:
            node = self.root
        while node and node.right:
            node = node.right
        return node

BST Deletion

Three cases:

  1. Leaf node: Simply remove
  2. One child: Replace with child
  3. Two children: Replace with in-order successor (or predecessor)
def delete(self, val):
    """Delete value from BST: O(h)"""
    self.root = self._delete_recursive(self.root, val)

def _delete_recursive(self, node, val):
    if node is None:
        return None

    if val < node.val:
        node.left = self._delete_recursive(node.left, val)
    elif val > node.val:
        node.right = self._delete_recursive(node.right, val)
    else:
        # Node to delete found
        if node.left is None:
            return node.right
        if node.right is None:
            return node.left

        # Two children: replace with in-order successor
        successor = self.find_min(node.right)
        node.val = successor.val
        node.right = self._delete_recursive(node.right, successor.val)

    return node

BST Complexity

OperationAverageWorst (unbalanced)
SearchO(log n)O(n)
InsertO(log n)O(n)
DeleteO(log n)O(n)
Find Min/MaxO(log n)O(n)

Worst case occurs when tree degenerates to a linked list (e.g., inserting sorted data).


5. BST Operations

Find Successor/Predecessor

def inorder_successor(self, node):
    """Find next node in in-order traversal"""
    if node.right:
        return self.find_min(node.right)

    # Go up until we come from a left child
    parent = node.parent
    while parent and node == parent.right:
        node = parent
        parent = parent.parent
    return parent

def inorder_predecessor(self, node):
    """Find previous node in in-order traversal"""
    if node.left:
        return self.find_max(node.left)

    parent = node.parent
    while parent and node == parent.left:
        node = parent
        parent = parent.parent
    return parent

Validate BST

def is_valid_bst(root, min_val=float('-inf'), max_val=float('inf')):
    """Check if tree satisfies BST property"""
    if root is None:
        return True

    if root.val <= min_val or root.val >= max_val:
        return False

    return (is_valid_bst(root.left, min_val, root.val) and
            is_valid_bst(root.right, root.val, max_val))

Lowest Common Ancestor

def lowest_common_ancestor(root, p, q):
    """Find LCA in BST"""
    while root:
        if p.val < root.val and q.val < root.val:
            root = root.left
        elif p.val > root.val and q.val > root.val:
            root = root.right
        else:
            return root
    return None

6. Self-Balancing Trees (Preview)

To guarantee O(log n) operations, use self-balancing trees:

AVL Trees

  • Strictly balanced: heights differ by at most 1
  • Rotations on insert/delete to maintain balance
  • Faster lookups than Red-Black trees

Red-Black Trees

  • Less strictly balanced
  • Nodes are colored red or black
  • Used in Java TreeMap, C++ std::map

B-Trees

  • Nodes can have many children
  • Used in databases and file systems
  • Optimized for disk access

Detailed coverage in advanced data structures.


7. Common Tree Problems

Maximum Depth

def max_depth(root):
    if root is None:
        return 0
    return 1 + max(max_depth(root.left), max_depth(root.right))

Check if Balanced

def is_balanced(root):
    def check(node):
        if node is None:
            return 0

        left = check(node.left)
        right = check(node.right)

        if left == -1 or right == -1 or abs(left - right) > 1:
            return -1

        return 1 + max(left, right)

    return check(root) != -1

Path Sum

def has_path_sum(root, target):
    """Check if root-to-leaf path sums to target"""
    if root is None:
        return False

    if root.left is None and root.right is None:
        return root.val == target

    return (has_path_sum(root.left, target - root.val) or
            has_path_sum(root.right, target - root.val))

Serialize/Deserialize

def serialize(root):
    """Convert tree to string"""
    if root is None:
        return "null"
    return f"{root.val},{serialize(root.left)},{serialize(root.right)}"

def deserialize(data):
    """Reconstruct tree from string"""
    def helper(values):
        val = next(values)
        if val == "null":
            return None
        node = TreeNode(int(val))
        node.left = helper(values)
        node.right = helper(values)
        return node

    return helper(iter(data.split(',')))

Invert Binary Tree

def invert_tree(root):
    if root is None:
        return None

    root.left, root.right = root.right, root.left
    invert_tree(root.left)
    invert_tree(root.right)

    return root

8. Trees in Practice

File System

class FileNode:
    def __init__(self, name, is_file=False):
        self.name = name
        self.is_file = is_file
        self.children = {}  # name -> FileNode

    def add_child(self, name, is_file=False):
        self.children[name] = FileNode(name, is_file)
        return self.children[name]

    def find(self, path):
        parts = path.split('/')
        current = self
        for part in parts:
            if part and part in current.children:
                current = current.children[part]
            elif part:
                return None
        return current

Expression Trees

        +
       / \
      *   3
     / \
    2   5

Represents: (2 * 5) + 3 = 13
def evaluate(node):
    if node.is_number():
        return node.val

    left = evaluate(node.left)
    right = evaluate(node.right)

    if node.val == '+':
        return left + right
    elif node.val == '*':
        return left * right
    # ... etc

Trie (Prefix Tree)

class TrieNode:
    def __init__(self):
        self.children = {}
        self.is_word = False


class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word):
        node = self.root
        for char in word:
            if char not in node.children:
                node.children[char] = TrieNode()
            node = node.children[char]
        node.is_word = True

    def search(self, word):
        node = self._find_node(word)
        return node is not None and node.is_word

    def starts_with(self, prefix):
        return self._find_node(prefix) is not None

    def _find_node(self, prefix):
        node = self.root
        for char in prefix:
            if char not in node.children:
                return None
            node = node.children[char]
        return node

9. When to Use Trees

Use BST When:

  • Need sorted data with O(log n) operations
  • Need range queries
  • Need in-order traversal (sorted sequence)
  • Data changes frequently

Use Hash Table Instead When:

  • Only need exact match lookups
  • Don't need ordering
  • Average case O(1) is sufficient

Use Trees When:

  • Data is hierarchical (file system, XML)
  • Need to represent relationships (family tree)
  • Need to evaluate expressions

Exercises

Basic

  1. Implement pre-order, in-order, and post-order traversals iteratively.

  2. Count the number of leaf nodes in a binary tree.

  3. Find the minimum value in a BST.

Intermediate

  1. Given a sorted array, construct a balanced BST.

  2. Print all paths from root to leaves.

  3. Find the kth smallest element in a BST.

Advanced

  1. Convert a BST to a sorted doubly linked list in-place.

  2. Find the diameter of a binary tree (longest path between any two nodes).

  3. Implement a function to check if two trees are identical.


Summary

  • Trees are hierarchical structures with root, parent, child relationships
  • Binary trees have at most two children per node
  • BST property: left < root < right
  • Traversals: pre-order, in-order, post-order, level-order
  • BST operations are O(log n) average, O(n) worst case
  • Self-balancing trees guarantee O(log n)
  • Use for sorted data, hierarchical data, range queries

Next Reading

Heaps and Priority Queues →