Arrays and Dynamic Arrays

Introduction

Arrays are the most fundamental data structure in computer science. They provide direct, indexed access to elements stored in contiguous memory locations. Understanding arrays deeply - how they work, their performance characteristics, and when to use them - is essential for every programmer.

Learning Objectives

By the end of this reading, you will be able to:

  • Understand how arrays are stored in memory
  • Analyze time complexity of array operations
  • Implement dynamic arrays
  • Choose between arrays and other data structures appropriately

1. What is an Array?

An array is a collection of elements stored in contiguous (adjacent) memory locations, accessible by index.

Key Properties

  1. Fixed Size (in traditional arrays): Size determined at creation
  2. Contiguous Memory: Elements stored next to each other
  3. Random Access: Any element accessible in O(1) time via index
  4. Homogeneous: Elements typically of the same type

Memory Layout

Index:     0       1       2       3       4
         ┌───────┬───────┬───────┬───────┬───────┐
Values:  │  10   │  20   │  30   │  40   │  50   │
         └───────┴───────┴───────┴───────┴───────┘
Address: 1000    1004    1008    1012    1016

Address of element i = base_address + (i × element_size)

This formula enables O(1) random access - the key superpower of arrays.


2. Array Operations and Complexity

Access by Index: O(1)

# Direct calculation: address = base + index * size
value = arr[3]  # O(1) - always same speed regardless of array size

Search (unsorted): O(n)

def linear_search(arr, target):
    for i, value in enumerate(arr):
        if value == target:
            return i
    return -1  # Not found

# Must check each element in worst case

Search (sorted): O(log n)

def binary_search(arr, target):
    left, right = 0, len(arr) - 1

    while left <= right:
        mid = (left + right) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1

    return -1

Insert at End: O(1) amortized

# If space available, just add
arr.append(value)  # O(1) amortized

Insert at Index: O(n)

# Must shift all elements after index
def insert_at(arr, index, value):
    # Shift elements to the right
    arr.append(None)  # Make room
    for i in range(len(arr) - 1, index, -1):
        arr[i] = arr[i - 1]
    arr[index] = value

# Example: Insert 25 at index 2 in [10, 20, 30, 40]
# Step 1: [10, 20, 30, 40, None]
# Step 2: [10, 20, 30, 40, 40]
# Step 3: [10, 20, 30, 30, 40]
# Step 4: [10, 20, 25, 30, 40]

Delete at End: O(1)

arr.pop()  # O(1) - just decrement size

Delete at Index: O(n)

# Must shift all elements after index
def delete_at(arr, index):
    for i in range(index, len(arr) - 1):
        arr[i] = arr[i + 1]
    arr.pop()  # Remove last (now duplicate) element

Complexity Summary

OperationAverage CaseWorst Case
Access by indexO(1)O(1)
Search (unsorted)O(n)O(n)
Search (sorted)O(log n)O(log n)
Insert at endO(1)*O(n)*
Insert at indexO(n)O(n)
Delete at endO(1)O(1)
Delete at indexO(n)O(n)

*Amortized for dynamic arrays


3. Static vs Dynamic Arrays

Static Arrays (Fixed Size)

// C - fixed size, must specify at declaration
int arr[5];  // Array of 5 integers

// Cannot grow or shrink after creation

Dynamic Arrays (Resizable)

Python lists, Java ArrayList, C++ vector - these grow automatically.

# Python list - dynamic array
arr = []
arr.append(1)  # [1]
arr.append(2)  # [1, 2]
arr.append(3)  # [1, 2, 3]
# Grows as needed

4. Implementing a Dynamic Array

The Resizing Strategy

When full, create a larger array and copy elements.

class DynamicArray:
    def __init__(self):
        self._capacity = 1        # Current array size
        self._size = 0            # Number of elements
        self._data = [None] * self._capacity

    def __len__(self):
        return self._size

    def __getitem__(self, index):
        if not 0 <= index < self._size:
            raise IndexError("Index out of bounds")
        return self._data[index]

    def __setitem__(self, index, value):
        if not 0 <= index < self._size:
            raise IndexError("Index out of bounds")
        self._data[index] = value

    def append(self, value):
        if self._size == self._capacity:
            self._resize(2 * self._capacity)  # Double capacity

        self._data[self._size] = value
        self._size += 1

    def _resize(self, new_capacity):
        new_data = [None] * new_capacity
        for i in range(self._size):
            new_data[i] = self._data[i]
        self._data = new_data
        self._capacity = new_capacity

    def pop(self):
        if self._size == 0:
            raise IndexError("Pop from empty array")

        value = self._data[self._size - 1]
        self._data[self._size - 1] = None  # Help garbage collection
        self._size -= 1

        # Shrink if too empty (optional)
        if self._size < self._capacity // 4:
            self._resize(self._capacity // 2)

        return value

Why Double the Size?

Amortized Analysis:

If we double capacity each time, the total cost of n appends is O(n), giving O(1) amortized per append.

Append #1:  Resize from 1 to 2, copy 1 element
Append #2:  No resize (capacity=2, size=2)
Append #3:  Resize from 2 to 4, copy 2 elements
Append #4:  No resize
Append #5:  Resize from 4 to 8, copy 4 elements
...

Total copies: 1 + 2 + 4 + 8 + ... + n/2 ≈ n

Cost of n appends: O(n) resizing + O(n) simple appends = O(n)
Amortized per append: O(n) / n = O(1)

If we increased by 1 each time instead:

Total copies: 1 + 2 + 3 + ... + n = O(n²)
Per append: O(n) - much worse!

5. Multidimensional Arrays

2D Arrays (Matrices)

# Create 3x4 matrix
rows, cols = 3, 4
matrix = [[0] * cols for _ in range(rows)]

# Access element at row 1, column 2
matrix[1][2] = 5

# Iterate
for row in range(rows):
    for col in range(cols):
        print(matrix[row][col], end=" ")
    print()

Memory Layout

Row-major order (C, Python): Row elements are contiguous

Logical:        Memory:
[a b c]         [a b c d e f g h i]
[d e f]
[g h i]

Column-major order (Fortran, MATLAB): Column elements are contiguous

Logical:        Memory:
[a b c]         [a d g b e h c f i]
[d e f]
[g h i]

This affects cache performance - access patterns should match memory layout.


6. Common Array Patterns

Two Pointers

# Find pair that sums to target (sorted array)
def two_sum_sorted(arr, target):
    left, right = 0, len(arr) - 1

    while left < right:
        current_sum = arr[left] + arr[right]
        if current_sum == target:
            return (left, right)
        elif current_sum < target:
            left += 1
        else:
            right -= 1

    return None

# O(n) time, O(1) space

Sliding Window

# Maximum sum of k consecutive elements
def max_sum_subarray(arr, k):
    if len(arr) < k:
        return None

    # Initial window
    window_sum = sum(arr[:k])
    max_sum = window_sum

    # Slide window
    for i in range(k, len(arr)):
        window_sum += arr[i] - arr[i - k]  # Add new, remove old
        max_sum = max(max_sum, window_sum)

    return max_sum

# O(n) time instead of O(n*k)

Prefix Sum

# Precompute sums for fast range queries
def build_prefix_sum(arr):
    prefix = [0] * (len(arr) + 1)
    for i in range(len(arr)):
        prefix[i + 1] = prefix[i] + arr[i]
    return prefix

def range_sum(prefix, left, right):
    # Sum of arr[left:right+1]
    return prefix[right + 1] - prefix[left]

arr = [1, 2, 3, 4, 5]
prefix = build_prefix_sum(arr)  # [0, 1, 3, 6, 10, 15]
print(range_sum(prefix, 1, 3))  # 2 + 3 + 4 = 9

In-place Reversal

def reverse_array(arr):
    left, right = 0, len(arr) - 1
    while left < right:
        arr[left], arr[right] = arr[right], arr[left]
        left += 1
        right -= 1

# O(n) time, O(1) space

7. Arrays in Different Languages

Python (list)

arr = [1, 2, 3, 4, 5]
arr.append(6)          # Add to end
arr.insert(0, 0)       # Insert at index
arr.pop()              # Remove from end
arr.pop(0)             # Remove at index
len(arr)               # Size
arr[2]                 # Access
arr[1:4]               # Slice [2, 3, 4]

Java (ArrayList)

ArrayList<Integer> arr = new ArrayList<>();
arr.add(1);            // Add to end
arr.add(0, 0);         // Insert at index
arr.remove(arr.size()-1);  // Remove from end
arr.remove(0);         // Remove at index
arr.size();            // Size
arr.get(2);            // Access
arr.set(2, 10);        // Update

C++ (vector)

vector<int> arr = {1, 2, 3, 4, 5};
arr.push_back(6);      // Add to end
arr.insert(arr.begin(), 0);  // Insert at beginning
arr.pop_back();        // Remove from end
arr.erase(arr.begin()); // Remove at index
arr.size();            // Size
arr[2];                // Access

JavaScript

let arr = [1, 2, 3, 4, 5];
arr.push(6);           // Add to end
arr.unshift(0);        // Add to beginning
arr.pop();             // Remove from end
arr.shift();           // Remove from beginning
arr.splice(2, 0, 10);  // Insert at index
arr.length;            // Size
arr[2];                // Access

8. When to Use Arrays

Use Arrays When:

  • You need fast random access by index
  • You mostly access elements at the end
  • Memory locality matters (cache efficiency)
  • You know the size in advance (or it rarely changes)

Consider Alternatives When:

  • Frequent insertions/deletions in the middle → Linked List
  • Need fast lookup by key → Hash Table
  • Need sorted data with fast insert → Balanced BST
  • Need FIFO/LIFO behavior → Queue/Stack

Exercises

Basic

  1. Write a function to find the maximum element in an array.

  2. Write a function to reverse an array in place.

  3. Given an array, return a new array with all elements doubled.

Intermediate

  1. Implement remove_duplicates(arr) that removes duplicates from a sorted array in-place and returns the new length.

  2. Given an array, rotate it k positions to the right.

rotate([1,2,3,4,5], 2)  # [4,5,1,2,3]
  1. Find the "equilibrium index" where sum of elements on left equals sum on right.

Advanced

  1. Complete the DynamicArray implementation with:

    • insert(index, value)
    • remove(index)
    • __str__ for pretty printing
  2. Implement a 2D dynamic array (matrix) class.

  3. Given an array of integers, find the contiguous subarray with the largest sum (Kadane's algorithm).


Summary

  • Arrays store elements in contiguous memory with O(1) random access
  • Static arrays have fixed size; dynamic arrays resize automatically
  • Insert/delete at arbitrary positions costs O(n) due to shifting
  • Dynamic arrays double in size for O(1) amortized append
  • Common patterns: two pointers, sliding window, prefix sum
  • Arrays excel at random access but struggle with mid-array modifications

Next Reading

Linked Lists →