Arrays and Dynamic Arrays
Introduction
Arrays are the most fundamental data structure in computer science. They provide direct, indexed access to elements stored in contiguous memory locations. Understanding arrays deeply - how they work, their performance characteristics, and when to use them - is essential for every programmer.
Learning Objectives
By the end of this reading, you will be able to:
- Understand how arrays are stored in memory
- Analyze time complexity of array operations
- Implement dynamic arrays
- Choose between arrays and other data structures appropriately
1. What is an Array?
An array is a collection of elements stored in contiguous (adjacent) memory locations, accessible by index.
Key Properties
- Fixed Size (in traditional arrays): Size determined at creation
- Contiguous Memory: Elements stored next to each other
- Random Access: Any element accessible in O(1) time via index
- Homogeneous: Elements typically of the same type
Memory Layout
Index: 0 1 2 3 4
┌───────┬───────┬───────┬───────┬───────┐
Values: │ 10 │ 20 │ 30 │ 40 │ 50 │
└───────┴───────┴───────┴───────┴───────┘
Address: 1000 1004 1008 1012 1016
Address of element i = base_address + (i × element_size)
This formula enables O(1) random access - the key superpower of arrays.
2. Array Operations and Complexity
Access by Index: O(1)
# Direct calculation: address = base + index * size
value = arr[3] # O(1) - always same speed regardless of array size
Search (unsorted): O(n)
def linear_search(arr, target):
for i, value in enumerate(arr):
if value == target:
return i
return -1 # Not found
# Must check each element in worst case
Search (sorted): O(log n)
def binary_search(arr, target):
left, right = 0, len(arr) - 1
while left <= right:
mid = (left + right) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
return -1
Insert at End: O(1) amortized
# If space available, just add
arr.append(value) # O(1) amortized
Insert at Index: O(n)
# Must shift all elements after index
def insert_at(arr, index, value):
# Shift elements to the right
arr.append(None) # Make room
for i in range(len(arr) - 1, index, -1):
arr[i] = arr[i - 1]
arr[index] = value
# Example: Insert 25 at index 2 in [10, 20, 30, 40]
# Step 1: [10, 20, 30, 40, None]
# Step 2: [10, 20, 30, 40, 40]
# Step 3: [10, 20, 30, 30, 40]
# Step 4: [10, 20, 25, 30, 40]
Delete at End: O(1)
arr.pop() # O(1) - just decrement size
Delete at Index: O(n)
# Must shift all elements after index
def delete_at(arr, index):
for i in range(index, len(arr) - 1):
arr[i] = arr[i + 1]
arr.pop() # Remove last (now duplicate) element
Complexity Summary
| Operation | Average Case | Worst Case |
|---|---|---|
| Access by index | O(1) | O(1) |
| Search (unsorted) | O(n) | O(n) |
| Search (sorted) | O(log n) | O(log n) |
| Insert at end | O(1)* | O(n)* |
| Insert at index | O(n) | O(n) |
| Delete at end | O(1) | O(1) |
| Delete at index | O(n) | O(n) |
*Amortized for dynamic arrays
3. Static vs Dynamic Arrays
Static Arrays (Fixed Size)
// C - fixed size, must specify at declaration
int arr[5]; // Array of 5 integers
// Cannot grow or shrink after creation
Dynamic Arrays (Resizable)
Python lists, Java ArrayList, C++ vector - these grow automatically.
# Python list - dynamic array
arr = []
arr.append(1) # [1]
arr.append(2) # [1, 2]
arr.append(3) # [1, 2, 3]
# Grows as needed
4. Implementing a Dynamic Array
The Resizing Strategy
When full, create a larger array and copy elements.
class DynamicArray:
def __init__(self):
self._capacity = 1 # Current array size
self._size = 0 # Number of elements
self._data = [None] * self._capacity
def __len__(self):
return self._size
def __getitem__(self, index):
if not 0 <= index < self._size:
raise IndexError("Index out of bounds")
return self._data[index]
def __setitem__(self, index, value):
if not 0 <= index < self._size:
raise IndexError("Index out of bounds")
self._data[index] = value
def append(self, value):
if self._size == self._capacity:
self._resize(2 * self._capacity) # Double capacity
self._data[self._size] = value
self._size += 1
def _resize(self, new_capacity):
new_data = [None] * new_capacity
for i in range(self._size):
new_data[i] = self._data[i]
self._data = new_data
self._capacity = new_capacity
def pop(self):
if self._size == 0:
raise IndexError("Pop from empty array")
value = self._data[self._size - 1]
self._data[self._size - 1] = None # Help garbage collection
self._size -= 1
# Shrink if too empty (optional)
if self._size < self._capacity // 4:
self._resize(self._capacity // 2)
return value
Why Double the Size?
Amortized Analysis:
If we double capacity each time, the total cost of n appends is O(n), giving O(1) amortized per append.
Append #1: Resize from 1 to 2, copy 1 element
Append #2: No resize (capacity=2, size=2)
Append #3: Resize from 2 to 4, copy 2 elements
Append #4: No resize
Append #5: Resize from 4 to 8, copy 4 elements
...
Total copies: 1 + 2 + 4 + 8 + ... + n/2 ≈ n
Cost of n appends: O(n) resizing + O(n) simple appends = O(n)
Amortized per append: O(n) / n = O(1)
If we increased by 1 each time instead:
Total copies: 1 + 2 + 3 + ... + n = O(n²)
Per append: O(n) - much worse!
5. Multidimensional Arrays
2D Arrays (Matrices)
# Create 3x4 matrix
rows, cols = 3, 4
matrix = [[0] * cols for _ in range(rows)]
# Access element at row 1, column 2
matrix[1][2] = 5
# Iterate
for row in range(rows):
for col in range(cols):
print(matrix[row][col], end=" ")
print()
Memory Layout
Row-major order (C, Python): Row elements are contiguous
Logical: Memory:
[a b c] [a b c d e f g h i]
[d e f]
[g h i]
Column-major order (Fortran, MATLAB): Column elements are contiguous
Logical: Memory:
[a b c] [a d g b e h c f i]
[d e f]
[g h i]
This affects cache performance - access patterns should match memory layout.
6. Common Array Patterns
Two Pointers
# Find pair that sums to target (sorted array)
def two_sum_sorted(arr, target):
left, right = 0, len(arr) - 1
while left < right:
current_sum = arr[left] + arr[right]
if current_sum == target:
return (left, right)
elif current_sum < target:
left += 1
else:
right -= 1
return None
# O(n) time, O(1) space
Sliding Window
# Maximum sum of k consecutive elements
def max_sum_subarray(arr, k):
if len(arr) < k:
return None
# Initial window
window_sum = sum(arr[:k])
max_sum = window_sum
# Slide window
for i in range(k, len(arr)):
window_sum += arr[i] - arr[i - k] # Add new, remove old
max_sum = max(max_sum, window_sum)
return max_sum
# O(n) time instead of O(n*k)
Prefix Sum
# Precompute sums for fast range queries
def build_prefix_sum(arr):
prefix = [0] * (len(arr) + 1)
for i in range(len(arr)):
prefix[i + 1] = prefix[i] + arr[i]
return prefix
def range_sum(prefix, left, right):
# Sum of arr[left:right+1]
return prefix[right + 1] - prefix[left]
arr = [1, 2, 3, 4, 5]
prefix = build_prefix_sum(arr) # [0, 1, 3, 6, 10, 15]
print(range_sum(prefix, 1, 3)) # 2 + 3 + 4 = 9
In-place Reversal
def reverse_array(arr):
left, right = 0, len(arr) - 1
while left < right:
arr[left], arr[right] = arr[right], arr[left]
left += 1
right -= 1
# O(n) time, O(1) space
7. Arrays in Different Languages
Python (list)
arr = [1, 2, 3, 4, 5]
arr.append(6) # Add to end
arr.insert(0, 0) # Insert at index
arr.pop() # Remove from end
arr.pop(0) # Remove at index
len(arr) # Size
arr[2] # Access
arr[1:4] # Slice [2, 3, 4]
Java (ArrayList)
ArrayList<Integer> arr = new ArrayList<>();
arr.add(1); // Add to end
arr.add(0, 0); // Insert at index
arr.remove(arr.size()-1); // Remove from end
arr.remove(0); // Remove at index
arr.size(); // Size
arr.get(2); // Access
arr.set(2, 10); // Update
C++ (vector)
vector<int> arr = {1, 2, 3, 4, 5};
arr.push_back(6); // Add to end
arr.insert(arr.begin(), 0); // Insert at beginning
arr.pop_back(); // Remove from end
arr.erase(arr.begin()); // Remove at index
arr.size(); // Size
arr[2]; // Access
JavaScript
let arr = [1, 2, 3, 4, 5];
arr.push(6); // Add to end
arr.unshift(0); // Add to beginning
arr.pop(); // Remove from end
arr.shift(); // Remove from beginning
arr.splice(2, 0, 10); // Insert at index
arr.length; // Size
arr[2]; // Access
8. When to Use Arrays
Use Arrays When:
- You need fast random access by index
- You mostly access elements at the end
- Memory locality matters (cache efficiency)
- You know the size in advance (or it rarely changes)
Consider Alternatives When:
- Frequent insertions/deletions in the middle → Linked List
- Need fast lookup by key → Hash Table
- Need sorted data with fast insert → Balanced BST
- Need FIFO/LIFO behavior → Queue/Stack
Exercises
Basic
Write a function to find the maximum element in an array.
Write a function to reverse an array in place.
Given an array, return a new array with all elements doubled.
Intermediate
Implement
remove_duplicates(arr)that removes duplicates from a sorted array in-place and returns the new length.Given an array, rotate it k positions to the right.
rotate([1,2,3,4,5], 2) # [4,5,1,2,3]
- Find the "equilibrium index" where sum of elements on left equals sum on right.
Advanced
Complete the DynamicArray implementation with:
insert(index, value)remove(index)__str__for pretty printing
Implement a 2D dynamic array (matrix) class.
Given an array of integers, find the contiguous subarray with the largest sum (Kadane's algorithm).
Summary
- Arrays store elements in contiguous memory with O(1) random access
- Static arrays have fixed size; dynamic arrays resize automatically
- Insert/delete at arbitrary positions costs O(n) due to shifting
- Dynamic arrays double in size for O(1) amortized append
- Common patterns: two pointers, sliding window, prefix sum
- Arrays excel at random access but struggle with mid-array modifications