Context-Free Grammars and Pushdown Automata | Computer Science Fundamentals

Context-free grammars (CFGs) are more powerful than regular expressions and finite automata. They can describe the syntax of most programming languages, including nested structures like balanced parentheses, matching if-then-else statements, and recursive data structures. CFGs form the theoretical foundation of compiler design and parsing algorithms.

Learning Objectives

By the end of this reading, you will be able to:

Define and construct context-free grammars
Understand derivations, parse trees, and the language generated by a CFG
Identify and resolve ambiguous grammars
Convert CFGs to Chomsky Normal Form
Understand pushdown automata (PDAs) as recognizers for context-free languages
Prove the equivalence of CFGs and PDAs
Apply the pumping lemma for context-free languages
Recognize the limitations of context-free languages

Context-Free Grammars: Formal Definition

Basic Definitions

A context-free grammar (CFG) is a 4-tuple G = (V, Σ, R, S) where:

V is a finite set of variables (also called nonterminals)
Σ is a finite set of terminals (the alphabet), disjoint from V
R is a finite set of production rules of the form A → w where A ∈ V and w ∈ (V ∪ Σ)*
S ∈ V is the start variable

Notation:

Variables: Usually uppercase letters A, B, C, S
Terminals: Usually lowercase letters a, b, c or digits
Strings of terminals: w, x, y, z
Strings of variables and terminals: α, β, γ

Derivations

A derivation is a sequence of substitutions using production rules.

We write α ⇒ β if we can derive β from α in one step by applying a single production rule.

We write α ⇒* β if we can derive β from α in zero or more steps.

Notation:

α ⇒ β: derives in one step
α ⇒* β: derives in zero or more steps
α ⇒+ β: derives in one or more steps

Types of derivations:

Leftmost derivation: Always replace the leftmost variable
Rightmost derivation: Always replace the rightmost variable

Language Generated by a Grammar

The language generated by grammar G is:

L(G) = {w ∈ Σ* : S ⇒* w}

A language L is context-free if there exists a CFG G such that L = L(G).

Example 1: Balanced Parentheses

Design a CFG for the language of balanced parentheses.

Grammar:

V = {S}
Σ = {(, )}
S = S (start variable)
R:
- S → (S)
- S → SS
- S → ε

Derivations:

Example: Generate "(())()"

S ⇒ SS          (use S → SS)
  ⇒ (S)S        (use S → (S) on first S)
  ⇒ ((S))S      (use S → (S))
  ⇒ (())S       (use S → ε)
  ⇒ (())(S)     (use S → (S))
  ⇒ (())()      (use S → ε)

Alternative derivation (leftmost):

S ⇒ SS ⇒ (S)S ⇒ ((S))S ⇒ (())S ⇒ (())(S) ⇒ (())()

Proof that L(G) = {balanced parentheses}:

(⊆) By induction on derivation length, every string derived is balanced
(⊇) By induction on string structure, every balanced string can be derived

Example 2: L = {0ⁿ1ⁿ : n ≥ 0}

Grammar:

V = {S}
Σ = {0, 1}
R:
- S → 0S1
- S → ε

Derivation of "000111":

S ⇒ 0S1 ⇒ 00S11 ⇒ 000S111 ⇒ 000111

This language is context-free but not regular (proved via pumping lemma in previous reading).

Example 3: Arithmetic Expressions

Design a CFG for simple arithmetic expressions with +, *, and parentheses.

Grammar:

V = {E, T, F} (Expression, Term, Factor)
Σ = {+, *, (, ), a} (a represents variables/numbers)
Start: E
R:
- E → E + T | T
- T → T * F | F
- F → (E) | a

Derivation of "a + a * a":

E ⇒ E + T
  ⇒ T + T
  ⇒ F + T
  ⇒ a + T
  ⇒ a + T * F
  ⇒ a + F * F
  ⇒ a + a * F
  ⇒ a + a * a

This grammar captures operator precedence: * binds tighter than +.

Example 4: Programming Language Constructs

If-then-else statements:

S → if E then S
S → if E then S else S
S → other
E → condition

While loops:

S → while E do S
S → begin S end
S → S; S
S → a := b

Parse Trees

A parse tree is a graphical representation of a derivation.

Structure:

Root: Start variable S
Internal nodes: Variables
Leaves: Terminals (reading left-to-right gives the derived string)
Parent-child relationship: If A → α is used, A is parent and symbols in α are children

Example 5: Parse Tree for "a + a * a"

Using the arithmetic grammar from Example 3:

              E
              |
          ┌───┴───┬───┐
          E   +   T
          |       |
          T   ┌───┼───┬───┐
          |   T   *   F
          F   |       |
          |   F       a
          a   |
              a

Reading leaves left-to-right: a + a * a

Relationship Between Derivations and Parse Trees

Each parse tree corresponds to multiple derivations (leftmost, rightmost, etc.)
But all these derivations represent the same syntactic structure
Parse trees capture the structural meaning, while derivations show the process

Theorem: A string w has more than one parse tree if and only if it has more than one leftmost derivation.

Ambiguity

Definition

A CFG is ambiguous if there exists a string with two or more distinct parse trees (equivalently, two or more distinct leftmost derivations).

A CFG is unambiguous if every string has at most one parse tree.

Example 6: Ambiguous Grammar

Consider this grammar for arithmetic expressions:

E → E + E
E → E * E
E → (E)
E → a

The string "a + a * a" has (at least) two parse trees:

Parse Tree 1 (+ at root):

        E
    ┌───┼───┐
    E   +   E
    |   ┌───┼───┐
    a   E   *   E
        |       |
        a       a

Parse Tree 2 (* at root):

        E
    ┌───┼───┐
    E   *   E
┌───┼───┐   |
E   +   E   a
|       |
a       a

These represent different interpretations: (a+a)a vs. a+(aa).

Resolving Ambiguity

Strategy 1: Rewrite the grammar (as in Example 3)

Use separate variables for different precedence levels
Encode associativity in recursive structure

Strategy 2: Add disambiguating rules (used in parser generators)

Specify operator precedence: * before +
Specify associativity: left-to-right
Example (yacc/bison): %left '+' %left '*'

Strategy 3: Modify the language

Sometimes ambiguity is inherent in the language
Example: "dangling else" problem

The Dangling Else Problem

Consider:

S → if E then S
S → if E then S else S
S → other

The string "if E₁ then if E₂ then S₁ else S₂" has two parse trees:

Parse 1: else matches second if

           S
           |
    ┌──────┼──────┬──────┬─────┐
   if     E₁    then     S
                         |
                ┌────────┼──────┬──────┬──────┬─────┐
               if       E₂    then    S₁    else    S₂

Parse 2: else matches first if

                S
                |
    ┌───────────┼───────┬──────┬──────┬──────┬─────┐
   if          E₁     then     S     else    S₂
                               |
                        ┌──────┼──────┬──────┬─────┐
                       if     E₂    then    S₁

Standard resolution: "else matches the nearest unmatched if" (Parse 1)

This can be encoded in the grammar:

S → Matched | Unmatched
Matched → if E then Matched else Matched | other
Unmatched → if E then S | if E then Matched else Unmatched

Inherently Ambiguous Languages

Some context-free languages have no unambiguous grammar.

Example: L = {aⁿbⁿcᵐdᵐ : n,m ≥ 1} ∪ {aⁿbᵐcᵐdⁿ : n,m ≥ 1}

Every CFG for this language is ambiguous. The string a²b²c²d² belongs to both parts, and any grammar must derive it in fundamentally different ways.

Chomsky Normal Form

Definition

A CFG is in Chomsky Normal Form (CNF) if every production rule is of the form:

A → BC (where B, C are variables, not the start variable)
A → a (where a is a terminal)
S → ε (only if ε ∈ L(G), and S doesn't appear on the right side of any rule)

Importance

Simplifies parsing algorithms: CYK algorithm runs in O(n³) for CNF grammars
Theoretical tool: Many proofs are easier with CNF
Normal form theorem: Every CFG can be converted to CNF

Conversion to CNF

Algorithm:

Add new start variable: Create S₀ → S to ensure start variable doesn't appear on right side
Eliminate ε-rules: Remove A → ε rules
- For each rule B → αAβ, add B → αβ
- Repeat for all nullable variables
Eliminate unit rules: Remove A → B rules
- For each rule A → B and B → α, add A → α
- Repeat until no unit rules remain
Convert long rules: Replace A → u₁u₂...uₖ (k ≥ 3) with:
- A → u₁A₁
- A₁ → u₂A₂
- ...
- Aₖ₋₂ → uₖ₋₁uₖ
Convert terminal rules: Replace A → u₁u₂ where uᵢ is a terminal with:
- A → U₁U₂
- Uᵢ → uᵢ (new rules)

Example 7: Converting to CNF

Original grammar:

S → ASA | aB
A → B | S
B → b | ε

Step 1: Add new start variable

S₀ → S
S → ASA | aB
A → B | S
B → b | ε

Step 2: Eliminate ε-rules (B is nullable)

S₀ → S
S → ASA | aB | a    (added S → a since B is nullable)
A → B | S | ε       (A becomes nullable)
B → b

Now A is nullable too:

S₀ → S
S → ASA | AS | SA | S | aB | a
A → B | S
B → b

Step 3: Eliminate unit rules

S₀ → ASA | AS | SA | aB | a | b
S → ASA | AS | SA | aB | a | b
A → b
B → b

Step 4: Convert long rules (ASA has length 3)

S₀ → AX₁ | AS | SA | aB | a | b  where X₁ → SA
S → AX₁ | AS | SA | aB | a | b
X₁ → SA
A → b
B → b

Step 5: Convert terminal rules (like aB)

S₀ → AX₁ | AS | SA | UB | a | b  where U → a
S → AX₁ | AS | SA | UB | a | b
X₁ → SA
A → b
B → b
U → a

Final CNF:

S₀ → AX₁ | AS | SA | UB | a | b
S → AX₁ | AS | SA | UB | a | b
X₁ → SA
A → b
B → b
U → a

Pushdown Automata

Motivation

Regular languages are recognized by finite automata (finite memory). Context-free languages need more memory, but not arbitrary memory - they need a stack.

A pushdown automaton (PDA) is an NFA augmented with a stack.

Formal Definition

A pushdown automaton is a 6-tuple P = (Q, Σ, Γ, δ, q₀, F) where:

Q is a finite set of states
Σ is the input alphabet
Γ is the stack alphabet
δ: Q × Σ_ε × Γ_ε → P(Q × Γ_ε) is the transition function
q₀ ∈ Q is the start state
F ⊆ Q is the set of accept states

Notation: Σ_ε = Σ ∪ {ε} and Γ_ε = Γ ∪ {ε}

Transition function: δ(q, a, X) = {(p₁, γ₁), (p₂, γ₂), ...} means:

In state q
Reading input symbol a (or ε)
With X on top of stack (or ε to ignore stack)
Can transition to state pᵢ and push γᵢ onto stack

Stack operations:

Push X: δ(q, a, ε) = {(p, X)} or δ(q, a, Y) = {(p, XY)}
Pop X: δ(q, a, X) = {(p, ε)}
No-op: δ(q, a, X) = {(p, X)}

Acceptance

A PDA accepts a string w if:

By final state: After reading all of w, PDA is in an accept state (stack can be anything)
By empty stack: After reading all of w, the stack is empty (state can be anything)

These two notions are equivalent in power.

Example 8: PDA for {0ⁿ1ⁿ : n ≥ 0}

Informal description:

Push each 0 onto the stack
Pop a 0 for each 1 read
Accept if stack becomes empty

Formal specification:

Q = {q₀, q₁, q₂}
Σ = {0, 1}
Γ = {0, $} ($ is stack bottom marker)
Start: q₀
Accept: {q₂}

Transitions:

δ(q₀, ε, ε) = {(q₁, $)}       // Push $ onto empty stack
δ(q₁, 0, ε) = {(q₁, 0)}       // Push 0 for each 0 read
δ(q₁, 1, 0) = {(q₁, ε)}       // Pop 0 for each 1 read
δ(q₁, ε, $) = {(q₂, ε)}       // Accept when $ is on top

State diagram:

        ε, ε → $       0, ε → 0         1, 0 → ε        ε, $ → ε
   ┌────┐         ┌────┐           ┌────┐           ┌────┐
→  │ q₀ │────────►│ q₁ │◄──────────│ q₁ │──────────►│(q₂)│
   └────┘         └────┘           └────┘           └────┘

Trace for "0011":

Configuration: (state, unread input, stack)
(q₀, 0011, ε)   ⊢  (q₁, 0011, $)
                ⊢  (q₁, 011, 0$)
                ⊢  (q₁, 11, 00$)
                ⊢  (q₁, 1, 0$)
                ⊢  (q₁, ε, $)
                ⊢  (q₂, ε, ε)  ✓ Accept

Example 9: PDA for Balanced Parentheses

Strategy:

Push '(' onto stack when seen
Pop from stack when ')' is seen
Accept if stack is empty at end

Transitions:

δ(q₀, ε, ε) = {(q₁, $)}         // Initialize stack
δ(q₁, '(', ε) = {(q₁, '(')}     // Push (
δ(q₁, ')', '(') = {(q₁, ε)}     // Pop ( when seeing )
δ(q₁, ε, $) = {(q₂, ε)}         // Accept when done

Example 10: PDA for {wcwᴿ : w ∈ {a,b}*}

This language consists of strings with a 'c' in the middle, where the right half is the reverse of the left half.

Strategy:

Push symbols onto stack until 'c' is seen
Then pop and match with remaining input
Accept if stack becomes empty

Transitions:

δ(q₀, ε, ε) = {(q₁, $)}           // Initialize
δ(q₁, a, ε) = {(q₁, a)}           // Push a's and b's
δ(q₁, b, ε) = {(q₁, b)}
δ(q₁, c, ε) = {(q₂, ε)}           // Switch to matching mode
δ(q₂, a, a) = {(q₂, ε)}           // Pop and match
δ(q₂, b, b) = {(q₂, ε)}
δ(q₂, ε, $) = {(q₃, ε)}           // Accept

Trace for "abcba":

(q₀, abcba, ε)  ⊢  (q₁, abcba, $)
                ⊢  (q₁, bcba, a$)
                ⊢  (q₁, cba, ba$)
                ⊢  (q₂, ba, ba$)
                ⊢  (q₂, a, a$)
                ⊢  (q₂, ε, $)
                ⊢  (q₃, ε, ε)  ✓

Equivalence of CFGs and PDAs

Theorem: CFG ↔ PDA Equivalence

Theorem: A language is context-free if and only if some PDA recognizes it.

Proof outline:

CFG → PDA: Construct PDA that simulates leftmost derivations
PDA → CFG: Construct CFG that simulates PDA computation

CFG to PDA Construction

Given CFG G = (V, Σ, R, S), construct PDA P:

Idea: Use stack to store sentential form, nondeterministically apply productions.

Algorithm:

Push $ then S onto stack
Repeat:
- If top of stack is variable A, nondeterministically choose A → w and replace A with w
- If top of stack is terminal a, read a from input and pop stack
- If top of stack is $, accept

Formal construction:

States: {q_start, q_loop, q_accept}
δ(q_start, ε, ε) = {(q_loop, S$)}
For each rule A → w: δ(q_loop, ε, A) = {(q_loop, w)}
For each terminal a: δ(q_loop, a, a) = {(q_loop, ε)}
δ(q_loop, ε, $) = {(q_accept, ε)}

PDA to CFG Construction

Given PDA P, construct CFG G such that L(G) = L(P).

Idea: Variable A_pq generates all strings that take PDA from state p with empty stack to state q with empty stack.

Construction (sketch):

For each p, q ∈ Q, create variable A_pq
Start variable: A_q₀,f for some f ∈ F
For each transition, create corresponding production

This construction is complex; see textbook for full details.

Pumping Lemma for Context-Free Languages

The Pumping Lemma

Not all languages are context-free. The pumping lemma for CFLs helps prove non-context-freeness.

Theorem (Pumping Lemma for CFLs):

If L is a context-free language, then there exists p ≥ 1 (pumping length) such that every string s ∈ L with |s| ≥ p can be divided into five parts, s = uvxyz, satisfying:

|vxy| ≤ p
|vy| > 0 (at least one of v or y is non-empty)
For all i ≥ 0, uvⁱxyⁱz ∈ L

Intuition: Parse trees for long strings must have a repeated variable on some path, creating a "pumpable" region.

Proof of Pumping Lemma

Proof sketch:

Let G be a CFG in CNF for L with |V| variables
Let p = 2^|V|
Any string s with |s| ≥ p has a parse tree of height > |V|
By pigeonhole principle, some variable R repeats on a path from root to leaf
This creates two subtrees rooted at R, one inside the other
The inner subtree generates x, the region between generates v and y
These can be pumped independently

Example 11: L = {aⁿbⁿcⁿ : n ≥ 0} is Not Context-Free

Proof: Assume L is context-free with pumping length p.

Choose: s = aᵖbᵖcᵖ ∈ L (clearly |s| = 3p ≥ p)

Consider any division s = uvxyz where |vxy| ≤ p and |vy| > 0:

Case 1: vxy contains only one type of symbol (all a's, all b's, or all c's)

Then pumping (i=2) increases count of only one symbol
Result: unequal numbers → uv²xy²z ∉ L

Case 2: vxy contains two types of symbols

vxy spans at most p symbols, so it can't contain all three types
Subcase 2a: vxy contains a's and b's but no c's
- Pumping changes ratio of a's and b's to c's
- Result: uv²xy²z ∉ L
Subcase 2b: vxy contains b's and c's but no a's
- Similar reasoning
- Result: uv²xy²z ∉ L
Subcase 2c: vxy contains a's and c's
- This is impossible since a's and c's are not adjacent in aᵖbᵖcᵖ

All cases lead to contradiction, so L is not context-free. ∎

Example 12: L = {ww : w ∈ {a,b}*} is Not Context-Free

Proof: Assume L is context-free with pumping length p.

Choose: s = aᵖbᵖaᵖbᵖ ∈ L

Consider any division s = uvxyz where |vxy| ≤ p and |vy| > 0:

Since |vxy| ≤ p, the substring vxy cannot span both halves of s.

Case 1: vxy is entirely in the first half (first aᵖbᵖ)

Pumping up (i=2) makes first half longer than second half
Result: uv²xy²z ∉ L

Case 2: vxy is entirely in the second half

Similar reasoning
Result: uv²xy²z ∉ L

Case 3: vxy spans the boundary between the two halves

vxy can span at most p symbols around the boundary
The middle of s is ...bᵖaᵖ...
If vxy is in this region, pumping changes the balance
Result: uv²xy²z ∉ L

Therefore L is not context-free. ∎

Closure Properties of Context-Free Languages

Positive Closure Properties

Context-free languages are closed under:

Union: L₁ ∪ L₂
- Proof: S → S₁ | S₂ where S₁ generates L₁ and S₂ generates L₂
Concatenation: L₁L₂
- Proof: S → S₁S₂
Kleene Star: L*
- Proof: S → SS | ε
Reversal: L^R
- Proof: Reverse all productions
Homomorphism: h(L)
- Replace each terminal a with h(a) in the grammar

Negative Closure Properties

Context-free languages are NOT closed under:

Intersection: L₁ ∩ L₂
- Counterexample: L₁ = {aⁿbⁿcᵐ}, L₂ = {aᵐbⁿcⁿ} are CF
- But L₁ ∩ L₂ = {aⁿbⁿcⁿ} is not CF
Complement: L̄
- Follows from non-closure under intersection (De Morgan's laws)

However: If L is context-free and R is regular, then L ∩ R is context-free.

Proof: Simulate PDA for L and DFA for R simultaneously (product construction)

Applications of Context-Free Grammars

1. Programming Language Syntax

Most programming language syntax is context-free:

program → statement*
statement → assignment | if-stmt | while-stmt | block
assignment → identifier = expression ;
if-stmt → if ( expression ) statement
       | if ( expression ) statement else statement
while-stmt → while ( expression ) statement
block → { statement* }
expression → term | expression + term | expression - term
term → factor | term * factor | term / factor
factor → identifier | number | ( expression )

2. Parsing Algorithms

Top-down parsing (LL):

Recursive descent
Predictive parsing
Works for LL(k) grammars

Bottom-up parsing (LR):

Shift-reduce
LR(k), SLR, LALR parsers
More powerful, handles more grammars

CYK Algorithm: O(n³) parsing for CNF grammars

3. XML and HTML Structure

element → <tag> content </tag> | <tag/>
content → element* | text

4. Natural Language Processing

Context-free grammars model phrase structure:

S → NP VP
NP → Det N | Det Adj N
VP → V NP | V NP PP
PP → P NP

Decidability for Context-Free Languages

Decidable Problems

Membership: Given CFG G and string w, is w ∈ L(G)?
- Yes, using CYK algorithm: O(n³)
Emptiness: Is L(G) = ∅?
- Yes: Check if start variable can derive any terminal string
Finiteness: Is |L(G)| < ∞?
- Yes: Check for cycles in derivation graph

Undecidable Problems

Equivalence: Given CFGs G₁ and G₂, is L(G₁) = L(G₂)?
- Undecidable
Inclusion: Is L(G₁) ⊆ L(G₂)?
- Undecidable
Ambiguity: Is G ambiguous?
- Undecidable
Intersection emptiness: Is L(G₁) ∩ L(G₂) = ∅?
- Undecidable

These are in stark contrast to regular languages, where all analogous problems are decidable.

Exercises

Basic Exercises

CFG Construction: Design context-free grammars for:
- a) L = {aⁿb²ⁿ : n ≥ 0}
- b) Palindromes over {a, b}
- c) L = {aⁱbʲcᵏ : i = j or j = k}
Derivations: For the grammar S → aSb | SS | ε, give:
- a) A leftmost derivation of "aabbab"
- b) A rightmost derivation of "aabbab"
- c) A parse tree for "aabbab"
Ambiguity Detection: Determine if this grammar is ambiguous:
```
S → aS | Sa | a
```
If ambiguous, show two parse trees for some string.

Intermediate Exercises

CNF Conversion: Convert this grammar to Chomsky Normal Form:
```
S → AB | C
A → aA | ε
B → bB | b
C → cC | c
```
PDA Construction: Design a PDA for:
- a) L = {aⁱbʲcᵏ : i = k}
- b) L = {w ∈ {a,b}* : #a(w) = #b(w)}
- c) Palindromes over {a, b}
Pumping Lemma: Use the pumping lemma to prove these languages are not context-free:
- a) L = {aⁿbⁿcⁿdⁿ : n ≥ 0}
- b) L = {aⁱbʲcᵏ : i < j < k}
- c) L = {ww : w ∈ {a,b,c}*}

Advanced Exercises

Inherent Ambiguity: Prove that every CFG for L = {aⁿbⁿcᵐdᵐ : n,m ≥ 1} ∪ {aⁿbᵐcᵐdⁿ : n,m ≥ 1} must be ambiguous. (Hint: Consider strings in both languages.)
Closure Proof: Prove that if L₁ is context-free and L₂ is regular, then L₁ ∩ L₂ is context-free. (Hint: Product construction for PDA and DFA.)

PDA to CFG: Convert this PDA to a CFG:

Q = {q₀, q₁}, Σ = {a, b}, Γ = {X, Z₀}, Start: q₀, Accept: {q₁}
δ(q₀, a, Z₀) = {(q₀, XZ₀)}
δ(q₀, a, X) = {(q₀, XX)}
δ(q₀, b, X) = {(q₁, ε)}
δ(q₁, b, X) = {(q₁, ε)}
δ(q₁, ε, Z₀) = {(q₁, ε)}

Ogden's Lemma: Research Ogden's Lemma (a stronger version of the pumping lemma) and use it to prove L = {aⁱbʲcᵏ : i = j or i = k} is not context-free.
Grammar Design: Design an unambiguous CFG for arithmetic expressions with operators +, -, *, / and parentheses, where:
- - and / have higher precedence than + and -
- All operators are left-associative
- Prove your grammar is unambiguous
Decidability: Prove that the emptiness problem for CFGs is decidable. Design an algorithm and analyze its time complexity.

Summary

In this reading, we explored context-free grammars and pushdown automata:

Context-Free Grammars provide a generative formalism for languages with recursive structure, using production rules
Parse Trees represent the syntactic structure of derivations and reveal ambiguity
Ambiguity occurs when a string has multiple parse trees; some languages are inherently ambiguous
Chomsky Normal Form is a restricted form where every production is A → BC or A → a; every CFG can be converted to CNF
Pushdown Automata are NFAs augmented with a stack, providing operational view of context-free languages
Equivalence: CFGs and PDAs recognize exactly the context-free languages
Pumping Lemma for CFLs provides a tool to prove languages are not context-free
Closure Properties: CFLs are closed under union, concatenation, star, but not intersection or complement
Applications: Programming language syntax, parsing, XML structure
Decidability: Membership is decidable (O(n³)), but equivalence and ambiguity are undecidable

Key Insights:

Context-free languages extend regular languages by adding stack memory
Parse trees make structure explicit and reveal ambiguity
Not all natural languages (like {aⁿbⁿcⁿ}) are context-free
Some problems decidable for regular languages become undecidable for CFLs

Limitations:

Cannot express "equal numbers of a's, b's, and c's"
Cannot enforce non-local dependencies (context-sensitive features)
Cannot solve the general ambiguity problem

Next Steps

In the next reading, we'll explore Turing Machines and Computability, which remove all memory limitations and reach the theoretical limits of computation:

Turing machine model and variants
Church-Turing thesis
Decidability and the Halting Problem
Complexity classes P and NP
Reductions and NP-completeness

← Previous: Regular Languages | Continue to Turing Machines →

References and Further Reading

Sipser, Michael. "Introduction to the Theory of Computation" (Chapter 2)
Hopcroft, John E., et al. "Introduction to Automata Theory" (Chapters 5-7)
Aho, Alfred V., et al. "Compilers: Principles, Techniques, and Tools" (Dragon Book)
Chomsky, Noam. "Three models for the description of language" (1956)
CYK Algorithm: Cocke, Kasami, Younger (1960s)