Context-Free Grammars and Pushdown Automata
Introduction
Context-free grammars (CFGs) are more powerful than regular expressions and finite automata. They can describe the syntax of most programming languages, including nested structures like balanced parentheses, matching if-then-else statements, and recursive data structures. CFGs form the theoretical foundation of compiler design and parsing algorithms.
Learning Objectives
By the end of this reading, you will be able to:
- Define and construct context-free grammars
- Understand derivations, parse trees, and the language generated by a CFG
- Identify and resolve ambiguous grammars
- Convert CFGs to Chomsky Normal Form
- Understand pushdown automata (PDAs) as recognizers for context-free languages
- Prove the equivalence of CFGs and PDAs
- Apply the pumping lemma for context-free languages
- Recognize the limitations of context-free languages
Context-Free Grammars: Formal Definition
Basic Definitions
A context-free grammar (CFG) is a 4-tuple G = (V, Σ, R, S) where:
- V is a finite set of variables (also called nonterminals)
- Σ is a finite set of terminals (the alphabet), disjoint from V
- R is a finite set of production rules of the form A → w where A ∈ V and w ∈ (V ∪ Σ)*
- S ∈ V is the start variable
Notation:
- Variables: Usually uppercase letters A, B, C, S
- Terminals: Usually lowercase letters a, b, c or digits
- Strings of terminals: w, x, y, z
- Strings of variables and terminals: α, β, γ
Derivations
A derivation is a sequence of substitutions using production rules.
We write α ⇒ β if we can derive β from α in one step by applying a single production rule.
We write α ⇒* β if we can derive β from α in zero or more steps.
Notation:
- α ⇒ β: derives in one step
- α ⇒* β: derives in zero or more steps
- α ⇒+ β: derives in one or more steps
Types of derivations:
- Leftmost derivation: Always replace the leftmost variable
- Rightmost derivation: Always replace the rightmost variable
Language Generated by a Grammar
The language generated by grammar G is:
L(G) = {w ∈ Σ* : S ⇒* w}
A language L is context-free if there exists a CFG G such that L = L(G).
Example 1: Balanced Parentheses
Design a CFG for the language of balanced parentheses.
Grammar:
- V = {S}
- Σ = {(, )}
- S = S (start variable)
- R:
- S → (S)
- S → SS
- S → ε
Derivations:
Example: Generate "(())()"
S ⇒ SS (use S → SS)
⇒ (S)S (use S → (S) on first S)
⇒ ((S))S (use S → (S))
⇒ (())S (use S → ε)
⇒ (())(S) (use S → (S))
⇒ (())() (use S → ε)
Alternative derivation (leftmost):
S ⇒ SS ⇒ (S)S ⇒ ((S))S ⇒ (())S ⇒ (())(S) ⇒ (())()
Proof that L(G) = {balanced parentheses}:
- (⊆) By induction on derivation length, every string derived is balanced
- (⊇) By induction on string structure, every balanced string can be derived
Example 2: L = {0ⁿ1ⁿ : n ≥ 0}
Grammar:
- V = {S}
- Σ = {0, 1}
- R:
- S → 0S1
- S → ε
Derivation of "000111":
S ⇒ 0S1 ⇒ 00S11 ⇒ 000S111 ⇒ 000111
This language is context-free but not regular (proved via pumping lemma in previous reading).
Example 3: Arithmetic Expressions
Design a CFG for simple arithmetic expressions with +, *, and parentheses.
Grammar:
- V = {E, T, F} (Expression, Term, Factor)
- Σ = {+, *, (, ), a} (a represents variables/numbers)
- Start: E
- R:
- E → E + T | T
- T → T * F | F
- F → (E) | a
Derivation of "a + a * a":
E ⇒ E + T
⇒ T + T
⇒ F + T
⇒ a + T
⇒ a + T * F
⇒ a + F * F
⇒ a + a * F
⇒ a + a * a
This grammar captures operator precedence: * binds tighter than +.
Example 4: Programming Language Constructs
If-then-else statements:
S → if E then S
S → if E then S else S
S → other
E → condition
While loops:
S → while E do S
S → begin S end
S → S; S
S → a := b
Parse Trees
A parse tree is a graphical representation of a derivation.
Structure:
- Root: Start variable S
- Internal nodes: Variables
- Leaves: Terminals (reading left-to-right gives the derived string)
- Parent-child relationship: If A → α is used, A is parent and symbols in α are children
Example 5: Parse Tree for "a + a * a"
Using the arithmetic grammar from Example 3:
E
|
┌───┴───┬───┐
E + T
| |
T ┌───┼───┬───┐
| T * F
F | |
| F a
a |
a
Reading leaves left-to-right: a + a * a
Relationship Between Derivations and Parse Trees
- Each parse tree corresponds to multiple derivations (leftmost, rightmost, etc.)
- But all these derivations represent the same syntactic structure
- Parse trees capture the structural meaning, while derivations show the process
Theorem: A string w has more than one parse tree if and only if it has more than one leftmost derivation.
Ambiguity
Definition
A CFG is ambiguous if there exists a string with two or more distinct parse trees (equivalently, two or more distinct leftmost derivations).
A CFG is unambiguous if every string has at most one parse tree.
Example 6: Ambiguous Grammar
Consider this grammar for arithmetic expressions:
E → E + E
E → E * E
E → (E)
E → a
The string "a + a * a" has (at least) two parse trees:
Parse Tree 1 (+ at root):
E
┌───┼───┐
E + E
| ┌───┼───┐
a E * E
| |
a a
Parse Tree 2 (* at root):
E
┌───┼───┐
E * E
┌───┼───┐ |
E + E a
| |
a a
These represent different interpretations: (a+a)a vs. a+(aa).
Resolving Ambiguity
Strategy 1: Rewrite the grammar (as in Example 3)
- Use separate variables for different precedence levels
- Encode associativity in recursive structure
Strategy 2: Add disambiguating rules (used in parser generators)
- Specify operator precedence: * before +
- Specify associativity: left-to-right
- Example (yacc/bison):
%left '+' %left '*'
Strategy 3: Modify the language
- Sometimes ambiguity is inherent in the language
- Example: "dangling else" problem
The Dangling Else Problem
Consider:
S → if E then S
S → if E then S else S
S → other
The string "if E₁ then if E₂ then S₁ else S₂" has two parse trees:
Parse 1: else matches second if
S
|
┌──────┼──────┬──────┬─────┐
if E₁ then S
|
┌────────┼──────┬──────┬──────┬─────┐
if E₂ then S₁ else S₂
Parse 2: else matches first if
S
|
┌───────────┼───────┬──────┬──────┬──────┬─────┐
if E₁ then S else S₂
|
┌──────┼──────┬──────┬─────┐
if E₂ then S₁
Standard resolution: "else matches the nearest unmatched if" (Parse 1)
This can be encoded in the grammar:
S → Matched | Unmatched
Matched → if E then Matched else Matched | other
Unmatched → if E then S | if E then Matched else Unmatched
Inherently Ambiguous Languages
Some context-free languages have no unambiguous grammar.
Example: L = {aⁿbⁿcᵐdᵐ : n,m ≥ 1} ∪ {aⁿbᵐcᵐdⁿ : n,m ≥ 1}
Every CFG for this language is ambiguous. The string a²b²c²d² belongs to both parts, and any grammar must derive it in fundamentally different ways.
Chomsky Normal Form
Definition
A CFG is in Chomsky Normal Form (CNF) if every production rule is of the form:
- A → BC (where B, C are variables, not the start variable)
- A → a (where a is a terminal)
- S → ε (only if ε ∈ L(G), and S doesn't appear on the right side of any rule)
Importance
- Simplifies parsing algorithms: CYK algorithm runs in O(n³) for CNF grammars
- Theoretical tool: Many proofs are easier with CNF
- Normal form theorem: Every CFG can be converted to CNF
Conversion to CNF
Algorithm:
Add new start variable: Create S₀ → S to ensure start variable doesn't appear on right side
Eliminate ε-rules: Remove A → ε rules
- For each rule B → αAβ, add B → αβ
- Repeat for all nullable variables
Eliminate unit rules: Remove A → B rules
- For each rule A → B and B → α, add A → α
- Repeat until no unit rules remain
Convert long rules: Replace A → u₁u₂...uₖ (k ≥ 3) with:
- A → u₁A₁
- A₁ → u₂A₂
- ...
- Aₖ₋₂ → uₖ₋₁uₖ
Convert terminal rules: Replace A → u₁u₂ where uᵢ is a terminal with:
- A → U₁U₂
- Uᵢ → uᵢ (new rules)
Example 7: Converting to CNF
Original grammar:
S → ASA | aB
A → B | S
B → b | ε
Step 1: Add new start variable
S₀ → S
S → ASA | aB
A → B | S
B → b | ε
Step 2: Eliminate ε-rules (B is nullable)
S₀ → S
S → ASA | aB | a (added S → a since B is nullable)
A → B | S | ε (A becomes nullable)
B → b
Now A is nullable too:
S₀ → S
S → ASA | AS | SA | S | aB | a
A → B | S
B → b
Step 3: Eliminate unit rules
S₀ → ASA | AS | SA | aB | a | b
S → ASA | AS | SA | aB | a | b
A → b
B → b
Step 4: Convert long rules (ASA has length 3)
S₀ → AX₁ | AS | SA | aB | a | b where X₁ → SA
S → AX₁ | AS | SA | aB | a | b
X₁ → SA
A → b
B → b
Step 5: Convert terminal rules (like aB)
S₀ → AX₁ | AS | SA | UB | a | b where U → a
S → AX₁ | AS | SA | UB | a | b
X₁ → SA
A → b
B → b
U → a
Final CNF:
S₀ → AX₁ | AS | SA | UB | a | b
S → AX₁ | AS | SA | UB | a | b
X₁ → SA
A → b
B → b
U → a
Pushdown Automata
Motivation
Regular languages are recognized by finite automata (finite memory). Context-free languages need more memory, but not arbitrary memory - they need a stack.
A pushdown automaton (PDA) is an NFA augmented with a stack.
Formal Definition
A pushdown automaton is a 6-tuple P = (Q, Σ, Γ, δ, q₀, F) where:
- Q is a finite set of states
- Σ is the input alphabet
- Γ is the stack alphabet
- δ: Q × Σ_ε × Γ_ε → P(Q × Γ_ε) is the transition function
- q₀ ∈ Q is the start state
- F ⊆ Q is the set of accept states
Notation: Σ_ε = Σ ∪ {ε} and Γ_ε = Γ ∪ {ε}
Transition function: δ(q, a, X) = {(p₁, γ₁), (p₂, γ₂), ...} means:
- In state q
- Reading input symbol a (or ε)
- With X on top of stack (or ε to ignore stack)
- Can transition to state pᵢ and push γᵢ onto stack
Stack operations:
- Push X: δ(q, a, ε) = {(p, X)} or δ(q, a, Y) = {(p, XY)}
- Pop X: δ(q, a, X) = {(p, ε)}
- No-op: δ(q, a, X) = {(p, X)}
Acceptance
A PDA accepts a string w if:
- By final state: After reading all of w, PDA is in an accept state (stack can be anything)
- By empty stack: After reading all of w, the stack is empty (state can be anything)
These two notions are equivalent in power.
Example 8: PDA for {0ⁿ1ⁿ : n ≥ 0}
Informal description:
- Push each 0 onto the stack
- Pop a 0 for each 1 read
- Accept if stack becomes empty
Formal specification:
- Q = {q₀, q₁, q₂}
- Σ = {0, 1}
- Γ = {0, $} ($ is stack bottom marker)
- Start: q₀
- Accept: {q₂}
Transitions:
δ(q₀, ε, ε) = {(q₁, $)} // Push $ onto empty stack
δ(q₁, 0, ε) = {(q₁, 0)} // Push 0 for each 0 read
δ(q₁, 1, 0) = {(q₁, ε)} // Pop 0 for each 1 read
δ(q₁, ε, $) = {(q₂, ε)} // Accept when $ is on top
State diagram:
ε, ε → $ 0, ε → 0 1, 0 → ε ε, $ → ε
┌────┐ ┌────┐ ┌────┐ ┌────┐
→ │ q₀ │────────►│ q₁ │◄──────────│ q₁ │──────────►│(q₂)│
└────┘ └────┘ └────┘ └────┘
Trace for "0011":
Configuration: (state, unread input, stack)
(q₀, 0011, ε) ⊢ (q₁, 0011, $)
⊢ (q₁, 011, 0$)
⊢ (q₁, 11, 00$)
⊢ (q₁, 1, 0$)
⊢ (q₁, ε, $)
⊢ (q₂, ε, ε) ✓ Accept
Example 9: PDA for Balanced Parentheses
Strategy:
- Push '(' onto stack when seen
- Pop from stack when ')' is seen
- Accept if stack is empty at end
Transitions:
δ(q₀, ε, ε) = {(q₁, $)} // Initialize stack
δ(q₁, '(', ε) = {(q₁, '(')} // Push (
δ(q₁, ')', '(') = {(q₁, ε)} // Pop ( when seeing )
δ(q₁, ε, $) = {(q₂, ε)} // Accept when done
Example 10: PDA for {wcwᴿ : w ∈ {a,b}*}
This language consists of strings with a 'c' in the middle, where the right half is the reverse of the left half.
Strategy:
- Push symbols onto stack until 'c' is seen
- Then pop and match with remaining input
- Accept if stack becomes empty
Transitions:
δ(q₀, ε, ε) = {(q₁, $)} // Initialize
δ(q₁, a, ε) = {(q₁, a)} // Push a's and b's
δ(q₁, b, ε) = {(q₁, b)}
δ(q₁, c, ε) = {(q₂, ε)} // Switch to matching mode
δ(q₂, a, a) = {(q₂, ε)} // Pop and match
δ(q₂, b, b) = {(q₂, ε)}
δ(q₂, ε, $) = {(q₃, ε)} // Accept
Trace for "abcba":
(q₀, abcba, ε) ⊢ (q₁, abcba, $)
⊢ (q₁, bcba, a$)
⊢ (q₁, cba, ba$)
⊢ (q₂, ba, ba$)
⊢ (q₂, a, a$)
⊢ (q₂, ε, $)
⊢ (q₃, ε, ε) ✓
Equivalence of CFGs and PDAs
Theorem: CFG ↔ PDA Equivalence
Theorem: A language is context-free if and only if some PDA recognizes it.
Proof outline:
- CFG → PDA: Construct PDA that simulates leftmost derivations
- PDA → CFG: Construct CFG that simulates PDA computation
CFG to PDA Construction
Given CFG G = (V, Σ, R, S), construct PDA P:
Idea: Use stack to store sentential form, nondeterministically apply productions.
Algorithm:
- Push $ then S onto stack
- Repeat:
- If top of stack is variable A, nondeterministically choose A → w and replace A with w
- If top of stack is terminal a, read a from input and pop stack
- If top of stack is $, accept
Formal construction:
- States: {q_start, q_loop, q_accept}
- δ(q_start, ε, ε) = {(q_loop, S$)}
- For each rule A → w: δ(q_loop, ε, A) = {(q_loop, w)}
- For each terminal a: δ(q_loop, a, a) = {(q_loop, ε)}
- δ(q_loop, ε, $) = {(q_accept, ε)}
PDA to CFG Construction
Given PDA P, construct CFG G such that L(G) = L(P).
Idea: Variable A_pq generates all strings that take PDA from state p with empty stack to state q with empty stack.
Construction (sketch):
- For each p, q ∈ Q, create variable A_pq
- Start variable: A_q₀,f for some f ∈ F
- For each transition, create corresponding production
This construction is complex; see textbook for full details.
Pumping Lemma for Context-Free Languages
The Pumping Lemma
Not all languages are context-free. The pumping lemma for CFLs helps prove non-context-freeness.
Theorem (Pumping Lemma for CFLs):
If L is a context-free language, then there exists p ≥ 1 (pumping length) such that every string s ∈ L with |s| ≥ p can be divided into five parts, s = uvxyz, satisfying:
- |vxy| ≤ p
- |vy| > 0 (at least one of v or y is non-empty)
- For all i ≥ 0, uvⁱxyⁱz ∈ L
Intuition: Parse trees for long strings must have a repeated variable on some path, creating a "pumpable" region.
Proof of Pumping Lemma
Proof sketch:
- Let G be a CFG in CNF for L with |V| variables
- Let p = 2^|V|
- Any string s with |s| ≥ p has a parse tree of height > |V|
- By pigeonhole principle, some variable R repeats on a path from root to leaf
- This creates two subtrees rooted at R, one inside the other
- The inner subtree generates x, the region between generates v and y
- These can be pumped independently
Example 11: L = {aⁿbⁿcⁿ : n ≥ 0} is Not Context-Free
Proof: Assume L is context-free with pumping length p.
Choose: s = aᵖbᵖcᵖ ∈ L (clearly |s| = 3p ≥ p)
Consider any division s = uvxyz where |vxy| ≤ p and |vy| > 0:
Case 1: vxy contains only one type of symbol (all a's, all b's, or all c's)
- Then pumping (i=2) increases count of only one symbol
- Result: unequal numbers → uv²xy²z ∉ L
Case 2: vxy contains two types of symbols
- vxy spans at most p symbols, so it can't contain all three types
- Subcase 2a: vxy contains a's and b's but no c's
- Pumping changes ratio of a's and b's to c's
- Result: uv²xy²z ∉ L
- Subcase 2b: vxy contains b's and c's but no a's
- Similar reasoning
- Result: uv²xy²z ∉ L
- Subcase 2c: vxy contains a's and c's
- This is impossible since a's and c's are not adjacent in aᵖbᵖcᵖ
All cases lead to contradiction, so L is not context-free. ∎
Example 12: L = {ww : w ∈ {a,b}*} is Not Context-Free
Proof: Assume L is context-free with pumping length p.
Choose: s = aᵖbᵖaᵖbᵖ ∈ L
Consider any division s = uvxyz where |vxy| ≤ p and |vy| > 0:
Since |vxy| ≤ p, the substring vxy cannot span both halves of s.
Case 1: vxy is entirely in the first half (first aᵖbᵖ)
- Pumping up (i=2) makes first half longer than second half
- Result: uv²xy²z ∉ L
Case 2: vxy is entirely in the second half
- Similar reasoning
- Result: uv²xy²z ∉ L
Case 3: vxy spans the boundary between the two halves
- vxy can span at most p symbols around the boundary
- The middle of s is ...bᵖaᵖ...
- If vxy is in this region, pumping changes the balance
- Result: uv²xy²z ∉ L
Therefore L is not context-free. ∎
Closure Properties of Context-Free Languages
Positive Closure Properties
Context-free languages are closed under:
Union: L₁ ∪ L₂
- Proof: S → S₁ | S₂ where S₁ generates L₁ and S₂ generates L₂
Concatenation: L₁L₂
- Proof: S → S₁S₂
Kleene Star: L*
- Proof: S → SS | ε
Reversal: L^R
- Proof: Reverse all productions
Homomorphism: h(L)
- Replace each terminal a with h(a) in the grammar
Negative Closure Properties
Context-free languages are NOT closed under:
Intersection: L₁ ∩ L₂
- Counterexample: L₁ = {aⁿbⁿcᵐ}, L₂ = {aᵐbⁿcⁿ} are CF
- But L₁ ∩ L₂ = {aⁿbⁿcⁿ} is not CF
Complement: L̄
- Follows from non-closure under intersection (De Morgan's laws)
However: If L is context-free and R is regular, then L ∩ R is context-free.
- Proof: Simulate PDA for L and DFA for R simultaneously (product construction)
Applications of Context-Free Grammars
1. Programming Language Syntax
Most programming language syntax is context-free:
program → statement*
statement → assignment | if-stmt | while-stmt | block
assignment → identifier = expression ;
if-stmt → if ( expression ) statement
| if ( expression ) statement else statement
while-stmt → while ( expression ) statement
block → { statement* }
expression → term | expression + term | expression - term
term → factor | term * factor | term / factor
factor → identifier | number | ( expression )
2. Parsing Algorithms
Top-down parsing (LL):
- Recursive descent
- Predictive parsing
- Works for LL(k) grammars
Bottom-up parsing (LR):
- Shift-reduce
- LR(k), SLR, LALR parsers
- More powerful, handles more grammars
CYK Algorithm: O(n³) parsing for CNF grammars
3. XML and HTML Structure
element → <tag> content </tag> | <tag/>
content → element* | text
4. Natural Language Processing
Context-free grammars model phrase structure:
S → NP VP
NP → Det N | Det Adj N
VP → V NP | V NP PP
PP → P NP
Decidability for Context-Free Languages
Decidable Problems
Membership: Given CFG G and string w, is w ∈ L(G)?
- Yes, using CYK algorithm: O(n³)
Emptiness: Is L(G) = ∅?
- Yes: Check if start variable can derive any terminal string
Finiteness: Is |L(G)| < ∞?
- Yes: Check for cycles in derivation graph
Undecidable Problems
Equivalence: Given CFGs G₁ and G₂, is L(G₁) = L(G₂)?
- Undecidable
Inclusion: Is L(G₁) ⊆ L(G₂)?
- Undecidable
Ambiguity: Is G ambiguous?
- Undecidable
Intersection emptiness: Is L(G₁) ∩ L(G₂) = ∅?
- Undecidable
These are in stark contrast to regular languages, where all analogous problems are decidable.
Exercises
Basic Exercises
CFG Construction: Design context-free grammars for:
- a) L = {aⁿb²ⁿ : n ≥ 0}
- b) Palindromes over {a, b}
- c) L = {aⁱbʲcᵏ : i = j or j = k}
Derivations: For the grammar S → aSb | SS | ε, give:
- a) A leftmost derivation of "aabbab"
- b) A rightmost derivation of "aabbab"
- c) A parse tree for "aabbab"
Ambiguity Detection: Determine if this grammar is ambiguous:
S → aS | Sa | aIf ambiguous, show two parse trees for some string.
Intermediate Exercises
CNF Conversion: Convert this grammar to Chomsky Normal Form:
S → AB | C A → aA | ε B → bB | b C → cC | cPDA Construction: Design a PDA for:
- a) L = {aⁱbʲcᵏ : i = k}
- b) L = {w ∈ {a,b}* : #a(w) = #b(w)}
- c) Palindromes over {a, b}
Pumping Lemma: Use the pumping lemma to prove these languages are not context-free:
- a) L = {aⁿbⁿcⁿdⁿ : n ≥ 0}
- b) L = {aⁱbʲcᵏ : i < j < k}
- c) L = {ww : w ∈ {a,b,c}*}
Advanced Exercises
Inherent Ambiguity: Prove that every CFG for L = {aⁿbⁿcᵐdᵐ : n,m ≥ 1} ∪ {aⁿbᵐcᵐdⁿ : n,m ≥ 1} must be ambiguous. (Hint: Consider strings in both languages.)
Closure Proof: Prove that if L₁ is context-free and L₂ is regular, then L₁ ∩ L₂ is context-free. (Hint: Product construction for PDA and DFA.)
PDA to CFG: Convert this PDA to a CFG:
Q = {q₀, q₁}, Σ = {a, b}, Γ = {X, Z₀}, Start: q₀, Accept: {q₁} δ(q₀, a, Z₀) = {(q₀, XZ₀)} δ(q₀, a, X) = {(q₀, XX)} δ(q₀, b, X) = {(q₁, ε)} δ(q₁, b, X) = {(q₁, ε)} δ(q₁, ε, Z₀) = {(q₁, ε)}Ogden's Lemma: Research Ogden's Lemma (a stronger version of the pumping lemma) and use it to prove L = {aⁱbʲcᵏ : i = j or i = k} is not context-free.
Grammar Design: Design an unambiguous CFG for arithmetic expressions with operators +, -, *, / and parentheses, where:
- and / have higher precedence than + and -
- All operators are left-associative
- Prove your grammar is unambiguous
Decidability: Prove that the emptiness problem for CFGs is decidable. Design an algorithm and analyze its time complexity.
Summary
In this reading, we explored context-free grammars and pushdown automata:
Context-Free Grammars provide a generative formalism for languages with recursive structure, using production rules
Parse Trees represent the syntactic structure of derivations and reveal ambiguity
Ambiguity occurs when a string has multiple parse trees; some languages are inherently ambiguous
Chomsky Normal Form is a restricted form where every production is A → BC or A → a; every CFG can be converted to CNF
Pushdown Automata are NFAs augmented with a stack, providing operational view of context-free languages
Equivalence: CFGs and PDAs recognize exactly the context-free languages
Pumping Lemma for CFLs provides a tool to prove languages are not context-free
Closure Properties: CFLs are closed under union, concatenation, star, but not intersection or complement
Applications: Programming language syntax, parsing, XML structure
Decidability: Membership is decidable (O(n³)), but equivalence and ambiguity are undecidable
Key Insights:
- Context-free languages extend regular languages by adding stack memory
- Parse trees make structure explicit and reveal ambiguity
- Not all natural languages (like {aⁿbⁿcⁿ}) are context-free
- Some problems decidable for regular languages become undecidable for CFLs
Limitations:
- Cannot express "equal numbers of a's, b's, and c's"
- Cannot enforce non-local dependencies (context-sensitive features)
- Cannot solve the general ambiguity problem
Next Steps
In the next reading, we'll explore Turing Machines and Computability, which remove all memory limitations and reach the theoretical limits of computation:
- Turing machine model and variants
- Church-Turing thesis
- Decidability and the Halting Problem
- Complexity classes P and NP
- Reductions and NP-completeness
← Previous: Regular Languages | Continue to Turing Machines →
References and Further Reading
- Sipser, Michael. "Introduction to the Theory of Computation" (Chapter 2)
- Hopcroft, John E., et al. "Introduction to Automata Theory" (Chapters 5-7)
- Aho, Alfred V., et al. "Compilers: Principles, Techniques, and Tools" (Dragon Book)
- Chomsky, Noam. "Three models for the description of language" (1956)
- CYK Algorithm: Cocke, Kasami, Younger (1960s)