Language Design & Architecture
Before writing any code, we need to design our language. What will it look like? What features will it have? How will the interpreter work?
What Makes a Programming Language?
A programming language has three main components:
- Syntax: The rules for what code looks like (grammar)
- Semantics: The meaning of code (what it does)
- Runtime: The environment where code executes
The Interpreter Pipeline
Our Lux interpreter follows this flow:
Source Code (text file or REPL input)
↓
LEXER (src/lexer.rs)
- Breaks text into tokens
- "let x = 5" → [LET, IDENTIFIER("x"), EQUAL, NUMBER(5)]
↓
PARSER (src/parser.rs)
- Builds Abstract Syntax Tree (AST)
- Tokens → Tree structure representing program structure
↓
INTERPRETER (src/interpreter.rs)
- Walks the AST
- Executes statements, evaluates expressions
- Uses Environment for variable storage
↓
Output / Side Effects
Interpreter vs Compiler vs JIT
| Aspect | Interpreter | Compiler | JIT Compiler |
|---|---|---|---|
| Execution | Executes source directly | Produces machine code first | Compiles during runtime |
| Startup | Instant | Compile time required | Slight delay first run |
| Performance | Slower (10-100x) | Fastest | Fast (near-native) |
| Memory | Moderate | Low (no runtime) | Higher (code + data) |
| Errors | Runtime only | Compile-time + runtime | Runtime only |
| Examples | Python, Ruby, PHP | C, C++, Rust, Go | JavaScript (V8), Java (JVM) |
| Best For | Scripting, rapid dev | System software | Web browsers, VMs |
We're building a tree-walking interpreter - the simplest and most educational approach.
Lux Language Design Decisions
1. Dynamically Typed
let x = 5; // x is a number
x = "hello"; // now x is a string - this is OK
Why? Simpler to implement, no type checker needed. Good for scripting.
Alternative: Static typing (like Rust, requires type inference or annotations)
2. Expression-Based
Almost everything is an expression that returns a value:
let x = if true { 5 } else { 10 }; // if returns a value
let y = { let a = 1; let b = 2; a + b }; // block returns last expression
3. First-Class Functions
Functions are values that can be passed around:
fn add(a, b) { return a + b; }
let operation = add; // assign function to variable
let result = operation(2, 3); // call through variable
4. Closures
Functions capture variables from their environment:
fn outer() {
let x = 10;
fn inner() {
return x; // captures x from outer
}
return inner;
}
5. C-Like Syntax with Modern Touches
Familiar to developers coming from JavaScript, C, Java, etc:
// C-style blocks and keywords
if condition {
// ...
} else {
// ...
}
// Modern: no semicolons required (optional)
// Modern: fn keyword (like Rust)
// Modern: let for variables (like JavaScript/Rust)
The Complete Lux Language Specification
Keywords
let - variable declaration
fn - function declaration
if - conditional
else - alternative branch
while - loop
for - iteration loop
return - return from function
true - boolean literal
false - boolean literal
nil - null/none value
print - built-in print statement
and - logical AND
or - logical OR
not - logical NOT
Operators
Arithmetic: + - * / %
Comparison: == != < > <= >=
Logical: and or not
Assignment: =
Grouping: ( )
Data Types
42 // Number (f64 internally)
3.14159
"hello" // String
true, false // Boolean
nil // Nil (null/none)
[1, 2, 3] // List
fn() { } // Function
Grammar Overview
program → statement* EOF ;
statement → exprStmt
| letStmt
| printStmt
| blockStmt
| ifStmt
| whileStmt
| forStmt
| returnStmt
| fnDecl ;
letStmt → "let" IDENTIFIER "=" expression ";" ;
printStmt → "print" expression ";" ;
blockStmt → "{" statement* "}" ;
ifStmt → "if" expression blockStmt ( "else" blockStmt )? ;
whileStmt → "while" expression blockStmt ;
forStmt → "for" IDENTIFIER "in" expression blockStmt ;
returnStmt → "return" expression? ";" ;
fnDecl → "fn" IDENTIFIER "(" parameters? ")" blockStmt ;
exprStmt → expression ";" ;
expression → assignment ;
assignment → IDENTIFIER "=" assignment | logic_or ;
logic_or → logic_and ( "or" logic_and )* ;
logic_and → equality ( "and" equality )* ;
equality → comparison ( ( "==" | "!=" ) comparison )* ;
comparison → term ( ( ">" | ">=" | "<" | "<=" ) term )* ;
term → factor ( ( "+" | "-" ) factor )* ;
factor → unary ( ( "*" | "/" | "%" ) unary )* ;
unary → ( "not" | "-" ) unary | call ;
call → primary ( "(" arguments? ")" | "[" expression "]" )* ;
primary → NUMBER | STRING | "true" | "false" | "nil"
| IDENTIFIER | "(" expression ")"
| "[" arguments? "]"
| "fn" "(" parameters? ")" blockStmt ;
Sample Lux Programs
Hello World
print "Hello, World!";
Variables and Arithmetic
let a = 10;
let b = 20;
let sum = a + b;
print "Sum: " + str(sum);
Functions
fn greet(name) {
return "Hello, " + name + "!";
}
print greet("Alice");
print greet("Bob");
Conditionals
fn max(a, b) {
if a > b {
return a;
} else {
return b;
}
}
print max(10, 5);
Loops
// While loop
let i = 0;
while i < 5 {
print i;
i = i + 1;
}
// For loop
let numbers = [1, 2, 3, 4, 5];
for n in numbers {
print n * n;
}
Recursion
fn factorial(n) {
if n <= 1 {
return 1;
}
return n * factorial(n - 1);
}
print factorial(5); // 120
Closures
fn make_adder(x) {
fn adder(y) {
return x + y;
}
return adder;
}
let add5 = make_adder(5);
print add5(10); // 15
print add5(20); // 25
Higher-Order Functions
fn apply_twice(f, x) {
return f(f(x));
}
fn double(x) {
return x * 2;
}
print apply_twice(double, 5); // 20
Lists
let items = [1, 2, 3];
push(items, 4);
push(items, 5);
for item in items {
print item;
}
print "Length: " + str(len(items));
Project Setup
Let's create our Rust project:
# Create the project
cargo new lux
cd lux
# Our initial project structure
lux/
├── Cargo.toml
└── src/
├── main.rs # Entry point (REPL and file runner)
├── lexer.rs # Tokenization
├── parser.rs # AST construction
├── ast.rs # AST node definitions
├── interpreter.rs # Evaluation engine
└── environment.rs # Variable storage and scoping
Initial Cargo.toml
[package]
name = "lux"
version = "0.1.0"
edition = "2021"
[dependencies]
# We'll add dependencies as needed
Initial src/main.rs
fn main() {
println!("Lux Programming Language");
println!("Coming soon...");
}
Test that it works:
cargo run
You should see:
Lux Programming Language
Coming soon...
Architecture Overview
Module Responsibilities
main.rs
- CLI argument parsing
- REPL loop
- File reading and execution
- Error display
lexer.rs
- Character-by-character scanning
- Token generation
- Position tracking (line/column)
- Comment and whitespace handling
ast.rs
Tokenenum (keyword, operator, literal types)Exprenum (all expression types)Stmtenum (all statement types)- Helper types
parser.rs
- Recursive descent parsing
- Operator precedence handling
- AST construction
- Syntax error detection
interpreter.rs
Valueenum (runtime values)- Expression evaluation
- Statement execution
- Built-in function implementation
environment.rs
- Variable storage
- Scope management
- Variable lookup and assignment
Why These Choices?
Why Rust?
- Memory safety: No crashes from bad pointers
- Performance: Fast even for a tree-walker
- Enums and pattern matching: Perfect for AST representation
- Error handling: Result/Option types for clean error propagation
- Learning: Rust teaches you to think carefully about data ownership
Why Tree-Walking Interpreter?
- Simplest to understand: Direct mapping from AST to execution
- Easy to debug: Can print the tree and trace execution
- Quick to implement: Complete interpreter in ~1000 lines
- Foundation for more: Can optimize later (bytecode, JIT)
Why Dynamic Typing?
- Simpler implementation: No type checker needed
- Faster prototyping: Get interpreter working quickly
- Familiar to many: JavaScript, Python, Ruby all dynamic
- Good for scripting: Flexibility in small programs
Trade-off: Runtime errors vs compile-time safety. We handle this with good error messages.
Development Workflow
As we build Lux, follow this pattern:
- Design: Plan the feature (syntax, semantics)
- Test: Write test cases first
- Implement: Add code incrementally
- Test: Verify it works
- Refine: Clean up and optimize
Common Pitfalls to Avoid
- Forgetting position tracking: Always track line/column for error messages
- Inconsistent precedence: Use a precedence table, test thoroughly
- Scope bugs: Be careful with environment parent chains
- Memory leaks: Use Rc/RefCell appropriately for cycles
- Poor error messages: Always include source location and context
What's Next?
Now that we understand the architecture, let's build the first component: the lexer!
→ Continue to Chapter 2: Lexer
Quick Reference
Project Commands
# Run REPL
cargo run
# Run a file
cargo run -- script.lux
# Run tests
cargo test
# Run with debugging
RUST_LOG=debug cargo run
# Build release version
cargo build --release
./target/release/lux
Key Concepts
- Lexer: Text → Tokens
- Parser: Tokens → AST
- Interpreter: AST → Execution
- Environment: Variable storage with scoping
- Value: Runtime representation of data