Language Design & Architecture

Before writing any code, we need to design our language. What will it look like? What features will it have? How will the interpreter work?

What Makes a Programming Language?

A programming language has three main components:

  1. Syntax: The rules for what code looks like (grammar)
  2. Semantics: The meaning of code (what it does)
  3. Runtime: The environment where code executes

The Interpreter Pipeline

Our Lux interpreter follows this flow:

Source Code (text file or REPL input)
         ↓
    LEXER (src/lexer.rs)
    - Breaks text into tokens
    - "let x = 5" → [LET, IDENTIFIER("x"), EQUAL, NUMBER(5)]
         ↓
   PARSER (src/parser.rs)
    - Builds Abstract Syntax Tree (AST)
    - Tokens → Tree structure representing program structure
         ↓
  INTERPRETER (src/interpreter.rs)
    - Walks the AST
    - Executes statements, evaluates expressions
    - Uses Environment for variable storage
         ↓
    Output / Side Effects

Interpreter vs Compiler vs JIT

AspectInterpreterCompilerJIT Compiler
ExecutionExecutes source directlyProduces machine code firstCompiles during runtime
StartupInstantCompile time requiredSlight delay first run
PerformanceSlower (10-100x)FastestFast (near-native)
MemoryModerateLow (no runtime)Higher (code + data)
ErrorsRuntime onlyCompile-time + runtimeRuntime only
ExamplesPython, Ruby, PHPC, C++, Rust, GoJavaScript (V8), Java (JVM)
Best ForScripting, rapid devSystem softwareWeb browsers, VMs

We're building a tree-walking interpreter - the simplest and most educational approach.

Lux Language Design Decisions

1. Dynamically Typed

let x = 5;        // x is a number
x = "hello";      // now x is a string - this is OK

Why? Simpler to implement, no type checker needed. Good for scripting.

Alternative: Static typing (like Rust, requires type inference or annotations)

2. Expression-Based

Almost everything is an expression that returns a value:

let x = if true { 5 } else { 10 };  // if returns a value
let y = { let a = 1; let b = 2; a + b };  // block returns last expression

3. First-Class Functions

Functions are values that can be passed around:

fn add(a, b) { return a + b; }
let operation = add;  // assign function to variable
let result = operation(2, 3);  // call through variable

4. Closures

Functions capture variables from their environment:

fn outer() {
    let x = 10;
    fn inner() {
        return x;  // captures x from outer
    }
    return inner;
}

5. C-Like Syntax with Modern Touches

Familiar to developers coming from JavaScript, C, Java, etc:

// C-style blocks and keywords
if condition {
    // ...
} else {
    // ...
}

// Modern: no semicolons required (optional)
// Modern: fn keyword (like Rust)
// Modern: let for variables (like JavaScript/Rust)

The Complete Lux Language Specification

Keywords

let      - variable declaration
fn       - function declaration
if       - conditional
else     - alternative branch
while    - loop
for      - iteration loop
return   - return from function
true     - boolean literal
false    - boolean literal
nil      - null/none value
print    - built-in print statement
and      - logical AND
or       - logical OR
not      - logical NOT

Operators

Arithmetic: +  -  *  /  %
Comparison: ==  !=  <  >  <=  >=
Logical:    and  or  not
Assignment: =
Grouping:   ( )

Data Types

42              // Number (f64 internally)
3.14159
"hello"         // String
true, false     // Boolean
nil             // Nil (null/none)
[1, 2, 3]       // List
fn() { }        // Function

Grammar Overview

program        → statement* EOF ;

statement      → exprStmt
               | letStmt
               | printStmt
               | blockStmt
               | ifStmt
               | whileStmt
               | forStmt
               | returnStmt
               | fnDecl ;

letStmt        → "let" IDENTIFIER "=" expression ";" ;
printStmt      → "print" expression ";" ;
blockStmt      → "{" statement* "}" ;
ifStmt         → "if" expression blockStmt ( "else" blockStmt )? ;
whileStmt      → "while" expression blockStmt ;
forStmt        → "for" IDENTIFIER "in" expression blockStmt ;
returnStmt     → "return" expression? ";" ;
fnDecl         → "fn" IDENTIFIER "(" parameters? ")" blockStmt ;
exprStmt       → expression ";" ;

expression     → assignment ;
assignment     → IDENTIFIER "=" assignment | logic_or ;
logic_or       → logic_and ( "or" logic_and )* ;
logic_and      → equality ( "and" equality )* ;
equality       → comparison ( ( "==" | "!=" ) comparison )* ;
comparison     → term ( ( ">" | ">=" | "<" | "<=" ) term )* ;
term           → factor ( ( "+" | "-" ) factor )* ;
factor         → unary ( ( "*" | "/" | "%" ) unary )* ;
unary          → ( "not" | "-" ) unary | call ;
call           → primary ( "(" arguments? ")" | "[" expression "]" )* ;
primary        → NUMBER | STRING | "true" | "false" | "nil"
               | IDENTIFIER | "(" expression ")"
               | "[" arguments? "]"
               | "fn" "(" parameters? ")" blockStmt ;

Sample Lux Programs

Hello World

print "Hello, World!";

Variables and Arithmetic

let a = 10;
let b = 20;
let sum = a + b;
print "Sum: " + str(sum);

Functions

fn greet(name) {
    return "Hello, " + name + "!";
}

print greet("Alice");
print greet("Bob");

Conditionals

fn max(a, b) {
    if a > b {
        return a;
    } else {
        return b;
    }
}

print max(10, 5);

Loops

// While loop
let i = 0;
while i < 5 {
    print i;
    i = i + 1;
}

// For loop
let numbers = [1, 2, 3, 4, 5];
for n in numbers {
    print n * n;
}

Recursion

fn factorial(n) {
    if n <= 1 {
        return 1;
    }
    return n * factorial(n - 1);
}

print factorial(5);  // 120

Closures

fn make_adder(x) {
    fn adder(y) {
        return x + y;
    }
    return adder;
}

let add5 = make_adder(5);
print add5(10);  // 15
print add5(20);  // 25

Higher-Order Functions

fn apply_twice(f, x) {
    return f(f(x));
}

fn double(x) {
    return x * 2;
}

print apply_twice(double, 5);  // 20

Lists

let items = [1, 2, 3];
push(items, 4);
push(items, 5);

for item in items {
    print item;
}

print "Length: " + str(len(items));

Project Setup

Let's create our Rust project:

# Create the project
cargo new lux
cd lux

# Our initial project structure
lux/
├── Cargo.toml
└── src/
    ├── main.rs          # Entry point (REPL and file runner)
    ├── lexer.rs         # Tokenization
    ├── parser.rs        # AST construction
    ├── ast.rs           # AST node definitions
    ├── interpreter.rs   # Evaluation engine
    └── environment.rs   # Variable storage and scoping

Initial Cargo.toml

[package]
name = "lux"
version = "0.1.0"
edition = "2021"

[dependencies]
# We'll add dependencies as needed

Initial src/main.rs

fn main() {
    println!("Lux Programming Language");
    println!("Coming soon...");
}

Test that it works:

cargo run

You should see:

Lux Programming Language
Coming soon...

Architecture Overview

Module Responsibilities

  1. main.rs

    • CLI argument parsing
    • REPL loop
    • File reading and execution
    • Error display
  2. lexer.rs

    • Character-by-character scanning
    • Token generation
    • Position tracking (line/column)
    • Comment and whitespace handling
  3. ast.rs

    • Token enum (keyword, operator, literal types)
    • Expr enum (all expression types)
    • Stmt enum (all statement types)
    • Helper types
  4. parser.rs

    • Recursive descent parsing
    • Operator precedence handling
    • AST construction
    • Syntax error detection
  5. interpreter.rs

    • Value enum (runtime values)
    • Expression evaluation
    • Statement execution
    • Built-in function implementation
  6. environment.rs

    • Variable storage
    • Scope management
    • Variable lookup and assignment

Why These Choices?

Why Rust?

  • Memory safety: No crashes from bad pointers
  • Performance: Fast even for a tree-walker
  • Enums and pattern matching: Perfect for AST representation
  • Error handling: Result/Option types for clean error propagation
  • Learning: Rust teaches you to think carefully about data ownership

Why Tree-Walking Interpreter?

  • Simplest to understand: Direct mapping from AST to execution
  • Easy to debug: Can print the tree and trace execution
  • Quick to implement: Complete interpreter in ~1000 lines
  • Foundation for more: Can optimize later (bytecode, JIT)

Why Dynamic Typing?

  • Simpler implementation: No type checker needed
  • Faster prototyping: Get interpreter working quickly
  • Familiar to many: JavaScript, Python, Ruby all dynamic
  • Good for scripting: Flexibility in small programs

Trade-off: Runtime errors vs compile-time safety. We handle this with good error messages.

Development Workflow

As we build Lux, follow this pattern:

  1. Design: Plan the feature (syntax, semantics)
  2. Test: Write test cases first
  3. Implement: Add code incrementally
  4. Test: Verify it works
  5. Refine: Clean up and optimize

Common Pitfalls to Avoid

  1. Forgetting position tracking: Always track line/column for error messages
  2. Inconsistent precedence: Use a precedence table, test thoroughly
  3. Scope bugs: Be careful with environment parent chains
  4. Memory leaks: Use Rc/RefCell appropriately for cycles
  5. Poor error messages: Always include source location and context

What's Next?

Now that we understand the architecture, let's build the first component: the lexer!

→ Continue to Chapter 2: Lexer

Quick Reference

Project Commands

# Run REPL
cargo run

# Run a file
cargo run -- script.lux

# Run tests
cargo test

# Run with debugging
RUST_LOG=debug cargo run

# Build release version
cargo build --release
./target/release/lux

Key Concepts

  • Lexer: Text → Tokens
  • Parser: Tokens → AST
  • Interpreter: AST → Execution
  • Environment: Variable storage with scoping
  • Value: Runtime representation of data