WAT and Binary: Inside a Wasm Module

This chapter covers Wasm's text format (WAT) and binary structure: the sections, types, imports, exports, and functions that make up every module.

Why Learn WAT

Most of your day-to-day Wasm work is in Rust (or C, or AssemblyScript). You might never write WAT by hand. But reading it pays off:

  • Debugging. When wasm-bindgen does something surprising, the WAT shows you what got generated.
  • Size tuning. Every extra type and import adds bytes. Readable output makes it obvious where to trim.
  • Understanding. Wasm is a specification, not just a file format. Reading WAT is how you learn the spec.

You don't need to write WAT to use Wasm. Twenty minutes reading it makes you a better user.

The Text Format

WAT (WebAssembly Text) uses S-expressions: parenthesized terms, nested freely. Looks Lisp-ish. Think of it as a pretty-printed binary.

The simplest valid module:

(module)

Empty. No functions, no memory, nothing. Compiles to a handful of bytes.

A module with one function:

(module
  (func $add (param i32 i32) (result i32)
    local.get 0
    local.get 1
    i32.add)
  (export "add" (func $add)))

Assemble it to binary:

wat2wasm hello.wat -o hello.wasm

Run wasm2wat hello.wasm and you get back something equivalent.

Module Sections

A Wasm module has a fixed set of optional sections, in a fixed order:

type        Function type signatures.
import      Functions, memories, globals, tables imported from the host.
function    Type index of each locally-defined function.
table       Tables (arrays of function references, mainly).
memory      Memory declaration (size in pages).
global      Module-level global variables.
export      Names for functions, memories, globals, tables the host can access.
start       Index of a function to run at instantiation.
element     Initializers for tables.
code        Actual function bodies.
data        Initializers for memory.

Every .wasm file is exactly those sections, in order, with a 4-byte magic header (\0asm) and version (\1\0\0\0).

The Value Types

Wasm is strictly typed. Four primitive value types from the start:

i32    32-bit integer, signed or unsigned depending on operation
i64    64-bit integer
f32    32-bit float
f64    64-bit float

Later proposals added:

v128         128-bit SIMD vector (Wasm SIMD)
funcref      reference to a function
externref    reference to a host object

Most day-to-day Wasm deals with i32 and i64. Pointers in linear memory are i32 (or i64 under the memory64 proposal).

Function Signatures

A function signature is (param types...) (result types...):

(type $sig (func (param i32 i32) (result i32)))

A function uses that signature:

(func $add (type $sig)
  local.get 0
  local.get 1
  i32.add)

Or inline:

(func $add (param i32 i32) (result i32)
  local.get 0
  local.get 1
  i32.add)

Both forms exist; compilers usually emit the indexed form to deduplicate.

The Stack Machine

Wasm is stack-based. Instructions push and pop operands; the stack's state is known statically.

i32.const 40
i32.const 2
i32.add        ;; stack: [42]

At every point, the stack contents are known at compile time, which makes Wasm validation fast and simple.

Compare to a register machine like x86, where you'd write:

mov eax, 40
add eax, 2

Stack-based code is denser in the binary and easier to validate. Runtimes translate it to register instructions during JIT compilation.

Locals

Function parameters become locals indexed from 0. Additional locals are declared with local:

(func $swap_sum (param i32 i32) (result i32)
  (local $tmp i32)
  local.get 0
  local.set $tmp
  local.get 1
  local.get $tmp
  i32.add)

Memory

A module can declare one linear memory (up to 4 GB, more with memory64):

(memory (export "memory") 1 10)

1 10 means "start with 1 page, grow to at most 10 pages". A page is 64 KB.

Access is via i32.load, i32.store, and their typed variants:

i32.const 0          ;; address
i32.const 42         ;; value
i32.store            ;; store 42 at address 0

i32.const 0
i32.load             ;; load i32 from address 0, stack: [42]

Chapter 3 covers memory properly.

Imports

A module declares what it needs from the host:

(module
  (import "env" "console_log" (func $log (param i32 i32)))
  (memory 1)
  (export "memory" (memory 0))
  (func (export "greet")
    i32.const 0    ;; ptr
    i32.const 5    ;; len
    call $log)
  (data (i32.const 0) "hello"))

The JS side provides env.console_log at instantiation:

const instance = await WebAssembly.instantiate(bytes, {
  env: {
    console_log: (ptr, len) => {
      const view = new Uint8Array(instance.instance.exports.memory.buffer);
      console.log(new TextDecoder().decode(view.slice(ptr, ptr + len)));
    },
  },
});
instance.instance.exports.greet();

Chapter 5 covers this interop fully.

Exports

Symmetrically, exports expose things to the host:

(export "add" (func $add))
(export "memory" (memory 0))

Anything not exported is internal to the module.

Tables and Function References

Tables hold references; typically function references for indirect calls (how Wasm does virtual dispatch and function pointers).

(table 2 funcref)
(elem (i32.const 0) $add $sub)

(func (export "call_indirect") (param i32) (result i32)
  i32.const 5
  i32.const 3
  local.get 0         ;; index into the table
  call_indirect (type $sig))

Useful for dynamic dispatch; more relevant once you're generating Wasm yourself. For Rust, the compiler handles it.

A Complete Example

A module with imports, exports, memory, a start function, and data:

(module
  ;; Import a log function from the host
  (import "env" "log" (func $log (param i32 i32)))

  ;; One page of memory, exported for the host
  (memory (export "memory") 1)

  ;; Initial data: "hi!\n" at offset 0
  (data (i32.const 0) "hi!\n")

  ;; Exported function: log "hi!\n"
  (func $greet (export "greet")
    i32.const 0        ;; pointer
    i32.const 4        ;; length
    call $log)

  ;; Auto-run at startup
  (start $greet))

Assemble, inspect, run (via a host that provides env.log). It's a full Wasm program in 15 lines.

Binary vs Text

The text is for humans. The binary is what runtimes consume.

wat2wasm hello.wat -o hello.wasm     # text to binary
wasm2wat hello.wasm -o hello.wat     # binary to text

Binary is compact: the example above is about 60 bytes. Text is readable. Runtimes work with binary; you read text.

Validation

Before a module runs, the runtime validates it. Validation is fast (linear time, type-directed) and catches:

  • Type mismatches on the stack.
  • Out-of-bounds memory accesses (statically, where known).
  • Invalid function signatures.
  • References to non-existent functions/types.

This is the "Wasm is safe to run" story. A validated module can't corrupt the host's memory outside its declared linear memory. The runtime can trust the module's types.

Common Pitfalls

Expecting WAT to be a full programming language. It isn't. No macros, no high-level constructs, no loops you'd want to hand-write. Use it to read, not to author.

Confusing function indices with names. Names are optional annotations (debug info). Indices are the real references. call 0 calls function 0 regardless of what you named it.

Forgetting crate-type = ["cdylib"] in Rust. Without it, you get a Rust library (.rlib), not a Wasm module.

Reading stripped WAT and getting confused. Production builds strip debug names. wasm2wat on a stripped module shows numeric references everywhere. wasm-objdump -d can help reconstruct structure.

Thinking stack-based means slow. Runtimes translate stack code to register code during JIT. The stack model is for compact encoding and easy validation; execution is native speed.

Next Steps

Continue to 03-linear-memory.md to move data across the host boundary.