MLIR for Lox: Part 5 — Closures — When a Variable Outlives Its Stack Frame

You’ve built a garbage collector that tracks roots on the stack and frees everything else. It works — until a function returns another function that references a local variable. That variable should be dead (the stack frame is gone), but it isn’t (the returned closure still uses it). This is the closure problem, and it’s the hardest part of garbage collection.

The Closure Problem

Consider this Lox code:

fun makeCounter() {
    var count = 0;
    
    fun counter() {
        count = count + 1;  // 'count' is from makeCounter's scope!
        return count;
    }
    
    return counter;
}

var c = makeCounter();  // makeCounter returns, but 'count' must live on!
print c();  // 1
print c();  // 2
print c();  // 3

The problem:

makeCounter() returns
Its stack frame is destroyed
But count must still exist because counter captures it!
Where does count live?

Stack vs Heap

┌─────────────────────────────────────────────────────────────┐
│ WRONG: count on the stack                                   │
│                                                             │
│   makeCounter() called                                      │
│   ┌─────────────────────┐                                   │
│   │ count = 0           │  ← on the stack                   │
│   │ return counter      │                                   │
│   └─────────────────────┘                                   │
│          ↓                                                  │
│   makeCounter() returns                                     │
│   ┌─────────────────────┐                                   │
│   │ (freed!)            │  ← count is gone!                 │
│   └─────────────────────┘                                   │
│          ↓                                                  │
│   c() is called                                             │
│   counter tries to access count... CRASH!                   │
│                                                             │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│ RIGHT: count on the heap (in a closure environment)         │
│                                                             │
│   makeCounter() called                                      │
│   ┌─────────────────────┐      ┌─────────────────────┐      │
│   │ env = alloc()       │──────│ count = 0           │      │
│   │ return counter      │      │ (on the heap!)      │      │
│   └─────────────────────┘      └─────────────────────┘      │
│          ↓                           ↑                      │
│   makeCounter() returns              │                      │
│   (stack frame freed)                │                      │
│          ↓                           │                      │
│   c() is called ─────────────────────┘                      │
│   counter accesses count via env pointer                    │
│   count is still alive!                                      │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Closure Environments

A closure environment is a heap-allocated structure that holds captured variables.

Structure

Closure Environment:
┌────────────────────────────────────┐
│ Header (ObjHeader)                 │
│   marked: bool                     │
│   obj_type: ObjType::Environment   │
│   size: ...                        │
├────────────────────────────────────┤
│ Data                               │
│   enclosing: *mut Env (or null)    │  ← for nested closures
│   count: usize                     │  ← number of slots
│   slot[0]: value                   │
│   slot[1]: value                   │
│   ...                              │
└────────────────────────────────────┘

Closure Object

Closure Object:
┌────────────────────────────────────┐
│ Header (ObjHeader)                 │
│   marked: bool                     │
│   obj_type: ObjType::Closure       │
│   size: ...                        │
├────────────────────────────────────┤
│ Data                               │
│   function: *mut Function          │  ← the code to execute
│   environment: *mut Env            │  ← captured variables
└────────────────────────────────────┘

Implementing Environments

Let’s add environment support to our runtime:

#![allow(unused)]
fn main() {
// src/runtime/object.rs (extended)

#[repr(u8)]
#[derive(Debug, Clone, Copy, PartialEq)]
pub enum ObjType {
    Number = 0,
    String = 1,
    Environment = 2,   // NEW: closure environment
    Closure = 3,       // shifted from 2 → 3 (was a forward declaration in Part 2)
    Instance = 4,      // shifted from 3 → 4
}
}

Note the renumbering. Part 2 defined the enum with Closure = 2 and Instance = 3 as forward declarations (the Closure variant existed but wasn’t used until this part). Now we’re inserting Environment = 2 before Closure, which shifts Closure to 3 and Instance to 4. This is why we use explicit discriminants — if we relied on implicit numbering, inserting a variant in the middle would silently break the GC’s trace_object dispatch. The explicit values make the contract between the enum and the heap header’s obj_type byte visible.

#![allow(unused)]
fn main() {
/// An environment (holds captured variables)
#[repr(C)]
pub struct Environment {
    /// Pointer to enclosing environment (for nested closures)
    /// null = no enclosing environment (top-level)
    pub enclosing: *mut Environment,
    
    /// Number of variable slots
    pub slot_count: usize,
    
    /// Variable slots (flexible array member)
    /// The `[0]` is Rust's approximation of a C flexible array member —
    /// `alloc_environment` allocates extra space for the slots beyond the
    /// struct size, and `get`/`set` access them via pointer arithmetic
    /// on `as_ptr().add(index)`.
    pub slots: [*mut u8; 0],
}

impl Environment {
    /// Get a slot value
    pub fn get(&self, index: usize) -> *mut u8 {
        assert!(index < self.slot_count);
        unsafe { *self.slots.as_ptr().add(index) }
    }
    
    /// Set a slot value
    pub fn set(&mut self, index: usize, value: *mut u8) {
        assert!(index < self.slot_count);
        unsafe { *self.slots.as_mut_ptr().add(index) = value; }
    }
}

/// A closure (function + environment)
#[repr(C)]
pub struct Closure {
    /// Pointer to the function code (an index into our function table)
    pub function_index: usize,
    
    /// Pointer to the captured environment
    pub environment: *mut Environment,
}
}

Why do the slots store *mut u8 but the MLIR code loads f64? The Rust runtime stores *mut u8 pointers in each slot — pointers to GC-tracked heap objects. In the numbers-only model, those heap objects are Number values whose data area contains an f64. The MLIR lox_runtime_env_get returns a !llvm.ptr pointing to the Number’s data area, and llvm.load reads the f64 from there. The MLIR lox_runtime_env_set_number takes an f64, boxes it into a heap-allocated Number (the C runtime handles this internally), and stores the pointer in the slot. We use env_set_number instead of the generic env_set (which takes void*) because passing an f64 through a void* parameter violates the C calling convention — on x86-64 System V, f64 is passed in an XMM register while void* is passed in a general-purpose register, so the callee would read garbage. In the tagged-union model (Part 7+), each slot would hold a (i8, i64) pair and the generic env_set with void* would be used instead.

Allocating an Environment

#![allow(unused)]
fn main() {
// src/runtime/gc.rs (extended)

/// Allocate a new environment with the given slot count
pub fn alloc_environment(slot_count: usize, enclosing: *mut Environment) -> *mut Environment {
    // Calculate size: header + environment struct + slots
    let env_data_size = std::mem::size_of::<Environment>() 
                      + slot_count * std::mem::size_of::<*mut u8>();
    
    let data_ptr = alloc(env_data_size, ObjType::Environment);
    let env = data_ptr as *mut Environment;
    
    unsafe {
        (*env).enclosing = enclosing;
        (*env).slot_count = slot_count;
        
        // Initialize all slots to null
        for i in 0..slot_count {
            (*env).set(i, std::ptr::null_mut());
        }
    }
    
    env
}

/// Allocate a new closure
pub fn alloc_closure(function_index: usize, environment: *mut Environment) -> *mut Closure {
    let data_ptr = alloc(std::mem::size_of::<Closure>(), ObjType::Closure);
    let closure = data_ptr as *mut Closure;
    
    unsafe {
        (*closure).function_index = function_index;
        (*closure).environment = environment;
    }
    
    closure
}
}

Marking Environments

When we mark a closure, we must also mark its environment. This is the same mark_references function from Part 3 — we’re adding the Environment arm for closures:

#![allow(unused)]
fn main() {
// src/runtime/gc.rs (extended)

fn mark_references(header: *mut ObjHeader) {
    let obj_type = unsafe { (*header).obj_type };
    let data = unsafe { (header as *mut u8).add(std::mem::size_of::<ObjHeader>()) };
    
    match obj_type {
        ObjType::Number | ObjType::String => {
            // No references
        }
        
        ObjType::Environment => {
            let env = data as *mut Environment;
            unsafe {
                // Mark enclosing environment
                if !(*env).enclosing.is_null() {
                    mark_object((*env).enclosing as *mut u8);
                }
                
                // Mark all slots
                for i in 0..(*env).slot_count {
                    let slot = (*env).get(i);
                    if !slot.is_null() {
                        mark_object(slot);
                    }
                }
            }
        }
        
        ObjType::Closure => {
            let closure = data as *mut Closure;
            unsafe {
                // Mark the environment
                let env = (*closure).environment;
                if !env.is_null() {
                    mark_object(env as *mut u8);
                }
            }
        }
        
        ObjType::Instance => {
            // (same as before)
        }
    }
}
}

Code Generation for Closures

Now let’s generate code for closures in our compiler:

The Challenge

When compiling a closure, we need to:

Identify captured variables - which variables from outer scopes are used?
Allocate an environment - create heap storage for captured variables
Store captured values - copy values into the environment
Access via environment - when the closure reads/writes captured vars, go through the environment

Step 1: Variable Analysis

#![allow(unused)]
fn main() {
// src/analysis/captures.rs

use crate::ast::*;

/// A variable that is captured by a closure
#[derive(Debug, Clone)]
pub struct CapturedVar {
    pub name: String,
    pub depth: usize,      // How many enclosing environments to follow (0 = immediately enclosing)
    pub slot_index: usize, // Index in the environment
}

/// Analyze a function to find captured variables
pub fn find_captures(func: &FunctionStmt) -> Vec<CapturedVar> {
    let mut analyzer = CaptureAnalyzer::new();
    analyzer.analyze_function(func);
    analyzer.captures
}

struct CaptureAnalyzer {
    scopes: Vec<Vec<String>>,  // Stack of local variables in each scope
    captures: Vec<CapturedVar>,
    current_slot: usize,
}

impl CaptureAnalyzer {
    fn new() -> Self {
        Self {
            scopes: vec![vec![]],  // Start with one scope for parameters
            captures: Vec::new(),
            current_slot: 0,
        }
    }
    
    fn analyze_function(&mut self, func: &FunctionStmt) {
        // Parameters are in scope 0
        for param in &func.params {
            self.scopes[0].push(param.clone());
        }
        
        // Analyze body
        for stmt in &func.body {
            self.analyze_stmt(stmt);
        }
    }
    
    fn analyze_stmt(&mut self, stmt: &Stmt) {
        match stmt {
            Stmt::Var(v) => {
                self.scopes.last_mut().unwrap().push(v.name.clone());
                self.analyze_expr(&v.init);
            }
            Stmt::Block(b) => {
                self.scopes.push(vec![]);
                for s in &b.statements {
                    self.analyze_stmt(s);
                }
                self.scopes.pop();
            }
            Stmt::Print(p) => self.analyze_expr(&p.expression),
            Stmt::Return(r) => {
                if let Some(value) = &r.value {
                    self.analyze_expr(value);
                }
            }
            Stmt::Function(f) => {
                // The function name is local to this scope
                self.scopes.last_mut().unwrap().push(f.name.clone());
                // But we don't recurse into the body here —
                // the compiler processes each function separately
                // (see "Compilation Order: Inside Out" below).
            }
            Stmt::Expression(e) => self.analyze_expr(&e.expression),
            Stmt::If(i) => {
                self.analyze_expr(&i.condition);
                self.analyze_stmt(&i.then_branch);
                if let Some(else_branch) = &i.else_branch {
                    self.analyze_stmt(else_branch);
                }
            }
            Stmt::While(w) => {
                self.analyze_expr(&w.condition);
                self.analyze_stmt(&w.body);
            }
        }
    }
    
    fn analyze_expr(&mut self, expr: &Expr) {
        match expr {
            Expr::Variable(v) => {
                // Is this variable captured? (not in any local scope)
                if !self.is_local(&v.name) {
                    // It's a capture!
                    if !self.captures.iter().any(|c| c.name == v.name) {
                        self.captures.push(CapturedVar {
                            name: v.name.clone(),
                            // depth = how many enclosing environments to follow
                            // at runtime. A variable captured from the immediately
                            // enclosing function is at depth 0 (first env up).
                            // For nested closures (capturing from two+ levels up),
                            // extending the scope analysis to track how many
                            // function boundaries are between the reference and
                            // its definition would compute the actual depth.
                            // Our compiler handles single-level
                            // captures — captured variables are always in the
                            // immediately enclosing environment.
                            depth: 0,
                            slot_index: self.current_slot,
                        });
                        self.current_slot += 1;
                    }
                }
            }
            // Recursive cases — the capture analysis doesn't change
            // behavior for these, it traverses into sub-expressions
            // to find more variable references.
            Expr::Binary(b) => {
                self.analyze_expr(&b.left);
                self.analyze_expr(&b.right);
            }
            Expr::Unary(u) => self.analyze_expr(&u.right),
            Expr::Call(c) => {
                self.analyze_expr(&c.callee);
                for arg in &c.arguments {
                    self.analyze_expr(arg);
                }
            }
            Expr::Assign(a) => {
                // Assignment to a variable might be a capture too
                if !self.is_local(&a.name) {
                    if !self.captures.iter().any(|c| c.name == a.name) {
                        self.captures.push(CapturedVar {
                            name: a.name.clone(),
                            depth: 0,
                            slot_index: self.current_slot,
                        });
                        self.current_slot += 1;
                    }
                }
                self.analyze_expr(&a.value);
            }
            Expr::Literal(_) => {} // No sub-expressions
            Expr::Logical(l) => {
                self.analyze_expr(&l.left);
                self.analyze_expr(&l.right);
            }
            Expr::Grouping(g) => self.analyze_expr(&g.expression),
        }
    }
    
    fn is_local(&self, name: &str) -> bool {
        // Check ALL scopes — a variable from an enclosing block in the
        // same function is still local, not a capture.
        //
        // Variables from enclosing *functions* aren't in self.scopes at all
        // (we only track the current function's scopes), so they naturally
        // fail this check and are classified as captures. That's correct:
        // a variable that isn't local to this function must be captured.
        self.scopes.iter().any(|scope| scope.contains(&name.to_string()))
    }
    
    // depth_of is shown for reference but not used in our simplified
    // capture analysis — it computes the scope depth within the current
    // function. For captured variables (not in self.scopes), this would
    // panic. A full compiler would track the number of enclosing
    // function boundaries between the reference and the definition to
    // compute the environment chain depth.
    fn _depth_of(&self, name: &str) -> usize {
        for (depth, scope) in self.scopes.iter().rev().enumerate() {
            if scope.contains(&name.to_string()) {
                return depth;
            }
        }
        panic!("Variable not found: {}", name);
    }
}
}

Step 2: Environment Allocation

#![allow(unused)]
fn main() {
// Continuing in src/codegen.rs — same imports as Part 4
use std::collections::HashMap;
// (BlockLike, RegionLike, OperationLike must be in scope)
// src/codegen/generator.rs (extended)

impl<'c> CodeGenerator<'c> {
    
    fn compile_function(&self, func: &FunctionStmt, variables: &mut HashMap<String, Value<'c, 'c>>, current_env: &mut Option<Value<'c, 'c>>) {
        // current_env is &mut because we assign to it below (*current_env = Some(env)),
        // not because alloc_environment needs mutability — it takes &Option<Value>.
        // ... setup ...
        
        // Find captured variables
        let captures = find_captures(func);
        
        // If we have captures, we need an environment
        if !captures.is_empty() {
            let env = self.alloc_environment(block, captures.len(), current_env);
            
            // Store captured values into the environment
            for capture in &captures {
                // get_capture_value looks up the variable in the enclosing
                // function's variables map, possibly traversing the scope
                // chain (capture.depth levels up) to find it.
                let value = self.get_capture_value(&capture.name, capture.depth);
                self.store_to_environment(block, env, capture.slot_index, value);
            }
            
            // The environment is now available for inner functions
            *current_env = Some(env);
        }
        
        // ... compile body ...
    }
    
    fn alloc_environment(&self, block: &Block<'c>, slot_count: usize, current_env: &Option<Value<'c, 'c>>) -> Value<'c, 'c> {
        let location = Location::unknown(self.context);
        
        // Call lox_runtime_alloc_environment(slot_count, enclosing)
        let slot_count_val = self.const_i64(slot_count as i64);
        let enclosing_val = current_env
            .unwrap_or_else(|| self.const_null());
        
        let call = block.append_operation(func::call(
            self.context,
            melior::ir::attribute::FlatSymbolRefAttribute::new(
                self.context, 
                "lox_runtime_alloc_environment"
            ),
            &[slot_count_val, enclosing_val],
            &[Type::parse(self.context, "!llvm.ptr").unwrap()],
            location,
        ));
        
        call.result(0).unwrap().into()
    }
    
    fn store_to_environment(&self, block: &Block<'c>, env: Value<'c, 'c>, index: usize, value: Value<'c, 'c>) {
        let location = Location::unknown(self.context);
        
        // The C runtime handles boxing the f64 into a heap-allocated
        // Number and storing the pointer in the environment slot.
        block.append_operation(func::call(
            self.context,
            melior::ir::attribute::FlatSymbolRefAttribute::new(
                self.context,
                "lox_runtime_env_set_number"
            ),
            &[env, self.const_i64(index as i64), value],
            &[],
            location,
        ));
    }
}

Why lox_runtime_env_set_number instead of lox_runtime_env_set?

The generic lox_runtime_env_set takes a void* parameter. Our values are f64. You might think: “cast the f64 to a pointer and pass it.” On x86-64, that breaks. The C calling convention passes void* arguments in general-purpose registers (rdi, rsi, rdx…) but passes double arguments in XMM registers (xmm0, xmm1…). If the callee expects a pointer in rdx but you passed a double in xmm0, the callee reads the wrong register and gets garbage.

The type-specific lox_runtime_env_set_number takes double directly, so the compiler puts the value in the right register. The C runtime then boxes the double into a heap-allocated Number object and stores that pointer in the environment slot. This is the kind of calling-convention trap that only bites you when you cross language boundaries — MLIR’s func.call doesn’t save you from the C ABI’s register rules.

#![allow(unused)]
fn main() {
    /// Get a pointer to a captured variable in the closure environment.
    /// The environment stores pointers to heap objects; `env_get` returns
    /// a pointer to the object's data area (past ObjHeader), so you can
    /// `llvm.load` the f64 directly.
    fn env_get_ptr(&self, block: &Block<'c>, env: Value<'c, 'c>, index: usize) -> Value<'c, 'c> {
        let location = Location::unknown(self.context);
        
        let call = block.append_operation(func::call(
            self.context,
            melior::ir::attribute::FlatSymbolRefAttribute::new(
                self.context,
                "lox_runtime_env_get"
            ),
            &[env, self.const_i64(index as i64)],
            &[Type::parse(self.context, "!llvm.ptr").unwrap()],
            location,
        ));
        
        call.result(0).unwrap().into()
    }
    
    /// Load a captured variable's value from the closure environment.
    /// Two steps: get the pointer (env_get_ptr), then load the f64.
    fn load_from_environment(&self, block: &Block<'c>, env: Value<'c, 'c>, index: usize) -> Value<'c, 'c> {
        let ptr = self.env_get_ptr(block, env, index);
        let location = Location::unknown(self.context);
        
        let loaded = block.append_operation(llvm::load(
            ptr,
            Type::float64(self.context),
            location,
        ));
        
        loaded.result(0).unwrap().into()
    }
}
}

How Does the Closure Receive Its Environment?

The code above shows the enclosing function creating an environment and storing captured values into it. But the closure itself is compiled as a separate function — and it needs an %env parameter to access those captured variables.

Here’s the missing link. When the compiler compiles a closure’s body, it does two things differently from a regular function:

The function signature includes an %env parameter. You’ll see this in the MLIR for @counter below — the first argument is always the environment pointer.
current_env is initialized from that parameter, not from alloc_environment. The allocation happened in the enclosing function, and the closure receives the pointer through the calling convention.

When the compiler encounters a reference to a captured variable like count, it calls load_from_environment with the %env parameter instead of looking up the variable in the stack-allocated variables HashMap. Local variables (the function’s own parameters and locals) still go through variables; only captured variables go through the environment.

This split is what find_captures determines — it classifies each variable reference as local or captured, and the compiler routes accordingly.

The Complete Picture

Here’s how everything connects for our makeCounter example:

Lox Source

fun makeCounter() {
    var count = 0;
    
    fun counter() {
        count = count + 1;
        return count;
    }
    
    return counter;
}

How Does the Runtime Call a Closure?

Before we look at the generated MLIR, there’s a question you should be asking: when the Lox program does c(), how does the runtime know which function to call and what to pass it?

The closure object has two fields: function_index and environment. The calling convention works like this:

When the Lox program does: c()

1. Load c from the stack (it's a pointer to a Closure object)
2. Read the Closure's function_index (e.g., 1)
3. Read the Closure's environment pointer (e.g., 0x2000)
4. Look up function_index 1 in a function table → @counter
5. Call @counter(environment)

The function table is an array the compiler builds at module creation time. Each closure-generating function gets an index. When alloc_closure(1, %env) is called, the 1 refers to @counter’s position in that table.

In the generated MLIR, this looks like:

// In the caller (e.g., main or another function)
%closure = <load closure from variable>

// Read function_index and environment from the closure object
%func_idx = llvm.load %closure[0] : !llvm.ptr -> i64
%env_ptr  = llvm.load %closure[8] : !llvm.ptr -> !llvm.ptr

// Indirect call through the function table
%func_table = llvm.mlir.addressof @lox_function_table : !llvm.ptr
%func_ptr = llvm.getelementptr %func_table[%func_idx] : (!llvm.ptr, i64) -> !llvm.ptr
%callee = llvm.load %func_ptr : !llvm.ptr -> !llvm.ptr
%result = llvm.call %callee(%env_ptr) : !llvm.ptr, !llvm.ptr -> f64

This is an indirect call. The target function isn’t known at compile time — it depends on which closure the variable holds at runtime. That’s the cost of first-class functions. The benefit is that c() works regardless of which closure c points to, whether it’s a counter, an adder, or anything else.

For closures that take their own parameters (like @add in the “Multiple Captured Variables” section below, which takes (%env: !llvm.ptr, %x: f64, %y: f64)), the environment pointer is followed by the function’s own arguments in the indirect call: %result = llvm.call %callee(%env_ptr, %x, %y). The calling convention is: environment first, then the closure’s declared parameters.

Simplification: A production compiler would use MLIR’s call operation with a symbol reference when the target is known at compile time (direct call) and fall back to indirect calls only for closures stored in variables. Our compiler uses indirect calls for all closure invocations for simplicity.

Now you know how closures are called. With that in place, the generated MLIR will make sense — when you see @counter(%env: !llvm.ptr), the %env comes from the calling convention above.

Generated MLIR (Simplified)

module {
  // makeCounter creates an environment for 'count'
  func.func @makeCounter() -> !llvm.ptr {
    // Push frame with 1 root: the environment must survive across
    // the alloc_closure call — if GC triggers there, %env needs to
    // be on the shadow stack or it could be collected.
    %frame = lox.push_frame root_count = 1 : !llvm.ptr
    
    // Allocate environment with 1 slot
    %env = func.call @lox_runtime_alloc_environment(1, null) : (i64, !llvm.ptr) -> !llvm.ptr
    
    // Root the environment before any further allocation
    lox.set_root index = 0, %env : !llvm.ptr
    
    // Initialize count = 0 (env_set_number boxes the f64 into a heap Number)
    %zero = arith.constant 0.0 : f64
    func.call @lox_runtime_env_set_number(%env, 0, %zero) : (!llvm.ptr, i64, f64)
    
    // Create closure for counter
    // (counter_index = 1, env = %env)
    // Safe: %env is rooted, so GC during alloc_closure can find it
    %closure = func.call @lox_runtime_alloc_closure(1, %env) : (i64, !llvm.ptr) -> !llvm.ptr
    
    lox.pop_frame
    func.return %closure : !llvm.ptr
  }
  
  // counter accesses 'count' via environment
  func.func @counter(%env: !llvm.ptr) -> f64 {
    %frame = lox.push_frame root_count = 1 : !llvm.ptr
    lox.set_root index = 0, %env : !llvm.ptr
    
    // Load count from environment (returns a pointer to the heap object)
    %count_ptr = func.call @lox_runtime_env_get(%env, 0) : (!llvm.ptr, i64) -> !llvm.ptr
    %count = llvm.load %count_ptr : !llvm.ptr -> f64
    
    // count = count + 1
    %one = arith.constant 1.0 : f64
    %new_count = arith.addf %count, %one : f64
    
    // Store back to environment (env_set_number boxes the new f64)
    func.call @lox_runtime_env_set_number(%env, 0, %new_count) : (!llvm.ptr, i64, f64)
    
    lox.pop_frame
    func.return %new_count : f64
  }
}

Memory Layout After `var c = makeCounter();`

Stack:
  ┌─────────────────┐
  │ c = 0x1000      │──┐
  └─────────────────┘  │
                       ▼
Heap:                  
  ┌─────────────────────────┐  0x1000: Closure
  │ header: { Closure }     │
  │ function_index: 1       │
  │ environment: 0x2000 ────│──┐
  └─────────────────────────┘  │
                               ▼
  ┌─────────────────────────┐  0x2000: Environment
  │ header: { Environment } │
  │ enclosing: null         │
  │ slot_count: 1           │
  │ slot[0]: 1.0 ───────────│──┐
  └─────────────────────────┘  │
                               │
  (count = 1.0) ◄──────────────┘

Nested Closures Deep Dive

Let’s trace through a more complex example with nested closures:

The Code

fun makeAdder(x) {
    fun adder(y) {
        return x + y;  // Captures 'x' from makeAdder
    }
    return adder;
}

var add5 = makeAdder(5);
var add10 = makeAdder(10);

print add5(3);   // 8
print add10(3);  // 13

Step-by-Step Execution

1. makeAdder(5) is called:

Stack:
  ┌─────────────────────────────┐
  │ Frame: makeAdder            │
  │   x = 5 (parameter)         │
  └─────────────────────────────┘
        │
        ▼ Creates environment
  ┌─────────────────────────────┐
  │ Environment 0x1000          │
  │   slot[0]: 5 (x)            │
  └─────────────────────────────┘
        │
        ▼ Creates closure
  ┌─────────────────────────────┐
  │ Closure 0x2000 (adder)      │
  │   function_index: 1         │
  │   environment: 0x1000 ──────│──► env with x=5
  └─────────────────────────────┘

Returns closure 0x2000, assigned to add5
makeAdder stack frame is destroyed, but environment lives on!

2. makeAdder(10) is called:

Creates NEW environment 0x3000 with x=10
Creates NEW closure 0x4000 pointing to environment 0x3000

add5  → closure 0x2000 → env 0x1000 (x=5)
add10 → closure 0x4000 → env 0x3000 (x=10)

3. add5(3) is called:

Stack:
  ┌─────────────────────────────┐
  │ Frame: adder                │
  │   y = 3 (parameter)         │
  │   env = 0x1000 (from closure)│
  └─────────────────────────────┘

Load x from env[0] = 5
Compute 5 + 3 = 8
Return 8

4. add10(3) is called:

Same process, but env = 0x3000
Load x from env[0] = 10
Compute 10 + 3 = 13
Return 13

Key Insight: Each Call Creates Separate Environment

makeAdder(5):
  → Creates env {x: 5}
  → Creates closure pointing to that env

makeAdder(10):
  → Creates NEW env {x: 10}
  → Creates NEW closure pointing to NEW env

The two closures share no state!

Multiple Captured Variables

What if we capture multiple variables?

fun makeOffsetter(offsetX, offsetY) {
    fun apply(x, y) {
        // Captures both offsetX and offsetY
        print offsetX + x;
        print offsetY + y;
    }
    return apply;
}

var shift = makeOffsetter(10, 20);
shift(1, 2);   // prints 11, then 22

Environment Layout

Environment:
  slot[0]: offsetX
  slot[1]: offsetY

When apply is called:
  1. Load offsetX from env[0]
  2. Load offsetY from env[1]
  3. Compute offsetX + x and print
  4. Compute offsetY + y and print

Generated Code

func.func @apply(%env: !llvm.ptr, %x: f64, %y: f64) {
    // Load captured variables (env_get returns pointers, then we load the f64 values)
    %offsetX_ptr = func.call @lox_runtime_env_get(%env, 0) : (!llvm.ptr, i64) -> !llvm.ptr
    %offsetX = llvm.load %offsetX_ptr : !llvm.ptr -> f64
    %offsetY_ptr = func.call @lox_runtime_env_get(%env, 1) : (!llvm.ptr, i64) -> !llvm.ptr
    %offsetY = llvm.load %offsetY_ptr : !llvm.ptr -> f64
    
    // Compute
    %sum_x = arith.addf %offsetX, %x : f64
    %sum_y = arith.addf %offsetY, %y : f64
    
    // Print each result
    func.call @lox_print(%sum_x) : (f64) -> ()
    func.call @lox_print(%sum_y) : (f64) -> ()
    func.return
}

Nested Environments

Beyond our simplified model. The find_captures analysis hardcodes depth: 0 because our compiler handles single-level captures — captured variables are always in the immediately enclosing environment. The nested example below shows what the runtime can do: environment chains, depth traversal, lox_runtime_env_get_enclosing are all real runtime machinery. But producing the correct depth values would require extending the compile-time scope analysis to track how many function boundaries are between the variable reference and its definition — a local depth computation during the existing scope walk, not a whole-program pass. The MLIR shown here is what a full compiler would generate; our simplified compiler would emit depth: 0 for all captures and wouldn’t handle the inner case correctly.

There are two gaps between the code shown and the nested example below. First, CaptureAnalyzer doesn’t compute actual depths — it always emits depth: 0. A full implementation would replace the hardcoded depth: 0 with a scope-walk that counts function boundaries between the reference and the definition. Second, the Stmt::Function arm pushes the function name into scope but doesn’t recurse into the body. This means captures from inner functions never flow upward to their enclosing functions. The “Compilation Order: Inside Out” section below explains the concept (inner functions determine what outer functions must capture). Here’s the mechanism that makes it work:
fn propagate_captures(
    inner_captures: &[CapturedVar],
    outer_scope: &Scope,
) -> Vec<CapturedVar> {
    // For each variable that the inner function captured
    // from beyond its immediately enclosing function,
    // the enclosing function must also capture it
    // (one level closer to the definition).
    inner_captures
        .iter()
        .filter(|c| c.depth > 0 || !outer_scope.is_local(&c.name))
        .map(|c| CapturedVar {
            name: c.name.clone(),
            depth: c.depth.saturating_sub(1),
            ..*c
        })
        .collect()
}
The key insight: if inner captures a at depth 1 (it’s in outer, one level beyond middle), then middle must capture a at depth 0 (it’s in outer, the immediately enclosing function). Each function in the chain captures at one level less depth than the function below it. This is the upward propagation that our simplified find_captures doesn’t implement — but that a full compiler needs.

Our simplified compiler emits depth: 0 for all captures. This is correct for single-level captures — the variable is always in the immediately enclosing environment. But it produces wrong code when a variable is two or more scopes away. The inner function needs a at depth 1, not depth 0, and our find_captures can’t compute that. The nested example below is here to show how the runtime would work if the analysis could produce the right depth values.

⚠️ The MLIR below is what a full compiler generates. The code shown earlier in this chapter does not produce this output for nested closures.

What if a closure captures a variable from two scopes up?

fun outer() {
    var a = 1;
    
    fun middle() {
        var b = 2;
        
        fun inner() {
            return a + b;  // a from outer, b from middle
        }
        
        return inner;
    }
    
    return middle;
}

Environment Chain

outer() creates:
  env_outer { a: 1 }

middle() creates:
  env_middle { b: 2, enclosing: env_outer }
                ↑
                Points to outer's environment

inner() closure:
  environment: env_middle

When inner() runs:
  1. Look up 'a': not in env_middle → follow enclosing → found in env_outer
  2. Look up 'b': found in env_middle

Variable Lookup Algorithm

#![allow(unused)]
fn main() {
/// Look up a variable by traversing the environment chain
fn lookup_variable(env: *mut Environment, depth: usize, slot: usize) -> *mut u8 {
    let mut current = env;
    
    // Walk up 'depth' levels through the enclosing chain.
    // depth=0 means the variable is in the current environment,
    // depth=1 means one level up, etc.
    for _ in 0..depth {
        current = unsafe { (*current).enclosing };
        assert!(!current.is_null(), "enclosing environment is null at depth > 0");
    }
    
    // Now access the slot
    unsafe { (*current).get(slot) }
}
}

Compilation Order: Inside Out

There’s a subtle but important constraint on how we compile closures: inner functions must be compiled before outer functions.

When outer defines middle which defines inner, the compilation order must be:

Compile inner → discover it captures a (from outer) and b (from middle)
Compile middle → now we know inner captures b from middle’s scope, so middle must allocate an environment with b in it
Compile outer → now we know middle captures a from outer’s scope, so outer must allocate an environment with a in it

If we compiled top-down, outer wouldn’t know it needs an environment for a until middle is compiled — and middle wouldn’t know about b until inner is compiled. The capture information flows upward: inner functions determine what outer functions must capture.

This is why the find_captures function analyzes a single function’s body — it finds variables that aren’t local to that function. The compiler then uses this information when compiling the enclosing function to allocate the right environment slots.

Generated MLIR

func.func @inner(%env: !llvm.ptr) -> f64 {
    // Load 'b' from current environment (depth=0, slot=0)
    %b_ptr = func.call @lox_runtime_env_get(%env, 0) : (!llvm.ptr, i64) -> !llvm.ptr
    %b = llvm.load %b_ptr : !llvm.ptr -> f64
    
    // Load 'a' from enclosing environment (depth=1, slot=0)
    // lox_runtime_env_get_enclosing follows the parent pointer in
    // the Environment struct — it must be declared alongside
    // lox_runtime_env_get and lox_runtime_env_set in the runtime.
    // Signature: lox_runtime_env_get_enclosing(env: !llvm.ptr) -> !llvm.ptr
    %env_outer = func.call @lox_runtime_env_get_enclosing(%env) : (!llvm.ptr) -> !llvm.ptr
    %a_ptr = func.call @lox_runtime_env_get(%env_outer, 0) : (!llvm.ptr, i64) -> !llvm.ptr
    %a = llvm.load %a_ptr : !llvm.ptr -> f64
    
    // Compute
    %sum = arith.addf %a, %b : f64
    
    func.return %sum : f64
}

Practice Exercises

Exercise 1: Trace Memory Layout

For this code:

fun factory(value) {
    fun getter() {
        return value;
    }
    fun setter(new_value) {
        value = new_value;
    }
    // In a real Lox program, you'd return both closures
    // via a class instance or global variables. For this
    // exercise, imagine both are available after calling
    // factory() — they share the same environment.
    return getter;
}

var get = factory(10);
print get();  // 10
// If setter were also available:
// setter(20);
// print get();  // 20

Draw the memory layout after calling factory(10) and assigning the result to get.

Click to reveal answer

Stack:
  ┌─────────────────────────────┐
  │ get = 0x1000 (closure)      │
  └─────────────────────────────┘

Heap:
  0x1000: Closure (getter)
    function_index: @getter
    environment: 0x3000 ──────────┐
                                  │
  (If setter were also returned:) │
  0x2000: Closure (setter)        │
    function_index: @setter       │
    environment: 0x3000 ──────────│── Same environment!
                                  │
  0x3000: Environment ◄───────────┘
    slot[0]: 10 (value)

Key insight: Both closures share the SAME environment. That’s why setter(20) would affect what getter() returns!

Exercise 2: Variable Analysis

For this function, what variables are captured and where do they go?

fun outer(x, y) {
    var a = x + y;
    
    fun inner(b) {
        return a + b + x;  // Captures a and x
    }
    
    return inner;
}

Click to reveal answer

Captured variables:

a (local in outer) → slot 0
x (parameter in outer) → slot 1

Environment for inner:

env_inner:
  slot[0]: a
  slot[1]: x

Note: y is NOT captured (not used by inner), so it’s not in the environment.

Initialization order matters: the environment must be allocated early so it can be rooted on the shadow stack before any later allocation (like alloc_closure) triggers GC. If GC ran during alloc_closure and the environment wasn’t rooted, it could be collected — taking a and x with it. The sequence is: allocate env → root env → compute a = x + y → store a in env[0] → store x in env[1] → allocate closure. The stores into the environment can happen in any order after allocation, but the environment must be on the shadow stack before alloc_closure.

Exercise 3: Why Not Copy Values?

Why can’t we copy captured values into the closure directly? Why do we need an environment?

fun example() {
    var count = 0;
    
    fun increment() {
        count = count + 1;  // MODIFIES count!
        return count;
    }
    
    return increment;
}

Click to reveal answer

If we copied count into the closure, each call to increment() would modify its own copy — the outer count would never change, and multiple calls wouldn’t accumulate.

By using an environment, all closures share the same environment. Modifications are visible to every closure that references it, so state is properly shared.

The environment is essentially a shared “box” that holds the variable.

Next: Part 6 — Complete Reference — Closures are the hardest single feature. Before adding more, let’s see the complete numbers-only compiler in one place: every module, every pass, every runtime function. This is the working system that Parts 1–5 built, assembled and running end to end.

Keyboard shortcuts

MLIR for Lox: A Compiler Tutorial