MLIR for Lox: Part 5 — Closures — When a Variable Outlives Its Stack Frame
You’ve built a garbage collector that tracks roots on the stack and frees everything else. It works — until a function returns another function that references a local variable. That variable should be dead (the stack frame is gone), but it isn’t (the returned closure still uses it). This is the closure problem, and it’s the hardest part of garbage collection.
The Closure Problem
Consider this Lox code:
fun makeCounter() {
var count = 0;
fun counter() {
count = count + 1; // 'count' is from makeCounter's scope!
return count;
}
return counter;
}
var c = makeCounter(); // makeCounter returns, but 'count' must live on!
print c(); // 1
print c(); // 2
print c(); // 3
The problem:
makeCounter()returns- Its stack frame is destroyed
- But
countmust still exist becausecountercaptures it! - Where does
countlive?
Stack vs Heap
┌─────────────────────────────────────────────────────────────┐
│ WRONG: count on the stack │
│ │
│ makeCounter() called │
│ ┌─────────────────────┐ │
│ │ count = 0 │ ← on the stack │
│ │ return counter │ │
│ └─────────────────────┘ │
│ ↓ │
│ makeCounter() returns │
│ ┌─────────────────────┐ │
│ │ (freed!) │ ← count is gone! │
│ └─────────────────────┘ │
│ ↓ │
│ c() is called │
│ counter tries to access count... CRASH! │
│ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ RIGHT: count on the heap (in a closure environment) │
│ │
│ makeCounter() called │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ env = alloc() │──────│ count = 0 │ │
│ │ return counter │ │ (on the heap!) │ │
│ └─────────────────────┘ └─────────────────────┘ │
│ ↓ ↑ │
│ makeCounter() returns │ │
│ (stack frame freed) │ │
│ ↓ │ │
│ c() is called ─────────────────────┘ │
│ counter accesses count via env pointer │
│ count is still alive! │
│ │
└─────────────────────────────────────────────────────────────┘
Closure Environments
A closure environment is a heap-allocated structure that holds captured variables.
Structure
Closure Environment:
┌────────────────────────────────────┐
│ Header (ObjHeader) │
│ marked: bool │
│ obj_type: ObjType::Environment │
│ size: ... │
├────────────────────────────────────┤
│ Data │
│ enclosing: *mut Env (or null) │ ← for nested closures
│ count: usize │ ← number of slots
│ slot[0]: value │
│ slot[1]: value │
│ ... │
└────────────────────────────────────┘
Closure Object
Closure Object:
┌────────────────────────────────────┐
│ Header (ObjHeader) │
│ marked: bool │
│ obj_type: ObjType::Closure │
│ size: ... │
├────────────────────────────────────┤
│ Data │
│ function: *mut Function │ ← the code to execute
│ environment: *mut Env │ ← captured variables
└────────────────────────────────────┘
Implementing Environments
Let’s add environment support to our runtime:
#![allow(unused)]
fn main() {
// src/runtime/object.rs (extended)
#[repr(u8)]
#[derive(Debug, Clone, Copy, PartialEq)]
pub enum ObjType {
Number = 0,
String = 1,
Environment = 2, // NEW: closure environment
Closure = 3, // shifted from 2 → 3 (was a forward declaration in Part 2)
Instance = 4, // shifted from 3 → 4
}
}
Note the renumbering. Part 2 defined the enum with
Closure = 2andInstance = 3as forward declarations (theClosurevariant existed but wasn’t used until this part). Now we’re insertingEnvironment = 2beforeClosure, which shiftsClosureto 3 andInstanceto 4. This is why we use explicit discriminants — if we relied on implicit numbering, inserting a variant in the middle would silently break the GC’strace_objectdispatch. The explicit values make the contract between the enum and the heap header’sobj_typebyte visible.
#![allow(unused)]
fn main() {
/// An environment (holds captured variables)
#[repr(C)]
pub struct Environment {
/// Pointer to enclosing environment (for nested closures)
/// null = no enclosing environment (top-level)
pub enclosing: *mut Environment,
/// Number of variable slots
pub slot_count: usize,
/// Variable slots (flexible array member)
/// The `[0]` is Rust's approximation of a C flexible array member —
/// `alloc_environment` allocates extra space for the slots beyond the
/// struct size, and `get`/`set` access them via pointer arithmetic
/// on `as_ptr().add(index)`.
pub slots: [*mut u8; 0],
}
impl Environment {
/// Get a slot value
pub fn get(&self, index: usize) -> *mut u8 {
assert!(index < self.slot_count);
unsafe { *self.slots.as_ptr().add(index) }
}
/// Set a slot value
pub fn set(&mut self, index: usize, value: *mut u8) {
assert!(index < self.slot_count);
unsafe { *self.slots.as_mut_ptr().add(index) = value; }
}
}
/// A closure (function + environment)
#[repr(C)]
pub struct Closure {
/// Pointer to the function code (an index into our function table)
pub function_index: usize,
/// Pointer to the captured environment
pub environment: *mut Environment,
}
}
Why do the slots store
*mut u8but the MLIR code loadsf64? The Rust runtime stores*mut u8pointers in each slot — pointers to GC-tracked heap objects. In the numbers-only model, those heap objects are Number values whose data area contains anf64. The MLIRlox_runtime_env_getreturns a!llvm.ptrpointing to the Number’s data area, andllvm.loadreads thef64from there. The MLIRlox_runtime_env_set_numbertakes anf64, boxes it into a heap-allocated Number (the C runtime handles this internally), and stores the pointer in the slot. We useenv_set_numberinstead of the genericenv_set(which takesvoid*) because passing anf64through avoid*parameter violates the C calling convention — on x86-64 System V,f64is passed in an XMM register whilevoid*is passed in a general-purpose register, so the callee would read garbage. In the tagged-union model (Part 7+), each slot would hold a(i8, i64)pair and the genericenv_setwithvoid*would be used instead.
Allocating an Environment
#![allow(unused)]
fn main() {
// src/runtime/gc.rs (extended)
/// Allocate a new environment with the given slot count
pub fn alloc_environment(slot_count: usize, enclosing: *mut Environment) -> *mut Environment {
// Calculate size: header + environment struct + slots
let env_data_size = std::mem::size_of::<Environment>()
+ slot_count * std::mem::size_of::<*mut u8>();
let data_ptr = alloc(env_data_size, ObjType::Environment);
let env = data_ptr as *mut Environment;
unsafe {
(*env).enclosing = enclosing;
(*env).slot_count = slot_count;
// Initialize all slots to null
for i in 0..slot_count {
(*env).set(i, std::ptr::null_mut());
}
}
env
}
/// Allocate a new closure
pub fn alloc_closure(function_index: usize, environment: *mut Environment) -> *mut Closure {
let data_ptr = alloc(std::mem::size_of::<Closure>(), ObjType::Closure);
let closure = data_ptr as *mut Closure;
unsafe {
(*closure).function_index = function_index;
(*closure).environment = environment;
}
closure
}
}
Marking Environments
When we mark a closure, we must also mark its environment. This is the same mark_references function from Part 3 — we’re adding the Environment arm for closures:
#![allow(unused)]
fn main() {
// src/runtime/gc.rs (extended)
fn mark_references(header: *mut ObjHeader) {
let obj_type = unsafe { (*header).obj_type };
let data = unsafe { (header as *mut u8).add(std::mem::size_of::<ObjHeader>()) };
match obj_type {
ObjType::Number | ObjType::String => {
// No references
}
ObjType::Environment => {
let env = data as *mut Environment;
unsafe {
// Mark enclosing environment
if !(*env).enclosing.is_null() {
mark_object((*env).enclosing as *mut u8);
}
// Mark all slots
for i in 0..(*env).slot_count {
let slot = (*env).get(i);
if !slot.is_null() {
mark_object(slot);
}
}
}
}
ObjType::Closure => {
let closure = data as *mut Closure;
unsafe {
// Mark the environment
let env = (*closure).environment;
if !env.is_null() {
mark_object(env as *mut u8);
}
}
}
ObjType::Instance => {
// (same as before)
}
}
}
}
Code Generation for Closures
Now let’s generate code for closures in our compiler:
The Challenge
When compiling a closure, we need to:
- Identify captured variables - which variables from outer scopes are used?
- Allocate an environment - create heap storage for captured variables
- Store captured values - copy values into the environment
- Access via environment - when the closure reads/writes captured vars, go through the environment
Step 1: Variable Analysis
#![allow(unused)]
fn main() {
// src/analysis/captures.rs
use crate::ast::*;
/// A variable that is captured by a closure
#[derive(Debug, Clone)]
pub struct CapturedVar {
pub name: String,
pub depth: usize, // How many enclosing environments to follow (0 = immediately enclosing)
pub slot_index: usize, // Index in the environment
}
/// Analyze a function to find captured variables
pub fn find_captures(func: &FunctionStmt) -> Vec<CapturedVar> {
let mut analyzer = CaptureAnalyzer::new();
analyzer.analyze_function(func);
analyzer.captures
}
struct CaptureAnalyzer {
scopes: Vec<Vec<String>>, // Stack of local variables in each scope
captures: Vec<CapturedVar>,
current_slot: usize,
}
impl CaptureAnalyzer {
fn new() -> Self {
Self {
scopes: vec![vec![]], // Start with one scope for parameters
captures: Vec::new(),
current_slot: 0,
}
}
fn analyze_function(&mut self, func: &FunctionStmt) {
// Parameters are in scope 0
for param in &func.params {
self.scopes[0].push(param.clone());
}
// Analyze body
for stmt in &func.body {
self.analyze_stmt(stmt);
}
}
fn analyze_stmt(&mut self, stmt: &Stmt) {
match stmt {
Stmt::Var(v) => {
self.scopes.last_mut().unwrap().push(v.name.clone());
self.analyze_expr(&v.init);
}
Stmt::Block(b) => {
self.scopes.push(vec![]);
for s in &b.statements {
self.analyze_stmt(s);
}
self.scopes.pop();
}
Stmt::Print(p) => self.analyze_expr(&p.expression),
Stmt::Return(r) => {
if let Some(value) = &r.value {
self.analyze_expr(value);
}
}
Stmt::Function(f) => {
// The function name is local to this scope
self.scopes.last_mut().unwrap().push(f.name.clone());
// But we don't recurse into the body here —
// the compiler processes each function separately
// (see "Compilation Order: Inside Out" below).
}
Stmt::Expression(e) => self.analyze_expr(&e.expression),
Stmt::If(i) => {
self.analyze_expr(&i.condition);
self.analyze_stmt(&i.then_branch);
if let Some(else_branch) = &i.else_branch {
self.analyze_stmt(else_branch);
}
}
Stmt::While(w) => {
self.analyze_expr(&w.condition);
self.analyze_stmt(&w.body);
}
}
}
fn analyze_expr(&mut self, expr: &Expr) {
match expr {
Expr::Variable(v) => {
// Is this variable captured? (not in any local scope)
if !self.is_local(&v.name) {
// It's a capture!
if !self.captures.iter().any(|c| c.name == v.name) {
self.captures.push(CapturedVar {
name: v.name.clone(),
// depth = how many enclosing environments to follow
// at runtime. A variable captured from the immediately
// enclosing function is at depth 0 (first env up).
// For nested closures (capturing from two+ levels up),
// extending the scope analysis to track how many
// function boundaries are between the reference and
// its definition would compute the actual depth.
// Our compiler handles single-level
// captures — captured variables are always in the
// immediately enclosing environment.
depth: 0,
slot_index: self.current_slot,
});
self.current_slot += 1;
}
}
}
// Recursive cases — the capture analysis doesn't change
// behavior for these, it traverses into sub-expressions
// to find more variable references.
Expr::Binary(b) => {
self.analyze_expr(&b.left);
self.analyze_expr(&b.right);
}
Expr::Unary(u) => self.analyze_expr(&u.right),
Expr::Call(c) => {
self.analyze_expr(&c.callee);
for arg in &c.arguments {
self.analyze_expr(arg);
}
}
Expr::Assign(a) => {
// Assignment to a variable might be a capture too
if !self.is_local(&a.name) {
if !self.captures.iter().any(|c| c.name == a.name) {
self.captures.push(CapturedVar {
name: a.name.clone(),
depth: 0,
slot_index: self.current_slot,
});
self.current_slot += 1;
}
}
self.analyze_expr(&a.value);
}
Expr::Literal(_) => {} // No sub-expressions
Expr::Logical(l) => {
self.analyze_expr(&l.left);
self.analyze_expr(&l.right);
}
Expr::Grouping(g) => self.analyze_expr(&g.expression),
}
}
fn is_local(&self, name: &str) -> bool {
// Check ALL scopes — a variable from an enclosing block in the
// same function is still local, not a capture.
//
// Variables from enclosing *functions* aren't in self.scopes at all
// (we only track the current function's scopes), so they naturally
// fail this check and are classified as captures. That's correct:
// a variable that isn't local to this function must be captured.
self.scopes.iter().any(|scope| scope.contains(&name.to_string()))
}
// depth_of is shown for reference but not used in our simplified
// capture analysis — it computes the scope depth within the current
// function. For captured variables (not in self.scopes), this would
// panic. A full compiler would track the number of enclosing
// function boundaries between the reference and the definition to
// compute the environment chain depth.
fn _depth_of(&self, name: &str) -> usize {
for (depth, scope) in self.scopes.iter().rev().enumerate() {
if scope.contains(&name.to_string()) {
return depth;
}
}
panic!("Variable not found: {}", name);
}
}
}
Step 2: Environment Allocation
#![allow(unused)]
fn main() {
// Continuing in src/codegen.rs — same imports as Part 4
use std::collections::HashMap;
// (BlockLike, RegionLike, OperationLike must be in scope)
// src/codegen/generator.rs (extended)
impl<'c> CodeGenerator<'c> {
fn compile_function(&self, func: &FunctionStmt, variables: &mut HashMap<String, Value<'c, 'c>>, current_env: &mut Option<Value<'c, 'c>>) {
// current_env is &mut because we assign to it below (*current_env = Some(env)),
// not because alloc_environment needs mutability — it takes &Option<Value>.
// ... setup ...
// Find captured variables
let captures = find_captures(func);
// If we have captures, we need an environment
if !captures.is_empty() {
let env = self.alloc_environment(block, captures.len(), current_env);
// Store captured values into the environment
for capture in &captures {
// get_capture_value looks up the variable in the enclosing
// function's variables map, possibly traversing the scope
// chain (capture.depth levels up) to find it.
let value = self.get_capture_value(&capture.name, capture.depth);
self.store_to_environment(block, env, capture.slot_index, value);
}
// The environment is now available for inner functions
*current_env = Some(env);
}
// ... compile body ...
}
fn alloc_environment(&self, block: &Block<'c>, slot_count: usize, current_env: &Option<Value<'c, 'c>>) -> Value<'c, 'c> {
let location = Location::unknown(self.context);
// Call lox_runtime_alloc_environment(slot_count, enclosing)
let slot_count_val = self.const_i64(slot_count as i64);
let enclosing_val = current_env
.unwrap_or_else(|| self.const_null());
let call = block.append_operation(func::call(
self.context,
melior::ir::attribute::FlatSymbolRefAttribute::new(
self.context,
"lox_runtime_alloc_environment"
),
&[slot_count_val, enclosing_val],
&[Type::parse(self.context, "!llvm.ptr").unwrap()],
location,
));
call.result(0).unwrap().into()
}
fn store_to_environment(&self, block: &Block<'c>, env: Value<'c, 'c>, index: usize, value: Value<'c, 'c>) {
let location = Location::unknown(self.context);
// The C runtime handles boxing the f64 into a heap-allocated
// Number and storing the pointer in the environment slot.
block.append_operation(func::call(
self.context,
melior::ir::attribute::FlatSymbolRefAttribute::new(
self.context,
"lox_runtime_env_set_number"
),
&[env, self.const_i64(index as i64), value],
&[],
location,
));
}
}
Why
lox_runtime_env_set_numberinstead oflox_runtime_env_set?The generic
lox_runtime_env_settakes avoid*parameter. Our values aref64. You might think: “cast the f64 to a pointer and pass it.” On x86-64, that breaks. The C calling convention passesvoid*arguments in general-purpose registers (rdi, rsi, rdx…) but passesdoublearguments in XMM registers (xmm0, xmm1…). If the callee expects a pointer in rdx but you passed a double in xmm0, the callee reads the wrong register and gets garbage.The type-specific
lox_runtime_env_set_numbertakesdoubledirectly, so the compiler puts the value in the right register. The C runtime then boxes thedoubleinto a heap-allocatedNumberobject and stores that pointer in the environment slot. This is the kind of calling-convention trap that only bites you when you cross language boundaries — MLIR’sfunc.calldoesn’t save you from the C ABI’s register rules.
#![allow(unused)]
fn main() {
/// Get a pointer to a captured variable in the closure environment.
/// The environment stores pointers to heap objects; `env_get` returns
/// a pointer to the object's data area (past ObjHeader), so you can
/// `llvm.load` the f64 directly.
fn env_get_ptr(&self, block: &Block<'c>, env: Value<'c, 'c>, index: usize) -> Value<'c, 'c> {
let location = Location::unknown(self.context);
let call = block.append_operation(func::call(
self.context,
melior::ir::attribute::FlatSymbolRefAttribute::new(
self.context,
"lox_runtime_env_get"
),
&[env, self.const_i64(index as i64)],
&[Type::parse(self.context, "!llvm.ptr").unwrap()],
location,
));
call.result(0).unwrap().into()
}
/// Load a captured variable's value from the closure environment.
/// Two steps: get the pointer (env_get_ptr), then load the f64.
fn load_from_environment(&self, block: &Block<'c>, env: Value<'c, 'c>, index: usize) -> Value<'c, 'c> {
let ptr = self.env_get_ptr(block, env, index);
let location = Location::unknown(self.context);
let loaded = block.append_operation(llvm::load(
ptr,
Type::float64(self.context),
location,
));
loaded.result(0).unwrap().into()
}
}
}
How Does the Closure Receive Its Environment?
The code above shows the enclosing function creating an environment and storing captured values into it. But the closure itself is compiled as a separate function — and it needs an %env parameter to access those captured variables.
Here’s the missing link. When the compiler compiles a closure’s body, it does two things differently from a regular function:
- The function signature includes an
%envparameter. You’ll see this in the MLIR for@counterbelow — the first argument is always the environment pointer. current_envis initialized from that parameter, not fromalloc_environment. The allocation happened in the enclosing function, and the closure receives the pointer through the calling convention.
When the compiler encounters a reference to a captured variable like count, it calls load_from_environment with the %env parameter instead of looking up the variable in the stack-allocated variables HashMap. Local variables (the function’s own parameters and locals) still go through variables; only captured variables go through the environment.
This split is what find_captures determines — it classifies each variable reference as local or captured, and the compiler routes accordingly.
The Complete Picture
Here’s how everything connects for our makeCounter example:
Lox Source
fun makeCounter() {
var count = 0;
fun counter() {
count = count + 1;
return count;
}
return counter;
}
How Does the Runtime Call a Closure?
Before we look at the generated MLIR, there’s a question you should be asking: when the Lox program does c(), how does the runtime know which function to call and what to pass it?
The closure object has two fields: function_index and environment. The calling convention works like this:
When the Lox program does: c()
1. Load c from the stack (it's a pointer to a Closure object)
2. Read the Closure's function_index (e.g., 1)
3. Read the Closure's environment pointer (e.g., 0x2000)
4. Look up function_index 1 in a function table → @counter
5. Call @counter(environment)
The function table is an array the compiler builds at module creation time. Each closure-generating function gets an index. When alloc_closure(1, %env) is called, the 1 refers to @counter’s position in that table.
In the generated MLIR, this looks like:
// In the caller (e.g., main or another function)
%closure = <load closure from variable>
// Read function_index and environment from the closure object
%func_idx = llvm.load %closure[0] : !llvm.ptr -> i64
%env_ptr = llvm.load %closure[8] : !llvm.ptr -> !llvm.ptr
// Indirect call through the function table
%func_table = llvm.mlir.addressof @lox_function_table : !llvm.ptr
%func_ptr = llvm.getelementptr %func_table[%func_idx] : (!llvm.ptr, i64) -> !llvm.ptr
%callee = llvm.load %func_ptr : !llvm.ptr -> !llvm.ptr
%result = llvm.call %callee(%env_ptr) : !llvm.ptr, !llvm.ptr -> f64
This is an indirect call. The target function isn’t known at compile time — it depends on which closure the variable holds at runtime. That’s the cost of first-class functions. The benefit is that c() works regardless of which closure c points to, whether it’s a counter, an adder, or anything else.
For closures that take their own parameters (like @add in the “Multiple Captured Variables” section below, which takes (%env: !llvm.ptr, %x: f64, %y: f64)), the environment pointer is followed by the function’s own arguments in the indirect call: %result = llvm.call %callee(%env_ptr, %x, %y). The calling convention is: environment first, then the closure’s declared parameters.
Simplification: A production compiler would use MLIR’s
calloperation with a symbol reference when the target is known at compile time (direct call) and fall back to indirect calls only for closures stored in variables. Our compiler uses indirect calls for all closure invocations for simplicity.
Now you know how closures are called. With that in place, the generated MLIR will make sense — when you see @counter(%env: !llvm.ptr), the %env comes from the calling convention above.
Generated MLIR (Simplified)
module {
// makeCounter creates an environment for 'count'
func.func @makeCounter() -> !llvm.ptr {
// Push frame with 1 root: the environment must survive across
// the alloc_closure call — if GC triggers there, %env needs to
// be on the shadow stack or it could be collected.
%frame = lox.push_frame root_count = 1 : !llvm.ptr
// Allocate environment with 1 slot
%env = func.call @lox_runtime_alloc_environment(1, null) : (i64, !llvm.ptr) -> !llvm.ptr
// Root the environment before any further allocation
lox.set_root index = 0, %env : !llvm.ptr
// Initialize count = 0 (env_set_number boxes the f64 into a heap Number)
%zero = arith.constant 0.0 : f64
func.call @lox_runtime_env_set_number(%env, 0, %zero) : (!llvm.ptr, i64, f64)
// Create closure for counter
// (counter_index = 1, env = %env)
// Safe: %env is rooted, so GC during alloc_closure can find it
%closure = func.call @lox_runtime_alloc_closure(1, %env) : (i64, !llvm.ptr) -> !llvm.ptr
lox.pop_frame
func.return %closure : !llvm.ptr
}
// counter accesses 'count' via environment
func.func @counter(%env: !llvm.ptr) -> f64 {
%frame = lox.push_frame root_count = 1 : !llvm.ptr
lox.set_root index = 0, %env : !llvm.ptr
// Load count from environment (returns a pointer to the heap object)
%count_ptr = func.call @lox_runtime_env_get(%env, 0) : (!llvm.ptr, i64) -> !llvm.ptr
%count = llvm.load %count_ptr : !llvm.ptr -> f64
// count = count + 1
%one = arith.constant 1.0 : f64
%new_count = arith.addf %count, %one : f64
// Store back to environment (env_set_number boxes the new f64)
func.call @lox_runtime_env_set_number(%env, 0, %new_count) : (!llvm.ptr, i64, f64)
lox.pop_frame
func.return %new_count : f64
}
}
Memory Layout After var c = makeCounter();
Stack:
┌─────────────────┐
│ c = 0x1000 │──┐
└─────────────────┘ │
▼
Heap:
┌─────────────────────────┐ 0x1000: Closure
│ header: { Closure } │
│ function_index: 1 │
│ environment: 0x2000 ────│──┐
└─────────────────────────┘ │
▼
┌─────────────────────────┐ 0x2000: Environment
│ header: { Environment } │
│ enclosing: null │
│ slot_count: 1 │
│ slot[0]: 1.0 ───────────│──┐
└─────────────────────────┘ │
│
(count = 1.0) ◄──────────────┘
Nested Closures Deep Dive
Let’s trace through a more complex example with nested closures:
The Code
fun makeAdder(x) {
fun adder(y) {
return x + y; // Captures 'x' from makeAdder
}
return adder;
}
var add5 = makeAdder(5);
var add10 = makeAdder(10);
print add5(3); // 8
print add10(3); // 13
Step-by-Step Execution
1. makeAdder(5) is called:
Stack:
┌─────────────────────────────┐
│ Frame: makeAdder │
│ x = 5 (parameter) │
└─────────────────────────────┘
│
▼ Creates environment
┌─────────────────────────────┐
│ Environment 0x1000 │
│ slot[0]: 5 (x) │
└─────────────────────────────┘
│
▼ Creates closure
┌─────────────────────────────┐
│ Closure 0x2000 (adder) │
│ function_index: 1 │
│ environment: 0x1000 ──────│──► env with x=5
└─────────────────────────────┘
Returns closure 0x2000, assigned to add5
makeAdder stack frame is destroyed, but environment lives on!
2. makeAdder(10) is called:
Creates NEW environment 0x3000 with x=10
Creates NEW closure 0x4000 pointing to environment 0x3000
add5 → closure 0x2000 → env 0x1000 (x=5)
add10 → closure 0x4000 → env 0x3000 (x=10)
3. add5(3) is called:
Stack:
┌─────────────────────────────┐
│ Frame: adder │
│ y = 3 (parameter) │
│ env = 0x1000 (from closure)│
└─────────────────────────────┘
Load x from env[0] = 5
Compute 5 + 3 = 8
Return 8
4. add10(3) is called:
Same process, but env = 0x3000
Load x from env[0] = 10
Compute 10 + 3 = 13
Return 13
Key Insight: Each Call Creates Separate Environment
makeAdder(5):
→ Creates env {x: 5}
→ Creates closure pointing to that env
makeAdder(10):
→ Creates NEW env {x: 10}
→ Creates NEW closure pointing to NEW env
The two closures share no state!
Multiple Captured Variables
What if we capture multiple variables?
fun makeOffsetter(offsetX, offsetY) {
fun apply(x, y) {
// Captures both offsetX and offsetY
print offsetX + x;
print offsetY + y;
}
return apply;
}
var shift = makeOffsetter(10, 20);
shift(1, 2); // prints 11, then 22
Environment Layout
Environment:
slot[0]: offsetX
slot[1]: offsetY
When apply is called:
1. Load offsetX from env[0]
2. Load offsetY from env[1]
3. Compute offsetX + x and print
4. Compute offsetY + y and print
Generated Code
func.func @apply(%env: !llvm.ptr, %x: f64, %y: f64) {
// Load captured variables (env_get returns pointers, then we load the f64 values)
%offsetX_ptr = func.call @lox_runtime_env_get(%env, 0) : (!llvm.ptr, i64) -> !llvm.ptr
%offsetX = llvm.load %offsetX_ptr : !llvm.ptr -> f64
%offsetY_ptr = func.call @lox_runtime_env_get(%env, 1) : (!llvm.ptr, i64) -> !llvm.ptr
%offsetY = llvm.load %offsetY_ptr : !llvm.ptr -> f64
// Compute
%sum_x = arith.addf %offsetX, %x : f64
%sum_y = arith.addf %offsetY, %y : f64
// Print each result
func.call @lox_print(%sum_x) : (f64) -> ()
func.call @lox_print(%sum_y) : (f64) -> ()
func.return
}
Nested Environments
Beyond our simplified model. The
find_capturesanalysis hardcodesdepth: 0because our compiler handles single-level captures — captured variables are always in the immediately enclosing environment. The nested example below shows what the runtime can do: environment chains, depth traversal,lox_runtime_env_get_enclosingare all real runtime machinery. But producing the correct depth values would require extending the compile-time scope analysis to track how many function boundaries are between the variable reference and its definition — a local depth computation during the existing scope walk, not a whole-program pass. The MLIR shown here is what a full compiler would generate; our simplified compiler would emitdepth: 0for all captures and wouldn’t handle theinnercase correctly.There are two gaps between the code shown and the nested example below. First,
CaptureAnalyzerdoesn’t compute actual depths — it always emitsdepth: 0. A full implementation would replace the hardcodeddepth: 0with a scope-walk that counts function boundaries between the reference and the definition. Second, theStmt::Functionarm pushes the function name into scope but doesn’t recurse into the body. This means captures from inner functions never flow upward to their enclosing functions. The “Compilation Order: Inside Out” section below explains the concept (inner functions determine what outer functions must capture). Here’s the mechanism that makes it work:fn propagate_captures( inner_captures: &[CapturedVar], outer_scope: &Scope, ) -> Vec<CapturedVar> { // For each variable that the inner function captured // from beyond its immediately enclosing function, // the enclosing function must also capture it // (one level closer to the definition). inner_captures .iter() .filter(|c| c.depth > 0 || !outer_scope.is_local(&c.name)) .map(|c| CapturedVar { name: c.name.clone(), depth: c.depth.saturating_sub(1), ..*c }) .collect() }The key insight: if
innercapturesaat depth 1 (it’s inouter, one level beyondmiddle), thenmiddlemust captureaat depth 0 (it’s inouter, the immediately enclosing function). Each function in the chain captures at one level less depth than the function below it. This is the upward propagation that our simplifiedfind_capturesdoesn’t implement — but that a full compiler needs.
Our simplified compiler emits depth: 0 for all captures. This is correct for single-level captures — the variable is always in the immediately enclosing environment. But it produces wrong code when a variable is two or more scopes away. The inner function needs a at depth 1, not depth 0, and our find_captures can’t compute that. The nested example below is here to show how the runtime would work if the analysis could produce the right depth values.
⚠️ The MLIR below is what a full compiler generates. The code shown earlier in this chapter does not produce this output for nested closures.
What if a closure captures a variable from two scopes up?
fun outer() {
var a = 1;
fun middle() {
var b = 2;
fun inner() {
return a + b; // a from outer, b from middle
}
return inner;
}
return middle;
}
Environment Chain
outer() creates:
env_outer { a: 1 }
middle() creates:
env_middle { b: 2, enclosing: env_outer }
↑
Points to outer's environment
inner() closure:
environment: env_middle
When inner() runs:
1. Look up 'a': not in env_middle → follow enclosing → found in env_outer
2. Look up 'b': found in env_middle
Variable Lookup Algorithm
#![allow(unused)]
fn main() {
/// Look up a variable by traversing the environment chain
fn lookup_variable(env: *mut Environment, depth: usize, slot: usize) -> *mut u8 {
let mut current = env;
// Walk up 'depth' levels through the enclosing chain.
// depth=0 means the variable is in the current environment,
// depth=1 means one level up, etc.
for _ in 0..depth {
current = unsafe { (*current).enclosing };
assert!(!current.is_null(), "enclosing environment is null at depth > 0");
}
// Now access the slot
unsafe { (*current).get(slot) }
}
}
Compilation Order: Inside Out
There’s a subtle but important constraint on how we compile closures: inner functions must be compiled before outer functions.
When outer defines middle which defines inner, the compilation order must be:
- Compile
inner→ discover it capturesa(fromouter) andb(frommiddle) - Compile
middle→ now we knowinnercapturesbfrommiddle’s scope, somiddlemust allocate an environment withbin it - Compile
outer→ now we knowmiddlecapturesafromouter’s scope, sooutermust allocate an environment withain it
If we compiled top-down, outer wouldn’t know it needs an environment for a until middle is compiled — and middle wouldn’t know about b until inner is compiled. The capture information flows upward: inner functions determine what outer functions must capture.
This is why the find_captures function analyzes a single function’s body — it finds variables that aren’t local to that function. The compiler then uses this information when compiling the enclosing function to allocate the right environment slots.
Generated MLIR
func.func @inner(%env: !llvm.ptr) -> f64 {
// Load 'b' from current environment (depth=0, slot=0)
%b_ptr = func.call @lox_runtime_env_get(%env, 0) : (!llvm.ptr, i64) -> !llvm.ptr
%b = llvm.load %b_ptr : !llvm.ptr -> f64
// Load 'a' from enclosing environment (depth=1, slot=0)
// lox_runtime_env_get_enclosing follows the parent pointer in
// the Environment struct — it must be declared alongside
// lox_runtime_env_get and lox_runtime_env_set in the runtime.
// Signature: lox_runtime_env_get_enclosing(env: !llvm.ptr) -> !llvm.ptr
%env_outer = func.call @lox_runtime_env_get_enclosing(%env) : (!llvm.ptr) -> !llvm.ptr
%a_ptr = func.call @lox_runtime_env_get(%env_outer, 0) : (!llvm.ptr, i64) -> !llvm.ptr
%a = llvm.load %a_ptr : !llvm.ptr -> f64
// Compute
%sum = arith.addf %a, %b : f64
func.return %sum : f64
}
Practice Exercises
Exercise 1: Trace Memory Layout
For this code:
fun factory(value) {
fun getter() {
return value;
}
fun setter(new_value) {
value = new_value;
}
// In a real Lox program, you'd return both closures
// via a class instance or global variables. For this
// exercise, imagine both are available after calling
// factory() — they share the same environment.
return getter;
}
var get = factory(10);
print get(); // 10
// If setter were also available:
// setter(20);
// print get(); // 20
Draw the memory layout after calling factory(10) and assigning the result to get.
Click to reveal answer
Stack:
┌─────────────────────────────┐
│ get = 0x1000 (closure) │
└─────────────────────────────┘
Heap:
0x1000: Closure (getter)
function_index: @getter
environment: 0x3000 ──────────┐
│
(If setter were also returned:) │
0x2000: Closure (setter) │
function_index: @setter │
environment: 0x3000 ──────────│── Same environment!
│
0x3000: Environment ◄───────────┘
slot[0]: 10 (value)
Key insight: Both closures share the SAME environment. That’s why setter(20) would affect what getter() returns!
Exercise 2: Variable Analysis
For this function, what variables are captured and where do they go?
fun outer(x, y) {
var a = x + y;
fun inner(b) {
return a + b + x; // Captures a and x
}
return inner;
}
Click to reveal answer
Captured variables:
a(local in outer) → slot 0x(parameter in outer) → slot 1
Environment for inner:
env_inner:
slot[0]: a
slot[1]: x
Note: y is NOT captured (not used by inner), so it’s not in the environment.
Initialization order matters: the environment must be allocated early so it can be rooted on the shadow stack before any later allocation (like alloc_closure) triggers GC. If GC ran during alloc_closure and the environment wasn’t rooted, it could be collected — taking a and x with it. The sequence is: allocate env → root env → compute a = x + y → store a in env[0] → store x in env[1] → allocate closure. The stores into the environment can happen in any order after allocation, but the environment must be on the shadow stack before alloc_closure.
Exercise 3: Why Not Copy Values?
Why can’t we copy captured values into the closure directly? Why do we need an environment?
fun example() {
var count = 0;
fun increment() {
count = count + 1; // MODIFIES count!
return count;
}
return increment;
}
Click to reveal answer
If we copied count into the closure, each call to increment() would modify its own copy — the outer count would never change, and multiple calls wouldn’t accumulate.
By using an environment, all closures share the same environment. Modifications are visible to every closure that references it, so state is properly shared.
The environment is essentially a shared “box” that holds the variable.
Next: Part 6 — Complete Reference — Closures are the hardest single feature. Before adding more, let’s see the complete numbers-only compiler in one place: every module, every pass, every runtime function. This is the working system that Parts 1–5 built, assembled and running end to end.