MLIR for Lox: Part 8 — Why We Did It This Way
If you’ve made it this far, you probably have questions. Why f64 for everything? Why pass blocks as parameters? Why scf.if instead of bitwise operators? Good — these are the questions any careful reader would ask. This chapter answers them.
Think of this as a checkpoint before we push further into the runtime and linking. The choices we made in Parts 1–7 were deliberate. Understanding why will make the code in Parts 9–11 make more sense.
The questions:
- Why start with a “numbers only” subset when Lox is dynamically typed?
- Why pass blocks as parameters instead of storing them in a struct field?
- Why
scf.iffor logical operators instead ofarith.andi/arith.ori? - Why not show the lexer?
- Why a global
RUNTIMEinstead of passing it through? - How does the transition from
f64to tagged unions work? - What does this tutorial not cover?
Why Start with a “Numbers Only” Subset?
The most consistent feedback: “Lox is dynamically typed, but the codegen treats everything as f64. That’s wrong.”
It is — for a production compiler. But a tutorial that introduces tagged unions, type-tag checking, and the from_raw/to_raw boundary before the reader understands how MLIR blocks, regions, and lowering work would be 200 lines of boilerplate before the first interesting operation. The reader would quit.
The approach this tutorial takes:
-
Parts 1–6: All values are
f64.trueis1.0,falseis0.0,nilis0.0. Arithmetic isarith.addf. Comparison isarith.cmpf. This lets the reader focus on MLIR’s structure — blocks, regions, control flow, lowering — without drowning in tag-checking boilerplate. -
Part 7 (Classes and Instances): Introduces the tagged union representation (
!llvm.struct<(i8, i64)>) because classes needLoxValue::Instance,LoxValue::String, and other types that can’t be represented asf64. Every arithmetic operation becomes “check tag → extract payload → operate → re-tag result.” This is the production path. -
Parts 9–11: The runtime, error reporting, and cross-module linking use the tagged union representation (Parts 9–10) or reference it. By this point, the reader understands why — they’ve seen the simpler model and can appreciate what the tags buy them. Part 11 (Cross-Module Linking) uses the numbers-only model for its IR examples — the linking concepts are the same regardless of the value representation.
Honest simplification: A real Lox compiler would use tagged unions from the start. The “numbers only” model is a pedagogical choice, not an engineering one. If you’re building a production compiler, skip straight to the tagged representation.
Why Parameter-Passing for Blocks?
The original code used current_block: Option<Block<'c>> — a struct field holding the current MLIR block. Review caught a critical issue: this doesn’t work with Melior’s ownership model.
In Melior, a Block is moved into a Region via region.append_block(block). After that move, the block is consumed — you can’t hold a reference to it in a struct field. The original code tried to do both:
#![allow(unused)]
fn main() {
self.current_block = Some(block); // store the block
region.append_block(block); // move the block — DOUBLE USE
}
This won’t compile. Rust’s ownership rules catch it at compile time.
The fix: each compile method takes &Block<'c> as a parameter. The block is created, operations are appended, and it’s moved into a region — all within a single method. No struct field, no double-move.
#![allow(unused)]
fn main() {
use std::collections::HashMap;
fn compile_binary(&self, binary: &BinaryExpr, block: &Block<'c>, variables: &mut HashMap<String, Value<'c, 'c>>) -> Value<'c, 'c> {
let lhs = self.compile_expression(&binary.left, block, variables);
let rhs = self.compile_expression(&binary.right, block, variables);
// ...
}
}
This pattern is more verbose (every method takes block as a parameter) but it’s correct. And it teaches the reader something important: MLIR’s ownership model isn’t a nuisance — it prevents you from accidentally creating dangling references to blocks that have been moved into regions.
Why scf.if for Logical Operators?
The original code used arith.andi and arith.ori for and/or. Review caught this: bitwise AND/OR don’t short-circuit.
In Lox, false and crash() should never call crash(). But arith.andi evaluates both operands unconditionally. The fix is scf.if:
// a and b → if a { b } else { a }
// Lox's `and` returns the left operand when it's falsy,
// and the right operand when it's truthy.
%result = scf.if %a -> (f64) {
scf.yield %b : f64
} else {
scf.yield %a : f64
}
This is more IR, but it’s correct. Short-circuit evaluation requires control flow, not bitwise operations. We use scf.if rather than generating cf.cond_br directly for the same reason the rest of the codegen uses scf — it’s structurally simpler (regions instead of explicit blocks and branches) and the --convert-scf-to-cf pass handles the lowering. Generating cf directly would mean managing block arguments and branch targets in every compile method, which is exactly the complexity scf hides. In the numbers-only model, when a is falsy, a == 0.0, so yielding a in the else-branch is equivalent to yielding the constant 0.0 — but yielding the left operand directly is more semantically correct and avoids a redundant constant.
Note: In the tagged union model (Part 7+), the else-branch would yield a tagged
Nilvalue whenais falsy, and the then-branch would yield the second operand with its tag preserved.
Why Not Show the Lexer?
The parser references Token, TokenType, and a tokenize() function. None of these are shown in full.
Because the lexer isn’t interesting in the context of MLIR. It’s a standard scanner — skip whitespace, match keywords, read numbers and strings. Crafting Interpreters Chapter 4 covers it in detail, and the implementation here follows that chapter almost exactly. Showing 200 lines of match arms would add length without adding understanding.
The tutorial provides a LexValue enum and a tokenize() stub so the code compiles, then points the reader at Crafting Interpreters for the full implementation.
Why a Global RUNTIME Instead of Passing It Through?
The JIT execution engine registers function pointers (like lox_print) at build time. These are raw extern "C" functions — they can’t carry a &mut self parameter or a reference to a Runtime struct. So the runtime lives in a global:
#![allow(unused)]
fn main() {
static RUNTIME: LazyLock<Mutex<Runtime>> = LazyLock::new(|| {
Mutex::new(Runtime::new())
});
}
This is a pragmatic choice, not an elegant one. Thread-local storage would work, but adds complexity for no benefit in a single-threaded compiler. A global with a Mutex is simple, prevents data races if the compiler ever becomes multi-threaded, and is honest about the constraint.
A production compiler would likely use a more structured approach (like Cranelift’s UserState), but for a tutorial, the global is the right trade-off.
The Tagged Union Graduation
Part 7 (Classes and Instances) switches from f64-only values to the tagged union representation. This is deliberate — classes need tagged unions because LoxValue::Instance, LoxValue::String, and other types can’t be represented as f64. Part 9 (Standard Library and Runtime) continues with tagged unions for the runtime functions (lox_print, lox_clock) that must handle all Lox types.
The transition looks like this:
Before (Parts 1–6, numbers-only):
%one = arith.constant 1.0 : f64
%two = arith.constant 2.0 : f64
%sum = arith.addf %one, %two : f64
After (Part 7+, tagged union):
%one_tag = arith.constant 0 : i8 // TAG_NUMBER
%one_val = arith.constant 1.0 : f64
%one_bits = llvm.bitcast %one_val : f64 to i64 // f64 → i64 for the struct
%one_tmp = llvm.insertvalue %one_bits, %undef[1] : !llvm.struct<(i8, i64)>
%one = llvm.insertvalue %one_tag, %one_tmp[0] : !llvm.struct<(i8, i64)>
%two_tag = arith.constant 0 : i8
%two_val = arith.constant 2.0 : f64
%two_bits = llvm.bitcast %two_val : f64 to i64
%two_tmp = llvm.insertvalue %two_bits, %undef[1] : !llvm.struct<(i8, i64)>
%two = llvm.insertvalue %two_tag, %two_tmp[0] : !llvm.struct<(i8, i64)>
// Add: check tags match, extract payloads, bitcast back to f64, add, re-tag
%lhs_tag = llvm.extractvalue %one[0] : !llvm.struct<(i8, i64)>
%rhs_tag = llvm.extractvalue %two[0] : !llvm.struct<(i8, i64)>
%tags_match = arith.cmpi eq, %lhs_tag, %rhs_tag : i8
// ... (tag check + payload extraction + bitcast i64→f64 + arith.addf + bitcast f64→i64 + re-tag)
The “numbers only” model is 3 lines. The tagged model is 10+ lines with error handling. The reader needs to understand the 3-line version before the 10-line version makes sense.
What This Tutorial Doesn’t Cover
No tutorial covers everything. The tutorial doesn’t write custom optimization passes — a significant topic that deserves its own guide. Debug info emission isn’t covered; MLIR locations are shown, but DWARF debug info generation isn’t. The GC’s integration with the JIT is hand-waved — the GC is implemented, but the interaction between stack maps and code generation requires precise stack roots that a production compiler needs. And string interning in the compiler is skipped — the AST uses String directly, while a production compiler would intern strings for faster comparison and lower memory usage.
Cross-module linking is now covered in Part 11.
Each of these is worth a tutorial of its own. This one focuses on getting the reader from “what is MLIR?” to “I can compile a non-trivial Lox program” without losing them along the way.
Next: Part 9 — Standard Library and Runtime — Our compiler generates MLIR, but it can’t do anything useful yet — there’s no print, no clock, no way to see the output. We’ll build the C runtime that bridges MLIR’s compiled code to the outside world, and see why extern "C" is the one boundary where Rust’s safety guarantees give way to trust.