Introduction
MLIR for Lox: A Practical Guide
This guide reimagines the MLIR tutorial for building a Lox compiler instead of the tensor-based Toy language. If you know Crafting Interpreters, this should feel familiar.
Why MLIR for Lox?
LLVM is powerful but low-level. It doesn't know about:
- Variable scoping rules
- Closure captures
- Dynamic typing
- Lox-specific optimizations
MLIR lets you define a dialect that represents Lox semantics directly, then progressively lower it to LLVM IR. This is how modern languages like Swift, Rust, and Julia work.
Part 1: The Lox Dialect
Instead of tensors, our dialect models Lox's types and operations.
Lox Types
// include/lox/Types.td (conceptual - type list, not code)
lox.nil - The nil type
lox.bool - Boolean type
lox.number - 64-bit float (Lox uses doubles)
lox.string - String type
lox.object - Instance/class type (dynamic)
A Sample Lox Program
// test/add.lox
fun add(a, b) {
return a + b;
}
print add(3, 4); // 7
The MLIR Representation
// output after: loxc test/add.lox --emit-mlir
module {
lox.func @add(%arg0: !lox.number, %arg1: !lox.number) -> !lox.number {
%result = lox.add %arg0, %arg1 : !lox.number
lox.return %result : !lox.number
}
lox.func @main() {
%three = lox.constant 3.0 : !lox.number
%four = lox.constant 4.0 : !lox.number
%sum = lox.call @add(%three, %four) : (!lox.number, !lox.number) -> !lox.number
lox.print %sum : !lox.number
lox.return
}
}
Part 2: Defining the Lox Dialect in TableGen
MLIR uses TableGen (.td files) to declaratively define dialects. This generates C++ boilerplate for you.
Dialect Definition
// include/lox/Lox.td
// The Lox dialect definition
def Lox_Dialect : Dialect {
let name = "lox";
let summary = "A dialect for the Lox programming language";
let description = [{
This dialect represents the Lox language at a high level,
enabling Lox-specific optimizations before lowering to LLVM.
}];
let cppNamespace = "lox";
}
Operation Definitions
// include/lox/Ops.td
// Base class for Lox operations
class Lox_Op<string mnemonic, list<Trait> traits = []> :
Op<Lox_Dialect, mnemonic, traits>;
// Arithmetic: lox.add
def AddOp : Lox_Op<"add", [Pure, SameOperandsAndResultType]> {
let summary = "Add two Lox numbers";
let arguments = (ins Lox_NumberType:$lhs, Lox_NumberType:$rhs);
let results = (outs Lox_NumberType:$result);
let assemblyFormat = "$lhs `,` $rhs attr-dict `:` type($result)";
}
// Subtraction, multiplication, division follow the same pattern
def SubOp : Lox_Op<"sub", [Pure, SameOperandsAndResultType]> { ... }
def MulOp : Lox_Op<"mul", [Pure, SameOperandsAndResultType]> { ... }
def DivOp : Lox_Op<"div", [Pure, SameOperandsAndResultType]> { ... }
// Comparison: lox.less, lox.equal, etc.
def LessOp : Lox_Op<"less", [Pure]> {
let arguments = (ins Lox_NumberType:$lhs, Lox_NumberType:$rhs);
let results = (outs Lox_BoolType:$result);
}
// Variables: lox.declare, lox.assign, lox.load
def DeclareOp : Lox_Op<"declare"> {
let arguments = (ins StrAttr:$name, AnyLoxType:$init);
let results = (outs AnyLoxType:$value);
}
def LoadOp : Lox_Op<"load", [Pure]> {
let arguments = (ins StrAttr:$name);
let results = (outs AnyLoxType:$value);
}
def AssignOp : Lox_Op<"assign"> {
let arguments = (ins StrAttr:$name, AnyLoxType:$value);
}
// Control flow: lox.if, lox.while, lox.for
def IfOp : Lox_Op<"if", [Terminator, NoTerminator]> {
let arguments = (ins Lox_BoolType:$condition);
let regions = (region SizedRegion<1>:$thenRegion,
OptionalRegion<1>:$elseRegion);
}
def WhileOp : Lox_Op<"while", [Terminator]> {
let arguments = (ins Lox_BoolType:$condition);
let regions = (region SizedRegion<1>:$body);
}
// Functions
// Lox semantics that differ from func.func:
// 1. Parameters have no declared types (dynamic typing)
// 2. Return type is optional (implicitly nil if missing)
// 3. Functions are first-class values (can be passed, returned, assigned)
// 4. May capture variables from enclosing scopes (closures)
// 5. Arity is checked at runtime, not compile time
// 6. No function overloading
def FuncOp : Lox_Op<"func", [AffineScope, IsolatedFromAbove]> {
let summary = "A Lox function definition";
let description = [{
Declares a Lox function. Unlike func.func, parameters have no declared
types (Lox is dynamically typed). The return type is also optional —
functions that don't explicitly return implicitly return nil.
Functions are first-class values: they can be passed as arguments,
returned from other functions, and assigned to variables.
}];
let arguments = (ins
SymbolNameAttr:$sym_name, // Function name
ArrayAttr:$param_names, // Parameter names (not types!)
OptionalAttr<TypeAttr>:$return_type // Optional return type
);
let regions = (region AnyRegion:$body);
let assemblyFormat = [{
$sym_name `(` $param_names `)` (`->` $return_type^)? attr-dict $body
}];
let skipDefaultBuilders = 1;
// Builder for functions with no explicit return type (returns nil)
let builders = [
OpBuilder<(ins "StringRef":$name, "ArrayRef<StringRef>":$params)>
];
}
def CallOp : Lox_Op<"call", [Pure]> {
let arguments = (ins FlatSymbolRefAttr:$callee,
Variadic<AnyLoxType>:$operands);
let results = (outs Variadic<AnyLoxType>:$results);
}
def ReturnOp : Lox_Op<"return", [Terminator]> {
let arguments = (ins Optional<AnyLoxType>:$value);
}
// Print builtin
def PrintOp : Lox_Op<"print"> {
let arguments = (ins AnyLoxType:$value);
}
Part 3: From AST to MLIR
This is like Chapter 2 of the Toy tutorial, but for Lox.
The Lox AST
Your parser (from Crafting Interpreters) produces an AST. Here's the full structure:
// include/lox/AST.h
// AST node types (used for dispatching in the visitor)
enum class StmtType {
FUNCTION, RETURN, VAR, IF, WHILE, PRINT, BLOCK, EXPRESSION
};
enum class ExprType {
BINARY, UNARY, LITERAL, GROUPING, VARIABLE, ASSIGN, CALL, LOGICAL
};
// ========================================================================
// Base classes
// ========================================================================
struct Location {
int line;
int column;
};
struct Expr {
ExprType type;
Location loc;
virtual ~Expr() = default;
};
struct Stmt {
StmtType type;
virtual ~Stmt() = default;
};
// ========================================================================
// Statements
// ========================================================================
// A whole program is just a list of top-level statements
struct Program {
std::vector<Stmt*> statements;
};
struct FunctionStmt : Stmt {
std::string name;
std::vector<std::string> params;
std::vector<Stmt*> body;
FunctionStmt(std::string n, std::vector<std::string> p, std::vector<Stmt*> b)
: Stmt{StmtType::FUNCTION}, name(std::move(n)), params(std::move(p)), body(std::move(b)) {}
};
struct ReturnStmt : Stmt {
Expr* value; // Can be null for bare "return;"
ReturnStmt(Expr* v) : Stmt{StmtType::RETURN}, value(v) {}
};
struct VarStmt : Stmt {
std::string name;
Expr* init;
VarStmt(std::string n, Expr* i) : Stmt{StmtType::VAR}, name(std::move(n)), init(i) {}
};
struct IfStmt : Stmt {
Expr* condition;
std::vector<Stmt*> thenBranch;
std::vector<Stmt*> elseBranch; // Empty if no else clause
IfStmt(Expr* c, std::vector<Stmt*> then, std::vector<Stmt*> els)
: Stmt{StmtType::IF}, condition(c), thenBranch(std::move(then)), elseBranch(std::move(els)) {}
};
struct WhileStmt : Stmt {
Expr* condition;
std::vector<Stmt*> body;
WhileStmt(Expr* c, std::vector<Stmt*> b)
: Stmt{StmtType::WHILE}, condition(c), body(std::move(b)) {}
};
struct PrintStmt : Stmt {
Expr* expression;
PrintStmt(Expr* e) : Stmt{StmtType::PRINT}, expression(e) {}
};
struct BlockStmt : Stmt {
std::vector<Stmt*> statements;
BlockStmt(std::vector<Stmt*> stmts) : Stmt{StmtType::BLOCK}, statements(std::move(stmts)) {}
};
struct ExpressionStmt : Stmt {
Expr* expression;
ExpressionStmt(Expr* e) : Stmt{StmtType::EXPRESSION}, expression(e) {}
};
// ========================================================================
// Expressions
// ========================================================================
struct BinaryExpr : Expr {
Expr* left;
TokenType op; // PLUS, MINUS, STAR, SLASH, etc.
Expr* right;
BinaryExpr(Expr* l, TokenType o, Expr* r)
: Expr{ExprType::BINARY}, left(l), op(o), right(r) {}
};
struct UnaryExpr : Expr {
TokenType op; // MINUS, BANG
Expr* right;
UnaryExpr(TokenType o, Expr* r) : Expr{ExprType::UNARY}, op(o), right(r) {}
};
struct LiteralExpr : Expr {
LoxValue value; // Your value type from Crafting Interpreters
LiteralExpr(LoxValue v) : Expr{ExprType::LITERAL}, value(std::move(v)) {}
};
struct GroupingExpr : Expr {
Expr* expression;
GroupingExpr(Expr* e) : Expr{ExprType::GROUPING}, expression(e) {}
};
struct VarExpr : Expr {
std::string name;
VarExpr(std::string n) : Expr{ExprType::VARIABLE}, name(std::move(n)) {}
};
struct AssignExpr : Expr {
std::string name;
Expr* value;
AssignExpr(std::string n, Expr* v)
: Expr{ExprType::ASSIGN}, name(std::move(n)), value(v) {}
};
struct CallExpr : Expr {
Expr* callee;
std::vector<Expr*> arguments;
CallExpr(Expr* c, std::vector<Expr*> args)
: Expr{ExprType::CALL}, callee(c), arguments(std::move(args)) {}
};
struct LogicalExpr : Expr {
Expr* left;
TokenType op; // AND, OR
Expr* right;
LogicalExpr(Expr* l, TokenType o, Expr* r)
: Expr{ExprType::LOGICAL}, left(l), op(o), right(r) {}
};
How the AST is Built
Your parser (from Crafting Interpreters) constructs these nodes while parsing:
// lib/Parser.cpp (simplified excerpt from Crafting Interpreters)
class Parser {
std::vector<Token> tokens;
size_t current = 0;
public:
Program* parse() {
Program* program = new Program();
while (!isAtEnd()) {
program->statements.push_back(declaration());
}
return program;
}
private:
// Top-level declarations (functions, variables, statements)
Stmt* declaration() {
if (match(TokenType::FUN)) return functionDeclaration();
if (match(TokenType::VAR)) return varDeclaration();
return statement();
}
Stmt* functionDeclaration() {
std::string name = consume(TokenType::IDENTIFIER, "Expect function name.");
consume(TokenType::LEFT_PAREN, "Expect '(' after function name.");
std::vector<std::string> params;
if (!check(TokenType::RIGHT_PAREN)) {
do {
params.push_back(consume(TokenType::IDENTIFIER, "Expect parameter name.").lexeme);
} while (match(TokenType::COMMA));
}
consume(TokenType::RIGHT_PAREN, "Expect ')' after parameters.");
consume(TokenType::LEFT_BRACE, "Expect '{' before function body.");
std::vector<Stmt*> body = block();
return new FunctionStmt(name, params, body);
}
Stmt* statement() {
if (match(TokenType::PRINT)) return printStatement();
if (match(TokenType::IF)) return ifStatement();
if (match(TokenType::WHILE)) return whileStatement();
if (match(TokenType::FOR)) return forStatement();
if (match(TokenType::RETURN)) return returnStatement();
if (match(TokenType::LEFT_BRACE)) return new BlockStmt(block());
return expressionStatement();
}
Stmt* printStatement() {
Expr* value = expression();
consume(TokenType::SEMICOLON, "Expect ';' after value.");
return new PrintStmt(value);
}
Stmt* ifStatement() {
consume(TokenType::LEFT_PAREN, "Expect '(' after 'if'.");
Expr* condition = expression();
consume(TokenType::RIGHT_PAREN, "Expect ')' after if condition.");
std::vector<Stmt*> thenBranch = { statement() };
std::vector<Stmt*> elseBranch;
if (match(TokenType::ELSE)) {
elseBranch.push_back(statement());
}
return new IfStmt(condition, thenBranch, elseBranch);
}
std::vector<Stmt*> block() {
std::vector<Stmt*> statements;
while (!check(TokenType::RIGHT_BRACE) && !isAtEnd()) {
statements.push_back(declaration());
}
consume(TokenType::RIGHT_BRACE, "Expect '}' after block.");
return statements;
}
// ... rest of parser ...
// Expression parsing (precedence climbing)
Expr* expression() { return assignment(); }
Expr* assignment() {
Expr* expr = orExpr();
if (match(TokenType::EQUAL)) {
Expr* value = assignment();
// Check if left side is a variable
if (VarExpr* var = dynamic_cast<VarExpr*>(expr)) {
return new AssignExpr(var->name, value);
}
error("Invalid assignment target.");
}
return expr;
}
Expr* orExpr() {
Expr* expr = andExpr();
while (match(TokenType::OR)) {
Expr* right = andExpr();
expr = new LogicalExpr(expr, TokenType::OR, right);
}
return expr;
}
// ... more expression parsing methods ...
Expr* term() {
Expr* expr = factor();
while (match(TokenType::MINUS, TokenType::PLUS)) {
Expr* right = factor();
expr = new BinaryExpr(expr, previous().type, right);
}
return expr;
}
};
Example: Parsing print 1 + 2;
Parser::parse()
└── declaration()
└── statement()
└── printStatement()
├── expression()
│ └── term()
│ └── factor()
│ ├── primary() → LiteralExpr(1)
│ ├── match(PLUS) → true
│ └── factor()
│ └── primary() → LiteralExpr(2)
│ → BinaryExpr(Literal(1), PLUS, Literal(2))
└── consume(SEMICOLON)
→ PrintStmt(BinaryExpr(...))
The MLIR Generator
You write a visitor that walks the AST and emits MLIR. The visitor pattern dispatches based on AST node type — each visit* method handles one node type.
// lib/IR/MLIRGen.cpp
#include "lox/Dialect.h"
#include "lox/Ops.h"
#include "lox/AST.h" // Your AST classes from Crafting Interpreters
class MLIRGenerator : public ExprVisitor, public StmtVisitor {
private:
mlir::OpBuilder builder;
mlir::ModuleOp module;
mlir::Location currentLoc;
public:
// ========================================================================
// Entry point: Convert a whole program (list of statements) to MLIR
// ========================================================================
mlir::ModuleOp generateModule(Program *program) {
module = mlir::ModuleOp::create(builder.getUnknownLoc());
// Visit each top-level statement (function declarations, etc.)
for (Stmt *stmt : program->statements) {
visitStmt(stmt);
}
return module;
}
// ========================================================================
// Statement visitors (called by visitStmt, which dispatches by type)
// ========================================================================
// The dispatcher: called for every statement, routes to the right visit method
void visitStmt(Stmt *stmt) {
// Each Stmt subclass has a type tag we switch on
switch (stmt->type) {
case StmtType::FUNCTION: return visitFunctionStmt((FunctionStmt*)stmt);
case StmtType::RETURN: return visitReturnStmt((ReturnStmt*)stmt);
case StmtType::VAR: return visitVarStmt((VarStmt*)stmt);
case StmtType::IF: return visitIfStmt((IfStmt*)stmt);
case StmtType::WHILE: return visitWhileStmt((WhileStmt*)stmt);
case StmtType::PRINT: return visitPrintStmt((PrintStmt*)stmt);
case StmtType::BLOCK: return visitBlockStmt((BlockStmt*)stmt);
case StmtType::EXPRESSION: return visitExpressionStmt((ExpressionStmt*)stmt);
}
}
void visitFunctionStmt(FunctionStmt *stmt) {
// Create a new function in the module
auto func = builder.create<FuncOp>(currentLoc, stmt->name, stmt->params);
// Push a new block for the function body
auto *entryBlock = func.getBody().emplaceBlock();
builder.setInsertionPointToStart(entryBlock);
// Add block arguments for each parameter
for (const std::string ¶m : stmt->params) {
entryBlock->addArgument(builder.getType<LoxDynamicType>());
}
// Visit all statements in the function body
for (Stmt *bodyStmt : stmt->body) {
visitStmt(bodyStmt);
}
// Add implicit return nil if the function doesn't end with a return
if (!endsWithReturn(func)) {
builder.create<ReturnOp>(currentLoc);
}
}
void visitReturnStmt(ReturnStmt *stmt) {
mlir::Value value = stmt->value ? visitExpr(stmt->value) : nullptr;
builder.create<ReturnOp>(currentLoc, value);
}
void visitVarStmt(VarStmt *stmt) {
mlir::Value init = visitExpr(stmt->init);
builder.create<DeclareOp>(currentLoc, stmt->name, init);
}
void visitIfStmt(IfStmt *stmt) {
mlir::Value cond = visitExpr(stmt->condition);
auto ifOp = builder.create<IfOp>(currentLoc, cond);
// Then branch
builder.setInsertionPointToStart(&ifOp.getThenRegion().front());
for (Stmt *thenStmt : stmt->thenBranch) {
visitStmt(thenStmt);
}
// Else branch (optional)
if (stmt->elseBranch) {
builder.setInsertionPointToStart(&ifOp.getElseRegion().front());
for (Stmt *elseStmt : stmt->elseBranch) {
visitStmt(elseStmt);
}
}
}
void visitWhileStmt(WhileStmt *stmt) {
mlir::Value cond = visitExpr(stmt->condition);
auto whileOp = builder.create<WhileOp>(currentLoc, cond);
builder.setInsertionPointToStart(&whileOp.getBody().front());
for (Stmt *bodyStmt : stmt->body) {
visitStmt(bodyStmt);
}
}
void visitPrintStmt(PrintStmt *stmt) {
mlir::Value value = visitExpr(stmt->expression);
builder.create<PrintOp>(currentLoc, value);
}
void visitExpressionStmt(ExpressionStmt *stmt) {
visitExpr(stmt->expression);
}
void visitBlockStmt(BlockStmt *stmt) {
for (Stmt *blockStmt : stmt->statements) {
visitStmt(blockStmt);
}
}
// ========================================================================
// Expression visitors (called by visitExpr, which dispatches by type)
// ========================================================================
// The dispatcher: called for every expression, routes to the right visit method
// This is what calls visitBinaryExpr, visitLiteralExpr, etc.
mlir::Value visitExpr(Expr *expr) {
currentLoc = convertLocation(expr->loc);
switch (expr->type) {
case ExprType::BINARY: return visitBinaryExpr((BinaryExpr*)expr);
case ExprType::UNARY: return visitUnaryExpr((UnaryExpr*)expr);
case ExprType::LITERAL: return visitLiteralExpr((LiteralExpr*)expr);
case ExprType::GROUPING: return visitGroupingExpr((GroupingExpr*)expr);
case ExprType::VARIABLE: return visitVarExpr((VarExpr*)expr);
case ExprType::ASSIGN: return visitAssignExpr((AssignExpr*)expr);
case ExprType::CALL: return visitCallExpr((CallExpr*)expr);
case ExprType::LOGICAL: return visitLogicalExpr((LogicalExpr*)expr);
}
return nullptr;
}
mlir::Value visitBinaryExpr(BinaryExpr *expr) {
// Recursively visit left and right operands
// These calls go through visitExpr, which dispatches to the right visitor
mlir::Value lhs = visitExpr(expr->left);
mlir::Value rhs = visitExpr(expr->right);
switch (expr->op) {
case TokenType::PLUS:
return builder.create<AddOp>(currentLoc, lhs, rhs);
case TokenType::MINUS:
return builder.create<SubOp>(currentLoc, lhs, rhs);
case TokenType::STAR:
return builder.create<MulOp>(currentLoc, lhs, rhs);
case TokenType::SLASH:
return builder.create<DivOp>(currentLoc, lhs, rhs);
case TokenType::LESS:
return builder.create<LessOp>(currentLoc, lhs, rhs);
case TokenType::LESS_EQUAL:
return builder.create<LessEqualOp>(currentLoc, lhs, rhs);
case TokenType::GREATER:
return builder.create<GreaterOp>(currentLoc, lhs, rhs);
case TokenType::GREATER_EQUAL:
return builder.create<GreaterEqualOp>(currentLoc, lhs, rhs);
case TokenType::EQUAL_EQUAL:
return builder.create<EqualOp>(currentLoc, lhs, rhs);
case TokenType::BANG_EQUAL:
return builder.create<NotEqualOp>(currentLoc, lhs, rhs);
default:
emitError(currentLoc, "Unknown binary operator");
return nullptr;
}
}
mlir::Value visitUnaryExpr(UnaryExpr *expr) {
mlir::Value operand = visitExpr(expr->right);
switch (expr->op) {
case TokenType::MINUS:
return builder.create<NegateOp>(currentLoc, operand);
case TokenType::BANG:
return builder.create<NotOp>(currentLoc, operand);
default:
emitError(currentLoc, "Unknown unary operator");
return nullptr;
}
}
mlir::Value visitLiteralExpr(LiteralExpr *expr) {
if (expr->value.isNumber()) {
return builder.create<ConstantOp>(currentLoc, expr->value.asNumber());
}
if (expr->value.isBool()) {
return builder.create<ConstantOp>(currentLoc, expr->value.asBool());
}
if (expr->value.isString()) {
return builder.create<ConstantOp>(currentLoc, expr->value.asString());
}
// nil
return builder.create<NilOp>(currentLoc);
}
mlir::Value visitGroupingExpr(GroupingExpr *expr) {
// Grouping is just parentheses - emit the inner expression directly
return visitExpr(expr->expression);
}
mlir::Value visitVarExpr(VarExpr *expr) {
return builder.create<LoadOp>(currentLoc, expr->name);
}
mlir::Value visitAssignExpr(AssignExpr *expr) {
mlir::Value value = visitExpr(expr->value);
builder.create<AssignOp>(currentLoc, expr->name, value);
return value;
}
mlir::Value visitCallExpr(CallExpr *expr) {
mlir::Value callee = visitExpr(expr->callee);
// Visit all arguments
llvm::SmallVector<mlir::Value, 4> args;
for (Expr *arg : expr->arguments) {
args.push_back(visitExpr(arg));
}
return builder.create<CallOp>(currentLoc, callee, args);
}
mlir::Value visitLogicalExpr(LogicalExpr *expr) {
mlir::Value left = visitExpr(expr->left);
// Logical AND/OR short-circuit, so they need control flow
// lox.and and lox.or operations handle this
switch (expr->op) {
case TokenType::AND:
return builder.create<AndOp>(currentLoc, left,
/* right is a lazy region */ expr->right);
case TokenType::OR:
return builder.create<OrOp>(currentLoc, left,
/* right is a lazy region */ expr->right);
default:
return nullptr;
}
}
};
How the visitor works:
generateProgram()
│
├── for each statement:
│ │
│ visitStmt(stmt) ───── dispatches based on stmt->type
│ │
│ ├── visitFunctionStmt() if StmtType::FUNCTION
│ ├── visitIfStmt() if StmtType::IF
│ ├── visitWhileStmt() if StmtType::WHILE
│ └── ...
│
└── statements call visitExpr() for their expressions:
│
visitExpr(expr) ───── dispatches based on expr->type
│
├── visitBinaryExpr() if ExprType::BINARY
│ │
│ ├── visitExpr(left) ── recursive!
│ └── visitExpr(right) ── recursive!
│
├── visitLiteralExpr() if ExprType::LITERAL
├── visitCallExpr() if ExprType::CALL
└── ...
Part 4: Lox-Specific Optimizations
This is where MLIR shines. You can write passes that understand Lox semantics.
Example: Constant Folding
// lib/Transforms/LoxOpt.cpp
#include "lox/Ops.h"
#include "mlir/Transforms/GreedyPatternRewriteDriver.h"
// Fold lox.add(constant, constant) -> constant
struct FoldConstantAdd : public OpRewritePattern<AddOp> {
using OpRewritePattern<AddOp>::OpRewritePattern;
LogicalResult matchAndRewrite(AddOp op, PatternRewriter &rewriter) const override {
// Check if both operands are constants
auto lhsDef = op.getLhs().getDefiningOp<ConstantOp>();
auto rhsDef = op.getRhs().getDefiningOp<ConstantOp>();
if (!lhsDef || !rhsDef) return failure();
// Fold the constants
double result = lhsDef.getValue() + rhsDef.getValue();
rewriter.replaceOpWithNewOp<ConstantOp>(op, result);
return success();
}
};
Example: Dead Variable Elimination
// lib/Transforms/LoxOpt.cpp (continued)
// Remove variables that are declared but never used
struct EliminateDeadVariable : public OpRewritePattern<DeclareOp> {
using OpRewritePattern<DeclareOp>::OpRewritePattern;
LogicalResult matchAndRewrite(DeclareOp op, PatternRewriter &rewriter) const override {
// Check if the variable is ever loaded
if (hasUses(op.getName())) return failure();
// Variable is never used, remove it
rewriter.eraseOp(op);
return success();
}
};
Example: Inline Simple Functions
// lib/Transforms/LoxOpt.cpp (continued)
// Inline functions that are small (e.g., just return an expression)
struct InlineSimpleFunctions : public OpRewritePattern<CallOp> {
using OpRewritePattern<CallOp>::OpRewritePattern;
LogicalResult matchAndRewrite(CallOp call, PatternRewriter &rewriter) const override {
FuncOp func = lookupFunction(call.getCallee());
// Only inline small functions
if (!isSimpleEnoughToInline(func)) return failure();
// Clone the function body and substitute arguments
inlineFunction(call, func, rewriter);
return success();
}
};
Part 5: Lowering to Standard Dialects
Now we progressively transform lox.* operations into lower-level MLIR dialects.
Step 1: Lox Dialect → SCF + Arith Dialects
Control flow becomes structured control flow (scf.if, scf.for), arithmetic becomes arith.addf, etc.
// lib/Transforms/LowerToLLVM.cpp
#include "lox/Ops.h"
#include "mlir/Conversion/ArithToLLVM/ArithToLLVM.h"
#include "mlir/Conversion/SCFToControlFlow/SCFToControlFlow.h"
// Convert lox.add to arith.addf
struct AddOpLowering : public OpConversionPattern<AddOp> {
using OpConversionPattern<AddOp>::OpConversionPattern;
LogicalResult matchAndRewrite(AddOp op, OpAdaptor adaptor,
ConversionPatternRewriter &rewriter) const override {
rewriter.replaceOpWithNewOp<mlir::arith::AddFOp>(op, adaptor.getLhs(), adaptor.getRhs());
return success();
}
};
// Convert lox.while to scf.while
struct WhileOpLowering : public OpConversionPattern<WhileOp> {
LogicalResult matchAndRewrite(WhileOp op, OpAdaptor adaptor,
ConversionPatternRewriter &rewriter) const override {
// Convert lox.while body to scf.while
auto whileOp = rewriter.create<scf::WhileOp>(
op.getLoc(), TypeRange{}, adaptor.getCondition());
// Convert the body region
rewriter.inlineRegionBefore(op.getBody(), whileOp.getBody(),
whileOp.getBody().begin());
rewriter.eraseOp(op);
return success();
}
};
Step 2: SCF + Arith → LLVM Dialect
This is handled by built-in MLIR passes:
# Run from build directory or project root
mlir-opt lox.mlir \
--convert-lox-to-scf \
--convert-scf-to-cf \
--convert-cf-to-llvm \
--convert-arith-to-llvm \
--convert-func-to-llvm \
-o lox_llvm.mlir
Step 3: LLVM Dialect → LLVM IR → Machine Code
# Run from build directory or project root
mlir-translate lox_llvm.mlir --mlir-to-llvmir -o lox.ll
llc lox.ll -o lox.s
clang lox.s -o lox
Part 6: Handling Dynamic Typing
Lox is dynamically typed, but MLIR is typed. Options:
Option A: Tagged Values
Every Lox value is a struct with a type tag:
// include/lox/Types.td (type definition)
!lox.value = struct<{
tag: i8, // 0=nil, 1=bool, 2=number, 3=string, 4=object
data: union<i1, f64, !lox.string, !lox.object>
}>
Operations check the tag at runtime:
// lib/Transforms/RuntimeChecks.cpp (generated MLIR for runtime type checking)
// lox.add checks that both operands are numbers
%lhs_tag = llvm.extractvalue %lhs[0] : !lox.value
%rhs_tag = llvm.extractvalue %rhs[0] : !lox.value
%tags_match = arith.cmpi eq, %lhs_tag, %rhs_tag : i8
cf.cond_br %tags_match, ^ok, ^type_error
^ok:
%lhs_num = llvm.extractvalue %lhs[1] : !lox.value
%rhs_num = llvm.extractvalue %rhs[1] : !lox.value
%result = arith.addf %lhs_num, %rhs_num : f64
// ... pack result back into tagged value ...
Option B: Specialize by Type
Generate specialized versions of functions for each type combination:
// lib/Transforms/TypeSpecialization.cpp (generated after type specialization)
// Generic call
lox.call @add(%a, %b)
// After specialization (if both are numbers)
lox.call @add_numbers(%a, %b)
// If strings, call string concatenation
lox.call @add_strings(%a, %b)
Part 7: Closures
Closures are the trickiest part of Lox. MLIR's regions help.
Captured Variables
// test/closure.lox
fun makeCounter() {
var count = 0;
fun counter() {
count = count + 1;
return count;
}
return counter;
}
MLIR Representation
// output after: loxc test/closure.lox --emit-mlir
lox.func @makeCounter() -> !lox.closure {
// Create a closure environment
%env = lox.alloc_env { size = 1 }
// Store count in the environment
%zero = lox.constant 0.0 : !lox.number
lox.env_store %env[0] = %zero : !lox.number
// Create the closure, capturing the environment
%closure = lox.make_closure @counter(%env)
lox.return %closure : !lox.closure
}
lox.func @counter(%env: !lox.env) -> !lox.number {
// Load count from the environment
%count = lox.env_load %env[0] : !lox.number
// Increment
%one = lox.constant 1.0 : !lox.number
%new_count = lox.add %count, %one : !lox.number
// Store back
lox.env_store %env[0] = %new_count : !lox.number
lox.return %new_count : !lox.number
}
Part 8: Project Structure
lox-mlir/
├── include/
│ └── lox/
│ ├── Lox.td # Dialect definition (TableGen)
│ ├── Ops.td # Operation definitions
│ ├── Types.td # Type definitions
│ ├── Dialect.h # Generated C++ header
│ └── Dialect.cpp # Dialect implementation
├── lib/
│ ├── Dialect.cpp # Dialect registration
│ ├── IR/
│ │ └── MLIRGen.cpp # AST → MLIR conversion
│ └── Transforms/
│ ├── LoxOpt.cpp # Lox-specific optimizations
│ └── LowerToLLVM.cpp # Lowering passes
├── tools/
│ └── loxc/
│ └── main.cpp # Compiler driver
└── test/
└── *.lox # Test programs
Part 9: Build System (CMake)
# CMakeLists.txt (project root)
cmake_minimum_required(VERSION 3.16)
project(lox-mlir)
find_package(MLIR REQUIRED CONFIG)
# Generate dialect files from TableGen
mlir_tablegen(LoxDialect.h.inc -gen-dialect-decls)
mlir_tablegen(LoxDialect.cpp.inc -gen-dialect-defs)
mlir_tablegen(LoxOps.h.inc -gen-op-decls)
mlir_tablegen(LoxOps.cpp.inc -gen-op-defs)
add_library(LoxDialect
Dialect.cpp
)
target_link_libraries(LoxDialect MLIRIR)
add_library(LoxTransforms
Transforms/LoxOpt.cpp
Transforms/LowerToLLVM.cpp
)
target_link_libraries(LoxTransforms LoxDialect MLIRTransformUtils)
add_executable(loxc tools/loxc/main.cpp)
target_link_libraries(loxc LoxDialect LoxTransforms MLIRParser)
Quick Reference: Lox → MLIR Mapping
| Lox Construct | MLIR Operation | Lowered To |
|---|---|---|
a + b | lox.add | arith.addf |
a - b | lox.sub | arith.subf |
a * b | lox.mul | arith.mulf |
a / b | lox.div | arith.divf |
a < b | lox.less | arith.cmpf |
a == b | lox.equal | runtime check |
var x = v | lox.declare | llvm.alloca + store |
x | lox.load | llvm.load |
x = v | lox.assign | llvm.store |
if (c) {...} | lox.if | scf.if → cf.cond_br |
while (c) {...} | lox.while | scf.while → cf.br/cf.cond_br |
fun f(...) {...} | lox.func | func.func → llvm.func |
f(args) | lox.call | func.call → llvm.call |
return v | lox.return | func.return → llvm.return |
print v | lox.print | runtime printf call |
Next Steps
- Start small: Just numbers and arithmetic. Get
print 1 + 2;working. - Add variables: Implement local variables with
lox.declare/lox.load/lox.assign. - Add control flow:
ifandwhilewithscfdialect. - Add functions:
lox.funcandlox.call. - Add closures: This is where it gets interesting.
- Add classes/objects: The full Lox experience.
The MLIR infrastructure handles the boring parts (SSA construction, dominance checking, pass management), letting you focus on Lox-specific concerns.
MLIR for Lox: A Rust Guide (using Melior)
This guide shows how to build a Lox compiler using Rust and the Melior crate instead of C++. If you know Crafting Interpreters, this should feel familiar — just with MLIR instead of tree-walk interpretation.
Why MLIR for Lox?
LLVM is powerful but low-level. It doesn't know about:
- Variable scoping rules
- Closure captures
- Dynamic typing
- Lox-specific optimizations
MLIR lets you define a dialect that represents Lox semantics directly, then progressively lower it to LLVM IR. This is how modern languages like Swift, Rust, and Julia work.
Why Rust + Melior?
- Memory safety without garbage collection
- Pattern matching for clean AST traversal
- No TableGen — Melior builds dialects directly in Rust
- Excellent for compiler toolchains
Part 1: Setup
Dependencies
# Cargo.toml
[package]
name = "lox-mlir"
version = "0.1.0"
edition = "2021"
[dependencies]
melior = "0.27"
Install MLIR
# macOS
brew install llvm@22
# Linux (Ubuntu/Debian)
# You may need to build from source or use a PPA for LLVM 22
# See: https://apt.llvm.org/
# Add to your shell config:
export LLVM_SYS_220_PREFIX=/opt/homebrew/opt/llvm@22 # macOS
# or
export LLVM_SYS_220_PREFIX=/usr/lib/llvm-22 # Linux
Part 2: The Lox AST
Hand-written Rust — not generated. This is exactly what you'd write following Crafting Interpreters.
#![allow(unused)] fn main() { // src/ast.rs use std::fmt; /// Source location for error messages #[derive(Debug, Clone, Copy)] pub struct Location { pub line: usize, pub column: usize, } /// A Lox value (dynamically typed) #[derive(Debug, Clone)] pub enum LoxValue { Nil, Bool(bool), Number(f64), String(String), } // ======================================================================== // Expressions // ======================================================================== #[derive(Debug, Clone)] pub enum Expr { Binary(BinaryExpr), Unary(UnaryExpr), Literal(LiteralExpr), Grouping(GroupingExpr), Variable(VariableExpr), Assign(AssignExpr), Call(CallExpr), Logical(LogicalExpr), } impl Expr { pub fn location(&self) -> Location { match self { Expr::Binary(e) => e.location, Expr::Unary(e) => e.location, Expr::Literal(e) => e.location, Expr::Grouping(e) => e.location, Expr::Variable(e) => e.location, Expr::Assign(e) => e.location, Expr::Call(e) => e.location, Expr::Logical(e) => e.location, } } } #[derive(Debug, Clone)] pub struct BinaryExpr { pub location: Location, pub left: Box<Expr>, pub op: BinaryOp, pub right: Box<Expr>, } #[derive(Debug, Clone, Copy)] pub enum BinaryOp { Add, Sub, Mul, Div, Less, LessEqual, Greater, GreaterEqual, Equal, NotEqual, } #[derive(Debug, Clone)] pub struct UnaryExpr { pub location: Location, pub op: UnaryOp, pub right: Box<Expr>, } #[derive(Debug, Clone, Copy)] pub enum UnaryOp { Negate, Not, } #[derive(Debug, Clone)] pub struct LiteralExpr { pub location: Location, pub value: LoxValue, } #[derive(Debug, Clone)] pub struct GroupingExpr { pub location: Location, pub expr: Box<Expr>, } #[derive(Debug, Clone)] pub struct VariableExpr { pub location: Location, pub name: String, } #[derive(Debug, Clone)] pub struct AssignExpr { pub location: Location, pub name: String, pub value: Box<Expr>, } #[derive(Debug, Clone)] pub struct CallExpr { pub location: Location, pub callee: Box<Expr>, pub arguments: Vec<Expr>, } #[derive(Debug, Clone)] pub struct LogicalExpr { pub location: Location, pub left: Box<Expr>, pub op: LogicalOp, pub right: Box<Expr>, } #[derive(Debug, Clone, Copy)] pub enum LogicalOp { And, Or, } // ======================================================================== // Statements // ======================================================================== #[derive(Debug, Clone)] pub enum Stmt { Function(FunctionStmt), Return(ReturnStmt), Var(VarStmt), If(IfStmt), While(WhileStmt), Print(PrintStmt), Block(BlockStmt), Expression(ExpressionStmt), } #[derive(Debug, Clone)] pub struct FunctionStmt { pub location: Location, // Location of 'fun' keyword pub name: String, pub name_location: Location, // Location of the function name pub params: Vec<String>, pub param_locations: Vec<Location>, // Location of each parameter name pub body: Vec<Stmt>, } #[derive(Debug, Clone)] pub struct ReturnStmt { pub location: Location, // Location of 'return' keyword pub value: Option<Expr>, } #[derive(Debug, Clone)] pub struct VarStmt { pub location: Location, // Location of 'var' keyword pub name: String, pub name_location: Location, // Location of the variable name pub init: Expr, } #[derive(Debug, Clone)] pub struct IfStmt { pub location: Location, // Location of 'if' keyword pub condition: Expr, pub then_branch: Vec<Stmt>, pub else_branch: Vec<Stmt>, } #[derive(Debug, Clone)] pub struct WhileStmt { pub location: Location, // Location of 'while' keyword pub condition: Expr, pub body: Vec<Stmt>, } #[derive(Debug, Clone)] pub struct PrintStmt { pub location: Location, // Location of 'print' keyword pub value: Expr, } #[derive(Debug, Clone)] pub struct BlockStmt { pub location: Location, // Location of opening '{' pub statements: Vec<Stmt>, } #[derive(Debug, Clone)] pub struct ExpressionStmt { pub location: Location, // Start of the expression pub expr: Expr, } // ======================================================================== // Program // ======================================================================== /// A Lox program is a list of top-level statements #[derive(Debug, Clone)] pub struct Program { pub statements: Vec<Stmt>, } // ======================================================================== // Helper trait for getting locations // ======================================================================== impl Stmt { /// Get the primary location of this statement pub fn location(&self) -> Location { match self { Stmt::Function(f) => f.location, Stmt::Return(r) => r.location, Stmt::Var(v) => v.location, Stmt::If(i) => i.location, Stmt::While(w) => w.location, Stmt::Print(p) => p.location, Stmt::Block(b) => b.location, Stmt::Expression(e) => e.location, } } } }
Part 3: The Parser
The parser follows Crafting Interpreters Chapter 6 closely. It expects a stream of Token values from a scanner. Here are the token types — the scanner itself is a standard lexical analysis exercise (see Crafting Interpreters Chapter 4), and not the focus of this tutorial.
#![allow(unused)] fn main() { // src/lexer.rs use crate::ast::Location; /// A single token produced by the scanner #[derive(Debug, Clone)] pub struct Token { pub token_type: TokenType, pub lexeme: String, pub literal: Option<LexValue>, pub location: Location, } /// The category of a token #[derive(Debug, Clone, Copy, PartialEq)] pub enum TokenType { // Single-character tokens LeftParen, RightParen, LeftBrace, RightBrace, Comma, Dot, Minus, Plus, Semicolon, Slash, Star, // One or two character tokens Bang, BangEqual, Equal, EqualEqual, Greater, GreaterEqual, Less, LessEqual, // Literals Identifier, String, Number, // Keywords And, Class, Else, False, Fun, For, If, Nil, Or, Print, Return, Super, This, Var, While, // Special Eof, } /// A literal value from the source #[derive(Debug, Clone)] pub enum LexValue { Boolean(bool), F64(f64), Str(String), } impl LexValue { pub fn as_number(&self) -> f64 { match self { LexValue::F64(n) => *n, _ => 0.0, } } pub fn as_string(&self) -> String { match self { LexValue::Str(s) => s.clone(), _ => String::new(), } } } impl TokenType { pub fn name(&self) -> &'static str { match self { TokenType::LeftParen => "(", TokenType::RightParen => ")", TokenType::LeftBrace => "{", TokenType::RightBrace => "}", TokenType::Comma => ",", TokenType::Dot => ".", TokenType::Minus => "-", TokenType::Plus => "+", TokenType::Semicolon => ";", TokenType::Slash => "/", TokenType::Star => "*", TokenType::Bang => "!", TokenType::BangEqual => "!=", TokenType::Equal => "=", TokenType::EqualEqual => "==", TokenType::Greater => ">", TokenType::GreaterEqual => ">=", TokenType::Less => "<", TokenType::LessEqual => "<=", TokenType::Identifier => "identifier", TokenType::String => "string", TokenType::Number => "number", TokenType::And => "and", TokenType::Class => "class", TokenType::Else => "else", TokenType::False => "false", TokenType::Fun => "fun", TokenType::For => "for", TokenType::If => "if", TokenType::Nil => "nil", TokenType::Or => "or", TokenType::Print => "print", TokenType::Return => "return", TokenType::Super => "super", TokenType::This => "this", TokenType::Var => "var", TokenType::While => "while", TokenType::Eof => "eof", } } } }
Now the parser, adapted to produce the AST from Part 2:
#![allow(unused)] fn main() { // src/parser.rs use crate::ast::*; use crate::lexer::{Token, TokenType, LexValue}; #[derive(Debug)] pub struct ParseError { pub message: String, pub location: Location, } impl ParseError { pub fn new(message: &str, location: Location) -> Self { Self { message: message.to_string(), location } } } pub struct Parser { tokens: Vec<Token>, current: usize, } impl Parser { pub fn new(tokens: Vec<Token>) -> Self { Self { tokens, current: 0 } } pub fn parse(&mut self) -> Result<Program, ParseError> { let mut statements = Vec::new(); while !self.is_at_end() { statements.push(self.declaration()?); } Ok(Program { statements }) } // ======================================================================== // Statement parsing // ======================================================================== fn declaration(&mut self) -> Result<Stmt, ParseError> { if self.match_token(TokenType::Fun) { return self.function_declaration(); } if self.match_token(TokenType::Var) { return self.var_declaration(); } self.statement() } fn function_declaration(&mut self) -> Result<Stmt, ParseError> { let location = self.previous().location; // Location of 'fun' let name = self.consume(TokenType::Identifier, "Expect function name.")?; let name_location = name.location; let name_str = name.lexeme.clone(); self.consume(TokenType::LeftParen, "Expect '(' after function name.")?; let mut params = Vec::new(); let mut param_locations = Vec::new(); if !self.check(TokenType::RightParen) { loop { let param = self.consume(TokenType::Identifier, "Expect parameter name.")?; params.push(param.lexeme.clone()); param_locations.push(param.location); if !self.match_token(TokenType::Comma) { break; } } } self.consume(TokenType::RightParen, "Expect ')' after parameters.")?; self.consume(TokenType::LeftBrace, "Expect '{' before function body.")?; let body = self.block()?; Ok(Stmt::Function(FunctionStmt { location, name: name_str, name_location, params, param_locations, body })) } fn var_declaration(&mut self) -> Result<Stmt, ParseError> { let location = self.previous().location; // Location of 'var' let name = self.consume(TokenType::Identifier, "Expect variable name.")?; let name_location = name.location; let name_str = name.lexeme.clone(); let init = if self.match_token(TokenType::Equal) { self.expression()? } else { Expr::Literal(LiteralExpr { location, value: LoxValue::Nil }) }; self.consume(TokenType::Semicolon, "Expect ';' after variable declaration.")?; Ok(Stmt::Var(VarStmt { location, name: name_str, name_location, init })) } fn statement(&mut self) -> Result<Stmt, ParseError> { if self.match_token(TokenType::Print) { return self.print_statement(); } if self.match_token(TokenType::If) { return self.if_statement(); } if self.match_token(TokenType::While) { return self.while_statement(); } if self.match_token(TokenType::Return) { return self.return_statement(); } if self.match_token(TokenType::LeftBrace) { let location = self.previous().location; return Ok(Stmt::Block(BlockStmt { location, statements: self.block()? })); } self.expression_statement() } fn print_statement(&mut self) -> Result<Stmt, ParseError> { let location = self.previous().location; // Location of 'print' let value = self.expression()?; self.consume(TokenType::Semicolon, "Expect ';' after value.")?; Ok(Stmt::Print(PrintStmt { location, value })) } fn if_statement(&mut self) -> Result<Stmt, ParseError> { let location = self.previous().location; // Location of 'if' self.consume(TokenType::LeftParen, "Expect '(' after 'if'.")?; let condition = self.expression()?; self.consume(TokenType::RightParen, "Expect ')' after if condition.")?; let then_branch = vec![self.statement()?]; let else_branch = if self.match_token(TokenType::Else) { vec![self.statement()?] } else { Vec::new() }; Ok(Stmt::If(IfStmt { location, condition, then_branch, else_branch })) } fn while_statement(&mut self) -> Result<Stmt, ParseError> { let location = self.previous().location; // Location of 'while' self.consume(TokenType::LeftParen, "Expect '(' after 'while'.")?; let condition = self.expression()?; self.consume(TokenType::RightParen, "Expect ')' after while condition.")?; let body = vec![self.statement()?]; Ok(Stmt::While(WhileStmt { location, condition, body })) } fn return_statement(&mut self) -> Result<Stmt, ParseError> { let location = self.previous().location; // Location of 'return' let value = if !self.check(TokenType::Semicolon) { Some(self.expression()?) } else { None }; self.consume(TokenType::Semicolon, "Expect ';' after return value.")?; Ok(Stmt::Return(ReturnStmt { location, value })) } fn expression_statement(&mut self) -> Result<Stmt, ParseError> { let expr = self.expression()?; let location = expr.location(); // Start of the expression self.consume(TokenType::Semicolon, "Expect ';' after expression.")?; Ok(Stmt::Expression(ExpressionStmt { location, expr })) } fn block(&mut self) -> Result<Vec<Stmt>, ParseError> { let mut statements = Vec::new(); while !self.check(TokenType::RightBrace) && !self.is_at_end() { statements.push(self.declaration()?); } self.consume(TokenType::RightBrace, "Expect '}' after block.")?; Ok(statements) } // ======================================================================== // Expression parsing (precedence climbing) // ======================================================================== fn expression(&mut self) -> Result<Expr, ParseError> { self.assignment() } fn assignment(&mut self) -> Result<Expr, ParseError> { let expr = self.or_expr()?; if self.match_token(TokenType::Equal) { let location = self.previous().location; let value = self.assignment()?; if let Expr::Variable(var) = expr { return Ok(Expr::Assign(AssignExpr { location, name: var.name, value: Box::new(value), })); } return Err(ParseError::new("Invalid assignment target.", location)); } Ok(expr) } fn or_expr(&mut self) -> Result<Expr, ParseError> { let mut expr = self.and_expr()?; while self.match_token(TokenType::Or) { let location = self.previous().location; let right = self.and_expr()?; expr = Expr::Logical(LogicalExpr { location, left: Box::new(expr), op: LogicalOp::Or, right: Box::new(right), }); } Ok(expr) } fn and_expr(&mut self) -> Result<Expr, ParseError> { let mut expr = self.equality()?; while self.match_token(TokenType::And) { let location = self.previous().location; let right = self.equality()?; expr = Expr::Logical(LogicalExpr { location, left: Box::new(expr), op: LogicalOp::And, right: Box::new(right), }); } Ok(expr) } fn equality(&mut self) -> Result<Expr, ParseError> { let mut expr = self.comparison()?; while self.match_any(&[TokenType::EqualEqual, TokenType::BangEqual]) { let location = self.previous().location; let op = match self.previous().token_type { TokenType::EqualEqual => BinaryOp::Equal, TokenType::BangEqual => BinaryOp::NotEqual, _ => unreachable!(), }; let right = self.comparison()?; expr = Expr::Binary(BinaryExpr { location, left: Box::new(expr), op, right: Box::new(right), }); } Ok(expr) } fn comparison(&mut self) -> Result<Expr, ParseError> { let mut expr = self.term()?; while self.match_any(&[ TokenType::Greater, TokenType::GreaterEqual, TokenType::Less, TokenType::LessEqual, ]) { let location = self.previous().location; let op = match self.previous().token_type { TokenType::Greater => BinaryOp::Greater, TokenType::GreaterEqual => BinaryOp::GreaterEqual, TokenType::Less => BinaryOp::Less, TokenType::LessEqual => BinaryOp::LessEqual, _ => unreachable!(), }; let right = self.term()?; expr = Expr::Binary(BinaryExpr { location, left: Box::new(expr), op, right: Box::new(right), }); } Ok(expr) } fn term(&mut self) -> Result<Expr, ParseError> { let mut expr = self.factor()?; while self.match_any(&[TokenType::Plus, TokenType::Minus]) { let location = self.previous().location; let op = match self.previous().token_type { TokenType::Plus => BinaryOp::Add, TokenType::Minus => BinaryOp::Sub, _ => unreachable!(), }; let right = self.factor()?; expr = Expr::Binary(BinaryExpr { location, left: Box::new(expr), op, right: Box::new(right), }); } Ok(expr) } fn factor(&mut self) -> Result<Expr, ParseError> { let mut expr = self.unary()?; while self.match_any(&[TokenType::Star, TokenType::Slash]) { let location = self.previous().location; let op = match self.previous().token_type { TokenType::Star => BinaryOp::Mul, TokenType::Slash => BinaryOp::Div, _ => unreachable!(), }; let right = self.unary()?; expr = Expr::Binary(BinaryExpr { location, left: Box::new(expr), op, right: Box::new(right), }); } Ok(expr) } fn unary(&mut self) -> Result<Expr, ParseError> { if self.match_any(&[TokenType::Bang, TokenType::Minus]) { let location = self.previous().location; let op = match self.previous().token_type { TokenType::Bang => UnaryOp::Not, TokenType::Minus => UnaryOp::Negate, _ => unreachable!(), }; let right = Box::new(self.unary()?); return Ok(Expr::Unary(UnaryExpr { location, op, right })); } self.primary() } fn primary(&mut self) -> Result<Expr, ParseError> { let token = self.peek(); let location = token.location; if self.match_token(TokenType::False) { return Ok(Expr::Literal(LiteralExpr { location, value: LoxValue::Bool(false) })); } if self.match_token(TokenType::True) { return Ok(Expr::Literal(LiteralExpr { location, value: LoxValue::Bool(true) })); } if self.match_token(TokenType::Nil) { return Ok(Expr::Literal(LiteralExpr { location, value: LoxValue::Nil })); } if self.match_token(TokenType::Number) { let value = token.literal.as_ref().unwrap().as_number(); return Ok(Expr::Literal(LiteralExpr { location, value: LoxValue::Number(value) })); } if self.match_token(TokenType::String) { let value = token.literal.as_ref().unwrap().as_string(); return Ok(Expr::Literal(LiteralExpr { location, value: LoxValue::String(value) })); } if self.match_token(TokenType::Identifier) { return Ok(Expr::Variable(VariableExpr { location, name: token.lexeme.clone() })); } if self.match_token(TokenType::LeftParen) { let expr = self.expression()?; self.consume(TokenType::RightParen, "Expect ')' after expression.")?; return Ok(Expr::Grouping(GroupingExpr { location, expr: Box::new(expr), })); } Err(ParseError::new("Expect expression.", location)) } // ======================================================================== // Helper methods // ======================================================================== fn match_token(&mut self, token_type: TokenType) -> bool { if self.check(token_type) { self.advance(); return true; } false } fn match_any(&mut self, types: &[TokenType]) -> bool { for t in types { if self.check(*t) { self.advance(); return true; } } false } fn check(&self, token_type: TokenType) -> bool { !self.is_at_end() && self.peek().token_type == token_type } fn advance(&mut self) -> Token { if !self.is_at_end() { self.current += 1; } self.previous() } fn is_at_end(&self) -> bool { self.peek().token_type == TokenType::Eof } fn peek(&self) -> Token { self.tokens[self.current].clone() } fn previous(&self) -> Token { self.tokens[self.current - 1].clone() } fn consume(&mut self, token_type: TokenType, message: &str) -> Result<Token, ParseError> { if self.check(token_type) { return Ok(self.advance()); } Err(ParseError::new(message, self.peek().location)) } } #[derive(Debug)] pub struct ParseError { pub message: String, pub location: Location, } impl ParseError { pub fn new(message: &str, location: Location) -> Self { Self { message: message.to_string(), location } } } }
Part 4: MLIR Code Generation with Melior
The core of the compiler — walks the AST and emits MLIR.
Note on the codegen model: The codegen uses a simplified
current_block: Option<Block>pattern for exposition. In production Melior code, blocks are created and immediately appended to regions (the "build then insert" pattern). Thecurrent_blockownership issue — where a block is moved into a region and can't be accessed again — is real, and production code avoids it by building regions first, then appending. For this tutorial, the simplified pattern helps you follow the logic without getting lost in ownership boilerplate. See the review notes for the production approach.
Scope note: In this part, we compile a subset of Lox that only supports numbers and arithmetic. This isn't a limitation of MLIR — it's a pedagogical choice. Dynamic typing with tagged unions adds 3-4x more code for every operation (check the tag, dispatch, unbox, compute, re-box). We'll cover dynamic typing in the "Tagged Unions" section below. For now, every value is an
f64.What this means in practice:
- All values are
f64—trueis1.0,falseis0.0,nilis0.0- Arithmetic operations use
arith.addf,arith.mulf, etc.- Comparisons use
arith.cmpf- Logical operators use
scf.iffor short-circuit evaluationlox.printcalls a runtime function that prints a floatThis is a simplification. A production Lox compiler would use tagged unions. But starting with "everything is a float" lets us focus on MLIR concepts without drowning in type-tag boilerplate.
Basic Code Generator
#![allow(unused)] fn main() { // src/codegen/mod.rs mod generator; pub use generator::generate_module; }
#![allow(unused)] fn main() { // src/codegen/generator.rs use crate::ast::*; use melior::{ Context, dialect::{arith, func, scf, DialectRegistry}, ir::{ attribute::{StringAttribute, TypeAttribute, FloatAttribute, IntegerAttribute}, r#type::FunctionType, Location, Module, Region, Block, Type, Value, operation::OperationBuilder, }, utility::register_all_dialects, }; /// State for code generation pub struct CodeGenerator<'c> { context: &'c Context, module: Module<'c>, current_block: Option<Block<'c>>, // Variable storage: maps variable names to their SSA values variables: std::collections::HashMap<String, Value<'c>>, } impl<'c> CodeGenerator<'c> { pub fn new(context: &'c Context) -> Self { let location = Location::unknown(context); let module = Module::new(location); Self { context, module, current_block: None, variables: std::collections::HashMap::new(), } } // ======================================================================== // Entry point // ======================================================================== pub fn generate(mut self, program: &Program) -> Module<'c> { for stmt in &program.statements { self.compile_statement(stmt); } self.module } // ======================================================================== // Statement compilation // ======================================================================== fn compile_statement(&mut self, stmt: &Stmt) { match stmt { Stmt::Function(f) => self.compile_function(f), Stmt::Return(r) => self.compile_return(r), Stmt::Var(v) => self.compile_var(v), Stmt::If(i) => self.compile_if(i), Stmt::While(w) => self.compile_while(w), Stmt::Print(p) => self.compile_print(p), Stmt::Block(b) => self.compile_block(b), Stmt::Expression(e) => { self.compile_expression(&e.expr); } } } fn compile_function(&mut self, func: &FunctionStmt) { let location = Location::unknown(self.context); let float_type = Type::float64(self.context); // Create parameter types (all f64 for now - dynamic typing) let param_types: Vec<Type> = func.params.iter().map(|_| float_type).collect(); let return_type = float_type; // Create the function type let function_type = FunctionType::new(self.context, ¶m_types, &[return_type]); // Create the function body region let region = Region::new(); let block = Block::new( ¶m_types.iter().map(|&t| (t, location)).collect::<Vec<_>>() ); // Store parameters as variables for (i, param_name) in func.params.iter().enumerate() { let arg = block.argument(i).unwrap(); self.variables.insert(param_name.clone(), arg.into()); } // Set current block for body compilation self.current_block = Some(block); // Compile the function body for stmt in &func.body { self.compile_statement(stmt); } // Add implicit return nil if no return at end let nil_value = self.compile_nil(); if let Some(block) = &self.current_block { block.append_operation(func::r#return(&[nil_value], location)); } // Append block to region if let Some(block) = self.current_block.take() { region.append_block(block); } // Add function to module self.module.body().append_operation(func::func( self.context, StringAttribute::new(self.context, &func.name), TypeAttribute::new(function_type.into()), region, &[], location, )); // Clear variables after function self.variables.clear(); } fn compile_return(&mut self, ret: &ReturnStmt) { let location = Location::unknown(self.context); let value = match &ret.value { Some(expr) => self.compile_expression(expr), None => self.compile_nil(), }; if let Some(block) = &self.current_block { block.append_operation(func::r#return(&[value], location)); } } fn compile_var(&mut self, var: &VarStmt) { let value = self.compile_expression(&var.init); self.variables.insert(var.name.clone(), value); } fn compile_if(&mut self, if_stmt: &IfStmt) { let location = Location::unknown(self.context); let condition = self.compile_expression(&if_stmt.condition); // Create scf.if operation let then_region = Region::new(); let then_block = then_region.append_block(Block::new(&[])); // Compile then branch let prev_block = self.current_block.replace(then_block); for stmt in &if_stmt.then_branch { self.compile_statement(stmt); } // Handle else branch let else_region = if !if_stmt.else_branch.is_empty() { let else_region = Region::new(); let else_block = else_region.append_block(Block::new(&[])); self.current_block = Some(else_block); for stmt in &if_stmt.else_branch { self.compile_statement(stmt); } // Restore block reference (Melior ownership is tricky here) // In a real impl, we'd need to handle this more carefully Some(else_region) } else { None }; self.current_block = prev_block; // Append scf.if to current block // Note: MLIR scf.if requires BOTH then and else regions. // An empty else region is used for single-branch if statements. if let Some(block) = &self.current_block { let else_region = Region::new(); let if_op = OperationBuilder::new("scf.if", location) .add_operand(condition) .add_region(then_region) .add_region(else_region) .build() .unwrap(); block.append_operation(if_op); } } fn compile_while(&mut self, while_stmt: &WhileStmt) { let location = Location::unknown(self.context); // For scf.while, we need: // 1. A "before" region that computes the condition // 2. An "after" region that is the loop body let before_region = Region::new(); let before_block = before_region.append_block(Block::new(&[])); // Compile condition in before block let prev_block = self.current_block.replace(before_block); let condition = self.compile_expression(&while_stmt.condition); // Add scf.condition let condition_op = OperationBuilder::new("scf.condition", location) .add_operand(condition) .build() .unwrap(); before_block.append_operation(condition_op); // Create after region (loop body) let after_region = Region::new(); let after_block = after_region.append_block(Block::new(&[])); self.current_block = Some(after_block); for stmt in &while_stmt.body { self.compile_statement(stmt); } // Add scf.yield let yield_op = OperationBuilder::new("scf.yield", location) .build() .unwrap(); if let Some(block) = &self.current_block { block.append_operation(yield_op); } self.current_block = prev_block; // Create scf.while if let Some(block) = &self.current_block { let while_op = OperationBuilder::new("scf.while", location) .add_region(before_region) .add_region(after_region) .build() .unwrap(); block.append_operation(while_op); } } fn compile_print(&mut self, print: &PrintStmt) { let location = Location::unknown(self.context); let value = self.compile_expression(&print.value); // Create a call to a runtime print function // For simplicity, we'll use a placeholder operation let print_op = OperationBuilder::new("lox.print", location) .add_operand(value) .build() .unwrap(); if let Some(block) = &self.current_block { block.append_operation(print_op); } } fn compile_block(&mut self, block: &BlockStmt) { for stmt in &block.statements { self.compile_statement(stmt); } } // ======================================================================== // Expression compilation // ======================================================================== fn compile_expression(&mut self, expr: &Expr) -> Value<'c> { let location = Location::unknown(self.context); let op = match expr { Expr::Binary(b) => self.compile_binary(b), Expr::Unary(u) => self.compile_unary(u), Expr::Literal(l) => return self.compile_literal(l), Expr::Grouping(g) => return self.compile_expression(&g.expr), Expr::Variable(v) => return self.compile_variable(v), Expr::Assign(a) => return self.compile_assign(a), Expr::Call(c) => self.compile_call(c), Expr::Logical(l) => self.compile_logical(l), }; // Get the result from the operation op.result(0).unwrap().into() } fn compile_binary(&mut self, binary: &BinaryExpr) -> melior::ir::Operation<'c> { let location = Location::unknown(self.context); let lhs = self.compile_expression(&binary.left); let rhs = self.compile_expression(&binary.right); let op = match binary.op { BinaryOp::Add => arith::addf(lhs, rhs, location), BinaryOp::Sub => arith::subf(lhs, rhs, location), BinaryOp::Mul => arith::mulf(lhs, rhs, location), BinaryOp::Div => arith::divf(lhs, rhs, location), BinaryOp::Less => arith::cmpf(self.context, arith::CmpfPredicate::Olt, lhs, rhs, location), BinaryOp::LessEqual => arith::cmpf(self.context, arith::CmpfPredicate::Ole, lhs, rhs, location), BinaryOp::Greater => arith::cmpf(self.context, arith::CmpfPredicate::Ogt, lhs, rhs, location), BinaryOp::GreaterEqual => arith::cmpf(self.context, arith::CmpfPredicate::Oge, lhs, rhs, location), BinaryOp::Equal => arith::cmpf(self.context, arith::CmpfPredicate::Oeq, lhs, rhs, location), BinaryOp::NotEqual => arith::cmpf(self.context, arith::CmpfPredicate::Une, lhs, rhs, location), }; // Append to current block if let Some(block) = &self.current_block { block.append_operation(op.clone()); } op } fn compile_unary(&mut self, unary: &UnaryExpr) -> melior::ir::Operation<'c> { let location = Location::unknown(self.context); let operand = self.compile_expression(&unary.right); match unary.op { UnaryOp::Negate => { let op = arith::negf(operand, location); if let Some(block) = &self.current_block { block.append_operation(op.clone()); } op } UnaryOp::Not => { // In our "numbers only" model, `not x` is: // if x == 0.0 { 1.0 } else { 0.0 } // We use scf.if since there's no direct float negation of boolean sense. let zero_const = arith::constant( self.context, FloatAttribute::new(self.context, 0.0, Type::float64(self.context)).into(), Location::unknown(self.context), ); // Build then region (x == 0.0 → result = 1.0) let then_region = { let r = Region::new(); let b = r.append_block(Block::new(&[])); let one = arith::constant( self.context, FloatAttribute::new(self.context, 1.0, Type::float64(self.context)).into(), Location::unknown(self.context), ); b.append_operation(one); r }; // Build else region (x != 0.0 → result = 0.0) let else_region = { let r = Region::new(); let b = r.append_block(Block::new(&[])); let zero = arith::constant( self.context, FloatAttribute::new(self.context, 0.0, Type::float64(self.context)).into(), Location::unknown(self.context), ); b.append_operation(zero); r }; // Compare operand to 0.0 if let Some(block) = &self.current_block { block.append_operation(zero_const.clone()); let is_zero = arith::cmpf( self.context, arith::CmpfPredicate::Oeq, operand, zero_const.result(0).unwrap().into(), Location::unknown(self.context), ); block.append_operation(is_zero); // scf.if(is_zero) { then_region } { else_region } -> f64 let if_op = OperationBuilder::new("scf.if", location) .add_operand(is_zero.result(0).unwrap().into()) .add_result(Type::float64(self.context)) .add_region(then_region) .add_region(else_region) .build() .unwrap(); block.append_operation(if_op); if_op } else { // Fallback if no current block (shouldn't happen in practice) zero_const } } } } fn compile_literal(&mut self, literal: &LiteralExpr) -> Value<'c> { let location = Location::unknown(self.context); let op = match &literal.value { LoxValue::Nil => { return self.compile_nil(); } LoxValue::Bool(b) => { // In our "numbers only" model, booleans are 1.0 and 0.0 arith::constant( self.context, FloatAttribute::new(self.context, if *b { 1.0 } else { 0.0 }, Type::float64(self.context)).into(), location, ) } LoxValue::Number(n) => { arith::constant( self.context, FloatAttribute::new(self.context, *n, Type::float64(self.context)).into(), location, ) } LoxValue::String(s) => { // String constants are global - no heap allocation! // See the String Constants section below return self.compile_string(s, location); } }; if let Some(block) = &self.current_block { block.append_operation(op.clone()); } op.result(0).unwrap().into() } fn compile_nil(&mut self) -> Value<'c> { // In our "numbers only" subset, nil is represented as 0.0 f64. // This is consistent with the simplified typing model. // A full implementation would use tagged unions. let location = Location::unknown(self.context); let op = arith::constant( self.context, FloatAttribute::new(self.context, 0.0, Type::float64(self.context)).into(), location, ); if let Some(block) = &self.current_block { block.append_operation(op.clone()); } op.result(0).unwrap().into() } fn compile_variable(&mut self, var: &VariableExpr) -> Value<'c> { // Look up the variable in the current scope self.variables.get(&var.name) .copied() .unwrap_or_else(|| self.compile_nil()) } fn compile_assign(&mut self, assign: &AssignExpr) -> Value<'c> { let value = self.compile_expression(&assign.value); self.variables.insert(assign.name.clone(), value); value } fn compile_call(&mut self, call: &CallExpr) -> melior::ir::Operation<'c> { let location = Location::unknown(self.context); // Compile arguments let args: Vec<Value> = call.arguments.iter() .map(|arg| self.compile_expression(arg)) .collect(); // For now, assume callee is a direct function call if let Expr::Variable(var) = call.callee.as_ref() { let call_op = func::call( self.context, melior::ir::attribute::FlatSymbolRefAttribute::new(self.context, &var.name), &args, &[Type::float64(self.context)], location, ); if let Some(block) = &self.current_block { block.append_operation(call_op.clone()); } return call_op; } // Indirect call (first-class function) - not implemented unimplemented!("Indirect function calls not yet supported") } fn compile_logical(&mut self, logical: &LogicalExpr) -> melior::ir::Operation<'c> { let location = Location::unknown(self.context); // Logical operations short-circuit in Lox, so we MUST use scf.if. // Using arith::andi/ori would be WRONG — those are bitwise, not short-circuit. // // `a and b` → if a { b } else { false } // `a or b` → if a { true } else { b } let left = self.compile_expression(&logical.left); // Convert left to i1 for the condition (nonzero = true) let zero = arith::constant( self.context, FloatAttribute::new(self.context, 0.0, Type::float64(self.context)).into(), location, ); if let Some(block) = &self.current_block { block.append_operation(zero.clone()); } let cond = arith::cmpf( self.context, arith::CmpfPredicate::One, // ordered not-equal (nonzero = true) left, zero.result(0).unwrap().into(), location, ); if let Some(block) = &self.current_block { block.append_operation(cond.clone()); } match logical.op { LogicalOp::And => { // if left { right } else { 0.0 } let then_block = Block::new(&[]); let else_block = Block::new(&[]); let prev = self.current_block.replace(then_block); let right = self.compile_expression(&logical.right); let then_block = self.current_block.take().unwrap(); self.current_block = Some(else_block); let false_val = arith::constant( self.context, FloatAttribute::new(self.context, 0.0, Type::float64(self.context)).into(), location, ); if let Some(block) = &self.current_block { block.append_operation(false_val.clone()); } let else_block = self.current_block.take().unwrap(); self.current_block = prev; let if_op = OperationBuilder::new("scf.if", location) .add_operand(cond.result(0).unwrap().into()) .add_result(Type::float64(self.context)) .add_region({ let mut region = Region::new(); region.append_block(then_block); region }) .add_region({ let mut region = Region::new(); region.append_block(else_block); region }) .build()?; if let Some(block) = &self.current_block { block.append_operation(if_op.clone()); } if_op } LogicalOp::Or => { // if left { 1.0 } else { right } let then_block = Block::new(&[]); let else_block = Block::new(&[]); let prev = self.current_block.replace(then_block); let true_val = arith::constant( self.context, FloatAttribute::new(self.context, 1.0, Type::float64(self.context)).into(), location, ); if let Some(block) = &self.current_block { block.append_operation(true_val.clone()); } let then_block = self.current_block.take().unwrap(); self.current_block = Some(else_block); let right = self.compile_expression(&logical.right); let else_block = self.current_block.take().unwrap(); self.current_block = prev; let if_op = OperationBuilder::new("scf.if", location) .add_operand(cond.result(0).unwrap().into()) .add_result(Type::float64(self.context)) .add_region({ let mut region = Region::new(); region.append_block(then_block); region }) .add_region({ let mut region = Region::new(); region.append_block(else_block); region }) .build()?; if let Some(block) = &self.current_block { block.append_operation(if_op.clone()); } if_op } } } /// Compile a string literal to a global constant fn compile_string(&mut self, value: &str, location: Location<'c>) -> Value<'c> { // Placeholder - see String Constants section below for full implementation let op = arith::constant( self.context, IntegerAttribute::new(0, Type::integer(self.context, 64)).into(), location, ); if let Some(block) = &self.current_block { block.append_operation(op.clone()); } op.result(0).unwrap().into() } } /// Main entry point for code generation pub fn generate_module(context: &Context, program: &Program) -> Module { let generator = CodeGenerator::new(context); generator.generate(program) } }
String Constants (No Allocation Needed!)
String literals are constants, not heap allocations. They live in the binary's data section, just like in C.
#![allow(unused)] fn main() { // src/codegen/strings.rs use melior::{ Context, Location, dialect::llvm, ir::{ attribute::StringAttribute, Type, Module, operation::OperationBuilder, }, }; /// Create a global string constant in the LLVM dialect /// /// This creates something like: /// llvm.mlir.global constant @str_0("hello") /// /// No heap allocation - the string lives in the data section. pub fn create_string_constant( module: &Module, context: &Context, name: &str, // e.g., "str_0" value: &str, location: Location, ) { module.body().append_operation( llvm::r#const( context, name, Type::parse(context, &format!("!llvm.array<{} x i8>", value.len())).unwrap(), StringAttribute::new(context, value), location, ) ); } }
Why This Works
| Approach | Memory Location | Allocation? |
|---|---|---|
llvm.mlir.global constant | Data section | No (static) |
| Heap allocation (malloc) | Heap | Yes (runtime) |
| Stack allocation (alloca) | Stack | No, but per-call |
String literals are static data — they exist for the lifetime of the program, embedded in the binary. No runtime cost.
Dynamic Typing with Tagged Unions
Lox is dynamically typed, so a function parameter can receive any type:
fun printValue(x) {
print x; // x could be number, string, bool, nil, or object
}
We need a tagged union type:
#![allow(unused)] fn main() { // src/codegen/types.rs use melior::ir::Type; use melior::Context; /// A Lox value is a tagged union: struct { tag: i8, data: i64 } pub fn lox_value_type(context: &Context) -> Type { Type::parse(context, "!llvm.struct<(i8, i64)>").unwrap() } /// Tag values for each Lox type pub const TAG_NIL: i8 = 0; pub const TAG_BOOL: i8 = 1; pub const TAG_NUMBER: i8 = 2; pub const TAG_STRING: i8 = 3; pub const TAG_OBJECT: i8 = 4; pub const TAG_CLOSURE: i8 = 5; }
Part 5: Source Locations in MLIR
Every operation in MLIR has a source location. Unlike LLVM where debug info is optional, in MLIR locations are core to the IR.
The Location API
MLIR operations carry source locations for error messages and debug output. Melior provides several ways to create them:
#![allow(unused)] fn main() { // src/location.rs use melior::ir::Location; use melior::Context; pub fn demonstrate_locations(context: &Context) { // Unknown location — for generated code with no source mapping let unknown = Location::unknown(context); // File location — specific file, line, column let file_loc = Location::new(context, "test.lox", 10, 5); // Name location — for generated code, use a descriptive name let name_loc = Location::name(context, "implicit_return"); } }
Updated Code Generator with Proper Locations
#![allow(unused)] fn main() { pub struct CodeGenerator<'c> { context: &'c Context, module: Module<'c>, current_block: Option<Block<'c>>, variables: std::collections::HashMap<String, Value<'c>>, filename: String, } impl<'c> CodeGenerator<'c> { pub fn new(context: &'c Context, filename: &str) -> Self { let location = Location::new(context, filename, 1, 1); let module = Module::new(location); Self { context, module, current_block: None, variables: std::collections::HashMap::new(), filename: filename.to_string(), } } /// Convert an AST location to an MLIR location fn loc(&self, ast_loc: crate::ast::Location) -> Location<'c> { Location::new(self.context, &self.filename, ast_loc.line, ast_loc.column) } /// Get a location for generated/implicit code fn generated_loc(&self, description: &str) -> Location<'c> { Location::name(self.context, description) } } }
What the IR Looks Like With Locations
Before (using Location::unknown):
module {
func.func @add(%arg0: f64, %arg1: f64) -> f64 {
%0 = arith.addf %arg0, %arg1 : f64
return %0 : f64
}
}
After (with proper locations, shown with -mlir-print-debuginfo):
module {
func.func @add(%arg0: f64, %arg1: f64) -> f64
loc("test.lox":1:1)
{
%0 = arith.addf %arg0, %arg1 : f64
loc("test.lox":2:14)
return %0 : f64 loc("test.lox":2:3)
} loc("test.lox":1:1)
} loc("test.lox":1:1)
Part 6: A Complete Example
// examples/simple_add.rs use melior::{ Context, dialect::{arith, func, DialectRegistry}, ir::{ attribute::{StringAttribute, TypeAttribute, FloatAttribute}, r#type::FunctionType, Location, Module, Region, Block, Type, }, utility::register_all_dialects, }; fn main() -> Result<(), Box<dyn std::error::Error>> { let registry = DialectRegistry::new(); register_all_dialects(®istry); let context = Context::new(); context.append_dialect_registry(®istry); context.load_all_available_dialects(); let location = Location::unknown(&context); let module = Module::new(location); // Create function type: (f64, f64) -> f64 let float_type = Type::float64(&context); let function_type = FunctionType::new(&context, &[float_type, float_type], &[float_type]); // Create function body let region = Region::new(); let block = region.append_block(Block::new(&[ (float_type, location), (float_type, location), ])); // %sum = arith.addf %arg0, %arg1 : f64 let sum = block.append_operation(arith::addf( block.argument(0).unwrap().into(), block.argument(1).unwrap().into(), location, )); // return %sum : f64 block.append_operation(func::r#return( &[sum.result(0).unwrap().into()], location, )); // Create the function module.body().append_operation(func::func( &context, StringAttribute::new(&context, "add"), TypeAttribute::new(function_type.into()), region, &[], location, )); // Verify and print assert!(module.as_operation().verify()); println!("{}", module.as_operation()); Ok(()) }
Output:
module {
func.func @add(%arg0: f64, %arg1: f64) -> f64 {
%0 = arith.addf %arg0, %arg1 : f64
return %0 : f64
}
}
Part 7: Lowering to LLVM IR
After generating MLIR, lower it to LLVM IR and compile to machine code:
#![allow(unused)] fn main() { // src/lib.rs pub mod ast; pub mod parser; pub mod codegen; use melior::{ Context, dialect::DialectRegistry, pass::PassManager, utility::register_all_dialects, }; pub fn compile_to_llvm(source: &str) -> Result<String, CompileError> { // 1. Parse let tokens = lexer::tokenize(source)?; let program = parser::Parser::new(tokens).parse()?; // 2. Generate MLIR let registry = DialectRegistry::new(); register_all_dialects(®istry); let context = Context::new(); context.append_dialect_registry(®istry); context.load_all_available_dialects(); let module = codegen::generate_module(&context, &program); // 3. Run lowering passes (MLIR → LLVM IR) // The exact pass names depend on your Melior version. // Common passes: convert-scf-to-cf, convert-arith-to-llvmir, // convert-func-to-llvmir let pass_manager = PassManager::new(&context); // pass_manager.add_pass(pass::convert_scf_to_cf()); // pass_manager.add_pass(pass::convert_arith_to_llvm()); // pass_manager.add_pass(pass::convert_func_to_llvm()); // pass_manager.run(&module)?; Ok(module.as_operation().to_string()) } }
Using the CLI
# Compile Lox to MLIR
cargo run -- compile input.lox --emit-mlir -o output.mlir
# Lower MLIR to LLVM IR
mlir-translate output.mlir --mlir-to-llvmir -o output.ll
# Compile to executable
clang output.ll -o output
Part 8: Project Structure
lox-mlir/
├── Cargo.toml
├── src/
│ ├── lib.rs # Library entry point
│ ├── main.rs # CLI entry point
│ ├── ast.rs # AST definitions
│ ├── lexer.rs # Tokenizer
│ ├── parser.rs # Parser
│ ├── codegen/
│ │ ├── mod.rs
│ │ ├── generator.rs # MLIR code generator
│ │ ├── types.rs # Tagged union types
│ │ └── strings.rs # String constant handling
│ └── runtime/
│ ├── mod.rs
│ └── print.c # Runtime print implementation
├── examples/
│ ├── simple_add.rs
│ └── *.lox
└── tests/
└── integration.rs
Quick Reference: Lox → MLIR Mapping
| Lox Construct | Rust Enum | MLIR Operation |
|---|---|---|
a + b | BinaryOp::Add | arith.addf |
a - b | BinaryOp::Sub | arith.subf |
a * b | BinaryOp::Mul | arith.mulf |
a / b | BinaryOp::Div | arith.divf |
a < b | BinaryOp::Less | arith.cmpf olt |
a == b | BinaryOp::Equal | arith.cmpf oeq |
var x = v | VarStmt | Store in HashMap |
x | VariableExpr | Load from HashMap |
if (c) {...} | IfStmt | scf.if |
while (c) {...} | WhileStmt | scf.while |
fun f(...) {...} | FunctionStmt | func.func |
f(args) | CallExpr | func.call |
return v | ReturnStmt | func.return |
Differences from C++ MLIR
| Aspect | C++ MLIR | Melior (Rust) |
|---|---|---|
| Dialect definition | TableGen (.td) | Rust code directly |
| Operations | Generated from ODS | Built with OperationBuilder |
| Ownership | Manual / raw pointers | RAII with lifetimes |
| Pattern rewriting | C++ classes | Closures / Rust traits |
| Error handling | LogicalResult | Result<T, Error> |
Next Steps
- Start small: Just numbers and arithmetic. Get
print 1 + 2;working. - Add variables: Implement local variables with SSA values.
- Add control flow:
ifandwhilewithscfdialect. - Add functions:
func.funcandfunc.call. - Add closures: Environment capture with heap allocation.
- Add classes/objects: The full Lox experience.
Melior provides a safe, idiomatic Rust interface to MLIR. The ownership model takes some getting used to (regions/blocks are moved rather than borrowed), but the type system prevents most common errors.
MLIR for Lox: Part 2 - Garbage Collection (From Scratch)
This guide starts from absolute zero and builds up GC understanding one concept at a time. No prior GC knowledge assumed.
Chapter 1: The Problem
What Happens to Memory Without GC?
When you allocate memory, it comes from a big pool called the heap. In C, you do:
void* obj = malloc(16); // Take 16 bytes from heap
// ... use obj ...
free(obj); // Put it back
The problem: When do you call free()?
Consider this Lox code:
var a = "hello";
var b = a;
a = nil;
// Is "hello" still needed? b still references it!
The string "hello" is referenced by b. If we freed it when a was set to nil, b would point to garbage.
The core question GC answers:
When is an object no longer needed, so it's safe to free?
Chapter 2: The Intuition
Think of it like a party at your house:
- Objects = guests
- References = guests holding a "ticket" to stay
- GC = you checking who still has a ticket
Guest A enters (allocate)
↓
Guest A gives ticket to Guest B (reference)
↓
Guest A leaves but Guest B still has ticket
↓
You check: "Does anyone have Guest A's ticket?"
↓
No? → Guest A can leave (free)
Key insight: An object is "live" if it's reachable from your program's variables. If nothing points to it, it's garbage.
Chapter 3: The Two Questions
GC must answer:
- What objects are "roots"? (starting points for reachability)
- What objects does each object reference? (follow the chain)
What Are Roots?
A "root" is something your program can directly access:
var x = 1; // 'x' is a root (global variable)
print x; // We can reach '1' from 'x'
Roots include:
- Global variables
- Local variables on the stack
- Temporaries in expressions (intermediate results)
What Is Reachability?
var a = "hello"; // "hello" is reachable from root 'a'
var b = a; // "hello" is reachable from 'b' too
a = nil; // Still reachable from 'b'
b = nil; // Now unreachable! Garbage!
Visual representation:
BEFORE: AFTER b = nil:
┌─────────┐ ┌─────────┐
│ root: a │──► "hello" │ root: a │──► nil
└─────────┘ └─────────┘
┌─────────┐ ┌─────────┐
│ root: b │──► "hello" │ root: b │──► nil
└─────────┘ └─────────┘
│
"hello" ← GARBAGE!
(nothing points to it)
Chapter 4: The Algorithm (Mark-Sweep)
The simplest GC algorithm is mark-sweep. It has two phases:
Phase 1: Mark
┌─────────────────────────────────────────────┐
│ MARK PHASE │
│ │
│ 1. Start with all roots │
│ 2. For each root, mark the object │
│ 3. Recursively mark objects it references │
│ 4. Repeat until nothing new to mark │
└─────────────────────────────────────────────┘
Phase 2: Sweep
┌─────────────────────────────────────────────┐
│ SWEEP PHASE │
│ │
│ 1. Walk all allocated objects │
│ 2. If marked: clear mark (still live) │
│ 3. If not marked: FREE IT (garbage) │
└─────────────────────────────────────────────┘
Visual Example
Before collection:
Objects in heap:
[A] ──► [B] ──► [C]
↑
root (global variable)
[D] ──► [E]
↑
nothing! (D was a local that went out of scope)
After mark phase:
[A*] ──► [B*] ──► [C*] (* = marked, still live)
↑
root
[D] ──► [E] (not marked, these are garbage)
After sweep phase:
[A] ──► [B] ──► [C] (live, marks cleared for next cycle)
[freed] [freed] (D and E are gone! memory reclaimed)
Chapter 5: The Implementation Challenge
Now we know WHAT to do. But HOW does the GC know:
- Where are the roots? (what variables are currently in scope)
- What does each object reference? (which other objects it points to)
These are the two hard problems. Let's tackle them one at a time.
Problem 1: Finding Roots
When the GC runs, your program is paused mid-execution. The stack has local variables. Those are roots.
fun example() {
var a = 1; // 'a' is on the stack → root
var b = 2; // 'b' is on the stack → root
// GC runs here!
// It needs to know about 'a' and 'b'
}
The challenge: In compiled code, variables are just CPU registers or stack slots. The GC doesn't know which ones are pointers vs integers.
Solutions:
- Conservative GC: Treat everything that LOOKS like a pointer as a pointer
- Precise GC: Tell the GC exactly where pointers are (this is what we'll do)
Problem 2: Finding Object References
Each object may reference other objects:
class Pair {
init(first, second) {
this.first = first; // this object references 'first'
this.second = second; // this object references 'second'
}
}
var pair = Pair(1, 2);
// The 'pair' object references two number objects
The challenge: The GC needs to know: given an object, what other objects does it point to?
Solution: Store type information with each object so the GC knows how to walk it.
Chapter 6: Our First Data Structure
Before writing any GC code, we need a way to represent objects. Every object needs:
- A header (metadata for the GC)
- The actual data (the object's fields)
┌────────────────────────────┐
│ Object Header │
│ - marked: bool │
│ - type: ObjType │
│ - size: usize │
├────────────────────────────┤
│ Object Data │
│ - field 1 │
│ - field 2 │
│ - ... │
└────────────────────────────┘
Why the Header?
When the GC walks all objects, it needs to:
- Know if an object is already marked (avoid infinite loops)
- Know what type it is (to find references)
- Know how big it is (for sweeping)
The Type Enum
Different Lox objects have different data:
| Lox Type | What It Stores | What It References |
|---|---|---|
| Number | f64 | Nothing |
| String | bytes | Nothing |
| Closure | function ptr, env | Environment variables |
| Instance | class ptr, fields | Field values |
Chapter 7: Let's Write Some Code
Now we understand the concepts. Let's implement the absolute minimum:
Step 1: Define the Object Header
#![allow(unused)] fn main() { // src/runtime/object.rs /// Type of a Lox object #[repr(u8)] #[derive(Debug, Clone, Copy, PartialEq)] pub enum ObjType { Number = 0, String = 1, Closure = 2, Instance = 3, } /// Header prepended to every heap object /// /// Memory layout: /// [header: ObjHeader][data: depends on ObjType] #[repr(C)] pub struct ObjHeader { /// Has this object been marked by the GC? pub marked: bool, /// What type of object is this? pub obj_type: ObjType, /// Size of the data portion (not including header) pub size: u32, /// Pointer to next object in the allocation list /// (the GC needs to walk all objects during sweep) pub next: Option<*mut ObjHeader>, } }
Step 2: The Global Object List
The GC needs to know about ALL objects to sweep them:
#![allow(unused)] fn main() { // src/runtime/gc.rs use std::cell::RefCell; /// All allocated objects, linked list style thread_local! { static ALL_OBJECTS: RefCell<Option<*mut ObjHeader>> = RefCell::new(None); static OBJECT_COUNT: RefCell<usize> = RefCell::new(0); } }
Why thread_local!? Because Lox is single-threaded, and this avoids passing a GC context everywhere.
Step 3: Allocation (Without GC Yet)
#![allow(unused)] fn main() { /// Allocate a new object /// /// This is like `malloc` but: /// 1. Adds a header for GC metadata /// 2. Tracks the object in a global list pub fn alloc(size: usize, obj_type: ObjType) -> *mut u8 { // Calculate total size: header + data let total_size = std::mem::size_of::<ObjHeader>() + size; // Allocate raw memory let layout = std::alloc::Layout::from_size_align(total_size, 8).unwrap(); let ptr = unsafe { std::alloc::alloc(layout) }; if ptr.is_null() { panic!("Out of memory!"); } // Write the header let header = ptr as *mut ObjHeader; unsafe { (*header).marked = false; (*header).obj_type = obj_type; (*header).size = size as u32; // Add to global list ALL_OBJECTS.with(|list| { (*header).next = *list.borrow(); *list.borrow_mut() = Some(header); }); } // Return pointer to DATA (after header) let data_ptr = unsafe { ptr.add(std::mem::size_of::<ObjHeader>()) }; // Increment count OBJECT_COUNT.with(|count| { *count.borrow_mut() += 1; }); data_ptr } }
Memory Layout Deep Dive
Let's trace through exactly what happens when we allocate:
#![allow(unused)] fn main() { // Calling: let ptr = alloc(16, ObjType::String); // Step 1: Calculate total size // sizeof(ObjHeader) = 16 bytes (bool + padding + u8 + padding + u32 + padding + pointer) // data_size = 16 bytes // total = 32 bytes // Step 2: Allocate raw memory // ptr = 0x1000 (example address) // Step 3: Write header at ptr // 0x1000: marked = false // 0x1001: obj_type = String (1) // 0x1004: size = 16 // 0x1008: next = previous head of list // Step 4: Return data pointer // data_ptr = 0x1010 (ptr + 16) }
Visual Memory Layout
Address Content
──────────────────────────────────────
0x1000 ┌─────────────────────────┐ ← Header starts here
0x1001 │ marked: false (1 byte) │
0x1002 │ padding (1 byte) │
0x1003 │ obj_type: 1 (1 byte) │
0x1004 │ padding (1 byte) │
0x1005 │ size: 16 (4 bytes) │
0x1008 │ padding (4 bytes) │
0x1010 │ next: 0x2000 (8 bytes) │ ← Points to previous allocation
├─────────────────────────┤
0x1018 │ │ ← Data starts here (returned ptr)
0x1020 │ Your 16 bytes of │
0x1028 │ string data │
0x1030 │ │
└─────────────────────────┘
Key insight: When you receive a pointer from alloc(), you can always find the header by subtracting 16 bytes (sizeof header).
Chapter 7.5: Practice Exercise
Before moving on, let's verify your understanding:
Exercise 1: Trace the Allocations
Given this code:
#![allow(unused)] fn main() { let a = alloc(8, ObjType::Number); // Allocates a number let b = alloc(24, ObjType::String); // Allocates a string // What does ALL_OBJECTS look like? // What are the header addresses for a and b? }
Click to reveal answer
ALL_OBJECTS → b's header → a's header → None
a's header:
- address: b's data ptr - 24 - 16 = depends on b's allocation
- size: 8
- type: Number
- next: None (it was first)
b's header:
- address: b_data_ptr - 16
- size: 24
- type: String
- next: a's header
The linked list is built in reverse order. Each new allocation becomes the new head.
Exercise 2: Why the Header?
Why do we need marked, obj_type, and size in the header? What would happen if we removed each one?
Click to reveal answer
-
Without
marked: We couldn't track which objects are live. The sweep phase wouldn't know what to free. We might free live objects or keep garbage forever. -
Without
obj_type: We couldn't walk object references. When marking a closure, we wouldn't know it has an environment to also mark. -
Without
size: We couldn't properly free memory (need to know how much to deallocate). We also couldn't debug allocation issues.
Exercise 3: Thread Safety
Why do we use thread_local! instead of a regular static?
Click to reveal answer
A regular static would be shared across all threads, requiring synchronization (mutexes, atomics). Since Lox is single-threaded, thread_local! gives us:
- No synchronization overhead
- Each thread could have its own GC (if we later add threads)
- Simpler code
If we used a regular static, we'd need static ALL_OBJECTS: Mutex<...> which adds overhead.
Checkpoint: Before Moving On
We have:
- Object header definition
- Global object list
- Allocation function
- Understanding of memory layout
Before we continue to Part 3, make sure you can answer:
- Where is the header relative to the returned pointer?
- How does the linked list of objects work?
- Why do we need each field in the header?
Next: How do we find roots during garbage collection?
MLIR for Lox: Part 3 - Finding Roots
Continuing from Part 2. We can allocate objects. Now we need to find them during GC.
Chapter 8: The Root Problem
When GC runs, it needs to know: what variables are currently holding object references?
fun example() {
var a = "hello"; // 'a' is on the stack
var b = 42; // 'b' is on the stack (but it's a number, not a reference)
var c = a; // 'c' is on the stack, points to "hello"
// GC RUNS HERE
// The GC needs to know:
// - 'a' is a reference to a string object
// - 'b' is NOT a reference (just a number)
// - 'c' is a reference to the same string object
}
Why This Is Hard
In compiled code, a, b, and c are just bytes in memory (stack slots or registers). The GC sees:
Stack memory:
[slot 0]: 0x7fff1234 <-- Is this a pointer? A number? Who knows!
[slot 1]: 42 <-- This looks like a number, but could be a pointer!
[slot 2]: 0x7fff1234 <-- Same as slot 0
The GC can't tell the difference between a pointer and an integer that happens to look like a valid address.
Two Approaches
| Approach | How It Works | Pros | Cons |
|---|---|---|---|
| Conservative | Treat anything that LOOKS like a pointer as a pointer | Simple, no compiler changes | Can't move objects, may keep garbage alive |
| Precise | Tell GC exactly where pointers are | Efficient, enables moving GC | Compiler must track pointer locations |
We'll use precise GC because:
- It's more educational
- It's what real language implementations use
- It enables better collectors later
Chapter 9: The Shadow Stack
The simplest way to do precise GC: maintain a linked list of stack frames.
The Concept
┌─────────────────────────────────────────────────────────────┐
│ SHADOW STACK │
│ │
│ Each function call pushes a "frame" that lists: │
│ - How many roots it has │
│ - Pointers to each root │
│ │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────┐ │
│ │ Frame for main() │ │
│ │ roots: [a, c] │──┐ │
│ └─────────────────────┘ │ │
│ ▼ │
│ ┌─────────────────────┐ │
│ │ Frame for example() │ │
│ │ roots: [x, y] │──┐ │
│ └─────────────────────┘ │ │
│ ▼ │
│ ┌─────────────────────┐ │
│ │ Frame for helper() │ │
│ │ roots: [temp] │──► NULL │
│ └─────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
When GC runs:
- Walk the linked list of frames
- For each frame, mark each root
- Done! All reachable objects are marked
The Frame Structure
#![allow(unused)] fn main() { // src/runtime/shadow_stack.rs /// A frame in the shadow stack /// /// This is pushed when a function is called, /// and popped when a function returns. #[repr(C)] pub struct StackFrame { /// Pointer to the next frame (up the call stack) pub next: Option<*mut StackFrame>, /// Number of roots in this frame pub root_count: usize, /// Array of root pointers /// (this is a flexible array member - actual size = root_count) pub roots: [*mut u8; 0], } impl StackFrame { /// Get a pointer to the roots array pub fn roots_ptr(&mut self) -> *mut *mut u8 { self.roots.as_mut_ptr() } /// Get a root by index pub fn get_root(&self, index: usize) -> *mut u8 { assert!(index < self.root_count); unsafe { *self.roots.as_ptr().add(index) } } /// Set a root by index pub fn set_root(&mut self, index: usize, value: *mut u8) { assert!(index < self.root_count); unsafe { *self.roots.as_mut_ptr().add(index) = value; } } } }
Global Stack Head
#![allow(unused)] fn main() { // src/runtime/shadow_stack.rs (continued) use std::cell::RefCell; /// The head of the shadow stack (most recent frame) thread_local! { pub static SHADOW_STACK_HEAD: RefCell<Option<*mut StackFrame>> = RefCell::new(None); } }
Push and Pop Operations
#![allow(unused)] fn main() { /// Push a new frame onto the shadow stack /// /// This allocates a frame with space for `root_count` roots. /// Returns a pointer to the roots array so the compiler can fill it in. /// /// # Safety /// The caller must call `pop_frame` before the function returns. #[no_mangle] pub unsafe extern "C" fn gc_push_frame(root_count: usize) -> *mut *mut u8 { // Calculate frame size: header + root pointers let frame_size = std::mem::size_of::<StackFrame>() + root_count * std::mem::size_of::<*mut u8>(); // Allocate the frame let layout = std::alloc::Layout::from_size_align(frame_size, 8).unwrap(); let frame_ptr = std::alloc::alloc(layout) as *mut StackFrame; if frame_ptr.is_null() { panic!("Out of memory allocating stack frame!"); } // Initialize the frame SHADOW_STACK_HEAD.with(|head| { (*frame_ptr).next = *head.borrow(); (*frame_ptr).root_count = root_count; // Initialize all roots to null for i in 0..root_count { (*frame_ptr).set_root(i, std::ptr::null_mut()); } // Update head *head.borrow_mut() = Some(frame_ptr); }); // Return pointer to roots array (*frame_ptr).roots_ptr() } /// Pop a frame from the shadow stack /// /// # Safety /// Must be called exactly once for each `gc_push_frame`. #[no_mangle] pub unsafe extern "C" fn gc_pop_frame() { SHADOW_STACK_HEAD.with(|head| { let frame_ptr = (*head.borrow()).expect("No frame to pop!"); // Calculate frame size for deallocation let root_count = (*frame_ptr).root_count; let frame_size = std::mem::size_of::<StackFrame>() + root_count * std::mem::size_of::<*mut u8>(); // Update head to next frame *head.borrow_mut() = (*frame_ptr).next; // Free the frame let layout = std::alloc::Layout::from_size_align(frame_size, 8).unwrap(); std::alloc::dealloc(frame_ptr as *mut u8, layout); }); } /// Set a root in the current frame #[no_mangle] pub unsafe extern "C" fn gc_set_root(root_index: usize, value: *mut u8) { SHADOW_STACK_HEAD.with(|head| { let frame_ptr = (*head.borrow()).expect("No frame!"); (*frame_ptr).set_root(root_index, value); }); } }
Chapter 10: Putting It Together - The Full GC
Now we have:
- Allocation (Part 2)
- Shadow stack for roots (this part)
Let's write the actual gc_collect() function.
Mark Phase
#![allow(unused)] fn main() { // src/runtime/gc.rs (continued) use crate::runtime::object::{ObjHeader, ObjType}; use crate::runtime::shadow_stack::{SHADOW_STACK_HEAD, StackFrame}; /// Mark all reachable objects fn gc_mark() { // Walk the shadow stack SHADOW_STACK_HEAD.with(|head| { let mut current = *head.borrow(); while let Some(frame_ptr) = current { let frame = unsafe { &*frame_ptr }; // Mark each root in this frame for i in 0..frame.root_count { let root = frame.get_root(i); if !root.is_null() { mark_object(root); } } // Move to next frame current = frame.next; } }); } /// Recursively mark an object and everything it references fn mark_object(obj_ptr: *mut u8) { // Get the header (before the data pointer) let header = unsafe { (obj_ptr as *mut ObjHeader).sub(1) }; // Already marked? Skip (avoid infinite loops) if unsafe { (*header).marked } { return; } // Mark it unsafe { (*header).marked = true; } // Now mark any objects this one references // (depends on object type) mark_references(header); } /// Mark objects referenced by this object fn mark_references(header: *mut ObjHeader) { let obj_type = unsafe { (*header).obj_type }; let data = unsafe { (header as *mut u8).add(std::mem::size_of::<ObjHeader>()) }; match obj_type { ObjType::Number | ObjType::String => { // These don't reference other objects } ObjType::Closure => { // A closure references its captured variables // Layout: [function_ptr][capture_count][capture_0][capture_1]... unsafe { let capture_count = *(data.add(8) as *const usize); let captures = data.add(16) as *const *mut u8; for i in 0..capture_count { let captured = *captures.add(i); if !captured.is_null() { mark_object(captured); } } } } ObjType::Instance => { // An instance references its field values // Layout: [class_ptr][field_count][field_0_key][field_0_value]... unsafe { let field_count = *(data.add(8) as *const usize); let fields = data.add(16) as *const *mut u8; // Each field is (key_ptr, value_ptr) for i in 0..field_count { let value = *fields.add(i * 2 + 1); if !value.is_null() { mark_object(value); } } } } } } }
Sweep Phase
#![allow(unused)] fn main() { /// Free all unmarked objects fn gc_sweep() { use std::collections::HashSet; // We need to rebuild the object list as we sweep let mut new_head: Option<*mut ObjHeader> = None; let mut freed_count = 0; ALL_OBJECTS.with(|objects| { let mut current = *objects.borrow(); let mut keep = Vec::new(); while let Some(header_ptr) = current { let header = unsafe { &*header_ptr }; let next = header.next; if header.marked { // Still alive! Clear mark for next cycle unsafe { (*header_ptr).marked = false; } keep.push(header_ptr); } else { // Dead! Free it let total_size = std::mem::size_of::<ObjHeader>() + header.size as usize; let layout = std::alloc::Layout::from_size_align(total_size, 8).unwrap(); unsafe { std::alloc::dealloc(header_ptr as *mut u8, layout); } freed_count += 1; } current = next; } // Rebuild linked list for (i, &header_ptr) in keep.iter().enumerate() { let next = if i + 1 < keep.len() { Some(keep[i + 1]) } else { None }; unsafe { (*header_ptr).next = next; } } *objects.borrow_mut() = keep.first().copied(); }); OBJECT_COUNT.with(|count| { *count.borrow_mut() -= freed_count; }); eprintln!("GC: Freed {} objects, {} remaining", freed_count, OBJECT_COUNT.with(|c| *c.borrow())); } }
The Main Entry Point
#![allow(unused)] fn main() { /// Run a garbage collection cycle /// /// This is automatically called when allocation threshold is hit, /// but can also be called manually. #[no_mangle] pub extern "C" fn gc_collect() { gc_mark(); gc_sweep(); } }
Chapter 11: When to Collect
We need to decide: when should gc_collect() run?
Options:
- Every allocation - Simple but slow
- Every N allocations - Simple, tunable
- When memory exceeds threshold - More sophisticated
- Never automatically - Manual only (good for debugging)
We'll use option 2 (every N allocations):
#![allow(unused)] fn main() { // src/runtime/gc.rs /// Trigger collection after this many allocations const GC_THRESHOLD: usize = 1024; /// Check if we should collect fn maybe_gc() { OBJECT_COUNT.with(|count| { if *count.borrow() >= GC_THRESHOLD { gc_collect(); } }); } /// Modified alloc to trigger GC pub fn alloc(size: usize, obj_type: ObjType) -> *mut u8 { maybe_gc(); // Check before allocating // ... rest of alloc code ... } }
Chapter 10.5: Step-by-Step Mark-Sweep Walkthrough
Let's trace through a complete GC cycle with a concrete example:
Setup: A Running Program
fun main() {
var a = "hello"; // Object A: String "hello"
var b = 42; // Not an object (number is inline)
var c = a; // c points to same string as a
// ... some code that creates garbage ...
var temp = "garbage"; // Object G: String "garbage"
temp = nil; // G is now garbage!
// GC RUNS HERE
print a; // Should still work
print c; // Should still work
}
State Before GC
Shadow Stack:
┌─────────────────────────┐
│ Frame for main() │
│ root[0]: 0x1000 (a) │
│ root[1]: null (b) │
│ root[2]: 0x1000 (c) │
└─────────────────────────┘
Heap Objects:
ALL_OBJECTS → 0x2000 (G) → 0x1000 (A) → None
Object A (at 0x1000):
header: { marked: false, type: String, size: 5, next: None }
data: "hello"
Object G (at 0x2000):
header: { marked: false, type: String, size: 7, next: 0x1000 }
data: "garbage"
Mark Phase: Step by Step
Step 1: Walk to frame (main)
Step 2: Process root[0] = 0x1000 (a)
- Get header at 0x1000 - 16 = 0x0FF0
- marked = false → mark it!
- Type is String, no references to mark
Result: Object A is now marked
Step 3: Process root[1] = null
- Skip null values
Result: No action
Step 4: Process root[2] = 0x1000 (c)
- Get header at 0x1000 - 16 = 0x0FF0
- marked = true → already marked, skip
Result: Already marked, no action
Step 5: No more frames
Mark phase complete!
State After Mark Phase
Shadow Stack: (unchanged)
┌─────────────────────────┐
│ Frame for main() │
│ root[0]: 0x1000 │
│ root[1]: null │
│ root[2]: 0x1000 │
└─────────────────────────┘
Heap Objects:
ALL_OBJECTS → 0x2000 (G) → 0x1000 (A) → None
Object A (at 0x1000):
header: { marked: TRUE, type: String, size: 5, next: None }
data: "hello"
Object G (at 0x2000):
header: { marked: FALSE, type: String, size: 7, next: 0x1000 }
data: "garbage"
Sweep Phase: Step by Step
Step 1: Walk to object 0x2000 (G)
- marked = false → FREE IT!
- Free 23 bytes (header + 7 data)
- freed_count = 1
Step 2: Walk to object 0x1000 (A)
- marked = true → KEEP IT
- Clear mark for next cycle
- marked = false
- Add to keep list
Step 3: Rebuild linked list
- keep = [0x1000]
- ALL_OBJECTS = 0x1000
- 0x1000.next = None
Result: 1 object freed, 1 remaining
State After Sweep Phase
Shadow Stack: (unchanged)
Heap Objects:
ALL_OBJECTS → 0x1000 (A) → None
Object A (at 0x1000):
header: { marked: false, type: String, size: 5, next: None }
data: "hello"
Object G: FREED! Memory returned to system.
Chapter 10.6: Common Mistakes and Debugging
Mistake 1: Forgetting to Register a Root
Code:
#![allow(unused)] fn main() { fn broken_function() { let obj = alloc(16, ObjType::String); // Oops! Forgot gc_set_root(0, obj) gc_collect(); // BAD! obj might be freed! // Now obj points to freed memory use_object(obj); // CRASH or corruption! } }
Fix:
#![allow(unused)] fn main() { fn correct_function() { gc_push_frame(1); // One root slot let obj = alloc(16, ObjType::String); gc_set_root(0, obj); // Register as root! gc_collect(); // Safe! obj is protected // ... use obj ... gc_pop_frame(); // Clean up } }
Mistake 2: Push/Pop Mismatch
Code:
#![allow(unused)] fn main() { fn broken() { gc_push_frame(2); // Oops! Forgot gc_pop_frame() } // Frame leaks! fn next_call() { gc_push_frame(1); // Creates NEW frame // But old frame is still in the list! // GC will mark stale pointers } }
Fix: Always use RAII pattern or ensure pop on all paths:
#![allow(unused)] fn main() { struct FrameGuard; impl FrameGuard { fn new(root_count: usize) -> Self { unsafe { gc_push_frame(root_count) }; FrameGuard } } impl Drop for FrameGuard { fn drop(&mut self) { unsafe { gc_pop_frame() }; } } fn correct() { let _guard = FrameGuard::new(2); // gc_pop_frame called automatically when _guard drops // Even if we panic or return early! } }
Mistake 3: Circular References Not Handled
Actually this IS handled! The marked flag prevents infinite loops:
Object A references Object B
Object B references Object A
mark(A):
A.marked = true
mark(B) // A's reference
B.marked = true
mark(A) // B's reference
A.marked = true → SKIP (already marked)
return
return
No infinite loop!
Debugging Tips
Add these diagnostic functions:
#![allow(unused)] fn main() { /// Check if the GC state is consistent pub fn gc_validate() { // Check shadow stack SHADOW_STACK_HEAD.with(|head| { let mut current = *head.borrow(); let mut frame_count = 0; while let Some(frame_ptr) = current { frame_count += 1; if frame_count > 1000 { panic!("Shadow stack too deep! Infinite loop?"); } current = unsafe { (*frame_ptr).next }; } }); // Check object list ALL_OBJECTS.with(|objects| { let mut current = *objects.borrow(); let mut obj_count = 0; while let Some(header_ptr) = current { obj_count += 1; if obj_count > 100000 { panic!("Object list too long! Infinite loop?"); } current = unsafe { (*header_ptr).next }; } // Verify count matches OBJECT_COUNT.with(|count| { assert_eq!(*count.borrow(), obj_count, "Object count mismatch!"); }); }); } }
Chapter 10.7: Practice Exercises
Exercise 1: Trace the Mark Phase
Given this state:
Shadow Stack:
Frame(main): roots = [0x1000, null, 0x2000]
Frame(helper): roots = [0x3000]
Heap:
0x1000: String "a" (not marked)
0x2000: String "b" (not marked)
0x3000: String "c" (not marked)
0x4000: String "garbage" (not marked)
Which objects get marked?
Click to reveal answer
- 0x1000: Marked (root in main frame, slot 0)
- 0x2000: Marked (root in main frame, slot 2)
- 0x3000: Marked (root in helper frame, slot 0)
- 0x4000: NOT marked (no root points to it) → will be freed
The mark phase walks all frames and marks every non-null root.
Exercise 2: Implement a Stack Frame Guard
Write a Rust struct that automatically pops the frame when it goes out of scope:
#![allow(unused)] fn main() { // TODO: Implement this struct StackFrameGuard { // What fields? } impl StackFrameGuard { fn new(root_count: usize) -> Self { // Push frame, return guard } fn set_root(&self, index: usize, value: *mut u8) { // Set a root } } impl Drop for StackFrameGuard { fn drop(&mut self) { // Pop frame } } }
Click to reveal answer
#![allow(unused)] fn main() { struct StackFrameGuard { root_count: usize, } impl StackFrameGuard { fn new(root_count: usize) -> Self { unsafe { gc_push_frame(root_count) }; Self { root_count } } fn set_root(&self, index: usize, value: *mut u8) { assert!(index < self.root_count); unsafe { gc_set_root(index, value) }; } } impl Drop for StackFrameGuard { fn drop(&mut self) { unsafe { gc_pop_frame() }; } } // Usage: fn example() { let frame = StackFrameGuard::new(2); let obj = alloc(16, ObjType::String); frame.set_root(0, obj); // ... use obj ... // frame automatically pops when function returns } }
Exercise 3: Why Not Mark During Sweep?
Why do we separate mark and sweep into two phases? Why not free objects as soon as we find they're unreachable?
Click to reveal answer
We need to mark ALL live objects before freeing ANY.
Consider:
Object A references Object B
We're walking the heap in order: A, then B
If we freed A immediately when we saw it was unmarked:
- We'd lose the reference to B
- B would become unreachable
- But B might be reachable from a root!
By marking first, we ensure we've found ALL live objects.
Then sweep can safely free everything else.
Two-phase approach ensures correctness.
Checkpoint 2
We now have a complete garbage collector:
- Object allocation with headers
- Shadow stack for tracking roots
- Mark phase (walk roots, recursively mark)
- Sweep phase (free unmarked objects)
- Automatic collection trigger
- Step-by-step walkthrough
- Common mistakes and debugging
- Practice exercises
Files created so far:
mlir-lox-guide-rust-part2.md- Concepts + allocationmlir-lox-guide-rust-part3.md- Roots + full GC (this file)
Next: Integration with MLIR code generation
MLIR for Lox: Part 4 - MLIR Integration
Now we have a working GC runtime. Let's integrate it with our MLIR code generator.
Chapter 12: The Runtime Module
First, let's organize our runtime into a proper Rust module:
#![allow(unused)] fn main() { // src/runtime/mod.rs pub mod object; pub mod shadow_stack; pub mod gc; // Re-export the public API pub use object::{ObjHeader, ObjType}; pub use gc::{alloc, gc_collect}; pub use shadow_stack::{gc_push_frame, gc_pop_frame, gc_set_root}; }
And the Cargo.toml needs to build it as a static library:
# Cargo.toml
[package]
name = "lox-runtime"
version = "0.1.0"
edition = "2021"
[lib]
name = "lox_runtime"
crate-type = ["staticlib", "cdylib"] # Both static and dynamic lib
[dependencies]
Chapter 13: The Lox MLIR Dialect
We need MLIR operations that correspond to our GC operations:
| Operation | Meaning | LLVM Lowering |
|---|---|---|
lox.alloc | Allocate a heap object | Call lox_runtime::alloc |
lox.gc_root | Register a pointer as a GC root | Store in shadow stack slot |
lox.gc_push_frame | Push a stack frame | Call gc_push_frame |
lox.gc_pop_frame | Pop a stack frame | Call gc_pop_frame |
Defining Operations in Melior
Melior doesn't use TableGen like C++ MLIR. We define operations directly in Rust:
#![allow(unused)] fn main() { // src/codegen/lox_dialect.rs use melior::{ Context, Location, dialect::Dialect, ir::{Operation, OperationBuilder, Region, Type, Value}, }; /// The Lox dialect namespace pub const DIALECT_NAME: &str = "lox"; /// Operation names pub mod ops { pub const ALLOC: &str = "lox.alloc"; pub const LOAD: &str = "lox.load"; pub const STORE: &str = "lox.store"; pub const PUSH_FRAME: &str = "lox.push_frame"; pub const POP_FRAME: &str = "lox.pop_frame"; pub const SET_ROOT: &str = "lox.set_root"; } /// Create a lox.alloc operation /// /// This allocates a heap object of the given type. /// Returns a pointer to the object data (after the header). pub fn create_alloc<'c>( context: &'c Context, obj_type: u8, // ObjType enum value size: usize, // Size of object data in bytes location: Location<'c>, ) -> melior::ir::Operation<'c> { // The lox.alloc operation takes: // - type: i8 (the ObjType enum) // - size: i64 (allocation size) // And returns: // - ptr: !llvm.ptr (pointer to object data) OperationBuilder::new(ops::ALLOC, location) .add_attribute("obj_type", melior::ir::attribute::IntegerAttribute::new(obj_type as i64, Type::integer(context, 8)).into()) .add_attribute("size", melior::ir::attribute::IntegerAttribute::new(size as i64, Type::integer(context, 64)).into()) .add_results(&[Type::parse(context, "!llvm.ptr").unwrap()]) .build() .unwrap() } /// Create a lox.push_frame operation /// /// This pushes a new shadow stack frame with the given number of root slots. /// Returns a pointer to the roots array. pub fn create_push_frame<'c>( context: &'c Context, root_count: usize, location: Location<'c>, ) -> melior::ir::Operation<'c> { OperationBuilder::new(ops::PUSH_FRAME, location) .add_attribute("root_count", melior::ir::attribute::IntegerAttribute::new(root_count as i64, Type::integer(context, 64)).into()) .add_results(&[Type::parse(context, "!llvm.ptr").unwrap()]) .build() .unwrap() } /// Create a lox.pop_frame operation pub fn create_pop_frame<'c>( context: &'c Context, location: Location<'c>, ) -> melior::ir::Operation<'c> { OperationBuilder::new(ops::POP_FRAME, location) .build() .unwrap() } /// Create a lox.set_root operation /// /// Sets a root in the current shadow stack frame. pub fn create_set_root<'c>( context: &'c Context, root_index: usize, value: Value<'c>, location: Location<'c>, ) -> melior::ir::Operation<'c> { OperationBuilder::new(ops::SET_ROOT, location) .add_attribute("index", melior::ir::attribute::IntegerAttribute::new(root_index as i64, Type::integer(context, 64)).into()) .add_operand(value) .build() .unwrap() } }
Chapter 14: Lowering to LLVM
Our lox.* operations need to be converted to LLVM IR. We do this with a lowering pass:
#![allow(unused)] fn main() { // src/codegen/lowering.rs use melior::{ Context, Location, PassManager, ir::{Block, Module, Operation, Region, Value}, dialect::{func, llvm}, pass::Pass, }; /// Lower lox.alloc to a runtime call fn lower_alloc(op: &Operation, block: &mut Block, context: &Context) { let location = op.location(); // Get attributes let obj_type = op.attribute("obj_type").unwrap() .as_integer().unwrap() as i64; let size = op.attribute("size").unwrap() .as_integer().unwrap() as i64; // Create constants for arguments let obj_type_val = create_const_i8(context, obj_type as i8); let size_val = create_const_i64(context, size); // Call lox_runtime_alloc(type, size) let call = func::call( context, melior::ir::attribute::FlatSymbolRefAttribute::new(context, "lox_runtime_alloc"), &[obj_type_val, size_val], &[Type::parse(context, "!llvm.ptr").unwrap()], location, ); block.append_operation(call.clone()); // Replace uses of the original result let result = op.result(0).unwrap(); let new_result = call.result(0).unwrap(); // (In real code, we'd track and replace all uses) } /// Lower lox.push_frame to a runtime call fn lower_push_frame(op: &Operation, block: &mut Block, context: &Context) { let location = op.location(); let root_count = op.attribute("root_count").unwrap() .as_integer().unwrap() as i64; let count_val = create_const_i64(context, root_count); // Call gc_push_frame(root_count) let call = func::call( context, melior::ir::attribute::FlatSymbolRefAttribute::new(context, "gc_push_frame"), &[count_val], &[Type::parse(context, "!llvm.ptr").unwrap()], location, ); block.append_operation(call.clone()); // Replace uses of the original result let result = op.result(0).unwrap(); let new_result = call.result(0).unwrap(); } /// Lower lox.pop_frame to a runtime call fn lower_pop_frame(op: &Operation, block: &mut Block, context: &Context) { let location = op.location(); // Call gc_pop_frame() let call = func::call( context, melior::ir::attribute::FlatSymbolRefAttribute::new(context, "gc_pop_frame"), &[], &[], location, ); block.append_operation(call); } /// Lower lox.set_root to a store instruction fn lower_set_root(op: &Operation, block: &mut Block, context: &Context) { let location = op.location(); let root_index = op.attribute("index").unwrap() .as_integer().unwrap() as i64; let value = op.operand(0).unwrap(); // Get the frame pointer (from the most recent push_frame) // In real code, we'd track this // Store value at frame_ptr[index] // llvm.store value, frame_ptr[index] } }
Chapter 15: Code Generation for Functions
Now we modify our function code generator to use the shadow stack:
#![allow(unused)] fn main() { // src/codegen/generator.rs (modified) impl<'c> CodeGenerator<'c> { /// Compile a function with GC support fn compile_function(&mut self, func: &FunctionStmt) { let location = self.loc(func.location); // === STEP 1: Count roots === // Roots = parameters + local variables let root_count = self.count_roots(func); // === STEP 2: Create function type === let float_type = Type::float64(self.context); let param_types: Vec<Type> = func.params.iter().map(|_| float_type).collect(); let function_type = FunctionType::new(self.context, ¶m_types, &[float_type]); // === STEP 3: Create function body === let region = Region::new(); let block = Block::new( ¶m_types.iter().map(|&t| (t, location)).collect::<Vec<_>>() ); // === STEP 4: Push shadow stack frame === let push_frame = create_push_frame(self.context, root_count, location); block.append_operation(push_frame.clone()); // The result is a pointer to the roots array let roots_ptr = push_frame.result(0).unwrap().into(); // === STEP 5: Store parameters as roots === for (i, param_name) in func.params.iter().enumerate() { let arg = block.argument(i).unwrap(); // Store the parameter value in roots[i] // (In a real implementation, we'd handle this based on type) self.set_root(&block, i, arg.into(), location); // Also track in our local variables map self.variables.insert(param_name.clone(), arg.into()); } // === STEP 6: Compile function body === self.current_block = Some(block); self.current_root_index = func.params.len(); // Next free root slot for stmt in &func.body { self.compile_statement(stmt); } // === STEP 7: Add implicit return if needed === // ... // === STEP 8: Pop shadow stack frame === let pop_frame = create_pop_frame(self.context, location); if let Some(block) = &self.current_block { block.append_operation(pop_frame); } // === STEP 9: Create the function === // ... append block to region, add func to module ... } /// Count the total number of roots needed for a function fn count_roots(&self, func: &FunctionStmt) -> usize { // Parameters are roots let mut count = func.params.len(); // Local variables are roots for stmt in &func.body { count += self.count_roots_in_stmt(stmt); } count } /// Count roots introduced by a statement fn count_roots_in_stmt(&self, stmt: &Stmt) -> usize { match stmt { Stmt::Var(v) => 1, // Each var declaration is a root Stmt::Block(b) => b.statements.iter() .map(|s| self.count_roots_in_stmt(s)) .sum(), Stmt::If(i) => { self.count_roots_in_stmt(&i.then_branch[0]) + i.else_branch.iter().map(|s| self.count_roots_in_stmt(s)).sum::<usize>() } Stmt::While(w) => self.count_roots_in_stmt(&w.body[0]), _ => 0, } } /// Set a root in the shadow stack fn set_root(&mut self, block: &Block<'c>, index: usize, value: Value<'c>, location: Location<'c>) { let set_root_op = create_set_root(self.context, index, value, location); block.append_operation(set_root_op); } } }
Chapter 16: The Lowering Pass
We need a pass that converts lox.* operations to LLVM calls:
#![allow(unused)] fn main() { // src/codegen/lowering_pass.rs use melior::{ Context, Module, PassManager, pass::Pass, }; /// Create the lowering pass manager pub fn create_lowering_pass_manager(context: &Context) -> PassManager { let pm = PassManager::new(context); // Lower Lox dialect to LLVM pm.add_pass(pass::convert_lox_to_llvm()); // Lower standard dialects to LLVM pm.add_pass(pass::convert_scf_to_cf()); pm.add_pass(pass::convert_cf_to_llvm()); pm.add_pass(pass::convert_arith_to_llvm()); pm.add_pass(pass::convert_func_to_llvm()); pm } /// Our custom Lox-to-LLVM pass mod pass { use super::*; pub fn convert_lox_to_llvm() -> Pass { Pass::from_info("lox-to-llvm", |module: &Module| { // Walk all operations // For each lox.* operation, replace with LLVM call module.as_operation().walk(|op| { let op_name = op.name(); match op_name { "lox.alloc" => { // Replace with call to lox_runtime_alloc } "lox.push_frame" => { // Replace with call to gc_push_frame } "lox.pop_frame" => { // Replace with call to gc_pop_frame } "lox.set_root" => { // Replace with store instruction } _ => {} } }); }) } } }
Chapter 17: Linking Everything Together
Now we need to link the MLIR-generated code with our Rust runtime:
Step 1: Compile the Runtime
cd lox-mlir
cargo build --release
This produces target/release/liblox_runtime.a (static lib) and liblox_runtime.so (dynamic lib).
Step 2: Generate MLIR
#![allow(unused)] fn main() { // Compile Lox source to MLIR let mlir = compile_to_mlir(source)?; println!("{}", mlir); }
Output:
module {
func.func @example() -> f64 {
%0 = lox.push_frame root_count = 3 : !llvm.ptr
// Allocate a string object
%1 = lox.alloc obj_type = 1, size = 5 : !llvm.ptr
// Store as root 0
lox.set_root index = 0, %1 : !llvm.ptr
// ... function body ...
lox.pop_frame
return %result : f64
}
}
Step 3: Lower to LLVM IR
# Lower MLIR to LLVM IR
mlir-translate output.mlir --mlir-to-llvmir -o output.ll
Step 4: Compile to Object File
# Compile LLVM IR to object file
llc output.ll -filetype=obj -o output.o
# Link with runtime
clang output.o -L./target/release -llox_runtime -o output
Step 5: Run
./output
Chapter 17.5: Complete MLIR Example
Let's trace through the complete compilation of a simple Lox function:
Input: Lox Source
fun add(a, b) {
return a + b;
}
print add(1, 2);
Stage 1: MLIR (Lox Dialect)
module {
// The 'add' function
func.func @add(%arg0: f64, %arg1: f64) -> f64
attributes {sym_name = "add"}
{
// Push shadow stack frame with 2 roots (for parameters)
%frame = lox.push_frame root_count = 2 : !llvm.ptr
// Register parameters as roots
lox.set_root index = 0, %arg0 : f64
lox.set_root index = 1, %arg1 : f64
// The addition
%sum = arith.addf %arg0, %arg1 : f64
// Pop frame before return
lox.pop_frame
// Return
return %sum : f64
}
// The main entry point
func.func @main() -> i32 {
%frame = lox.push_frame root_count = 0 : !llvm.ptr
// Call add(1, 2)
%one = arith.constant 1.0 : f64
%two = arith.constant 2.0 : f64
%result = func.call @add(%one, %two) : (f64, f64) -> f64
// Print the result
// (simplified - would call a print runtime function)
lox.pop_frame
%zero = arith.constant 0 : i32
return %zero : i32
}
}
Stage 2: After Lowering (LLVM Dialect)
module {
llvm.func @add(%arg0: f64, %arg1: f64) -> f64 {
// gc_push_frame(2)
%frame = llvm.call @gc_push_frame(%c2_i64) : (i64) -> !llvm.ptr
// gc_set_root(0, arg0) - converted to store
// (simplified representation)
llvm.store %arg0, %frame[%c0_i64] : f64, !llvm.ptr
// gc_set_root(1, arg1)
llvm.store %arg1, %frame[%c1_i64] : f64, !llvm.ptr
// Addition
%sum = llvm.fadd %arg0, %arg1 : f64
// gc_pop_frame()
llvm.call @gc_pop_frame() : () -> ()
llvm.return %sum : f64
}
llvm.func @main() -> i32 {
// ... similar lowering ...
llvm.return %c0_i32 : i32
}
// External declarations for runtime
llvm.func @gc_push_frame(i64) -> !llvm.ptr
llvm.func @gc_pop_frame() -> ()
llvm.func @lox_runtime_alloc(i64, i8) -> !llvm.ptr
}
Stage 3: LLVM IR
define double @add(double %0, double %1) {
entry:
; Push shadow stack frame
%frame = call i8* @gc_push_frame(i64 2)
; Store parameters as roots
%root0_ptr = getelementptr i8*, i8* %frame, i64 0
store double %0, double* %root0_ptr
%root1_ptr = getelementptr i8*, i8* %frame, i64 1
store double %1, double* %root1_ptr
; Addition
%sum = fadd double %0, %1
; Pop frame
call void @gc_pop_frame()
ret double %sum
}
define i32 @main() {
entry:
%result = call double @add(double 1.0, double 2.0)
; ... print result ...
ret i32 0
}
; External runtime functions
declare i8* @gc_push_frame(i64)
declare void @gc_pop_frame()
declare i8* @lox_runtime_alloc(i64, i8)
Stage 4: Assembly (x86-64, simplified)
add:
push rbp
mov rbp, rsp
; gc_push_frame(2)
mov rdi, 2
call gc_push_frame
mov rax, rax ; frame pointer
; Store roots (simplified)
movsd [rax], xmm0 ; store %0
movsd [rax + 8], xmm1 ; store %1
; Addition
addsd xmm0, xmm1 ; %sum = %0 + %1
; gc_pop_frame()
call gc_pop_frame
; Return
pop rbp
ret
main:
; ...
movsd xmm0, 1.0
movsd xmm1, 2.0
call add
; ...
Chapter 17.6: Handling Different Object Types
Let's see how we generate code for different object types:
String Allocation
var s = "hello";
Generated MLIR:
// Allocate string object (ObjType.String = 1, size = 5)
%str = lox.alloc obj_type = 1, size = 5 : !llvm.ptr
// Initialize string data
// (would fill in length, hash, and characters)
// Store as root
lox.set_root index = 0, %str : !llvm.ptr
Closure Allocation
fun makeCounter() {
var count = 0;
fun counter() {
count = count + 1;
return count;
}
return counter;
}
Generated MLIR (simplified):
func.func @makeCounter() -> !llvm.ptr {
// 1. Allocate environment for captured 'count'
%env = lox.alloc obj_type = 2, size = 16 : !llvm.ptr
// Environment layout: [enclosing, count_slots...]
// 2. Initialize count = 0 in environment
// (store at offset)
// 3. Allocate closure
%closure = lox.alloc obj_type = 3, size = 16 : !llvm.ptr
// Closure layout: [function_index, environment_ptr]
// 4. Link closure to environment
// (store function index and env pointer)
return %closure : !llvm.ptr
}
Chapter 17.7: Practice Exercises
Exercise 1: Generate MLIR for a Function
Write the MLIR (Lox dialect) for this Lox function:
fun multiply(x, y) {
var result = x * y;
return result;
}
Click to reveal answer
func.func @multiply(%arg0: f64, %arg1: f64) -> f64 {
// 3 roots: x, y, result
%frame = lox.push_frame root_count = 3 : !llvm.ptr
// Register parameters
lox.set_root index = 0, %arg0 : f64
lox.set_root index = 1, %arg1 : f64
// Compute x * y
%product = arith.mulf %arg0, %arg1 : f64
// Allocate 'result' (if it's a heap object)
// For simplicity, if result is just a number, we don't allocate
// But if it's an object:
%result_obj = lox.alloc obj_type = 0, size = 8 : !llvm.ptr
lox.set_root index = 2, %result_obj : !llvm.ptr
// Store product in result_obj
// (would need additional operations)
lox.pop_frame
return %product : f64
}
Note: For simple numbers, we might not need heap allocation. The actual implementation would depend on your Lox value representation.
Exercise 2: Trace the Compilation Pipeline
Given this Lox code:
var x = 1;
var y = 2;
print x + y;
What does each stage produce?
Click to reveal answer
MLIR (Lox Dialect):
func.func @main() -> i32 {
%frame = lox.push_frame root_count = 2 : !llvm.ptr
// var x = 1
%x_val = arith.constant 1.0 : f64
lox.set_root index = 0, %x_val : f64
// var y = 2
%y_val = arith.constant 2.0 : f64
lox.set_root index = 1, %y_val : f64
// x + y
%sum = arith.addf %x_val, %y_val : f64
// print (simplified)
// call print_runtime(%sum)
lox.pop_frame
return %c0_i32 : i32
}
After Lowering: The lox.push_frame becomes func.call @gc_push_frame, etc.
LLVM IR: Standard LLVM with calls to runtime functions.
Assembly: Native x86-64 or ARM code.
Exercise 3: Why Separate Dialects?
Why do we have a separate lox dialect instead of just generating LLVM IR directly?
Click to reveal answer
-
Abstraction Level: Lox dialect captures Lox semantics (allocation, GC roots) at a high level. We can optimize at this level before lowering.
-
Target Independence: MLIR can target WebAssembly, GPUs, or other backends. We don't lock ourselves into LLVM.
-
Debugging: We can inspect the IR at each stage (Lox dialect → LLVM dialect → LLVM IR).
-
Custom Optimizations: We can write passes that understand Lox semantics (e.g., eliminate unnecessary allocations).
-
Incremental Lowering: We can do some optimizations in the Lox dialect, then lower to LLVM for target-specific work.
Checkpoint 3
We now have complete MLIR integration:
-
Lox dialect operations (
lox.alloc,lox.push_frame, etc.) - Lowering to LLVM calls
- Function code generation with shadow stack
- Linking with Rust runtime
- Complete MLIR example walkthrough
- Different object types
- Practice exercises
Files created:
mlir-lox-guide-rust-part2.md- Concepts + allocationmlir-lox-guide-rust-part3.md- Roots + full GCmlir-lox-guide-rust-part4.md- MLIR integration (this file)
Next: Handling closures (the tricky part)
MLIR for Lox: Part 5 - Closures
Closures are the most challenging part of garbage collection. They capture variables from outer scopes, and those captured variables must stay alive as long as the closure exists.
Chapter 18: The Closure Problem
Consider this Lox code:
fun makeCounter() {
var count = 0;
fun counter() {
count = count + 1; // 'count' is from makeCounter's scope!
return count;
}
return counter;
}
var c = makeCounter(); // makeCounter returns, but 'count' must live on!
print c(); // 1
print c(); // 2
print c(); // 3
The problem:
makeCounter()returns- Its stack frame is destroyed
- But
countmust still exist becausecountercaptures it! - Where does
countlive?
Stack vs Heap
┌─────────────────────────────────────────────────────────────┐
│ WRONG: count on the stack │
│ │
│ makeCounter() called │
│ ┌─────────────────────┐ │
│ │ count = 0 │ ← on the stack │
│ │ return counter │ │
│ └─────────────────────┘ │
│ ↓ │
│ makeCounter() returns │
│ ┌─────────────────────┐ │
│ │ (freed!) │ ← count is gone! │
│ └─────────────────────┘ │
│ ↓ │
│ c() is called │
│ counter tries to access count... CRASH! │
│ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ RIGHT: count on the heap (in a closure environment) │
│ │
│ makeCounter() called │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ env = alloc() │──────│ count = 0 │ │
│ │ return counter │ │ (on the heap!) │ │
│ └─────────────────────┘ └─────────────────────┘ │
│ ↓ ↑ │
│ makeCounter() returns │ │
│ (stack frame freed) │ │
│ ↓ │ │
│ c() is called ─────────────────────┘ │
│ counter accesses count via env pointer │
│ count is still alive! │
│ │
└─────────────────────────────────────────────────────────────┘
Chapter 19: Closure Environments
A closure environment is a heap-allocated structure that holds captured variables.
Structure
Closure Environment:
┌────────────────────────────────────┐
│ Header (ObjHeader) │
│ marked: bool │
│ obj_type: ObjType::Environment │
│ size: ... │
├────────────────────────────────────┤
│ Data │
│ enclosing: *mut Env (or null) │ ← for nested closures
│ count: usize │ ← number of slots
│ slot[0]: value │
│ slot[1]: value │
│ ... │
└────────────────────────────────────┘
Closure Object
Closure Object:
┌────────────────────────────────────┐
│ Header (ObjHeader) │
│ marked: bool │
│ obj_type: ObjType::Closure │
│ size: ... │
├────────────────────────────────────┤
│ Data │
│ function: *mut Function │ ← the code to execute
│ environment: *mut Env │ ← captured variables
└────────────────────────────────────┘
Chapter 20: Implementing Environments
Let's add environment support to our runtime:
#![allow(unused)] fn main() { // src/runtime/object.rs (extended) #[repr(u8)] #[derive(Debug, Clone, Copy, PartialEq)] pub enum ObjType { Number = 0, String = 1, Environment = 2, // NEW: closure environment Closure = 3, // NEW: closure object Instance = 4, } /// An environment (holds captured variables) #[repr(C)] pub struct Environment { /// Pointer to enclosing environment (for nested closures) pub enclosing: Option<*mut Environment>, /// Number of variable slots pub slot_count: usize, /// Variable slots (flexible array member) pub slots: [*mut u8; 0], } impl Environment { /// Get a slot value pub fn get(&self, index: usize) -> *mut u8 { assert!(index < self.slot_count); unsafe { *self.slots.as_ptr().add(index) } } /// Set a slot value pub fn set(&mut self, index: usize, value: *mut u8) { assert!(index < self.slot_count); unsafe { *self.slots.as_mut_ptr().add(index) = value; } } } /// A closure (function + environment) #[repr(C)] pub struct Closure { /// Pointer to the function code (an index into our function table) pub function_index: usize, /// Pointer to the captured environment pub environment: *mut Environment, } }
Allocating an Environment
#![allow(unused)] fn main() { // src/runtime/gc.rs (extended) /// Allocate a new environment with the given slot count pub fn alloc_environment(slot_count: usize, enclosing: Option<*mut Environment>) -> *mut Environment { // Calculate size: header + environment struct + slots let env_data_size = std::mem::size_of::<Environment>() + slot_count * std::mem::size_of::<*mut u8>(); let data_ptr = alloc(env_data_size, ObjType::Environment); let env = data_ptr as *mut Environment; unsafe { (*env).enclosing = enclosing; (*env).slot_count = slot_count; // Initialize all slots to null for i in 0..slot_count { (*env).set(i, std::ptr::null_mut()); } } env } /// Allocate a new closure pub fn alloc_closure(function_index: usize, environment: *mut Environment) -> *mut Closure { let data_ptr = alloc(std::mem::size_of::<Closure>(), ObjType::Closure); let closure = data_ptr as *mut Closure; unsafe { (*closure).function_index = function_index; (*closure).environment = environment; } closure } }
Chapter 21: Marking Environments
When we mark a closure, we must also mark its environment:
#![allow(unused)] fn main() { // src/runtime/gc.rs (extended) fn mark_references(header: *mut ObjHeader) { let obj_type = unsafe { (*header).obj_type }; let data = unsafe { (header as *mut u8).add(std::mem::size_of::<ObjHeader>()) }; match obj_type { ObjType::Number | ObjType::String => { // No references } ObjType::Environment => { let env = data as *mut Environment; unsafe { // Mark enclosing environment if let Some(enclosing) = (*env).enclosing { mark_object(enclosing as *mut u8); } // Mark all slots for i in 0..(*env).slot_count { let slot = (*env).get(i); if !slot.is_null() { mark_object(slot); } } } } ObjType::Closure => { let closure = data as *mut Closure; unsafe { // Mark the environment let env = (*closure).environment; if !env.is_null() { mark_object(env as *mut u8); } } } ObjType::Instance => { // (same as before) } } } }
Chapter 22: Code Generation for Closures
Now let's generate code for closures in our compiler:
The Challenge
When compiling a closure, we need to:
- Identify captured variables - which variables from outer scopes are used?
- Allocate an environment - create heap storage for captured variables
- Store captured values - copy values into the environment
- Access via environment - when the closure reads/writes captured vars, go through the environment
Step 1: Variable Analysis
#![allow(unused)] fn main() { // src/analysis/captures.rs use crate::ast::*; /// A variable that is captured by a closure #[derive(Debug, Clone)] pub struct CapturedVar { pub name: String, pub depth: usize, // How many scopes up? (0 = immediate, 1 = one level up, etc.) pub slot_index: usize, // Index in the environment } /// Analyze a function to find captured variables pub fn find_captures(func: &FunctionStmt) -> Vec<CapturedVar> { let mut analyzer = CaptureAnalyzer::new(); analyzer.analyze_function(func); analyzer.captures } struct CaptureAnalyzer { scopes: Vec<Vec<String>>, // Stack of local variables in each scope captures: Vec<CapturedVar>, current_slot: usize, } impl CaptureAnalyzer { fn new() -> Self { Self { scopes: vec![vec![]], // Start with one scope for parameters captures: Vec::new(), current_slot: 0, } } fn analyze_function(&mut self, func: &FunctionStmt) { // Parameters are in scope 0 for param in &func.params { self.scopes[0].push(param.clone()); } // Analyze body for stmt in &func.body { self.analyze_stmt(stmt); } } fn analyze_stmt(&mut self, stmt: &Stmt) { match stmt { Stmt::Var(v) => { self.scopes.last_mut().unwrap().push(v.name.clone()); self.analyze_expr(&v.init); } Stmt::Block(b) => { self.scopes.push(vec![]); for s in &b.statements { self.analyze_stmt(s); } self.scopes.pop(); } // ... other cases ... _ => {} } } fn analyze_expr(&mut self, expr: &Expr) { match expr { Expr::Variable(v) => { // Is this variable captured? (not in current scope) if !self.is_local(&v.name) { // It's a capture! if !self.captures.iter().any(|c| c.name == v.name) { self.captures.push(CapturedVar { name: v.name.clone(), depth: self.depth_of(&v.name), slot_index: self.current_slot, }); self.current_slot += 1; } } } // ... other cases ... _ => {} } } fn is_local(&self, name: &str) -> bool { self.scopes.last().unwrap().contains(&name.to_string()) } fn depth_of(&self, name: &str) -> usize { for (depth, scope) in self.scopes.iter().rev().enumerate() { if scope.contains(&name.to_string()) { return depth; } } panic!("Variable not found: {}", name); } } }
Step 2: Environment Allocation
#![allow(unused)] fn main() { // src/codegen/generator.rs (extended) impl<'c> CodeGenerator<'c> { fn compile_function(&mut self, func: &FunctionStmt) { // ... setup ... // Find captured variables let captures = find_captures(func); // If we have captures, we need an environment if !captures.is_empty() { let env = self.alloc_environment(captures.len()); // Store captured values into the environment for capture in &captures { let value = self.get_capture_value(&capture.name, capture.depth); self.store_to_environment(env, capture.slot_index, value); } // The environment is now available for inner functions self.current_environment = Some(env); } // ... compile body ... } fn alloc_environment(&mut self, slot_count: usize) -> Value<'c> { let location = Location::unknown(self.context); // Call lox_runtime_alloc_environment(slot_count, enclosing) let slot_count_val = self.const_i64(slot_count as i64); let enclosing_val = self.current_environment .unwrap_or_else(|| self.const_null()); let call = func::call( self.context, melior::ir::attribute::FlatSymbolRefAttribute::new( self.context, "lox_runtime_alloc_environment" ), &[slot_count_val, enclosing_val], &[Type::parse(self.context, "!llvm.ptr").unwrap()], location, ); if let Some(block) = &self.current_block { block.append_operation(call.clone()); } call.result(0).unwrap().into() } fn store_to_environment(&mut self, env: Value<'c>, index: usize, value: Value<'c>) { let location = Location::unknown(self.context); // Calculate offset: sizeof(Environment) + index * sizeof(ptr) // Then store value at env + offset // For simplicity, call a runtime function let call = func::call( self.context, melior::ir::attribute::FlatSymbolRefAttribute::new( self.context, "lox_runtime_env_set" ), &[env, self.const_i64(index as i64), value], &[], location, ); if let Some(block) = &self.current_block { block.append_operation(call); } } fn load_from_environment(&mut self, env: Value<'c>, index: usize) -> Value<'c> { let location = Location::unknown(self.context); let call = func::call( self.context, melior::ir::attribute::FlatSymbolRefAttribute::new( self.context, "lox_runtime_env_get" ), &[env, self.const_i64(index as i64)], &[Type::parse(self.context, "!llvm.ptr").unwrap()], location, ); if let Some(block) = &self.current_block { block.append_operation(call.clone()); } call.result(0).unwrap().into() } } }
Chapter 23: The Complete Picture
Here's how everything connects for our makeCounter example:
Lox Source
fun makeCounter() {
var count = 0;
fun counter() {
count = count + 1;
return count;
}
return counter;
}
Generated MLIR (Simplified)
module {
// makeCounter creates an environment for 'count'
func.func @makeCounter() -> !llvm.ptr {
// Push frame with 0 roots (count goes in environment, not stack)
%frame = lox.push_frame root_count = 0 : !llvm.ptr
// Allocate environment with 1 slot
%env = call @lox_runtime_alloc_environment(1, null) : (!llvm.ptr)
// Initialize count = 0
%zero = arith.constant 0.0 : f64
call @lox_runtime_env_set(%env, 0, %zero) : (!llvm.ptr, i64, f64)
// Create closure for counter
// (counter_index = 1, env = %env)
%closure = call @lox_runtime_alloc_closure(1, %env) : (i64, !llvm.ptr)
lox.pop_frame
return %closure : !llvm.ptr
}
// counter accesses 'count' via environment
func.func @counter(%env: !llvm.ptr) -> f64 {
%frame = lox.push_frame root_count = 1 : !llvm.ptr
lox.set_root index = 0, %env : !llvm.ptr
// Load count from environment
%count = call @lox_runtime_env_get(%env, 0) : (!llvm.ptr, i64) -> f64
// count = count + 1
%one = arith.constant 1.0 : f64
%new_count = arith.addf %count, %one : f64
// Store back to environment
call @lox_runtime_env_set(%env, 0, %new_count) : (!llvm.ptr, i64, f64)
lox.pop_frame
return %new_count : f64
}
}
Memory Layout After var c = makeCounter();
Stack:
┌─────────────────┐
│ c = 0x1000 │──┐
└─────────────────┘ │
▼
Heap:
┌─────────────────────────┐ 0x1000: Closure
│ header: { Closure } │
│ function_index: 1 │
│ environment: 0x2000 ────│──┐
└─────────────────────────┘ │
▼
┌─────────────────────────┐ 0x2000: Environment
│ header: { Environment } │
│ enclosing: null │
│ slot_count: 1 │
│ slot[0]: 1.0 ───────────│──┐
└─────────────────────────┘ │
│
(count = 1.0) ◄──────────────┘
Chapter 23.5: Nested Closures Deep Dive
Let's trace through a more complex example with nested closures:
The Code
fun makeAdder(x) {
fun adder(y) {
return x + y; // Captures 'x' from makeAdder
}
return adder;
}
var add5 = makeAdder(5);
var add10 = makeAdder(10);
print add5(3); // 8
print add10(3); // 13
Step-by-Step Execution
1. makeAdder(5) is called:
Stack:
┌─────────────────────────────┐
│ Frame: makeAdder │
│ x = 5 (parameter) │
└─────────────────────────────┘
│
▼ Creates environment
┌─────────────────────────────┐
│ Environment 0x1000 │
│ slot[0]: 5 (x) │
└─────────────────────────────┘
│
▼ Creates closure
┌─────────────────────────────┐
│ Closure 0x2000 (adder) │
│ function_index: 1 │
│ environment: 0x1000 ──────│──► env with x=5
└─────────────────────────────┘
Returns closure 0x2000, assigned to add5
makeAdder stack frame is destroyed, but environment lives on!
2. makeAdder(10) is called:
Creates NEW environment 0x3000 with x=10
Creates NEW closure 0x4000 pointing to environment 0x3000
add5 → closure 0x2000 → env 0x1000 (x=5)
add10 → closure 0x4000 → env 0x3000 (x=10)
3. add5(3) is called:
Stack:
┌─────────────────────────────┐
│ Frame: adder │
│ y = 3 (parameter) │
│ env = 0x1000 (from closure)│
└─────────────────────────────┘
Load x from env[0] = 5
Compute 5 + 3 = 8
Return 8
4. add10(3) is called:
Same process, but env = 0x3000
Load x from env[0] = 10
Compute 10 + 3 = 13
Return 13
Key Insight: Each Call Creates Separate Environment
makeAdder(5):
→ Creates env {x: 5}
→ Creates closure pointing to that env
makeAdder(10):
→ Creates NEW env {x: 10}
→ Creates NEW closure pointing to NEW env
The two closures share no state!
Chapter 23.6: Multiple Captured Variables
What if we capture multiple variables?
fun makePoint(x, y) {
fun translate(dx, dy) {
// Captures both x and y
return (x + dx, y + dy);
}
return translate;
}
Environment Layout
Environment:
slot[0]: x
slot[1]: y
When translate is called:
1. Load x from env[0]
2. Load y from env[1]
3. Compute x + dx
4. Compute y + dy
Generated Code
func.func @translate(%env: !llvm.ptr, %dx: f64, %dy: f64) -> (f64, f64) {
// Load captured variables
%x = call @lox_runtime_env_get(%env, 0) : (!llvm.ptr, i64) -> f64
%y = call @lox_runtime_env_get(%env, 1) : (!llvm.ptr, i64) -> f64
// Compute
%new_x = arith.addf %x, %dx : f64
%new_y = arith.addf %y, %dy : f64
return %new_x, %new_y : f64, f64
}
Chapter 23.7: Nested Environments
What if a closure captures a variable from two scopes up?
fun outer() {
var a = 1;
fun middle() {
var b = 2;
fun inner() {
return a + b; // a from outer, b from middle
}
return inner;
}
return middle;
}
Environment Chain
outer() creates:
env_outer { a: 1 }
middle() creates:
env_middle { b: 2, enclosing: env_outer }
↑
Points to outer's environment
inner() closure:
environment: env_middle
When inner() runs:
1. Look up 'a': not in env_middle → follow enclosing → found in env_outer
2. Look up 'b': found in env_middle
Variable Lookup Algorithm
#![allow(unused)] fn main() { /// Look up a variable by traversing the environment chain fn lookup_variable(env: *mut Environment, depth: usize, slot: usize) -> *mut u8 { let mut current = env; // Walk up 'depth' levels for _ in 0..depth { current = unsafe { (*current).enclosing.unwrap() }; } // Now access the slot unsafe { (*current).get(slot) } } }
Generated MLIR
func.func @inner(%env: !llvm.ptr) -> f64 {
// Load 'b' from current environment (depth=0, slot=0)
%b = call @lox_runtime_env_get(%env, 0) : (!llvm.ptr, i64) -> f64
// Load 'a' from enclosing environment (depth=1, slot=0)
%env_outer = call @lox_runtime_env_get_enclosing(%env) : (!llvm.ptr) -> !llvm.ptr
%a = call @lox_runtime_env_get(%env_outer, 0) : (!llvm.ptr, i64) -> f64
// Compute
%sum = arith.addf %a, %b : f64
return %sum : f64
}
Chapter 23.8: Practice Exercises
Exercise 1: Trace Memory Layout
For this code:
fun factory(value) {
fun get() {
return value;
}
fun set(new_value) {
value = new_value;
}
return (get, set);
}
var (getter, setter) = factory(10);
print getter(); // 10
setter(20);
print getter(); // 20
Draw the memory layout after var (getter, setter) = factory(10);
Click to reveal answer
Stack:
┌─────────────────────────────┐
│ getter = 0x1000 (closure) │
│ setter = 0x2000 (closure) │
└─────────────────────────────┘
Heap:
0x1000: Closure (get)
function_index: @get
environment: 0x3000 ──────────┐
│
0x2000: Closure (set) │
function_index: @set │
environment: 0x3000 ──────────│── Same environment!
│
0x3000: Environment ◄───────────┘
slot[0]: 10 (value)
Key insight: Both closures share the SAME environment. That's why setter(20) affects what getter() returns!
Exercise 2: Variable Analysis
For this function, what variables are captured and where do they go?
fun outer(x, y) {
var a = x + y;
fun inner(b) {
return a + b + x; // Captures a and x
}
return inner;
}
Click to reveal answer
Captured variables:
a(local in outer) → slot 0x(parameter in outer) → slot 1
Environment for inner:
env_inner:
slot[0]: a
slot[1]: x
Note: y is NOT captured (not used by inner), so it's not in the environment.
Exercise 3: Why Not Copy Values?
Why can't we just copy captured values into the closure? Why do we need an environment?
fun example() {
var count = 0;
fun increment() {
count = count + 1; // MODIFIES count!
return count;
}
return increment;
}
Click to reveal answer
If we copied count into the closure:
- Each call to
increment()would modify its OWN copy - The outer
countwould never change - Multiple calls wouldn't accumulate
By using an environment:
- All closures share the same environment
- Modifications are visible to all
- State is properly shared
The environment is essentially a shared "box" that holds the variable.
Checkpoint 4
We've covered the hardest part: closures!
- Why captured variables can't live on the stack
- Environment structure (heap-allocated variable storage)
- Closure structure (function + environment pointer)
- Marking environments during GC
- Code generation for closures
- Nested closures deep dive
- Multiple captured variables
- Environment chains
- Practice exercises
Files created:
mlir-lox-guide-rust-part2.md- Concepts + allocationmlir-lox-guide-rust-part3.md- Roots + full GCmlir-lox-guide-rust-part4.md- MLIR integrationmlir-lox-guide-rust-part5.md- Closures (this file)
Next: Final review and complete reference
MLIR for Lox: Part 6 - Complete Reference
This final part provides a complete reference and review of the entire GC implementation.
Chapter 24: Complete File Structure
lox-mlir/
├── Cargo.toml
├── src/
│ ├── lib.rs # Library entry point
│ ├── main.rs # CLI
│ │
│ ├── ast/
│ │ ├── mod.rs
│ │ └── types.rs # AST type definitions
│ │
│ ├── lexer/
│ │ ├── mod.rs
│ │ └── tokenizer.rs # Tokenization
│ │
│ ├── parser/
│ │ ├── mod.rs
│ │ └── parser.rs # Parsing
│ │
│ ├── analysis/
│ │ ├── mod.rs
│ │ └── captures.rs # Closure capture analysis
│ │
│ ├── codegen/
│ │ ├── mod.rs
│ │ ├── generator.rs # MLIR code generation
│ │ ├── lox_dialect.rs # Lox MLIR operations
│ │ ├── lowering.rs # Lox → LLVM lowering
│ │ └── types.rs # Type utilities
│ │
│ └── runtime/
│ ├── mod.rs
│ ├── object.rs # Object header and types
│ ├── gc.rs # Garbage collector
│ └── shadow_stack.rs # Shadow stack implementation
│
├── examples/
│ ├── simple.lox # Basic example
│ ├── closures.lox # Closure example
│ └── gc_stress.lox # GC stress test
│
└── tests/
└── gc_test.rs # GC integration tests
Chapter 25: Complete Object Layout
Every object in Lox has this layout:
┌─────────────────────────────────────────────────────┐
│ ObjHeader (8 bytes) │
├─────────────────────────────────────────────────────┤
│ Field │ Offset │ Size │ Description │
├───────────────┼────────┼──────┼────────────────────┤
│ marked │ 0 │ 1 │ GC mark flag │
│ obj_type │ 1 │ 1 │ ObjType enum │
│ padding │ 2 │ 2 │ Alignment padding │
│ size │ 4 │ 4 │ Data size in bytes │
└─────────────────────────────────────────────────────┘
Total header size: 8 bytes
Data follows immediately after header.
Type-Specific Layouts
Number (ObjType::Number)
┌──────────────────────────────┐
│ Header (8 bytes) │
├──────────────────────────────┤
│ value: f64 (8 bytes) │
└──────────────────────────────┘
Total: 16 bytes
String (ObjType::String)
┌──────────────────────────────┐
│ Header (8 bytes) │
├──────────────────────────────┤
│ length: usize (8 bytes) │
│ hash: u64 (8 bytes) │
│ chars: [u8; length] │
└──────────────────────────────┘
Total: 16 + length bytes
Environment (ObjType::Environment)
┌──────────────────────────────┐
│ Header (8 bytes) │
├──────────────────────────────┤
│ enclosing: *mut Env (8) │
│ slot_count: usize (8) │
│ slots: [*mut u8; slot_count] │
└──────────────────────────────┘
Total: 24 + (8 × slot_count) bytes
Closure (ObjType::Closure)
┌──────────────────────────────┐
│ Header (8 bytes) │
├──────────────────────────────┤
│ function_index: usize (8) │
│ environment: *mut Env (8) │
└──────────────────────────────┘
Total: 24 bytes
Instance (ObjType::Instance)
┌──────────────────────────────┐
│ Header (8 bytes) │
├──────────────────────────────┤
│ class: *mut Class (8) │
│ field_count: usize (8) │
│ fields: [(*str, *val); n] │
└──────────────────────────────┘
Total: 24 + (16 × field_count) bytes
Chapter 26: Complete GC API Reference
Runtime Functions (C ABI)
// Allocation
void* lox_runtime_alloc(size_t size, uint8_t obj_type);
void* lox_runtime_alloc_environment(size_t slot_count, void* enclosing);
void* lox_runtime_alloc_closure(size_t function_index, void* environment);
// Environment access
void lox_runtime_env_set(void* env, size_t index, void* value);
void* lox_runtime_env_get(void* env, size_t index);
// Shadow stack
void** gc_push_frame(size_t root_count);
void gc_pop_frame(void);
void gc_set_root(size_t index, void* value);
// Collection
void gc_collect(void);
size_t gc_object_count(void);
MLIR Operations
lox.alloc :: (obj_type: i8, size: i64) -> !llvm.ptr
Allocates a heap object with the given type and size.
lox.push_frame :: (root_count: i64) -> !llvm.ptr
Pushes a shadow stack frame with the given number of root slots.
Returns pointer to roots array.
lox.pop_frame :: () -> ()
Pops the current shadow stack frame.
lox.set_root :: (index: i64, value: !llvm.ptr) -> ()
Sets a root in the current frame.
lox.env_set :: (env: !llvm.ptr, index: i64, value: !llvm.ptr) -> ()
Sets a slot in an environment.
lox.env_get :: (env: !llvm.ptr, index: i64) -> !llvm.ptr
Gets a slot from an environment.
Chapter 27: The Compilation Pipeline
┌─────────────────────────────────────────────────────────────┐
│ 1. LEXING │
│ "var x = 1 + 2;" → [VAR, IDENT, EQUAL, NUMBER, ...] │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 2. PARSING │
│ Tokens → AST │
│ VarStmt { name: "x", init: BinaryExpr { ... } } │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 3. ANALYSIS │
│ Find captured variables, compute root counts │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 4. CODE GENERATION │
│ AST → MLIR (Lox dialect) │
│ func.func @main() { │
│ %frame = lox.push_frame root_count = 1 │
│ %x = lox.alloc obj_type = 0, size = 8 │
│ lox.set_root index = 0, %x │
│ ... │
│ } │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 5. LOWERING │
│ MLIR (Lox dialect) → MLIR (LLVM dialect) │
│ lox.alloc → func.call @lox_runtime_alloc │
│ lox.push_frame → func.call @gc_push_frame │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 6. LLVM IR GENERATION │
│ MLIR → LLVM IR │
│ define i8* @lox_runtime_alloc(...) { ... } │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 7. NATIVE CODE GENERATION │
│ LLVM IR → Machine code │
│ Link with liblox_runtime.a │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 8. EXECUTION │
│ ./program │
│ Runtime GC manages memory │
└─────────────────────────────────────────────────────────────┘
Chapter 28: Debugging Your GC
Common Bugs
1. Use After Free
Symptom: Random crashes, corrupted data
Cause: Object freed while still referenced
Fix: Check that all references are registered as roots
2. Memory Leaks
Symptom: Memory grows unbounded
Cause: Objects not being freed
Fix: Check mark phase visits all reachable objects
3. Premature Collection
Symptom: Object disappears mid-function
Cause: Root not registered in shadow stack
Fix: Ensure gc_set_root is called for all local references
Debugging Tools
#![allow(unused)] fn main() { // Add to gc.rs /// Print all objects in the heap (for debugging) pub fn gc_debug_print_heap() { ALL_OBJECTS.with(|objects| { let mut current = *objects.borrow(); println!("=== HEAP CONTENTS ==="); while let Some(header_ptr) = current { unsafe { let header = &*header_ptr; println!(" {:p}: type={:?}, marked={}, size={}", header_ptr, header.obj_type, header.marked, header.size); current = header.next; } } }); } /// Print the shadow stack (for debugging) pub fn gc_debug_print_stack() { SHADOW_STACK_HEAD.with(|head| { let mut current = *head.borrow(); println!("=== SHADOW STACK ==="); let mut frame_num = 0; while let Some(frame_ptr) = current { unsafe { let frame = &*frame_ptr; println!(" Frame {} ({} roots):", frame_num, frame.root_count); for i in 0..frame.root_count { let root = frame.get_root(i); println!(" [{}] = {:p}", i, root); } current = frame.next; frame_num += 1; } } }); } }
Chapter 29: Performance Considerations
Allocation Overhead
| Operation | Time | Notes |
|---|---|---|
alloc() | O(1) | Plus possible GC trigger |
gc_push_frame() | O(1) | Allocate and link |
gc_pop_frame() | O(1) | Unlink and free |
gc_collect() | O(live objects) | Mark + sweep |
Tuning Parameters
#![allow(unused)] fn main() { // Adjust these based on your workload /// Trigger GC after this many allocations const GC_THRESHOLD: usize = 1024; /// Maximum objects before forced collection const GC_MAX_OBJECTS: usize = 10000; /// Enable debug output const GC_DEBUG: bool = false; }
Optimization Opportunities
- Generational GC: Most objects die young. Only scan recent allocations.
- Incremental GC: Don't pause for full collection. Do it in small chunks.
- Parallel GC: Use multiple threads for mark/sweep.
- Escape Analysis: If an object doesn't escape a function, put it on the stack.
Chapter 30: What's Next?
You now have a complete garbage collector! Here's what to explore next:
Immediate Next Steps
- Get it running - Compile the runtime, write a simple test program
- Add more object types - Arrays, maps, classes
- Write tests - Verify GC handles all edge cases
Advanced Topics
-
Generational Collection
- Young generation: frequent, small collections
- Old generation: rare, large collections
-
Incremental Collection
- Break mark/sweep into small chunks
- Reduce pause times
-
Compacting Collection
- Move objects to eliminate fragmentation
- Requires precise GC (we have it!)
-
Concurrent Collection
- Run GC in parallel with program execution
- Complex but powerful
Resources
- Crafting Interpreters - The GC chapter (you skipped it!)
- GC Handbook - The definitive reference
- MLIR Documentation - For advanced dialect features
- LLVM Statepoints - If you want relocating GC
Appendix A: Quick Reference Card
┌──────────────────────────────────────────────────────────────┐
│ GC QUICK REFERENCE │
├──────────────────────────────────────────────────────────────┤
│ │
│ MARK-SWEEP ALGORITHM │
│ ───────────────────── │
│ 1. Mark all roots (shadow stack) │
│ 2. Recursively mark referenced objects │
│ 3. Sweep: free all unmarked objects │
│ │
│ ROOTS │
│ ───── │
│ • Global variables │
│ • Local variables (in shadow stack frames) │
│ • Temporaries (also in shadow stack) │
│ │
│ OBJECT LIFECYCLE │
│ ───────────────── │
│ 1. alloc() → Object on heap │
│ 2. gc_set_root() → Register in shadow stack │
│ 3. Use object │
│ 4. gc_pop_frame() → Remove from shadow stack │
│ 5. gc_collect() → Free if unreachable │
│ │
│ CLOSURES │
│ ──────── │
│ • Captured vars → Environment (heap) │
│ • Closure = function pointer + environment pointer │
│ • Mark closure → Mark environment → Mark captured vars │
│ │
│ MLIR OPERATIONS │
│ ─────────────── │
│ lox.alloc → Allocate object │
│ lox.push_frame → Push shadow stack frame │
│ lox.pop_frame → Pop shadow stack frame │
│ lox.set_root → Register root │
│ │
└──────────────────────────────────────────────────────────────┘
Appendix B: Error Messages
| Error | Cause | Fix |
|---|---|---|
No frame to pop! | gc_pop_frame called without matching gc_push_frame | Check function prologue/epilogue pairing |
Out of memory! | Allocation failed | Check for memory leaks, tune GC threshold |
Variable not found: x | Capture analysis failed | Check scope tracking in analyzer |
Object not marked during sweep | Logic error | Call gc_debug_print_heap() before sweep |
Appendix C: Study Roadmap
Use this checklist to track your progress through the GC tutorial:
Phase 1: Understanding (Read-Only)
- Part 2, Chapters 1-4: Understand the GC problem and mark-sweep algorithm
- Part 2, Chapter 5: Understand the two hard problems (roots and references)
- Part 2, Chapters 6-7: Understand object headers and allocation
- Part 2, Chapter 7.5: Complete practice exercises
- Part 3, Chapters 8-9: Understand the shadow stack concept
- Part 3, Chapter 10: Understand mark and sweep phases
- Part 3, Chapter 10.5: Walk through the step-by-step example
- Part 3, Chapter 10.6: Read about common mistakes
- Part 3, Chapter 10.7: Complete practice exercises
- Part 4, Chapters 12-14: Understand MLIR dialect and lowering
- Part 4, Chapter 15: Understand function code generation
- Part 4, Chapter 17.5: Trace through the complete MLIR example
- Part 4, Chapter 17.6: Understand different object types
- Part 4, Chapter 17.7: Complete practice exercises
- Part 5, Chapters 18-19: Understand the closure problem
- Part 5, Chapters 20-21: Understand environments and marking
- Part 5, Chapter 22: Understand closure code generation
- Part 5, Chapter 23: Understand the complete picture
- Part 5, Chapters 23.5-23.7: Understand nested and complex closures
- Part 5, Chapter 23.8: Complete practice exercises
Phase 2: Implementation (Hands-On)
- Create the runtime module structure
-
Implement
ObjHeaderandObjType -
Implement
alloc()function -
Implement shadow stack (
StackFrame,gc_push_frame,gc_pop_frame) -
Implement mark phase (
gc_mark,mark_object,mark_references) -
Implement sweep phase (
gc_sweep) -
Implement
gc_collect()entry point - Test allocation without GC
- Test mark phase with simple objects
- Test full GC cycle
-
Add debug utilities (
gc_debug_print_heap, etc.) - Implement environment allocation
- Implement closure allocation
- Test closure GC
Phase 3: MLIR Integration
- Set up Melior dependency
- Define Lox dialect operations
- Implement lowering pass
- Generate MLIR for simple functions
- Lower to LLVM IR
- Link with runtime
- Test end-to-end execution
Phase 4: Advanced Features
- Add string type
- Add instance type
- Implement nested closure environments
- Add generational GC (optional)
- Add incremental collection (optional)
Appendix D: Glossary
| Term | Definition |
|---|---|
| Heap | The pool of memory where objects are allocated |
| Stack | Memory for function calls and local variables |
| Root | A pointer that the GC treats as always live (globals, locals) |
| Reachable | An object that can be found by following pointers from roots |
| Mark | To flag an object as "still live" during GC |
| Sweep | To free all objects that weren't marked |
| Shadow Stack | A separate stack that tracks GC roots for each function call |
| Environment | A heap-allocated structure holding captured variables for closures |
| Closure | A function together with its captured environment |
| Dialect | A set of operations in MLIR for a specific domain |
| Lowering | Converting high-level IR to lower-level IR |
Appendix E: Further Reading
Books
- "Crafting Interpreters" by Bob Nystrom - Chapter on garbage collection
- "The Garbage Collection Handbook" by Jones et al. - Comprehensive GC reference
- "Engineering a Compiler" by Cooper & Torczon - GC in the context of compilers
Papers
- "Uniprocessor Garbage Collection Techniques" - Wilson (1992) - Classic survey
- "A Unified Theory of Garbage Collection" - Bacon et al. (2004) - Theoretical foundation
Code
- Lua 5.4 source - Simple, readable GC implementation
- MicroPython - Mark-sweep GC for embedded systems
- mruby - Lightweight Ruby with simple GC
MLIR/LLVM
- MLIR Documentation - https://mlir.llvm.org/docs/
- Melior Repository - Rust bindings for MLIR
- LLVM GC Documentation - https://llvm.org/docs/GarbageCollection.html
Conclusion
You've learned:
- Why GC is needed - Manual memory management doesn't work for closures
- Mark-sweep algorithm - The simplest correct GC
- Shadow stack - How to track roots precisely
- Closures - The hardest part: captured variables in environments
- MLIR integration - How to generate code that uses the GC
The key insight: GC is not magic. It's just:
- Track what's reachable (roots)
- Mark everything reachable
- Free everything not marked
Everything else is optimization.
File Summary
| File | Content |
|---|---|
mlir-lox-guide-rust-part2.md | GC concepts, object headers, allocation |
mlir-lox-guide-rust-part3.md | Shadow stack, mark phase, sweep phase |
mlir-lox-guide-rust-part4.md | MLIR dialect, lowering to LLVM |
mlir-lox-guide-rust-part5.md | Closures and environments |
mlir-lox-guide-rust-part6.md | Complete reference (this file) |
Good luck building your Lox compiler! 🦀
MLIR for Lox: Part 7 - Classes and Instances
Classes are the last major feature in Crafting Interpreters. They combine everything we've built — heap allocation, GC roots, closures — and add a new layer: method dispatch, inheritance, and this binding.
This part assumes you've read Parts 2–6. We'll extend the GC runtime from Part 6 and the MLIR codegen from Part 4.
What We're Building
By the end, this Lox program should work:
class Doughnut {
cook(flavor) {
print "Frying " + flavor + " doughnut";
}
}
class FilledDoughnut < Doughnut {
cook(flavor) {
super.cook(flavor);
print "Injecting " + flavor + " filling";
}
}
var d = FilledDoughnut();
d.cook("custard");
Output:
Frying custard doughnut
Injecting custard filling
The Object Model
Lox has three class-related object types:
| Object | What It Holds | Analogy |
|---|---|---|
ObjClass | Name, methods, superclass | A blueprint |
ObjInstance | Class pointer, field table | A house built from the blueprint |
ObjBoundMethod | Receiver + closure | A method "bound" to an instance |
These are all heap-allocated, GC-managed objects. They extend the ObjHeader we built in Part 2.
Updated Object Types
#![allow(unused)] fn main() { // src/runtime/object.rs use std::cell::UnsafeCell; use std::collections::HashMap; use crate::runtime::value::Value; use crate::runtime::heap::{ObjHeader, ObjType}; /// A Lox class object pub struct ObjClass { pub header: ObjHeader, pub name: String, /// Methods defined directly on this class (not inherited) pub methods: HashMap<String, Value>, /// Superclass, if any (nil for root classes) pub superclass: Option<*mut ObjClass>, } impl ObjClass { /// Look up a method, walking the inheritance chain pub fn find_method(&self, name: &str) -> Option<Value> { if let Some(value) = self.methods.get(name) { return Some(*value); } // Walk up the inheritance chain if let Some(super_ptr) = self.superclass { // SAFETY: GC guarantees the object is alive when we have a reference unsafe { (*super_ptr).find_method(name) } } else { None } } } /// A Lox instance object pub struct ObjInstance { pub header: ObjHeader, /// The class this instance belongs to pub class: *mut ObjClass, /// Instance fields (set at runtime, not on the class) pub fields: UnsafeCell<HashMap<String, Value>>, } impl ObjInstance { pub fn new(class: *mut ObjClass) -> Self { Self { header: ObjHeader::new(ObjType::Instance), class, fields: UnsafeCell::new(HashMap::new()), } } /// Get a field value, or look up a method on the class pub fn get_property(&self, name: &str) -> Option<Value> { // Fields shadow methods let fields = unsafe { &*self.fields.get() }; if let Some(value) = fields.get(name) { return Some(*value); } // Look up method on the class unsafe { (*self.class).find_method(name).map(|method| { // Bind the method to this instance // (We'll define ObjBoundMethod below) Value::bound_method(self as *const ObjInstance as *mut ObjInstance, method) }) } } /// Set a field value pub fn set_property(&self, name: String, value: Value) { let fields = unsafe { &mut *self.fields.get() }; fields.insert(name, value); } } /// A method bound to a specific receiver instance pub struct ObjBoundMethod { pub header: ObjHeader, /// The instance that receives the method call (the `this` value) pub receiver: *mut ObjInstance, /// The underlying closure pub method: Value, } }
Updated ObjType Enum
#![allow(unused)] fn main() { // src/runtime/object.rs (continued) #[repr(u8)] #[derive(Debug, Clone, Copy, PartialEq, Eq)] pub enum ObjType { String, Closure, // From Part 5 Class, Instance, BoundMethod, } }
GC Tracing for Classes
Every new object type needs to report its outgoing references. Miss one and you get use-after-free. This is the same trace function from Part 3, extended.
#![allow(unused)] fn main() { // src/runtime/gc.rs impl GC { /// Trace all reachable objects from `obj` pub fn trace_object(&mut self, obj: *mut ObjHeader) { let header = unsafe { &mut *obj }; if header.is_marked { return; } header.is_marked = true; match header.obj_type { ObjType::String => { // Strings have no outgoing references } ObjType::Closure => { let closure = unsafe { &*(obj as *const ObjClosure) }; // Trace the upvalue references (from Part 5) for &upvalue_ptr in &closure.upvalues { self.trace_value(unsafe { (*upvalue_ptr).closed }); } } ObjType::Class => { let class = unsafe { &*(obj as *const ObjClass) }; // Trace method values for value in class.methods.values() { self.trace_value(*value); } // Trace superclass if let Some(super_ptr) = class.superclass { self.trace_object(super_ptr as *mut ObjHeader); } } ObjType::Instance => { let instance = unsafe { &*(obj as *const ObjInstance) }; // Trace the class reference self.trace_object(instance.class as *mut ObjHeader); // Trace all field values let fields = unsafe { &*instance.fields.get() }; for value in fields.values() { self.trace_value(*value); } } ObjType::BoundMethod => { let bound = unsafe { &*(obj as *const ObjBoundMethod) }; // Trace the receiver instance self.trace_object(bound.receiver as *mut ObjHeader); // Trace the method closure self.trace_value(bound.method); } } } } }
Key insight: ObjClass traces its methods and its superclass. ObjInstance traces its class and all field values. ObjBoundMethod traces both the receiver and the closure. Every edge in the object graph must be walked.
this Binding
When a method is called, this refers to the receiver instance. We implement this the same way closures capture upvalues (Part 5) — the method's closure has an implicit upvalue that points to this.
How It Works
When we create a class, each method closure gets an extra upvalue slot for this. When a method is bound to an instance (via ObjBoundMethod), we fill that slot with the instance pointer.
#![allow(unused)] fn main() { // src/runtime/bind.rs use crate::runtime::object::{ObjBoundMethod, ObjInstance, ObjHeader, ObjType}; use crate::runtime::value::Value; use crate::runtime::heap::Heap; impl Heap { /// Bind a method closure to a receiver instance pub fn bind_method( &mut self, receiver: *mut ObjInstance, method: Value, ) -> *mut ObjBoundMethod { let bound = ObjBoundMethod { header: ObjHeader::new(ObjType::BoundMethod), receiver, method, }; self.allocate(bound) } } }
The Method Call Protocol
When the VM encounters a method call like d.cook("custard"):
- Evaluate
d→ get theObjInstancepointer - Look up
"cook"on the instance → get anObjBoundMethod - Call the bound method's closure with the provided arguments
- Inside the closure,
thisresolves to the bound receiver
No vtable needed. Method dispatch is just a hash map lookup that walks the superclass chain.
Inheritance: Just Linked Lists
Lox's inheritance is single-inheritance only. That means the class hierarchy is a linked list:
FilledDoughnut → Doughnut → nil
When we look up a method, we walk the chain:
#![allow(unused)] fn main() { // Already defined above in ObjClass::find_method pub fn find_method(&self, name: &str) -> Option<Value> { if let Some(value) = self.methods.get(name) { return Some(*value); } if let Some(super_ptr) = self.superclass { unsafe { (*super_ptr).find_method(name) } } else { None } } }
super Calls
A super.cook(flavor) expression needs two things:
- The superclass of the enclosing class (not the receiver's class)
- The method name
We resolve super at compile time, not runtime. During codegen, when we're inside a class method, we know which class we're in and therefore what the superclass is. We store this as a hidden upvalue on the closure — just like this.
#![allow(unused)] fn main() { // During compilation, inside a class method: // The method closure gets two implicit upvalues: // [0] = this (the receiver instance) // [1] = super (the enclosing class's superclass) }
This means super is free at runtime — no lookup needed. The superclass pointer is already captured in the closure.
MLIR Code Generation for Classes
Now the interesting part: generating MLIR for class declarations, instance creation, property access, and method calls.
New AST Nodes
#![allow(unused)] fn main() { // src/ast.rs (additions) #[derive(Debug, Clone)] pub enum Expr { // ... existing variants ... Get(GetExpr), Set(SetExpr), This(ThisExpr), Super(SuperExpr), } #[derive(Debug, Clone)] pub struct GetExpr { pub location: Location, pub object: Box<Expr>, pub name: String, } #[derive(Debug, Clone)] pub struct SetExpr { pub location: Location, pub object: Box<Expr>, pub name: String, pub value: Box<Expr>, } #[derive(Debug, Clone)] pub struct ThisExpr { pub location: Location, } #[derive(Debug, Clone)] pub struct SuperExpr { pub location: Location, pub method: String, } #[derive(Debug, Clone)] pub enum Stmt { // ... existing variants ... Class(ClassStmt), } #[derive(Debug, Clone)] pub struct ClassStmt { pub location: Location, pub name: String, pub superclass: Option<String>, pub methods: Vec<FunctionStmt>, } }
Runtime Calls as External Functions
Class operations are too complex for pure MLIR. We emit calls to runtime functions instead:
#![allow(unused)] fn main() { // src/codegen/classes.rs use melior::{ Context, Location, dialect::func, ir::{ attribute::{FlatSymbolRefAttribute, StringAttribute, TypeAttribute}, r#type::FunctionType, Type, Value, Block, operation::OperationBuilder, }, }; use crate::codegen::types::lox_value_type; /// Declare runtime functions needed for class operations pub fn declare_runtime_functions(context: &Context, module: &mut Module) { let location = Location::unknown(context); let lox_val = lox_value_type(context); // lox.create_class(name_ptr: !llvm.ptr, superclass: lox_val) -> lox_val let create_class_type = FunctionType::new( context, &[Type::parse(context, "!llvm.ptr").unwrap(), lox_val], &[lox_val], ); declare_external(module, context, "lox_create_class", create_class_type, location); // lox.instance_from_class(class: lox_val) -> lox_val let instance_type = FunctionType::new(context, &[lox_val], &[lox_val]); declare_external(module, context, "lox_instance_from_class", instance_type, location); // lox.get_property(instance: lox_val, name_ptr: !llvm.ptr) -> lox_val let get_prop_type = FunctionType::new( context, &[lox_val, Type::parse(context, "!llvm.ptr").unwrap()], &[lox_val], ); declare_external(module, context, "lox_get_property", get_prop_type, location); // lox.set_property(instance: lox_val, name_ptr: !llvm.ptr, value: lox_val) -> lox_val let set_prop_type = FunctionType::new( context, &[lox_val, Type::parse(context, "!llvm.ptr").unwrap(), lox_val], &[lox_val], ); declare_external(module, context, "lox_set_property", set_prop_type, location); // lox.bind_method(receiver: lox_val, method: lox_val) -> lox_val let bind_method_type = FunctionType::new(context, &[lox_val, lox_val], &[lox_val]); declare_external(module, context, "lox_bind_method", bind_method_type, location); } fn declare_external( module: &mut Module, context: &Context, name: &str, fn_type: FunctionType, location: Location, ) { module.body().append_operation(func::func( context, StringAttribute::new(context, name), TypeAttribute::new(fn_type.into()), Region::new(), &[], location, )); } }
Compiling Class Declarations
#![allow(unused)] fn main() { // src/codegen/generator.rs (additions) impl<'c> CodeGenerator<'c> { fn compile_class(&mut self, class: &ClassStmt) { let location = self.loc(class.location); // 1. Resolve superclass (if any) let superclass_val = if let Some(super_name) = &class.superclass { // Look up the superclass variable — it should be a class object self.compile_variable(&VariableExpr { location: class.location, name: super_name.clone(), }) } else { self.compile_nil() }; // 2. Create a global string constant for the class name let name_global = self.create_string_constant(&class.name); // 3. Call runtime: lox_create_class(name, superclass) let lox_val = lox_value_type(self.context); let create_class = func::call( self.context, FlatSymbolRefAttribute::new(self.context, "lox_create_class"), &[name_global, superclass_val], &[lox_val], location, ); if let Some(block) = &self.current_block { block.append_operation(create_class.clone()); } let class_val = create_class.result(0).unwrap().into(); // 4. Store each method on the class for method in &class.methods { self.compile_method(class_val, method); } // 5. Store the class object as a variable self.variables.insert(class.name.clone(), class_val); } fn compile_method(&mut self, class_val: Value<'c>, method: &FunctionStmt) { let location = self.loc(method.location); // Compile the method as a closure with two extra upvalues: // - upvalue[0] = this (will be bound at call time) // - upvalue[1] = super (the superclass, for super calls) // This is the same as compile_function, but with the extra upvalue slots. // For now, compile it as a regular function // (A full implementation would add the implicit upvalues) self.compile_function(method); // Then call a runtime function to attach the method to the class let method_val = self.variables.get(&method.name).copied().unwrap(); let name_global = self.create_string_constant(&method.name); let attach = func::call( self.context, FlatSymbolRefAttribute::new(self.context, "lox_set_method"), &[class_val, name_global, method_val], &[lox_value_type(self.context)], location, ); if let Some(block) = &self.current_block { block.append_operation(attach); } } } }
Compiling Property Access and Assignment
#![allow(unused)] fn main() { impl<'c> CodeGenerator<'c> { fn compile_get(&mut self, get: &GetExpr) -> Value<'c> { let location = self.loc(get.location); let object = self.compile_expression(&get.object); // Create a global string constant for the property name let name_global = self.create_string_constant(&get.name); // Call runtime: lox_get_property(instance, "name") let op = func::call( self.context, FlatSymbolRefAttribute::new(self.context, "lox_get_property"), &[object, name_global], &[lox_value_type(self.context)], location, ); if let Some(block) = &self.current_block { block.append_operation(op.clone()); } op.result(0).unwrap().into() } fn compile_set(&mut self, set: &SetExpr) -> Value<'c> { let location = self.loc(set.location); let object = self.compile_expression(&set.object); let value = self.compile_expression(&set.value); let name_global = self.create_string_constant(&set.name); // Call runtime: lox_set_property(instance, "name", value) let op = func::call( self.context, FlatSymbolRefAttribute::new(self.context, "lox_set_property"), &[object, name_global, value], &[lox_value_type(self.context)], location, ); if let Some(block) = &self.current_block { block.append_operation(op.clone()); } // set expressions return the assigned value (like assignment) op.result(0).unwrap().into() } } }
Compiling this and super
#![allow(unused)] fn main() { impl<'c> CodeGenerator<'c> { fn compile_this(&mut self, this: &ThisExpr) -> Value<'c> { // `this` is just a variable lookup — it's stored as an upvalue // by the method binding mechanism self.compile_variable(&VariableExpr { location: this.location, name: "this".to_string(), }) } fn compile_super(&mut self, super_expr: &SuperExpr) -> Value<'c> { let location = self.loc(super_expr.location); // `super.method` resolves to: // 1. Get the superclass from the implicit upvalue // 2. Look up the method on the superclass // 3. Bind it to `this` let super_class = self.compile_variable(&VariableExpr { location: super_expr.location, name: "super".to_string(), }); let method_name = self.create_string_constant(&super_expr.method); let this_val = self.compile_this(&ThisExpr { location: super_expr.location }); // Call runtime: lox_super_lookup(superclass, "method", this) let op = func::call( self.context, FlatSymbolRefAttribute::new(self.context, "lox_super_lookup"), &[super_class, method_name, this_val], &[lox_value_type(self.context)], location, ); if let Some(block) = &self.current_block { block.append_operation(op.clone()); } op.result(0).unwrap().into() } } }
What the Generated MLIR Looks Like
Given our doughnut example from the top:
module {
// Runtime function declarations
func.func @lox_create_class(!llvm.ptr, !llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)>
func.func @lox_instance_from_class(!llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)>
func.func @lox_get_property(!llvm.struct<(i8, i64)>, !llvm.ptr) -> !llvm.struct<(i8, i64)>
func.func @lox_set_property(!llvm.struct<(i8, i64)>, !llvm.ptr, !llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)>
func.func @lox_bind_method(!llvm.struct<(i8, i64)>, !llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)>
// Global string constants
llvm.mlir.global constant @str_0("Doughnut")
llvm.mlir.global constant @str_1("cook")
llvm.mlir.global constant @str_2("flavor")
llvm.mlir.global constant @str_3("FilledDoughnut")
llvm.mlir.global constant @str_4("custard")
// Doughnut.cook method
func.func @Doughnut_cook(%arg0: !llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)> {
// print "Frying " + flavor + " doughnut"
// ... (string concatenation via runtime calls)
func.return %nil : !llvm.struct<(i8, i64)>
}
// FilledDoughnut.cook method
func.func @FilledDoughnut_cook(%arg0: !llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)> {
// super.cook(flavor)
// print "Injecting " + flavor + " filling"
// ... (runtime calls)
func.return %nil : !llvm.struct<(i8, i64)>
}
// Top-level code
func.func @main() -> !llvm.struct<(i8, i64)> {
// Create Doughnut class
%doughnut = func.call @lox_create_class(@str_0, %nil) : (!llvm.ptr, !llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)>
// Create FilledDoughnut class (inherits from Doughnut)
%filled = func.call @lox_create_class(@str_3, %doughnut) : (!llvm.ptr, !llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)>
// var d = FilledDoughnut()
%d = func.call @lox_instance_from_class(%filled) : (!llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)>
// d.cook("custard")
%method = func.call @lox_get_property(%d, @str_1) : (!llvm.struct<(i8, i64)>, !llvm.ptr) -> !llvm.struct<(i8, i64)>
%custard = ... // string constant for "custard"
func.call @lox_call(%method, %custard) : (!llvm.struct<(i8, i64)>, !llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)>
func.return %nil : !llvm.struct<(i8, i64)>
}
}
The IR is verbose, but that's the point — it's an intermediate representation, not hand-written code. Each operation has clear semantics and the lowering passes can optimize it.
The C Runtime
The runtime functions are simple C — they operate on the same tagged union and heap we built in Parts 2–5.
// src/runtime/class_runtime.c
#include "runtime.h"
#include "gc.h"
#include <string.h>
// Create a new class object
LoxValue lox_create_class(const char* name, LoxValue superclass) {
ObjClass* klass = gc_allocate(sizeof(ObjClass));
klass->header.type = OBJ_CLASS;
klass->header.is_marked = false;
klass->name = strdup(name);
klass->methods = NULL; // empty hash map
klass->method_count = 0;
klass->superclass = IS_NIL(superclass) ? NULL : AS_CLASS(superclass);
return MAKE_OBJ(klass);
}
// Create an instance of a class
LoxValue lox_instance_from_class(LoxValue class_val) {
ObjClass* klass = AS_CLASS(class_val);
ObjInstance* instance = gc_allocate(sizeof(ObjInstance));
instance->header.type = OBJ_INSTANCE;
instance->header.is_marked = false;
instance->klass = klass;
instance->fields = NULL; // empty hash map
instance->field_count = 0;
return MAKE_OBJ(instance);
}
// Get a property on an instance
LoxValue lox_get_property(LoxValue instance_val, const char* name) {
ObjInstance* instance = AS_INSTANCE(instance_val);
// Check fields first (fields shadow methods)
for (int i = 0; i < instance->field_count; i++) {
if (strcmp(instance->fields[i].key, name) == 0) {
return instance->fields[i].value;
}
}
// Look up method on the class
ObjClass* klass = instance->klass;
while (klass != NULL) {
for (int i = 0; i < klass->method_count; i++) {
if (strcmp(klass->methods[i].key, name) == 0) {
// Bind the method to this instance
return lox_bind_method(instance_val, klass->methods[i].value);
}
}
klass = klass->superclass;
}
// Runtime error: undefined property
fprintf(stderr, "Undefined property '%s'.\n", name);
exit(1);
}
// Set a property on an instance
LoxValue lox_set_property(LoxValue instance_val, const char* name, LoxValue value) {
ObjInstance* instance = AS_INSTANCE(instance_val);
// Check if field already exists
for (int i = 0; i < instance->field_count; i++) {
if (strcmp(instance->fields[i].key, name) == 0) {
instance->fields[i].value = value;
return value;
}
}
// Add new field
int idx = instance->field_count++;
instance->fields = gc_reallocate(
instance->fields,
idx * sizeof(FieldEntry),
(idx + 1) * sizeof(FieldEntry)
);
instance->fields[idx].key = strdup(name);
instance->fields[idx].value = value;
return value;
}
// Bind a method to a receiver
LoxValue lox_bind_method(LoxValue receiver, LoxValue method) {
ObjBoundMethod* bound = gc_allocate(sizeof(ObjBoundMethod));
bound->header.type = OBJ_BOUND_METHOD;
bound->header.is_marked = false;
bound->receiver = AS_INSTANCE(receiver);
bound->method = method;
return MAKE_OBJ(bound);
}
Design Decisions and Trade-offs
Why Runtime Calls Instead of Pure MLIR?
You could emit pure MLIR for class operations — llvm.alloca for field storage, llvm.insertvalue/llvm.extractvalue for field access, etc. But:
| Approach | Pros | Cons |
|---|---|---|
| Runtime calls | Simple, correct, GC-aware | Can't optimize across boundary |
| Pure MLIR | Optimizable, no FFI overhead | Must teach MLIR about GC roots, field layout, dispatch |
For a tutorial, runtime calls are the right call. A production compiler would progressively move more into MLIR as it proves correctness. Start simple, optimize later.
Why No VTable?
VTables are an optimization for static dispatch. Lox's dispatch is dynamic — methods can be added at runtime, classes are first-class values. A hash map lookup per dispatch is the honest representation. If profiling shows it's a bottleneck, you add inline caches later.
Why Not Generic lox_call?
In the generated IR, method calls go through lox_get_property (returns a bound method) then lox_call (invokes the closure). Two runtime calls per method invocation. An optimization would be a single lox_invoke(instance, method_name, args) that combines both. We keep them separate for clarity — each function does one thing.
Full Update to the Expression Compiler
Adding the new expression types to the main compile_expression dispatch:
#![allow(unused)] fn main() { // src/codegen/generator.rs (updated match arm) fn compile_expression(&mut self, expr: &Expr) -> Value<'c> { match expr { Expr::Binary(b) => { let op = self.compile_binary(b); op.result(0).unwrap().into() } Expr::Unary(u) => { let op = self.compile_unary(u); op.result(0).unwrap().into() } Expr::Literal(l) => self.compile_literal(l), Expr::Grouping(g) => self.compile_expression(&g.expr), Expr::Variable(v) => self.compile_variable(v), Expr::Assign(a) => self.compile_assign(a), Expr::Call(c) => { let op = self.compile_call(c); op.result(0).unwrap().into() } Expr::Logical(l) => { let op = self.compile_logical(l); op.result(0).unwrap().into() } Expr::Get(g) => self.compile_get(g), Expr::Set(s) => self.compile_set(s), Expr::This(t) => self.compile_this(t), Expr::Super(s) => self.compile_super(s), } } }
Testing
Unit Tests for Method Lookup
#![allow(unused)] fn main() { #[cfg(test)] mod tests { use super::*; #[test] fn method_lookup_walks_inheritance() { let mut base_class = ObjClass { header: ObjHeader::new(ObjType::Class), name: "Base".to_string(), methods: HashMap::new(), superclass: None, }; base_class.methods.insert("greet".to_string(), Value::Nil); let mut derived_class = ObjClass { header: ObjHeader::new(ObjType::Class), name: "Derived".to_string(), methods: HashMap::new(), superclass: Some(&base_class as *const ObjClass as *mut ObjClass), }; // Derived doesn't have "greet", but Base does assert!(derived_class.find_method("greet").is_some()); // Derived doesn't have "missing" and neither does Base assert!(derived_class.find_method("missing").is_none()); } #[test] fn fields_shadow_methods() { let mut class = ObjClass { header: ObjHeader::new(ObjType::Class), name: "Test".to_string(), methods: HashMap::new(), superclass: None, }; class.methods.insert("x".to_string(), Value::Number(42.0)); let mut instance = ObjInstance::new(&class as *const ObjClass as *mut ObjClass); // Set field "x" to a different value instance.set_property("x".to_string(), Value::Number(99.0)); // Field should shadow the method let result = instance.get_property("x").unwrap(); assert_eq!(result, Value::Number(99.0)); } } }
Summary
| Concept | How We Implemented It |
|---|---|
| Class declaration | Runtime call lox_create_class |
| Instance creation | Runtime call lox_instance_from_class |
| Property access | Runtime call lox_get_property (fields before methods) |
| Property assignment | Runtime call lox_set_property |
| Method binding | ObjBoundMethod wraps receiver + closure |
this | Implicit upvalue, filled when method is bound |
super | Implicit upvalue holding the superclass, resolved at compile time |
| Inheritance | Linked list of superclass pointers, walked during method lookup |
| GC tracing | Walk methods, superclass, fields, receiver, and bound method |
Classes tie together every system we've built: the GC heap, closures, upvalues, and MLIR code generation. There's no new fundamental mechanism — just new combinations of what already exists. That's how you know the architecture is right.
Next: The series is complete. The "Next Steps" from Part 1 are all covered. To go further, consider adding: inline caches for method dispatch, a custom Lox dialect (instead of using only arith/scf/func), or JIT compilation via ORCv2.
REVIEW.md — MLIR for Lox Tutorial (Rust + Melior)
Reviewed by Esme, 2026-04-21
Overall
This tutorial covers ambitious ground: a full Lox compiler using MLIR via the Melior crate. The scope is impressive — AST, parsing, codegen, string constants, tagged unions, source locations, lowering — all in one document. However, this breadth comes at the cost of depth and correctness. The code examples have significant issues that would prevent them from compiling, and some MLIR concepts are presented in ways that could mislead. The strongest sections are the conceptual explanations (why MLIR, tagged unions, string constants); the weakest are the codegen examples.
Part 1: Setup
[error] The melior = "0.27" version should be verified. Melior's API changes frequently between minor versions, and the code examples use APIs that may not exist in 0.27. The arith::addf, arith::cmpf, func::func, etc. helper functions have changed signatures across versions.
[error] LLVM 22 is very recent. Most readers won't have it installed, and the Linux instructions ("You may need to build from source or use a PPA") are unhelpful. Consider targeting LLVM 18 or 19, which are more widely available, or provide actual build-from-source instructions.
[suggestion] The nom = "7.1" dependency is listed as optional but then never used. The parser in Part 3 is hand-written. Remove it or explain when it would be used.
Part 2: The Lox AST
[clarity] The AST definition is thorough and well-structured. The Location on every node is a good practice that pays off in Part 5. However, this is ~250 lines of struct definitions with no explanation of design choices. Why separate BinaryExpr, UnaryExpr, etc. instead of an enum with inline variants? (Answer: because each variant needs named fields and location tracking.) A brief note would help.
[suggestion] The LoxValue enum has String(String) — heap-allocated. For a compiler that's generating MLIR, you'd want to use interned strings or string references. This is fine for the AST level but worth a note.
Part 3: The Parser
[error] The parser references Token, TokenType, and ParseError types that are never defined. The Token struct and TokenType enum are needed for the parser to compile but aren't provided. A lexer is referenced (lexer::tokenize in Part 7) but never shown.
[error] The LoxValue::Number(f64) variant's as_number() and as_string() methods are used in primary() but never defined.
[clarity] The parser is a nearly line-for-line port of Crafting Interpreters' Java parser. This is fine for familiarity, but a note saying "This follows Crafting Interpreters Chapter 6 — we're not covering the scanner here" would set expectations.
[suggestion] The if_statement and while_statement methods wrap single statements in vec![] (then_branch: vec![self.statement()?]). This differs from Crafting Interpreters, which uses a single declaration. It's fine but means the then_branch and else_branch are always blocks, which affects codegen later.
Part 4: MLIR Code Generation (THE BIG ONE)
This is the core of the tutorial and has the most issues.
[error — CRITICAL] The CodeGenerator struct holds current_block: Option<Block<'c>> and attempts to append operations to blocks that are already inserted into regions. In Melior's ownership model, blocks are moved into regions via region.append_block(). After that, you can't hold a reference to the block separately — you access it through the region. The code does things like self.current_block = Some(block) and then later region.append_block(block) — but the block is already stored in self.current_block. This creates a double-use of the same moved value, which won't compile in Rust. The entire current_block pattern needs rethinking to work with Melior's ownership model.
[error — CRITICAL] The compile_function method does:
#![allow(unused)] fn main() { let block = Block::new(...); // ... add operations to block via self.current_block ... self.current_block = Some(block); // ... compile body ... region.append_block(block); // ERROR: block was already moved into self.current_block }
This won't compile. The block is moved into self.current_block, then the code tries to move it again into region.append_block().
[error] The compile_if method creates then_region and appends a block to it, then tries to use that block via self.current_block. But after then_region.append_block(then_block), the then_block is consumed and can't be referenced later. The pattern of "create block, insert into region, then reference via Option" doesn't work with Melior's move semantics.
[error] The compile_while has the same block-ownership issues as compile_if.
[error] The compile_logical method uses arith::andi and arith::ori for short-circuit logical operators. This is WRONG — bitwise AND/OR do NOT short-circuit. Lox's and and or must use scf.if for short-circuit evaluation. The code even has a comment saying "Logical operations short-circuit, so we need scf.if" but then uses bitwise ops instead!
[error] The compile_unary method for Not creates a constant true value and uses arith::xori. But the operand might be an i1 (boolean) or f64 — the types don't match. Lox's not operator needs to work on the tagged union, not on raw i1 values.
[error] The code generator treats all values as f64 but then introduces a tagged union !llvm.struct<(i8, i64)> in Part 4's types section. These two approaches are inconsistent. The codegen uses arith::addf (float add) but a dynamically-typed language needs type-tag checking before arithmetic. This is a fundamental architectural problem.
[clarity] The "String Constants" section is conceptually great — explaining that string literals go in the data section, not the heap. But the compile_string method in the generator just returns a placeholder i64 constant. The string constant infrastructure (create_string_constant) is never connected to the code generator.
[suggestion] The compile_print method emits a lox.print custom operation. This requires defining a Lox dialect, but no dialect definition is shown. In Melior, you'd need to either (a) define a custom dialect, or (b) use func.call to an external runtime function. The tutorial should pick one approach and show it.
Part 5: Source Locations
[error] The Location::new(context, filename, line, column) API shown doesn't match Melior 0.27's actual API. In Melior, file locations are typically created via Location::new(context, filename, line, column) or similar, but the exact constructor signature varies by version. Needs verification.
[error] Location::callsite and Location::fused may not exist in Melior 0.27's API. The demonstrate_locations function is pseudocode.
[clarity] The concept is well-explained: "Unlike LLVM where debug info is optional, in MLIR locations are core to the IR." This is a valuable insight that distinguishes MLIR from LLVM.
[suggestion] The "Before/After" IR comparison showing locations is excellent. More before/after IR examples throughout would help the reader see what they're building.
Part 6: Complete Example
[error] The example uses pass_manager.add_pass(pass::convert_scf_to_cf()) etc., but these pass functions aren't imported and may not exist in the Melior API with those exact names. The pass module path is wrong — in Melior, conversion passes are typically in melior::pass::convert::*.
[error] The compile_to_llvm function references lexer::tokenize which is never defined.
[error] The module.as_operation().verify() call returns a Result in Melior, not a bool. The assert!() usage is incorrect.
Part 7: Project Structure
[suggestion] The project structure shows runtime/print.c but no C runtime code is provided. For a dynamically-typed language, the runtime needs at least: print, type-tag checking, and string comparison functions. A minimal runtime would make the tutorial complete.
Cross-Cutting Issues
[error — CRITICAL] The fundamental problem: the codegen treats Lox as if it were statically typed with all-f64 values, but Lox is dynamically typed. The tagged union type (!llvm.struct<(i8, i64)>) is introduced but never used in codegen. Every arithmetic operation should check the type tag, but none do. This means the generated code would produce wrong results for any non-trivial Lox program. The tutorial needs to either:
- (a) Commit to the tagged union approach and show the tag-checking code (verbose but correct), or
- (b) Start with a "numbers only" subset of Lox and explicitly state the simplification, then add dynamic typing later
[clarity] The "Why MLIR for Lox?" section is compelling. The table comparing "What LLVM doesn't know" vs "What MLIR lets you do" is effective.
[clarity] The "Differences from C++ MLIR" table is helpful. The "No TableGen — Melior builds dialects directly in Rust" point is a key selling feature that deserves more emphasis.
[style] The tutorial is structured as a single long document with "Parts" as sections. This makes it hard to navigate. Consider splitting into separate files like the other tutorials.
[suggestion] The "Quick Reference: Lox → MLIR Mapping" table is excellent. More reference tables like this throughout.
[suggestion] No example shows the actual MLIR output of compiling a non-trivial Lox program. Showing the IR at each stage (Lox source → Lox dialect MLIR → Standard dialect MLIR → LLVM dialect MLIR) would make the lowering pipeline tangible.
What's Working
- The conceptual explanations are strong: why MLIR, tagged unions, string constants, source locations
- The AST is well-designed with proper location tracking
- The parser is a faithful port of Crafting Interpreters
- The before/after IR comparison in Part 5 is effective
- The quick reference tables are useful
- The C++ comparison table helps readers coming from the official MLIR tutorial
Priority Fixes
- Fix block ownership — the
current_block: Option<Block<'c>>pattern doesn't work with Melior's move semantics. This is the single biggest correctness issue. - Fix logical operators —
andi/oridon't short-circuit; usescf.if. - Commit to one typing strategy — either use tagged unions throughout or explicitly scope the tutorial to "numbers only" Lox.
- Provide missing definitions — Token, TokenType, lexer, LoxValue methods.
- Verify all Melior API calls against the actual 0.27 API.