Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

MLIR for Lox: Part 1 — From Lox Source to MLIR Dialect

Here’s the problem with compiling a dynamic language to LLVM: LLVM doesn’t know what a Lox value is. It knows about 64-bit floats, integers, and pointers. But a + b in Lox could be number addition, string concatenation, or a runtime type error — and you can’t express that in a single fadd instruction. You need to check the type tag, extract the payload, dispatch to the right operation, and re-box the result. In LLVM, that’s 15 instructions where you wanted one. And if you ever change how Lox values are represented — say, from a (i8, i64) struct to NaN-boxed floats — you rewrite every one of those 15-instruction sequences.

MLIR solves this by letting you define a dialect that represents Lox semantics directly. lox.add means “add two Lox values, with tag checking and dispatch” — a single operation that captures the semantics, not the implementation. Then you lower that dialect through progressively simpler representations until you reach LLVM IR. Each level is independently verifiable, and a change to the value representation only affects the lowering pass, not every operation that uses values. This is how modern language compilers work: Swift and Fortran (via Flang) use MLIR this way in production, and there are research efforts to bring MLIR to Julia as well — define your semantics, then lower them.

Our compiler takes a pragmatic approach: we use MLIR’s built-in arith and func dialects for the numbers-only model, and define custom lox.* operations only when the built-in dialects can’t express what we need (garbage collection, heap allocation, root tracking). This is the same tradeoff real compilers make — use standard dialects when you can, custom dialects when you must.

This guide builds a Lox compiler using Rust and the Melior crate. If you know Crafting Interpreters, this should feel familiar — with MLIR instead of tree-walk interpretation.

Why Rust and Melior? Memory safety without a garbage collector — important when you’re building one. Pattern matching for clean AST traversal. And no TableGen: Melior builds MLIR dialects directly in Rust, which means one fewer language to learn. These are the same reasons Rust compiler projects chose this stack.


Setup

Dependencies

# Cargo.toml
[package]
name = "lox-mlir"
version = "0.1.0"
edition = "2021"

[dependencies]
melior = "0.27"
anyhow = "1"
# anyhow provides flexible error handling — we use Result<T, anyhow::Error>
# throughout the codebase instead of String-based errors

Melior and LLVM Versions

Melior binds to a specific LLVM version through the mlir-sys crate. The version numbers line up directly: mlir-sys 220.x requires LLVM 22, 210.x requires LLVM 21, and so on. You can’t mix versions — mlir-sys 220.x won’t find an LLVM 20 installation no matter what environment variables you set.

meliormlir-sysLLVM requiredLLVM_SYS_*_PREFIX variable
0.27220.xLLVM 22LLVM_SYS_220_PREFIX
0.26210.xLLVM 21LLVM_SYS_210_PREFIX
0.23–0.250.5.xLLVM 20LLVM_SYS_200_PREFIX
0.20–0.220.4.xLLVM 19LLVM_SYS_190_PREFIX
0.190.3.xLLVM 18LLVM_SYS_180_PREFIX

If you can’t install LLVM 22 on your system, pick the melior version that matches the LLVM you have. The MLIR operations and lowering passes are the same across versions — but the Melior Rust API changes significantly between versions. This tutorial’s code examples require melior 0.27.

Install MLIR

macOS

brew install llvm@22

# Add to your shell config (~/.zshrc or ~/.bashrc):
export LLVM_SYS_220_PREFIX=/opt/homebrew/opt/llvm@22
export PATH="/opt/homebrew/opt/llvm@22/bin:$PATH"

Linux (Ubuntu/Debian)

Ubuntu’s default repos ship LLVM 20 (on 24.04/24.10). LLVM 22 is available through the official LLVM apt repository at apt.llvm.org.

Option A: LLVM 22 via apt.llvm.org (recommended, matches melior 0.27)

# Use the official LLVM install script
wget https://apt.llvm.org/llvm.sh
chmod +x llvm.sh
sudo ./llvm.sh 22 all

# Install MLIR development headers and libraries
sudo apt install llvm-22-dev libmlir-22-dev clang-22 libpolly-22-dev libzstd-dev

# Add to your shell config (~/.bashrc or ~/.zshrc):
export LLVM_SYS_220_PREFIX=/usr/lib/llvm-22
export PATH=/usr/lib/llvm-22/bin:$PATH
export BINDGEN_EXTRA_CLANG_ARGS="-I/usr/lib/llvm-22/lib/clang/22/include"

That BINDGEN_EXTRA_CLANG_ARGS variable tells bindgen where to find Clang’s built-in headers. Without it, you’ll get compilation errors about missing stddef.h or stdarg.h when mlir-sys generates Rust bindings from C headers.

Option B: LLVM 20 from Ubuntu repos — not recommended for this tutorial

Ubuntu’s default repos ship LLVM 20, which works with melior 0.23–0.25. However, this tutorial’s code is written against the melior 0.27 API, and older melior versions have significant API differences (different method signatures, renamed types, moved modules). Downgrading produces dozens of compilation errors.

If you want to use LLVM 20 anyway — perhaps for a different project — here are the packages and variables you’d need:

sudo apt install llvm-20-dev libmlir-20-dev clang-20 libpolly-20-dev libzstd-dev

# In Cargo.toml, change to:
#   melior = "0.25"

export LLVM_SYS_200_PREFIX=/usr/lib/llvm-20
export PATH=/usr/lib/llvm-20/bin:$PATH
export BINDGEN_EXTRA_CLANG_ARGS="-I/usr/lib/llvm-20/lib/clang/20/include"

For this tutorial, use Option A (LLVM 22 via apt.llvm.org) instead.

Verify Your Setup

# Check that mlir-opt is available and reports the right version
mlir-opt --version
# Should print: LLVM 22.x.x (if using LLVM 22)

# Check that the environment variable is set
echo $LLVM_SYS_220_PREFIX
# Should print: /usr/lib/llvm-22  (or /opt/homebrew/opt/llvm@22 on macOS)

If mlir-opt --version reports a different LLVM than what your melior version expects, cargo build will fail with linking errors. Double-check that the right LLVM_SYS_*_PREFIX is set and that mlir-opt comes from the same LLVM installation.

Common Build Errors

could not find native static library mlir — You’re missing libmlir-XX-dev. Install it with sudo apt install libmlir-22-dev (replace 22 with your LLVM version).

fatal error: 'stddef.h' file not found during mlir-sys build — Set BINDGEN_EXTRA_CLANG_ARGS as shown above. This tells bindgen where Clang’s built-in headers live.

linking with cc failed: exit code: 1 with undefined MLIR symbols — Your LLVM_SYS_*_PREFIX doesn’t match the LLVM version your melior expects. Check the version table above and make sure they align.


The Lox AST

Hand-written Rust — not generated. This is exactly what you’d write following Crafting Interpreters. If you’ve read Crafting Interpreters, you can skim this section — the important parts are the LoxValue enum and the Expr/Stmt sum types. The details of each variant don’t matter until we compile them in the MLIR Code Generation section.

Design choice: Why separate BinaryExpr, UnaryExpr, etc. instead of an enum with inline variants? Because each variant needs named fields and location tracking. An enum with inline variants would require Binary(Box<Expr>, BinaryOp, Box<Expr>, Location) — and you’d have to remember which field is which. Named structs are self-documenting.

#![allow(unused)]
fn main() {
// src/ast.rs
/// Source location for error messages
#[derive(Debug, Clone, Copy)]
pub struct Location {
    pub line: usize,
    pub column: usize,
}

/// A Lox value (dynamically typed)
#[derive(Debug, Clone)]
pub enum LoxValue {
    Nil,
    Bool(bool),
    Number(f64),
    String(String), // Heap-allocated — fine for the AST, but a compiler
                    // generating MLIR would use interned strings instead.
                    // We'll deal with this below in "String Constants."
}

// ========================================================================
// Expressions
// ========================================================================

#[derive(Debug, Clone)]
pub enum Expr {
    Binary(BinaryExpr),
    Unary(UnaryExpr),
    Literal(LiteralExpr),
    Grouping(GroupingExpr),
    Variable(VariableExpr),
    Assign(AssignExpr),
    Call(CallExpr),
    Logical(LogicalExpr),
}

impl Expr {
    pub fn location(&self) -> Location {
        match self {
            Expr::Binary(e) => e.location,
            Expr::Unary(e) => e.location,
            Expr::Literal(e) => e.location,
            Expr::Grouping(e) => e.location,
            Expr::Variable(e) => e.location,
            Expr::Assign(e) => e.location,
            Expr::Call(e) => e.location,
            Expr::Logical(e) => e.location,
        }
    }
}

#[derive(Debug, Clone)]
pub struct BinaryExpr {
    pub location: Location,
    pub left: Box<Expr>,
    pub op: BinaryOp,
    pub right: Box<Expr>,
}

#[derive(Debug, Clone, Copy)]
pub enum BinaryOp {
    Add, Sub, Mul, Div,
    Less, LessEqual, Greater, GreaterEqual,
    Equal, NotEqual,
}

#[derive(Debug, Clone)]
pub struct UnaryExpr {
    pub location: Location,
    pub op: UnaryOp,
    pub right: Box<Expr>,
}

#[derive(Debug, Clone, Copy)]
pub enum UnaryOp {
    Negate, Not,
}

#[derive(Debug, Clone)]
pub struct LiteralExpr {
    pub location: Location,
    pub value: LoxValue,
}

#[derive(Debug, Clone)]
pub struct GroupingExpr {
    pub location: Location,
    pub expr: Box<Expr>,
}

#[derive(Debug, Clone)]
pub struct VariableExpr {
    pub location: Location,
    pub name: String,
}

#[derive(Debug, Clone)]
pub struct AssignExpr {
    pub location: Location,
    pub name: String,
    pub value: Box<Expr>,
}

#[derive(Debug, Clone)]
pub struct CallExpr {
    pub location: Location,
    pub callee: Box<Expr>,
    pub arguments: Vec<Expr>,
}

#[derive(Debug, Clone)]
pub struct LogicalExpr {
    pub location: Location,
    pub left: Box<Expr>,
    pub op: LogicalOp,
    pub right: Box<Expr>,
}

#[derive(Debug, Clone, Copy)]
pub enum LogicalOp {
    And, Or,
}

// ========================================================================
// Statements
// ========================================================================

#[derive(Debug, Clone)]
pub enum Stmt {
    Function(FunctionStmt),
    Return(ReturnStmt),
    Var(VarStmt),
    If(IfStmt),
    While(WhileStmt),
    Print(PrintStmt),
    Block(BlockStmt),
    Expression(ExpressionStmt),
}

#[derive(Debug, Clone)]
pub struct FunctionStmt {
    pub location: Location,  // Location of 'fun' keyword
    pub name: String,
    pub name_location: Location,  // Location of the function name
    pub params: Vec<String>,
    pub param_locations: Vec<Location>,  // Location of each parameter name
    pub body: Vec<Stmt>,
}

#[derive(Debug, Clone)]
pub struct ReturnStmt {
    pub location: Location,  // Location of 'return' keyword
    pub value: Option<Expr>,
}

#[derive(Debug, Clone)]
pub struct VarStmt {
    pub location: Location,  // Location of 'var' keyword
    pub name: String,
    pub name_location: Location,  // Location of the variable name
    pub init: Expr,
}

#[derive(Debug, Clone)]
pub struct IfStmt {
    pub location: Location,  // Location of 'if' keyword
    pub condition: Expr,
    pub then_branch: Vec<Stmt>,
    pub else_branch: Vec<Stmt>,
}

#[derive(Debug, Clone)]
pub struct WhileStmt {
    pub location: Location,  // Location of 'while' keyword
    pub condition: Expr,
    pub body: Vec<Stmt>,
}

#[derive(Debug, Clone)]
pub struct PrintStmt {
    pub location: Location,  // Location of 'print' keyword
    pub value: Expr,
}

#[derive(Debug, Clone)]
pub struct BlockStmt {
    pub location: Location,  // Location of opening '{'
    pub statements: Vec<Stmt>,
}

#[derive(Debug, Clone)]
pub struct ExpressionStmt {
    pub location: Location,  // Start of the expression
    pub expr: Expr,
}

// ========================================================================
// Program
// ========================================================================

/// A Lox program is a list of top-level statements
#[derive(Debug, Clone)]
pub struct Program {
    pub statements: Vec<Stmt>,
}

// ========================================================================
// Helper trait for getting locations
// ========================================================================

impl Stmt {
    /// Get the primary location of this statement
    pub fn location(&self) -> Location {
        match self {
            Stmt::Function(f) => f.location,
            Stmt::Return(r) => r.location,
            Stmt::Var(v) => v.location,
            Stmt::If(i) => i.location,
            Stmt::While(w) => w.location,
            Stmt::Print(p) => p.location,
            Stmt::Block(b) => b.location,
            Stmt::Expression(e) => e.location,
        }
    }
}
}

The Parser

The parser is a standard recursive-descent parser with precedence climbing. It follows Crafting Interpreters closely — if you’ve read that book, this will feel familiar. The scanner (lexer) is also straight out of Crafting Interpreters Chapter 4.

What matters for this tutorial is the output: the AST we defined above. The parser is how we get there, but it’s not what we’re here to study. Here’s the shape of the thing.

Tokens

The scanner turns source text into a Vec<Token>. Each token carries its type, the raw lexeme, an optional literal value, and a source location:

#![allow(unused)]
fn main() {
// src/lexer.rs
use anyhow::Result;
use crate::ast::Location;

#[derive(Debug, Clone)]
pub struct Token {
    pub token_type: TokenType,
    pub lexeme: String,
    pub literal: Option<LexValue>,
    pub location: Location,
}

#[derive(Debug, Clone, Copy, PartialEq)]
pub enum TokenType {
    // Single-character tokens
    LeftParen, RightParen, LeftBrace, RightBrace,
    Comma, Dot, Minus, Plus, Semicolon, Slash, Star,
    // One or two character tokens
    Bang, BangEqual, Equal, EqualEqual, Greater, GreaterEqual,
    Less, LessEqual,
    // Literals
    Identifier, String, Number,
    // Keywords
    And, Class, Else, False, True, Fun, For, If, Nil, Or,
    Print, Return, Super, This, Var, While,
    // Special
    Eof,
}

#[derive(Debug, Clone)]
pub enum LexValue {
    Boolean(bool),
    F64(f64),
    Str(String),
}
}

The scanner itself is ~300 lines of character-by-character processing. It’s not the focus of this tutorial — see Crafting Interpreters Chapter 4 for a complete implementation. The important thing is the output: a Vec<Token> that the parser consumes.

The Parser Structure

The parser walks the token stream and builds the AST. Two things to understand about its structure:

Statements are dispatched by keyword. fun → function declaration, var → variable declaration, print → print statement, if → if statement, and so on. Each statement parser consumes tokens until it has a complete Stmt node.

Expressions use precedence climbing. Each binary operator level has its own method — equality() handles ==/!=, comparison() handles >/>=/</<=, term() handles +/-, factor() handles *//. Each method parses its level, then calls the next-tighter level. The result is a left-associative tree: 1 + 2 * 3 parses as Binary(1, Add, Binary(2, Mul, 3)).

The precedence ladder, from loosest to tightest:

assignment  →  or  →  and  →  equality  →  comparison  →  term  →  factor  →  unary  →  primary

Here’s the skeleton of the parser — the parts that matter for understanding what the code generator receives:

#![allow(unused)]
fn main() {
// src/parser.rs
use crate::ast::*;
use crate::lexer::{Token, TokenType, LexValue};

#[derive(Debug)]
pub struct ParseError {
    pub message: String,
    pub location: Location,
}

pub struct Parser {
    tokens: Vec<Token>,
    current: usize,
}

impl Parser {
    pub fn new(tokens: Vec<Token>) -> Self {
        Self { tokens, current: 0 }
    }

    pub fn parse(&mut self) -> Result<Program, ParseError> {
        let mut statements = Vec::new();
        while !self.is_at_end() {
            statements.push(self.declaration()?);
        }
        Ok(Program { statements })
    }

    fn declaration(&mut self) -> Result<Stmt, ParseError> {
        if self.match_token(TokenType::Fun) {
            return self.function_declaration();
        }
        if self.match_token(TokenType::Var) {
            return self.var_declaration();
        }
        self.statement()
    }

    // ... statement parsers (if, while, print, return, block, expression)
    // Each one consumes tokens and produces the corresponding Stmt variant.
    // The pattern is the same: match the keyword, parse the sub-expressions,
    // build the Stmt node.

    fn expression(&mut self) -> Result<Expr, ParseError> {
        self.assignment()
    }

    // Each binary operator level follows the same pattern:
    //   1. Parse the next-tighter level
    //   2. While the current token is one of our operators, consume it,
    //      parse the right operand, and build a BinaryExpr node
    //
    // For example, term() handles + and -:
    fn term(&mut self) -> Result<Expr, ParseError> {
        let mut expr = self.factor()?;  // next-tighter level
        while self.match_any(&[TokenType::Plus, TokenType::Minus]) {
            let location = self.previous().location;
            let op = match self.previous().token_type {
                TokenType::Plus => BinaryOp::Add,
                TokenType::Minus => BinaryOp::Sub,
                _ => unreachable!(),
            };
            let right = self.factor()?;
            expr = Expr::Binary(BinaryExpr {
                location,
                left: Box::new(expr),
                op,
                right: Box::new(right),
            });
        }
        Ok(expr)
    }

    // ... equality, comparison, factor, unary, primary follow the same pattern
    // ... assignment and logical operators (or, and) handle right-associativity
    //     and short-circuit evaluation respectively

    // Helper methods: match_token, match_any, check, advance, peek, previous, consume
    // Standard recursive-descent utilities — see Crafting Interpreters Chapter 6
}
}

The full parser is about 500 lines of straightforward recursive-descent code. It’s not reproduced here in full because it’s not the focus — the code generator doesn’t care how the AST was built, only what shape it is. If you’re building along, follow Crafting Interpreters Chapter 6 and adapt the AST nodes to match the ones we defined above. The key difference from the book’s Java implementation: our AST carries Location fields on every node, because MLIR’s error diagnostics need source positions.


MLIR Code Generation with Melior

The core of the compiler — walks the AST and emits MLIR.

Scope note: In this part, we compile a subset of Lox that only supports numbers and arithmetic. This isn’t a limitation of MLIR — it’s a pedagogical choice. Dynamic typing with tagged unions adds 3-4x more code for every operation (check the tag, dispatch, unbox, compute, re-box). The “Dynamic Typing with Tagged Unions” subsection below explains the representation conceptually. Part 7 (Classes and Instances) shows the working implementation — every arith.addf becomes tag-check-extract-compute-repack. For now, every value is an f64.

What this means in practice:

  • All values are f64true is 1.0, false is 0.0, nil is 0.0
  • Arithmetic operations use arith.addf, arith.mulf, etc.
  • Comparisons use arith.cmpf
  • Logical operators use scf.if for short-circuit evaluation
  • lox.print calls a runtime function that prints a float

This is a simplification. A production Lox compiler would use tagged unions. But starting with “everything is a float” lets us focus on MLIR concepts without drowning in type-tag boilerplate.

How Melior’s Block Ownership Works

Before we write any codegen, there’s one thing you need to understand about Melior’s ownership model — it’s different from what you might expect.

In Melior, Block<'c> is an owned value. When you call region.append_block(block), the block is consumed — moved into the region. After that, you can’t use the Block anymore. Instead, append_block returns a BlockRef<'c, '_> — a borrowed reference to the block inside the region.

This means you can’t store a Block in a struct field, compile some operations into it, and then append it to a region later. The block gets moved when you append it, and the struct field is left empty.

Block lifecycle:

  Create              Append ops              Move into region
  ──────              ──────────              ─────────────────
  let block =         block.append_operation  region.append_block(block)
    Block::new(...);    (some_op);           // block is consumed — gone
                       block.append_operation  // you get a BlockRef back,
                         (another_op);        // but usually don't need it

  You own it.         Still own it.           Region owns it now.
  Add anything         No restrictions.        Can't modify the Block
  you want.                                    directly anymore.

  ❌ Can't do this:
  self.current_block = Some(block);   // store it
  region.append_block(block);          // move it — DOUBLE USE

  ✅ Instead: pass blocks as parameters.
  Each method takes &Block, emits operations,
  and returns. No struct fields, no double-moves.

The pattern that works: Create a Block, append operations to it (it’s not in any region yet, so you own it freely), then move it into a Region. After that, the Block is gone — you get a BlockRef back, but you usually don’t need it because the region is complete.

#![allow(unused)]
fn main() {
// ✅ Works: build the block, then move it into a region
let block = Block::new(&[(float_type, location)]);
block.append_operation(some_op);
block.append_operation(another_op);

let region = Region::new();
region.append_block(block);  // block is consumed here
}

For our code generator, this means we pass the current block as a parameter to compilation methods, rather than storing it in a struct. When a statement needs to create a new region (like scf.if), it builds the inner blocks inline, moves them into a region, and then creates the operation — all without touching the outer block.

We also don’t store the variable map in the struct — for a different reason. See the next section for why.

Why Separate Context from State?

You might notice something unusual about our CodeGenerator: it doesn’t have a variables field. Most compiler tutorials store the variable map in the code generator struct. We pass it as a separate &mut HashMap parameter instead.

Here’s why: in Melior 0.27, Value has two lifetime parameters — Value<'c, 'a>. The 'c is the context lifetime, and 'a is tied to the operation that produced the value. When you store Value<'c, 'a> in a struct and then call a method that takes &mut self, the returned Value keeps &mut self borrowed — and then the same method can’t access self.context or do anything else with self without conflicting with that borrow.

This isn’t a workaround. It’s a real consequence of how Rust’s borrow checker works with MLIR’s ownership model. The fix is to recognize that our code generator has two kinds of state:

Immutable context — the MLIR Context and Module. These never change during compilation.

Mutable compilation state — the variable map, which grows as we declare variables.

By putting immutable context in the struct (accessed via &self) and passing mutable state as a parameter (accessed via &mut HashMap), we make the borrow structure explicit. compile_expression takes &self because it only needs self.context. The variables HashMap gets passed separately because it needs mutable access.

This is actually cleaner than the alternative. It forces each method to declare exactly what it needs — and the borrow checker enforces that declaration.

Basic Code Generator

Version note: The code examples below use Melior 0.27’s API. Melior’s API changes frequently between minor versions — helper function availability and signatures differ between releases. In melior 0.27.0, the melior::dialect::arith module only exports constant, cmpf, cmpi, and select as Rust helper functions. There are no addf, subf, mulf, divf, or negf helpers in that module. The code below defines local helper functions that use OperationBuilder to construct arithmetic operations — the same approach melior itself uses for the operations it does export (see the source of arith::constant and arith::cmpf). The concepts (which operations to use, how regions work) are stable across versions — only the Rust API surface changes. If something doesn’t compile, check Melior’s docs for the exact API (replace latest with your version in the URL if needed).

ODS alternative: melior::dialect::ods::arith auto-generates helpers from MLIR’s TableGen definitions, including addi, andi, ori, addf, subf, mulf, divf, negf, and many more. These work, but the ODS module is marked experimental and its generated signatures may change between releases. If you prefer the ODS helpers, replace the local helpers with calls like ods::arith::addf(lhs, rhs, location) and add use melior::dialect::ods; to your imports. Note that andi (used in Part 10) is only available via this ODS path — dialect::arith doesn’t export it.

Version-resilient alternative: If you’re working with a different Melior version and the helper functions don’t compile, you can use OperationBuilder instead. This constructs the MLIR operation directly by name, bypassing the version-specific Rust wrappers entirely. We already use this approach for scf.if, scf.while, scf.condition, and scf.yield (which don’t have stable helpers across versions). The same pattern works for func.func and func.call:

#![allow(unused)]
fn main() {
// Instead of func::func(context, name_attr, type_attr, region, attrs, location):
let func_op = OperationBuilder::new("func.func", location)
    .add_attributes(&[
        (
            Identifier::new(context, "sym_name"),
            StringAttribute::new(context, &func.name).into(),
        ),
        (
            Identifier::new(context, "function_type"),
            TypeAttribute::new(function_type.into()).into(),
        ),
    ])
    .add_regions([region])
    .build()
    .expect("valid func.func");
module.body().append_operation(func_op);

// Instead of func::call(context, symbol_ref, args, result_types, location):
let call_op = OperationBuilder::new("func.call", location)
    .add_attributes(&[(
        Identifier::new(context, "callee"),
        FlatSymbolRefAttribute::new(context, "lox_print").into(),
    )])
    .add_operands(&args)
    .add_results(&result_types)
    .build()
    .expect("valid func.call");
block.append_operation(call_op);
}

The OperationBuilder approach is more verbose but works across Melior versions because it targets the stable MLIR operation names ("func.func", "func.call") rather than the version-specific Rust function signatures. Later parts of this tutorial use the helper functions (func::func, func::call) for readability, but if you hit a compilation error, substitute the OperationBuilder version above.

#![allow(unused)]
fn main() {
// src/codegen/mod.rs
mod generator;

pub use generator::generate_module;
}
#![allow(unused)]
fn main() {
// src/codegen/generator.rs
use crate::ast::*;
use melior::{
    Context,
    dialect::{arith, func, DialectRegistry},
    ir::{
        attribute::{StringAttribute, TypeAttribute, FloatAttribute, FlatSymbolRefAttribute},
        r#type::FunctionType,
        Location, Module, Region, Block, Type, Value, ValueLike,
        operation::{OperationBuilder, OperationLike},
        block::BlockLike, RegionLike,
    },
    utility::register_all_dialects,
};
use std::collections::HashMap;

// ========================================================================
// Arith helper functions
// ========================================================================
// Melior 0.27's arith module only provides helpers for `constant`, `cmpf`,
// `cmpi`, and `select`. The arithmetic operations (addf, subf, mulf, divf,
// negf) must be built with OperationBuilder. These helpers match the
// signature style of the built-in helpers for consistency.

/// Create an `arith.addf` operation.
fn arith_addf<'c>(
    lhs: Value<'c, 'c>,
    rhs: Value<'c, 'c>,
    location: Location<'c>,
) -> melior::ir::Operation<'c> {
    OperationBuilder::new("arith.addf", location)
        .add_operands(&[lhs, rhs])
        .add_results(&[lhs.r#type()])
        .build()
        .expect("valid arith.addf")
}

/// Create an `arith.subf` operation.
fn arith_subf<'c>(
    lhs: Value<'c, 'c>,
    rhs: Value<'c, 'c>,
    location: Location<'c>,
) -> melior::ir::Operation<'c> {
    OperationBuilder::new("arith.subf", location)
        .add_operands(&[lhs, rhs])
        .add_results(&[lhs.r#type()])
        .build()
        .expect("valid arith.subf")
}

/// Create an `arith.mulf` operation.
fn arith_mulf<'c>(
    lhs: Value<'c, 'c>,
    rhs: Value<'c, 'c>,
    location: Location<'c>,
) -> melior::ir::Operation<'c> {
    OperationBuilder::new("arith.mulf", location)
        .add_operands(&[lhs, rhs])
        .add_results(&[lhs.r#type()])
        .build()
        .expect("valid arith.mulf")
}

/// Create an `arith.divf` operation.
fn arith_divf<'c>(
    lhs: Value<'c, 'c>,
    rhs: Value<'c, 'c>,
    location: Location<'c>,
) -> melior::ir::Operation<'c> {
    OperationBuilder::new("arith.divf", location)
        .add_operands(&[lhs, rhs])
        .add_results(&[lhs.r#type()])
        .build()
        .expect("valid arith.divf")
}

/// Create an `arith.negf` operation.
fn arith_negf<'c>(
    operand: Value<'c, 'c>,
    location: Location<'c>,
) -> melior::ir::Operation<'c> {
    OperationBuilder::new("arith.negf", location)
        .add_operands(&[operand])
        .add_results(&[operand.r#type()])
        .build()
        .expect("valid arith.negf")
}

/// State for code generation.
///
/// Note: we do NOT store the variables HashMap or the "current block"
/// as struct fields. Both are passed as parameters instead.
///
/// In Melior 0.27+, `Value` has two lifetime parameters (`'c` and `'a`).
/// Storing `Value<'c, 'a>` in a struct field and then calling methods that
/// take `&mut self` creates a borrow conflict — the returned `Value` keeps
/// `&mut self` alive, and then the same method can't access `self.context`
/// or other struct fields without conflicting with that borrow.
///
/// The fix: separate immutable context from mutable compilation state.
/// `compile_expression` only needs `&self` (for `self.context`). The
/// variables HashMap and current block are passed as separate parameters.
/// This is actually a cleaner design — it makes the borrow structure
/// explicit rather than hiding it in the struct.
pub struct CodeGenerator<'c> {
    context: &'c Context,
    module: Module<'c>,
}

impl<'c> CodeGenerator<'c> {
    pub fn new(context: &'c Context) -> Self {
        let location = Location::unknown(context);
        let module = Module::new(location);
        Self {
            context,
            module,
        }
    }

    // ========================================================================
    // Entry point
    // ========================================================================

    pub fn generate(self, program: &Program) -> Module<'c> {
        // Each function gets its own variable scope.
        // We don't store variables in the struct — see the note above.
        //
        // NOTE: We only compile top-level functions here. Top-level
        // statements (var, print, if, while, expression statements) are
        // handled by wrapping them in a synthetic @main function — see
        // the "Top-Level Statements" section below.
        for stmt in &program.statements {
            if let Stmt::Function(f) = stmt {
                self.compile_function(f);
            }
        }
        self.module
    }

    // A complete implementation would add a method like
    // `generate_main(program, ...)` that wraps top-level non-function
    // statements into a synthetic @main function. The walkthrough
    // examples below show the @main output you'd get. The pattern is
    // straightforward: create a func.func @main, then compile each
    // non-function statement into its body. We leave the implementation
    // as an exercise to keep the code generator focused on the core
    // compilation logic.

    // ========================================================================
    // Statement compilation
    // ========================================================================

    fn compile_statement(&self, stmt: &Stmt, block: &Block<'c>, variables: &mut HashMap<String, Value<'c, 'c>>) {
        match stmt {
            Stmt::Function(f) => self.compile_function(f),
            Stmt::Return(r) => self.compile_return(r, block, variables),
            Stmt::Var(v) => self.compile_var(v, block, variables),
            Stmt::If(i) => self.compile_if(i, block, variables),
            Stmt::While(w) => self.compile_while(w, block, variables),
            Stmt::Print(p) => self.compile_print(p, block, variables),
            Stmt::Block(b) => self.compile_block_stmt(b, block, variables),
            Stmt::Expression(e) => { self.compile_expression(&e.expr, block, variables); }
        }
    }

    fn compile_function(&self, func: &FunctionStmt) {
        let location = Location::unknown(self.context);
        let float_type = Type::float64(self.context);

        // Create parameter types (all f64 for now — dynamic typing)
        let param_types: Vec<Type> = func.params.iter().map(|_| float_type).collect();
        let return_type = float_type;

        // Create the function type
        let function_type = FunctionType::new(self.context, &param_types, &[return_type]);

        // Create the function body block (not yet in a region)
        let block = Block::new(
            &param_types.iter().map(|&t| (t, location)).collect::<Vec<_>>()
        );

        // Each function gets its own variable scope.
        // We don't store variables in the struct — see the note above.
        let mut variables: HashMap<String, Value<'c, 'c>> = HashMap::new();

        // Store parameters as variables
        for (i, param_name) in func.params.iter().enumerate() {
            let arg: Value = block.argument(i).unwrap().into();
            variables.insert(param_name.clone(), arg);
        }

        // Compile the function body into the block
        for stmt in &func.body {
            self.compile_statement(stmt, &block, &mut variables);
        }

        // Add implicit return nil ONLY if the last statement wasn't a return.
        // Without this guard, a function like `fun f() { return 1; }` would
        // have two return operations in the same block — MLIR rejects operations
        // after a terminator.
        let needs_implicit_return = func.body.last().map_or(true, |stmt| {
            !matches!(stmt, Stmt::Return(_))
        });
        if needs_implicit_return {
            let nil_value = self.compile_nil(&block);
            block.append_operation(func::r#return(&[nil_value], location));
        }

        // Move the block into a region
        let region = Region::new();
        region.append_block(block);

        // Add function to module
        self.module.body().append_operation(func::func(
            self.context,
            StringAttribute::new(self.context, &func.name),
            TypeAttribute::new(function_type.into()),
            region,
            &[],
            location,
        ));

        // Variables are dropped automatically — no need to clear
    }

    fn compile_return(&self, ret: &ReturnStmt, block: &Block<'c>, variables: &mut HashMap<String, Value<'c, 'c>>) {
        let location = Location::unknown(self.context);
        let value = match &ret.value {
            Some(expr) => self.compile_expression(expr, block, variables),
            None => self.compile_nil(block),
        };

        block.append_operation(func::r#return(&[value], location));
    }

    fn compile_var(&self, var: &VarStmt, block: &Block<'c>, variables: &mut HashMap<String, Value<'c, 'c>>) {
        let value = self.compile_expression(&var.init, block, variables);
        variables.insert(var.name.clone(), value);
    }

    fn compile_if(&self, if_stmt: &IfStmt, current_block: &Block<'c>, variables: &mut HashMap<String, Value<'c, 'c>>) {
        let location = Location::unknown(self.context);

        // Compile the condition into the current (outer) block
        let condition = self.compile_expression(&if_stmt.condition, current_block, variables);

        // Build then region: create a block, compile into it, move into region
        //
        // CAUTION: We pass the same `variables` HashMap to both branches.
        // Variable declarations in one branch leak into the other, but the
        // SSA value lives in the wrong block — using it produces invalid MLIR.
        // A production compiler would track scope depth or use scf.if with
        // results. For this tutorial, don't declare variables inside if-branches
        // and use them outside.
        let then_block = Block::new(&[]);
        for stmt in &if_stmt.then_branch {
            self.compile_statement(stmt, &then_block, variables);
        }
        let then_region = Region::new();
        then_region.append_block(then_block);

        // Build else region (even if empty — scf.if requires both regions).
        // No scf.yield needed here: MLIR's scf.if without result types
        // implicitly yields at the end of each region. (scf.if *with* result
        // types needs explicit scf.yield — see the logical operator
        // compilation below for an example.)
        let else_block = Block::new(&[]);
        for stmt in &if_stmt.else_branch {
            self.compile_statement(stmt, &else_block, variables);
        }
        let else_region = Region::new();
        else_region.append_block(else_block);

        // Create scf.if and append to the outer block
        let if_op = OperationBuilder::new("scf.if", location)
            .add_operands(&[condition])
            .add_regions([then_region, else_region])
            .build()
            .expect("valid scf.if operation");
        current_block.append_operation(if_op);
    }

    fn compile_while(&self, while_stmt: &WhileStmt, current_block: &Block<'c>, variables: &mut HashMap<String, Value<'c, 'c>>) {
        let location = Location::unknown(self.context);

        // For scf.while, we need:
        // 1. A "before" region that computes the condition and calls scf.condition
        // 2. An "after" region that is the loop body and calls scf.yield
        //
        // CAUTION: Same variable-scoping caveat as compile_if — we pass the same
        // `variables` HashMap to both the before and after regions. This causes
        // two problems:
        //
        // 1. Variable declarations inside the loop body leak into the enclosing
        //    scope, but the SSA value lives in the wrong block. Don't declare
        //    variables inside while-loops and use them outside.
        //
        // 2. Re-assigning a variable inside the loop (e.g., `i = i + 1`) updates
        //    the HashMap with a Value from the after region. On the next iteration,
        //    the before region tries to use that Value — but the after region
        //    doesn't dominate the before region. This produces invalid MLIR.
        //    The example in the "A Third Example: while and Back-Edges" section
        //    shows the fix: thread loop-modified variables as scf.while operands
        //    instead of tracking them through the HashMap.
        //
        // A production compiler would thread loop-carried values through
        // scf.while's operand list and scf.yield, which keeps the SSA values
        // in the correct dominance relationship.

        // Build the before region (condition check)
        let before_block = Block::new(&[]);
        let condition = self.compile_expression(&while_stmt.condition, &before_block, variables);

        // scf.condition takes the condition value and any loop-carried values
        let condition_op = OperationBuilder::new("scf.condition", location)
            .add_operands(&[condition])
            .build()
            .expect("valid scf.condition");
        before_block.append_operation(condition_op);

        let before_region = Region::new();
        before_region.append_block(before_block);

        // Build the after region (loop body)
        let after_block = Block::new(&[]);
        for stmt in &while_stmt.body {
            self.compile_statement(stmt, &after_block, variables);
        }
        // scf.while's after region must end with scf.yield
        let yield_op = OperationBuilder::new("scf.yield", location)
            .build()
            .expect("valid scf.yield");
        after_block.append_operation(yield_op);

        let after_region = Region::new();
        after_region.append_block(after_block);

        // Create scf.while and append to the outer block
        let while_op = OperationBuilder::new("scf.while", location)
            .add_regions([before_region, after_region])
            .build()
            .expect("valid scf.while");
        current_block.append_operation(while_op);
    }

    fn compile_print(&self, print: &PrintStmt, block: &Block<'c>, variables: &mut HashMap<String, Value<'c, 'c>>) {
        let location = Location::unknown(self.context);
        let value = self.compile_expression(&print.value, block, variables);

        // Call a runtime print function: func.call @lox_print(value)
        let print_op = func::call(
            self.context,
            FlatSymbolRefAttribute::new(self.context, "lox_print"),
            &[value],
            &[],
            location,
        );
        block.append_operation(print_op);
    }

    fn compile_block_stmt(&self, block_stmt: &BlockStmt, current_block: &Block<'c>, variables: &mut HashMap<String, Value<'c, 'c>>) {
        for stmt in &block_stmt.statements {
            self.compile_statement(stmt, current_block, variables);
        }
    }

    // ========================================================================
    // Expression compilation
    // ========================================================================

    fn compile_expression(&self, expr: &Expr, block: &Block<'c>, variables: &mut HashMap<String, Value<'c, 'c>>) -> Value<'c, 'c> {
        match expr {
            Expr::Binary(b) => self.compile_binary(b, block, variables),
            Expr::Unary(u) => self.compile_unary(u, block, variables),
            Expr::Literal(l) => self.compile_literal(l, block, variables),
            Expr::Grouping(g) => self.compile_expression(&g.expr, block, variables),
            Expr::Variable(v) => self.compile_variable(v, variables),
            Expr::Assign(a) => self.compile_assign(a, block, variables),
            Expr::Call(c) => self.compile_call(c, block, variables),
            Expr::Logical(l) => self.compile_logical(l, block, variables),
        }
    }

    fn compile_binary(&self, binary: &BinaryExpr, block: &Block<'c>, variables: &mut HashMap<String, Value<'c, 'c>>) -> Value<'c, 'c> {
        let location = Location::unknown(self.context);

        let lhs = self.compile_expression(&binary.left, block, variables);
        let rhs = self.compile_expression(&binary.right, block, variables);

        let op = match binary.op {
            BinaryOp::Add => arith_addf(lhs, rhs, location),
            BinaryOp::Sub => arith_subf(lhs, rhs, location),
            BinaryOp::Mul => arith_mulf(lhs, rhs, location),
            BinaryOp::Div => arith_divf(lhs, rhs, location),
            BinaryOp::Less => arith::cmpf(self.context, arith::CmpfPredicate::Olt, lhs, rhs, location),
            BinaryOp::LessEqual => arith::cmpf(self.context, arith::CmpfPredicate::Ole, lhs, rhs, location),
            BinaryOp::Greater => arith::cmpf(self.context, arith::CmpfPredicate::Ogt, lhs, rhs, location),
            BinaryOp::GreaterEqual => arith::cmpf(self.context, arith::CmpfPredicate::Oge, lhs, rhs, location),
            BinaryOp::Equal => arith::cmpf(self.context, arith::CmpfPredicate::Oeq, lhs, rhs, location),
            BinaryOp::NotEqual => arith::cmpf(self.context, arith::CmpfPredicate::Une, lhs, rhs, location),
        };

        let result_ref = block.append_operation(op);
        result_ref.result(0).unwrap().into()
    }

    fn compile_unary(&self, unary: &UnaryExpr, block: &Block<'c>, variables: &mut HashMap<String, Value<'c, 'c>>) -> Value<'c, 'c> {
        let location = Location::unknown(self.context);

        match unary.op {
            UnaryOp::Negate => {
                let operand = self.compile_expression(&unary.right, block, variables);
                let op = arith_negf(operand, location);
                let result = block.append_operation(op);
                result.result(0).unwrap().into()
            }
            UnaryOp::Not => {
                // In our "numbers only" model, `not x` is:
                //   if x == 0.0 { 1.0 } else { 0.0 }
                // We use scf.if since there's no direct float-to-boolean negation.
                //
                // The generated MLIR looks like this:
                //   %zero = arith.constant 0.0 : f64
                //   %is_zero = arith.cmpf oeq, %x, %zero : f64
                //   %result = scf.if %is_zero -> (f64) {
                //     %one = arith.constant 1.0 : f64
                //     scf.yield %one : f64
                //   } else {
                //     %zero2 = arith.constant 0.0 : f64
                //     scf.yield %zero2 : f64
                //   }
                //
                // `not 0.0` → 1.0 (falsy becomes truthy)
                // `not 5.0` → 0.0 (truthy becomes falsy)

                let operand = self.compile_expression(&unary.right, block, variables);

                // Compare operand to 0.0
                let zero_val = block.append_operation(arith::constant(
                    self.context,
                    FloatAttribute::new(self.context, Type::float64(self.context), 0.0).into(),
                    location,
                ));
                let is_zero = block.append_operation(arith::cmpf(
                    self.context,
                    arith::CmpfPredicate::Oeq,
                    operand,
                    zero_val.result(0).unwrap().into(),
                    location,
                ));

                // Build then region (x == 0.0 → result = 1.0)
                let then_region = {
                    let then_block = Block::new(&[]);
                    let one_val = then_block.append_operation(arith::constant(
                        self.context,
                        FloatAttribute::new(self.context, Type::float64(self.context), 1.0).into(),
                        location,
                    ));
                    then_block.append_operation(
                        OperationBuilder::new("scf.yield", location)
                            .add_operands(&[one_val.result(0).unwrap().into()])
                            .build()
                            .expect("valid scf.yield")
                    );
                    let r = Region::new();
                    r.append_block(then_block);
                    r
                };

                // Build else region (x != 0.0 → result = 0.0)
                let else_region = {
                    let else_block = Block::new(&[]);
                    let zero_val = else_block.append_operation(arith::constant(
                        self.context,
                        FloatAttribute::new(self.context, Type::float64(self.context), 0.0).into(),
                        location,
                    ));
                    else_block.append_operation(
                        OperationBuilder::new("scf.yield", location)
                            .add_operands(&[zero_val.result(0).unwrap().into()])
                            .build()
                            .expect("valid scf.yield")
                    );
                    let r = Region::new();
                    r.append_block(else_block);
                    r
                };

                // scf.if(is_zero) { then } { else } -> f64
                let if_op = OperationBuilder::new("scf.if", location)
                    .add_operands(&[is_zero.result(0).unwrap().into()])
                    .add_results(&[Type::float64(self.context)])
                    .add_regions([then_region, else_region])
                    .build()
                    .expect("valid scf.if");
                let result = block.append_operation(if_op);
                result.result(0).unwrap().into()
            }
        }
    }

    fn compile_literal(&self, literal: &LiteralExpr, block: &Block<'c>, variables: &mut HashMap<String, Value<'c, 'c>>) -> Value<'c, 'c> {
        let location = Location::unknown(self.context);

        match &literal.value {
            LoxValue::Nil => self.compile_nil(block),
            LoxValue::Bool(b) => {
                // In our "numbers only" model, booleans are 1.0 and 0.0
                let op = arith::constant(
                    self.context,
                    FloatAttribute::new(self.context, Type::float64(self.context), if *b { 1.0 } else { 0.0 }).into(),
                    location,
                );
                let result = block.append_operation(op);
                result.result(0).unwrap().into()
            }
            LoxValue::Number(n) => {
                let op = arith::constant(
                    self.context,
                    FloatAttribute::new(self.context, Type::float64(self.context), *n).into(),
                    location,
                );
                let result = block.append_operation(op);
                result.result(0).unwrap().into()
            }
            LoxValue::String(_) => {
                // String constants can't be represented as a single f64 — they
                // need the tagged union layout (tag + data). Our values are still
                // f64, so we compile strings to nil for now. This means `print "hello"`
                // prints `0` in the numbers-only model, which is wrong but honest —
                // the tagged union model in Part 7 fixes this.
                //
                // The `create_string_constant` function below builds the
                // `llvm.mlir.global constant` operation, but it only makes
                // sense once we switch to the tagged union representation.
                self.compile_nil(block)
            }
        }
    }

    fn compile_nil(&self, block: &Block<'c>) -> Value<'c, 'c> {
        // In our "numbers only" subset, nil is represented as 0.0 f64.
        // This is consistent with the simplified typing model.
        // The tagged union model in Part 7 uses (TAG_NIL, 0) instead,
        // which distinguishes nil from false and 0.0.
        let location = Location::unknown(self.context);
        let op = arith::constant(
            self.context,
            FloatAttribute::new(self.context, Type::float64(self.context), 0.0).into(),
            location,
        );
        let result = block.append_operation(op);
        result.result(0).unwrap().into()
    }

    fn compile_variable(&self, var: &VariableExpr, variables: &HashMap<String, Value<'c, 'c>>) -> Value<'c, 'c> {
        // Look up the variable in the current scope
        variables.get(&var.name)
            .copied()
            .unwrap_or_else(|| panic!("Undefined variable: {}", var.name))
    }

    fn compile_assign(&self, assign: &AssignExpr, block: &Block<'c>, variables: &mut HashMap<String, Value<'c, 'c>>) -> Value<'c, 'c> {
        let value = self.compile_expression(&assign.value, block, variables);
        variables.insert(assign.name.clone(), value);
        value
    }

    fn compile_call(&self, call: &CallExpr, block: &Block<'c>, variables: &mut HashMap<String, Value<'c, 'c>>) -> Value<'c, 'c> {
        let location = Location::unknown(self.context);

        // Compile arguments
        let args: Vec<Value> = call.arguments.iter()
            .map(|arg| self.compile_expression(arg, block, variables))
            .collect();

        // For now, assume callee is a direct function call
        if let Expr::Variable(var) = call.callee.as_ref() {
            let call_op = func::call(
                self.context,
                FlatSymbolRefAttribute::new(self.context, &var.name),
                &args,
                &[Type::float64(self.context)],
                location,
            );

            let result_ref = block.append_operation(call_op);
            return result_ref.result(0).unwrap().into();
        }

        // Indirect call (first-class function) - not implemented
        unimplemented!("Indirect function calls not yet supported")
    }
}

Logical Operators: Why scf.if Is Mandatory

Lox’s and and or are different from most languages. They don’t return a boolean — they return one of their operands:

  • 5 and 3 returns 3 (left is truthy, so return the right)
  • nil and 3 returns nil (left is falsy, so return the left)
  • 5 or 3 returns 5 (left is truthy, so return the left)
  • nil or 3 returns 3 (left is falsy, so return the right)

We can’t use bitwise operations. arith.andi and arith.ori operate on bits, not control flow. They’d evaluate both sides unconditionally — no short-circuiting. In Lox, nil and crash() must never call crash() because nil is falsy.

The result comes from a branch, not a computation. a and b returns either a or b — the actual value that flowed through the branch. That’s what scf.if with results gives us: each branch yields a value, and the scf.if produces whichever one was taken.

The translation pattern:

  • a and bscf.if a { yield b } else { yield a }
  • a or bscf.if a { yield a } else { yield b }

In the numbers-only model, yielding a when it’s falsy is the same as yielding 0.0 — but we yield a directly because it’s semantically correct and avoids creating a redundant constant. (The tagged union model in Part 7 uses the same pattern with (tag, payload) pairs instead of bare f64 values.)

#![allow(unused)]
fn main() {
    fn compile_logical(&self, logical: &LogicalExpr, block: &Block<'c>, variables: &mut HashMap<String, Value<'c, 'c>>) -> Value<'c, 'c> {
        let location = Location::unknown(self.context);

        let left = self.compile_expression(&logical.left, block, variables);

        // Convert left to i1 for the condition (nonzero = true)
        let zero_val = block.append_operation(arith::constant(
            self.context,
            FloatAttribute::new(self.context, Type::float64(self.context), 0.0).into(),
            location,
        ));
        let cond = block.append_operation(arith::cmpf(
            self.context,
            arith::CmpfPredicate::One,  // ordered not-equal (nonzero = true)
            left,
            zero_val.result(0).unwrap().into(),
            location,
        ));

        match logical.op {
            LogicalOp::And => {
                // `a and b` → if a { yield b } else { yield a }

                // Then region: evaluate right (left is truthy)
                let then_region = {
                    let then_block = Block::new(&[]);
                    let right = self.compile_expression(&logical.right, &then_block, variables);
                    then_block.append_operation(
                        OperationBuilder::new("scf.yield", location)
                            .add_operands(&[right])
                            .build()
                            .expect("valid scf.yield")
                    );
                    let r = Region::new();
                    r.append_block(then_block);
                    r
                };

                // Else region: return left (it's falsy, so pass it through)
                let else_region = {
                    let else_block = Block::new(&[]);
                    // `left` is defined in the enclosing scope, so it's
                    // visible inside the scf.if region. We yield it directly
                    // instead of creating a constant 0.0.
                    else_block.append_operation(
                        OperationBuilder::new("scf.yield", location)
                            .add_operands(&[left])
                            .build()
                            .expect("valid scf.yield")
                    );
                    let r = Region::new();
                    r.append_block(else_block);
                    r
                };

                let if_op = OperationBuilder::new("scf.if", location)
                    .add_operands(&[cond.result(0).unwrap().into()])
                    .add_results(&[Type::float64(self.context)])
                    .add_regions([then_region, else_region])
                    .build()
                    .expect("valid scf.if");
                let result = block.append_operation(if_op);
                result.result(0).unwrap().into()
            }
            LogicalOp::Or => {
                // `a or b` → if a { yield a } else { yield b }

                // Then region: return left (it's truthy, so pass it through)
                let then_region = {
                    let then_block = Block::new(&[]);
                    // `left` is defined in the enclosing scope, so it's
                    // visible inside the scf.if region. We yield it directly
                    // instead of creating a constant 1.0.
                    then_block.append_operation(
                        OperationBuilder::new("scf.yield", location)
                            .add_operands(&[left])
                            .build()
                            .expect("valid scf.yield")
                    );
                    let r = Region::new();
                    r.append_block(then_block);
                    r
                };

                // Else region: evaluate right
                let else_region = {
                    let else_block = Block::new(&[]);
                    let right = self.compile_expression(&logical.right, &else_block, variables);
                    else_block.append_operation(
                        OperationBuilder::new("scf.yield", location)
                            .add_operands(&[right])
                            .build()
                            .expect("valid scf.yield")
                    );
                    let r = Region::new();
                    r.append_block(else_block);
                    r
                };

                let if_op = OperationBuilder::new("scf.if", location)
                    .add_operands(&[cond.result(0).unwrap().into()])
                    .add_results(&[Type::float64(self.context)])
                    .add_regions([then_region, else_region])
                    .build()
                    .expect("valid scf.if");
                let result = block.append_operation(if_op);
                result.result(0).unwrap().into()
            }
        }
    }
}

/// Main entry point for code generation
pub fn generate_module<'c>(context: &'c Context, program: &Program) -> Module<'c> {
    let generator = CodeGenerator::new(context);
    generator.generate(program)
}
}

Why left Is Visible Inside scf.if Regions

The compile_logical code uses left — a value computed in the outer block — inside the then and else regions of an scf.if. That might look wrong: we created a new Block for each region, so how can a value from a different block be visible?

It’s not wrong — it’s how MLIR’s dominance rules work. In MLIR, a value is dominant over any region nested inside the operation that uses it. left was added to the outer block before the scf.if was created. The scf.if operation is inside that outer block. Any region inside the scf.if can see values from the outer block because the outer block dominates them — control flow must pass through the outer block to reach the inner region.

This is the same rule that makes outer variable scoping work in most programming languages. A variable declared before an if statement is visible inside both branches — you can’t reach the branch without passing the declaration. MLIR formalizes this as dominance, and the verifier enforces it: if you try to use a value that isn’t dominant (say, a value defined in the then region inside the else region), the verifier rejects the IR.

In Melior’s API, this shows up as a Value<'c, 'c> being passable to operations in any nested region. The two lifetime parameters on Value<'c, 'a> are 'c (the context) and 'a (the parent operation). These are separate concerns: 'c ties the value to the Context that owns the entire IR, while 'a ties it to its defining operation. In our code, both lifetimes resolve to 'c because the context outlives both the value and any region that uses it — Rust’s borrow checker sees a single shared lifetime and allows the reference. The actual dominance check happens later, at verification time — Melior doesn’t enforce it at Rust compile time.


String Constants (No Allocation Needed!)

String literals are constants, not heap allocations. They live in the binary’s data section, the same way string constants work in C.

Note: Melior 0.27 doesn’t provide a direct helper for llvm.mlir.global constant. You’d build it with OperationBuilder. Here’s the conceptual approach — the exact API details may vary with your MLIR version.

#![allow(unused)]
fn main() {
// src/codegen/strings.rs
use melior::{
    Context, Location,
    ir::{
        attribute::{StringAttribute, TypeAttribute, UnitAttribute}, Identifier, Type, Module,
        operation::OperationBuilder,
    },
};

/// Create a global string constant using OperationBuilder.
///
/// This creates something like:
///   llvm.mlir.global constant @str_0("hello")
///
/// No heap allocation — the string lives in the data section.
pub fn create_string_constant(
    module: &Module,
    context: &Context,
    name: &str,        // e.g., "str_0"
    value: &str,
    location: Location,
) {
    let array_type = Type::parse(
        context,
        &format!("!llvm.array<{} x i8>", value.len()),
    ).expect("valid array type");

    let op = OperationBuilder::new("llvm.mlir.global", location)
        .add_attributes(&[
            (Identifier::new(context, "sym_name"), StringAttribute::new(context, name).into()),
            (Identifier::new(context, "type"), TypeAttribute::new(array_type.into()).into()),
            (Identifier::new(context, "constant"), UnitAttribute::new(context).into()),
            (Identifier::new(context, "value"), StringAttribute::new(context, value).into()),
        ])
        .build()
        .expect("valid llvm.mlir.global");

    module.body().append_operation(op);
}
}

The constant attribute on llvm.mlir.global is a unit attribute — it has no value. Its presence means “constant” and its absence means “mutable.” We use UnitAttribute::new(context), not StringAttribute::new(context, "true").

The value attribute uses StringAttribute with the string content. MLIR accepts a StringAttr directly for !llvm.array<N x i8> globals — the bytes are embedded in the data section. For non-string globals (integers, structs), you’d use a DenseElementsAttr or an IntegerAttr instead.

Why This Works

ApproachMemory LocationAllocation?
llvm.mlir.global constantData sectionNo (static)
Heap allocation (malloc)HeapYes (runtime)
Stack allocation (alloca)StackNo, but per-call

String literals are static data — they exist for the lifetime of the program, embedded in the binary. No runtime cost.


Dynamic Typing with Tagged Unions

Lox is dynamically typed, so a function parameter can receive any type:

fun printValue(x) {
  print x;  // x could be number, string, bool, nil, or object
}

We need a tagged union type:

#![allow(unused)]
fn main() {
// src/codegen/types.rs
use melior::ir::Type;
use melior::Context;

/// A Lox value is a tagged union: struct { tag: i8, data: i64 }
pub fn lox_value_type(context: &Context) -> Type {
    Type::parse(context, "!llvm.struct<(i8, i64)>").unwrap()
}

/// Tag values for each Lox type
pub const TAG_NIL: i8 = 0;
pub const TAG_BOOL: i8 = 1;
pub const TAG_NUMBER: i8 = 2;
pub const TAG_STRING: i8 = 3;
pub const TAG_CLOSURE: i8 = 4;    // matches compiled value tags from Part 7
pub const TAG_INSTANCE: i8 = 5;   // matches compiled value tags from Part 7
}

These are not the same as the GC’s ObjType discriminants. The TAG_* constants number the types that appear in compiled values — TAG_NIL and TAG_BOOL are here because they’re Lox value types, even though they’re not heap objects. The ObjType enum in Part 2 (Number = 0, String = 1, etc.) numbers the types for the GC’s trace_object dispatch — it includes Environment (a GC-internal type) and doesn’t include Nil or Bool. Same concepts, different numbering, different purpose. The runtime’s from_raw/to_raw functions (Part 9) translate between them.

Why we’re not using this yet: The codegen above uses all-f64 values to keep the examples focused on MLIR concepts. Tagged unions require tag-checking before every operation, which adds significant boilerplate. A production Lox compiler would use lox_value_type() for every value and insert tag checks before arithmetic. The “numbers only” model is a stepping stone — once you understand the codegen patterns, adding tagged unions is a mechanical extension.

What it looks like in practice: Part 9 (Standard Library and Runtime) shows the tagged-union MLIR for 1 + 2 — every arith.addf becomes “check both tags are TAG_NUMBER → extract the f64 payloads → add → re-pack as (TAG_NUMBER, result).” Same core operation, more wrapping. Part 7 (Classes and Instances) introduces the representation; Part 9 shows the runtime that uses it.


Source Locations in MLIR

Every operation in MLIR has a source location. Unlike LLVM where debug info is optional, in MLIR locations are core to the IR.

The Location API

MLIR operations carry source locations for error messages and debug output. Melior provides several ways to create them:

#![allow(unused)]
fn main() {
// src/location.rs
use melior::ir::Location;
use melior::Context;

pub fn demonstrate_locations(context: &Context) {
    // Unknown location — for generated code with no source mapping
    let unknown = Location::unknown(context);
    
    // File location — specific file, line, column
    // Note: the exact Location::new signature varies between Melior versions.
    // In some versions, file locations use Location::file() or a builder.
    // If Location::new(context, filename, line, col) doesn't compile, try:
    //   Location::new(context, filename)        // newer versions: filename only
    //   Location::file(context, filename, line, col)  // some versions
    // Or use a name location as a fallback:
    //   Location::name(context, "test.lox:10:5", unknown)
    let file_loc = Location::new(context, "test.lox", 10, 5);
    
    // Name location — for generated code, use a descriptive name
    let name_loc = Location::name(context, "implicit_return", unknown);
    
    // Callsite location — links a callee's definition to its call site
    // Note: Location::call_site may not exist in all Melior versions.
    // Some versions use Location::callsite (no underscore), fused locations,
    // or you can fall back to using the file location directly.
    // If this doesn't compile, replace with: let callsite_loc = file_loc;
    let callsite_loc = Location::call_site(file_loc, unknown);
}
}

Updated Code Generator with Proper Locations

#![allow(unused)]
fn main() {
pub struct CodeGenerator<'c> {
    context: &'c Context,
    module: Module<'c>,
    filename: String,
    // Note: variables are NOT stored in the struct.
    // See the "Why Separate Context from State?" section for details.
}

impl<'c> CodeGenerator<'c> {
    pub fn new(context: &'c Context, filename: &str) -> Self {
        // Note: Location::new(context, filename, 1, 1) may not compile in all
        // Melior versions. See the "Version-resilient alternative" note above.
        // If it doesn't compile, try Location::new(context, filename) or
        // Location::unknown(context) as a fallback.
        let location = Location::new(context, filename, 1, 1);
        let module = Module::new(location);
        Self { 
            context, 
            module, 
            filename: filename.to_string(),
        }
    }

    /// Convert an AST location to an MLIR location
    /// Note: same version caveat as CodeGenerator::new — if Location::new
    /// doesn't compile with 4 arguments, try Location::new(context, filename)
    /// or fall back to Location::name(context, "file:line:col", unknown).
    fn loc(&self, ast_loc: crate::ast::Location) -> Location<'c> {
        Location::new(self.context, &self.filename, ast_loc.line, ast_loc.column)
    }

    /// Get a location for generated/implicit code
    fn generated_loc(&self, description: &str) -> Location<'c> {
        Location::name(self.context, description, Location::unknown(self.context))
    }
}
}

What the IR Looks Like With Locations

Before (using Location::unknown):

module {
  func.func @add(%arg0: f64, %arg1: f64) -> f64 {
    %0 = arith.addf %arg0, %arg1 : f64
    func.return %0 : f64
  }
}

After (with proper locations, shown with -mlir-print-debuginfo):

module {
  func.func @add(%arg0: f64, %arg1: f64) -> f64 
      loc("test.lox":1:1) 
  {
    %0 = arith.addf %arg0, %arg1 : f64 
        loc("test.lox":2:14)
    func.return %0 : f64 loc("test.lox":2:3)
  } loc("test.lox":1:1)
} loc("test.lox":1:1)

A Complete Example

Here’s a minimal working example that creates an add function using Melior’s API. This follows the same pattern as our code generator: build the block, append operations, then move the block into a region.

Note: This example defines the arith_addf helper function locally because Melior 0.27’s arith module doesn’t provide an addf helper. The same helper is defined in the code generator above — if you’re building incrementally, you already have it. If you’re building this example standalone, the helper is included below.

// examples/simple_add.rs
use melior::{
    Context,
    dialect::{func, DialectRegistry},
    ir::{
        attribute::{StringAttribute, TypeAttribute},
        r#type::FunctionType,
        Location, Module, Region, Block, Type, Value, ValueLike,
        operation::{OperationBuilder, OperationLike},
        block::BlockLike, RegionLike,
    },
    utility::register_all_dialects,
};

/// Create an `arith.addf` operation.
/// We define this locally because Melior 0.27's `arith` module
/// doesn't provide an `addf` helper — only `constant`, `cmpf`,
/// `cmpi`, and `select`. See the code generator above for the
/// full set of arithmetic helpers.
fn arith_addf<'c>(
    lhs: Value<'c, 'c>,
    rhs: Value<'c, 'c>,
    location: Location<'c>,
) -> melior::ir::Operation<'c> {
    OperationBuilder::new("arith.addf", location)
        .add_operands(&[lhs, rhs])
        .add_results(&[lhs.r#type()])
        .build()
        .expect("valid arith.addf")
}

fn main() {
    let registry = DialectRegistry::new();
    register_all_dialects(&registry);
    
    let context = Context::new();
    context.append_dialect_registry(&registry);
    context.load_all_available_dialects();
    
    let location = Location::unknown(&context);
    let module = Module::new(location);
    
    // Create function type: (f64, f64) -> f64
    let float_type = Type::float64(&context);
    let function_type = FunctionType::new(&context, &[float_type, float_type], &[float_type]);
    
    // Create the function body block (not yet in a region)
    let block = Block::new(&[
        (float_type, location),
        (float_type, location),
    ]);
    
    // %sum = arith.addf %arg0, %arg1 : f64
    let sum = block.append_operation(arith_addf(
        block.argument(0).unwrap().into(),
        block.argument(1).unwrap().into(),
        location,
    ));
    
    // return %sum : f64
    block.append_operation(func::r#return(
        &[sum.result(0).unwrap().into()],
        location,
    ));
    
    // Move block into region
    let region = Region::new();
    region.append_block(block);
    
    // Create the function
    module.body().append_operation(func::func(
        &context,
        StringAttribute::new(&context, "add"),
        TypeAttribute::new(function_type.into()),
        region,
        &[],
        location,
    ));
    
    // Verify and print
    // Note: verify() requires the OperationLike trait to be in scope.
    // It returns bool in Melior 0.27 — use assert! instead of .expect().
    assert!(module.as_operation().verify(), "module verification failed");
    println!("{}", module.as_operation());
}

Output:

module {
  func.func @add(%arg0: f64, %arg1: f64) -> f64 {
    %0 = arith.addf %arg0, %arg1 : f64
    func.return %0 : f64
  }
}

Lowering to LLVM IR

You’ve built a code generator that produces MLIR in the arith, func, and scf dialects. That IR is correct but it can’t run on hardware — those dialects are abstractions, not machine instructions. To produce an executable, you need to lower the IR through a pipeline of transformation passes, each one replacing higher-level operations with lower-level ones until you reach LLVM IR (which maps directly to machine code).

The Lowering Pipeline

MLIR lowering happens in stages. Each stage converts one dialect to a lower-level dialect. The order matters — you can’t lower scf.if to LLVM until you’ve first converted it to cf.cond_br (control flow in the cf dialect), because scf operations don’t have LLVM equivalents.

Here’s the pipeline for our Lox compiler:

Lox source
    ↓  (parse + codegen)
MLIR: arith + func + scf    ← What our code generator produces
    ↓  (scf-to-cf)
MLIR: arith + func + cf      ← Structured control flow → branches
    ↓  (arith-to-llvm)
MLIR: func + cf + llvm        ← Arithmetic ops → LLVM integer/float ops
    ↓  (func-to-llvm + cf-to-llvm)
MLIR: llvm only               ← Functions → LLVM functions, branches → LLVM br/cond_br
    ↓  (mlir-translate)
LLVM IR                       ← Textual LLVM IR (.ll)
    ↓  (clang / llc)
Machine code                  ← Native executable

Each arrow is a pass. Each pass converts one dialect’s operations into operations from a lower dialect. The key insight: lowering is compositional. You don’t need a “Lox to LLVM” pass. You need a series of small passes, each handling one dialect.

A Concrete Example: var x = 1 + 2; print x;

Here’s what this Lox program looks like at each stage of the pipeline. We’re using the numbers-only model — every value is an f64, and print takes a single float argument.

After code generation (arith + func):

module {
  func.func private @lox_print(f64)

  func.func @main() -> f64 {
    %one = arith.constant 1.0 : f64
    %two = arith.constant 2.0 : f64
    %sum = arith.addf %one, %two : f64
    func.call @lox_print(%sum) : (f64) -> ()
    %nil = arith.constant 0.0 : f64
    func.return %nil : f64
  }
}

This is what our code generator produces — arith operations for math, func operations for functions and calls. No scf here because there’s no control flow — straight-line code all the way through. Every value is an f64. The @lox_print declaration tells the verifier that an external function with that name exists — the linker will resolve it to our runtime.

After arith-to-llvm (func still present):

module {
  func.func private @lox_print(f64)

  func.func @main() -> f64 {
    %one = llvm.mlir.constant(1.0 : f64) : f64
    %two = llvm.mlir.constant(2.0 : f64) : f64
    %sum = llvm.fadd %one, %two : f64
    func.call @lox_print(%sum) : (f64) -> ()
    %nil = llvm.mlir.constant(0.0 : f64) : f64
    func.return %nil : f64
  }
}

arith.addf became llvm.fadd. arith.constant became llvm.mlir.constant. The func operations are untouched — this pass only converts arithmetic. This is the key insight: each pass handles one dialect and ignores everything else.

After func-to-llvm (llvm only):

module {
  llvm.func private @lox_print(f64)

  llvm.func @main() -> f64 {
    %one = llvm.mlir.constant(1.0 : f64) : f64
    %two = llvm.mlir.constant(2.0 : f64) : f64
    %sum = llvm.fadd %one, %two : f64
    llvm.call @lox_print(%sum) : (f64) -> ()
    %nil = llvm.mlir.constant(0.0 : f64) : f64
    llvm.return %nil : f64
  }
}

func.func became llvm.func, func.call became llvm.call, func.return became llvm.return. Every operation now lives in the llvm dialect — one translation step away from actual LLVM IR.

After mlir-translate (LLVM IR):

declare void @lox_print(double)

define double @main() {
entry:
  %0 = fadd double 1.0, 2.0
  call void @lox_print(double %0)
  ret double 0.0
}

This is textual LLVM IR — what llc or clang consumes to produce machine code. The names changed (llvm.faddfadd, llvm.mlir.constant → inline constants), but the operations are the same. The @lox_print declaration is an external symbol — the linker will resolve it to our runtime function.

Four stages, one program. Each stage is a mechanical translation. No magic — dialects converting to lower dialects until you hit something the hardware understands.

A Second Example: if and the scf Dialect

The first example had no control flow, so scf never appeared. Let’s see what happens when it does. Here’s a Lox program with an if statement:

fun max(a, b) {
  if (a < b) {
    print b;
  } else {
    print a;
  }
}

After code generation (arith + func + scf):

module {
  func.func private @lox_print(f64)

  func.func @max(%arg0: f64, %arg1: f64) -> f64 {
    %cond = arith.cmpf olt, %arg0, %arg1 : f64
    scf.if %cond {
      func.call @lox_print(%arg1) : (f64) -> ()
    } else {
      func.call @lox_print(%arg0) : (f64) -> ()
    }
    %nil = arith.constant 0.0 : f64
    func.return %nil : f64
  }
}

Now there’s scf.if — structured control flow with two regions (then and else). The condition comes from arith.cmpf (ordered less-than). Notice that scf.if doesn’t produce a value here — both branches call lox_print and fall through. The implicit 0.0 return comes after the scf.if.

After scf-to-cf (arith + func + cf):

module {
  func.func private @lox_print(f64)

  func.func @max(%arg0: f64, %arg1: f64) -> f64 {
    %cond = arith.cmpf olt, %arg0, %arg1 : f64
    cf.cond_br %cond, ^then, ^else
      ^then:
        func.call @lox_print(%arg1) : (f64) -> ()
        cf.br ^continue
      ^else:
        func.call @lox_print(%arg0) : (f64) -> ()
        cf.br ^continue
      ^continue:
    %nil = arith.constant 0.0 : f64
    func.return %nil : f64
  }
}

The scf.if became three blocks: ^then, ^else, and ^continue. Each branch ends with cf.br ^continue — the structured guarantee that both paths rejoin is now explicit.

Indented blocks are a presentation choice. In these examples, ^then, ^else, and ^continue are indented under cf.cond_br / llvm.cond_br to show their logical relationship. In MLIR’s actual textual format, blocks are siblings within the function body — they’re all at the same level, not nested under the branch. A reader who feeds this into mlir-translate would need to remove the indentation. The indented presentation makes the control flow structure easier to follow at the cost of not being valid MLIR syntax.

cf.cond_br takes the condition and two block labels — if the condition is true, jump to ^then; otherwise jump to ^else.

Variable scoping in compile_if. The code generator passes the same variables HashMap to both branches of scf.if. This means a variable declared inside one branch leaks into the other’s namespace — but the SSA value lives in the wrong block, so using it produces invalid MLIR (a dominance violation). In the max example above this doesn’t matter because both branches only call lox_print and don’t declare variables. But if you wrote if (cond) { var x = 1; } print x;, the code generator would put x in the HashMap with an SSA value from the then block, and print x would try to use that value in the outer block — where it isn’t dominant. A production compiler would either track scope depth (each branch gets its own variable namespace, and declarations don’t escape) or use scf.if with results (yield the variable’s value from both branches). Our simplified compiler does neither. Don’t declare variables inside if branches and use them outside. The same caveat applies to compile_while — the “A Third Example” section below shows the production-ready pattern with loop-carried values.

module {
  llvm.func private @lox_print(f64)

  llvm.func @max(%arg0: f64, %arg1: f64) -> f64 {
    %cond = llvm.fcmp "olt" %arg0, %arg1 : f64
    llvm.cond_br %cond, ^then, ^else
      ^then:
        llvm.call @lox_print(%arg1) : (f64) -> ()
        llvm.br ^continue
      ^else:
        llvm.call @lox_print(%arg0) : (f64) -> ()
        llvm.br ^continue
      ^continue:
    %nil = llvm.mlir.constant(0.0 : f64) : f64
    llvm.return %nil : f64
  }
}

Every operation is now in the llvm dialect. arith.cmpf olt became llvm.fcmp "olt". cf.cond_br became llvm.cond_br. cf.br became llvm.br. The block structure is identical — only the dialect names changed.

After mlir-translate (LLVM IR):

declare void @lox_print(double)

define double @max(double %0, double %1) {
entry:
  %cond = fcmp olt double %0, %1
  br i1 %cond, label %then, label %else

then:
  call void @lox_print(double %1)
  br label %continue

else:
  call void @lox_print(double %0)
  br label %continue

continue:
  ret double 0.0
}

Notice what happened to the condition: arith.cmpf produces an i1 (single-bit integer) after lowering, and LLVM’s br i1 uses it directly. In the MLIR stages, the condition was typed as the result of arith.cmpf / llvm.fcmp; in LLVM IR it’s a plain i1. This is LLVM’s representation of boolean values — comparison results are always i1, a single-bit integer. MLIR’s arith.cmpf and llvm.fcmp produce the same thing; LLVM IR makes the i1 type explicit. (The C calling convention governs how function arguments and return values are passed between callers and callees — it’s a separate concern from how LLVM represents booleans internally.)

A Third Example: while and Back-Edges

The if example showed branches that jump forward — both paths end up at ^continue. A while loop is different: the body jumps backward to the condition check. This back-edge is what makes loops work, and it’s what scf.while hides behind a region-based API.

var i = 0;
while (i < 5) {
  print i;
  i = i + 1;
}

About this example: The MLIR below shows scf.while with loop-carried values — the counter %i threads through the scf.condition and scf.yield as an operand. This is the correct approach for while loops that modify variables used in the condition. Our code generator above uses a simpler approach (tracking variables through the HashMap) that works for read-only conditions but produces invalid MLIR when the loop body reassigns a variable. The CAUTION note in compile_while explains the dominance issue. The example here shows the production-ready pattern.

After code generation (arith + func + scf):

module {
  func.func private @lox_print(f64)

  func.func @main() -> f64 {
    %zero = arith.constant 0.0 : f64
    %five = arith.constant 5.0 : f64
    %one = arith.constant 1.0 : f64
    %loop_result = scf.while (%i = %zero) : f64 {
      %cond = arith.cmpf olt, %i, %five : f64
      scf.condition(%cond) %i : f64
    } do {
    ^bb0(%i: f64):
      func.call @lox_print(%i) : (f64) -> ()
      %next = arith.addf %i, %one : f64
      scf.yield %next : f64
    }
    func.return %loop_result : f64
  }
}

scf.while has two regions. The before region checks the condition and calls scf.condition — if true, execution enters the after region; if false, the loop exits with the current iterator value. The after region runs the body and yields the updated value back to the before region. The loop variable %i flows between the two regions: initialized to %zero, updated by the after region, and checked again by the before region on each iteration.

After scf-to-cf (arith + func + cf):

module {
  func.func private @lox_print(f64)

  func.func @main() -> f64 {
    %zero = arith.constant 0.0 : f64
    %five = arith.constant 5.0 : f64
    %one = arith.constant 1.0 : f64
    cf.br ^before(%zero : f64)
    ^before(%i: f64):
      %cond = arith.cmpf olt, %i, %five : f64
      cf.cond_br %cond, ^body(%i : f64), ^exit(%i : f64)
    ^body(%i: f64):
      func.call @lox_print(%i) : (f64) -> ()
      %next = arith.addf %i, %one : f64
      cf.br ^before(%next : f64)     ← the back-edge
    ^exit(%i: f64):
      func.return %i : f64
  }
}

Now the back-edge is explicit: cf.br ^before(%next : f64) at the bottom of ^body jumps back to ^before. The loop is three blocks with branches — no special “while” construct, only control flow. cf.cond_br at the top of ^before decides whether to enter the body or exit. The %i argument threads through every block, carrying the current loop counter.

Compare this to the if example. Both use cf.cond_br for the branch decision. The difference is what happens next: the if jumps forward to a continue block, while the while jumps backward to the condition. That’s the only structural difference between a conditional and a loop at the cf level.

After arith-to-llvm + func-to-llvm + cf-to-llvm (llvm only):

module {
  llvm.func private @lox_print(f64)

  llvm.func @main() -> f64 {
    %zero = llvm.mlir.constant(0.0 : f64) : f64
    %five = llvm.mlir.constant(5.0 : f64) : f64
    %one = llvm.mlir.constant(1.0 : f64) : f64
    llvm.br ^before(%zero : f64)
    ^before(%i: f64):
      %cond = llvm.fcmp "olt" %i, %five : f64
      llvm.cond_br %cond, ^body(%i : f64), ^exit(%i : f64)
    ^body(%i: f64):
      llvm.call @lox_print(%i) : (f64) -> ()
      %next = llvm.fadd %i, %one : f64
      llvm.br ^before(%next : f64)
    ^exit(%i: f64):
      llvm.return %i : f64
  }
}

Same block structure as the cf stage. Only the dialect names changed. The back-edge — llvm.br ^before — is still there. By this point, MLIR is done. The final step is mlir-translate to LLVM IR, which produces the same block structure with LLVM IR syntax.

After mlir-translate (LLVM IR):

declare void @lox_print(double)

define double @main() {
entry:
  br label %before

before:                                             ; ← the back-edge target
  %i = phi double [ 0.0, %entry ], [ %next, %body ]
  %cond = fcmp olt double %i, 5.0
  br i1 %cond, label %body, label %exit

body:
  call void @lox_print(double %i)
  %next = fadd double %i, 1.0
  br label %before                                 ; ← the back-edge

exit:
  ret double %i
}

Something new appears in the LLVM IR that wasn’t in the MLIR: the phi instruction. In LLVM IR, each block is a separate scope — values from other blocks aren’t directly accessible. The phi instruction is LLVM’s way of merging values from different predecessors: “if we came from entry, the value is 0.0; if we came from body, the value is %next.” MLIR handles this implicitly through block arguments (^before(%i: f64)), but LLVM IR requires an explicit phi.

This is one of the clearest differences between MLIR and LLVM IR. MLIR uses block arguments; LLVM uses phi nodes. They express the same thing — a value that depends on which predecessor block we came from — but MLIR’s syntax is simpler. When you read LLVM IR and see a phi, think “that was a block argument in MLIR.”

Three examples, same pipeline. The first showed arithmetic. The second added forward branching. The third added back-edges and introduced phi nodes. Every Lox program is a combination of these three patterns — arithmetic, conditionals, and loops — and every one follows the same lowering path.

Tagged union model: The examples above use the numbers-only model. In the tagged union model (Parts 7–12), the same programs would pass (i8, i64) pairs to @lox_print instead of bare f64 values, and 1.0 would be wrapped in a struct with tag=2 (TAG_NUMBER). The lowering pipeline is the same — only the code generation step changes.

Why this ordering? The first example (arithmetic only) didn’t have control flow — straight-line code, so only arith-to-llvm and func-to-llvm did anything. But a program with if or while produces scf operations, and those need their own pass. Here’s the full ordering for programs that have control flow:

  1. scf-to-cf first. scf.if and scf.while are structured control flow — they use regions to guarantee well-formedness. But LLVM doesn’t have regions; it has basic blocks and branches. The scf-to-cf pass converts structured control flow into explicit cf.cond_br (conditional branch) and cf.br (unconditional branch) operations.

  2. arith-to-llvm next. arith.addf, arith.cmpf, and friends become llvm.fadd, llvm.fcmp, etc. This pass doesn’t touch control flow — it only converts arithmetic and comparison operations.

  3. func-to-llvm and cf-to-llvm last. These convert the remaining high-level operations. func.func becomes llvm.func, func.call becomes llvm.call, func.return becomes llvm.return. cf.cond_br and cf.br become llvm.cond_br and llvm.br.

The order of arith-to-llvm relative to func-to-llvm/cf-to-llvm doesn’t matter — they touch different operations. We list arith-to-llvm first because it produces the simpler transformation. Only scf-to-cf before cf-to-llvm is a hard ordering constraint.

The passes don’t interfere with each other because they operate on different dialects. arith-to-llvm ignores scf and func operations; func-to-llvm ignores arith and cf operations. This separation is MLIR’s key advantage over LLVM — each dialect is independently optimizable and lowerable.

Implementing the Pipeline

#![allow(unused)]
fn main() {
// src/lib.rs
pub mod ast;
pub mod lexer;
pub mod parser;
pub mod codegen;

use anyhow::Result;
use melior::{
    Context,
    dialect::DialectRegistry,
    pass::PassManager,
    utility::register_all_dialects,
};

pub fn compile_to_llvm(source: &str) -> Result<String> {
    // 1. Tokenize and parse
    let tokens = lexer::tokenize(source)?;
    let program = parser::Parser::new(tokens).parse()?;
    
    // 2. Generate MLIR
    let registry = DialectRegistry::new();
    register_all_dialects(&registry);
    
    let context = Context::new();
    context.append_dialect_registry(&registry);
    context.load_all_available_dialects();
    
    let module = codegen::generate_module(&context, &program);
    
    // 3. Run lowering passes (MLIR → LLVM IR)
    //
    //    The pipeline converts dialects from high to low:
    //
    //    scf → cf     (structured control flow → explicit branches)
    //    arith → llvm  (arithmetic ops → LLVM float/integer ops)
    //    func → llvm  (functions → LLVM functions)
    //    cf → llvm    (branches → LLVM branches)
    //
    //    In Melior 0.27, pass registration and invocation looks like:
    //
    //    let pass_manager = PassManager::new(&context);
    //    pass_manager.add_pass(melior::pass::conversion::create_scf_to_control_flow());
    //    pass_manager.add_pass(melior::pass::conversion::create_arith_to_llvm());
    //    pass_manager.add_pass(melior::pass::conversion::create_func_to_llvm());
    //    pass_manager.add_pass(melior::pass::conversion::create_control_flow_to_llvm());
    //    pass_manager.run(&module)?;
    //
    //    The exact pass function names may differ between Melior versions.
    //    Check the `melior::pass` module for available conversion passes.
    //    The function names follow the pattern `convert_<dialect>_to_<dialect>`
    //    (e.g., `create_scf_to_control_flow`, `create_arith_to_llvm`). If a function
    //    name doesn't match, search for the dialect name in the `pass` module —
    //    the underlying MLIR pass is the same regardless of what Melior calls it.
    
    Ok(module.as_operation().to_string())
}
}

What About Control Flow?

For a program with an if statement, the pipeline adds one more step:

// Before scf-to-cf:
scf.if %cond {
  // then region
} else {
  // else region
}

// After scf-to-cf:
cf.cond_br %cond, ^then_block, ^else_block
^then_block:
  // then operations
  cf.br ^join
^else_block:
  // else operations
  cf.br ^join
^join:
  // continue

scf.if uses regions (nested blocks inside the operation). cf.cond_br uses basic blocks and explicit branches. The scf-to-cf pass unwraps the regions into flat control flow. Then cf-to-llvm converts cf.cond_br and cf.br into llvm.cond_br and llvm.br.

The key difference: scf.if guarantees well-formedness (the regions can’t be jumped into from outside). cf.cond_br does not — a malformed program could branch into the middle of an if-else. The scf dialect’s structure is a safety net that the lowering pass removes once the IR is correct.

For simplicity, this example shows an scf.if without results. When scf.if produces a value (like var x = if (cond) { a } else { b }), each region ends with scf.yield to provide the result, and the scf-to-cf pass threads that result through block arguments on cf.br:

// scf.if with a result:
%x = scf.if %cond -> f64 {
  scf.yield %a : f64
} else {
  scf.yield %b : f64
}

// After scf-to-cf, the result is threaded through block arguments:
cf.cond_br %cond, ^then, ^else
^then:
  cf.br ^join(%a : f64)       // pass %a as a block argument to ^join
^else:
  cf.br ^join(%b : f64)       // pass %b as a block argument to ^join
^join(%result: f64):           // %result is %a or %b, depending on the branch
  // use %result here

The ^join block takes a parameter (%result: f64) that receives the value from whichever branch was taken. This is how cf represents the single-assignment result that scf.if computed structurally. Every branch into ^join must provide the same number and types of block arguments — if they don’t match, the verifier catches it.

The C Runtime

Our generated MLIR declares @lox_print as an external function — the verifier accepts it, the lowering passes leave it alone, and the final LLVM IR emits a declare void @lox_print(double). But clang can’t link an executable without a definition for that symbol. We need a runtime.

The runtime is tiny for the numbers-only model — one function that prints a float:

// runtime/print.c
#include <stdio.h>

void lox_print(double value) {
    printf("%g\n", value);
}

That’s it. %g prints numbers without trailing zeros — 1.0 prints as 1, 3.14 prints as 3.14. It picks the shorter of %f (fixed-point) and %e (scientific). This matches Lox’s print behavior for numbers.

The obvious problem: nil, false, and 0 are all 0.0 in the numbers-only model, so lox_print prints 0 for all three. We can’t tell them apart because we threw away the type information when we decided every value is an f64. The tagged union model in Part 7 fixes this — lox_print gets the tag alongside the payload and can dispatch to printf("nil"), printf("false"), or printf("%g", payload) as appropriate. But for now, one function, one format specifier, and we move on.

Using the CLI

With the runtime in place, the full compile pipeline is:

# Compile Lox to MLIR
cargo run -- compile input.lox --emit-mlir -o output.mlir

# Lower MLIR to LLVM IR
mlir-translate output.mlir --mlir-to-llvmir -o output.ll

# Compile to executable (link the runtime)
clang output.ll runtime/print.c -o output -lm

# Run it
./output

That -lm links the C math library. We don’t use it yet, but it’s good practice — Lox’s standard library (Part 10) calls fmod, floor, and other math functions.

If you get undefined reference to lox_print at link time, make sure runtime/print.c is on the compile command and the path is correct relative to where you’re running clang.


Project Structure

lox-mlir/
├── Cargo.toml
├── src/
│   ├── lib.rs              # Library entry point
│   ├── main.rs             # CLI entry point
│   ├── ast.rs              # AST definitions
│   ├── lexer.rs            # Tokenizer (Token, TokenType, LexValue)
│   ├── parser.rs           # Parser (ParseError, Parser)
│   ├── codegen/
│   │   ├── mod.rs
│   │   ├── generator.rs    # MLIR code generator
│   │   ├── types.rs        # Tagged union types
│   │   └── strings.rs      # String constant handling
│   └── runtime/
│       └── print.c         # C runtime — lox_print for the numbers-only model
├── examples/
│   ├── simple_add.rs
│   └── *.lox
└── tests/
    └── integration.rs

Quick Reference: Lox → MLIR Mapping

Lox ConstructRust EnumMLIR Operation
a + bBinaryOp::Addarith.addf
a - bBinaryOp::Subarith.subf
a * bBinaryOp::Mularith.mulf
a / bBinaryOp::Divarith.divf
a < bBinaryOp::Lessarith.cmpf olt
a == bBinaryOp::Equalarith.cmpf oeq
not xUnaryOp::Notarith.cmpf oeqscf.if (1.0 / 0.0)
var x = vVarStmtStore in HashMap
xVariableExprLoad from HashMap
if (c) {...}IfStmtscf.if
while (c) {...}WhileStmtscf.while
fun f(...) {...}FunctionStmtfunc.func
f(args)CallExprfunc.call
return vReturnStmtfunc.return

Tagged Union Model (Parts 7–12)

When using the !llvm.struct<(i8, i64)> tagged union, the same Lox constructs produce more MLIR operations — each one checks the tag, extracts the payload, operates, and re-packs the result:

Lox ConstructNumbers-Only MLIRTagged Union MLIR
a + barith.addftag check → llvm.extractvaluellvm.bitcastarith.addfllvm.bitcastllvm.insertvalue
a < barith.cmpf olttag check → extract → arith.cmpf olt → re-pack
a and bscf.if → resultscf.if on tag → extract → scf.if for short-circuit → re-pack
var x = vStore f64 in HashMapStore (tag, payload) pair
nilarith.constant 0.0llvm.undef + llvm.insertvalue (tag=0, payload undef)
true / falsearith.constant 1.0/0.0 (no real boolean support)llvm.insertvalue (tag=1, payload 1/0)
"hello"Compiles to nil (0.0)llvm.mlir.addressof + llvm.insertvalue (tag=3)
print xfunc.call @lox_print(%x)func.call @lox_print(%tag, %payload) or pass the whole struct

The numbers-only model keeps each row to one MLIR operation. The tagged union model expands each row to 3–6 operations. That’s the cost of dynamic typing at the MLIR level — and it’s why we start with the simpler model in Parts 1–6.

A few notes on the numbers-only column.

true/false as 1.0/0.0 — the numbers-only model has no real boolean support. 1.0 and 0.0 are a C-style convention, not a faithful representation. Lox programs that use booleans will compile, but the semantics are approximate.

"hello" compiles to nil (0.0) — strings can’t be represented as a single f64. The code generator’s compile_literal returns compile_nil() for string values, which means print "hello" prints 0 in the numbers-only model. Strings require the tagged union model (Part 7).

print x with tag+payload — the tagged-union column shows the two-argument form (pass tag and payload separately). You could also pass the whole !llvm.struct<(i8, i64)> and let the callee destructure it — both are valid design choices. The two-argument form is more explicit about what’s happening at the ABI level.


Differences from C++ MLIR

AspectC++ MLIRMelior (Rust)
Dialect definitionTableGen (.td)Rust code directly
OperationsGenerated from ODSBuilt with OperationBuilder
OwnershipManual / raw pointersRAII with lifetimes
Pattern rewritingC++ classesClosures / Rust traits
Error handlingLogicalResultResult<T, Error>

Next Steps

You’ve seen the full pipeline: parse Lox, generate MLIR, lower through dialects, emit LLVM IR. Every value is an f64, and we hand off arith and func operations to MLIR’s built-in lowering passes. That works — but it’s leaky. Every variable sits in a Rust HashMap that disappears after codegen. No garbage collection. No heap-allocated objects. No way for the compiled program to manage memory at runtime.

Part 2 adds a garbage collector. Not a library — we generate MLIR that implements one: shadow stack roots, mark-and-sweep traversal, and custom lox.* operations for allocation and rooting that have no standard MLIR equivalent. This is where the tutorial stops being “emit arith operations and let MLIR handle the rest” and starts being “define your own semantics in MLIR.”

Next: Part 2 — Garbage Collection from Scratch — We can compile Lox to MLIR, but our compiled code has a problem: every string, every closure, every instance leaks memory forever. We need a garbage collector. We’ll build mark-sweep from scratch — the same algorithm Lua and the JVM use, scaled down.