MLIR for Lox: A Rust Guide (using Melior)

This guide shows how to build a Lox compiler using Rust and the Melior crate instead of C++. If you know Crafting Interpreters, this should feel familiar — just with MLIR instead of tree-walk interpretation.

Why MLIR for Lox?

LLVM is powerful but low-level. It doesn't know about:

  • Variable scoping rules
  • Closure captures
  • Dynamic typing
  • Lox-specific optimizations

MLIR lets you define a dialect that represents Lox semantics directly, then progressively lower it to LLVM IR. This is how modern languages like Swift, Rust, and Julia work.

Why Rust + Melior?

  • Memory safety without garbage collection
  • Pattern matching for clean AST traversal
  • No TableGen — Melior builds dialects directly in Rust
  • Excellent for compiler toolchains

Part 1: Setup

Dependencies

# Cargo.toml
[package]
name = "lox-mlir"
version = "0.1.0"
edition = "2021"

[dependencies]
melior = "0.27"

Install MLIR

# macOS
brew install llvm@22

# Linux (Ubuntu/Debian)
# You may need to build from source or use a PPA for LLVM 22
# See: https://apt.llvm.org/

# Add to your shell config:
export LLVM_SYS_220_PREFIX=/opt/homebrew/opt/llvm@22  # macOS
# or
export LLVM_SYS_220_PREFIX=/usr/lib/llvm-22            # Linux

Part 2: The Lox AST

Hand-written Rust — not generated. This is exactly what you'd write following Crafting Interpreters.

#![allow(unused)]
fn main() {
// src/ast.rs
use std::fmt;

/// Source location for error messages
#[derive(Debug, Clone, Copy)]
pub struct Location {
    pub line: usize,
    pub column: usize,
}

/// A Lox value (dynamically typed)
#[derive(Debug, Clone)]
pub enum LoxValue {
    Nil,
    Bool(bool),
    Number(f64),
    String(String),
}

// ========================================================================
// Expressions
// ========================================================================

#[derive(Debug, Clone)]
pub enum Expr {
    Binary(BinaryExpr),
    Unary(UnaryExpr),
    Literal(LiteralExpr),
    Grouping(GroupingExpr),
    Variable(VariableExpr),
    Assign(AssignExpr),
    Call(CallExpr),
    Logical(LogicalExpr),
}

impl Expr {
    pub fn location(&self) -> Location {
        match self {
            Expr::Binary(e) => e.location,
            Expr::Unary(e) => e.location,
            Expr::Literal(e) => e.location,
            Expr::Grouping(e) => e.location,
            Expr::Variable(e) => e.location,
            Expr::Assign(e) => e.location,
            Expr::Call(e) => e.location,
            Expr::Logical(e) => e.location,
        }
    }
}

#[derive(Debug, Clone)]
pub struct BinaryExpr {
    pub location: Location,
    pub left: Box<Expr>,
    pub op: BinaryOp,
    pub right: Box<Expr>,
}

#[derive(Debug, Clone, Copy)]
pub enum BinaryOp {
    Add, Sub, Mul, Div,
    Less, LessEqual, Greater, GreaterEqual,
    Equal, NotEqual,
}

#[derive(Debug, Clone)]
pub struct UnaryExpr {
    pub location: Location,
    pub op: UnaryOp,
    pub right: Box<Expr>,
}

#[derive(Debug, Clone, Copy)]
pub enum UnaryOp {
    Negate, Not,
}

#[derive(Debug, Clone)]
pub struct LiteralExpr {
    pub location: Location,
    pub value: LoxValue,
}

#[derive(Debug, Clone)]
pub struct GroupingExpr {
    pub location: Location,
    pub expr: Box<Expr>,
}

#[derive(Debug, Clone)]
pub struct VariableExpr {
    pub location: Location,
    pub name: String,
}

#[derive(Debug, Clone)]
pub struct AssignExpr {
    pub location: Location,
    pub name: String,
    pub value: Box<Expr>,
}

#[derive(Debug, Clone)]
pub struct CallExpr {
    pub location: Location,
    pub callee: Box<Expr>,
    pub arguments: Vec<Expr>,
}

#[derive(Debug, Clone)]
pub struct LogicalExpr {
    pub location: Location,
    pub left: Box<Expr>,
    pub op: LogicalOp,
    pub right: Box<Expr>,
}

#[derive(Debug, Clone, Copy)]
pub enum LogicalOp {
    And, Or,
}

// ========================================================================
// Statements
// ========================================================================

#[derive(Debug, Clone)]
pub enum Stmt {
    Function(FunctionStmt),
    Return(ReturnStmt),
    Var(VarStmt),
    If(IfStmt),
    While(WhileStmt),
    Print(PrintStmt),
    Block(BlockStmt),
    Expression(ExpressionStmt),
}

#[derive(Debug, Clone)]
pub struct FunctionStmt {
    pub location: Location,  // Location of 'fun' keyword
    pub name: String,
    pub name_location: Location,  // Location of the function name
    pub params: Vec<String>,
    pub param_locations: Vec<Location>,  // Location of each parameter name
    pub body: Vec<Stmt>,
}

#[derive(Debug, Clone)]
pub struct ReturnStmt {
    pub location: Location,  // Location of 'return' keyword
    pub value: Option<Expr>,
}

#[derive(Debug, Clone)]
pub struct VarStmt {
    pub location: Location,  // Location of 'var' keyword
    pub name: String,
    pub name_location: Location,  // Location of the variable name
    pub init: Expr,
}

#[derive(Debug, Clone)]
pub struct IfStmt {
    pub location: Location,  // Location of 'if' keyword
    pub condition: Expr,
    pub then_branch: Vec<Stmt>,
    pub else_branch: Vec<Stmt>,
}

#[derive(Debug, Clone)]
pub struct WhileStmt {
    pub location: Location,  // Location of 'while' keyword
    pub condition: Expr,
    pub body: Vec<Stmt>,
}

#[derive(Debug, Clone)]
pub struct PrintStmt {
    pub location: Location,  // Location of 'print' keyword
    pub value: Expr,
}

#[derive(Debug, Clone)]
pub struct BlockStmt {
    pub location: Location,  // Location of opening '{'
    pub statements: Vec<Stmt>,
}

#[derive(Debug, Clone)]
pub struct ExpressionStmt {
    pub location: Location,  // Start of the expression
    pub expr: Expr,
}

// ========================================================================
// Program
// ========================================================================

/// A Lox program is a list of top-level statements
#[derive(Debug, Clone)]
pub struct Program {
    pub statements: Vec<Stmt>,
}

// ========================================================================
// Helper trait for getting locations
// ========================================================================

impl Stmt {
    /// Get the primary location of this statement
    pub fn location(&self) -> Location {
        match self {
            Stmt::Function(f) => f.location,
            Stmt::Return(r) => r.location,
            Stmt::Var(v) => v.location,
            Stmt::If(i) => i.location,
            Stmt::While(w) => w.location,
            Stmt::Print(p) => p.location,
            Stmt::Block(b) => b.location,
            Stmt::Expression(e) => e.location,
        }
    }
}
}

Part 3: The Parser

The parser follows Crafting Interpreters Chapter 6 closely. It expects a stream of Token values from a scanner. Here are the token types — the scanner itself is a standard lexical analysis exercise (see Crafting Interpreters Chapter 4), and not the focus of this tutorial.

#![allow(unused)]
fn main() {
// src/lexer.rs
use crate::ast::Location;

/// A single token produced by the scanner
#[derive(Debug, Clone)]
pub struct Token {
    pub token_type: TokenType,
    pub lexeme: String,
    pub literal: Option<LexValue>,
    pub location: Location,
}

/// The category of a token
#[derive(Debug, Clone, Copy, PartialEq)]
pub enum TokenType {
    // Single-character tokens
    LeftParen, RightParen, LeftBrace, RightBrace,
    Comma, Dot, Minus, Plus, Semicolon, Slash, Star,
    // One or two character tokens
    Bang, BangEqual, Equal, EqualEqual, Greater, GreaterEqual,
    Less, LessEqual,
    // Literals
    Identifier, String, Number,
    // Keywords
    And, Class, Else, False, Fun, For, If, Nil, Or,
    Print, Return, Super, This, Var, While,
    // Special
    Eof,
}

/// A literal value from the source
#[derive(Debug, Clone)]
pub enum LexValue {
    Boolean(bool),
    F64(f64),
    Str(String),
}

impl LexValue {
    pub fn as_number(&self) -> f64 {
        match self {
            LexValue::F64(n) => *n,
            _ => 0.0,
        }
    }

    pub fn as_string(&self) -> String {
        match self {
            LexValue::Str(s) => s.clone(),
            _ => String::new(),
        }
    }
}

impl TokenType {
    pub fn name(&self) -> &'static str {
        match self {
            TokenType::LeftParen => "(",
            TokenType::RightParen => ")",
            TokenType::LeftBrace => "{",
            TokenType::RightBrace => "}",
            TokenType::Comma => ",",
            TokenType::Dot => ".",
            TokenType::Minus => "-",
            TokenType::Plus => "+",
            TokenType::Semicolon => ";",
            TokenType::Slash => "/",
            TokenType::Star => "*",
            TokenType::Bang => "!",
            TokenType::BangEqual => "!=",
            TokenType::Equal => "=",
            TokenType::EqualEqual => "==",
            TokenType::Greater => ">",
            TokenType::GreaterEqual => ">=",
            TokenType::Less => "<",
            TokenType::LessEqual => "<=",
            TokenType::Identifier => "identifier",
            TokenType::String => "string",
            TokenType::Number => "number",
            TokenType::And => "and",
            TokenType::Class => "class",
            TokenType::Else => "else",
            TokenType::False => "false",
            TokenType::Fun => "fun",
            TokenType::For => "for",
            TokenType::If => "if",
            TokenType::Nil => "nil",
            TokenType::Or => "or",
            TokenType::Print => "print",
            TokenType::Return => "return",
            TokenType::Super => "super",
            TokenType::This => "this",
            TokenType::Var => "var",
            TokenType::While => "while",
            TokenType::Eof => "eof",
        }
    }
}
}

Now the parser, adapted to produce the AST from Part 2:

#![allow(unused)]
fn main() {
// src/parser.rs
use crate::ast::*;
use crate::lexer::{Token, TokenType, LexValue};

#[derive(Debug)]
pub struct ParseError {
    pub message: String,
    pub location: Location,
}

impl ParseError {
    pub fn new(message: &str, location: Location) -> Self {
        Self { message: message.to_string(), location }
    }
}

pub struct Parser {
    tokens: Vec<Token>,
    current: usize,
}

impl Parser {
    pub fn new(tokens: Vec<Token>) -> Self {
        Self { tokens, current: 0 }
    }

    pub fn parse(&mut self) -> Result<Program, ParseError> {
        let mut statements = Vec::new();
        while !self.is_at_end() {
            statements.push(self.declaration()?);
        }
        Ok(Program { statements })
    }

    // ========================================================================
    // Statement parsing
    // ========================================================================

    fn declaration(&mut self) -> Result<Stmt, ParseError> {
        if self.match_token(TokenType::Fun) {
            return self.function_declaration();
        }
        if self.match_token(TokenType::Var) {
            return self.var_declaration();
        }
        self.statement()
    }

    fn function_declaration(&mut self) -> Result<Stmt, ParseError> {
        let location = self.previous().location;  // Location of 'fun'
        let name = self.consume(TokenType::Identifier, "Expect function name.")?;
        let name_location = name.location;
        let name_str = name.lexeme.clone();
        
        self.consume(TokenType::LeftParen, "Expect '(' after function name.")?;
        
        let mut params = Vec::new();
        let mut param_locations = Vec::new();
        if !self.check(TokenType::RightParen) {
            loop {
                let param = self.consume(TokenType::Identifier, "Expect parameter name.")?;
                params.push(param.lexeme.clone());
                param_locations.push(param.location);
                if !self.match_token(TokenType::Comma) { break; }
            }
        }
        self.consume(TokenType::RightParen, "Expect ')' after parameters.")?;
        self.consume(TokenType::LeftBrace, "Expect '{' before function body.")?;
        
        let body = self.block()?;
        
        Ok(Stmt::Function(FunctionStmt { 
            location, 
            name: name_str, 
            name_location,
            params, 
            param_locations,
            body 
        }))
    }

    fn var_declaration(&mut self) -> Result<Stmt, ParseError> {
        let location = self.previous().location;  // Location of 'var'
        let name = self.consume(TokenType::Identifier, "Expect variable name.")?;
        let name_location = name.location;
        let name_str = name.lexeme.clone();
        
        let init = if self.match_token(TokenType::Equal) {
            self.expression()?
        } else {
            Expr::Literal(LiteralExpr { location, value: LoxValue::Nil })
        };
        
        self.consume(TokenType::Semicolon, "Expect ';' after variable declaration.")?;
        Ok(Stmt::Var(VarStmt { location, name: name_str, name_location, init }))
    }

    fn statement(&mut self) -> Result<Stmt, ParseError> {
        if self.match_token(TokenType::Print) { return self.print_statement(); }
        if self.match_token(TokenType::If) { return self.if_statement(); }
        if self.match_token(TokenType::While) { return self.while_statement(); }
        if self.match_token(TokenType::Return) { return self.return_statement(); }
        if self.match_token(TokenType::LeftBrace) {
            let location = self.previous().location;
            return Ok(Stmt::Block(BlockStmt { location, statements: self.block()? }));
        }
        self.expression_statement()
    }

    fn print_statement(&mut self) -> Result<Stmt, ParseError> {
        let location = self.previous().location;  // Location of 'print'
        let value = self.expression()?;
        self.consume(TokenType::Semicolon, "Expect ';' after value.")?;
        Ok(Stmt::Print(PrintStmt { location, value }))
    }

    fn if_statement(&mut self) -> Result<Stmt, ParseError> {
        let location = self.previous().location;  // Location of 'if'
        self.consume(TokenType::LeftParen, "Expect '(' after 'if'.")?;
        let condition = self.expression()?;
        self.consume(TokenType::RightParen, "Expect ')' after if condition.")?;
        
        let then_branch = vec![self.statement()?];
        let else_branch = if self.match_token(TokenType::Else) {
            vec![self.statement()?]
        } else {
            Vec::new()
        };
        
        Ok(Stmt::If(IfStmt { location, condition, then_branch, else_branch }))
    }

    fn while_statement(&mut self) -> Result<Stmt, ParseError> {
        let location = self.previous().location;  // Location of 'while'
        self.consume(TokenType::LeftParen, "Expect '(' after 'while'.")?;
        let condition = self.expression()?;
        self.consume(TokenType::RightParen, "Expect ')' after while condition.")?;
        
        let body = vec![self.statement()?];
        
        Ok(Stmt::While(WhileStmt { location, condition, body }))
    }

    fn return_statement(&mut self) -> Result<Stmt, ParseError> {
        let location = self.previous().location;  // Location of 'return'
        let value = if !self.check(TokenType::Semicolon) {
            Some(self.expression()?)
        } else {
            None
        };
        self.consume(TokenType::Semicolon, "Expect ';' after return value.")?;
        Ok(Stmt::Return(ReturnStmt { location, value }))
    }

    fn expression_statement(&mut self) -> Result<Stmt, ParseError> {
        let expr = self.expression()?;
        let location = expr.location();  // Start of the expression
        self.consume(TokenType::Semicolon, "Expect ';' after expression.")?;
        Ok(Stmt::Expression(ExpressionStmt { location, expr }))
    }

    fn block(&mut self) -> Result<Vec<Stmt>, ParseError> {
        let mut statements = Vec::new();
        while !self.check(TokenType::RightBrace) && !self.is_at_end() {
            statements.push(self.declaration()?);
        }
        self.consume(TokenType::RightBrace, "Expect '}' after block.")?;
        Ok(statements)
    }

    // ========================================================================
    // Expression parsing (precedence climbing)
    // ========================================================================

    fn expression(&mut self) -> Result<Expr, ParseError> {
        self.assignment()
    }

    fn assignment(&mut self) -> Result<Expr, ParseError> {
        let expr = self.or_expr()?;
        
        if self.match_token(TokenType::Equal) {
            let location = self.previous().location;
            let value = self.assignment()?;
            
            if let Expr::Variable(var) = expr {
                return Ok(Expr::Assign(AssignExpr {
                    location,
                    name: var.name,
                    value: Box::new(value),
                }));
            }
            return Err(ParseError::new("Invalid assignment target.", location));
        }
        Ok(expr)
    }

    fn or_expr(&mut self) -> Result<Expr, ParseError> {
        let mut expr = self.and_expr()?;
        
        while self.match_token(TokenType::Or) {
            let location = self.previous().location;
            let right = self.and_expr()?;
            expr = Expr::Logical(LogicalExpr {
                location,
                left: Box::new(expr),
                op: LogicalOp::Or,
                right: Box::new(right),
            });
        }
        Ok(expr)
    }

    fn and_expr(&mut self) -> Result<Expr, ParseError> {
        let mut expr = self.equality()?;
        
        while self.match_token(TokenType::And) {
            let location = self.previous().location;
            let right = self.equality()?;
            expr = Expr::Logical(LogicalExpr {
                location,
                left: Box::new(expr),
                op: LogicalOp::And,
                right: Box::new(right),
            });
        }
        Ok(expr)
    }

    fn equality(&mut self) -> Result<Expr, ParseError> {
        let mut expr = self.comparison()?;
        
        while self.match_any(&[TokenType::EqualEqual, TokenType::BangEqual]) {
            let location = self.previous().location;
            let op = match self.previous().token_type {
                TokenType::EqualEqual => BinaryOp::Equal,
                TokenType::BangEqual => BinaryOp::NotEqual,
                _ => unreachable!(),
            };
            let right = self.comparison()?;
            expr = Expr::Binary(BinaryExpr {
                location,
                left: Box::new(expr),
                op,
                right: Box::new(right),
            });
        }
        Ok(expr)
    }

    fn comparison(&mut self) -> Result<Expr, ParseError> {
        let mut expr = self.term()?;
        
        while self.match_any(&[
            TokenType::Greater, TokenType::GreaterEqual,
            TokenType::Less, TokenType::LessEqual,
        ]) {
            let location = self.previous().location;
            let op = match self.previous().token_type {
                TokenType::Greater => BinaryOp::Greater,
                TokenType::GreaterEqual => BinaryOp::GreaterEqual,
                TokenType::Less => BinaryOp::Less,
                TokenType::LessEqual => BinaryOp::LessEqual,
                _ => unreachable!(),
            };
            let right = self.term()?;
            expr = Expr::Binary(BinaryExpr {
                location,
                left: Box::new(expr),
                op,
                right: Box::new(right),
            });
        }
        Ok(expr)
    }

    fn term(&mut self) -> Result<Expr, ParseError> {
        let mut expr = self.factor()?;
        
        while self.match_any(&[TokenType::Plus, TokenType::Minus]) {
            let location = self.previous().location;
            let op = match self.previous().token_type {
                TokenType::Plus => BinaryOp::Add,
                TokenType::Minus => BinaryOp::Sub,
                _ => unreachable!(),
            };
            let right = self.factor()?;
            expr = Expr::Binary(BinaryExpr {
                location,
                left: Box::new(expr),
                op,
                right: Box::new(right),
            });
        }
        Ok(expr)
    }

    fn factor(&mut self) -> Result<Expr, ParseError> {
        let mut expr = self.unary()?;
        
        while self.match_any(&[TokenType::Star, TokenType::Slash]) {
            let location = self.previous().location;
            let op = match self.previous().token_type {
                TokenType::Star => BinaryOp::Mul,
                TokenType::Slash => BinaryOp::Div,
                _ => unreachable!(),
            };
            let right = self.unary()?;
            expr = Expr::Binary(BinaryExpr {
                location,
                left: Box::new(expr),
                op,
                right: Box::new(right),
            });
        }
        Ok(expr)
    }

    fn unary(&mut self) -> Result<Expr, ParseError> {
        if self.match_any(&[TokenType::Bang, TokenType::Minus]) {
            let location = self.previous().location;
            let op = match self.previous().token_type {
                TokenType::Bang => UnaryOp::Not,
                TokenType::Minus => UnaryOp::Negate,
                _ => unreachable!(),
            };
            let right = Box::new(self.unary()?);
            return Ok(Expr::Unary(UnaryExpr { location, op, right }));
        }
        self.primary()
    }

    fn primary(&mut self) -> Result<Expr, ParseError> {
        let token = self.peek();
        let location = token.location;
        
        if self.match_token(TokenType::False) {
            return Ok(Expr::Literal(LiteralExpr { location, value: LoxValue::Bool(false) }));
        }
        if self.match_token(TokenType::True) {
            return Ok(Expr::Literal(LiteralExpr { location, value: LoxValue::Bool(true) }));
        }
        if self.match_token(TokenType::Nil) {
            return Ok(Expr::Literal(LiteralExpr { location, value: LoxValue::Nil }));
        }
        if self.match_token(TokenType::Number) {
            let value = token.literal.as_ref().unwrap().as_number();
            return Ok(Expr::Literal(LiteralExpr { location, value: LoxValue::Number(value) }));
        }
        if self.match_token(TokenType::String) {
            let value = token.literal.as_ref().unwrap().as_string();
            return Ok(Expr::Literal(LiteralExpr { location, value: LoxValue::String(value) }));
        }
        if self.match_token(TokenType::Identifier) {
            return Ok(Expr::Variable(VariableExpr { location, name: token.lexeme.clone() }));
        }
        if self.match_token(TokenType::LeftParen) {
            let expr = self.expression()?;
            self.consume(TokenType::RightParen, "Expect ')' after expression.")?;
            return Ok(Expr::Grouping(GroupingExpr {
                location,
                expr: Box::new(expr),
            }));
        }
        
        Err(ParseError::new("Expect expression.", location))
    }

    // ========================================================================
    // Helper methods
    // ========================================================================

    fn match_token(&mut self, token_type: TokenType) -> bool {
        if self.check(token_type) {
            self.advance();
            return true;
        }
        false
    }

    fn match_any(&mut self, types: &[TokenType]) -> bool {
        for t in types {
            if self.check(*t) {
                self.advance();
                return true;
            }
        }
        false
    }

    fn check(&self, token_type: TokenType) -> bool {
        !self.is_at_end() && self.peek().token_type == token_type
    }

    fn advance(&mut self) -> Token {
        if !self.is_at_end() {
            self.current += 1;
        }
        self.previous()
    }

    fn is_at_end(&self) -> bool {
        self.peek().token_type == TokenType::Eof
    }

    fn peek(&self) -> Token {
        self.tokens[self.current].clone()
    }

    fn previous(&self) -> Token {
        self.tokens[self.current - 1].clone()
    }

    fn consume(&mut self, token_type: TokenType, message: &str) -> Result<Token, ParseError> {
        if self.check(token_type) {
            return Ok(self.advance());
        }
        Err(ParseError::new(message, self.peek().location))
    }
}

#[derive(Debug)]
pub struct ParseError {
    pub message: String,
    pub location: Location,
}

impl ParseError {
    pub fn new(message: &str, location: Location) -> Self {
        Self { message: message.to_string(), location }
    }
}
}

Part 4: MLIR Code Generation with Melior

The core of the compiler — walks the AST and emits MLIR.

Note on the codegen model: The codegen uses a simplified current_block: Option<Block> pattern for exposition. In production Melior code, blocks are created and immediately appended to regions (the "build then insert" pattern). The current_block ownership issue — where a block is moved into a region and can't be accessed again — is real, and production code avoids it by building regions first, then appending. For this tutorial, the simplified pattern helps you follow the logic without getting lost in ownership boilerplate. See the review notes for the production approach.

Scope note: In this part, we compile a subset of Lox that only supports numbers and arithmetic. This isn't a limitation of MLIR — it's a pedagogical choice. Dynamic typing with tagged unions adds 3-4x more code for every operation (check the tag, dispatch, unbox, compute, re-box). We'll cover dynamic typing in the "Tagged Unions" section below. For now, every value is an f64.

What this means in practice:

  • All values are f64true is 1.0, false is 0.0, nil is 0.0
  • Arithmetic operations use arith.addf, arith.mulf, etc.
  • Comparisons use arith.cmpf
  • Logical operators use scf.if for short-circuit evaluation
  • lox.print calls a runtime function that prints a float

This is a simplification. A production Lox compiler would use tagged unions. But starting with "everything is a float" lets us focus on MLIR concepts without drowning in type-tag boilerplate.

Basic Code Generator

#![allow(unused)]
fn main() {
// src/codegen/mod.rs
mod generator;

pub use generator::generate_module;
}
#![allow(unused)]
fn main() {
// src/codegen/generator.rs
use crate::ast::*;
use melior::{
    Context,
    dialect::{arith, func, scf, DialectRegistry},
    ir::{
        attribute::{StringAttribute, TypeAttribute, FloatAttribute, IntegerAttribute},
        r#type::FunctionType,
        Location, Module, Region, Block, Type, Value,
        operation::OperationBuilder,
    },
    utility::register_all_dialects,
};

/// State for code generation
pub struct CodeGenerator<'c> {
    context: &'c Context,
    module: Module<'c>,
    current_block: Option<Block<'c>>,
    // Variable storage: maps variable names to their SSA values
    variables: std::collections::HashMap<String, Value<'c>>,
}

impl<'c> CodeGenerator<'c> {
    pub fn new(context: &'c Context) -> Self {
        let location = Location::unknown(context);
        let module = Module::new(location);
        Self { 
            context, 
            module, 
            current_block: None,
            variables: std::collections::HashMap::new(),
        }
    }

    // ========================================================================
    // Entry point
    // ========================================================================

    pub fn generate(mut self, program: &Program) -> Module<'c> {
        for stmt in &program.statements {
            self.compile_statement(stmt);
        }
        self.module
    }

    // ========================================================================
    // Statement compilation
    // ========================================================================

    fn compile_statement(&mut self, stmt: &Stmt) {
        match stmt {
            Stmt::Function(f) => self.compile_function(f),
            Stmt::Return(r) => self.compile_return(r),
            Stmt::Var(v) => self.compile_var(v),
            Stmt::If(i) => self.compile_if(i),
            Stmt::While(w) => self.compile_while(w),
            Stmt::Print(p) => self.compile_print(p),
            Stmt::Block(b) => self.compile_block(b),
            Stmt::Expression(e) => { self.compile_expression(&e.expr); }
        }
    }

    fn compile_function(&mut self, func: &FunctionStmt) {
        let location = Location::unknown(self.context);
        let float_type = Type::float64(self.context);
        
        // Create parameter types (all f64 for now - dynamic typing)
        let param_types: Vec<Type> = func.params.iter().map(|_| float_type).collect();
        let return_type = float_type;
        
        // Create the function type
        let function_type = FunctionType::new(self.context, &param_types, &[return_type]);
        
        // Create the function body region
        let region = Region::new();
        let block = Block::new(
            &param_types.iter().map(|&t| (t, location)).collect::<Vec<_>>()
        );
        
        // Store parameters as variables
        for (i, param_name) in func.params.iter().enumerate() {
            let arg = block.argument(i).unwrap();
            self.variables.insert(param_name.clone(), arg.into());
        }
        
        // Set current block for body compilation
        self.current_block = Some(block);
        
        // Compile the function body
        for stmt in &func.body {
            self.compile_statement(stmt);
        }
        
        // Add implicit return nil if no return at end
        let nil_value = self.compile_nil();
        if let Some(block) = &self.current_block {
            block.append_operation(func::r#return(&[nil_value], location));
        }
        
        // Append block to region
        if let Some(block) = self.current_block.take() {
            region.append_block(block);
        }
        
        // Add function to module
        self.module.body().append_operation(func::func(
            self.context,
            StringAttribute::new(self.context, &func.name),
            TypeAttribute::new(function_type.into()),
            region,
            &[],
            location,
        ));
        
        // Clear variables after function
        self.variables.clear();
    }

    fn compile_return(&mut self, ret: &ReturnStmt) {
        let location = Location::unknown(self.context);
        let value = match &ret.value {
            Some(expr) => self.compile_expression(expr),
            None => self.compile_nil(),
        };
        
        if let Some(block) = &self.current_block {
            block.append_operation(func::r#return(&[value], location));
        }
    }

    fn compile_var(&mut self, var: &VarStmt) {
        let value = self.compile_expression(&var.init);
        self.variables.insert(var.name.clone(), value);
    }

    fn compile_if(&mut self, if_stmt: &IfStmt) {
        let location = Location::unknown(self.context);
        let condition = self.compile_expression(&if_stmt.condition);
        
        // Create scf.if operation
        let then_region = Region::new();
        let then_block = then_region.append_block(Block::new(&[]));
        
        // Compile then branch
        let prev_block = self.current_block.replace(then_block);
        for stmt in &if_stmt.then_branch {
            self.compile_statement(stmt);
        }
        
        // Handle else branch
        let else_region = if !if_stmt.else_branch.is_empty() {
            let else_region = Region::new();
            let else_block = else_region.append_block(Block::new(&[]));
            self.current_block = Some(else_block);
            
            for stmt in &if_stmt.else_branch {
                self.compile_statement(stmt);
            }
            
            // Restore block reference (Melior ownership is tricky here)
            // In a real impl, we'd need to handle this more carefully
            Some(else_region)
        } else {
            None
        };
        
        self.current_block = prev_block;
        
        // Append scf.if to current block
        // Note: MLIR scf.if requires BOTH then and else regions.
        // An empty else region is used for single-branch if statements.
        if let Some(block) = &self.current_block {
            let else_region = Region::new();
            let if_op = OperationBuilder::new("scf.if", location)
                .add_operand(condition)
                .add_region(then_region)
                .add_region(else_region)
                .build()
                .unwrap();
            block.append_operation(if_op);
        }
    }

    fn compile_while(&mut self, while_stmt: &WhileStmt) {
        let location = Location::unknown(self.context);
        
        // For scf.while, we need:
        // 1. A "before" region that computes the condition
        // 2. An "after" region that is the loop body
        
        let before_region = Region::new();
        let before_block = before_region.append_block(Block::new(&[]));
        
        // Compile condition in before block
        let prev_block = self.current_block.replace(before_block);
        let condition = self.compile_expression(&while_stmt.condition);
        
        // Add scf.condition
        let condition_op = OperationBuilder::new("scf.condition", location)
            .add_operand(condition)
            .build()
            .unwrap();
        before_block.append_operation(condition_op);
        
        // Create after region (loop body)
        let after_region = Region::new();
        let after_block = after_region.append_block(Block::new(&[]));
        
        self.current_block = Some(after_block);
        for stmt in &while_stmt.body {
            self.compile_statement(stmt);
        }
        
        // Add scf.yield
        let yield_op = OperationBuilder::new("scf.yield", location)
            .build()
            .unwrap();
        if let Some(block) = &self.current_block {
            block.append_operation(yield_op);
        }
        
        self.current_block = prev_block;
        
        // Create scf.while
        if let Some(block) = &self.current_block {
            let while_op = OperationBuilder::new("scf.while", location)
                .add_region(before_region)
                .add_region(after_region)
                .build()
                .unwrap();
            block.append_operation(while_op);
        }
    }

    fn compile_print(&mut self, print: &PrintStmt) {
        let location = Location::unknown(self.context);
        let value = self.compile_expression(&print.value);
        
        // Create a call to a runtime print function
        // For simplicity, we'll use a placeholder operation
        let print_op = OperationBuilder::new("lox.print", location)
            .add_operand(value)
            .build()
            .unwrap();
        
        if let Some(block) = &self.current_block {
            block.append_operation(print_op);
        }
    }

    fn compile_block(&mut self, block: &BlockStmt) {
        for stmt in &block.statements {
            self.compile_statement(stmt);
        }
    }

    // ========================================================================
    // Expression compilation
    // ========================================================================

    fn compile_expression(&mut self, expr: &Expr) -> Value<'c> {
        let location = Location::unknown(self.context);
        
        let op = match expr {
            Expr::Binary(b) => self.compile_binary(b),
            Expr::Unary(u) => self.compile_unary(u),
            Expr::Literal(l) => return self.compile_literal(l),
            Expr::Grouping(g) => return self.compile_expression(&g.expr),
            Expr::Variable(v) => return self.compile_variable(v),
            Expr::Assign(a) => return self.compile_assign(a),
            Expr::Call(c) => self.compile_call(c),
            Expr::Logical(l) => self.compile_logical(l),
        };
        
        // Get the result from the operation
        op.result(0).unwrap().into()
    }

    fn compile_binary(&mut self, binary: &BinaryExpr) -> melior::ir::Operation<'c> {
        let location = Location::unknown(self.context);
        
        let lhs = self.compile_expression(&binary.left);
        let rhs = self.compile_expression(&binary.right);
        
        let op = match binary.op {
            BinaryOp::Add => arith::addf(lhs, rhs, location),
            BinaryOp::Sub => arith::subf(lhs, rhs, location),
            BinaryOp::Mul => arith::mulf(lhs, rhs, location),
            BinaryOp::Div => arith::divf(lhs, rhs, location),
            BinaryOp::Less => arith::cmpf(self.context, arith::CmpfPredicate::Olt, lhs, rhs, location),
            BinaryOp::LessEqual => arith::cmpf(self.context, arith::CmpfPredicate::Ole, lhs, rhs, location),
            BinaryOp::Greater => arith::cmpf(self.context, arith::CmpfPredicate::Ogt, lhs, rhs, location),
            BinaryOp::GreaterEqual => arith::cmpf(self.context, arith::CmpfPredicate::Oge, lhs, rhs, location),
            BinaryOp::Equal => arith::cmpf(self.context, arith::CmpfPredicate::Oeq, lhs, rhs, location),
            BinaryOp::NotEqual => arith::cmpf(self.context, arith::CmpfPredicate::Une, lhs, rhs, location),
        };
        
        // Append to current block
        if let Some(block) = &self.current_block {
            block.append_operation(op.clone());
        }
        
        op
    }

    fn compile_unary(&mut self, unary: &UnaryExpr) -> melior::ir::Operation<'c> {
        let location = Location::unknown(self.context);
        let operand = self.compile_expression(&unary.right);
        
        match unary.op {
            UnaryOp::Negate => {
                let op = arith::negf(operand, location);
                if let Some(block) = &self.current_block {
                    block.append_operation(op.clone());
                }
                op
            }
            UnaryOp::Not => {
                // In our "numbers only" model, `not x` is:
                //   if x == 0.0 { 1.0 } else { 0.0 }
                // We use scf.if since there's no direct float negation of boolean sense.
                
                let zero_const = arith::constant(
                    self.context,
                    FloatAttribute::new(self.context, 0.0, Type::float64(self.context)).into(),
                    Location::unknown(self.context),
                );
                
                // Build then region (x == 0.0 → result = 1.0)
                let then_region = {
                    let r = Region::new();
                    let b = r.append_block(Block::new(&[]));
                    let one = arith::constant(
                        self.context,
                        FloatAttribute::new(self.context, 1.0, Type::float64(self.context)).into(),
                        Location::unknown(self.context),
                    );
                    b.append_operation(one);
                    r
                };
                
                // Build else region (x != 0.0 → result = 0.0)
                let else_region = {
                    let r = Region::new();
                    let b = r.append_block(Block::new(&[]));
                    let zero = arith::constant(
                        self.context,
                        FloatAttribute::new(self.context, 0.0, Type::float64(self.context)).into(),
                        Location::unknown(self.context),
                    );
                    b.append_operation(zero);
                    r
                };
                
                // Compare operand to 0.0
                if let Some(block) = &self.current_block {
                    block.append_operation(zero_const.clone());
                    let is_zero = arith::cmpf(
                        self.context,
                        arith::CmpfPredicate::Oeq,
                        operand,
                        zero_const.result(0).unwrap().into(),
                        Location::unknown(self.context),
                    );
                    block.append_operation(is_zero);
                    
                    // scf.if(is_zero) { then_region } { else_region } -> f64
                    let if_op = OperationBuilder::new("scf.if", location)
                        .add_operand(is_zero.result(0).unwrap().into())
                        .add_result(Type::float64(self.context))
                        .add_region(then_region)
                        .add_region(else_region)
                        .build()
                        .unwrap();
                    block.append_operation(if_op);
                    
                    if_op
                } else {
                    // Fallback if no current block (shouldn't happen in practice)
                    zero_const
                }
            }
        }
    }

    fn compile_literal(&mut self, literal: &LiteralExpr) -> Value<'c> {
        let location = Location::unknown(self.context);
        
        let op = match &literal.value {
            LoxValue::Nil => {
                return self.compile_nil();
            }
            LoxValue::Bool(b) => {
                // In our "numbers only" model, booleans are 1.0 and 0.0
                arith::constant(
                    self.context,
                    FloatAttribute::new(self.context, if *b { 1.0 } else { 0.0 }, Type::float64(self.context)).into(),
                    location,
                )
            }
            LoxValue::Number(n) => {
                arith::constant(
                    self.context,
                    FloatAttribute::new(self.context, *n, Type::float64(self.context)).into(),
                    location,
                )
            }
            LoxValue::String(s) => {
                // String constants are global - no heap allocation!
                // See the String Constants section below
                return self.compile_string(s, location);
            }
        };
        
        if let Some(block) = &self.current_block {
            block.append_operation(op.clone());
        }
        
        op.result(0).unwrap().into()
    }

    fn compile_nil(&mut self) -> Value<'c> {
        // In our "numbers only" subset, nil is represented as 0.0 f64.
        // This is consistent with the simplified typing model.
        // A full implementation would use tagged unions.
        let location = Location::unknown(self.context);
        let op = arith::constant(
            self.context,
            FloatAttribute::new(self.context, 0.0, Type::float64(self.context)).into(),
            location,
        );
        
        if let Some(block) = &self.current_block {
            block.append_operation(op.clone());
        }
        
        op.result(0).unwrap().into()
    }

    fn compile_variable(&mut self, var: &VariableExpr) -> Value<'c> {
        // Look up the variable in the current scope
        self.variables.get(&var.name)
            .copied()
            .unwrap_or_else(|| self.compile_nil())
    }

    fn compile_assign(&mut self, assign: &AssignExpr) -> Value<'c> {
        let value = self.compile_expression(&assign.value);
        self.variables.insert(assign.name.clone(), value);
        value
    }

    fn compile_call(&mut self, call: &CallExpr) -> melior::ir::Operation<'c> {
        let location = Location::unknown(self.context);
        
        // Compile arguments
        let args: Vec<Value> = call.arguments.iter()
            .map(|arg| self.compile_expression(arg))
            .collect();
        
        // For now, assume callee is a direct function call
        if let Expr::Variable(var) = call.callee.as_ref() {
            let call_op = func::call(
                self.context,
                melior::ir::attribute::FlatSymbolRefAttribute::new(self.context, &var.name),
                &args,
                &[Type::float64(self.context)],
                location,
            );
            
            if let Some(block) = &self.current_block {
                block.append_operation(call_op.clone());
            }
            
            return call_op;
        }
        
        // Indirect call (first-class function) - not implemented
        unimplemented!("Indirect function calls not yet supported")
    }

    fn compile_logical(&mut self, logical: &LogicalExpr) -> melior::ir::Operation<'c> {
        let location = Location::unknown(self.context);
        
        // Logical operations short-circuit in Lox, so we MUST use scf.if.
        // Using arith::andi/ori would be WRONG — those are bitwise, not short-circuit.
        //
        // `a and b` → if a { b } else { false }
        // `a or b`   → if a { true } else { b }

        let left = self.compile_expression(&logical.left);

        // Convert left to i1 for the condition (nonzero = true)
        let zero = arith::constant(
            self.context,
            FloatAttribute::new(self.context, 0.0, Type::float64(self.context)).into(),
            location,
        );
        if let Some(block) = &self.current_block {
            block.append_operation(zero.clone());
        }
        let cond = arith::cmpf(
            self.context,
            arith::CmpfPredicate::One,  // ordered not-equal (nonzero = true)
            left,
            zero.result(0).unwrap().into(),
            location,
        );
        if let Some(block) = &self.current_block {
            block.append_operation(cond.clone());
        }

        match logical.op {
            LogicalOp::And => {
                // if left { right } else { 0.0 }
                let then_block = Block::new(&[]);
                let else_block = Block::new(&[]);

                let prev = self.current_block.replace(then_block);
                let right = self.compile_expression(&logical.right);
                let then_block = self.current_block.take().unwrap();

                self.current_block = Some(else_block);
                let false_val = arith::constant(
                    self.context,
                    FloatAttribute::new(self.context, 0.0, Type::float64(self.context)).into(),
                    location,
                );
                if let Some(block) = &self.current_block {
                    block.append_operation(false_val.clone());
                }
                let else_block = self.current_block.take().unwrap();

                self.current_block = prev;

                let if_op = OperationBuilder::new("scf.if", location)
                    .add_operand(cond.result(0).unwrap().into())
                    .add_result(Type::float64(self.context))
                    .add_region({
                        let mut region = Region::new();
                        region.append_block(then_block);
                        region
                    })
                    .add_region({
                        let mut region = Region::new();
                        region.append_block(else_block);
                        region
                    })
                    .build()?;

                if let Some(block) = &self.current_block {
                    block.append_operation(if_op.clone());
                }
                if_op
            }
            LogicalOp::Or => {
                // if left { 1.0 } else { right }
                let then_block = Block::new(&[]);
                let else_block = Block::new(&[]);

                let prev = self.current_block.replace(then_block);
                let true_val = arith::constant(
                    self.context,
                    FloatAttribute::new(self.context, 1.0, Type::float64(self.context)).into(),
                    location,
                );
                if let Some(block) = &self.current_block {
                    block.append_operation(true_val.clone());
                }
                let then_block = self.current_block.take().unwrap();

                self.current_block = Some(else_block);
                let right = self.compile_expression(&logical.right);
                let else_block = self.current_block.take().unwrap();

                self.current_block = prev;

                let if_op = OperationBuilder::new("scf.if", location)
                    .add_operand(cond.result(0).unwrap().into())
                    .add_result(Type::float64(self.context))
                    .add_region({
                        let mut region = Region::new();
                        region.append_block(then_block);
                        region
                    })
                    .add_region({
                        let mut region = Region::new();
                        region.append_block(else_block);
                        region
                    })
                    .build()?;

                if let Some(block) = &self.current_block {
                    block.append_operation(if_op.clone());
                }
                if_op
            }
        }
    }
    
    /// Compile a string literal to a global constant
    fn compile_string(&mut self, value: &str, location: Location<'c>) -> Value<'c> {
        // Placeholder - see String Constants section below for full implementation
        let op = arith::constant(
            self.context,
            IntegerAttribute::new(0, Type::integer(self.context, 64)).into(),
            location,
        );
        if let Some(block) = &self.current_block {
            block.append_operation(op.clone());
        }
        op.result(0).unwrap().into()
    }
}

/// Main entry point for code generation
pub fn generate_module(context: &Context, program: &Program) -> Module {
    let generator = CodeGenerator::new(context);
    generator.generate(program)
}
}

String Constants (No Allocation Needed!)

String literals are constants, not heap allocations. They live in the binary's data section, just like in C.

#![allow(unused)]
fn main() {
// src/codegen/strings.rs
use melior::{
    Context, Location,
    dialect::llvm,
    ir::{
        attribute::StringAttribute, Type, Module,
        operation::OperationBuilder,
    },
};

/// Create a global string constant in the LLVM dialect
/// 
/// This creates something like:
///   llvm.mlir.global constant @str_0("hello")
///
/// No heap allocation - the string lives in the data section.
pub fn create_string_constant(
    module: &Module,
    context: &Context,
    name: &str,        // e.g., "str_0"
    value: &str,
    location: Location,
) {
    module.body().append_operation(
        llvm::r#const(
            context,
            name,
            Type::parse(context, &format!("!llvm.array<{} x i8>", value.len())).unwrap(),
            StringAttribute::new(context, value),
            location,
        )
    );
}
}

Why This Works

ApproachMemory LocationAllocation?
llvm.mlir.global constantData sectionNo (static)
Heap allocation (malloc)HeapYes (runtime)
Stack allocation (alloca)StackNo, but per-call

String literals are static data — they exist for the lifetime of the program, embedded in the binary. No runtime cost.


Dynamic Typing with Tagged Unions

Lox is dynamically typed, so a function parameter can receive any type:

fun printValue(x) {
  print x;  // x could be number, string, bool, nil, or object
}

We need a tagged union type:

#![allow(unused)]
fn main() {
// src/codegen/types.rs
use melior::ir::Type;
use melior::Context;

/// A Lox value is a tagged union: struct { tag: i8, data: i64 }
pub fn lox_value_type(context: &Context) -> Type {
    Type::parse(context, "!llvm.struct<(i8, i64)>").unwrap()
}

/// Tag values for each Lox type
pub const TAG_NIL: i8 = 0;
pub const TAG_BOOL: i8 = 1;
pub const TAG_NUMBER: i8 = 2;
pub const TAG_STRING: i8 = 3;
pub const TAG_OBJECT: i8 = 4;
pub const TAG_CLOSURE: i8 = 5;
}

Part 5: Source Locations in MLIR

Every operation in MLIR has a source location. Unlike LLVM where debug info is optional, in MLIR locations are core to the IR.

The Location API

MLIR operations carry source locations for error messages and debug output. Melior provides several ways to create them:

#![allow(unused)]
fn main() {
// src/location.rs
use melior::ir::Location;
use melior::Context;

pub fn demonstrate_locations(context: &Context) {
    // Unknown location — for generated code with no source mapping
    let unknown = Location::unknown(context);
    
    // File location — specific file, line, column
    let file_loc = Location::new(context, "test.lox", 10, 5);
    
    // Name location — for generated code, use a descriptive name
    let name_loc = Location::name(context, "implicit_return");
}
}

Updated Code Generator with Proper Locations

#![allow(unused)]
fn main() {
pub struct CodeGenerator<'c> {
    context: &'c Context,
    module: Module<'c>,
    current_block: Option<Block<'c>>,
    variables: std::collections::HashMap<String, Value<'c>>,
    filename: String,
}

impl<'c> CodeGenerator<'c> {
    pub fn new(context: &'c Context, filename: &str) -> Self {
        let location = Location::new(context, filename, 1, 1);
        let module = Module::new(location);
        Self { 
            context, 
            module, 
            current_block: None,
            variables: std::collections::HashMap::new(),
            filename: filename.to_string(),
        }
    }

    /// Convert an AST location to an MLIR location
    fn loc(&self, ast_loc: crate::ast::Location) -> Location<'c> {
        Location::new(self.context, &self.filename, ast_loc.line, ast_loc.column)
    }

    /// Get a location for generated/implicit code
    fn generated_loc(&self, description: &str) -> Location<'c> {
        Location::name(self.context, description)
    }
}
}

What the IR Looks Like With Locations

Before (using Location::unknown):

module {
  func.func @add(%arg0: f64, %arg1: f64) -> f64 {
    %0 = arith.addf %arg0, %arg1 : f64
    return %0 : f64
  }
}

After (with proper locations, shown with -mlir-print-debuginfo):

module {
  func.func @add(%arg0: f64, %arg1: f64) -> f64 
      loc("test.lox":1:1) 
  {
    %0 = arith.addf %arg0, %arg1 : f64 
        loc("test.lox":2:14)
    return %0 : f64 loc("test.lox":2:3)
  } loc("test.lox":1:1)
} loc("test.lox":1:1)

Part 6: A Complete Example

// examples/simple_add.rs
use melior::{
    Context,
    dialect::{arith, func, DialectRegistry},
    ir::{
        attribute::{StringAttribute, TypeAttribute, FloatAttribute},
        r#type::FunctionType,
        Location, Module, Region, Block, Type,
    },
    utility::register_all_dialects,
};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let registry = DialectRegistry::new();
    register_all_dialects(&registry);
    
    let context = Context::new();
    context.append_dialect_registry(&registry);
    context.load_all_available_dialects();
    
    let location = Location::unknown(&context);
    let module = Module::new(location);
    
    // Create function type: (f64, f64) -> f64
    let float_type = Type::float64(&context);
    let function_type = FunctionType::new(&context, &[float_type, float_type], &[float_type]);
    
    // Create function body
    let region = Region::new();
    let block = region.append_block(Block::new(&[
        (float_type, location),
        (float_type, location),
    ]));
    
    // %sum = arith.addf %arg0, %arg1 : f64
    let sum = block.append_operation(arith::addf(
        block.argument(0).unwrap().into(),
        block.argument(1).unwrap().into(),
        location,
    ));
    
    // return %sum : f64
    block.append_operation(func::r#return(
        &[sum.result(0).unwrap().into()],
        location,
    ));
    
    // Create the function
    module.body().append_operation(func::func(
        &context,
        StringAttribute::new(&context, "add"),
        TypeAttribute::new(function_type.into()),
        region,
        &[],
        location,
    ));
    
    // Verify and print
    assert!(module.as_operation().verify());
    println!("{}", module.as_operation());
    
    Ok(())
}

Output:

module {
  func.func @add(%arg0: f64, %arg1: f64) -> f64 {
    %0 = arith.addf %arg0, %arg1 : f64
    return %0 : f64
  }
}

Part 7: Lowering to LLVM IR

After generating MLIR, lower it to LLVM IR and compile to machine code:

#![allow(unused)]
fn main() {
// src/lib.rs
pub mod ast;
pub mod parser;
pub mod codegen;

use melior::{
    Context,
    dialect::DialectRegistry,
    pass::PassManager,
    utility::register_all_dialects,
};

pub fn compile_to_llvm(source: &str) -> Result<String, CompileError> {
    // 1. Parse
    let tokens = lexer::tokenize(source)?;
    let program = parser::Parser::new(tokens).parse()?;
    
    // 2. Generate MLIR
    let registry = DialectRegistry::new();
    register_all_dialects(&registry);
    
    let context = Context::new();
    context.append_dialect_registry(&registry);
    context.load_all_available_dialects();
    
    let module = codegen::generate_module(&context, &program);
    
    // 3. Run lowering passes (MLIR → LLVM IR)
    //    The exact pass names depend on your Melior version.
    //    Common passes: convert-scf-to-cf, convert-arith-to-llvmir,
    //    convert-func-to-llvmir
    let pass_manager = PassManager::new(&context);
    // pass_manager.add_pass(pass::convert_scf_to_cf());
    // pass_manager.add_pass(pass::convert_arith_to_llvm());
    // pass_manager.add_pass(pass::convert_func_to_llvm());
    // pass_manager.run(&module)?;
    
    Ok(module.as_operation().to_string())
}
}

Using the CLI

# Compile Lox to MLIR
cargo run -- compile input.lox --emit-mlir -o output.mlir

# Lower MLIR to LLVM IR
mlir-translate output.mlir --mlir-to-llvmir -o output.ll

# Compile to executable
clang output.ll -o output

Part 8: Project Structure

lox-mlir/
├── Cargo.toml
├── src/
│   ├── lib.rs              # Library entry point
│   ├── main.rs             # CLI entry point
│   ├── ast.rs              # AST definitions
│   ├── lexer.rs            # Tokenizer
│   ├── parser.rs           # Parser
│   ├── codegen/
│   │   ├── mod.rs
│   │   ├── generator.rs    # MLIR code generator
│   │   ├── types.rs        # Tagged union types
│   │   └── strings.rs      # String constant handling
│   └── runtime/
│       ├── mod.rs
│       └── print.c         # Runtime print implementation
├── examples/
│   ├── simple_add.rs
│   └── *.lox
└── tests/
    └── integration.rs

Quick Reference: Lox → MLIR Mapping

Lox ConstructRust EnumMLIR Operation
a + bBinaryOp::Addarith.addf
a - bBinaryOp::Subarith.subf
a * bBinaryOp::Mularith.mulf
a / bBinaryOp::Divarith.divf
a < bBinaryOp::Lessarith.cmpf olt
a == bBinaryOp::Equalarith.cmpf oeq
var x = vVarStmtStore in HashMap
xVariableExprLoad from HashMap
if (c) {...}IfStmtscf.if
while (c) {...}WhileStmtscf.while
fun f(...) {...}FunctionStmtfunc.func
f(args)CallExprfunc.call
return vReturnStmtfunc.return

Differences from C++ MLIR

AspectC++ MLIRMelior (Rust)
Dialect definitionTableGen (.td)Rust code directly
OperationsGenerated from ODSBuilt with OperationBuilder
OwnershipManual / raw pointersRAII with lifetimes
Pattern rewritingC++ classesClosures / Rust traits
Error handlingLogicalResultResult<T, Error>

Next Steps

  1. Start small: Just numbers and arithmetic. Get print 1 + 2; working.
  2. Add variables: Implement local variables with SSA values.
  3. Add control flow: if and while with scf dialect.
  4. Add functions: func.func and func.call.
  5. Add closures: Environment capture with heap allocation.
  6. Add classes/objects: The full Lox experience.

Melior provides a safe, idiomatic Rust interface to MLIR. The ownership model takes some getting used to (regions/blocks are moved rather than borrowed), but the type system prevents most common errors.