MLIR for Lox: Part 9 — print and clock — The Built-In Functions Lox Can’t Run Without

Your Lox compiler can parse code, generate MLIR, lower it through dialects, and JIT-compile the result. It has garbage collection, closures, and classes. But when someone writes print "hello"; or var start = clock();, nothing happens. The compiler generates MLIR that calls lox_print and lox_clock, but those functions don’t exist yet. We never built the runtime.

What Is a Runtime, Exactly?

When our compiler lowers Lox to MLIR, it generates calls to functions that don’t exist in MLIR itself. Things like:

lox_print(value) — print a Lox value
lox_clock() — return the current time
lox_alloc(size) — allocate on the GC heap

These are runtime functions. They’re implemented in Rust (or C), compiled into the host binary, and linked at JIT time. The generated MLIR code calls them; the runtime defines them.

This split is fundamental to how compiled languages work. The compiler generates the what; the runtime provides the how. Lox’s runtime is small — only a handful of functions — but the pattern scales to any size.

A Note on Representation: Tagged Unions in the Runtime

Parts 1 through 6 work with f64 values — a numbers-only subset of Lox. Part 7 introduced the tagged union representation because classes need LoxValue::Instance, LoxValue::String, and other types that can’t fit in an f64. The runtime functions in this part work with the full tagged union — lox_print must handle every LoxValue variant, not only numbers.

If you’re coming from the numbers-only model, the change is mechanical: every operation gets wrapped in tag checks, payload extraction, and re-packing. The concept hasn’t changed; the representation is richer. Where the numbers-only model does arith.addf %lhs, %rhs : f64, the tagged union model checks tags, extracts payloads, does the arithmetic, and re-packs the result as (TAG_NUMBER, payload). You’ve seen this pattern in Parts 7 and 8 — the runtime uses the same (i8, i64) pairs, viewed from the Rust side of the boundary.

If you want to build the numbers-only runtime first (a good exercise), have lox_print handle only TAG_NUMBER and return early for everything else. Then add the other tags one at a time.

Reconstructing Lox Values: `from_raw` and `to_raw`

Before we define the runtime functions, we need the bridge between MLIR’s raw (tag, payload) pairs and Rust’s typed LoxValue enum.

LoxValue::from_raw(tag, payload) takes a raw (tag, payload) pair and reconstructs a LoxValue. Its companion to_raw() does the inverse. These are the bridge between the MLIR world (where values are (i8, i64) pairs) and the Rust world (where we have typed enums):

#![allow(unused)]
fn main() {
impl LoxValue {
    /// Reconstruct a LoxValue from its raw (tag, payload) representation.
    ///
    /// The tag identifies the type. The payload is type-specific:
    /// - Nil: payload is 0 (unused)
    /// - Bool: payload is 0 (false) or 1 (true)
    /// - Number: payload is the f64 bitcast to i64
    /// - String: payload is a pointer to the GC-managed string object
    /// - Instance: payload is a pointer to the GC-managed instance object
    pub fn from_raw(tag: u8, payload: i64) -> Self {
        match tag {
            TAG_NIL => LoxValue::Nil,
            TAG_BOOL => LoxValue::Bool(payload != 0),
            TAG_NUMBER => {
                // Bitcast i64 back to f64
                let bytes = payload.to_le_bytes();
                LoxValue::Number(f64::from_le_bytes(bytes))
            }
            TAG_STRING => {
                // payload is a raw pointer to a GcString on the GC heap.
                // Safety: the caller must ensure the value is still rooted
                // (reachable from a GC root). A value that's on the stack
                // or in a runtime function's local variables is rooted by
                // the GC's stack-scanning — the collector won't free it
                // while the pointer is in use.
                //
                // `(*ptr).clone()` copies the GcString pointer wrapper,
                // not the string data on the GC heap. Both the original
                // and the clone point to the same GC object. A deep copy
                // of the string data (e.g., `String::from(&**ptr)`)
                // would allocate outside the GC heap and leak or
                // double-free — but `GcString::clone()` doesn't do that.
                // The mark-sweep collector will find the object as long
                // as any GcString (or other root) still references it.
                let ptr = payload as *const GcString;
                LoxValue::String(unsafe { (*ptr).clone() }) // copy the pointer wrapper, not the data
            }
            TAG_INSTANCE => {
                // Same principle: wrap the existing GC pointer, don't deep-copy.
                let ptr = payload as *const GcInstance;
                LoxValue::Instance(unsafe { (*ptr).clone() }) // wrap existing GC pointer
            }
            // ⚠️ TAG_CLOSURE, TAG_CLASS, and TAG_BOUND hit this catchall
            // and silently become Nil. If you add a runtime function that
            // receives one of these types (e.g., lox_call for closures),
            // add an explicit arm that reconstructs the Gc pointer,
            // following the same pattern as TAG_STRING above.
            _ => LoxValue::Nil, // unknown tag → nil (defensive)
        }
    }

    /// Convert a LoxValue to its raw (tag, payload) representation.
    /// Used when passing values from the runtime back to MLIR.
    pub fn to_raw(&self) -> (u8, i64) {
        match self {
            LoxValue::Nil => (TAG_NIL, 0),
            LoxValue::Bool(b) => (TAG_BOOL, if *b { 1 } else { 0 }),
            LoxValue::Number(n) => {
                let bytes = n.to_le_bytes();
                (TAG_NUMBER, i64::from_le_bytes(bytes))
            }
            LoxValue::String(s) => (TAG_STRING, s as *const GcString as i64),
            LoxValue::Instance(i) => (TAG_INSTANCE, i as *const GcInstance as i64),
            // ⚠️ Same catchall as from_raw — LoxValue::Closure, Class, and
            // Bound silently become (TAG_NIL, 0), which destroys the value.
            // Add explicit arms when you need to pass these types to MLIR.
            _ => (TAG_NIL, 0),
        }
    }
}
}

Why not store LoxValue directly in MLIR? Because MLIR is a low-level IR — it doesn’t know about Rust enums, String, or GcString. It works with primitive types: integers, floats, and pointers. The tagged union representation (i8 tag + i64 payload) is the ABI between generated code and the runtime. from_raw/to_raw translate across that boundary.

Watch out for the catchalls. Both from_raw and to_raw have _ => Nil fallback arms. For from_raw, that means TAG_CLOSURE, TAG_CLASS, and TAG_BOUND silently become LoxValue::Nil — no error, no warning. For to_raw, passing a LoxValue::Closure produces (TAG_NIL, 0), which destroys the value. The catchall exists so the tutorial doesn’t need to show every arm up front, but if you’re building a runtime function that handles closures, classes, or bound methods, add explicit match arms before you wonder why you’re getting nil everywhere.

Why wrap the existing GC pointer instead of cloning the data? A GcString is a thin wrapper around a pointer to the GC heap — it’s not the string data itself. GcString::clone() copies the pointer wrapper, so both the original and the clone point to the same GC object. The mark-sweep collector (from Part 2, with root tracking from Part 3) will find the object as long as any root still references it. A deep copy of the string data — like String::from(&**ptr) — would allocate a new string outside the GC heap, which would leak (never collected) or double-free (if the original’s destructor also frees the data). The key distinction: (*ptr).clone() on a GcString copies the thin pointer wrapper (safe); String::from(&**ptr) copies the actual string bytes into untracked memory (dangerous).

The Runtime Interface

Now we can define the runtime functions. These take raw types (u8, i64, f64), not LoxValue directly — JIT-compiled code works with the raw representation. The runtime uses from_raw to reconstruct LoxValue when it needs to.

We’ll put these in a runtime module:

#![allow(unused)]
fn main() {
// src/runtime/mod.rs

use crate::gc::Gc;
use crate::value::LoxValue;

/// Functions that the JIT-compiled code can call.
/// These are registered as symbols when creating the execution engine.
pub struct Runtime {
    gc: Gc,
}

impl Runtime {
    pub fn new() -> Self {
        Runtime { gc: Gc::new() }
    }

    /// Print a Lox value to stdout.
    /// Called from MLIR as `lox_print(i8, i64) -> void`.
    pub fn lox_print(&self, tag: u8, payload: i64) {
        let value = LoxValue::from_raw(tag, payload);
        match &value {
            LoxValue::Nil => println!("nil"),
            LoxValue::Bool(b) => println!("{}", b),
            LoxValue::Number(n) => println!("{}", n),
            LoxValue::String(s) => {
                // Safety: print doesn't allocate on the GC heap, so the
                // collector can't run while we're using the string pointer.
                // The GC only triggers when the allocator determines a
                // collection is needed (see Part 4). The
                // caller (lox_print_wrapper) holds the RUNTIME mutex, but
                // that's not what makes this safe — it's the absence of
                // any allocation during this call.
                println!("{}", s.as_str());
            }
            LoxValue::Instance(_) => {
                // No safety concern here: we're printing a placeholder,
                // not dereferencing the instance's fields. If a future
                // version prints field values, it would need the same
                // GC safety analysis as the String arm above.
                println!("<instance>")
            }
            _ => println!("<object>"),
        }
    }

    /// Return the current clock time in seconds.
    /// Called from MLIR as `lox_clock() -> f64`.
    pub fn lox_clock(&self) -> f64 {
        use std::time::SystemTime;
        let duration = SystemTime::now()
            .duration_since(std::time::UNIX_EPOCH)
            .expect("time went backwards");
        duration.as_secs_f64()
    }

    /// Allocate a new GC-managed object.
    /// Called from MLIR as `lox_alloc(i32) -> i64` (returns raw pointer).
    ///
    /// `self.gc.allocate()` returns a raw pointer (`*mut u8`) to the
    /// newly allocated heap region. We cast it to `i64` so it can flow
    /// through MLIR as a plain integer (MLIR doesn't have pointer types
    /// at the Lox dialect level).
    ///
    /// **Platform note:** `ptr as i64` is correct on 64-bit platforms
    /// where pointers fit in `i64`. On 128-bit platforms (theoretical)
    /// or 32-bit platforms with address space extension, this would
    /// truncate the pointer. For this tutorial, we target x86-64 / AArch64
    /// where `sizeof(*mut u8) == sizeof(i64)`.
    pub fn lox_alloc(&mut self, size: i32) -> i64 {
        let ptr = self.gc.allocate(size as usize);
        ptr as i64
    }

    /// Trigger a garbage collection cycle.
    /// Called from MLIR as `lox_gc_collect() -> void`.
    pub fn lox_gc_collect(&mut self) {
        self.gc.collect();
    }
}
}

Registering Native Functions with the JIT

Now the question: how does the JIT-compiled MLIR code find these functions? When we create the MLIR execution engine, we need to register symbols — map string names to function pointers.

Melior’s ExecutionEngine wraps LLVM’s ORC JIT. The way to expose symbols depends on your Melior version, but the general approach is:

Compile your runtime wrappers into the host binary
Create the execution engine with shared library paths (for external runtimes), or register symbols directly

Here’s the pattern:

#![allow(unused)]
fn main() {
// src/jit.rs

use melior::{
    execution_engine::ExecutionEngine,
    Context,
};
use std::sync::{LazyLock, Mutex};
use crate::runtime::Runtime;

/// Global runtime instance, shared between the JIT and the host.
///
/// The JIT's symbol resolver works with raw function pointers — there's no way
/// to pass a `&mut Runtime` through a function pointer. Using a global
/// `LazyLock<Mutex<Runtime>>` is the simplest approach. In a production
/// compiler you might use thread-local storage or a more sophisticated
/// registration mechanism, but for a tutorial this keeps things clear.
static RUNTIME: LazyLock<Mutex<Runtime>> = LazyLock::new(|| {
    Mutex::new(Runtime::new())
});

/// Create an execution engine.
///
/// Note: the exact `ExecutionEngine` constructor signature varies between
/// Melior versions. In Melior 0.27, it takes five parameters:
///   ExecutionEngine::new(&module, optimization_level, &shared_lib_paths, enable_object_dump, enable_pic)
/// The last two are `bool` flags — `enable_object_dump` and `enable_pic`.
/// Some versions also take `&module` differently.
///
/// Symbol registration also varies — check the `ExecutionEngine` API docs
/// for your version. The pattern shown below is conceptual; you may need
/// to use `add_symbol` or a different registration method.
pub fn create_engine(module: &melior::ir::Module) -> ExecutionEngine {
    let engine = ExecutionEngine::new(module, 2, &[], false, false);
    // ^^^ Melior 0.27 takes 5 params: module, opt_level, shared_libs, enable_object_dump, enable_pic

    // Register runtime functions as JIT symbols.
    // The MLIR code calls these by name (e.g., `func.call @lox_print(...)`),
    // and the JIT resolves them through the symbol table.
    //
    // In Melior 0.27, symbol registration looks like:
    //   engine.add_symbol("lox_print", lox_print_wrapper as *const ());
    //
    // Check the ExecutionEngine docs for the exact method name and signature.

    engine
}
}

The Wrapper Problem

There’s a catch. The JIT calls functions using the C calling convention, and our Runtime methods take &self or &mut self. We need wrapper functions that bridge from the C ABI to our Rust runtime.

The #[no_mangle] attribute ensures the function name isn’t mangled by the Rust compiler, so the JIT can find it by name. The extern "C" ensures the C calling convention.

#![allow(unused)]
fn main() {
// src/jit.rs (continued)

/// C-compatible wrapper for lox_print.
///
/// MLIR signature: `(i8, i64) -> void`
/// We receive the tagged value as two arguments.
#[no_mangle]
pub extern "C" fn lox_print_wrapper(tag: u8, payload: i64) {
    let rt = RUNTIME.lock().unwrap();
    rt.lox_print(tag, payload);
}

/// C-compatible wrapper for lox_clock.
///
/// MLIR signature: `() -> f64`
#[no_mangle]
pub extern "C" fn lox_clock_wrapper() -> f64 {
    let rt = RUNTIME.lock().unwrap();
    rt.lox_clock()
}

/// C-compatible wrapper for lox_alloc.
///
/// MLIR signature: `(i32) -> i64`
#[no_mangle]
pub extern "C" fn lox_alloc_wrapper(size: i32) -> i64 {
    let mut rt = RUNTIME.lock().unwrap();
    rt.lox_alloc(size)
}
}

Why a global? The JIT’s symbol resolver works with raw function pointers — there’s no way to pass a &mut Runtime through a function pointer. Using a global LazyLock<Mutex<Runtime>> is the simplest approach. In a production compiler you might use thread-local storage or a more sophisticated registration mechanism, but for a tutorial this keeps things clear.

Generating Calls to Runtime Functions

Now that the runtime functions exist and are registered, we need to generate MLIR that calls them. The key principle:

Function declarations go at the module level. Function calls go inside blocks.

A func.func declaration (external function with no body) is a module-level operation. It should be appended to the module’s body block, not inside a function’s body block. The func.call operation is what goes inside a function’s body.

To avoid declaring the same external function multiple times (which would be an error), we declare all runtime functions once during initialization, then only emit func.call operations during codegen. In a complete compiler, declare_runtime_functions would include every runtime function from every part — lox_print and lox_clock from this part, gc_push_frame and gc_pop_frame from Part 3, lox_alloc from Part 4, and the class-related functions (lox_create_class, lox_instance_from_class, lox_get_property, lox_set_property, lox_bind_method, lox_set_method, lox_super_lookup, lox_call) from Part 7. The example below shows only lox_print and lox_clock; add the rest following the same pattern.

Declaring Runtime Functions at Module Level

#![allow(unused)]
fn main() {
use melior::dialect::func;
use melior::ir::attribute::{StringAttribute, TypeAttribute};
use melior::ir::r#type::{FunctionType, Type};
use melior::ir::{Identifier, Location, Region};

// In CodeGenerator — call this once during initialization
fn declare_runtime_functions(&self) {
    let location = Location::unknown(self.context);
    let i8_type: Type = Type::parse(self.context, "i8").unwrap();
    let i64_type: Type = Type::parse(self.context, "i64").unwrap();
    let f64_type = Type::float64(self.context); // used for clock's return type below

    // Declare lox_print: (i8, i64) -> ()
    let print_type = FunctionType::new(self.context, &[i8_type, i64_type], &[]);
    self.module.body().append_operation(func::func(
        self.context,
        StringAttribute::new(self.context, "lox_print"),
        TypeAttribute::new(print_type.into()),
        Region::new(),  // empty region = declaration, not definition
        &[
            (Identifier::new(self.context, "sym_visibility"),
             StringAttribute::new(self.context, "private").into()),
        ],
        location,
    ));

    // Declare lox_clock: () -> f64
    let clock_type = FunctionType::new(self.context, &[], &[f64_type]);
    self.module.body().append_operation(func::func(
        self.context,
        StringAttribute::new(self.context, "lox_clock"),
        TypeAttribute::new(clock_type.into()),
        Region::new(),
        &[
            (Identifier::new(self.context, "sym_visibility"),
             StringAttribute::new(self.context, "private").into()),
        ],
        location,
    ));
}
}

The Print Statement

When the parser sees print expr;, we:

Compile the expression to get a (tag, payload) value
Call lox_print(tag, payload) — this is a func.call, which goes in the block

That first step uses compile_expression_tagged — the tagged-union version of compile_expression. In the numbers-only model (Parts 1–6), compile_expression returns a single Value<'c, 'c> (the f64). In the tagged union model, every expression produces a (tag, payload) pair, so the code generator needs a version that returns both. Every function that produces a LoxValue now returns (Value<'c, 'c>, Value<'c, 'c>) — the tag and the payload.

The transformation is mechanical but not invisible. Here’s how compile_number_literal changes:

#![allow(unused)]
fn main() {
use melior::dialect::arith;
use melior::ir::attribute::{FloatAttribute, IntegerAttribute};
use melior::ir::operation::OperationBuilder;
use melior::ir::r#type::Type;
use melior::ir::{Block, Location, Value};

// Numbers-only (Part 1): returns a single f64 Value
fn compile_number_literal(&self, value: f64, block: &Block<'c>) -> Value<'c, 'c> {
    let location = Location::unknown(self.context);
    let op = arith::constant(self.context, FloatAttribute::new(
        Type::float64(self.context), value
    ).into(), location);
    block.append_operation(op).result(0).unwrap().into()
}

// Tagged union (Part 9): returns (tag, payload) pair
fn compile_number_literal_tagged(&self, value: f64, block: &Block<'c>) -> (Value<'c, 'c>, Value<'c, 'c>) {
    let location = Location::unknown(self.context);

    // tag = TAG_NUMBER
    let tag = block.append_operation(arith::constant(
        self.context,
        IntegerAttribute::new(Type::parse(self.context, "i8").unwrap(), TAG_NUMBER as i64).into(),
        location,
    ));

    // payload = f64 bitcast to i64
    let f64_val = block.append_operation(arith::constant(
        self.context,
        FloatAttribute::new(Type::float64(self.context), value).into(),
        location,
    ));
    // llvm.bitcast reinterprets the f64 bits as i64 (preserving the bit
    // pattern). We use the LLVM dialect's bitcast rather than arith.bitcast
    // because we're generating LLVM dialect IR directly — the arith dialect's
    // bitcast would need to be lowered by `convert-arith-to-llvm` anyway.
    // Both operations perform the same reinterpreted bit-for-bit conversion
    // between types of equal width (including float↔integer). See the
    // "Why llvm.bitcast?" note in compile_clock for the full explanation.
    let payload = block.append_operation(OperationBuilder::new("llvm.bitcast", location)
        .add_operands(&[f64_val.result(0).unwrap()])
        .add_results(&[Type::parse(self.context, "i64").unwrap().into()])
        .build().expect("valid bitcast"));

    (tag.result(0).unwrap().into(), payload.result(0).unwrap().into())
}
}

Same computation at the core — arith.constant to produce the value — but now wrapped in a tag constant and a bitcast. Every expression type gets the same treatment: compute the value, then wrap it in the appropriate tag. Binary operations like lox.add are more involved (they need tag checks on the inputs before extracting payloads), but the pattern is the same.

#![allow(unused)]
fn main() {
use melior::dialect::func;
use melior::ir::attribute::FlatSymbolRefAttribute;
use melior::ir::r#type::Type;
use melior::ir::{Block, Location, Value};
use std::collections::HashMap;

// In CodeGenerator::compile_print
fn compile_print(&self, print: &PrintStmt, block: &Block<'c>, variables: &mut HashMap<String, Value<'c, 'c>>) {
    let location = Location::unknown(self.context);
    let (tag, payload) = self.compile_expression_tagged(&print.value, block);

    // Call the already-declared lox_print function.
    // Only the call goes in the block — the declaration is at module level.
    let call_op = func::call(
        self.context,
        FlatSymbolRefAttribute::new(self.context, "lox_print"),
        &[tag, payload],
        &[],
        location,
    );
    block.append_operation(call_op);
}
}

The Clock Function

clock() is simpler — it takes no arguments and returns f64:

#![allow(unused)]
fn main() {
use melior::dialect::{arith, func};
use melior::ir::attribute::{FlatSymbolRefAttribute, IntegerAttribute};
use melior::ir::operation::OperationBuilder;
use melior::ir::r#type::Type;
use melior::ir::{Block, Location, Value};

fn compile_clock(&self, block: &Block<'c>) -> (Value<'c, 'c>, Value<'c, 'c>) {
    let location = Location::unknown(self.context);

    // Call the already-declared lox_clock function.
    let call_op = func::call(
        self.context,
        FlatSymbolRefAttribute::new(self.context, "lox_clock"),
        &[],
        &[Type::float64(self.context)],
        location,
    );
    let result = block.append_operation(call_op);
    let f64_result: Value = result.result(0).unwrap().into();

    // Box the f64 into a tagged value: tag = TAG_NUMBER, payload = bits
    let tag = block.append_operation(arith::constant(
        self.context,
        IntegerAttribute::new(
            Type::parse(self.context, "i8").unwrap(),
            TAG_NUMBER as i64,
        ).into(),
        location,
    ));

    // Reinterpret the f64 bits as i64 for the payload.
    // We use `llvm.bitcast` — the standard way to reinterpret bits
    // between different types of the same width in MLIR.
    // (We use llvm.bitcast rather than arith.bitcast because we're
    // generating LLVM dialect IR directly. Both operations perform the
    // same float↔integer reinterpreted bit conversion.)
    //
    // Alternative approaches if `llvm.bitcast` is unavailable:
    //   - Store the f64 to memory and load it as i64 (always works)
    //   - Use `builtin.unrealized_conversion_cast` as a placeholder
    //
    // We use `OperationBuilder` with the raw operation name for clarity.
    // Melior may provide a typed `llvm::bitcast()` helper in your
    // version — check the docs if you prefer the idiomatic API.
    let payload = block.append_operation(OperationBuilder::new("llvm.bitcast", location)
        .add_operands(&[f64_result])
        .add_results(&[Type::parse(self.context, "i64").unwrap().into()])
        .build()
        .expect("valid bitcast"));

    (tag.result(0).unwrap().into(), payload.result(0).unwrap().into())
}
}

The key insight: clock() returns an f64, but our Lox values are (tag, payload) pairs. So we tag the result with TAG_NUMBER and reinterpret the f64 bits as i64 for the payload.

Why llvm.bitcast instead of arith.bitcast? Both operations perform the same bit-for-bit reinterpretation between types of equal width — including f64 ↔ i64. The difference is dialect level. We use llvm.bitcast because we’re generating LLVM dialect IR directly (the code uses OperationBuilder with raw operation names). Using arith.bitcast would work too — it would be lowered to llvm.bitcast by the convert-arith-to-llvm pass — but since everything else in the tagged-union code is already in the LLVM dialect, we skip the extra lowering step. If you prefer the higher-level approach, arith.bitcast %val : f64 to i64 produces the same result after lowering.

GC Safety in Runtime Functions

There’s a subtlety with any runtime function that receives a GC-managed pointer. The string data lives on the GC heap — if a collection happens while the runtime function is using that pointer, the memory could be moved or freed.

For print, this is unlikely since println! doesn’t allocate on the GC heap, but it’s a real concern for more complex runtime functions. The general rule:

Any runtime function that receives a GC-managed pointer must ensure the GC cannot collect it while the pointer is in use.

Our lox_print is safe without explicit rooting because print doesn’t allocate on the GC heap, so the collector can’t run during the call. The GC only triggers when the allocator determines a collection is needed (see Part 4). The global RUNTIME mutex in the wrapper function does prevent concurrent GC — lox_gc_collect needs the same lock — but the real guarantee is the absence of allocation, not the lock. A runtime function that does allocate (like lox_concat) would need to root any GC pointers before the allocation point, even with the mutex held, because the allocation itself could trigger collection.

For instances, the current code prints <instance> as a placeholder. A real implementation would need to root the instance with push_root/pop_root (the same pattern from Part 3) before accessing its fields. Since push_root/pop_root require &mut self, and our lox_print takes &self, we’d need to either:

Change lox_print to take &mut self (simple but overkill — print doesn’t allocate)
Use the global RUNTIME mutex’s interior mutability to root temporarily
Accept that print can’t trigger GC (it doesn’t allocate), so rooting isn’t necessary for print specifically — the GC only triggers during allocation (Part 4)

We take the third approach — print doesn’t allocate, so GC can’t run during print. A production runtime would root aggressively and use &mut self.

Honest simplification: We’re trading perfect safety for simplicity here. If you add a runtime function that does allocate (like a concat function that creates new strings), you must root any GC pointers before the allocation point.

Extending the Standard Library

print and clock are the built-ins from Crafting Interpreters, but a real Lox implementation might want more. The pattern is always the same:

Declare the function in the MLIR module as an external func.func — once, at module level
Register the symbol with the JIT execution engine
Write a #[no_mangle] extern "C" wrapper that bridges to the Rust runtime
Generate a call from the code generator when the built-in is used — only func.call, never func.func

Here’s a sketch for adding len(string):

#![allow(unused)]
fn main() {
// Runtime function
pub fn lox_len(&self, tag: u8, payload: i64) -> f64 {
    if tag != TAG_STRING {
        // In a production runtime, you'd return a runtime error
        // (or set an error flag the caller checks). This sketch
        // returns 0.0 as a sentinel — it works for the tutorial's
        // test cases but silently hides type mismatches. A better
        // approach: set a last_error field on the Runtime, return
        // a TAG_ERROR tagged value, or use a result type that the
        // caller must check before using the return value.
        eprintln!("len() expects a string argument");
        // Silent sentinel — not great. Part 10 shows how to report
        // type mismatches as runtime errors with source locations.
        return 0.0;
    }
    // payload is a pointer to the GC string object.
    // We need to dereference it through the GcString type, not as a raw C string.
    let string_obj = unsafe { &*(payload as *const GcString) };
    string_obj.len() as f64
}

// C-compatible wrapper
#[no_mangle]
pub extern "C" fn lox_len_wrapper(tag: u8, payload: i64) -> f64 {
    let rt = RUNTIME.lock().unwrap();
    rt.lox_len(tag, payload)
}
}

And in the code generator, you’d add the declaration in declare_runtime_functions and the call in a new compile_len method — following the same pattern as compile_print and compile_clock.

Connecting to the Class System

In Part 7, we built classes and instances. Instances are GC-allocated objects with a methods table and fields dictionary. When a runtime function receives an instance (tag = INSTANCE_TAG, payload = pointer to GcInstance), it can access the instance’s data:

#![allow(unused)]
fn main() {
pub fn lox_instance_get_field(&self, instance_ptr: i64, field_name_ptr: i64) -> (u8, i64) {
    let instance = unsafe { &*(instance_ptr as *const GcInstance) };
    let field_name = unsafe { &*(field_name_ptr as *const GcString) };

    match instance.fields.get(field_name.as_str()) {
        Some(value) => {
            // Return the full (tag, payload) pair — the caller needs
            // the tag to know what type the field is. Returning only
            // the payload (value.to_raw().1) would lose type information:
            // a string field, an instance field, and a nil field would
            // all produce the same i64, and the caller couldn't tell
            // them apart.
            value.to_raw()
        }
        None => (TAG_NIL, 0), // nil: tag + payload
    }
}
}

Notice the pattern: lox_print uses from_raw to reconstruct a LoxValue because it needs to branch on the tag (different print behavior for each type). lox_instance_get_field works with raw pointers directly because it already knows the types — the tag check happened in the MLIR code before the call. Runtime functions that need to branch on the LoxValue type use from_raw; functions that already know the types work with raw pointers directly.

Safety note: This unsafe block dereferences raw pointers. The MLIR code must guarantee the pointer is valid and the tag is INSTANCE_TAG before calling this. Without that guarantee, the unsafe block is a crash waiting to happen. In a production runtime, you’d add tag-check assertions in debug builds.

The Runtime as a C Library

An alternative approach — and one that’s more common in production compilers — is to write the runtime in C and link it as a shared library. This gives you a stable ABI (no Rust mangling concerns), language independence (the same runtime works with any frontend), and simpler linking (add -lruntime to the linker flags).

The tutorial includes a compilable C runtime in runtime/lox_runtime.h and runtime/lox_runtime.c. Here’s the key idea — every Lox value is a (tag, payload) pair, and the C runtime knows how to print each tag:

// runtime/lox_runtime.h (excerpt)
#define TAG_NIL        0
#define TAG_BOOL       1
#define TAG_NUMBER     2
#define TAG_STRING     3
#define TAG_CLOSURE    4   /* matches compiled value tags from Part 7 */
#define TAG_INSTANCE   5   /* matches compiled value tags from Part 7 */
#define TAG_CLASS      6   /* matches compiled value tags from Part 7 */
#define TAG_BOUND      7   /* matches compiled value tags from Part 7 */

void lox_print(int8_t tag, int64_t payload);
double lox_clock(void);
int64_t lox_alloc_string(const char *data, int64_t length);

// runtime/lox_runtime.c (excerpt)
void lox_print(int8_t tag, int64_t payload) {
    switch (tag) {
        case TAG_NIL:    printf("nil\n"); break;
        case TAG_BOOL:   printf("%s\n", payload ? "true" : "false"); break;
        case TAG_NUMBER: {
            double value;
            memcpy(&value, &payload, sizeof(double));  // reinterpret i64 as f64
            printf("%g\n", value);
            break;
        }
        case TAG_STRING: {
            /*
             * ┌─────────────────────────────────────────────────────────┐
             * │ ⚠️  PAYLOAD FORMAT DIFFERS FROM THE RUST RUNTIME        │
             * │                                                         │
             * │ Rust runtime: payload = pointer to GcString on GC heap  │
             * │ C runtime:    payload = const char* (malloc, null-term)  │
             * │                                                         │
             * │ Same TAG_STRING value, DIFFERENT payload format.        │
             * │ Do NOT mix Rust-allocated strings with C-allocated ones.│
             * │ The C runtime is a standalone test harness only.        │
             * └─────────────────────────────────────────────────────────┘
             */
            const char *str = (const char *)payload;
            if (str) printf("%s\n", str);
            break;
        }
        case TAG_INSTANCE: printf("<instance at %ld>\n", (long)payload); break;
        case TAG_CLOSURE:  printf("<closure at %ld>\n", (long)payload); break;
        case TAG_CLASS:    printf("<class at %ld>\n", (long)payload); break;
        case TAG_BOUND:    printf("<bound method at %ld>\n", (long)payload); break;
        default: printf("<unknown tag %d>\n", tag); break;
    }
}

The C runtime uses null-terminated C strings allocated with malloc — no GC, no GcString wrapper. The lox_alloc_string function creates these strings so the MLIR code generator can store string data. See runtime/README.md for build instructions and a comparison with the Rust runtime.

Build and link:

gcc -shared -fPIC -o liblox_runtime.so runtime/lox_runtime.c

#![allow(unused)]
fn main() {
let engine = ExecutionEngine::new(module, 2, &["liblox_runtime.so"], false, false);
}

The C approach trades the safety of Rust for simplicity and universality. We’ll stick with the Rust runtime for the rest of this tutorial, but the C approach is worth knowing about — many production compilers (like the LLVM project’s own runtimes) use it.

Pulling It All Together

Let’s see the full picture of how a Lox program goes from source to execution, with the runtime in the loop:

// src/main.rs

use anyhow::Result;
use melior::Context;

mod ast;
mod lexer;
mod parser;
mod codegen;
mod runtime;
mod jit;
mod gc;
mod value;

fn main() -> Result<()> {
    let source = r#"
        var start = clock();
        print "Computing...";
        for (var i = 0; i < 1000; i = i + 1) {
            print i;
        }
        var elapsed = clock() - start;
        print elapsed;
    "#;

    // 1. Parse
    let tokens = lexer::tokenize(source)?;
    let ast = parser::Parser::new(tokens).parse()?;

    // 2. Create MLIR context and module
    let context = Context::new();
    let module = codegen::compile(&context, &ast)?;

    // 3. Create execution engine with runtime symbols registered
    let engine = jit::create_engine(&module)?;

    // 4. Invoke the compiled main function
    //
    // Note: `invoke_packed`'s return type and parameter signature
    // vary between Melior versions. In some versions it returns (),
    // in others it returns Result. The `?` handles the Result case;
    // if your version returns (), call without the `?` operator.
    // Note: `invoke_packed` is unsafe — it runs JIT-compiled code with
    // no bounds checks. Part 11 uses `result.map_err(...)` to turn
    // JIT failures into `anyhow` errors; we use `?` here for simplicity.
    // Note: `@main` returns `!llvm.struct<(i8, i64)>` (the tagged-union result),
    // but we don't capture the return value here. `invoke_packed` can ignore returns —
    // the Lox top-level result is effectively discarded. A real REPL would capture and
    // display the result.
    unsafe { engine.invoke_packed("main", &mut [])?; }

    Ok(())
}

When this runs:

clock() calls our lox_clock_wrapper, which returns the current time
print "Computing..." calls lox_print_wrapper with tag=TAG_STRING and a pointer to the string data
print i calls lox_print_wrapper with tag=TAG_NUMBER and the f64 reinterpreted as i64
The elapsed time is computed in MLIR as a float subtraction, then printed

The generated MLIR for this program would look something like:

module {
  func.func private @lox_print(i8, i64)
  func.func private @lox_clock() -> f64

  func.func @main() -> !llvm.struct<(i8, i64)> {
    %clock_result = func.call @lox_clock() : () -> f64
    %start_tag = arith.constant 2 : i8
    %start_payload = llvm.bitcast %clock_result : f64 to i64

    %str_tag = arith.constant 3 : i8
    // String constants use llvm.mlir.global for storage and llvm.mlir.addressof
    // to get a pointer. Shown here as a placeholder — the actual mechanism
    // is covered in the "String Constants" section of Part 1.
    %str_ptr = llvm.mlir.addressof @"str_Computing" : !llvm.ptr
    func.call @lox_print(%str_tag, %str_ptr) : (i8, i64) -> ()

    // ... loop body ...

    %clock_result2 = func.call @lox_clock() : () -> f64
    %elapsed = arith.subf %clock_result2, %clock_result : f64
    %elapsed_tag = arith.constant 2 : i8
    %elapsed_payload = llvm.bitcast %elapsed : f64 to i64
    func.call @lox_print(%elapsed_tag, %elapsed_payload) : (i8, i64) -> ()

    // Construct the nil return value: tag = TAG_NIL (0), payload = 0
    %nil_tag = arith.constant 0 : i8
    %nil_payload = arith.constant 0 : i64
    %nil_undef = llvm.undef : !llvm.struct<(i8, i64)>
    %nil_tagged1 = llvm.insertvalue %nil_tag, %nil_undef[0] : !llvm.struct<(i8, i64)>
    %nil_tagged2 = llvm.insertvalue %nil_payload, %nil_tagged1[1] : !llvm.struct<(i8, i64)>
    func.return %nil_tagged2 : !llvm.struct<(i8, i64)>
  }
}

Notice the structure: func.func private @lox_print is a module-level declaration (no body), while func.call @lox_print(...) is the actual call inside a function. Every func.call to a lox_* function is resolved at JIT time to our registered wrappers, which delegate to the Runtime struct. The MLIR code never knows about Rust, the GC, or LoxValue — it passes raw integers and floats.

What We Built

Across nine parts, we’ve gone from zero to a working Lox compiler. We built an AST and parser faithful to Crafting Interpreters. We generated MLIR through a Lox → MLIR dialect pipeline, then added tagged unions for proper dynamic typing with (tag, payload) pairs. Source locations came via MLIR’s first-class location tracking, and a lowering pipeline carried us from the Lox dialect through standard dialects to LLVM IR. JIT execution let us compile and run in the same process. We added garbage collection with mark-and-sweep and root tracking, closures with captured variables and upvalues, and classes and instances as GC-managed objects with methods and fields. Finally, a standard library with print, clock, and a pattern for extending.

That’s the core. Two more pieces make it production-ready: error reporting that points to the source line (Part 10), and cross-module linking so programs can span multiple files (Part 11).

Where to Go Next

If you want to keep building, here are some directions.

Cross-file programs — Part 11 shows how to compile and link multiple Lox files into a single program.

More built-ins like len(), str(), input(), and sqrt().

Error reporting with runtime type errors and line numbers (now covered in Part 10).

AOT compilation — write the LLVM IR to an object file instead of JIT.

Optimization passes between lowering stages.

A standard library in Lox — write list.lox, math.lox, etc. in Lox itself.

Debugging support — generate DWARF debug info from the source locations.

The runtime pattern we established here — declare once at module level, register with JIT, write a wrapper, call from the block — scales to any number of native functions. Each new built-in follows the same recipe. The hard part isn’t the runtime; it’s making sure the GC stays correct and the type tags are handled consistently. But you already know how to do that.

Next: Part 10 — Error Reporting and Debugging — Our compiler works — until it doesn’t. When a runtime error crashes the program, the user sees Segmentation fault and nothing else. No line number, no function name, no hint about what went wrong. We’ll thread source locations through the MLIR pipeline so runtime errors can say error on line 7: undefined variable 'x' — the difference between a usable compiler and a frustrating one.

Keyboard shortcuts

MLIR for Lox: A Compiler Tutorial