Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

MLIR for Lox: Part 7 — Classes Without a Type System — Compiling Dynamic Dispatch in MLIR

Classes are the last major feature in Crafting Interpreters. They combine everything we’ve built — heap allocation, GC roots, closures — and add a new layer: method dispatch, inheritance, and this binding. Every concept from Parts 2–6 shows up again here, but harder: objects reference other objects (GC marking chains), methods capture this (closure environments), and inheritance walks a linked list of class objects (pointer chasing across the heap).

This part assumes you’ve read Parts 2–6. We’ll extend the GC runtime from Parts 2–3 and the MLIR codegen from Part 4.


What We’re Building

By the end, this Lox program should work:

class Doughnut {
  cook(flavor) {
    print "Frying " + flavor + " doughnut";
  }
}

class FilledDoughnut < Doughnut {
  cook(flavor) {
    super.cook(flavor);
    print "Injecting " + flavor + " filling";
  }
}

var d = FilledDoughnut();
d.cook("custard");

Output:

Frying custard doughnut
Injecting custard filling

The Object Model

Lox has three class-related object types:

ObjectWhat It HoldsAnalogy
ObjClassName, methods, superclassA blueprint
ObjInstanceClass pointer, field tableA house built from the blueprint
ObjBoundMethodReceiver + closureA method “bound” to an instance

These are all heap-allocated, GC-managed objects. They extend the ObjHeader we built in Part 2.

Updated Object Types

#![allow(unused)]
fn main() {
// src/runtime/object.rs

use std::cell::UnsafeCell;
use std::collections::HashMap;
use crate::runtime::value::LoxValue;
use crate::runtime::heap::{ObjHeader, ObjType};

/// A Lox class object
pub struct ObjClass {
    pub header: ObjHeader,
    pub name: String,
    /// Methods defined directly on this class (not inherited)
    pub methods: HashMap<String, LoxValue>,
    /// Superclass, if any (null for root classes)
    pub superclass: *mut ObjClass,
}

impl ObjClass {
    /// Look up a method, walking the inheritance chain
    pub fn find_method(&self, name: &str) -> Option<LoxValue> {
        if let Some(value) = self.methods.get(name) {
            return Some(value.clone());
        }
        // Walk up the inheritance chain
        if !self.superclass.is_null() {
            // SAFETY: GC guarantees the object is alive when we have a reference
            unsafe { (*self.superclass).find_method(name) }
        } else {
            None
        }
    }
}
}

Updated LoxValue Enum

This part adds three new variants to the LoxValue enum from Part 2. Here’s the cumulative version:

What are GcClass, GcInstance, GcBoundMethod? The Gc<T> wrapper tells the garbage collector that these values live on the heap and need to be traced. Raw pointers are invisible to the GC — if the collector moves or frees the underlying object, you get a dangling pointer. Gc<T> wraps a raw pointer and implements Trace so the collector can walk the object graph.

These are type aliases:

#![allow(unused)]
fn main() {
type GcClass = Gc<ObjClass>;
type GcInstance = Gc<ObjInstance>;
type GcBoundMethod = Gc<ObjBoundMethod>;
}

The aliases exist because writing Gc<ObjClass> everywhere gets old fast. You’ll see the same pattern from Parts 2 (GcString) and 5 (GcClosure).

#![allow(unused)]
fn main() {
// src/runtime/value.rs

#[repr(C)]
#[derive(Debug, Clone)]
pub enum LoxValue {
    Nil,
    Bool(bool),
    Number(f64),
    String(GcString),       // Part 2: heap-allocated string (GC object)
    Closure(GcClosure),     // Part 5: closure with captured environment
    Instance(GcInstance),   // Part 7: class instance
    Class(GcClass),         // Part 7: class object
    BoundMethod(GcBoundMethod), // Part 7: method bound to a receiver
}

impl LoxValue {
    /// Create a bound method value
    ///
    /// `method` is a LoxValue (specifically LoxValue::Closure) rather than a raw
    /// pointer — the GC needs to trace through it. The underlying closure is
    /// wrapped in LoxValue so the collector can walk the object graph.
    pub fn bound_method(receiver: *mut ObjInstance, method: LoxValue) -> Self {
        LoxValue::BoundMethod(GcBoundMethod::new(receiver, method))
    }
}
}

Where does the allocation happen? ObjInstance::get_property calls bound_method when it finds a method on the class and needs to bind it to the receiver. GcBoundMethod::new() allocates the ObjBoundMethod on the GC heap and returns the Gc wrapper. If you’re building the runtime incrementally, you can use Heap::bind_method instead (shown below) — that version takes &mut Heap and calls self.allocate() directly, which is more explicit about the allocation. The two paths produce the same result; the difference is whether the allocation happens through the Gc::new constructor or through an explicit Heap::allocate call.

#![allow(unused)]
fn main() {
use std::collections::HashMap;

/// A method bound to a specific receiver instance
///
/// When you write `instance.method()`, the runtime needs to combine the method's
/// closure with the instance that receives the call (the `this` value). This struct
/// holds both pieces — a pointer to the receiver and the closure that implements the
/// method body. The GC can trace through both: the receiver pointer is a heap object,
/// and the method's `LoxValue` tag tells the collector whether it contains a heap reference.
pub struct ObjBoundMethod {
    pub header: ObjHeader,
    /// The instance that receives the method call (the `this` value)
    pub receiver: *mut ObjInstance,
    /// The underlying closure
    pub method: LoxValue,
}

/// A Lox instance object
pub struct ObjInstance {
    pub header: ObjHeader,
    /// The class this instance belongs to
    pub class: *mut ObjClass,
    /// Instance fields (set at runtime, not on the class)
    ///
    /// Why `UnsafeCell`? `get_property` takes `&self` (shared reference),
    /// but `set_property` needs to mutate the fields. In normal Rust,
    /// `&self` means "no mutation." `UnsafeCell` is Rust's opt-out mechanism
    /// for interior mutability — it's the only way to soundly mutate through
    /// a shared reference. The actual `&mut` extraction still requires an
    /// `unsafe` block (as shown in `get_property` and `set_property` below),
    /// but `UnsafeCell` makes it *legal* — without it, any such mutation
    /// would be undefined behavior, even inside `unsafe`.
    ///
    /// The GC code holds `&ObjInstance` while tracing (read-only) and needs
    /// `&mut HashMap` when setting fields. `UnsafeCell` makes this possible
    /// without restructuring the API into `RefCell<HashMap>` (which would
    /// add runtime borrow-check overhead on every field access).
    pub fields: UnsafeCell<HashMap<String, LoxValue>>,
}

impl ObjInstance {
    pub fn new(class: *mut ObjClass) -> Self {
        Self {
            header: ObjHeader::new(ObjType::Instance),
            class,
            fields: UnsafeCell::new(HashMap::new()),
        }
    }

    /// Get a field value, or look up a method on the class
    pub fn get_property(&self, name: &str) -> Option<LoxValue> {
        // Fields shadow methods
        let fields = unsafe { &*self.fields.get() };
        if let Some(value) = fields.get(name) {
            return Some(value.clone());
        }

        // Look up method on the class
        unsafe {
            (*self.class).find_method(name).map(|method| {
                // Bind the method to this instance
                //
                // The *const → *mut cast looks wrong, but it's a standard
                // pattern in GC-managed code: `self` is `&ObjInstance`, so
                // we can only derive a *const pointer. ObjBoundMethod needs
                // *mut to match the receiver field's type. This is safe
                // because the GC manages the lifetime — the pointer is only
                // dereferenced while the object is alive and the GC isn't
                // collecting.
                //
                // **Soundness caveat:** This only works if `self` refers to
                // a GC-managed heap object (e.g., `self` comes from
                // dereferencing a `Gc<ObjInstance>`). If someone constructs
                // an `ObjInstance` on the stack and calls `get_property`,
                // the resulting pointer will dangle after the stack frame
                // returns. In a complete implementation, `get_property`
                // would receive `&GcInstance` and call `GcInstance::as_ptr()`
                // instead of casting `self` — that guarantees the pointer
                // points to a GC-managed heap object.
                LoxValue::bound_method(self as *const ObjInstance as *mut ObjInstance, method)
            })
        }
    }

    /// Set a field value
    pub fn set_property(&self, name: String, value: LoxValue) {
        let fields = unsafe { &mut *self.fields.get() };
        fields.insert(name, value);
    }
}
}

Updated ObjType Enum

Each part adds new object types. Here’s the cumulative enum after Part 7 — every variant from Parts 2 and 5 is still here, with Class and BoundMethod appended:

#![allow(unused)]
fn main() {
// src/runtime/object.rs (continued)

#[repr(u8)]
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum ObjType {
    Number = 0,       // Introduced in Part 2 (GC)
    String = 1,       // Introduced in Part 2 (GC)
    Environment = 2,  // Introduced in Part 5 (closures)
    Closure = 3,      // Introduced in Part 5 (closures)
    Instance = 4,     // Part 7
    Class = 5,        // Part 7
    BoundMethod = 6,  // Part 7
}
}

Why explicit discriminants? The discriminant values are the obj_type byte stored in every heap object’s header. If you change the numbering, the GC’s trace_object dispatch (which reads header.obj_type and matches on the numeric value) will trace the wrong object type. Explicit discriminants make this contract visible.

What’s Number doing here? Number is a stack value, not a heap object — the GC never traces it. So why is it in ObjType? Because the GC runtime is simpler if every variant gets a match arm, even if some arms are no-ops. The GC’s match arm for Number does nothing (there are no outgoing references to trace).

Don’t confuse ObjType discriminants with compiled value tags. These are two different numbering schemes for two different purposes:

ObjType (GC dispatch)          Compiled value tags (codegen)
─────────────────────          ──────────────────────────────
0 = Number  (stack value)      0 = Nil     (stack value)
1 = String  (heap object)      1 = Bool    (stack value)
2 = Environment (GC-internal)   2 = Number  (stack value)
3 = Closure (heap object)      3 = String  (heap pointer)
4 = Instance (heap object)     4 = Closure (heap pointer)
5 = Class   (heap object)      5 = Instance(heap pointer)
6 = BoundMethod (heap object)  6 = Class   (heap pointer)
                                7 = BoundMethod (heap pointer)

Two differences: the compiled tags include Nil and Bool (they’re Lox values the codegen must represent, even though they’re not heap objects), and they exclude Environment (it’s a GC-internal type that never appears as a Lox value). When compiling trace_object, convert ObjType discriminants to compiled value tags — don’t use them interchangeably.


GC Tracing for Classes

Every new object type needs to report its outgoing references. Miss one and you get use-after-free.

Classes force a change to the GC API. Parts 2–6 used standalone mark_object and mark_references functions that operated on raw *mut u8 pointers. That works when every reference is a raw heap pointer. Classes break that assumption: ObjInstance has a raw *mut ObjClass pointer for its class reference, but ObjBoundMethod stores its method as a LoxValue tagged union. You can’t call the same trace function on both — a LoxValue might be Nil or Number (not a heap pointer at all), while a raw pointer is always a heap object. The fix: two trace methods. trace_object handles raw heap pointers. trace_value unwraps LoxValue tags and forwards heap variants to trace_object. Same tracing logic, but the type distinction is now explicit.

#![allow(unused)]
fn main() {
// src/runtime/gc.rs

impl GC {
    /// Trace a tagged LoxValue — dispatches based on the value's tag.
    ///
    /// Heap objects (String, Closure, Instance, Class, BoundMethod) get
    /// forwarded to `trace_object`. Stack values (Nil, Bool, Number) have
    /// no heap references — nothing to trace.
    ///
    /// This method exists because several object types store `LoxValue`
    /// fields (method tables, instance fields, bound methods). We can't
    /// call `trace_object` on a `LoxValue` directly because a `LoxValue`
    /// might be `Nil` or `Number` — not a heap pointer at all.
    pub fn trace_value(&mut self, value: LoxValue) {
        match value {
            LoxValue::Nil | LoxValue::Bool(_) | LoxValue::Number(_) => {
                // Stack values — no heap references to trace
            }
            LoxValue::String(s) => {
                self.trace_object(s.as_ptr() as *mut ObjHeader);
            }
            LoxValue::Closure(c) => {
                self.trace_object(c.as_ptr() as *mut ObjHeader);
            }
            LoxValue::Instance(i) => {
                self.trace_object(i.as_ptr() as *mut ObjHeader);
            }
            LoxValue::Class(c) => {
                self.trace_object(c.as_ptr() as *mut ObjHeader);
            }
            LoxValue::BoundMethod(b) => {
                self.trace_object(b.as_ptr() as *mut ObjHeader);
            }
        }
    }

    /// Trace all reachable objects from `obj`
    pub fn trace_object(&mut self, obj: *mut ObjHeader) {
        let header = unsafe { &mut *obj };
        if header.is_marked {
            return;
        }
        header.is_marked = true;

        match header.obj_type {
            ObjType::Number => {
                // Number is a stack value, not a heap object.
                // It's in the enum for completeness but the GC
                // never actually traces it — there are no outgoing references.
            }
            ObjType::String => {
                // Strings have no outgoing references
            }
            ObjType::Environment => {
                // Environment uses the prepended-header model from Parts 2–4:
                // the ObjHeader sits before the data, so we skip past it.
                // (ObjClass, ObjInstance, and ObjBoundMethod embed the header as
                // their first field, so the cast works directly. Environment and
                // Closure don't — they're allocated with the header prepended.)
                let data_ptr = unsafe { (obj as *const u8).add(std::mem::size_of::<ObjHeader>()) };
                let env = unsafe { &*(data_ptr as *const Environment) };
                // Trace the enclosing environment — if the outer
                // environment is only reachable through this pointer,
                // the GC must walk this edge or it could collect it.
                if !env.enclosing.is_null() {
                    self.trace_object(env.enclosing as *mut ObjHeader);
                }
                // Trace each variable slot.
                for i in 0..env.slot_count {
                    let slot_ptr = unsafe { *env.slots.as_ptr().add(i) };
                    if !slot_ptr.is_null() {
                        self.trace_object(slot_ptr as *mut ObjHeader);
                    }
                }
                // NOTE: The code above uses the *mut u8 slot model from Part 5,
                // where each slot is a raw heap pointer. If you've switched to
                // the tagged-union (i8, i64) model (described in the "What the
                // Generated MLIR Looks Like" section below), replace the null
                // check with a tag check: trace the slot only when the tag byte
                // indicates a heap object (TAG_STRING, TAG_CLOSURE, etc.).
                // Nil, Bool, and Number tags have no heap references.
            }
            ObjType::Closure => {
                // Same prepended-header model as Environment above
                let data_ptr = unsafe { (obj as *const u8).add(std::mem::size_of::<ObjHeader>()) };
                let closure = unsafe { &*(data_ptr as *const Closure) };
                // Trace the captured environment. Part 5's Closure struct has
                // two fields: `function_index` and `environment` (a single
                // pointer to the captured Environment). The environment's slots
                // may contain heap references (strings, other closures), so we
                // trace each slot.
                if !closure.environment.is_null() {
                    let env = unsafe { &*closure.environment };
                    for i in 0..env.slot_count {
                        let slot_ptr = unsafe { *env.slots.as_ptr().add(i) };
                        if !slot_ptr.is_null() {
                            // In the tagged-union model, check the tag before
                            // tracing — only heap objects (String, Closure,
                            // Instance, Class, BoundMethod) need marking.
                            // Nil, Bool, and Number are stack values with no
                            // heap references.
                            self.trace_object(slot_ptr as *mut ObjHeader);
                        }
                    }
                }
                // A full implementation adds an `upvalues` array to the closure
                // — one entry per captured variable — so that the GC can trace
                // each captured value independently. The `environment` pointer
                // from Part 5 is equivalent to a single upvalue (an environment
                // is a flat array of captured values), but real closures often
                // capture from multiple enclosing scopes, which requires
                // separate upvalue objects. The trace logic is the same either
                // way: walk the captured values and mark them. With upvalues:
                //
                //     for &upvalue_ptr in &closure.upvalues {
                //         self.trace_value(unsafe { (*upvalue_ptr).closed });
                //     }
            }
            ObjType::Class => {
                let class = unsafe { &*(obj as *const ObjClass) };
                // Trace method values
                for value in class.methods.values() {
                    self.trace_value(value.clone());
                }
                // Trace superclass
                if !class.superclass.is_null() {
                    self.trace_object(class.superclass as *mut ObjHeader);
                }
            }
            ObjType::Instance => {
                let instance = unsafe { &*(obj as *const ObjInstance) };
                // Trace the class reference
                self.trace_object(instance.class as *mut ObjHeader);
                // Trace all field values
                let fields = unsafe { &*instance.fields.get() };
                for value in fields.values() {
                    self.trace_value(value.clone());
                }
            }
            ObjType::BoundMethod => {
                let bound = unsafe { &*(obj as *const ObjBoundMethod) };
                // Trace the receiver instance
                self.trace_object(bound.receiver as *mut ObjHeader);
                // Trace the method closure
                self.trace_value(bound.method);
            }
        }
    }
}
}

Key insight: ObjClass traces its methods and its superclass. ObjInstance traces its class and all field values. ObjBoundMethod traces both the receiver and the closure. Every edge in the object graph must be walked.


this Binding

When a method is called, this refers to the receiver instance. We implement this the same way closures capture upvalues (Part 5) — the method’s closure has an implicit upvalue that points to this.

Implementation note: The compile_method code below compiles methods as regular functions for clarity. A complete implementation would add two implicit upvalue slots to each method’s closure: upvalue[0] = this (the receiver, bound at call time) and upvalue[1] = super (the superclass). The compile_this and compile_super generators assume these slots exist in the variables map — wiring them into the compiler is a straightforward extension of the closure capture logic from Part 5, but we leave the actual upvalue insertion as an exercise to keep the codegen example focused.

How It Works

When we create a class, each method closure gets an extra upvalue slot for this. When a method is bound to an instance (via ObjBoundMethod), we fill that slot with the instance pointer.

#![allow(unused)]
fn main() {
// src/runtime/bind.rs

use crate::runtime::object::{ObjBoundMethod, ObjInstance, ObjHeader, ObjType};
use crate::runtime::value::LoxValue;
use crate::runtime::heap::Heap;

impl Heap {
    /// Bind a method closure to a receiver instance
    pub fn bind_method(
        &mut self,
        receiver: *mut ObjInstance,
        method: LoxValue,
    ) -> *mut ObjBoundMethod {
        let bound = ObjBoundMethod {
            header: ObjHeader::new(ObjType::BoundMethod),
            receiver,
            method,
        };
        self.allocate(bound)
    }
}
}

The Method Call Protocol

When the VM encounters a method call like d.cook("custard"):

  1. Evaluate d → get the ObjInstance pointer
  2. Look up "cook" on the instance → get an ObjBoundMethod
  3. Call the bound method’s closure with the provided arguments
  4. Inside the closure, this resolves to the bound receiver

No vtable needed. Method dispatch is a hash map lookup that walks the superclass chain.


Inheritance: Linked Lists All the Way Down

Lox’s inheritance is single-inheritance only. That means the class hierarchy is a linked list:

FilledDoughnut → Doughnut → nil

When we look up a method, we walk the chain — the find_method method we defined earlier walks the superclass linked list, checking each class’s method table until it finds a match or reaches the end:

super Calls

A super.cook(flavor) expression needs two things:

  1. The superclass of the enclosing class (not the receiver’s class)
  2. The method name

We resolve super at compile time, not runtime. During codegen, when we’re inside a class method, we know which class we’re in and therefore what the superclass is. We store this as a hidden upvalue on the closure — the same mechanism as this.

#![allow(unused)]
fn main() {
// During compilation, inside a class method:
// The method closure gets two implicit upvalues:
//   [0] = this   (the receiver instance)
//   [1] = super  (the enclosing class's superclass)
}

This means super is free at runtime — no lookup needed. The superclass pointer is already captured in the closure.


MLIR Code Generation for Classes

Now the interesting part: generating MLIR for class declarations, instance creation, property access, and method calls.

New AST Nodes

#![allow(unused)]
fn main() {
// src/ast.rs (additions)

#[derive(Debug, Clone)]
pub enum Expr {
    // ... existing variants ...
    Get(GetExpr),
    Set(SetExpr),
    This(ThisExpr),
    Super(SuperExpr),
}

#[derive(Debug, Clone)]
pub struct GetExpr {
    pub location: Location,
    pub object: Box<Expr>,
    pub name: String,
}

#[derive(Debug, Clone)]
pub struct SetExpr {
    pub location: Location,
    pub object: Box<Expr>,
    pub name: String,
    pub value: Box<Expr>,
}

#[derive(Debug, Clone)]
pub struct ThisExpr {
    pub location: Location,
}

#[derive(Debug, Clone)]
pub struct SuperExpr {
    pub location: Location,
    pub method: String,
}

#[derive(Debug, Clone)]
pub enum Stmt {
    // ... existing variants ...
    Class(ClassStmt),
}

#[derive(Debug, Clone)]
pub struct ClassStmt {
    pub location: Location,
    pub name: String,
    pub superclass: Option<String>,
    pub methods: Vec<FunctionStmt>,
}
}

Runtime Calls as External Functions

Class operations are too complex for pure MLIR. We emit calls to runtime functions instead:

#![allow(unused)]
fn main() {
// src/codegen/classes.rs

use melior::{
    Context, Location,
    dialect::func,
    ir::{
        attribute::{FlatSymbolRefAttribute, StringAttribute, TypeAttribute},
        r#type::FunctionType,
        Region, Type, Value, Block, BlockLike,
    },
};
use crate::codegen::types::lox_value_type;
// FlatSymbolRefAttribute, Value, and Block are used by compile_class, compile_method,
// and compile_get/compile_set below. They're included here because these functions
// live in the same module.

/// Declare runtime functions needed for class operations
pub fn declare_runtime_functions(context: &Context, module: &mut Module) {
    let location = Location::unknown(context);
    let lox_val = lox_value_type(context);

    // lox.create_class(name_ptr: !llvm.ptr, superclass: lox_val) -> lox_val
    let create_class_type = FunctionType::new(
        context,
        &[Type::parse(context, "!llvm.ptr").unwrap(), lox_val],
        &[lox_val],
    );
    declare_external(module, context, "lox_create_class", create_class_type, location);

    // lox.instance_from_class(class: lox_val) -> lox_val
    let instance_type = FunctionType::new(context, &[lox_val], &[lox_val]);
    declare_external(module, context, "lox_instance_from_class", instance_type, location);

    // lox.get_property(instance: lox_val, name_ptr: !llvm.ptr) -> lox_val
    let get_prop_type = FunctionType::new(
        context,
        &[lox_val, Type::parse(context, "!llvm.ptr").unwrap()],
        &[lox_val],
    );
    declare_external(module, context, "lox_get_property", get_prop_type, location);

    // lox.set_property(instance: lox_val, name_ptr: !llvm.ptr, value: lox_val) -> lox_val
    let set_prop_type = FunctionType::new(
        context,
        &[lox_val, Type::parse(context, "!llvm.ptr").unwrap(), lox_val],
        &[lox_val],
    );
    declare_external(module, context, "lox_set_property", set_prop_type, location);

    // lox.bind_method(receiver: lox_val, method: lox_val) -> lox_val
    let bind_method_type = FunctionType::new(context, &[lox_val, lox_val], &[lox_val]);
    declare_external(module, context, "lox_bind_method", bind_method_type, location);

    // lox.set_method(class: lox_val, name_ptr: !llvm.ptr, method: lox_val) -> lox_val
    // Attaches a compiled method to a class object's method table
    let set_method_type = FunctionType::new(
        context,
        &[lox_val, Type::parse(context, "!llvm.ptr").unwrap(), lox_val],
        &[lox_val],
    );
    declare_external(module, context, "lox_set_method", set_method_type, location);

    // lox.super_lookup(superclass: lox_val, name_ptr: !llvm.ptr, this: lox_val) -> lox_val
    // Walks the class hierarchy starting from the superclass, finds the method,
    // and binds `this` as the receiver — used for `super` method calls
    let super_lookup_type = FunctionType::new(
        context,
        &[lox_val, Type::parse(context, "!llvm.ptr").unwrap(), lox_val],
        &[lox_val],
    );
    declare_external(module, context, "lox_super_lookup", super_lookup_type, location);

    // lox.call(callee: lox_val, arg: lox_val) -> lox_val
    // Invokes a Lox closure — loads the function pointer and environment from
    // the closure object, then performs an indirect call through the function table
    let call_type = FunctionType::new(context, &[lox_val, lox_val], &[lox_val]);
    declare_external(module, context, "lox_call", call_type, location);
}

fn declare_external(
    module: &mut Module,
    context: &Context,
    name: &str,
    fn_type: FunctionType,
    location: Location,
) {
    module.body().append_operation(func::func(
        context,
        StringAttribute::new(context, name),
        TypeAttribute::new(fn_type.into()),
        Region::new(),
        &[],
        location,
    ));
}
}

Compiling Class Declarations

#![allow(unused)]
fn main() {
// src/codegen/generator.rs (additions)
use std::collections::HashMap;

impl<'c> CodeGenerator<'c> {
    fn compile_class(&self, class: &ClassStmt, block: &Block<'c>, variables: &mut HashMap<String, Value<'c, 'c>>) {
        let location = self.loc(class.location);

        // 1. Resolve superclass (if any)
        let superclass_val = if let Some(super_name) = &class.superclass {
            // Look up the superclass variable — it should be a class object
            self.compile_variable(&VariableExpr {
                location: class.location,
                name: super_name.clone(),
            }, variables)
        } else {
            self.compile_nil(block)
        };

        // 2. Create a global string constant for the class name
        let name_global = self.create_string_constant(&class.name);

        // 3. Call runtime: lox_create_class(name, superclass)
        let lox_val = lox_value_type(self.context);
        let create_class_op = func::call(
            self.context,
            FlatSymbolRefAttribute::new(self.context, "lox_create_class"),
            &[name_global, superclass_val],
            &[lox_val],
            location,
        );

        let class_val = block.append_operation(create_class_op)
            .result(0).unwrap().into();

        // 4. Store each method on the class
        for method in &class.methods {
            self.compile_method(block, &class.name, class_val, method, variables);
        }

        // 5. Store the class object as a variable
        variables.insert(class.name.clone(), class_val);
    }

    fn compile_method(&self, block: &Block<'c>, class_name: &str, class_val: Value<'c, 'c>, method: &FunctionStmt, variables: &mut HashMap<String, Value<'c, 'c>>) {
        let location = self.loc(method.location);

        // Mangle the method name: Doughnut.cook → Doughnut_cook
        // Two classes can both have a `cook` method, so the MLIR
        // function name must be unique within the module.
        let mangled = format!("{}_{}", class_name, method.name);

        // Compile the method body. We compile it as a regular function
        // here — a complete implementation would add two implicit upvalues
        // (this, super) before compiling the body. See "What We're Simplifying"
        // for the full explanation of what's omitted and why.
        //
        // We pass the mangled name so `compile_function` creates `@Doughnut_cook`
        // instead of `@cook`. The variables map stores the value under the
        // mangled key, so we look it up the same way below.
        let mut method_with_name = method.clone();
        method_with_name.name = mangled.clone();
        self.compile_function(&method_with_name, variables);

        // Retrieve the compiled method's value from the variables map.
        // `compile_function` creates the `func.func` operation in the module
        // and stores a tagged-union LoxValue (TAG_CLOSURE with the function
        // pointer as payload) in the variables map under the function's name —
        // which is now the mangled name. This tagged value is what
        // `lox_set_method` receives: a `(i8, i64)` pair where tag=4 means
        // closure and the i64 payload points to the compiled function.
        let method_val = variables.get(&mangled).copied()
            .expect("compile_method: method not found in variables after compile_function");
        
        // The property name on the class uses the *original* method name,
        // not the mangled one. `d.cook("custard")` looks up "cook",
        // not "Doughnut_cook".
        let name_global = self.create_string_constant(&method.name);
        
        let attach_op = func::call(
            self.context,
            FlatSymbolRefAttribute::new(self.context, "lox_set_method"),
            &[class_val, name_global, method_val],
            &[lox_value_type(self.context)],
            location,
        );

        block.append_operation(attach_op);
    }
}
}

Compiling Property Access and Assignment

#![allow(unused)]
fn main() {
use std::collections::HashMap;

impl<'c> CodeGenerator<'c> {
    fn compile_get(&self, block: &Block<'c>, get: &GetExpr, variables: &mut HashMap<String, Value<'c, 'c>>) -> Value<'c, 'c> {
        let location = self.loc(get.location);
        let object = self.compile_expression(&get.object, block, variables);

        // Create a global string constant for the property name
        let name_global = self.create_string_constant(&get.name);

        // Call runtime: lox_get_property(instance, "name")
        let op = func::call(
            self.context,
            FlatSymbolRefAttribute::new(self.context, "lox_get_property"),
            &[object, name_global],
            &[lox_value_type(self.context)],
            location,
        );

        block.append_operation(op).result(0).unwrap().into()
    }

    fn compile_set(&self, block: &Block<'c>, set: &SetExpr, variables: &mut HashMap<String, Value<'c, 'c>>) -> Value<'c, 'c> {
        let location = self.loc(set.location);
        let object = self.compile_expression(&set.object, block, variables);
        let value = self.compile_expression(&set.value, block, variables);

        let name_global = self.create_string_constant(&set.name);

        // Call runtime: lox_set_property(instance, "name", value)
        let op = func::call(
            self.context,
            FlatSymbolRefAttribute::new(self.context, "lox_set_property"),
            &[object, name_global, value],
            &[lox_value_type(self.context)],
            location,
        );

        // set expressions return the assigned value (like assignment)
        block.append_operation(op).result(0).unwrap().into()
    }
}
}

Compiling this and super

#![allow(unused)]
fn main() {
use std::collections::HashMap;

impl<'c> CodeGenerator<'c> {
    fn compile_this(&self, this: &ThisExpr, variables: &HashMap<String, Value<'c, 'c>>) -> Value<'c, 'c> {
        // `this` is a variable lookup — it's stored as an upvalue
        // by the method binding mechanism.
        //
        // NOTE: This only works if `compile_method` has added "this"
        // to the variables map as an implicit upvalue (see the
        // "What We're Simplifying" section). The current `compile_method`
        // omits that wiring — calling `compile_this` will panic at
        // compile time with "Undefined variable: this" unless you
        // add the upvalue insertion described there.
        self.compile_variable(&VariableExpr {
            location: this.location,
            name: "this".to_string(),
        }, variables)
    }

    fn compile_super(&self, block: &Block<'c>, super_expr: &SuperExpr, variables: &HashMap<String, Value<'c, 'c>>) -> Value<'c, 'c> {
        let location = self.loc(super_expr.location);

        // `super.method` resolves to:
        // 1. Get the superclass from the implicit upvalue
        // 2. Look up the method on the superclass
        // 3. Bind it to `this`
        //
        // NOTE: Same caveat as `compile_this` — "super" must be in
        // the variables map as an implicit upvalue. Without it,
        // `compile_variable` panics with "Undefined variable: super".
        // The current `compile_method` doesn't add it. See
        // "What We're Simplifying."

        let super_class = self.compile_variable(&VariableExpr {
            location: super_expr.location,
            name: "super".to_string(),
        }, variables);

        let method_name = self.create_string_constant(&super_expr.method);
        let this_val = self.compile_this(&ThisExpr { location: super_expr.location }, variables);

        // Call runtime: lox_super_lookup(superclass, "method", this)
        let op = func::call(
            self.context,
            FlatSymbolRefAttribute::new(self.context, "lox_super_lookup"),
            &[super_class, method_name, this_val],
            &[lox_value_type(self.context)],
            location,
        );

        block.append_operation(op).result(0).unwrap().into()
    }
}
}

What the Generated MLIR Looks Like

A note before we look at the IR: this is the first part that uses the tagged union representation. Parts 1–6 compiled every value as bare f64 (the “numbers only” model). Classes break that model — you can’t represent a class instance, a bound method, or a string as a floating-point number. The tagged union represents every value as a struct with two fields: a tag byte (i8) that says what kind of value this is, and a payload word (i64) that holds the data. Here’s the mapping:

Tag  Value Type   Payload
───────────────────────────────
 0   Nil          (unused)
 1   Bool         0 or 1
 2   Number       f64 bits (stored as i64)
 3   String       pointer to heap string
 4   Closure      pointer to closure object
 5   Instance     pointer to instance object
 6   Class        pointer to class object
 7   BoundMethod  pointer to bound method

In MLIR, this is !llvm.struct<(i8, i64)>. Every function that produces a Lox value now returns this struct instead of f64. Every operation that was a single arith.addf becomes: check both tags → extract the f64 payloads → add → re-pack as (TAG_NUMBER, result). The core logic doesn’t change, but every operation now carries the bookkeeping of tag checking and repacking. Part 1 introduced this concept (see the “Dynamic Typing with Tagged Unions” subsection) and explained why the numbers-only model defers it; this is where we start using it.

Given our doughnut example from the top:

module {
  // Runtime function declarations
  func.func @lox_create_class(!llvm.ptr, !llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)>
  func.func @lox_instance_from_class(!llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)>
  func.func @lox_make_string(!llvm.ptr) -> !llvm.struct<(i8, i64)>
  func.func @lox_get_property(!llvm.struct<(i8, i64)>, !llvm.ptr) -> !llvm.struct<(i8, i64)>
  func.func @lox_set_property(!llvm.struct<(i8, i64)>, !llvm.ptr, !llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)>
  func.func @lox_bind_method(!llvm.struct<(i8, i64)>, !llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)>
  func.func @lox_set_method(!llvm.struct<(i8, i64)>, !llvm.ptr, !llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)>

  // Global string constants
  llvm.mlir.global constant @str_0("Doughnut")
  llvm.mlir.global constant @str_1("cook")
  llvm.mlir.global constant @str_2("flavor")
  llvm.mlir.global constant @str_3("FilledDoughnut")
  llvm.mlir.global constant @str_4("custard")

  // Doughnut.cook method — note the mangled name `@Doughnut_cook`.
  // MLIR function names must be unique within the module. Two classes
  // can both have a `cook` method, so we prefix the class name.
  // The `compile_method` code mangles the name before calling
  // `compile_function`, and looks it up in the variables map the same way.
  func.func @Doughnut_cook(%arg0: !llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)> {
    // print "Frying " + flavor + " doughnut"
    // ... (string concatenation via runtime calls)
    // %nil_tagged constructed as in @main: tag=0, payload left undef
    func.return %nil_tagged : !llvm.struct<(i8, i64)>
  }

  // FilledDoughnut.cook method — same mangling pattern
  func.func @FilledDoughnut_cook(%arg0: !llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)> {
    // super.cook(flavor)
    // print "Injecting " + flavor + " filling"
    // ... (runtime calls)
    // %nil_tagged constructed as in @main: tag=0, payload left undef
    func.return %nil_tagged : !llvm.struct<(i8, i64)>
  }

  // Top-level code
  func.func @main() -> !llvm.struct<(i8, i64)> {
    %nil_tag = arith.constant 0 : i8  // nil's tag is 0 in the tagged union
    %nil_val = llvm.undef : !llvm.struct<(i8, i64)>
    %nil_tagged = llvm.insertvalue %nil_tag, %nil_val[0] : !llvm.struct<(i8, i64)>  // payload left undef — nil has no meaningful payload

    // Create Doughnut class
    %doughnut = func.call @lox_create_class(@str_0, %nil_tagged) : (!llvm.ptr, !llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)>

    // Attach Doughnut.cook method to the class
    // compile_method compiles the method as @Doughnut_cook, then calls
    // lox_set_method to store the closure in the class's method table.
    // Without this, lox_get_property would find an empty method table.
    %doughnut_cook_val = ...  // tagged union: tag=4 (TAG_CLOSURE), payload = pointer to @Doughnut_cook
    func.call @lox_set_method(%doughnut, @str_1, %doughnut_cook_val) : (!llvm.struct<(i8, i64)>, !llvm.ptr, !llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)>

    // Create FilledDoughnut class (inherits from Doughnut)
    %filled = func.call @lox_create_class(@str_3, %doughnut) : (!llvm.ptr, !llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)>

    // Attach FilledDoughnut.cook method to the subclass
    %filled_cook_val = ...  // tagged union: tag=4 (TAG_CLOSURE), payload = pointer to @FilledDoughnut_cook
    func.call @lox_set_method(%filled, @str_1, %filled_cook_val) : (!llvm.struct<(i8, i64)>, !llvm.ptr, !llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)>

    // var d = FilledDoughnut()
    %d = func.call @lox_instance_from_class(%filled) : (!llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)>

    // d.cook("custard")
    %method = func.call @lox_get_property(%d, @str_1) : (!llvm.struct<(i8, i64)>, !llvm.ptr) -> !llvm.struct<(i8, i64)>
    %custard_ptr = llvm.mlir.addressof @str_4 : !llvm.ptr
    // Wrap the string in a LoxValue: lox_make_string creates a tagged union
    // with tag=3 (TAG_STRING) and the string pointer as the payload.
    %custard = func.call @lox_make_string(%custard_ptr) : (!llvm.ptr) -> !llvm.struct<(i8, i64)>
    func.call @lox_call(%method, %custard) : (!llvm.struct<(i8, i64)>, !llvm.struct<(i8, i64)>) -> !llvm.struct<(i8, i64)>

    func.return %nil_tagged : !llvm.struct<(i8, i64)>
  }
}

The IR is verbose, but that’s the point — it’s an intermediate representation, not hand-written code. Each operation has clear semantics and the lowering passes can optimize it.


The C Runtime

The runtime functions are simple C — they operate on the same tagged union and heap we built in Parts 2–5. The C runtime uses arrays (method_count + methods[]) instead of Rust’s HashMap — C doesn’t have a hash map in the standard library, and the linear scan is fast enough for the small method tables you’d find in a Lox program.

Key types used below: FieldEntry is a simple key-value pair — typedef struct { const char* key; LoxValue value; } FieldEntry;. The gc_reallocate(ptr, old_size, new_size) function resizes a heap allocation, updating the GC’s internal bookkeeping (allocated bytes count, trigger threshold for the next collection). It’s the C equivalent of Rust’s Vec::push amortized-growth strategy — allocate more space than needed so the next few inserts don’t trigger another reallocation. The implementation lives in the companion repo’s runtime/gc.c — we don’t show it inline here because it’s a straightforward wrapper around realloc plus bookkeeping, and the interesting part is how callers use it (like the GC-safety pattern in lox_set_property below).

API note: The MLIR-declared lox_bind_method takes two LoxValue arguments (matching the tagged union type). The Rust implementation extracts the raw *mut ObjInstance from the receiver LoxValue before calling Heap::bind_method. The C runtime works directly with LoxValue arguments and uses AS_INSTANCE() to unwrap them.

// src/runtime/class_runtime.c

#include "runtime.h"
#include "gc.h"
#include <string.h>

// Create a new class object
LoxValue lox_create_class(const char* name, LoxValue superclass) {
    ObjClass* klass = gc_allocate(sizeof(ObjClass));
    klass->header.type = OBJ_CLASS;
    klass->header.is_marked = false;
    klass->name = strdup(name);
    klass->methods = NULL;       // empty method table (linear array, not a hash map)
    klass->method_count = 0;
    klass->superclass = IS_NIL(superclass) ? NULL : AS_CLASS(superclass);
    return MAKE_OBJ(klass);
}

// Create an instance of a class
LoxValue lox_instance_from_class(LoxValue class_val) {
    ObjClass* klass = AS_CLASS(class_val);
    ObjInstance* instance = gc_allocate(sizeof(ObjInstance));
    instance->header.type = OBJ_INSTANCE;
    instance->header.is_marked = false;
    instance->klass = klass;
    instance->fields = NULL;     // empty field array
    instance->field_count = 0;
    return MAKE_OBJ(instance);
}

// Get a property on an instance
LoxValue lox_get_property(LoxValue instance_val, const char* name) {
    ObjInstance* instance = AS_INSTANCE(instance_val);
    
    // Check fields first (fields shadow methods)
    for (int i = 0; i < instance->field_count; i++) {
        if (strcmp(instance->fields[i].key, name) == 0) {
            return instance->fields[i].value;
        }
    }
    
    // Look up method on the class
    ObjClass* klass = instance->klass;
    while (klass != NULL) {
        for (int i = 0; i < klass->method_count; i++) {
            if (strcmp(klass->methods[i].key, name) == 0) {
                // Bind the method to this instance
                return lox_bind_method(instance_val, klass->methods[i].value);
            }
        }
        klass = klass->superclass;
    }
    
    // Runtime error: undefined property
    fprintf(stderr, "Undefined property '%s'.\n", name);
    exit(1);
}

// Set a property on an instance
LoxValue lox_set_property(LoxValue instance_val, const char* name, LoxValue value) {
    ObjInstance* instance = AS_INSTANCE(instance_val);
    
    // Check if field already exists
    for (int i = 0; i < instance->field_count; i++) {
        if (strcmp(instance->fields[i].key, name) == 0) {
            instance->fields[i].value = value;
            return value;
        }
    }
    
    // Add new field
    // IMPORTANT: increment field_count AFTER writing the field.
    // If GC triggers inside gc_reallocate, the GC traces fields[0..field_count].
    // Incrementing before writing would expose an uninitialized FieldEntry
    // to the GC — a dangling key pointer and garbage LoxValue.
    // NOTE: This pattern is safe because our GC is mark-sweep (non-moving).
    // If gc_reallocate triggers a collection, the GC traces
    // fields[0..field_count] — which is correct because we haven't
    // incremented field_count yet. A moving collector would invalidate
    // the old pointer during reallocation, so this pattern would need
    // pinning or a different allocation strategy.
    instance->fields = gc_reallocate(
        instance->fields,
        instance->field_count * sizeof(FieldEntry),
        (instance->field_count + 1) * sizeof(FieldEntry)
    );
    instance->fields[instance->field_count].key = strdup(name);
    instance->fields[instance->field_count].value = value;
    instance->field_count++;
    return value;
}

// Bind a method to a receiver
LoxValue lox_bind_method(LoxValue receiver, LoxValue method) {
    ObjBoundMethod* bound = gc_allocate(sizeof(ObjBoundMethod));
    bound->header.type = OBJ_BOUND_METHOD;
    bound->header.is_marked = false;
    bound->receiver = AS_INSTANCE(receiver);
    bound->method = method;
    return MAKE_OBJ(bound);
}

// Set a method on a class's method table
LoxValue lox_set_method(LoxValue class_val, const char* name, LoxValue method) {
    ObjClass* klass = AS_CLASS(class_val);
    // Same GC-safety pattern as lox_set_property: increment method_count
    // AFTER writing the new entry, so the GC only traces initialized slots.
    klass->methods = gc_reallocate(
        klass->methods,
        klass->method_count * sizeof(FieldEntry),
        (klass->method_count + 1) * sizeof(FieldEntry)
    );
    klass->methods[klass->method_count].key = strdup(name);
    klass->methods[klass->method_count].value = method;
    klass->method_count++;
    return method;
}

// Look up a method starting from the superclass, then bind it to `this`
LoxValue lox_super_lookup(LoxValue superclass_val, const char* name, LoxValue this_val) {
    ObjClass* klass = AS_CLASS(superclass_val);
    // Walk the class hierarchy starting from the superclass
    while (klass != NULL) {
        for (int i = 0; i < klass->method_count; i++) {
            if (strcmp(klass->methods[i].key, name) == 0) {
                // Bind the found method to `this` as the receiver
                return lox_bind_method(this_val, klass->methods[i].value);
            }
        }
        klass = klass->superclass;
    }
    fprintf(stderr, "Undefined property '%s' on superclass.\n", name);
    exit(1);
}

// Call a Lox closure — loads the function pointer and environment from the
// closure object, then performs an indirect call.
//
// The actual implementation depends on your closure calling convention.
// At minimum, it:
//   1. Extracts the function pointer from the closure object
//   2. Passes the environment pointer as an implicit first argument
//   3. Calls through the function pointer with the remaining arguments
//
// A complete implementation is in the companion repo's runtime/closure.c.
LoxValue lox_call(LoxValue callee, LoxValue arg) {
    // See the companion repository for the full implementation.
    // The closure calling convention is covered in Part 5 (Closures).
    fprintf(stderr, "lox_call: not implemented in this excerpt\n");
    exit(1);
}

Why show lox_call as a stub? The closure calling convention (how we pass the environment, how the function pointer is stored in the closure object) is already covered in Part 5. Duplicating it here would add 30+ lines of pointer arithmetic without teaching anything new — the class-specific parts (lookup, binding) are in lox_super_lookup and lox_bind_method. If you’re building along, the companion repo has the full lox_call implementation.


What We’re Simplifying

This part makes several simplifications that a production compiler would handle differently:

Method compilation doesn’t wire this and super upvalues. The compile_method code compiles each method as a regular function. A complete implementation would add two implicit upvalue slots — upvalue[0] = this and upvalue[1] = super — before compiling the method body, and insert them into the compiler’s variable map so that compile_this and compile_super can find them by name. The implementation note in the this Binding section above describes this wiring; adding it to the codegen example would double the code without teaching a new concept — it’s the same closure capture logic from Part 5 applied to two more variables.

No vtable or inline caches. Every method dispatch does a linear scan through the class hierarchy. For small programs this is fine. For programs with deep inheritance chains or hot call sites, you’d want a vtable (to turn the scan into an index lookup) or inline caches (to remember which method a call site resolved to last time). The “Why No VTable?” section below explains the tradeoff.

Two runtime calls per method invocation. lox_get_property returns a bound method, then lox_call invokes it. Each function does one job — find the method, call the closure — which makes them easier to understand and debug. The tradeoff is a temporary ObjBoundMethod allocation per method call. A production compiler would combine these into a single lox_invoke call to avoid the allocation. See the Design Decisions section for the full tradeoff analysis.

C runtime uses linear arrays instead of hash maps. The Rust ObjClass stores methods in a HashMap<String, LoxValue>, but the C runtime uses a methods[] array with linear lookup. C doesn’t have a hash map in the standard library, and the linear scan is fast enough for the small method tables in a Lox program. A real implementation would use a hash map (or a sorted array with binary search) once method count exceeds ~10–20 entries.


Design Decisions and Trade-offs

Why Runtime Calls Instead of Pure MLIR?

You could emit pure MLIR for class operations — llvm.alloca for field storage, llvm.insertvalue/llvm.extractvalue for field access, etc. But:

ApproachProsCons
Runtime callsSimple, correct, GC-awareCan’t optimize across boundary
Pure MLIROptimizable, no FFI overheadMust teach MLIR about GC roots, field layout, dispatch

For a tutorial, runtime calls are the right call. A production compiler would progressively move more into MLIR as it proves correctness. Start simple, optimize later.

Why No VTable?

VTables are an optimization for static dispatch. Lox’s dispatch is dynamic — methods can be added at runtime, classes are first-class values. A hash map lookup per dispatch is the honest representation. If profiling shows it’s a bottleneck, you add inline caches later.

Why Not Combine Lookup and Call?

Two separate calls — lox_get_property then lox_call — mean two separate jobs: one function finds the method, another calls it. That’s easier to understand, easier to debug, and means lox_get_property works the same whether you’re calling the result or storing it. The cost is a temporary ObjBoundMethod allocation per method call. A production compiler would merge these into a single lox_invoke(instance, method_name, args) that skips the allocation, but the savings only matter in hot loops — for a tutorial, two functions that each do one thing is the right call.


Full Update to the Expression Compiler

Adding the new expression types to the main compile_expression dispatch:

#![allow(unused)]
fn main() {
// src/codegen/generator.rs (updated match arm)
use std::collections::HashMap;

fn compile_expression(&self, expr: &Expr, block: &Block<'c>, variables: &mut HashMap<String, Value<'c, 'c>>) -> Value<'c, 'c> {
    match expr {
        Expr::Binary(b) => self.compile_binary(b, block, variables),
        Expr::Unary(u) => self.compile_unary(u, block, variables),
        Expr::Literal(l) => self.compile_literal(l, block, variables),
        Expr::Grouping(g) => self.compile_expression(&g.expr, block, variables),
        Expr::Variable(v) => self.compile_variable(v, variables),
        Expr::Assign(a) => self.compile_assign(a, block, variables),
        Expr::Call(c) => self.compile_call(c, block, variables),
        Expr::Logical(l) => self.compile_logical(l, block, variables),
        Expr::Get(g) => self.compile_get(g, block, variables),
        Expr::Set(s) => self.compile_set(s, block, variables),
        Expr::This(t) => self.compile_this(t, variables),
        Expr::Super(s) => self.compile_super(s, block, variables),
    }
}
}

Testing

Unit Tests for Method Lookup

These tests create objects on the stack, not the GC heap. When the test function returns, the stack pointers dangle. This is fine for unit tests — all access happens within the function scope — but don’t copy this pattern into production code. A complete implementation would use Heap::allocate to get GcClass and GcInstance values back.

#![allow(unused)]
fn main() {
use std::collections::HashMap;

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn method_lookup_walks_inheritance() {
        let mut base_class = ObjClass {
            header: ObjHeader::new(ObjType::Class),
            name: "Base".to_string(),
            methods: HashMap::new(),
            superclass: std::ptr::null_mut(),
        };
        base_class.methods.insert("greet".to_string(), LoxValue::Nil);

        let mut derived_class = ObjClass {
            header: ObjHeader::new(ObjType::Class),
            name: "Derived".to_string(),
            methods: HashMap::new(),
            superclass: &base_class as *const ObjClass as *mut ObjClass,
        };

        // Derived doesn't have "greet", but Base does
        assert!(derived_class.find_method("greet").is_some());
        // Derived doesn't have "missing" and neither does Base
        assert!(derived_class.find_method("missing").is_none());
    }

    #[test]
    fn fields_shadow_methods() {
        let mut class = ObjClass {
            header: ObjHeader::new(ObjType::Class),
            name: "Test".to_string(),
            methods: HashMap::new(),
            superclass: std::ptr::null_mut(),
        };
        class.methods.insert("x".to_string(), LoxValue::Number(42.0));

        let mut instance = ObjInstance::new(&class as *const ObjClass as *mut ObjClass);
        // Set field "x" to a different value
        instance.set_property("x".to_string(), LoxValue::Number(99.0));

        // Field should shadow the method
        let result = instance.get_property("x").unwrap();
        assert_eq!(result, LoxValue::Number(99.0));
    }
}
}

How Each Concept Maps to Code

ConceptHow We Implemented It
Class declarationRuntime call lox_create_class
Instance creationRuntime call lox_instance_from_class
Property accessRuntime call lox_get_property (fields before methods)
Property assignmentRuntime call lox_set_property
Method bindingObjBoundMethod wraps receiver + closure
thisImplicit upvalue, filled when method is bound
superImplicit upvalue holding the superclass, resolved at compile time
InheritanceLinked list of superclass pointers, walked during method lookup
GC tracingWalk methods, superclass, fields, receiver, and bound method

Classes tie together every system we’ve built: the GC heap, closures, upvalues, and MLIR code generation. There’s no new fundamental mechanism — only new combinations of what already exists. That’s how you know the architecture is right.


Next: Part 8 — Why We Did It This Way — Why numbers-only first? Why parameter-passing for blocks instead of a struct field? Why scf.if for logical operators? This chapter answers the questions that came up during review — and the answers tell you as much about MLIR as the code does.