MLIR for Lox: Part 6 - Complete Reference

This final part provides a complete reference and review of the entire GC implementation.


Chapter 24: Complete File Structure

lox-mlir/
├── Cargo.toml
├── src/
│   ├── lib.rs                    # Library entry point
│   ├── main.rs                   # CLI
│   │
│   ├── ast/
│   │   ├── mod.rs
│   │   └── types.rs              # AST type definitions
│   │
│   ├── lexer/
│   │   ├── mod.rs
│   │   └── tokenizer.rs          # Tokenization
│   │
│   ├── parser/
│   │   ├── mod.rs
│   │   └── parser.rs             # Parsing
│   │
│   ├── analysis/
│   │   ├── mod.rs
│   │   └── captures.rs           # Closure capture analysis
│   │
│   ├── codegen/
│   │   ├── mod.rs
│   │   ├── generator.rs          # MLIR code generation
│   │   ├── lox_dialect.rs        # Lox MLIR operations
│   │   ├── lowering.rs           # Lox → LLVM lowering
│   │   └── types.rs              # Type utilities
│   │
│   └── runtime/
│       ├── mod.rs
│       ├── object.rs             # Object header and types
│       ├── gc.rs                 # Garbage collector
│       └── shadow_stack.rs       # Shadow stack implementation
│
├── examples/
│   ├── simple.lox                # Basic example
│   ├── closures.lox              # Closure example
│   └── gc_stress.lox             # GC stress test
│
└── tests/
    └── gc_test.rs                # GC integration tests

Chapter 25: Complete Object Layout

Every object in Lox has this layout:

┌─────────────────────────────────────────────────────┐
│ ObjHeader (8 bytes)                                 │
├─────────────────────────────────────────────────────┤
│ Field         │ Offset │ Size │ Description        │
├───────────────┼────────┼──────┼────────────────────┤
│ marked        │ 0      │ 1    │ GC mark flag       │
│ obj_type      │ 1      │ 1    │ ObjType enum       │
│ padding       │ 2      │ 2    │ Alignment padding  │
│ size          │ 4      │ 4    │ Data size in bytes │
└─────────────────────────────────────────────────────┘

Total header size: 8 bytes
Data follows immediately after header.

Type-Specific Layouts

Number (ObjType::Number)

┌──────────────────────────────┐
│ Header (8 bytes)             │
├──────────────────────────────┤
│ value: f64 (8 bytes)         │
└──────────────────────────────┘
Total: 16 bytes

String (ObjType::String)

┌──────────────────────────────┐
│ Header (8 bytes)             │
├──────────────────────────────┤
│ length: usize (8 bytes)      │
│ hash: u64 (8 bytes)          │
│ chars: [u8; length]          │
└──────────────────────────────┘
Total: 16 + length bytes

Environment (ObjType::Environment)

┌──────────────────────────────┐
│ Header (8 bytes)             │
├──────────────────────────────┤
│ enclosing: *mut Env (8)      │
│ slot_count: usize (8)        │
│ slots: [*mut u8; slot_count] │
└──────────────────────────────┘
Total: 24 + (8 × slot_count) bytes

Closure (ObjType::Closure)

┌──────────────────────────────┐
│ Header (8 bytes)             │
├──────────────────────────────┤
│ function_index: usize (8)    │
│ environment: *mut Env (8)    │
└──────────────────────────────┘
Total: 24 bytes

Instance (ObjType::Instance)

┌──────────────────────────────┐
│ Header (8 bytes)             │
├──────────────────────────────┤
│ class: *mut Class (8)        │
│ field_count: usize (8)       │
│ fields: [(*str, *val); n]    │
└──────────────────────────────┘
Total: 24 + (16 × field_count) bytes

Chapter 26: Complete GC API Reference

Runtime Functions (C ABI)

// Allocation
void* lox_runtime_alloc(size_t size, uint8_t obj_type);
void* lox_runtime_alloc_environment(size_t slot_count, void* enclosing);
void* lox_runtime_alloc_closure(size_t function_index, void* environment);

// Environment access
void lox_runtime_env_set(void* env, size_t index, void* value);
void* lox_runtime_env_get(void* env, size_t index);

// Shadow stack
void** gc_push_frame(size_t root_count);
void gc_pop_frame(void);
void gc_set_root(size_t index, void* value);

// Collection
void gc_collect(void);
size_t gc_object_count(void);

MLIR Operations

lox.alloc :: (obj_type: i8, size: i64) -> !llvm.ptr
  Allocates a heap object with the given type and size.

lox.push_frame :: (root_count: i64) -> !llvm.ptr
  Pushes a shadow stack frame with the given number of root slots.
  Returns pointer to roots array.

lox.pop_frame :: () -> ()
  Pops the current shadow stack frame.

lox.set_root :: (index: i64, value: !llvm.ptr) -> ()
  Sets a root in the current frame.

lox.env_set :: (env: !llvm.ptr, index: i64, value: !llvm.ptr) -> ()
  Sets a slot in an environment.

lox.env_get :: (env: !llvm.ptr, index: i64) -> !llvm.ptr
  Gets a slot from an environment.

Chapter 27: The Compilation Pipeline

┌─────────────────────────────────────────────────────────────┐
│ 1. LEXING                                                   │
│    "var x = 1 + 2;" → [VAR, IDENT, EQUAL, NUMBER, ...]      │
└─────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│ 2. PARSING                                                  │
│    Tokens → AST                                             │
│    VarStmt { name: "x", init: BinaryExpr { ... } }          │
└─────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│ 3. ANALYSIS                                                 │
│    Find captured variables, compute root counts             │
└─────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│ 4. CODE GENERATION                                          │
│    AST → MLIR (Lox dialect)                                 │
│    func.func @main() {                                      │
│      %frame = lox.push_frame root_count = 1                 │
│      %x = lox.alloc obj_type = 0, size = 8                  │
│      lox.set_root index = 0, %x                             │
│      ...                                                    │
│    }                                                        │
└─────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│ 5. LOWERING                                                 │
│    MLIR (Lox dialect) → MLIR (LLVM dialect)                 │
│    lox.alloc → func.call @lox_runtime_alloc                 │
│    lox.push_frame → func.call @gc_push_frame                │
└─────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│ 6. LLVM IR GENERATION                                       │
│    MLIR → LLVM IR                                           │
│    define i8* @lox_runtime_alloc(...) { ... }               │
└─────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│ 7. NATIVE CODE GENERATION                                   │
│    LLVM IR → Machine code                                   │
│    Link with liblox_runtime.a                               │
└─────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│ 8. EXECUTION                                                │
│    ./program                                                │
│    Runtime GC manages memory                                │
└─────────────────────────────────────────────────────────────┘

Chapter 28: Debugging Your GC

Common Bugs

1. Use After Free

Symptom: Random crashes, corrupted data
Cause: Object freed while still referenced
Fix: Check that all references are registered as roots

2. Memory Leaks

Symptom: Memory grows unbounded
Cause: Objects not being freed
Fix: Check mark phase visits all reachable objects

3. Premature Collection

Symptom: Object disappears mid-function
Cause: Root not registered in shadow stack
Fix: Ensure gc_set_root is called for all local references

Debugging Tools

#![allow(unused)]
fn main() {
// Add to gc.rs

/// Print all objects in the heap (for debugging)
pub fn gc_debug_print_heap() {
    ALL_OBJECTS.with(|objects| {
        let mut current = *objects.borrow();
        println!("=== HEAP CONTENTS ===");
        
        while let Some(header_ptr) = current {
            unsafe {
                let header = &*header_ptr;
                println!("  {:p}: type={:?}, marked={}, size={}",
                    header_ptr, header.obj_type, header.marked, header.size);
                current = header.next;
            }
        }
    });
}

/// Print the shadow stack (for debugging)
pub fn gc_debug_print_stack() {
    SHADOW_STACK_HEAD.with(|head| {
        let mut current = *head.borrow();
        println!("=== SHADOW STACK ===");
        let mut frame_num = 0;
        
        while let Some(frame_ptr) = current {
            unsafe {
                let frame = &*frame_ptr;
                println!("  Frame {} ({} roots):", frame_num, frame.root_count);
                for i in 0..frame.root_count {
                    let root = frame.get_root(i);
                    println!("    [{}] = {:p}", i, root);
                }
                current = frame.next;
                frame_num += 1;
            }
        }
    });
}
}

Chapter 29: Performance Considerations

Allocation Overhead

OperationTimeNotes
alloc()O(1)Plus possible GC trigger
gc_push_frame()O(1)Allocate and link
gc_pop_frame()O(1)Unlink and free
gc_collect()O(live objects)Mark + sweep

Tuning Parameters

#![allow(unused)]
fn main() {
// Adjust these based on your workload

/// Trigger GC after this many allocations
const GC_THRESHOLD: usize = 1024;

/// Maximum objects before forced collection
const GC_MAX_OBJECTS: usize = 10000;

/// Enable debug output
const GC_DEBUG: bool = false;
}

Optimization Opportunities

  1. Generational GC: Most objects die young. Only scan recent allocations.
  2. Incremental GC: Don't pause for full collection. Do it in small chunks.
  3. Parallel GC: Use multiple threads for mark/sweep.
  4. Escape Analysis: If an object doesn't escape a function, put it on the stack.

Chapter 30: What's Next?

You now have a complete garbage collector! Here's what to explore next:

Immediate Next Steps

  1. Get it running - Compile the runtime, write a simple test program
  2. Add more object types - Arrays, maps, classes
  3. Write tests - Verify GC handles all edge cases

Advanced Topics

  1. Generational Collection

    • Young generation: frequent, small collections
    • Old generation: rare, large collections
  2. Incremental Collection

    • Break mark/sweep into small chunks
    • Reduce pause times
  3. Compacting Collection

    • Move objects to eliminate fragmentation
    • Requires precise GC (we have it!)
  4. Concurrent Collection

    • Run GC in parallel with program execution
    • Complex but powerful

Resources

  • Crafting Interpreters - The GC chapter (you skipped it!)
  • GC Handbook - The definitive reference
  • MLIR Documentation - For advanced dialect features
  • LLVM Statepoints - If you want relocating GC

Appendix A: Quick Reference Card

┌──────────────────────────────────────────────────────────────┐
│                    GC QUICK REFERENCE                        │
├──────────────────────────────────────────────────────────────┤
│                                                              │
│  MARK-SWEEP ALGORITHM                                        │
│  ─────────────────────                                       │
│  1. Mark all roots (shadow stack)                            │
│  2. Recursively mark referenced objects                      │
│  3. Sweep: free all unmarked objects                         │
│                                                              │
│  ROOTS                                                       │
│  ─────                                                       │
│  • Global variables                                          │
│  • Local variables (in shadow stack frames)                  │
│  • Temporaries (also in shadow stack)                        │
│                                                              │
│  OBJECT LIFECYCLE                                            │
│  ─────────────────                                           │
│  1. alloc() → Object on heap                                 │
│  2. gc_set_root() → Register in shadow stack                 │
│  3. Use object                                               │
│  4. gc_pop_frame() → Remove from shadow stack                │
│  5. gc_collect() → Free if unreachable                       │
│                                                              │
│  CLOSURES                                                    │
│  ────────                                                    │
│  • Captured vars → Environment (heap)                        │
│  • Closure = function pointer + environment pointer          │
│  • Mark closure → Mark environment → Mark captured vars      │
│                                                              │
│  MLIR OPERATIONS                                             │
│  ───────────────                                             │
│  lox.alloc         → Allocate object                         │
│  lox.push_frame    → Push shadow stack frame                 │
│  lox.pop_frame     → Pop shadow stack frame                  │
│  lox.set_root      → Register root                           │
│                                                              │
└──────────────────────────────────────────────────────────────┘

Appendix B: Error Messages

ErrorCauseFix
No frame to pop!gc_pop_frame called without matching gc_push_frameCheck function prologue/epilogue pairing
Out of memory!Allocation failedCheck for memory leaks, tune GC threshold
Variable not found: xCapture analysis failedCheck scope tracking in analyzer
Object not marked during sweepLogic errorCall gc_debug_print_heap() before sweep

Appendix C: Study Roadmap

Use this checklist to track your progress through the GC tutorial:

Phase 1: Understanding (Read-Only)

  • Part 2, Chapters 1-4: Understand the GC problem and mark-sweep algorithm
  • Part 2, Chapter 5: Understand the two hard problems (roots and references)
  • Part 2, Chapters 6-7: Understand object headers and allocation
  • Part 2, Chapter 7.5: Complete practice exercises
  • Part 3, Chapters 8-9: Understand the shadow stack concept
  • Part 3, Chapter 10: Understand mark and sweep phases
  • Part 3, Chapter 10.5: Walk through the step-by-step example
  • Part 3, Chapter 10.6: Read about common mistakes
  • Part 3, Chapter 10.7: Complete practice exercises
  • Part 4, Chapters 12-14: Understand MLIR dialect and lowering
  • Part 4, Chapter 15: Understand function code generation
  • Part 4, Chapter 17.5: Trace through the complete MLIR example
  • Part 4, Chapter 17.6: Understand different object types
  • Part 4, Chapter 17.7: Complete practice exercises
  • Part 5, Chapters 18-19: Understand the closure problem
  • Part 5, Chapters 20-21: Understand environments and marking
  • Part 5, Chapter 22: Understand closure code generation
  • Part 5, Chapter 23: Understand the complete picture
  • Part 5, Chapters 23.5-23.7: Understand nested and complex closures
  • Part 5, Chapter 23.8: Complete practice exercises

Phase 2: Implementation (Hands-On)

  • Create the runtime module structure
  • Implement ObjHeader and ObjType
  • Implement alloc() function
  • Implement shadow stack (StackFrame, gc_push_frame, gc_pop_frame)
  • Implement mark phase (gc_mark, mark_object, mark_references)
  • Implement sweep phase (gc_sweep)
  • Implement gc_collect() entry point
  • Test allocation without GC
  • Test mark phase with simple objects
  • Test full GC cycle
  • Add debug utilities (gc_debug_print_heap, etc.)
  • Implement environment allocation
  • Implement closure allocation
  • Test closure GC

Phase 3: MLIR Integration

  • Set up Melior dependency
  • Define Lox dialect operations
  • Implement lowering pass
  • Generate MLIR for simple functions
  • Lower to LLVM IR
  • Link with runtime
  • Test end-to-end execution

Phase 4: Advanced Features

  • Add string type
  • Add instance type
  • Implement nested closure environments
  • Add generational GC (optional)
  • Add incremental collection (optional)

Appendix D: Glossary

TermDefinition
HeapThe pool of memory where objects are allocated
StackMemory for function calls and local variables
RootA pointer that the GC treats as always live (globals, locals)
ReachableAn object that can be found by following pointers from roots
MarkTo flag an object as "still live" during GC
SweepTo free all objects that weren't marked
Shadow StackA separate stack that tracks GC roots for each function call
EnvironmentA heap-allocated structure holding captured variables for closures
ClosureA function together with its captured environment
DialectA set of operations in MLIR for a specific domain
LoweringConverting high-level IR to lower-level IR

Appendix E: Further Reading

Books

  • "Crafting Interpreters" by Bob Nystrom - Chapter on garbage collection
  • "The Garbage Collection Handbook" by Jones et al. - Comprehensive GC reference
  • "Engineering a Compiler" by Cooper & Torczon - GC in the context of compilers

Papers

  • "Uniprocessor Garbage Collection Techniques" - Wilson (1992) - Classic survey
  • "A Unified Theory of Garbage Collection" - Bacon et al. (2004) - Theoretical foundation

Code

  • Lua 5.4 source - Simple, readable GC implementation
  • MicroPython - Mark-sweep GC for embedded systems
  • mruby - Lightweight Ruby with simple GC

MLIR/LLVM

  • MLIR Documentation - https://mlir.llvm.org/docs/
  • Melior Repository - Rust bindings for MLIR
  • LLVM GC Documentation - https://llvm.org/docs/GarbageCollection.html

Conclusion

You've learned:

  1. Why GC is needed - Manual memory management doesn't work for closures
  2. Mark-sweep algorithm - The simplest correct GC
  3. Shadow stack - How to track roots precisely
  4. Closures - The hardest part: captured variables in environments
  5. MLIR integration - How to generate code that uses the GC

The key insight: GC is not magic. It's just:

  1. Track what's reachable (roots)
  2. Mark everything reachable
  3. Free everything not marked

Everything else is optimization.


File Summary

FileContent
mlir-lox-guide-rust-part2.mdGC concepts, object headers, allocation
mlir-lox-guide-rust-part3.mdShadow stack, mark phase, sweep phase
mlir-lox-guide-rust-part4.mdMLIR dialect, lowering to LLVM
mlir-lox-guide-rust-part5.mdClosures and environments
mlir-lox-guide-rust-part6.mdComplete reference (this file)

Good luck building your Lox compiler! 🦀