MLIR for Lox: Part 6 - Complete Reference
This final part provides a complete reference and review of the entire GC implementation.
Chapter 24: Complete File Structure
lox-mlir/
├── Cargo.toml
├── src/
│ ├── lib.rs # Library entry point
│ ├── main.rs # CLI
│ │
│ ├── ast/
│ │ ├── mod.rs
│ │ └── types.rs # AST type definitions
│ │
│ ├── lexer/
│ │ ├── mod.rs
│ │ └── tokenizer.rs # Tokenization
│ │
│ ├── parser/
│ │ ├── mod.rs
│ │ └── parser.rs # Parsing
│ │
│ ├── analysis/
│ │ ├── mod.rs
│ │ └── captures.rs # Closure capture analysis
│ │
│ ├── codegen/
│ │ ├── mod.rs
│ │ ├── generator.rs # MLIR code generation
│ │ ├── lox_dialect.rs # Lox MLIR operations
│ │ ├── lowering.rs # Lox → LLVM lowering
│ │ └── types.rs # Type utilities
│ │
│ └── runtime/
│ ├── mod.rs
│ ├── object.rs # Object header and types
│ ├── gc.rs # Garbage collector
│ └── shadow_stack.rs # Shadow stack implementation
│
├── examples/
│ ├── simple.lox # Basic example
│ ├── closures.lox # Closure example
│ └── gc_stress.lox # GC stress test
│
└── tests/
└── gc_test.rs # GC integration tests
Chapter 25: Complete Object Layout
Every object in Lox has this layout:
┌─────────────────────────────────────────────────────┐
│ ObjHeader (8 bytes) │
├─────────────────────────────────────────────────────┤
│ Field │ Offset │ Size │ Description │
├───────────────┼────────┼──────┼────────────────────┤
│ marked │ 0 │ 1 │ GC mark flag │
│ obj_type │ 1 │ 1 │ ObjType enum │
│ padding │ 2 │ 2 │ Alignment padding │
│ size │ 4 │ 4 │ Data size in bytes │
└─────────────────────────────────────────────────────┘
Total header size: 8 bytes
Data follows immediately after header.
Type-Specific Layouts
Number (ObjType::Number)
┌──────────────────────────────┐
│ Header (8 bytes) │
├──────────────────────────────┤
│ value: f64 (8 bytes) │
└──────────────────────────────┘
Total: 16 bytes
String (ObjType::String)
┌──────────────────────────────┐
│ Header (8 bytes) │
├──────────────────────────────┤
│ length: usize (8 bytes) │
│ hash: u64 (8 bytes) │
│ chars: [u8; length] │
└──────────────────────────────┘
Total: 16 + length bytes
Environment (ObjType::Environment)
┌──────────────────────────────┐
│ Header (8 bytes) │
├──────────────────────────────┤
│ enclosing: *mut Env (8) │
│ slot_count: usize (8) │
│ slots: [*mut u8; slot_count] │
└──────────────────────────────┘
Total: 24 + (8 × slot_count) bytes
Closure (ObjType::Closure)
┌──────────────────────────────┐
│ Header (8 bytes) │
├──────────────────────────────┤
│ function_index: usize (8) │
│ environment: *mut Env (8) │
└──────────────────────────────┘
Total: 24 bytes
Instance (ObjType::Instance)
┌──────────────────────────────┐
│ Header (8 bytes) │
├──────────────────────────────┤
│ class: *mut Class (8) │
│ field_count: usize (8) │
│ fields: [(*str, *val); n] │
└──────────────────────────────┘
Total: 24 + (16 × field_count) bytes
Chapter 26: Complete GC API Reference
Runtime Functions (C ABI)
// Allocation
void* lox_runtime_alloc(size_t size, uint8_t obj_type);
void* lox_runtime_alloc_environment(size_t slot_count, void* enclosing);
void* lox_runtime_alloc_closure(size_t function_index, void* environment);
// Environment access
void lox_runtime_env_set(void* env, size_t index, void* value);
void* lox_runtime_env_get(void* env, size_t index);
// Shadow stack
void** gc_push_frame(size_t root_count);
void gc_pop_frame(void);
void gc_set_root(size_t index, void* value);
// Collection
void gc_collect(void);
size_t gc_object_count(void);
MLIR Operations
lox.alloc :: (obj_type: i8, size: i64) -> !llvm.ptr
Allocates a heap object with the given type and size.
lox.push_frame :: (root_count: i64) -> !llvm.ptr
Pushes a shadow stack frame with the given number of root slots.
Returns pointer to roots array.
lox.pop_frame :: () -> ()
Pops the current shadow stack frame.
lox.set_root :: (index: i64, value: !llvm.ptr) -> ()
Sets a root in the current frame.
lox.env_set :: (env: !llvm.ptr, index: i64, value: !llvm.ptr) -> ()
Sets a slot in an environment.
lox.env_get :: (env: !llvm.ptr, index: i64) -> !llvm.ptr
Gets a slot from an environment.
Chapter 27: The Compilation Pipeline
┌─────────────────────────────────────────────────────────────┐
│ 1. LEXING │
│ "var x = 1 + 2;" → [VAR, IDENT, EQUAL, NUMBER, ...] │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 2. PARSING │
│ Tokens → AST │
│ VarStmt { name: "x", init: BinaryExpr { ... } } │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 3. ANALYSIS │
│ Find captured variables, compute root counts │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 4. CODE GENERATION │
│ AST → MLIR (Lox dialect) │
│ func.func @main() { │
│ %frame = lox.push_frame root_count = 1 │
│ %x = lox.alloc obj_type = 0, size = 8 │
│ lox.set_root index = 0, %x │
│ ... │
│ } │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 5. LOWERING │
│ MLIR (Lox dialect) → MLIR (LLVM dialect) │
│ lox.alloc → func.call @lox_runtime_alloc │
│ lox.push_frame → func.call @gc_push_frame │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 6. LLVM IR GENERATION │
│ MLIR → LLVM IR │
│ define i8* @lox_runtime_alloc(...) { ... } │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 7. NATIVE CODE GENERATION │
│ LLVM IR → Machine code │
│ Link with liblox_runtime.a │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ 8. EXECUTION │
│ ./program │
│ Runtime GC manages memory │
└─────────────────────────────────────────────────────────────┘
Chapter 28: Debugging Your GC
Common Bugs
1. Use After Free
Symptom: Random crashes, corrupted data
Cause: Object freed while still referenced
Fix: Check that all references are registered as roots
2. Memory Leaks
Symptom: Memory grows unbounded
Cause: Objects not being freed
Fix: Check mark phase visits all reachable objects
3. Premature Collection
Symptom: Object disappears mid-function
Cause: Root not registered in shadow stack
Fix: Ensure gc_set_root is called for all local references
Debugging Tools
#![allow(unused)] fn main() { // Add to gc.rs /// Print all objects in the heap (for debugging) pub fn gc_debug_print_heap() { ALL_OBJECTS.with(|objects| { let mut current = *objects.borrow(); println!("=== HEAP CONTENTS ==="); while let Some(header_ptr) = current { unsafe { let header = &*header_ptr; println!(" {:p}: type={:?}, marked={}, size={}", header_ptr, header.obj_type, header.marked, header.size); current = header.next; } } }); } /// Print the shadow stack (for debugging) pub fn gc_debug_print_stack() { SHADOW_STACK_HEAD.with(|head| { let mut current = *head.borrow(); println!("=== SHADOW STACK ==="); let mut frame_num = 0; while let Some(frame_ptr) = current { unsafe { let frame = &*frame_ptr; println!(" Frame {} ({} roots):", frame_num, frame.root_count); for i in 0..frame.root_count { let root = frame.get_root(i); println!(" [{}] = {:p}", i, root); } current = frame.next; frame_num += 1; } } }); } }
Chapter 29: Performance Considerations
Allocation Overhead
| Operation | Time | Notes |
|---|---|---|
alloc() | O(1) | Plus possible GC trigger |
gc_push_frame() | O(1) | Allocate and link |
gc_pop_frame() | O(1) | Unlink and free |
gc_collect() | O(live objects) | Mark + sweep |
Tuning Parameters
#![allow(unused)] fn main() { // Adjust these based on your workload /// Trigger GC after this many allocations const GC_THRESHOLD: usize = 1024; /// Maximum objects before forced collection const GC_MAX_OBJECTS: usize = 10000; /// Enable debug output const GC_DEBUG: bool = false; }
Optimization Opportunities
- Generational GC: Most objects die young. Only scan recent allocations.
- Incremental GC: Don't pause for full collection. Do it in small chunks.
- Parallel GC: Use multiple threads for mark/sweep.
- Escape Analysis: If an object doesn't escape a function, put it on the stack.
Chapter 30: What's Next?
You now have a complete garbage collector! Here's what to explore next:
Immediate Next Steps
- Get it running - Compile the runtime, write a simple test program
- Add more object types - Arrays, maps, classes
- Write tests - Verify GC handles all edge cases
Advanced Topics
-
Generational Collection
- Young generation: frequent, small collections
- Old generation: rare, large collections
-
Incremental Collection
- Break mark/sweep into small chunks
- Reduce pause times
-
Compacting Collection
- Move objects to eliminate fragmentation
- Requires precise GC (we have it!)
-
Concurrent Collection
- Run GC in parallel with program execution
- Complex but powerful
Resources
- Crafting Interpreters - The GC chapter (you skipped it!)
- GC Handbook - The definitive reference
- MLIR Documentation - For advanced dialect features
- LLVM Statepoints - If you want relocating GC
Appendix A: Quick Reference Card
┌──────────────────────────────────────────────────────────────┐
│ GC QUICK REFERENCE │
├──────────────────────────────────────────────────────────────┤
│ │
│ MARK-SWEEP ALGORITHM │
│ ───────────────────── │
│ 1. Mark all roots (shadow stack) │
│ 2. Recursively mark referenced objects │
│ 3. Sweep: free all unmarked objects │
│ │
│ ROOTS │
│ ───── │
│ • Global variables │
│ • Local variables (in shadow stack frames) │
│ • Temporaries (also in shadow stack) │
│ │
│ OBJECT LIFECYCLE │
│ ───────────────── │
│ 1. alloc() → Object on heap │
│ 2. gc_set_root() → Register in shadow stack │
│ 3. Use object │
│ 4. gc_pop_frame() → Remove from shadow stack │
│ 5. gc_collect() → Free if unreachable │
│ │
│ CLOSURES │
│ ──────── │
│ • Captured vars → Environment (heap) │
│ • Closure = function pointer + environment pointer │
│ • Mark closure → Mark environment → Mark captured vars │
│ │
│ MLIR OPERATIONS │
│ ─────────────── │
│ lox.alloc → Allocate object │
│ lox.push_frame → Push shadow stack frame │
│ lox.pop_frame → Pop shadow stack frame │
│ lox.set_root → Register root │
│ │
└──────────────────────────────────────────────────────────────┘
Appendix B: Error Messages
| Error | Cause | Fix |
|---|---|---|
No frame to pop! | gc_pop_frame called without matching gc_push_frame | Check function prologue/epilogue pairing |
Out of memory! | Allocation failed | Check for memory leaks, tune GC threshold |
Variable not found: x | Capture analysis failed | Check scope tracking in analyzer |
Object not marked during sweep | Logic error | Call gc_debug_print_heap() before sweep |
Appendix C: Study Roadmap
Use this checklist to track your progress through the GC tutorial:
Phase 1: Understanding (Read-Only)
- Part 2, Chapters 1-4: Understand the GC problem and mark-sweep algorithm
- Part 2, Chapter 5: Understand the two hard problems (roots and references)
- Part 2, Chapters 6-7: Understand object headers and allocation
- Part 2, Chapter 7.5: Complete practice exercises
- Part 3, Chapters 8-9: Understand the shadow stack concept
- Part 3, Chapter 10: Understand mark and sweep phases
- Part 3, Chapter 10.5: Walk through the step-by-step example
- Part 3, Chapter 10.6: Read about common mistakes
- Part 3, Chapter 10.7: Complete practice exercises
- Part 4, Chapters 12-14: Understand MLIR dialect and lowering
- Part 4, Chapter 15: Understand function code generation
- Part 4, Chapter 17.5: Trace through the complete MLIR example
- Part 4, Chapter 17.6: Understand different object types
- Part 4, Chapter 17.7: Complete practice exercises
- Part 5, Chapters 18-19: Understand the closure problem
- Part 5, Chapters 20-21: Understand environments and marking
- Part 5, Chapter 22: Understand closure code generation
- Part 5, Chapter 23: Understand the complete picture
- Part 5, Chapters 23.5-23.7: Understand nested and complex closures
- Part 5, Chapter 23.8: Complete practice exercises
Phase 2: Implementation (Hands-On)
- Create the runtime module structure
-
Implement
ObjHeaderandObjType -
Implement
alloc()function -
Implement shadow stack (
StackFrame,gc_push_frame,gc_pop_frame) -
Implement mark phase (
gc_mark,mark_object,mark_references) -
Implement sweep phase (
gc_sweep) -
Implement
gc_collect()entry point - Test allocation without GC
- Test mark phase with simple objects
- Test full GC cycle
-
Add debug utilities (
gc_debug_print_heap, etc.) - Implement environment allocation
- Implement closure allocation
- Test closure GC
Phase 3: MLIR Integration
- Set up Melior dependency
- Define Lox dialect operations
- Implement lowering pass
- Generate MLIR for simple functions
- Lower to LLVM IR
- Link with runtime
- Test end-to-end execution
Phase 4: Advanced Features
- Add string type
- Add instance type
- Implement nested closure environments
- Add generational GC (optional)
- Add incremental collection (optional)
Appendix D: Glossary
| Term | Definition |
|---|---|
| Heap | The pool of memory where objects are allocated |
| Stack | Memory for function calls and local variables |
| Root | A pointer that the GC treats as always live (globals, locals) |
| Reachable | An object that can be found by following pointers from roots |
| Mark | To flag an object as "still live" during GC |
| Sweep | To free all objects that weren't marked |
| Shadow Stack | A separate stack that tracks GC roots for each function call |
| Environment | A heap-allocated structure holding captured variables for closures |
| Closure | A function together with its captured environment |
| Dialect | A set of operations in MLIR for a specific domain |
| Lowering | Converting high-level IR to lower-level IR |
Appendix E: Further Reading
Books
- "Crafting Interpreters" by Bob Nystrom - Chapter on garbage collection
- "The Garbage Collection Handbook" by Jones et al. - Comprehensive GC reference
- "Engineering a Compiler" by Cooper & Torczon - GC in the context of compilers
Papers
- "Uniprocessor Garbage Collection Techniques" - Wilson (1992) - Classic survey
- "A Unified Theory of Garbage Collection" - Bacon et al. (2004) - Theoretical foundation
Code
- Lua 5.4 source - Simple, readable GC implementation
- MicroPython - Mark-sweep GC for embedded systems
- mruby - Lightweight Ruby with simple GC
MLIR/LLVM
- MLIR Documentation - https://mlir.llvm.org/docs/
- Melior Repository - Rust bindings for MLIR
- LLVM GC Documentation - https://llvm.org/docs/GarbageCollection.html
Conclusion
You've learned:
- Why GC is needed - Manual memory management doesn't work for closures
- Mark-sweep algorithm - The simplest correct GC
- Shadow stack - How to track roots precisely
- Closures - The hardest part: captured variables in environments
- MLIR integration - How to generate code that uses the GC
The key insight: GC is not magic. It's just:
- Track what's reachable (roots)
- Mark everything reachable
- Free everything not marked
Everything else is optimization.
File Summary
| File | Content |
|---|---|
mlir-lox-guide-rust-part2.md | GC concepts, object headers, allocation |
mlir-lox-guide-rust-part3.md | Shadow stack, mark phase, sweep phase |
mlir-lox-guide-rust-part4.md | MLIR dialect, lowering to LLVM |
mlir-lox-guide-rust-part5.md | Closures and environments |
mlir-lox-guide-rust-part6.md | Complete reference (this file) |
Good luck building your Lox compiler! 🦀