Chapter 14: Recursive Types — When a Type References Itself
You can write this in Lua:
---@class Node
---@field value number
---@field next Node|nil
local list = { value = 1, next = { value = 2, next = nil } }
Node has a field of type Node. Not a copy of Node — the same Node. A linked list that points to itself. A tree whose children are more trees. A JSON value that might contain other JSON values.
This is a recursive type. And if you try to represent it the way we’ve been representing types — by expanding the definition everywhere it appears — you’ll loop forever. Node contains Node contains Node contains Node…
Our type checker needs to handle this without spinning. The fix is simpler than you’d expect: don’t expand recursive types. Refer to them by name.
Expanding Node forever: Using a name reference:
Table { Table {
value: Number, value: Number,
next: Union([ next: Union([
Table { Ref("Node"), ← stops here
value: Number, Nil
next: Union([ ])
Table { }
value: Number,
next: Union([ Ref("Node") is a lazy pointer —
Table { "the type named Node" without
... ♾️ expanding what Node contains.
}, Compare names, not structures.
Nil One comparison, no loop.
])
}),
Nil
])
}),
Nil
])
}
The Problem: Substitution Loops
When our type checker encounters ---@field next Node|nil, it does two things:
- Look up
Nodein the environment → gets the class’s type - Use that type as the field type
If Node’s type is Table { fields: [("value", Number), ("next", Union([Node, Nil])] }, step 2 tries to resolve Node inside the definition. Which means looking up Node again. Which means expanding the same type. Which means looking up Node again. Infinite loop.
The same problem shows up with generic recursive types. Imagine:
---@class Tree<T>
---@field value T
---@field children Tree<T>[]
Tree<T> contains Tree<T>[] — itself, parameterized. Expanding it produces another Tree<T>, which produces another…
This isn’t a bug in our type checker. It’s a fundamental property of recursive definitions. The solution isn’t to prevent recursion — it’s to represent recursive types without expanding them.
Step 1: Names as References, Not Expansions
Right now, when check_stmt encounters ---@class Node, it registers the class in the environment with its full type:
#![allow(unused)]
fn main() {
// Current approach: store the expanded type
env.extend(name, Type::Table { fields: resolved_fields });
}
The problem: resolved_fields might contain Node, which triggers another lookup, which triggers another resolution…
Instead of expanding the class name into its full type, we can keep it as a name. When the type checker encounters Node as a field type, it stores… Node. Not the expanded definition. The name, and nothing more.
We already have the infrastructure for this: Type::Var stores a name. But Var means “a type variable to be resolved by substitution.” A recursive reference is different — it’s a name that shouldn’t be substituted away.
We need a new variant:
#![allow(unused)]
fn main() {
/// A named type reference — used for recursive types.
/// `Ref("Node")` means "the type named Node" without expanding it.
/// Unlike `Var("T")`, a `Ref` is never substituted — it's a final value.
Ref(String),
}
Ref and Var both store names, but they behave differently:
Var("T") | Ref("Node") | |
|---|---|---|
| Meaning | “Resolve me via substitution” | “I am a named type” |
| Substitution | Replaced with the mapped type | Left as-is |
| Compatibility | Compatible with everything (unknown) | Compatible with itself (same name) |
| When to use | Generic type parameters | Recursive type references |
Step 2: Class Registration with Ref
When check_stmt processes ---@class Node, it registers the class in the environment. Instead of expanding the class’s fields eagerly, we store a Ref:
#![allow(unused)]
fn main() {
Statement::Class { name, fields } => {
// Note: In our implementation, class registration happens via
// Annotation::Class in type_check's first pass, not here.
// This arm is shown for completeness — it demonstrates the
// registration pattern you'd use if classes were statements.
// Register the class name FIRST, before processing fields.
// This way, fields that reference the class itself will find it.
env.extend(name.clone(), Type::Ref(name.clone()));
// Now resolve field types. Any reference to this class
// will find the Ref we registered above — no infinite loop.
let resolved_fields: Vec<(String, Type)> = fields.iter()
.map(|(fname, ftype)| {
(fname.clone(), ftype.clone()) // parse_type already produces Ref for class names
})
.collect();
// Update the environment with the full table type.
// The Ref is still there for compatibility checks.
env.extend(name, Type::Table { fields: resolved_fields });
}
}
Let’s trace through what happens when we process ---@class Node:
---@class Nodearrives- Register
Node → Ref("Node")in the environment — this is the key step - Process
---@field value number→("value", Number) - Process
---@field next Node|nil→ look upNode→ findRef("Node")→ field type isUnion([Ref("Node"), Nil]) - Register
Node → Table { fields: [("value", Number), ("next", Union([Ref("Node"), Nil]))] }
The Table contains Ref("Node"), not another Table. No infinite expansion. The Ref acts as a lazy pointer — it names the type without expanding it.
The registration order matters: we register Ref("Node") before processing the fields (step 2 before steps 3-4). This way, when the next field looks up Node, it finds the Ref immediately. Step 5 then overwrites the Ref with the full Table, but by then all fields have already been resolved. This pattern — “declare the name first, fill in the details later” — is the same one C uses with forward declarations, and it shows up everywhere in compiler design.
Step 3: Compatibility for Ref
How does is_compatible_with handle Ref? Two Ref("Node") values are compatible because they refer to the same type:
#![allow(unused)]
fn main() {
(Type::Ref(a), Type::Ref(b)) => a == b,
}
What about Ref("Node") vs Table { fields: [...] }? If the Table is Node’s definition, they’re the same type — different representations of the same thing. This is where nominal vs structural typing comes back. We have two choices:
Option A: Nominal. Ref("Node") is only compatible with Ref("Node"). If the type checker has resolved Node to a Table, they’re different. Simple, consistent, but means you can’t compare a Ref with its expansion.
Option B: Structural with a lookup. When comparing Ref("Node") with a Table, look up Node in the environment and compare the Table structurally. More flexible, but requires access to the environment inside is_compatible_with.
We’ll go with Option A — nominal comparison. It’s simpler, it matches how LuaCATS ---@class works (classes are nominal, not structural), and it avoids threading the environment through every compatibility check.
#![allow(unused)]
fn main() {
(Type::Ref(a), Type::Ref(b)) => a == b,
// Ref is NOT automatically compatible with Table (in either direction).
// A Ref("Node") might have fields the Table doesn't; a Table
// might lack the class name. Use ---@type Node to bridge.
// (A full implementation would resolve Ref→Table via the
// environment and check structural compatibility.)
(Type::Ref(_), Type::Table { .. }) | (Type::Table { .. }, Type::Ref(_)) => false,
}
What about Ref vs Var, Ref vs Dynamic, or Ref vs Error? Those cases are handled by earlier wildcard arms in is_compatible_with — (Type::Var(_), _) | (_, Type::Var(_)) => true matches before the Ref arms, and (Type::Dynamic, _) | (_, Type::Dynamic) => true and (Type::Error, _) | (_, Type::Error) => true match even earlier. This is correct: Var is a placeholder that could be instantiated to any type (including a Ref), Dynamic is compatible with everything, and Error suppresses cascading errors.
The upshot: we only need the two Ref-specific match arms shown above — (Ref, Ref) for same-name checks and (Ref, Table) | (Table, Ref) to reject nominal-vs-structural mismatches. The catch-all _ => false handles any other Ref-vs-non-Ref pairing. Dynamic and Error are compatible with Ref as expected, but that flows through their own wildcard arms, not the Ref ones.
Step 4: Substitution Leaves Ref Alone
Our Substitution::substitute method replaces Var with mapped types. Ref should be left as-is:
#![allow(unused)]
fn main() {
Type::Var(name) => {
match self.get(name) {
Some(t) => t.substitute(self),
None => ty.clone(), // Unbound variable stays as-is
}
}
Type::Ref(_) => ty.clone(), // Never substitute a Ref
}
This is the key difference between Var and Ref. Var("T") is a placeholder — substitution fills it in. Ref("Node") is a final value — it names a type, and that name is the answer.
Step 5: parse_type_name Produces Ref for Known Classes
When the parser encounters a type name that corresponds to a class, it should produce a Ref instead of Dynamic. But parse_type_name doesn’t have access to the environment — it doesn’t know which names are classes.
In Chapter 12, we treated all uppercase names as type variables. Now that we have Ref for class references, we need a way to tell them apart. The convention: single uppercase letters (T, U) are type variables; multi-character uppercase names (Node, Tree) are class references. If a name starts with an uppercase letter and is more than one character, parse it as Ref. Unknown class names still become Ref — if the class doesn’t exist, the compatibility check fails cleanly.
#![allow(unused)]
fn main() {
fn parse_type_name(s: &str) -> Type {
match s.trim() {
"number" => Type::Number,
"string" => Type::String,
"boolean" => Type::Boolean,
"nil" => Type::Nil,
// Single uppercase letter → type variable
name if name.len() == 1 && name.chars().next().unwrap().is_uppercase() => {
Type::Var(name.to_string())
}
// Multi-char uppercase name → class reference
name if name.chars().next().map_or(false, |c| c.is_uppercase()) => {
Type::Ref(name.to_string())
}
_ => Type::Dynamic,
}
}
}
The rule: uppercase single letter = type variable, uppercase word = class reference.
A word of caution about this heuristic. In LuaCATS, only names declared in
---@genericare type parameters — everything else is a concrete type. Our convention (single uppercase letter = type variable) is a shortcut that works for the patterns in this tutorial (T,Ufor generics,Nodefor classes), but it has a blind spot: if someone writes---@param x String(capitalized), the parser treats it as a class referenceRef("String"), not the built-inType::String. The built-in types only match their lowercase spellings ("number","string","boolean","nil"). A production type checker would cross-reference the name against declared generics before deciding it’s aVar— we use the length heuristic to keep the code simple.
Step 6: Display for Ref
The Display implementation:
#![allow(unused)]
fn main() {
Type::Ref(name) => write!(f, "{name}"),
}
Simple — a Ref displays as its name. Ref("Node") shows as Node. This matches the annotation syntax.
Putting It Together: A Linked List
Let’s trace through a complete example:
---@class Node
---@field value number
---@field next Node|nil
local head = { value = 1, next = { value = 2, next = nil } }
-
---@class Node→ registerNode → Ref("Node")in the environment -
---@field value number→("value", Number) -
---@field next Node|nil→parse_type("Node|nil")→Union([Ref("Node"), Nil]) -
Update environment:
Node → Table { fields: [("value", Number), ("next", Union([Ref("Node"), Nil]))] } -
local head = { value = 1, next = { value = 2, next = nil } }→ type isTable { fields: [("value", Number), ("next", Union([Ref("Node"), Nil]))] }A note on table constructors: our simplified parser doesn’t handle table literals —
{ value = 1, next = ... }would be inferred asNilwithout an annotation. With---@type Node,headis typed asRef("Node")regardless of the literal’s inferred type, which is the point of the annotation. The trace above shows what a full implementation would produce; our code relies on the annotation to skip table-constructor inference entirely. -
Compatibility check:
head’s type is compatible withNode?TablevsRef("Node")→false(nominal)
Step 6 says the table literal isn’t compatible with Node! That’s because we’re comparing structurally-typed table literals against nominally-typed class references. The table literal has the right fields, but the type checker doesn’t recognize it as a Node — it’s an anonymous Table.
This is a real limitation of nominal typing. The fix is explicit annotation: add ---@type Node before the local head line, and the type checker treats the variable as Ref("Node") instead of an anonymous Table:
---@type Node
local head = { value = 1, next = { value = 2, next = nil } }
Now head’s type is Ref("Node"), and compatibility checks against Node work. Real LuaCATS implementations do the same thing — classes are nominal, and constructing instances that the type checker recognizes requires the annotation. This isn’t a hack; it’s the intended design. Structural subtyping (“any table with the right fields is a Node”) is possible but introduces its own complexity — cycle detection in the structural comparator, variance rules for field types, and the question of whether extra fields are allowed.
What We’re Simplifying
No structural subtyping for classes. A table with the right fields isn’t automatically a Node. Real LuaCATS doesn’t do structural subtyping either; classes are nominal.
No structural equivalence checking for Ref types. Ref("Node") and Ref("LinkedList") are different types, even if they have the same fields.
No generic recursive types. ---@class Tree<T> isn’t supported — generic classes require tracking type parameters on the class itself, which is a significant extension. The Ref mechanism works the same way — you’d need Ref("Tree", [Type::Var("T")]) instead of Ref("Tree").
No cycle detection in class definitions. ---@class A\n---@field b B and ---@class B\n---@field a A is a mutual recursion. Our approach handles it correctly (both become Refs), but a full checker would verify the recursion terminates at nil or Dynamic.
Next: Chapter 15: Type Narrowing — Union types tell you what a value could be. Type narrowing tells you what it is right now. After if type(x) == "number" then, x is definitely a number inside that branch.