Chapter 12: Generic Functions — One Definition, Many Types

Our type checker can annotate functions, infer types, and handle unions and classes. But there’s a gap that gets wider the more you use it. Consider the simplest possible function:

local function identity(x)
    return x
end

What’s the type of identity? It takes something and returns the same thing. With what we have so far, the best we can write is:

---@param x any
---@return any

But that’s wrong. Not “slightly imprecise” wrong — semantically wrong. If you call identity(42), you know you’ll get a number back. The type checker doesn’t. It thinks you might get anything back, because any means “I have no idea.” You’ve lost information that you — the programmer — had.

Or with unions:

---@param x number|string
---@return number|string

Still wrong. Pass a number, get back number|string. The return type is wider than it should be. The type checker should know that identity(42) returns number, not number|string. The output should be as specific as the input.

This is what generic functions solve. A generic function has type parameters — variables that stand in for types. When you call the function, the type parameters are replaced (substituted) with the actual types of the arguments. The return type is computed from that substitution.

---@generic T
---@param x T
---@return T
local function identity(x)
    return x
end

Now identity(42) substitutes T = number and returns number. identity("hello") substitutes T = string and returns string. The type of the return value tracks the type of the argument. No information lost.

The Problem: Information Loss

Here’s the pattern that generic functions fix. You have a function where the output depends on the input, but not in a way that a fixed type signature can express:

Function	Input	Output you want	Output you get with `any`
`identity(x)`	`number`	`number`	`any`
`first(arr)`	`{string}`	`string`	`any`
`map(arr, f)`	`{number}`, `number → string`	`{string}`	`any`

Every row has the same problem: the relationship between input and output is lost. The type checker can’t express “the output type depends on the input type” without generics.

With type parameters, these relationships become explicit:

identity is (T) → T — same type in and out
first is ({T}) → T — element type of the array
map is ({T}, (T) → U) → {U} — transform element types

The letters T and U are type variables. They’re placeholders that get resolved at the call site.

Type::Var — Type Variables

We add one new variant to Type:

#![allow(unused)]
fn main() {
pub enum Type {
    Number, String, Boolean, Nil,
    Function { params: Vec<Type>, ret: Box<Type> },
    Table { fields: Vec<(String, Type)> },
    Union(Vec<Type>),
    Var(String),     // NEW: type variable (T, K, V, etc.)
    Dynamic, Error,
}
}

Type::Var("T".to_string()) means “some type that we’ll figure out later.” It’s a promise: when we know what T is, we’ll replace it.

Type variables interact with the existing type system in three ways.

Compatibility: Var("T") is compatible with everything, like Dynamic. An unresolved type variable could be anything — we can’t reject it. This is a simplification. A full type checker would track constraints on type variables and reject inconsistent ones, but that requires constraint solving, which is beyond a gradual type checker’s scope.

Substitution: The whole point of Var is that it gets replaced. We’ll add a substitute method that walks a type and replaces Var("T") with whatever the substitution map says.

Display: Var("T") displays as T. After substitution, it displays as the concrete type.

Parsing `---@generic`

LuaCATS uses ---@generic to declare type parameters:

---@generic T
---@param x T
---@return T
local function identity(x) return x end

Multiple type parameters are comma-separated:

---@generic K, V
---@param t table
---@return K, V
local function next(t) end

We add a new Annotation variant:

#![allow(unused)]
fn main() {
pub enum Annotation {
    // ... existing variants ...
    Generic { func_name: String, type_params: Vec<String> },
}
}

The parsing follows the same pattern as ---@param: when we see ---@generic, we collect the type parameter names and attach them to the next function declaration.

One subtlety: how do we distinguish a type variable from a type name in ---@param annotations? When the user writes ---@param x T, is T a class named T or a type variable?

We use a convention: uppercase names are type variables. If a type name starts with an uppercase letter and contains only alphanumeric characters, it’s a type variable. This matches LuaCATS practice — T, K, V are universally used as type parameters, while number, string, Point, User are concrete types.

#![allow(unused)]
fn main() {
fn parse_type_name(s: &str) -> Type {
    // ...
    _ if s.chars().next().map(|c| c.is_uppercase()).unwrap_or(false)
        && s.chars().all(|c| c.is_alphanumeric()) => Type::Var(s.to_string()),
    _ => Type::Dynamic,
}
}

This is a heuristic. A real type checker would cross-reference the ---@generic declaration — names listed there are type variables, everything else is a type name. Our convention gets the same result in practice without that extra bookkeeping. The one case where it breaks down: a capitalized built-in name like String would be parsed as Type::Var("String") instead of Type::String. In practice, LuaCATS annotations use lowercase for built-in types, so this rarely comes up — but a production checker should check ---@generic declarations instead of relying on casing alone.

Substitution: The Core Mechanism

When a generic function is called, we need to figure out what the type variables resolve to. The mechanism is substitution — a mapping from type variable names to concrete types.

Here’s how it works for identity(42):

identity has type (T) → T
The argument 42 has type Number
Match formal parameter T against actual argument Number → T = Number
Substitute in the return type: T → Number
Result: Number

The substitution map is { T → Number }. We apply it to the return type, and the Var("T") becomes Number.

The Substitution struct

#![allow(unused)]
fn main() {
pub struct Substitution {
    map: Vec<(String, Type)>,
}
}

A simple list of (variable name, concrete type) pairs. We could use a HashMap, but for the small maps we’re dealing with (1-3 variables), a linear scan is simpler and avoids the hashing overhead.

Building a substitution from function arguments

The key operation: given a function’s formal parameter types (which may contain Var) and the actual argument types (which are concrete), build a substitution.

#![allow(unused)]
fn main() {
impl Substitution {
    fn from_args(formal_params: &[Type], actual_args: &[Type]) -> Substitution {
        let mut subst = Substitution::new();
        for (formal, actual) in formal_params.iter().zip(actual_args.iter()) {
            subst.unify(formal, actual);
        }
        subst
    }
}
}

The unify method matches a pattern (the formal parameter type, which may contain variables) against a concrete type, and extends the substitution with any variable bindings it discovers:

#![allow(unused)]
fn main() {
fn unify(&mut self, pattern: &Type, actual: &Type) {
    match (pattern, actual) {
        // Pattern is a type variable → bind it
        (Type::Var(name), ty) => {
            self.insert(name.clone(), ty.clone());
        }
        // Both are functions → unify params and returns
        (Type::Function { params: p1, ret: r1 },
         Type::Function { params: p2, ret: r2 }) if p1.len() == p2.len() => {
            for (fp, ap) in p1.iter().zip(p2.iter()) {
                self.unify(fp, ap);
            }
            self.unify(r1, r2);
        }
        // Both are tables → unify fields
        (Type::Table { fields: f1 }, Type::Table { fields: f2 }) => {
            for (name, ft) in f1 {
                if let Some((_, at)) = f2.iter().find(|(n, _)| n == name) {
                    self.unify(ft, at);
                }
            }
        }
        // Unions → try each variant
        //
        // ⚠️ Simplification: unifying a union pattern against a concrete
        // type tries every variant and collects all bindings. This is
        // imprecise — a union means "one of these, not all of them."
        // If Union([T, U]) unifies with Number, we'd bind both T → Number
        // and U → Number, even though the union means "T or U, not both."
        // For our gradual checker, this rarely causes problems because the
        // bindings tend to be consistent, but a full type checker would try
        // each variant separately and keep the first consistent binding.
        (Type::Union(variants), actual) | (actual, Type::Union(variants)) => {
            for v in variants { self.unify(v, actual); }
        }
        // Concrete types matching → nothing to bind
        _ => {}
    }
}
}

The name unify is intentional — this is a simplified form of unification, the algorithm that powers full type inference systems (where the compiler deduces types without any annotations at all, by solving constraints). Full unification handles bidirectional constraints and occurs-check (preventing T = Vec<T>). Our version is one-directional: we match the formal pattern against the actual type. This is enough for a gradual type checker.

What if a type variable maps to two different types? If identity had two parameters with the same variable — fun(x: T, y: T): T — and you called it with (42, "hello"), the substitution would try to map T → Number from the first argument and T → String from the second. The insert method detects this: if a variable is already mapped to a different concrete type, it keeps the first mapping and the inconsistency surfaces as a type error during checking. A full type checker would report the conflict immediately (“T cannot be both Number and String”), but for a gradual checker, letting the downstream error propagate is simpler and still catches the bug.

Applying the substitution

The substitute method walks a type and replaces variables:

#![allow(unused)]
fn main() {
impl Type {
    fn substitute(&self, subst: &Substitution) -> Type {
        match self {
            Type::Var(name) => subst.get(name).cloned()
                .unwrap_or_else(|| self.clone()),
            Type::Function { params, ret } => Type::Function {
                params: params.iter().map(|p| p.substitute(subst)).collect(),
                ret: Box::new(ret.substitute(subst)),
            },
            Type::Table { fields } => Type::Table {
                fields: fields.iter()
                    .map(|(n, t)| (n.clone(), t.substitute(subst)))
                    .collect(),
            },
            Type::Union(variants) => Type::union(
                variants.iter().map(|v| v.substitute(subst)).collect()
            ),
            other => other.clone(),
        }
    }
}
}

Atomic types (Number, String, etc.) substitute to themselves. Type variables are looked up in the map. Composite types (Function, Table, Union) recurse into their components.

If a variable isn’t in the substitution map, it stays as Var. This happens when a type variable appears in the return type but not in any parameter — there’s no argument to bind it from. The unbound variable remains as Var, which is compatible with everything (like Dynamic). A production type checker would report an error for unbound type variables.

The Call Site: Where Substitution Happens

The substitution is built and applied when a generic function is called, not when it’s defined. Here’s the change to infer_type for function calls:

#![allow(unused)]
fn main() {
Expression::FunctionCall { ref func, ref args } => {
    let ft = infer_type(db, source, (**func).clone(), env.clone());
    let arg_types: Vec<Type> = args.iter()
        .map(|a| infer_type(db, source, a.clone(), env.clone()))
        .collect();

    match ft {
        Type::Function { params, ret } => {
            let subst = Substitution::from_args(&params, &arg_types);
            ret.substitute(&subst)
        }
        Type::Dynamic => Type::Dynamic,
        _ => Type::Error,
    }
}
}

This uses ref func and ref args to borrow from the expression rather than taking ownership, consistent with the match patterns in earlier chapters. The (**func).clone() extracts and clones the boxed expression; args.iter() borrows the argument vector instead of consuming it.

The key addition: Substitution::from_args(&params, &arg_types) followed by ret.substitute(&subst).

This works for every function call, not only generic ones. If the function has no type variables, the substitution is empty and the return type passes through unchanged. No special-casing needed.

Let’s trace through the examples:

identity(42):

identity has type (T) → T
Argument type: Number
Substitution: { T → Number }
Return type after substitution: Number

identity("hello"):

identity has type (T) → T
Argument type: String
Substitution: { T → String }
Return type after substitution: String

identity(true):

identity has type (T) → T
Argument type: Boolean
Substitution: { T → Boolean }
Return type after substitution: Boolean

Same function, different call sites, different return types. That’s the point.

Two Type Parameters: Map

The identity function uses one type parameter. Real generic functions often use more. Here’s map:

---@generic T, U
---@param arr table
---@param f fun(x: T): U
---@return table
local function map(arr, f)
    return arr
end

T is the element type of the input array. U is the element type of the output array. The function type fun(x: T): U tells the type checker: “the callback takes a T and returns a U.”

A note before you try this: our parser can’t represent function types in annotations yet (that would require parsing fun(x: T): U as a Type::Function), so for now the ---@param f annotation falls through to Dynamic. The ---@generic T, U declaration is in place, but until Chapter 13 adds function type parsing, it won’t change the type checker’s behavior for this function. The substitution infrastructure works — when we add function type parsing, map’s type would be:

({T}, (T) → U) → {U}

And a call like map({1, 2, 3}, tostring) would produce the substitution { T → Number, U → String }, giving the return type {String}.

The infrastructure works. What’s missing is function type parsing in annotations — fun(x: T): U — which Chapter 13 adds. The ---@generic parsing is already in place from this chapter.

Inside the Function Body

When type-checking the body of a generic function, we don’t know what the type variables resolve to. identity’s parameter x is typed as T, but inside the function, T could be anything. We substitute type variables with Dynamic inside function bodies:

#![allow(unused)]
fn main() {
// Build the substitution directly: each type param → Dynamic
// Inside the function body, we don't know what T resolves to,
// so we treat it as Dynamic. The substitution happens at the
// call site, not inside the function.
let mut body_subst = Substitution::new();
for tp in &type_params {
    body_subst.insert(tp.clone(), Type::Dynamic);
}
func_env = func_env.extend(p.clone(), pt.substitute(&body_subst));
}

We build the substitution directly instead of using from_args for a reason: unify(Dynamic, Dynamic) hits the wildcard arm and produces no binding. If we passed [Dynamic, Dynamic, ...] as both formal and actual, the substitution would be empty and Var("T") would stay as Var("T") instead of becoming Dynamic.

This means inside identity, x has type Dynamic, not T. That’s correct — the function body doesn’t know the concrete type, so it can’t assume anything about T. The substitution from the call site doesn’t flow into the body.

A more sophisticated type checker would carry the type parameter constraints into the body. But for a gradual type checker, Dynamic inside the body is the right tradeoff: simple, correct, and it doesn’t produce false positives.

What We’re Simplifying

No constraint solving. Full type inference systems build a set of constraints and solve them. We match arguments to parameters one-directionally. This means we can’t infer types that flow “backwards” — if a function’s return type constrains its input type, we miss it. For a gradual type checker with explicit annotations, this is fine — the annotations provide the constraints.

No occurs check. In full unification, T = Table<T> would be rejected (a type can’t contain itself). Our substitution allows this. It would produce an infinitely recursive type, which our Display implementation would loop on. For the generic function annotations we’re handling here, self-referential type variables don’t arise — a function’s type parameters are always resolved to concrete types at the call site. Recursive class types (like Node containing Node|nil) are a different problem, which Chapter 14 handles with named references instead of expansion.

No higher-kinded types. You can’t write ---@generic M where M is a type constructor (like List or Option). Type parameters are always concrete types. This matches LuaCATS’s design.

No function type parsing in annotations. ---@param f fun(x: T): U would need a parser for function type syntax. Our parse_type_name maps fun(...) to Dynamic. A complete implementation would parse the function type, potentially with type variables.

Type variable convention, not declaration check. We treat uppercase names as type variables. A production checker would verify that type variables are declared in ---@generic. Our approach works for well-formed annotations but doesn’t catch typos like ---@param x Typpo where Typpo isn’t a declared type parameter.

Unbound type variables are Dynamic-like. If a type variable appears in a return type but not in any parameter type, it can’t be resolved at the call site. It stays as Var, which is compatible with everything. A production checker would report an error.

Next: Chapter 13: Function Types in Annotations — We’ve inferred function types from their bodies, but we haven’t let the programmer declare them. ---@param and ---@return are half the story — fun(number): string is the other half.

Keyboard shortcuts

Building a Gradual Type Checker for Lua with Salsa