Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Chapter 13: Function Types — Specifying What Goes In and What Comes Out

In Chapter 12, we wrote a generic map function with this annotation:

---@param f fun(x: T): U

But our parser couldn’t handle fun(x: T): U. It saw “fun” — not a known type name — and fell through to Dynamic. The type checker never learned what the callback does. f was Dynamic, and map couldn’t connect the callback’s parameter type to the array’s element type.

This chapter fixes that. We add a parser for function type syntax, which lets us express higher-order functions with full type precision.

The Problem: fun(...) Falls Through to Dynamic

Our parse_type_name function from Chapter 10 handles simple names:

#![allow(unused)]
fn main() {
fn parse_type_name(s: &str) -> Type {
    match s.trim() {
        "number" => Type::Number,
        "string" => Type::String,
        // ... other known types
        _ if looks_like_type_var(s) => Type::Var(s.to_string()),
        _ => Type::Dynamic,  // ← "fun(x: T): U" hits this branch
    }
}
}

The fun(...) syntax isn’t a simple name — it’s a structured type with parameters and a return type. We need a recursive parser that can handle it.

The Grammar

LuaCATS function type syntax:

fun_type       = "fun" "(" param_list ")" ":" type
param_list     = param ("," param)*
param          = name ":" type | type
type           = simple_name | fun_type | union_type
union_type     = type ("|" type)*

Examples and what they produce:

AnnotationParsed Type
fun(x: number): stringFunction([Number], String)
fun(x: T): UFunction([Var("T")], Var("U"))Var("U") is the return type, not a param
fun(number): booleanFunction([Number], Boolean) — unnamed params
fun(f: fun(T): U): tableFunction([Function([Var("T")], Var("U"))], Dynamic) — nested
fun(x: number, y: string): booleanFunction([Number, String], Boolean) — multiple params

The parser is recursive: fun(f: fun(T): U): table has a function type inside a function type. This is how you express higher-order functions — functions that take other functions as arguments.

Step 1: From parse_type_name to parse_type

The old function handled one thing: simple names. The new function handles three:

#![allow(unused)]
fn main() {
fn parse_type(s: &str) -> Type {
    let s = s.trim();

    // Function type: fun(...): return_type
    if s.starts_with("fun(") || s.starts_with("fun (") {
        return parse_function_type(s);
    }

    // Union type: "number|string"
    //
    // Caveat: this naive split on '|' breaks on nested unions inside
    // function types, e.g. "fun(x: number|string): boolean". The '|' inside
    // the function type's parameter gets split incorrectly. A production
    // parser would track parenthesis depth and only split on '|' at depth 0
    // — the same pattern we use for commas in parse_param_types below.
    // For the annotations in this tutorial (no nested unions in function
    // params), the naive split works. See "What We're Simplifying" at the end.
    if s.contains('|') {
        let variants: Vec<Type> = s.split('|')
            .map(|part| parse_type(part.trim()))
            .collect();
        return Type::union(variants);
    }

    // Simple type name (the old parse_type_name logic)
    parse_type_name(s)
}
}

The order matters: fun( must be checked first because function type syntax can contain | in its return type (like fun(T): U|nil). If we checked unions first, we’d split incorrectly on the | inside the return type. Then check for top-level unions, then fall back to simple names.

Wait — does fun(T): U|nil work with our parser? Let’s trace through it:

  1. s = "fun(T): U|nil" — starts with fun(, so we call parse_function_type
  2. parse_function_type finds the matching ), extracts T as the param
  3. It finds : U|nil after the closing paren
  4. It calls parse_type("U|nil") for the return type
  5. That hits the union branch: parse_type("U")Var("U"), parse_type("nil")Nil
  6. Result: Function([Var("T")], Union([Var("U"), Nil]))

It works because the return type is parsed by the same parse_type function — recursion handles the union automatically.

Step 2: Parsing the Function Type

The function type parser has three jobs:

  1. Find the matching closing paren (respecting nesting)
  2. Parse parameter types from the content inside the parens
  3. Parse the return type after the colon
#![allow(unused)]
fn main() {
fn parse_function_type(s: &str) -> Type {
    // Skip "fun" and optional space
    let after_fun = if s.starts_with("fun(") { &s[3..] } else { &s[4..] };

    // Find matching closing paren
    let paren_content = match find_matching_paren(after_fun) {
        Some(content) => content,
        None => return Type::Dynamic, // malformed
    };

    // Find the return type after the closing paren
    // Byte offset of the character after the closing ')'
    let after_paren = if s.starts_with("fun(") {
        4 + paren_content.len() + 1  // "fun(" + content + ")"
    } else {
        5 + paren_content.len() + 1  // "fun (" + content + ")"
    };

    let return_type = if let Some(rest) = s.get(after_paren..) {
        let rest = rest.trim();
        if let Some(rest) = rest.strip_prefix(':') {
            parse_type(rest)
        } else {
            Type::Dynamic  // no return type → Dynamic
        }
    } else {
        Type::Dynamic
    };

    // Parse parameter types from paren_content
    let params = parse_param_types(paren_content);

    Type::Function { params, ret: Box::new(return_type) }
}
}

Finding the matching paren. We can’t find the first ) and call it done — function types can be nested. fun(f: fun(T): U): V has two closing parens, and we need the outer one. So we count nesting depth:

#![allow(unused)]
fn main() {
fn find_matching_paren(s: &str) -> Option<&str> {
    if !s.starts_with('(') { return None; }
    let mut depth = 1;
    let mut i = 1;
    while i < s.len() && depth > 0 {
        match s.as_bytes()[i] {
            b'(' => depth += 1,
            b')' => depth -= 1,
            _ => {}
        }
        i += 1;
    }
    if depth == 0 { Some(&s[1..i-1]) } else { None }
}
}

Step 3: Parameter Parsing — Named vs Unnamed

LuaCATS supports both named and unnamed parameters:

fun(x: number): string    — named: x is number
fun(number): string        — unnamed: only the type

For type checking, we only care about the type — the name is documentation. Our parser handles both:

#![allow(unused)]
fn main() {
fn parse_param_type(s: &str) -> Type {
    let s = s.trim();
    // Find the colon that isn't inside parens
    let mut depth = 0;
    for (i, c) in s.char_indices() {
        match c {
            '(' => depth += 1,
            ')' => depth -= 1,
            ':' if depth == 0 => {
                // Named param: "x: type" — take everything after the colon
                return parse_type(&s[i + 1..]);
            }
            _ => {}
        }
    }
    // Unnamed param: the whole thing is a type
    parse_type(s)
}
}

Why track depth when looking for the colon? Because a parameter could be a function type: f: fun(T): U. The first colon (after f) is the name separator. The second colon (after fun(T)) is inside the function type. We only want the first one at depth 0.

Step 4: Commas Need Depth-Tracking Too

Parameter lists are comma-separated: fun(x: number, y: string): boolean. But a parameter can be a function type: fun(f: fun(T): U, x: T): U. The comma between U and x is a parameter separator. But there’s no comma inside fun(T): U — so a naive split(',') would break.

We split on commas at depth 0:

#![allow(unused)]
fn main() {
fn parse_param_types(s: &str) -> Vec<Type> {
    if s.trim().is_empty() {
        return Vec::new(); // fun() — no parameters
    }

    let mut params = Vec::new();
    let mut depth = 0;
    let mut start = 0;

    for (i, c) in s.char_indices() {
        match c {
            '(' => depth += 1,
            ')' => depth -= 1,
            ',' if depth == 0 => {
                let param = s[start..i].trim();
                if !param.is_empty() {
                    params.push(parse_param_type(param));
                }
                start = i + 1;
            }
            _ => {}
        }
    }
    // Last param (after the final comma, or the only param if no commas)
    let last = s[start..].trim();
    if !last.is_empty() {
        params.push(parse_param_type(last));
    }
    params
}
}

This pattern — track depth, split at depth 0 — shows up everywhere in recursive parsing. It’s the same technique you’d use for parsing nested parentheses in any language.

The Result: map Finally Has a Real Callback Type

Before (Chapter 12):

---@param f fun(x: T): U
-- parse_type_name("fun(x: T): U") → Dynamic
-- f's type: Dynamic (useless)

After (this chapter):

---@param f fun(x: T): U
-- parse_type("fun(x: T): U") → Function([Var("T")], Var("U"))
-- f's type: Function([Var("T")], Var("U"))

Now when map is called with a fun(number): string callback, the substitution correctly maps T → Number and U → String. The type checker can verify that the callback’s parameter type matches the array’s element type.

Higher-Order Function Composition

A higher-order function is a function that takes another function as an argument or returns one as a result. map is a higher-order function — it takes a callback. So is compose, which takes two functions and returns a new one. Function types in annotations let us express more than map alone. Here’s composition:

---@generic T, U, V
---@param f fun(x: U): V
---@param g fun(x: T): U
---@return fun(x: T): V
local function compose(f, g)
    return f
end

The return type fun(x: T): V is a function type with type variables. When we call compose with concrete functions:

  • f: fun(string): boolean
  • g: fun(number): string

The substitution maps T → Number, U → String, V → Boolean, and the return type resolves to fun(number): boolean — exactly what you’d expect from composing those two functions.

This is the power of combining generics (Chapter 12) with function types (this chapter): the type checker can reason about higher-order functions end-to-end, not treat every callback as Dynamic by default.

What We’re Simplifying

No table types in function params. You can’t write fun(t: {x: number}): number yet — our parser doesn’t handle inline table types inside function annotations. You’d need to define a ---@class and use the class name instead.

No optional parameters. LuaCATS supports fun(x?: number): string for optional parameters. We don’t parse the ? syntax.

No variadic parameters. fun(...: number) is valid LuaCATS but not handled here.

No overloads. fun(number): string | fun(string): number (multiple signatures for the same function) requires more work — you’d need to parse it as a union of function types and check each branch during inference.

Union splitting doesn’t track depth. The parse_type function splits on | with s.split('|'), which doesn’t respect nesting. This works for number|string and even number|fun(x: number): boolean (no | inside the function type). But number|fun(x: number|string): boolean would split incorrectly on the | inside the function type’s parameter. A depth-aware split — the same pattern we use for commas and colons — would fix this. For the annotations in this tutorial, the naive split is sufficient; if you’re building a production parser, add depth-tracking to the union branch.

These are all solvable with the same recursive parsing approach — each adds more branches to parse_type.

Running

cargo run --bin ch13-function-types

This chapter has 10 test cases covering the parser, annotation extraction, and substitution with function types:

cargo test --bin ch13-function-types

Next: Chapter 14: Recursive Types and Self-Reference — A linked list node that references itself. A tree where each branch is another tree. These types show up constantly in real Lua code — and they break naive type checking because typeof(field) points back to the type we’re still defining. We’ll handle self-reference with named type references — instead of expanding Node forever, we stop at Ref("Node") and compare by name.