Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Chapter 10: Type Annotations — Teaching the Type Checker What You Know

Our type checker is pretty good at figuring things out on its own. 42 is a Number. "hello" is a String. x * 2 returns Number because * only works on numbers. But it has a blind spot: it doesn’t know what the programmer intended.

Consider:

local function greet(name)
    return "Hello, " .. name
end

greet(42)  -- passes without complaint

Our type checker infers that greet returns String (from the concatenation). Good. But it types name as Dynamic — it has no idea the programmer meant name to be a String. That means greet(42) passes without complaint, even though passing a number to a function that concatenates it is almost certainly a bug.

Type annotations fix this. They let the programmer say “I know something the type checker doesn’t” — and the type checker uses that knowledge to catch more mistakes.

The LuaCATS Convention

Lua doesn’t have type syntax built into the language. If you write local x: string = "hello", that’s a syntax error. But the Lua ecosystem has converged on a convention: type annotations in comments, using the LuaCATS format.

LuaCATS is used by Lua Language Server, Teal, and other tooling. It looks like this:

---@type string
local name = "hello"

---@param x number
---@param y number
---@return number
local function add(x, y)
    return x + y
end

Three dashes. An @ directive. A type name. That’s it.

These are comments — Lua ignores them entirely. But a type checker can parse them and use them as input to type checking. The Lua runtime never sees them. They’re purely for tooling.

Annotations as Salsa Inputs

Here’s the key architectural insight: annotations are derived from source text, but they play the same role as inputs for downstream queries. The programmer writes them. They don’t change unless the programmer edits the file. Conceptually, they’re inputs to type checking — the programmer wrote them, they’re authoritative, and they invalidate downstream results when they change.

We don’t model annotations as a separate #[salsa::input]. Instead, they’re extracted from the source text by a tracked function. This is the right choice because annotations are derived from the source — they’re a parsed view of what’s already in the input. Making them a separate input would mean keeping them in sync with the source text manually. By extracting them via a tracked function, Salsa handles invalidation automatically: change the source → annotations re-extract → type checking re-runs.

The subtle point: annotations are semantically inputs (the programmer wrote them, they’re authoritative) but architecturally derived (extracted from source via a tracked function). That’s not a contradiction — it’s a design win. Salsa’s derived queries give us the invalidation tracking for free.

┌─────────────────┐      ┌──────────────────┐      ┌────────────┐
│  SourceFile      │      │  extract_        │      │  type_check │
│  (input)        │─────►│  annotations     │─────►│  (tracked)  │
│                 │      │  (tracked)       │      │             │
│  text contains  │      │  parses ---@     │      │  uses both  │
│  both code and  │      │  comments into   │      │  inferred & │
│  ---@ comments  │      │  Annotation      │      │  annotated  │
└─────────────────┘      │  values          │      │  types      │
                         └──────────────────┘      └────────────┘

The Annotation Enum

Before we parse anything, we need a type to represent the parsed result:

#![allow(unused)]
fn main() {
/// A parsed type annotation extracted from a LuaCATS comment.
#[derive(Debug, Clone, PartialEq, Eq, Hash, salsa::Update)]
pub enum Annotation {
    /// ---@type <type>
    /// Binds a variable to a specific type.
    /// `name` is empty for position-based matching (annotation on the
    /// line before the variable). A real implementation would use
    /// line numbers instead of this placeholder.
    Type { name: String, ty: Type },

    /// ---@param <name> <type>
    /// Annotates a function parameter.
    /// `func_name` scopes this annotation to the function it belongs to.
    Param { func_name: String, name: String, ty: Type },

    /// ---@return <type>
    /// Annotates a function's return type.
    /// `func_name` scopes this annotation to the function it belongs to.
    Return { func_name: String, ty: Type },
}
}

Three variants, one per LuaCATS directive. Param and Return both carry func_name so we know which function they belong to — that’s how we avoid mixing up ---@param x number above function add() with ---@param x string above function greet().

Parsing Annotations

The parser is straightforward — scan each line for ---@ prefixes:

#![allow(unused)]
fn main() {
#[salsa::tracked]
pub fn extract_annotations(db: &dyn salsa::Database, source: SourceFile) -> Vec<Annotation> {
    let text = source.text(db);
    let mut annotations = Vec::new();

    // We track the most recent ---@param annotations so we can associate
    // them with the function that follows. LuaCATS convention: params
    // and return annotations appear directly above the function.
    let mut pending_params: Vec<(String, Type)> = Vec::new();
    let mut pending_return: Option<Type> = None;

    for line in text.lines() {
        let trimmed = line.trim();

        // ---@type <typename>
        if let Some(rest) = trimmed.strip_prefix("---@type ") {
            let ty = parse_type_name(rest.trim());
            annotations.push(Annotation::Type { name: String::new(), ty });
            continue;
        }

        // ---@param <name> <typename>
        if let Some(rest) = trimmed.strip_prefix("---@param ") {
            let parts: Vec<&str> = rest.splitn(2, ' ').collect();
            if parts.len() == 2 {
                let name = parts[0].to_string();
                let ty = parse_type_name(parts[1].trim());
                pending_params.push((name, ty));
            }
            continue;
        }

        // ---@return <typename>
        if let Some(rest) = trimmed.strip_prefix("---@return ") {
            pending_return = Some(parse_type_name(rest.trim()));
            continue;
        }

        // Non-annotation line — check if it's a function declaration.
        // LuaCATS convention: annotations must appear directly above
        // the thing they annotate.
        if !trimmed.is_empty() {
            if let Some(func_name) = extract_function_name(trimmed) {
                // Flush pending params/returns, scoped to this function.
                for (name, ty) in pending_params.drain(..) {
                    annotations.push(Annotation::Param {
                        func_name: func_name.clone(),
                        name,
                        ty,
                    });
                }
                if let Some(ty) = pending_return.take() {
                    annotations.push(Annotation::Return {
                        func_name: func_name.clone(),
                        ty,
                    });
                }
            } else {
                // Not a function line. Stray non-annotation lines between
                // annotations and their target break the association.
                if !pending_params.is_empty() || pending_return.is_some() {
                    pending_params.clear();
                    pending_return = None;
                }
            }
        }
    }

    annotations
}
}

There are three annotation types:

AnnotationMeaningExample
---@type <typename>Variable has this type---@type string
---@param <name> <typename>Function parameter has this type---@param x number
---@return <typename>Function returns this type---@return string

The supported type names are the same ones we’ve been using: number, string, boolean, nil, and any (which maps to Dynamic). This is a simplification — real LuaCATS supports unions (number|string), arrays (string[]), and generic function types. But the parsing logic is the same pattern: match the syntax, produce a Type.

The parse_type_name helper converts a string like "number" into a Type:

#![allow(unused)]
fn main() {
fn parse_type_name(name: &str) -> Type {
    match name {
        "number"  => Type::Number,
        "string"  => Type::String,
        "boolean" => Type::Boolean,
        "nil"     => Type::Nil,
        "any"     => Type::Dynamic,
        _         => Type::Dynamic, // unknown type name → Dynamic
    }
}
}

Unknown type names silently become Dynamic. A real checker would emit “unknown type name” instead — we’ll note that in simplifications below.

The extract_function_name helper picks the function name out of a line like local function add(x, y) — a quick pattern match, not a full parse:

#![allow(unused)]
fn main() {
fn extract_function_name(line: &str) -> Option<String> {
    // Match "local function name" or "function name"
    let after_fn = line.strip_prefix("local function ")
        .or_else(|| line.strip_prefix("function "))?;
    // The function name is everything before the opening paren.
    after_fn.split('(').next().map(|s| s.trim().to_string())
}
}

This helper handles the two most common function forms — local function name and function name. Method-style declarations (function Foo:bar()) and dot-separated names (function M.foo()) would return the full string before the paren (Foo:bar, M.foo) as the function name — which means annotations wouldn’t match the simple func_name the programmer likely wrote. A production checker would parse the dot-separated and method portions separately. For now, annotations above those forms won’t be associated with any function — the pending params and returns will be cleared by the stray-line rule.

The Position Rule

LuaCATS has a rule: annotations must appear on the line immediately above the thing they annotate. This is how the type checker knows which function a ---@param belongs to:

---@param x number    ← belongs to the function below
---@return number     ← belongs to the function below
local function double(x)
    return x * 2
end

Our parser implements this by tracking “pending” params and returns, then flushing them when it sees a function keyword on a non-annotation line. Each flushed annotation records the function name it belongs to — this prevents a ---@param x number above function add() from being mistakenly applied to a ---@param x string above function greet(). When a non-function, non-annotation line appears with pending annotations, the pending state is cleared: LuaCATS annotations must be directly above their target.

How Annotations Change Type Checking

Before this chapter, check_stmt for a Function always typed parameters as Dynamic. Now it asks: did the programmer annotate this parameter? If yes, use their annotation. If no, fall back to Dynamic.

First, we call extract_annotations to get the annotations for this source file:

#![allow(unused)]
fn main() {
let annotations = extract_annotations(db, source);  // Salsa caches this — the file is only scanned once, regardless of how many times check_stmt is called
}

Then, when we encounter a function declaration, we look for matching ---@param annotations:

#![allow(unused)]
fn main() {
Statement::Function { name, params, body, .. } => {
    let mut fe = env.clone();
    let mut param_types = Vec::new();
    for p in params {
        let annotated = annotations.iter().find_map(|a| match a {
            Annotation::Param { func_name, name: n, ty } if func_name == name && n == p => Some(ty.clone()),
            _ => None,
        });
        let pt = annotated.unwrap_or(Type::Dynamic);
        param_types.push(pt.clone());
        fe = fe.extend(p.clone(), pt);
    }
    // ...
}
}

The annotation flows into the type environment before inference runs on the function body. This means inference can USE the annotated types — it doesn’t fight them. If the programmer says x: number and the code does x .. " items" (string concatenation on a number), the type checker catches it because the environment says x is a Number, not Dynamic.

Consuming ---@type Annotations

---@param and ---@return are easy — they carry a func_name so they only match one function. ---@type is harder. It sits above a variable assignment, and our parser stores it with an empty name field. So when check_stmt processes an assignment, how does it know which ---@type to apply?

The answer: consume it after use. The first variable that asks for a ---@type annotation gets it, and the annotation is removed from the list so it’s never applied again:

#![allow(unused)]
fn main() {
fn apply_annotation(annotations: &mut Vec<Annotation>, name: &str, inferred: Type) -> Type {
    // Look for a @type annotation. The `n == name` branch is for a future
    // named @type feature (e.g. ---@type string myVar). The parser currently
    // always sets `name` to empty, so this branch never matches — it's here so
    // the code is ready when the parser is extended.
    //
    // Empty-name @type is position-based — apply to the next variable
    // that asks, then consume it.
    let mut found_idx = None;
    let mut found_ty = None;
    for (i, a) in annotations.iter().enumerate() {
        if let Annotation::Type { name: n, ty } = a {
            if n == name || n.is_empty() {
                if !inferred.is_compatible_with(ty)
                    && !matches!(inferred, Type::Dynamic | Type::Error)
                {
                    // Mismatch between declared and inferred.
                    // The annotation wins, but a production checker
                    // would emit a warning here.
                }
                found_idx = Some(i);
                found_ty = Some(ty.clone());
                break;
            }
        }
    }
    if let (Some(idx), Some(ty)) = (found_idx, found_ty) {
        // Consume the annotation so it's not applied again.
        annotations.remove(idx);
        ty
    } else {
        inferred
    }
}
}

The key line is annotations.remove(idx) — once a ---@type annotation is used, it’s gone. Without this, a single ---@type string would apply to every variable in scope, which is wrong. The annotation goes with the variable directly below it.

The assignment arm of check_stmt calls this for each target:

#![allow(unused)]
fn main() {
Statement::Assignment { local: _, targets, values } => {
    let mut e = env.clone();
    for (t, v) in targets.iter().zip(values.iter()) {
        let mut vt = infer_type(db, source, v.clone(), e.clone());
        if let Expression::Name(n) = t {
            vt = apply_annotation(annotations, n, vt);
            e = e.extend(n.clone(), vt);
        }
    }
    StmtResult { env: e, return_type: None }
}
}

This is an approximation. Real LuaCATS matches ---@type to the variable on the next line by line number. Our consume-on-read pattern works for the common case (one annotation, one variable, right below it) but breaks if there’s a blank line between them — our parser would have already cleared the pending state. Line number tracking would fix this, and it’s the right thing for a production checker.

There’s a subtlety in how we call apply_annotation. The annotations vector must be extracted once and shared across all statements in a function body — if each check_stmt call creates its own copy, the consume-on-read is a no-op (removing from a local copy that’s discarded). Here’s how the top-level check function should look:

#![allow(unused)]
fn main() {
#[salsa::tracked]
pub fn type_check(db: &dyn salsa::Database, source: SourceFile) -> TypeCheckResult {
    let ast = parse(db, source);
    let annotations = extract_annotations(db, source);
    let env = TypeEnv::empty();
    // Pass a *mutable* reference to annotations so that consume-on-read
    // persists across all statements in the file.
    let mut annotations = annotations;
    let mut result_env = env.clone();
    for stmt in ast.statements(db) {
        let sr = check_stmt(db, source, stmt, result_env, &mut annotations);
        result_env = sr.env;
    }
    TypeCheckResult::new(db, source, result_env)
}
}

Now check_stmt takes &mut Vec<Annotation> instead of creating its own copy, and the annotations.remove(idx) in apply_annotation actually removes the annotation for subsequent calls:

Return Type: Annotation vs Inference

Return types have a subtlety. The programmer might annotate a return type that disagrees with what the body actually returns:

---@return number
local function oops()
    return "not a number"  -- declared number, returned string
end

Our type checker handles this by:

  1. Inferring the return type from the body (as before)
  2. If there’s a ---@return annotation, checking that the inferred type is compatible
  3. Using the annotated type for the function’s external signature
  4. Emitting a diagnostic if they disagree

The annotation wins for the signature — if you declare ---@return number, callers see number regardless of what the body does. But the mismatch is flagged, because it’s almost certainly a bug.

This is the same principle as TypeScript: declaration overrides inference, but contradictions are reported.

The Gradual Guarantee

The most important property of a gradual type system: unannotated code still works. No annotation → Dynamic → no type errors. You can annotate one function in a thousand-file project and get value from that one annotation.

-- No annotations — all Dynamic, no errors
local function greet(name)
    return "Hello, " .. name
end
greet(42)  -- no error (Dynamic is compatible with everything)

-- With annotations — type checking activates
---@param name string
local function greet(name)
    return "Hello, " .. name
end
greet(42)  -- ERROR: argument 1 has type Number but parameter expects String

This is why we don’t default unannotated parameters to some bottom type or require annotations everywhere. Dynamic is the escape hatch. It’s not a failure of the type checker — it’s the whole point of gradual typing.

What We’re Simplifying

No complex type parsing. Real LuaCATS supports unions (number|string), arrays (string[]), optional types (string?), and generic function types — our parser only handles simple type names. The extension point is clear: make parse_type_name recursive.

No line number tracking for ---@type. Our annotation carries a name field that’s always empty, a placeholder for a future named-annotation feature the parser doesn’t yet support. In LuaCATS, ---@type is position-based (the annotation goes above the assignment), so the type checker matches it to the next variable declaration by line number. We approximate this with a consume-on-read pattern: the first variable that asks for an annotation gets it, and the annotation is removed so it’s not reused. This works when annotations are extracted once and shared across all statements in a function body — the type_check function holds the mutable vector and passes it down. A real implementation would track line numbers for precise matching, which also handles edge cases like blank lines between the annotation and its target.

No ---@field or ---@class. LuaCATS can declare table shapes like ---@class Point { x: number, y: number }, which would connect naturally to our Type::Table variant. The pattern is the same: parse the annotation, produce a Type::Table, store it in the environment.

Unknown type names return Dynamic silently. If you write ---@param x entier (a typo for “number”), our parser returns Dynamic without warning — a real checker would emit “unknown type name: entier.”

Next: Chapter 11: Union Types and Table Classes — Our type checker knows about Number, String, Boolean, and Dynamic. But real Lua code doesn’t always fit into single boxes. We’ll add union types (number|string) and named table classes (---@class Point).