Building a Gradual Type Checker for Lua
A hands-on tutorial that teaches Salsa (the incremental computation framework) by building a gradual type checker for Lua, powered by the analisar parser and lex_lua lexer.
Who Is This For?
- Rust developers curious about Salsa and incremental computation
- People who want to understand how rust-analyzer's query system works under the hood
- Anyone interested in type checking, gradual typing, or Lua tooling
You should be comfortable with Rust. No prior Salsa or type-theory experience needed — we build from zero.
The Stack
| Crate | Role | Version |
|---|---|---|
lex_lua | Lua lexer — tokens, spans, comments | 0.2 |
analisar | Lua parser — AST from tokens | 0.4 |
salsa | Incremental query framework | 0.26 |
Chapters
-
Hello Salsa: Inputs and Tracked Functions — Your first Salsa database, input structs,
#[salsa::tracked], and the revision model. -
Parsing Lua with Analisar — Wire analisar into Salsa. Source text → AST, incrementally.
-
Interned Symbols and Name Resolution —
#[salsa::interned]for variable/function names. Building a symbol table. -
Tracked Structs: The Typed AST —
#[salsa::tracked]on structs. Converting analisar's AST into a Salsa-aware typed IR. -
Type Inference: The Core Query — The big one. A
#[salsa::tracked]function that walks the AST and assigns types. Incrementality in action. -
Diagnostics as Accumulators —
#[salsa::accumulator]for errors and warnings without poisoning the query graph. -
Putting It Together: The Language Server — Wire everything into a simple LSP. Watch Salsa skip work on keystrokes.
Running the Code
Each chapter has a src/ directory with runnable code. From the repo root:
cargo run --bin ch01
cargo run --bin ch02
# etc.
Why Salsa?
Salsa solves a real problem: when you're building a tool that re-runs on every keystroke (IDE, linter, type checker), you can't afford to recompute everything. Salsa gives you:
- Automatic caching — tracked functions memoize their results
- Incremental invalidation — when an input changes, only affected queries re-run
- Cycle detection — recursive queries that would loop forever are caught
- Accumulators — side-channels for diagnostics that don't break the query graph
You'll learn all of these by building something real, not by reading docs in a vacuum.
License
MIT
Chapter 1: Hello Salsa — Inputs and Tracked Functions
This is your first Salsa program. No Lua yet — just the core concepts.
What You'll Learn
- What a Salsa database is and why you need one
- How
#[salsa::input]defines the roots of your computation - How
#[salsa::tracked]defines derived (cached) queries - What revisions are and how Salsa decides what to re-run
- Why this matters for a type checker
The Mental Model
Think of Salsa as a spreadsheet:
- Inputs are the cells you type into (source code, config)
- Tracked functions are the formulas (
=A1+B1) - Revisions are like "after you edit a cell"
When you change an input, only the formulas that depend on it re-evaluate. The rest return their cached values instantly.
But there's a difference. A spreadsheet tracks dependencies at the cell level. Salsa tracks at the query level. A tracked function that reads source.text(db) is recorded as depending on that specific input. If the input changes in a new revision, the function re-runs. If it doesn't change, the cached result comes back for free.
Why does this matter? Because in a real type checker, you might have hundreds of source files and thousands of queries. You can't recompute everything on every keystroke. Salsa gives you the infrastructure to skip work — automatically.
Step 1: Define Your Input
In Salsa, inputs are the facts that come from outside the system. For a type checker, the obvious input is: the source code.
#![allow(unused)] fn main() { #[salsa::input] pub struct SourceFile { #[returns(ref)] pub path: String, #[returns(ref)] pub text: String, } }
#[salsa::input] does a lot of work behind the scenes. It generates:
- A constructor:
SourceFile::new(&db, path, text) - Getter methods:
source.path(db),source.text(db) - Setter methods:
source.set_text(&mut db).to(new_text)
The #[returns(ref)] attribute tells Salsa that the getter should return a reference (&String) instead of an owned value. This avoids cloning strings every time you read an input.
Key idea: Inputs are the only way data enters the system. Everything else is derived from them. This is what makes Salsa's incrementality work — if you can trace every piece of data back to an input, you know exactly what needs to re-run when an input changes.
Step 2: Define a Tracked Function
A #[salsa::tracked] function is a pure function of its inputs. Pure means: no side effects, no reading from global state, no randomness. Everything goes through the database.
#![allow(unused)] fn main() { #[salsa::tracked] pub fn line_count(db: &dyn salsa::Database, source: SourceFile) -> u32 { let text = source.text(db); text.lines().count() as u32 } }
The first argument is always the database (we'll refine this in later chapters). The remaining arguments are the "keys" — what distinguishes one call from another.
When you call line_count(&db, source), Salsa does this:
- Check: have I seen this query before, with these arguments, in this revision?
- If yes → return the cached result. No re-execution.
- If no → run the function, cache the result, return it.
The function body calls source.text(db). This read is tracked. Salsa records: "line_count(source) depends on source.text." Later, if you change the source text, Salsa knows this cache entry is stale.
Let's add one more query to see how dependencies work:
#![allow(unused)] fn main() { #[salsa::tracked] pub fn contains_text(db: &dyn salsa::Database, source: SourceFile, needle: String) -> bool { let text = source.text(db); text.contains(&needle) } }
Same pattern: read an input, compute a result, cache it. The needle parameter becomes part of the cache key — contains_text(db, source, "print") and contains_text(db, source, "local") are independent cache entries.
Step 3: Define Your Database
A database is the container that holds all the cached query results.
#![allow(unused)] fn main() { #[salsa::db] #[derive(Default)] pub struct Database { storage: salsa::Storage<Self>, } #[salsa::db] impl salsa::Database for Database {} }
You need three things:
- A
salsa::Storage<Self>field — this is where Salsa keeps its memo tables, revision counters, and dependency graphs. - An impl of
salsa::Database— marked with#[salsa::db]. - The same
#[salsa::db]attribute on the struct itself.
Right now our database is empty — it doesn't have any custom behavior. In later chapters, we'll add methods and custom traits. For now, it's just a container.
Step 4: Use It
fn main() { let mut db = Database::default(); let source = SourceFile::new( &db, "main.lua".to_string(), "local x = 1\nlocal y = 2\nprint(x + y)\n".to_string(), ); let count = line_count(&db, source); assert_eq!(count, 3); }
Nothing magical yet — we create a database, create an input, and query it. The result is computed and cached.
The Magic: Revisions
Now let's change the input and see what happens:
#![allow(unused)] fn main() { source.set_text(&mut db).to("local z = 99\n".to_string()); let new_count = line_count(&db, source); assert_eq!(new_count, 1); }
When we call set_text, Salsa increments its revision counter. This is Salsa's internal clock — every input mutation bumps the revision. Each cached query remembers which revision it was computed in.
When we query line_count again, Salsa checks: "Is the current revision newer than when I last computed this?" Yes. "Did the inputs this query depends on actually change?" Yes — we set new text. So it re-runs the function.
If we had queried line_count without changing the text, Salsa would return the cached result instantly. No re-execution.
Per-Input Isolation
Here's the key insight that makes Salsa work at IDE scale:
#![allow(unused)] fn main() { let other = SourceFile::new(&db, "other.lua".to_string(), "return 42\n".to_string()); let other_count = line_count(&db, other); // computed, cached source.set_text(&mut db).to("local a = 1\nlocal b = 2\n".to_string()); let other_count_again = line_count(&db, other); // cached! No re-run! assert_eq!(other_count_again, 1); }
We changed source, not other. Salsa knows line_count(other) doesn't depend on source's text. So it returns the cached value for other without re-running anything.
In a real type checker with hundreds of files, typing in one file only invalidates queries that read that file. Queries for other files are still cached. This is why rust-analyzer can respond in milliseconds even on large projects — it's not re-type-checking the whole world on every keystroke.
Running
cargo run --bin ch01-hello-salsa
Key Takeaways
-
Inputs are the source of truth. They come from outside (files, user input). Setting an input creates a new revision.
-
Tracked functions are pure. No side effects, no reading from global state. Everything goes through the database.
-
Incrementality is automatic. You don't write cache invalidation logic. Salsa does it by tracking which inputs each query reads.
-
Per-input isolation. Changing one file doesn't invalidate queries for other files. This is the key insight that makes IDE-scale projects feasible.
-
The dependency graph is implicit. You don't declare "query A depends on query B." Salsa infers it from what you read. This means the graph is always correct — it can't get out of sync with the code.
What's Next
Chapter 2: Parsing Lua with Analisar — We'll parse actual Lua source code using the analisar parser and wire it into Salsa as a tracked query. Same incremental model, but now doing real work.
Chapter 2: Parsing Lua with Analisar
Wire a real Lua parser into Salsa. Source text → AST, incrementally.
What You'll Learn
- How to make parsing a tracked query (so it only re-runs when source changes)
- How analisar's AST is structured
- Why parsing needs to be separate from type-checking (incrementality!)
- The difference between Salsa-owned data and borrowed data in tracked functions
Why Separate Parsing from Type-Checking?
Here's a question you might be asking: why not just type-check directly from the source text? Why have a parsing step at all?
Two reasons:
-
Parsing is expensive, type-checking is more expensive. If you mash them together, you lose the ability to cache them independently. A change that only affects one function's body shouldn't require re-parsing the entire file — and it shouldn't require re-type-checking unrelated functions either.
-
Different invalidation granularity. The parser runs on the whole source text. The type checker runs on individual expressions. By separating them, Salsa can skip the type checker when only the parse result changes but the types stay the same (this happens less often, but it can happen — think comments or whitespace that don't affect the AST structure).
The pipeline so far:
SourceFile (input)
│
▼
parse() ← tracked, cached
│
▼
LuaAst ← owned data
│
▼
derived queries (type checking, name resolution, etc.)
Step 1: The Same Input
We reuse the SourceFile input from Chapter 1:
#![allow(unused)] fn main() { #[salsa::input] pub struct SourceFile { #[returns(ref)] pub path: String, #[returns(ref)] pub text: String, } }
Nothing new here. The source text is our input, and Salsa tracks when it changes.
Step 2: An Owned AST
Analisar's AST uses borrowed strings (Cow<'a, str> for identifiers). This makes sense for analisar — it avoids allocating copies. But Salsa tracked functions need to return owned data. Why? Because the cached result has to survive across revisions. If the result borrows from the source text, and the source text changes, you've got a dangling reference.
So we define our own owned AST types:
#![allow(unused)] fn main() { #[derive(Debug, Clone, PartialEq, Eq, Hash, salsa::Update)] pub struct LuaAst { pub statements: Vec<Statement>, } #[derive(Debug, Clone, PartialEq, Eq, Hash, salsa::Update)] pub enum Statement { Assignment { local: bool, targets: Vec<Expression>, values: Vec<Expression> }, Expression(Expression), Return(Vec<Expression>), Function { local: bool, name: String, params: Vec<String>, body: Vec<Statement> }, // ... } }
Notice the salsa::Update derive. This is required for any type used as a tracked function's return value. It tells Salsa how to update the cached value when needed (for structural sharing in advanced use cases — for now, just know it's required).
But wait — isn't cloning expensive? Yes, for large files. That's why in Chapter 3 we'll use #[salsa::interned] for strings, and in Chapter 4 we'll use #[salsa::tracked] for AST nodes. Both avoid cloning. For now, we keep it simple with owned Strings.
Step 3: The Parse Query
#![allow(unused)] fn main() { #[salsa::tracked] pub fn parse(db: &dyn salsa::Database, source: SourceFile) -> LuaAst { let text = source.text(db); convert_ast(text) } }
This is the key tracked function. It takes a SourceFile (input) and returns a LuaAst (owned data). Salsa caches this — if the source text hasn't changed since the last revision, the cached AST is returned without re-parsing.
The convert_ast function is where we bridge analisar's borrowed world into our owned world. We walk analisar's AST node by node and build our own representation. It's mechanical — mapping one enum variant to another — but it's important to understand why we do it:
- analisar gives us a correct but borrowed AST
- Our AST is owned, hashable, and Salsa-compatible
- The conversion is the boundary between "external library" and "Salsa system"
This boundary pattern — converting external data into Salsa-owned data at the query boundary — is common in Salsa projects. rust-analyzer does the same thing with the Rust parser.
Step 4: Query Chaining
Now that we have parse(), we can build derived queries on top of it:
#![allow(unused)] fn main() { #[salsa::tracked] pub fn top_level_names(db: &dyn salsa::Database, source: SourceFile) -> Vec<String> { let ast = parse(db, source); // depends on parse let mut names = Vec::new(); for stmt in &ast.statements { match stmt { Statement::Assignment { local: true, targets, .. } => { for target in targets { if let Expression::Name(name) = target { names.push(name.clone()); } } } Statement::Function { local: true, name, .. } => { names.push(name.clone()); } _ => {} } } names } }
top_level_names calls parse(db, source), so it depends on the parse result. If the source changes:
parse()re-runs (its input changed)top_level_names()re-runs (its dependency — parse — produced a new result)
But here's a subtlety: if parse() returns a different LuaAst (the source changed in a way that affects the AST), top_level_names() re-runs. If parse() returns the same LuaAst (e.g., you changed a comment that analisar doesn't include in the AST), top_level_names() returns its cached value.
Salsa checks the value of dependencies, not just whether they were re-executed. This is called verify re-execution: Salsa re-runs parse(), compares the result to the old cache, and only propagates the change if the value is actually different.
Running
cargo run --bin ch02-parsing-lua
Key Takeaways
-
Parsing is a query. By making
parse()a tracked function, we get caching for free. This is the first step in any Salsa pipeline: raw input → parsed representation. -
Owned data at the boundary. External libraries often return borrowed data. Convert to owned at the Salsa boundary. Later we'll use interning to avoid the cloning cost.
-
Query chaining.
top_level_namesdepends onparse. Changing the source triggers both. But if parse returns the same value, downstream queries are skipped. -
The pipeline is already incremental. We have two layers of caching (parse + derived). Each layer only re-runs when its inputs change. In a real type checker, you'd have many more layers.
What's Next
Chapter 3: Interned Symbols — We'll use #[salsa::interned] for identifiers, which lets us compare names by ID instead of by string comparison. This is how compilers handle identifiers efficiently.
Chapter 3: Interned Symbols and Name Resolution
Use #[salsa::interned] to deduplicate identifiers and enable fast pointer-comparison.
What You'll Learn
- What interning is and why it matters for identifiers
- How
#[salsa::interned]works vs#[salsa::input] - How to build a symbol table (name → definition) as a tracked query
- Why interning enables fast pointer-comparison for names
The Problem: String Comparison Is Expensive
In Chapter 2, our AST used String for every identifier. When the type checker looks up a variable name, it compares strings byte-by-byte. In a real codebase with thousands of name lookups, that adds up.
Compilers solve this with interning: allocate each unique string once, and refer to it by ID. The first time you see "print", you store it and get back ID 42. Every subsequent "print" maps to the same ID 42. Now name comparison is a single integer comparison.
This isn't a new idea — Lisp systems have done it since the 1960s. Salsa gives you a first-class way to do it.
#[salsa::interned] vs #[salsa::input]
Both live in the database. Both get IDs. But they're different:
#[salsa::input] | #[salsa::interned] | |
|---|---|---|
| Created by | Outside code (your main function) | Inside tracked functions |
| Deduplication | No — each new() call creates a new entry | Yes — same field values → same ID |
| Mutable? | Yes — setters for each field | No — immutable once created |
| Lifecycle | Survives across revisions | Survives across revisions |
| Lifetime | No 'db parameter | Has 'db lifetime |
Inputs are for data that comes from outside the system and can change (source files, configuration). Interned values are for data that's derived inside queries and never changes (identifiers, keywords).
Step 1: Define an Interned Symbol
#![allow(unused)] fn main() { #[salsa::interned(debug)] pub struct Symbol<'db> { #[returns(ref)] pub text: String, } }
The 'db lifetime ties the symbol to the database — you can't hold a Symbol after the database is dropped. The #[returns(ref)] on text means symbol.text(db) returns &str instead of String, avoiding a clone.
When you call Symbol::new(db, "print".to_string()):
- Salsa checks: have I seen
"print"before? - If yes → return the existing ID. No allocation.
- If no → store the string, assign a new ID, return it.
Step 2: The Guarantee
#![allow(unused)] fn main() { let s1 = Symbol::new(&db, "print".to_string()); let s2 = Symbol::new(&db, "print".to_string()); let s3 = Symbol::new(&db, "write".to_string()); assert_eq!(s1, s2); // same string → same ID → equal assert_ne!(s1, s3); // different string → different ID → not equal }
s1 == s2 is a single integer comparison. Not a string comparison. In a type checker doing millions of name lookups, this matters.
Step 3: Symbol Table as a Tracked Query
Now we can build a symbol table that uses interned comparison:
#![allow(unused)] fn main() { #[salsa::tracked] pub fn lookup_name(db: &dyn salsa::Database, source: SourceFile, name: String) -> Option<Definition> { let symbol = Symbol::new(db, name.clone()); // intern the lookup name let defs = symbol_table(db, source); for def in defs { let def_symbol = Symbol::new(db, def.name.clone()); // intern each def name if def_symbol == symbol { // integer comparison! return Some(def); } } None } }
In this chapter's code, we're still using String in the AST (we'll switch to Symbol in the AST itself in Chapter 4). But the lookup already uses interned comparison at the boundary.
When to Intern vs When Not To
Intern: identifiers compared frequently (variable names, function names, type names). The whole point is deduplication + fast comparison.
Don't intern: one-off strings, large strings (you'd waste memory storing them forever), strings rarely compared. Not everything needs to be interned just because you can.
A good rule of thumb: if you'd put it in a HashSet or use it as a HashMap key, intern it. If you'd just display it once and forget it, don't.
Running
cargo run --bin ch03-interned-symbols
Key Takeaways
-
Interned = deduplicated. Same text → same ID. Always. This is how compilers handle identifiers efficiently.
-
Comparison by ID, not by value. Two
Symbolvalues with the same text have the same ID, so equality is O(1). -
Created inside queries. Unlike inputs (which come from outside), interned values are created inside tracked functions. They live in the database and persist across revisions.
-
Lifetime-tied.
Symbol<'db>can't outlive the database. This prevents stale references after database mutations. -
Inputs vs interned: different jobs. Inputs are mutable facts from outside. Interned values are immutable deduplicated data from inside. Don't confuse them.
What's Next
Chapter 4: Tracked Structs — The backbone of a Salsa-aware AST. We'll convert our String-based AST into one that uses Salsa IDs, giving each AST node stable identity that survives edits.
Chapter 4: Tracked Structs — Entity Identity in Salsa
Learn #[salsa::tracked] on structs: giving AST nodes stable identity that survives edits.
What You'll Learn
- What tracked structs are and why they matter
- The difference between "data" and "identity"
- How tracked structs enable fine-grained incrementality
- The pattern used in rust-analyzer's IR
The Problem: Plain Data Has No Identity
Consider two Lua functions:
function add(a, b) return a + b end
function mul(a, b) return a * b end
Both have the same shape — two parameters, a body that's a binary operation. But they're different functions. In a plain AST, they're just data. If you change add's body, how does Salsa know that mul is unaffected?
Answer: it can't — not with plain data. Both functions are part of the same LuaAst struct. If anything in the AST changes, the whole LuaAst is considered different, and every query that depends on it re-runs.
You need identity. Each function needs a stable ID that survives edits to other functions. That's what tracked structs give you.
Step 1: Define a Tracked Struct
#![allow(unused)] fn main() { #[salsa::tracked(debug)] pub struct FuncDef<'db> { pub name: FuncName<'db>, // interned name (from ch03) pub param_count: u32, // tracked field #[returns(ref)] pub body_text: String, // the body as text } }
#[salsa::tracked] on a struct makes it database-resident:
- Each instance gets a unique Salsa ID
- It's stored in the database, not on the stack
- It has a lifetime
'dbtying it to the database - Fields can be queried independently (fine-grained dependencies)
This is different from #[salsa::interned]. Interned structs are deduplicated by content — same fields, same ID. Tracked structs have unique IDs even if their fields are identical. Think of it as the difference between a value type and a reference type:
Interned: twoSymbol("print")instances are the same symbolTracked: twoFuncDefinstances with identical fields are different function definitions
Step 2: Identity vs Content
Here's the key distinction:
┌─────────────────────────────────────┐
│ Plain data: │
│ Two {name: "add", params: 2} │
│ → Equal by value │
│ → No way to tell them apart │
├─────────────────────────────────────┤
│ Tracked struct: │
│ Two FuncDef {name: "add", ...} │
│ → Different IDs │
│ → "Which function?" matters │
└─────────────────────────────────────┘
When you change add's body, you create a new FuncDef with a new ID. Salsa sees: "function #1 changed." Queries that depended on function #2 (mul) aren't invalidated because function #2's ID is unchanged.
Step 3: Per-Field Dependencies
Tracked structs have another superpower: field-level dependency tracking. When a tracked function reads func.param_count(db), Salsa records a dependency on that specific field. If you change func.body_text() but not func.param_count(), queries that only read param_count return their cached values.
#![allow(unused)] fn main() { #[salsa::tracked] pub fn func_complexity(db: &dyn salsa::Database, func: FuncDef<'_>) -> u32 { let body = func.body_text(db); // depends on body_text let ops = /* count operators in body */; ops + func.param_count(db) // depends on param_count } }
If you change only the body text, func_complexity re-runs (it depends on body_text). But if you added a query that only reads param_count, it would return its cached value.
Step 4: The Wrapper/Data Pattern
In practice, you often see this pattern in Salsa projects:
- Tracked struct — provides identity ("which function is this?")
- Plain data inside — provides content ("what's in this function?")
This is what rust-analyzer does. Each function, struct, enum, and impl in the Rust source becomes a tracked struct. The content (the AST, the type, the body) is stored as plain data inside the struct.
For our tutorial, we're keeping it simpler: FuncDef is a tracked struct with a few fields. In a production system, you'd have tracked structs for every named entity, and the AST nodes would reference them by ID rather than by string.
Running
cargo run --bin ch04-tracked-structs
Key Takeaways
-
Tracked structs have identity. Unlike plain data, each instance has a Salsa ID. Two
FuncDefinstances with the same content are still different entities. -
Fields can be tracked independently. Change only the body? Queries that read
param_countdon't re-run. This is fine-grained incrementality. -
The wrapper/data pattern. Tracked struct (identity) + plain data (content). This separates "which thing" from "what's in the thing."
-
Lifetime
'db. Tracked structs can't outlive the database. This prevents stale references after mutations. -
Why not tracked structs for the entire AST? You can, and rust-analyzer does. But it adds complexity: every node type needs a tracked struct definition, helper functions need lifetime annotations, and the
Updatetrait must be implemented for all inner types. For this tutorial, we use tracked structs for key entities (functions) and plain data for the AST content.
What's Next
Chapter 5: Type Inference — The heart of the tutorial. We'll implement type inference as a Salsa tracked query and see incrementality in action. This is where everything comes together.
Chapter 5: Type Inference — The Core Query
The heart of the tutorial: implement type inference as a Salsa tracked query and see incrementality in action.
What You'll Learn
- How gradual typing works: types where you want them, flexibility where you don't
- How to implement type inference as a Salsa tracked query
- How recursive tracked queries work (each call is memoized independently)
- The relationship between query granularity and incrementality
The Mental Model: Gradual Typing
Lua is dynamically typed. You don't write local x: number = 42. But that doesn't mean we can't check types — we just do it gradually:
- If the type is known, check it.
42 + 1→ both are numbers → OK. - If the type is unknown, allow it.
unknown_var + 1→ unknown is "dynamic" → OK (we trust the programmer). - If there's a contradiction, that's an error.
42 + "hello"→ number + string → Error.
The key distinction: Dynamic ≠ Error. Dynamic means "we don't know and that's fine." Error means "we found a contradiction." One is intentional flexibility; the other is a bug.
Our Type enum:
Type::Number — we know it's a number
Type::String — we know it's a string
Type::Boolean — we know it's a boolean
Type::Nil — we know it's nil
Type::Function — we know it's a function (with param/return types)
Type::Dynamic — we don't know, and that's OK
Type::Error — type checking found a contradiction
Compatibility check: is Type::Dynamic compatible with Type::Number? Yes. Is Type::Error compatible with anything? Yes (to avoid cascading errors). Is Type::Number compatible with Type::String? No.
Step 1: The Inference Query
#![allow(unused)] fn main() { #[salsa::tracked] pub fn infer_type( db: &dyn salsa::Database, source: SourceFile, expr: Expression, env: TypeEnv, ) -> Type { match expr { Expression::Nil => Type::Nil, Expression::True | Expression::False => Type::Boolean, Expression::Number(_) => Type::Number, Expression::StringLiteral(_) => Type::String, Expression::Name(ref name) => env.lookup(name), Expression::BinaryOp { ref left, op, ref right } => { let left_type = infer_type(db, source, (**left).clone(), env.clone()); let right_type = infer_type(db, source, (**right).clone(), env.clone()); // ... type-check the operation } // ... } } }
This is a tracked function that takes an expression and an environment, and returns a type. The environment (a list of name→type bindings) is part of the query key — same expression, different environment = different result.
Recursive calls are fine. infer_type(a + b) calls infer_type(a) and infer_type(b). Salsa memoizes each call independently. If only b changes, infer_type(a) returns its cached value, and only infer_type(b) re-runs.
Step 2: The Type Environment
#![allow(unused)] fn main() { #[derive(Debug, Clone, PartialEq, Eq, Hash, salsa::Update)] pub struct TypeEnv { pub bindings: Vec<(String, Type)>, } }
The environment maps variable names to types. When we see local x = 42, we add ("x", Type::Number) to the environment. When we look up x later, we find it's a number.
Why is TypeEnv part of the query key? Because the same variable name in different scopes can have different types. infer_type(db, source, expr_a, env_1) and infer_type(db, source, expr_a, env_2) are different queries with potentially different results.
This is correct but has a cost: if the environment changes (because you changed a variable's type), many queries re-run. In a production system, you'd use tracked structs for the environment to get finer-grained invalidation.
Step 3: Type-Checking Statements
Type inference handles expressions. Type-checking handles statements — it walks the AST, calls infer_type for each expression, and updates the environment:
Statement::Assignment { local, targets, values }
→ infer the value type
→ extend the environment with target = value_type
Statement::Function { name, params, body }
→ create a new env with params as Dynamic
→ type-check the body in that env
→ add name = Dynamic to the outer env
Statement::If { test, then_block, else_block }
→ infer the test type
→ check both blocks
Statement checking is not a tracked query in our implementation — it's a regular function called from type_check_program. This means the granularity of our type checker is per-file: changing any statement invalidates the entire file's type check.
Is this a problem? For a tutorial, no. For a production system, yes — you'd want per-function granularity. That's where tracked structs (Chapter 4) become essential: each function gets its own tracked struct, and changing one function only invalidates that function's type check.
Step 4: The Full Pipeline
SourceFile (input)
│
▼
parse() → LuaAst (plain data)
│
▼
top_level_types() → check each statement
│ │
│ ▼
│ infer_type() — recursive, memoized
│ │
│ (recursive calls, each cached)
│
▼
Vec<(String, Type)> — the top-level bindings and their types
Step 5: Incrementality in Action
#![allow(unused)] fn main() { let source = SourceFile::new(&db, "example.lua", "local x = 42\nlocal y = \"hello\"\n"); let types = top_level_types(&db, source); // x: Number, y: String source.set_text(&mut db).to("local a = 1 + 2\n".to_string()); let new_types = top_level_types(&db, source); // a: Number — re-computed because source changed }
When we change the source text, parse() re-runs, top_level_types() re-runs, and infer_type() re-runs for each expression. Everything that depended on the old source text is invalidated.
But here's what doesn't re-run: queries for other source files. If we have two files and only change one, the other file's type check returns from cache instantly. This is per-input isolation from Chapter 1, now applied to real work.
Running
cargo run --bin ch05-type-inference
Key Takeaways
-
Gradual typing = Types + Flexibility. Known types are checked. Unknown types are allowed. Contradictions are errors. This is the sweet spot between static and dynamic typing.
-
The environment is part of the query key. Same expression, different environment = different result. This is correct:
xin one scope is notxin another. -
Recursive queries are fine.
infer_type(a + b)callsinfer_type(a)andinfer_type(b). Each is memoized independently. This is one of Salsa's most powerful features. -
Granularity matters. With a plain-data AST, the granularity is per-file. With tracked structs, it's per-node. For a tutorial, per-file is fine. For a production system, use tracked structs.
-
The pipeline is a dependency graph.
SourceFile → parse → type_check → infer_type. Salsa builds this graph automatically from what you read. You don't declare dependencies — they're inferred.
What's Next
Chapter 6: Diagnostics as Accumulators — How to report errors without poisoning the query graph. This is essential for a real type checker: you want partial results even when there are type errors.
Chapter 6: Diagnostics as Accumulators
Report errors without poisoning the query graph.
What You'll Learn
- Why returning errors from queries is problematic
- How
#[salsa::accumulator]provides a side-channel for diagnostics - How accumulators keep the query graph healthy
- How to build a type checker that reports errors properly
The Problem: Errors That Poison the Graph
You wrote a type checker. It returns Type. What happens when there's a type error?
Option 1: Return Result<Type, Error>
If infer_type returns Err, downstream queries can't use the result. The error "poisons" everything that depends on it. You lose all the type information you did compute. In a gradual type system, partial type info is valuable — you want to keep it.
Option 2: Return Type::Error (sentinel value)
Better. Downstream queries see Error and can decide what to do (propagate it, ignore it, etc.). But you lose the error message. You know something went wrong, but not what. "Cannot add Number and String" is gone — all you have is Type::Error.
Option 3: Accumulators (the Salsa way)
The query returns a Type (possibly Type::Error), and emits a diagnostic as a side effect. The diagnostic doesn't affect the return value or the query graph. It's collected after the query runs. You get both the type information AND the error message.
Step 1: Define an Accumulator
#![allow(unused)] fn main() { #[salsa::accumulator] #[derive(Debug, Clone)] pub struct Diagnostic { pub severity: Severity, pub message: String, } }
An accumulator is a type that can be "pushed to" from inside tracked functions. It's a side channel — separate from the return value.
Step 2: Emit Diagnostics from Inside Queries
#![allow(unused)] fn main() { #[salsa::tracked] pub fn infer_type(db: &dyn salsa::Database, source: SourceFile, expr: Expression, env: TypeEnv) -> Type { match expr { Expression::BinaryOp { ref left, op, ref right } => { let lt = infer_type(db, source, (**left).clone(), env.clone()); let rt = infer_type(db, source, (**right).clone(), env.clone()); match op { BinOp::Add => { if !lt.is_compatible_with(&Type::Number) { Diagnostic::error(format!("cannot use {:?} in arithmetic", lt)) .emit(db); // side effect! } if !rt.is_compatible_with(&Type::Number) { Diagnostic::error(format!("cannot use {:?} in arithmetic", rt)) .emit(db); } if lt == Type::Error || rt == Type::Error { Type::Error } else { Type::Number } } // ... } } // ... } } }
When we find a type error, we do two things:
- Return
Type::Error— tells downstream queries "something went wrong" - Emit a
Diagnostic— tells the user what went wrong
These are separate concerns. The return value is for the query graph. The accumulator is for the human.
Step 3: Collect Accumulated Diagnostics
#![allow(unused)] fn main() { type_check(&db, source); let diags: Vec<_> = type_check::accumulated::<Diagnostic>(&db, source) .into_iter() .cloned() .collect(); }
After running type_check, you call type_check::accumulated::<Diagnostic> to get all diagnostics that were emitted during that query (and any queries it called). This is the read side of the accumulator.
Step 4: Incremental Accumulators
When a query re-runs, its old accumulated diagnostics are discarded and replaced. No stale diagnostics. If you fix a type error and the query re-runs successfully, the diagnostic from the previous revision disappears automatically.
#![allow(unused)] fn main() { // Bad program → diagnostics bad.set_text(&mut db).to("local x = 42\nlocal y = \"hello\"\nlocal z = x + y\n"); type_check(&db, bad); let bad_diags = type_check::accumulated::<Diagnostic>(&db, bad); assert!(!bad_diags.is_empty()); // Fix it → no diagnostics bad.set_text(&mut db).to("local x = 42\nlocal y = 1\nlocal z = x + y\n"); type_check(&db, bad); let fixed_diags = type_check::accumulated::<Diagnostic>(&db, bad); assert!(fixed_diags.is_empty()); }
The Pattern
┌──────────────────────────────────────────────┐
│ TRACKED FUNCTION │
│ │
│ input ──→ computation ──→ return value │
│ │ │
│ └──→ accumulate(diagnostics)│
└──────────────────────────────────────────────┘
Return value → used by downstream queries (the graph)
Accumulator → used by the human (the report)
These are intentionally separate. The graph shouldn't break because of a diagnostic. The diagnostic shouldn't be suppressed because the graph needs a valid value.
Why Not Result<Type, Diagnostic>?
Because Result is all-or-nothing: you get the type OR the error, never both. In gradual typing, partial information is valuable. If x is Number and y is String and z = x + y is an error, you still want to know that x is Number and y is String. With accumulators, you do.
Running
cargo run --bin ch06-diagnostics
Key Takeaways
-
Accumulators are side channels. They don't affect the return value. They're collected after the query runs.
-
Errors don't poison the graph. Return
Type::Errorto tell downstream queries something went wrong. Emit aDiagnosticto tell the user what. These are separate. -
Accumulators are incremental. When a query re-runs, old diagnostics are discarded and replaced. No stale diagnostics.
-
The pattern: return value for the graph, accumulator for the human. This is how rust-analyzer reports diagnostics. It works.
-
Partial results are valuable. Don't throw away type information just because there's an error somewhere. Gradual typing is about embracing partial knowledge.
What's Next
Chapter 7: The Language Server — Wire everything into a language server simulation and watch Salsa skip work on edits. This is the payoff for everything you've learned.
Chapter 7: Putting It Together — The Language Server
Wire everything into a language server simulation and watch Salsa skip work on edits.
What You'll Learn
- How the full pipeline fits together: parse → type-check → diagnostics
- How a language server uses Salsa's incrementality on every keystroke
- Per-file isolation in a multi-file project
- The architecture that powers rust-analyzer
The Language Server Loop
A language server does the same thing over and over:
- User opens a file → parse + type check + report diagnostics
- User types a character → update the source → re-parse + re-type-check → report diagnostics
- User saves → same as step 2, but the source is now on disk
The key constraint: step 2 has to be fast. Sub-100ms. The user is typing, and they expect the red squiggles to appear (or disappear) instantly. You can't re-type-check the entire project on every keystroke.
This is exactly the problem Salsa solves. Let's see how.
The LanguageServer Struct
#![allow(unused)] fn main() { struct LanguageServer { db: Database, files: Vec<SourceFile>, } impl LanguageServer { fn open_file(&mut self, path: &str, text: &str) -> usize { let file = SourceFile::new(&self.db, path.to_string(), text.to_string()); let idx = self.files.len(); self.files.push(file); type_check(&self.db, file); // initial check idx } fn edit_file(&mut self, idx: usize, new_text: &str) { self.files[idx].set_text(&mut self.db).to(new_text.to_string()); type_check(&self.db, self.files[idx]); // incremental check } fn diagnostics(&self, idx: usize) -> Vec<Diagnostic> { type_check::accumulated::<Diagnostic>(&self.db, self.files[idx]) .into_iter().cloned().collect() } } }
Three operations: open, edit, get diagnostics. On each edit, we set the new source text and re-type-check. Salsa handles the rest — it only re-runs queries whose inputs actually changed.
The Edit Cycle
User types "x + y" where "x" is Number and "y" is String:
1. set_text() → new revision
2. type_check() → re-runs (source changed)
3. parse() → re-runs (source changed)
4. infer_type() → re-runs for x + y
5. infer_type(x) → cached (x didn't change)
6. infer_type(y) → cached (y didn't change)
7. Diagnostic::error("cannot add Number and String") → accumulated
8. diagnostics() → [Error: "cannot add Number and String"]
User fixes to "x + 1":
1. set_text() → new revision
2. type_check() → re-runs
3. parse() → re-runs (source changed)
4. infer_type() → re-runs for x + 1
5. infer_type(x) → cached! (same revision, same env)
6. infer_type(1) → cached! (literal, no dependencies)
7. No diagnostics emitted
8. diagnostics() → [] (old diagnostics automatically discarded)
Steps 5-6 are the magic. Even though the source changed and parse() re-ran, the individual infer_type calls for unchanged sub-expressions return from cache. The type checker only does work for the part that actually changed.
Per-File Isolation
#![allow(unused)] fn main() { let main = server.open_file("main.lua", "local x = 42\n"); let other = server.open_file("other.lua", "local a = 1 + \"hello\"\n"); // other.lua has an error server.diagnostics(other); // → [Error: cannot add Number and String] // main.lua is clean — and it stays cached server.diagnostics(main); // → [] (no re-checking!) }
When we open a second file with an error, the first file isn't affected. Its cached type check is still valid. Salsa knows that main.lua's queries don't depend on other.lua's source text, so it returns the cached result instantly.
In a real project with hundreds of files, this is the difference between "responsive on every keystroke" and "freezes for seconds after each edit."
The Full Architecture
┌──────────────────────────────────────────────────────────────┐
│ LANGUAGE SERVER │
│ (diagnostics, hover, completion) │
└──────────────┬───────────────────────────────────────────────┘
│ accumulated diagnostics / cached types
▼
┌──────────────────────────────────────────────────────────────┐
│ TYPE CHECKER │
│ type_check() → check_stmt() → infer_type() │
│ ↓ ↓ ↓ │
│ [accumulators] [env updates] [recursive memoized calls] │
└──────────────┬───────────────────────────────────────────────┘
│ depends on
▼
┌──────────────────────────────────────────────────────────────┐
│ PARSER │
│ parse() — tracked, cached, uses analisar │
└──────────────┬───────────────────────────────────────────────┘
│ depends on
▼
┌──────────────────────────────────────────────────────────────┐
│ INPUTS │
│ SourceFile { path, text } — set on edit │
└──────────────────────────────────────────────────────────────┘
Incrementality in action:
set_text()creates a new revisionparse()re-runs (source changed)type_check()re-runs (parse result changed)infer_type()re-runs only for affected expressions- Diagnostics are re-accumulated
- Other files are NOT re-checked
This is how rust-analyzer stays fast. And now you know how.
Running
cargo run --bin ch07-language-server
What You Built
Over seven chapters, you built a gradual type checker for Lua powered by Salsa:
- Ch1 — Salsa basics: inputs, tracked functions, revisions
- Ch2 — Parsing Lua: wiring analisar into Salsa
- Ch3 — Interned symbols: fast name comparison by ID
- Ch4 — Tracked structs: entity identity in Salsa
- Ch5 — Type inference: the core tracked query, recursive and cached
- Ch6 — Accumulators: diagnostics without poisoning the graph
- Ch7 — Language server: the full pipeline, incremental on edits
Next Steps
This tutorial covers the fundamentals. To go further:
- Build a real LSP using
tower-lsp— the simulation here shows the loop, but a real server needs JSON-RPC, text document sync, etc. - Add type annotations — Teal-style
local x: number = 42. Parse annotations and use them as hints in inference. - Fine-grained incrementality — Use tracked structs for every AST node, not just function definitions. This is what rust-analyzer does.
- Go-to-definition, completion, hover — These are queries too.
lookup_namefrom Chapter 3 is the start of go-to-definition. - Cross-file references — Our type checker is per-file. A real one needs to handle
require()and resolve names across files. Tracked structs make this tractable: each file's definitions are tracked, and cross-file lookups are cached.
REVIEW.md — Building a Gradual Type Checker for Lua
Reviewed by Esme, 2026-04-21
Overall
This is a well-structured tutorial that builds incrementally from "hello Salsa" to a working language server simulation. The progression is natural — each chapter introduces one new Salsa concept and shows why it matters. The writing is clear and conversational. The main issues are around code correctness, significant code duplication across chapters, and a few conceptual gaps.
Chapter 1: Hello Salsa
[clarity] The spreadsheet analogy (inputs = cells, tracked functions = formulas, revisions = "after you edit a cell") is excellent. This is exactly the kind of concrete-before-abstract approach that works.
[error] The contains_text tracked function takes needle: String as a key parameter. This means every unique needle string creates a separate cache entry. For a tutorial, this is fine to demonstrate key parameters, but it's worth noting that using String as a query key means allocations on every call. A reader might assume this is idiomatic — it's not, for hot paths.
[suggestion] The README links point to chapters/ch01-hello-salsa.md but the actual files are chapters/ch01-hello-salsa/README.md. The links in README.md won't resolve correctly.
[style] The code comments are thorough and well-organized with section headers (Step 1, Step 2, etc.). This is great for learning — keep this pattern.
Chapter 2: Parsing Lua
[error] The salsa::Update derive macro is used on LuaAst, Statement, Expression, BinOp, UnaryOp. The salsa::Update trait is not a standard derive in Salsa 0.26 — it may be salsa::Update or may not exist as a derive. This needs verification against the actual Salsa 0.26 API. If it doesn't exist, the code won't compile.
[clarity] The README says "Why Convert the AST?" but doesn't fully explain the 'db lifetime issue. The comment in code mentions it, but the README should note that tracked function return values must be 'static (or at least not borrow from the input), which is why analisar's Cow<'a, str> AST can't be returned directly.
[error] The Suffixed expression handling has a logic problem. When s.computed is true (index access a[b]), the match arm ast::Expression::Suffixed(s) if !s.computed && !s.method won't match, and it falls through to the wildcard _ => Expression::Nil. This silently drops index expressions. Should at least have a comment noting this is intentionally simplified.
[clarity] The top_level_names function's comment about Salsa skipping re-runs when parse returns the same LuaAst is slightly misleading. Salsa re-runs top_level_names if parse is re-invoked in a new revision — it doesn't compare the LuaAst value. The comment says "same value" which implies value equality, but Salsa tracks at the revision level, then checks if the inputs to the derived query changed. This distinction matters for understanding Salsa's actual invalidation strategy.
Chapter 3: Interned Symbols
[error] The Symbol<'db> interned struct is defined but never actually used in the symbol_table or lookup_name queries in a way that demonstrates the benefit. The lookup_name function interns both the lookup name and each definition name, then compares by ID — but it's doing Symbol::new(db, def.name.clone()) inside a loop, creating new interned values on every comparison. The real benefit of interning is that you store Symbol in your data structures (not String) and then comparison is free. The current code actually does MORE work than a plain string comparison (intern + compare vs. just compare). This undermines the lesson.
[suggestion] The chapter should show Definition { name: Symbol<'db>, ... } instead of Definition { name: String, ... } to demonstrate the actual benefit: store symbols once, compare by ID forever after.
[clarity] The README's "Interned vs Input" table is great — clear and concise. More of these comparison tables throughout.
[error] The Symbol<'db> lifetime parameter means it can't be stored in Definition (which is plain data with salsa::Update). This is a real design tension that the tutorial doesn't address. The reader will try to put Symbol<'db> in their data structures and hit lifetime issues. This deserves explicit discussion.
Chapter 4: Tracked Structs
[error] The FuncDef::new(db, func_name, param_count, body) constructor signature doesn't match how #[salsa::tracked] structs work in Salsa 0.26. Tracked structs typically use a builder pattern or have specific field registration. The exact API needs verification.
[clarity] The "wrapper/data pattern" in the README is a key concept that deserves more explanation. The README mentions it but the code doesn't really demonstrate it — FuncDef has body_text: String as a tracked field, but there's no separate "data" struct inside it. The pattern is: tracked struct holds an ID, a separate data struct holds the content. The current code puts the content directly in the tracked struct, which is the simpler approach. Either demonstrate the actual wrapper/data pattern or remove the reference.
[error] The parsing in parse_functions is extremely fragile — it uses simple string splitting to parse function name(params) body end. This is fine for a demo, but the line if let Some(close_paren) = after_paren.find(')') will break on nested parentheses in the body. A comment acknowledging this simplification would help.
[suggestion] This chapter drops the analisar parser entirely in favor of manual string parsing. The transition is jarring — chapters 2-3 built up the analisar-based AST, and now we're parsing by hand. A brief explanation of why (tracked structs want a different granularity than the file-level parse) would help the reader understand the architectural shift.
Chapter 5: Type Inference
[error] The infer_type function passes Expression (a large clone-heavy enum) and TypeEnv as query keys. Every recursive call clones the entire environment and expression tree. In a real Salsa program, this would be catastrophically slow for non-trivial programs. The code acknowledges this in the "IMPORTANT DESIGN NOTE" comment, but the comment says "the granularity is per-file" — this isn't quite right. The granularity is actually per-(source, expr, env) triple, which means different expressions get separate cache entries, but the env cloning makes this extremely expensive.
[clarity] The "IMPORTANT DESIGN NOTE" is good but buried in a comment. This should be in the README and the main text — it's a critical architectural limitation that the reader needs to understand.
[error] Expression::String was renamed to Expression::StringLiteral in this chapter (vs String in ch02-04). This is actually a good rename (avoids confusion with Rust's String type) but it's inconsistent with earlier chapters. Should be backported or noted.
[clarity] The Dynamic type explanation is excellent. "Dynamic is NOT the same as Error. Dynamic means 'the programmer chose not to annotate this' — it's intentional. Error means 'we found a contradiction' — it's a bug." This is the kind of precise distinction that makes a tutorial valuable.
[suggestion] The is_compatible_with method treats Error as compatible with everything. This means once you have one error, it can suppress other real errors (error cascading is hidden). A brief note about this design choice would be helpful.
Chapter 6: Diagnostics
[error] The Diagnostic::emit method calls self.accumulate(db). In Salsa 0.26, the accumulator API may be different — accumulate might not be a method on the accumulated value itself. This needs verification against the actual Salsa 0.26 accumulator API.
[clarity] The "Why Not Result<Type, Diagnostic>?" section in the code comments is excellent — it clearly explains the three-option tradeoff. This belongs in the README too.
[style] The code formatting in ch06 and ch07 is significantly compressed compared to earlier chapters (e.g., fn new() -> Self { TypeEnv { bindings: Vec::new() } } on one line). This makes the code harder to read, especially in a tutorial context. The earlier chapters' formatting was better.
Chapter 7: Language Server
[error] The LanguageServer struct holds db: Database (owned). But SourceFile::set_text requires &mut db, and after mutation, any SourceFile references obtained before the mutation are still valid (Salsa guarantees this). However, the diagnostics method takes &self, which means it calls type_check::accumulated with an immutable reference. This should work, but the pattern of holding owned Database in a struct while also holding SourceFile IDs is worth a note — it's a common pattern in Salsa but not immediately obvious.
[suggestion] The language server simulation doesn't demonstrate the key benefit: timing. Add std::time::Instant measurements to show that re-checking after a small edit is faster than the initial check. Without timing, the "Salsa skips work" claim is invisible to the reader.
[clarity] The architecture diagram in the code comments is great. Should be in the README.
Cross-Cutting Issues
[error] MASSIVE CODE DUPLICATION. The AST types (Statement, Expression, BinOp, UnaryOp), the convert_stmt/convert_expr functions, TypeEnv, and infer_type are copy-pasted across chapters 2-7 with minor variations. This is ~200 lines duplicated per chapter. While this makes each chapter self-contained (good), it means any bug found in one chapter exists in all of them. Consider: (a) extracting shared code into a library crate that chapters depend on, or (b) explicitly noting that this is intentional for self-containedness and that production code would share the definitions.
[error] The salsa::Update derive macro is used extensively but never explained. The reader has no idea what it does or why it's needed. A one-paragraph explanation (it enables Salsa to compare old and new values for fine-grained invalidation) would help.
[clarity] The README chapter links use .md extensions pointing to files that don't exist. The actual content is in chapters/chXX-name/README.md.
[suggestion] No chapter discusses how Salsa handles cycles. The README mentions "Cycle detection — recursive queries that would loop forever are caught" but no chapter demonstrates this. A brief example (even just in an appendix) would round out the coverage.
[style] The code quality degrades noticeably from ch05 onward — compressed formatting, longer lines, less whitespace. Earlier chapters are much more readable.
What's Working
- The chapter progression is excellent — each one builds naturally on the last
- The Salsa concepts are introduced at the right pace: inputs → tracked functions → interned → tracked structs → accumulators
- The "key concepts" recaps at the end of each chapter are valuable
- The spreadsheet analogy in ch1 is memorable and accurate
- The gradual typing explanation in ch5 is precise and well-articulated
- The accumulator pattern (sentinel return + accumulated diagnostic) is clearly explained
- The per-file isolation demo in ch7 effectively shows Salsa's incremental value
Deep Dive Additions (2026-04-21 — Second Pass)
I re-read all 7 chapters' source code, compiled each one, and ran all binaries. The initial review raised API concerns about Salsa 0.26 and analisar 0.4 — these were verified:
salsa::Updatederive macro exists in Salsa 0.26 ✓- The accumulator API
type_check::accumulated::<Diagnostic>(&db, file)works as shown ✓ - All chapters compile and run successfully ✓
However, the deeper code-level review found these new issues:
[error] Suffixed expression regression in ch03, ch05, ch06, ch07
Chapter 2's convert_expr handles ast::Expression::Suffixed completely: field access (a.b), index access (a[b]), and method calls (a:f()). But chapters 3, 5, 6, and 7 changed the match arm to:
#![allow(unused)] fn main() { ast::Expression::Suffixed(s) if !s.computed && !s.method => { ... } }
The guard !s.computed && !s.method means:
a[b](index expression,s.computed == true) → falls to_ => Expression::Nila:f()(method call,s.method == true) → falls to_ => Expression::Nil
This is a regression from Chapter 2. Index and method expressions are silently converted to Nil across 4 chapters. No comment explains this is intentional.
Fix: Restore the full Suffixed handling from Chapter 2, or add a comment explaining that index/method expressions are intentionally unhandled.
[error] convert_args drops table arguments silently
In convert_args, the ast::Args::Table(_) arm returns vec![]:
#![allow(unused)] fn main() { ast::Args::Table(_) => vec![], // Table args dropped silently }
Function calls with table arguments like foo({a = 1}) are converted to zero arguments. For a tutorial showing Lua parsing, this is misleading — readers writing Lua code with table arguments will get broken behavior without explanation.
[error] BinOp::Concat returns Type::String unconditionally
In both ch05 (type inference) and ch06 (diagnostics), the concatenation operator returns:
#![allow(unused)] fn main() { BinOp::Concat => Type::String, }
This returns Type::String regardless of whether the operands are actually strings. In a gradual type checker:
String .. String→Type::String(correct)Dynamic .. anything→ should beType::Dynamic(currentlyType::String— wrong)Number .. String→ should beType::Error(currentlyType::String— wrong)
The current code has no type checking for .. — it's a blind shortcut. By contrast, BinOp::Add (and other arithmetic ops) in ch06 correctly checks type compatibility and emits diagnostics. BinOp::Concat should do the same.
Note: There's no test case exercising the .. operator, so this bug isn't caught by the asserts.
[error] Unused import / dead code warnings
- ch03:
use salsa::Setter;— unused import generates warning - ch03:
let mut db = Database::default()— never calls setters,mutgenerates warning - ch07:
fn display(&self) -> &str— dead code, never called, generates warning
These generate compiler warnings that would confuse readers following along.
[suggestion] No test case exercises the .. operator
None of the chapter tests use Lua's string concatenation operator (..). Adding a test like "hello" .. " world" and 42 .. "text" would verify the BinOp::Concat case and expose the bug described above.
[clarity] Per-file granularity claim is technically imprecise
Chapter 5's README says "the granularity is per-file" but the actual granularity is per-statement — check_stmt is called for each statement in the AST, and infer_type is called for each expression within each statement. A change to any statement in the file invalidates all downstream queries for that file. The README should clarify: "granularity is per-file for now; tracked structs (ch04) would enable per-node granularity."
[suggestion] Add comment explaining per-file granularity limitation
The README says "per-file isolation" in Chapter 7 but the actual behavior is: when a file is edited, ALL statements in that file are re-checked because type_check calls check_stmt for each statement in sequence. Only other files' queries return cached values. The README should clarify that the per-file isolation means "other files are not re-checked" not "other statements in the same file are not re-checked."
Priority Fixes from Second Pass
- Fix suffixed expression regression — ch03, ch05, ch06, ch07 silently drop
a[b]anda:f()expressions. Either restore full handling or add a comment explaining it's intentional. - Fix
BinOp::Concattype checking — should returnType::Dynamicwhen either operand isDynamic, andType::Error+ emit diagnostic when both are incompatible concrete types. - Clean up compiler warnings — remove unused
Setterimport in ch03, changelet mut dbtolet dbin ch03, remove deaddisplaymethod in ch07. - Add table argument handling or comment —
convert_argssilently drops table arguments in all chapters. Either handle them or explain the simplification. - Add
..operator test case — currently untested, and the bug from #2 can't be caught without it.