REVIEW.md — Building a Gradual Type Checker for Lua

Reviewed by Esme, 2026-04-21

Overall

This is a well-structured tutorial that builds incrementally from "hello Salsa" to a working language server simulation. The progression is natural — each chapter introduces one new Salsa concept and shows why it matters. The writing is clear and conversational. The main issues are around code correctness, significant code duplication across chapters, and a few conceptual gaps.


Chapter 1: Hello Salsa

[clarity] The spreadsheet analogy (inputs = cells, tracked functions = formulas, revisions = "after you edit a cell") is excellent. This is exactly the kind of concrete-before-abstract approach that works.

[error] The contains_text tracked function takes needle: String as a key parameter. This means every unique needle string creates a separate cache entry. For a tutorial, this is fine to demonstrate key parameters, but it's worth noting that using String as a query key means allocations on every call. A reader might assume this is idiomatic — it's not, for hot paths.

[suggestion] The README links point to chapters/ch01-hello-salsa.md but the actual files are chapters/ch01-hello-salsa/README.md. The links in README.md won't resolve correctly.

[style] The code comments are thorough and well-organized with section headers (Step 1, Step 2, etc.). This is great for learning — keep this pattern.


Chapter 2: Parsing Lua

[error] The salsa::Update derive macro is used on LuaAst, Statement, Expression, BinOp, UnaryOp. The salsa::Update trait is not a standard derive in Salsa 0.26 — it may be salsa::Update or may not exist as a derive. This needs verification against the actual Salsa 0.26 API. If it doesn't exist, the code won't compile.

[clarity] The README says "Why Convert the AST?" but doesn't fully explain the 'db lifetime issue. The comment in code mentions it, but the README should note that tracked function return values must be 'static (or at least not borrow from the input), which is why analisar's Cow<'a, str> AST can't be returned directly.

[error] The Suffixed expression handling has a logic problem. When s.computed is true (index access a[b]), the match arm ast::Expression::Suffixed(s) if !s.computed && !s.method won't match, and it falls through to the wildcard _ => Expression::Nil. This silently drops index expressions. Should at least have a comment noting this is intentionally simplified.

[clarity] The top_level_names function's comment about Salsa skipping re-runs when parse returns the same LuaAst is slightly misleading. Salsa re-runs top_level_names if parse is re-invoked in a new revision — it doesn't compare the LuaAst value. The comment says "same value" which implies value equality, but Salsa tracks at the revision level, then checks if the inputs to the derived query changed. This distinction matters for understanding Salsa's actual invalidation strategy.


Chapter 3: Interned Symbols

[error] The Symbol<'db> interned struct is defined but never actually used in the symbol_table or lookup_name queries in a way that demonstrates the benefit. The lookup_name function interns both the lookup name and each definition name, then compares by ID — but it's doing Symbol::new(db, def.name.clone()) inside a loop, creating new interned values on every comparison. The real benefit of interning is that you store Symbol in your data structures (not String) and then comparison is free. The current code actually does MORE work than a plain string comparison (intern + compare vs. just compare). This undermines the lesson.

[suggestion] The chapter should show Definition { name: Symbol<'db>, ... } instead of Definition { name: String, ... } to demonstrate the actual benefit: store symbols once, compare by ID forever after.

[clarity] The README's "Interned vs Input" table is great — clear and concise. More of these comparison tables throughout.

[error] The Symbol<'db> lifetime parameter means it can't be stored in Definition (which is plain data with salsa::Update). This is a real design tension that the tutorial doesn't address. The reader will try to put Symbol<'db> in their data structures and hit lifetime issues. This deserves explicit discussion.


Chapter 4: Tracked Structs

[error] The FuncDef::new(db, func_name, param_count, body) constructor signature doesn't match how #[salsa::tracked] structs work in Salsa 0.26. Tracked structs typically use a builder pattern or have specific field registration. The exact API needs verification.

[clarity] The "wrapper/data pattern" in the README is a key concept that deserves more explanation. The README mentions it but the code doesn't really demonstrate it — FuncDef has body_text: String as a tracked field, but there's no separate "data" struct inside it. The pattern is: tracked struct holds an ID, a separate data struct holds the content. The current code puts the content directly in the tracked struct, which is the simpler approach. Either demonstrate the actual wrapper/data pattern or remove the reference.

[error] The parsing in parse_functions is extremely fragile — it uses simple string splitting to parse function name(params) body end. This is fine for a demo, but the line if let Some(close_paren) = after_paren.find(')') will break on nested parentheses in the body. A comment acknowledging this simplification would help.

[suggestion] This chapter drops the analisar parser entirely in favor of manual string parsing. The transition is jarring — chapters 2-3 built up the analisar-based AST, and now we're parsing by hand. A brief explanation of why (tracked structs want a different granularity than the file-level parse) would help the reader understand the architectural shift.


Chapter 5: Type Inference

[error] The infer_type function passes Expression (a large clone-heavy enum) and TypeEnv as query keys. Every recursive call clones the entire environment and expression tree. In a real Salsa program, this would be catastrophically slow for non-trivial programs. The code acknowledges this in the "IMPORTANT DESIGN NOTE" comment, but the comment says "the granularity is per-file" — this isn't quite right. The granularity is actually per-(source, expr, env) triple, which means different expressions get separate cache entries, but the env cloning makes this extremely expensive.

[clarity] The "IMPORTANT DESIGN NOTE" is good but buried in a comment. This should be in the README and the main text — it's a critical architectural limitation that the reader needs to understand.

[error] Expression::String was renamed to Expression::StringLiteral in this chapter (vs String in ch02-04). This is actually a good rename (avoids confusion with Rust's String type) but it's inconsistent with earlier chapters. Should be backported or noted.

[clarity] The Dynamic type explanation is excellent. "Dynamic is NOT the same as Error. Dynamic means 'the programmer chose not to annotate this' — it's intentional. Error means 'we found a contradiction' — it's a bug." This is the kind of precise distinction that makes a tutorial valuable.

[suggestion] The is_compatible_with method treats Error as compatible with everything. This means once you have one error, it can suppress other real errors (error cascading is hidden). A brief note about this design choice would be helpful.


Chapter 6: Diagnostics

[error] The Diagnostic::emit method calls self.accumulate(db). In Salsa 0.26, the accumulator API may be different — accumulate might not be a method on the accumulated value itself. This needs verification against the actual Salsa 0.26 accumulator API.

[clarity] The "Why Not Result<Type, Diagnostic>?" section in the code comments is excellent — it clearly explains the three-option tradeoff. This belongs in the README too.

[style] The code formatting in ch06 and ch07 is significantly compressed compared to earlier chapters (e.g., fn new() -> Self { TypeEnv { bindings: Vec::new() } } on one line). This makes the code harder to read, especially in a tutorial context. The earlier chapters' formatting was better.


Chapter 7: Language Server

[error] The LanguageServer struct holds db: Database (owned). But SourceFile::set_text requires &mut db, and after mutation, any SourceFile references obtained before the mutation are still valid (Salsa guarantees this). However, the diagnostics method takes &self, which means it calls type_check::accumulated with an immutable reference. This should work, but the pattern of holding owned Database in a struct while also holding SourceFile IDs is worth a note — it's a common pattern in Salsa but not immediately obvious.

[suggestion] The language server simulation doesn't demonstrate the key benefit: timing. Add std::time::Instant measurements to show that re-checking after a small edit is faster than the initial check. Without timing, the "Salsa skips work" claim is invisible to the reader.

[clarity] The architecture diagram in the code comments is great. Should be in the README.


Cross-Cutting Issues

[error] MASSIVE CODE DUPLICATION. The AST types (Statement, Expression, BinOp, UnaryOp), the convert_stmt/convert_expr functions, TypeEnv, and infer_type are copy-pasted across chapters 2-7 with minor variations. This is ~200 lines duplicated per chapter. While this makes each chapter self-contained (good), it means any bug found in one chapter exists in all of them. Consider: (a) extracting shared code into a library crate that chapters depend on, or (b) explicitly noting that this is intentional for self-containedness and that production code would share the definitions.

[error] The salsa::Update derive macro is used extensively but never explained. The reader has no idea what it does or why it's needed. A one-paragraph explanation (it enables Salsa to compare old and new values for fine-grained invalidation) would help.

[clarity] The README chapter links use .md extensions pointing to files that don't exist. The actual content is in chapters/chXX-name/README.md.

[suggestion] No chapter discusses how Salsa handles cycles. The README mentions "Cycle detection — recursive queries that would loop forever are caught" but no chapter demonstrates this. A brief example (even just in an appendix) would round out the coverage.

[style] The code quality degrades noticeably from ch05 onward — compressed formatting, longer lines, less whitespace. Earlier chapters are much more readable.


What's Working

  • The chapter progression is excellent — each one builds naturally on the last
  • The Salsa concepts are introduced at the right pace: inputs → tracked functions → interned → tracked structs → accumulators
  • The "key concepts" recaps at the end of each chapter are valuable
  • The spreadsheet analogy in ch1 is memorable and accurate
  • The gradual typing explanation in ch5 is precise and well-articulated
  • The accumulator pattern (sentinel return + accumulated diagnostic) is clearly explained
  • The per-file isolation demo in ch7 effectively shows Salsa's incremental value

Deep Dive Additions (2026-04-21 — Second Pass)

I re-read all 7 chapters' source code, compiled each one, and ran all binaries. The initial review raised API concerns about Salsa 0.26 and analisar 0.4 — these were verified:

  • salsa::Update derive macro exists in Salsa 0.26 ✓
  • The accumulator API type_check::accumulated::<Diagnostic>(&db, file) works as shown ✓
  • All chapters compile and run successfully ✓

However, the deeper code-level review found these new issues:

[error] Suffixed expression regression in ch03, ch05, ch06, ch07

Chapter 2's convert_expr handles ast::Expression::Suffixed completely: field access (a.b), index access (a[b]), and method calls (a:f()). But chapters 3, 5, 6, and 7 changed the match arm to:

#![allow(unused)]
fn main() {
ast::Expression::Suffixed(s) if !s.computed && !s.method => { ... }
}

The guard !s.computed && !s.method means:

  • a[b] (index expression, s.computed == true) → falls to _ => Expression::Nil
  • a:f() (method call, s.method == true) → falls to _ => Expression::Nil

This is a regression from Chapter 2. Index and method expressions are silently converted to Nil across 4 chapters. No comment explains this is intentional.

Fix: Restore the full Suffixed handling from Chapter 2, or add a comment explaining that index/method expressions are intentionally unhandled.

[error] convert_args drops table arguments silently

In convert_args, the ast::Args::Table(_) arm returns vec![]:

#![allow(unused)]
fn main() {
ast::Args::Table(_) => vec![], // Table args dropped silently
}

Function calls with table arguments like foo({a = 1}) are converted to zero arguments. For a tutorial showing Lua parsing, this is misleading — readers writing Lua code with table arguments will get broken behavior without explanation.

[error] BinOp::Concat returns Type::String unconditionally

In both ch05 (type inference) and ch06 (diagnostics), the concatenation operator returns:

#![allow(unused)]
fn main() {
BinOp::Concat => Type::String,
}

This returns Type::String regardless of whether the operands are actually strings. In a gradual type checker:

  • String .. StringType::String (correct)
  • Dynamic .. anything → should be Type::Dynamic (currently Type::String — wrong)
  • Number .. String → should be Type::Error (currently Type::String — wrong)

The current code has no type checking for .. — it's a blind shortcut. By contrast, BinOp::Add (and other arithmetic ops) in ch06 correctly checks type compatibility and emits diagnostics. BinOp::Concat should do the same.

Note: There's no test case exercising the .. operator, so this bug isn't caught by the asserts.

[error] Unused import / dead code warnings

  • ch03: use salsa::Setter; — unused import generates warning
  • ch03: let mut db = Database::default() — never calls setters, mut generates warning
  • ch07: fn display(&self) -> &str — dead code, never called, generates warning

These generate compiler warnings that would confuse readers following along.

[suggestion] No test case exercises the .. operator

None of the chapter tests use Lua's string concatenation operator (..). Adding a test like "hello" .. " world" and 42 .. "text" would verify the BinOp::Concat case and expose the bug described above.

[clarity] Per-file granularity claim is technically imprecise

Chapter 5's README says "the granularity is per-file" but the actual granularity is per-statementcheck_stmt is called for each statement in the AST, and infer_type is called for each expression within each statement. A change to any statement in the file invalidates all downstream queries for that file. The README should clarify: "granularity is per-file for now; tracked structs (ch04) would enable per-node granularity."

[suggestion] Add comment explaining per-file granularity limitation

The README says "per-file isolation" in Chapter 7 but the actual behavior is: when a file is edited, ALL statements in that file are re-checked because type_check calls check_stmt for each statement in sequence. Only other files' queries return cached values. The README should clarify that the per-file isolation means "other files are not re-checked" not "other statements in the same file are not re-checked."


Priority Fixes from Second Pass

  1. Fix suffixed expression regression — ch03, ch05, ch06, ch07 silently drop a[b] and a:f() expressions. Either restore full handling or add a comment explaining it's intentional.
  2. Fix BinOp::Concat type checking — should return Type::Dynamic when either operand is Dynamic, and Type::Error + emit diagnostic when both are incompatible concrete types.
  3. Clean up compiler warnings — remove unused Setter import in ch03, change let mut db to let db in ch03, remove dead display method in ch07.
  4. Add table argument handling or commentconvert_args silently drops table arguments in all chapters. Either handle them or explain the simplification.
  5. Add .. operator test case — currently untested, and the bug from #2 can't be caught without it.