Chapter 1: Hello Salsa — Inputs and Tracked Functions

This is your first Salsa program. No Lua yet — just the core concepts.

What You'll Learn

  • What a Salsa database is and why you need one
  • How #[salsa::input] defines the roots of your computation
  • How #[salsa::tracked] defines derived (cached) queries
  • What revisions are and how Salsa decides what to re-run
  • Why this matters for a type checker

The Mental Model

Think of Salsa as a spreadsheet:

  • Inputs are the cells you type into (source code, config)
  • Tracked functions are the formulas (=A1+B1)
  • Revisions are like "after you edit a cell"

When you change an input, only the formulas that depend on it re-evaluate. The rest return their cached values instantly.

But there's a difference. A spreadsheet tracks dependencies at the cell level. Salsa tracks at the query level. A tracked function that reads source.text(db) is recorded as depending on that specific input. If the input changes in a new revision, the function re-runs. If it doesn't change, the cached result comes back for free.

Why does this matter? Because in a real type checker, you might have hundreds of source files and thousands of queries. You can't recompute everything on every keystroke. Salsa gives you the infrastructure to skip work — automatically.

Step 1: Define Your Input

In Salsa, inputs are the facts that come from outside the system. For a type checker, the obvious input is: the source code.

#![allow(unused)]
fn main() {
#[salsa::input]
pub struct SourceFile {
    #[returns(ref)]
    pub path: String,
    #[returns(ref)]
    pub text: String,
}
}

#[salsa::input] does a lot of work behind the scenes. It generates:

  • A constructor: SourceFile::new(&db, path, text)
  • Getter methods: source.path(db), source.text(db)
  • Setter methods: source.set_text(&mut db).to(new_text)

The #[returns(ref)] attribute tells Salsa that the getter should return a reference (&String) instead of an owned value. This avoids cloning strings every time you read an input.

Key idea: Inputs are the only way data enters the system. Everything else is derived from them. This is what makes Salsa's incrementality work — if you can trace every piece of data back to an input, you know exactly what needs to re-run when an input changes.

Step 2: Define a Tracked Function

A #[salsa::tracked] function is a pure function of its inputs. Pure means: no side effects, no reading from global state, no randomness. Everything goes through the database.

#![allow(unused)]
fn main() {
#[salsa::tracked]
pub fn line_count(db: &dyn salsa::Database, source: SourceFile) -> u32 {
    let text = source.text(db);
    text.lines().count() as u32
}
}

The first argument is always the database (we'll refine this in later chapters). The remaining arguments are the "keys" — what distinguishes one call from another.

When you call line_count(&db, source), Salsa does this:

  1. Check: have I seen this query before, with these arguments, in this revision?
  2. If yes → return the cached result. No re-execution.
  3. If no → run the function, cache the result, return it.

The function body calls source.text(db). This read is tracked. Salsa records: "line_count(source) depends on source.text." Later, if you change the source text, Salsa knows this cache entry is stale.

Let's add one more query to see how dependencies work:

#![allow(unused)]
fn main() {
#[salsa::tracked]
pub fn contains_text(db: &dyn salsa::Database, source: SourceFile, needle: String) -> bool {
    let text = source.text(db);
    text.contains(&needle)
}
}

Same pattern: read an input, compute a result, cache it. The needle parameter becomes part of the cache key — contains_text(db, source, "print") and contains_text(db, source, "local") are independent cache entries.

Step 3: Define Your Database

A database is the container that holds all the cached query results.

#![allow(unused)]
fn main() {
#[salsa::db]
#[derive(Default)]
pub struct Database {
    storage: salsa::Storage<Self>,
}

#[salsa::db]
impl salsa::Database for Database {}
}

You need three things:

  1. A salsa::Storage<Self> field — this is where Salsa keeps its memo tables, revision counters, and dependency graphs.
  2. An impl of salsa::Database — marked with #[salsa::db].
  3. The same #[salsa::db] attribute on the struct itself.

Right now our database is empty — it doesn't have any custom behavior. In later chapters, we'll add methods and custom traits. For now, it's just a container.

Step 4: Use It

fn main() {
    let mut db = Database::default();

    let source = SourceFile::new(
        &db,
        "main.lua".to_string(),
        "local x = 1\nlocal y = 2\nprint(x + y)\n".to_string(),
    );

    let count = line_count(&db, source);
    assert_eq!(count, 3);
}

Nothing magical yet — we create a database, create an input, and query it. The result is computed and cached.

The Magic: Revisions

Now let's change the input and see what happens:

#![allow(unused)]
fn main() {
source.set_text(&mut db).to("local z = 99\n".to_string());

let new_count = line_count(&db, source);
assert_eq!(new_count, 1);
}

When we call set_text, Salsa increments its revision counter. This is Salsa's internal clock — every input mutation bumps the revision. Each cached query remembers which revision it was computed in.

When we query line_count again, Salsa checks: "Is the current revision newer than when I last computed this?" Yes. "Did the inputs this query depends on actually change?" Yes — we set new text. So it re-runs the function.

If we had queried line_count without changing the text, Salsa would return the cached result instantly. No re-execution.

Per-Input Isolation

Here's the key insight that makes Salsa work at IDE scale:

#![allow(unused)]
fn main() {
let other = SourceFile::new(&db, "other.lua".to_string(), "return 42\n".to_string());
let other_count = line_count(&db, other); // computed, cached

source.set_text(&mut db).to("local a = 1\nlocal b = 2\n".to_string());

let other_count_again = line_count(&db, other); // cached! No re-run!
assert_eq!(other_count_again, 1);
}

We changed source, not other. Salsa knows line_count(other) doesn't depend on source's text. So it returns the cached value for other without re-running anything.

In a real type checker with hundreds of files, typing in one file only invalidates queries that read that file. Queries for other files are still cached. This is why rust-analyzer can respond in milliseconds even on large projects — it's not re-type-checking the whole world on every keystroke.

Running

cargo run --bin ch01-hello-salsa

Key Takeaways

  1. Inputs are the source of truth. They come from outside (files, user input). Setting an input creates a new revision.

  2. Tracked functions are pure. No side effects, no reading from global state. Everything goes through the database.

  3. Incrementality is automatic. You don't write cache invalidation logic. Salsa does it by tracking which inputs each query reads.

  4. Per-input isolation. Changing one file doesn't invalidate queries for other files. This is the key insight that makes IDE-scale projects feasible.

  5. The dependency graph is implicit. You don't declare "query A depends on query B." Salsa infers it from what you read. This means the graph is always correct — it can't get out of sync with the code.

What's Next

Chapter 2: Parsing Lua with Analisar — We'll parse actual Lua source code using the analisar parser and wire it into Salsa as a tracked query. Same incremental model, but now doing real work.