Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Part 8 — Wiring It to the Web

The snake is trained. The weights are saved to a model file on disk. Right now, that file is useless — it’s numbers on disk. We need to load those numbers into a network, feed it a board state, and return a move over HTTP.

This part takes everything we’ve built — the server from Part 1, the encoder from Part 2, the network from Part 3, the trained weights from Parts 5–7 — and connects it. The snake goes live.

What we’re building

The server from Part 1 was simple: receive a move request, return a direction. The trained snake server is the same shape, but the decision comes from the network instead of a heuristic or a random pick.

The pipeline for each move request:

HTTP request (JSON)
    │
    ▼  deserialize
Board + my_snake
    │
    ▼  encode_board()
Tensor(8, 11, 11)
    │
    ▼  reshape + batch dim
Tensor(1, 968)
    │
    ▼  net.forward() → argmax
direction index
    │
    ▼  map to string
"up" / "down" / "left" / "right"

Every turn, the server does this entire pipeline in under 500 milliseconds — that’s the BattleSnake timeout. On CPU, inference takes about 1ms for our small MLP. Plenty of headroom.

Loading the model at startup

The model loads once, when the server starts. We don’t want to reload the weights on every request — that would be slow and the server would miss its timeout.

use actix_web::{web, App, HttpServer, HttpResponse};
use burn::tensor::Tensor;
use burn::tensor::backend::Backend as BurnBackend;
use burn::tensor::activation::softmax;
use burn::module::Module;
use burn::record::{DefaultFileRecorder, FullPrecisionSettings, Recorder};
use snake_ml::{SnakeNet, encode_board, Board, Snake, Point};
use serde::{Deserialize, Serialize};
use std::sync::Mutex;

type Backend = burn::backend::Cpu;

// Shared model state — loaded once, used by every request
struct AppState {
    net: SnakeNet<Backend>,
    device: <Backend as BurnBackend>::Device,
}

#[derive(Deserialize)]
struct MoveRequest {
    board: Board,
    you: Snake,
    #[allow(dead_code)]
    game: GameInfo,
    #[allow(dead_code)]
    turn: u32,
}

#[derive(Deserialize)]
struct GameInfo {
    #[allow(dead_code)]
    id: String,
    #[allow(dead_code)]
    timeout: u32,
}

#[derive(Serialize)]
struct MoveResponse {
    #[serde(rename = "move")]
    direction: String,
    shout: Option<String>,
}

#[actix_web::main]
async fn main() -> std::io::Result<()> {
    let model_path = std::env::args()
        .nth(1)
        .unwrap_or_else(|| "model-final".to_string());

    println!("Loading model from {model_path}...");

    let device = <Backend as burn::tensor::backend::Backend>::Device::default();
    let recorder = DefaultFileRecorder::<FullPrecisionSettings>::new();
    let record = recorder
        .load(model_path.into(), &device)
        .expect("failed to load model weights");
    let net = SnakeNet::new(&device).load_record(record);

    println!("Model loaded. Starting server on port 8080...");

    let state = web::Data::new(Mutex::new(AppState { net, device: device.clone() }));

    HttpServer::new(move || {
        App::new()
            .app_data(state.clone())
            .route("/start", web::post().to(start_handler))
            .route("/move", web::post().to(move_handler))
            .route("/end", web::post().to(end_handler))
    })
    .bind(("0.0.0.0", 8080))?
    .run()
    .await
}

async fn start_handler() -> HttpResponse {
    HttpResponse::Ok().json(serde_json::json!({
        "apiversion": "1",
        "author": "your-name",
        "color": "#7B2D8E",
        "head": "default",
        "tail": "default"
    }))
}

async fn move_handler(
    state: web::Data<Mutex<AppState>>,
    req: web::Json<MoveRequest>,
) -> HttpResponse {
    let direction = decide_move(&state, &req.board, &req.you)
        .unwrap_or_else(|e| {
            eprintln!("Inference error: {e}");
            "up".to_string()
        });

    HttpResponse::Ok().json(MoveResponse {
        direction,
        shout: None,
    })
}

async fn end_handler() -> HttpResponse {
    HttpResponse::Ok().finish()
}

The Mutex wraps the model so Actix can share it across async handlers. Our network is read-only during inference — the weights don’t change — so the lock is never contested. A std::sync::RwLock or even an unsafe static would also work, but Mutex is correct and simple.

Blocking in async handlers. Burn’s tensor operations are CPU-bound and block the thread they run on. In an async runtime like Tokio (which Actix uses under the hood), blocking the main IO threads kills concurrency — every request has to wait for the blocking operation to finish. The fix: wrap the inference call in tokio::task::spawn_blocking, which moves it to a dedicated thread pool. For our small MLP this is a microsecond-level concern, but it’s the right pattern for production:

#![allow(unused)]
fn main() {
async fn move_handler(state: web::Data<Mutex<AppState>>, req: web::Json<MoveRequest>) -> HttpResponse {
    let board = req.board.clone();
    let my_snake = req.you.clone();
    let state = state.clone();

    // Run inference on the blocking thread pool
    let direction = tokio::task::spawn_blocking(move || {
        decide_move(&state, &board, &my_snake)
    })
    .await
    .unwrap_or_else(|_| Err("task join failed".into()))
    .unwrap_or_else(|_| "up".to_string());

    HttpResponse::Ok().json(MoveResponse { direction, shout: None })
}
}

If you load the model synchronously in main() before binding the server, startup is delayed until the weights are loaded. A cleaner pattern: load the model in a background spawn_blocking task before HttpServer::bind(), then share the loaded model via Arc. This way the server is ready to accept connections while the model is loading.

The decision function

This is the heart of the live snake. It’s the full pipeline in one function: encode → forward pass → pick a direction.

#![allow(unused)]
fn main() {
fn decide_move(
    state: &web::Data<Mutex<AppState>>,
    board: &Board,
    my_snake: &Snake,
) -> Result<String, Box<dyn std::error::Error>> {
    let app = state.lock().expect("state lock poisoned");

    // 1. Encode the board into a feature-plane tensor
    let tensor = encode_board::<Backend>(board, my_snake, &app.device);

    // 2. Flatten to (1, 968) — batch dimension of 1 for a single board
    let flat: Tensor<Backend, 2> = tensor.reshape([1, 968]);

    // 3. Forward pass → logits → pick the best direction
    let best = app.net.pick_direction(flat).index();

    // 4. Map index to direction string
    let direction = match best {
        0 => "up",
        1 => "down",
        2 => "left",
        3 => "right",
        _ => "up", // shouldn't happen
    };

    Ok(direction.to_string())
}
}

That’s the entire inference pipeline. On CPU, this takes about 1ms. The BattleSnake timeout is typically 500ms. We have a 499ms margin.

Handling the edge case: what if the network picks a deadly move?

The network can pick a direction that walks into a wall or another snake. During training, that’s fine — the reward signal teaches it not to. During a live game, one bad move and the snake is dead.

The practical fix: validate the network’s choice. If it picks a direction that leads to immediate death, fall back to the first safe direction. This is the same survival check the heuristic used in Part 4, applied as a safety net.

#![allow(unused)]
fn main() {
use burn::tensor::activation::softmax;

fn decide_move_safe(
    state: &web::Data<Mutex<AppState>>,
    board: &Board,
    my_snake: &Snake,
) -> Result<String, Box<dyn std::error::Error>> {
    let app = state.lock().expect("state lock poisoned");

    // Compute safe directions (not walls, not bodies)
    let safe = safe_directions(board, my_snake);

    if safe.is_empty() {
        // No safe move — we're going to die regardless.
        return decide_move(state, board, my_snake);
    }

    // Get the network's preference (probability for each direction)
    let tensor = encode_board::<Backend>(board, my_snake, &app.device);
    let flat: Tensor<Backend, 2> = tensor.reshape([1, 968]);
    let logits = app.net.forward(flat);
    let probs: Tensor<Backend, 2> = softmax(logits, 1);

    // Rank directions by probability, then pick the highest-probability safe one
    let probs_vec: Vec<f32> = probs.to_data().to_vec::<f32>()?;
    let mut ranked: Vec<(usize, f32)> = probs_vec
        .iter()
        .enumerate()
        .map(|(i, &p)| (i, p))
        .collect();
    ranked.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());

    let direction_names = ["up", "down", "left", "right"];

    for (idx, _prob) in ranked {
        if safe.contains(&idx) {
            return Ok(direction_names[idx].to_string());
        }
    }

    // Fall back to the first safe direction
    Ok(direction_names[safe[0]].to_string())
}

fn safe_directions(board: &Board, my_snake: &Snake) -> Vec<usize> {
    let head = &my_snake.head;
    let width = board.width as i32;
    let height = board.height as i32;

    // Collect all body positions (our own + enemies)
    let bodies: Vec<&Point> = board.snakes.iter()
        .flat_map(|s| s.body.iter())
        .collect();

    let mut safe = Vec::with_capacity(4);
    let candidates = [
        (0, 0, 1),   // up
        (1, 0, -1),  // down
        (2, -1, 0),  // left
        (3, 1, 0),   // right
    ];

    for (idx, dx, dy) in candidates {
        let nx = head.x + dx;
        let ny = head.y + dy;

        // Wall check
        if nx < 0 || nx >= width || ny < 0 || ny >= height {
            continue;
        }

        // Body check
        let hit_body = bodies.iter().any(|b| b.x == nx && b.y == ny);
        if hit_body {
            continue;
        }

        safe.push(idx);
    }

    safe
}
}

This is the “network with a safety net” pattern. The network makes the strategic decision. The safety net prevents tactical blunders — moves that kill the snake on the very next turn. Between the two, the snake plays well and doesn’t do obviously suicidal things.

This isn’t cheating. The network should learn to avoid walls and bodies on its own — the reward signal punishes death. But the network is probabilistic, and on any given turn it might make a mistake. The safety net catches the most obvious ones.

The /start response

BattleSnake requires a response to the /start endpoint that includes metadata about your snake. We can use it:

{
  "apiversion": "1",
  "author": "your-name",
  "color": "#7B2D8E",
  "head": "smart-caterpillar",
  "tail": "pixel",
  "version": "0.1.0"
}

The color, head, and tail fields control the snake’s appearance on the game board. They’re cosmetic, but picking a distinctive color makes it easier to spot your snake during playback.

Running the live snake

Build and run with the path to your trained model:

cargo run -- model-final

The server starts on port 8080. For local testing, tunnel it so the BattleSnake engine can reach it:

# Using cloudflared
cloudflared tunnel --url http://localhost:8080

# Or using ngrok
ngrok http 8080

Then go to play.battlesnake.com, register your snake with your tunnel URL, and start a game. Watch it play.

What to look for

A trained snake should:

  • Navigate toward food consistently, not only when it happens to be in the right direction
  • Avoid walls and its own body even in tight spaces
  • React to the opponent — not walk straight into the enemy’s body
  • Survive past turn 10 — a random snake typically dies within 5–10 turns on an 11×11 board

A well-trained self-play snake should also:

  • Control space — stay near the center when there’s no food pressure
  • Win head-to-head — avoid walking into a larger snake, but challenge a smaller one
  • Manage health — not starve because it was doing something else

If the snake is dying on turn 3, the model weights are the problem — either training didn’t converge, or the reward signal isn’t working. Go back to training, check the loss and win rate curves, and make sure the network is learning at all before worrying about strategy.

Monitoring in production

BattleSnake games happen in real time. If something goes wrong — the server is slow, the model produces garbage, the encoding is wrong — you need to know.

Add logging to the move handler:

#![allow(unused)]
fn main() {
async fn move_handler(
    state: web::Data<Mutex<AppState>>,
    req: web::Json<MoveRequest>,
) -> HttpResponse {
    let turn = req.turn;
    let health = req.you.health;

    let direction = decide_move_safe(&state, &req.board, &req.you)
        .unwrap_or_else(|e| {
            eprintln!("Turn {turn}: inference error: {e}");
            "up".to_string()
        });

    println!("Turn {turn}: health={health} → {direction}");

    HttpResponse::Ok().json(MoveResponse {
        direction,
        shout: None,
    })
}
}

The log line gives you the turn number, the snake’s health, and the chosen direction. If you see health=0 → up, the snake starved. If you see the same direction repeated every turn, the network is stuck. If you see inference error, something is wrong with the model or the encoding.

The complete server implementation

The full working server is in snake-server/src/main.rs. It includes:

  • A* pathfinding to the nearest food (pure Rust, BinaryHeap + HashMap, no external crates)
  • Survival check — immediately safe directions checked before A* runs
  • Fallback chain — A* failure drops to the first safe direction, then "up"
  • spawn_blocking wrapping on every request — the decision function runs on the blocking thread pool, not the async IO threads

You can run it right now without any trained weights. The A* heuristic is competitive enough to play a decent game against a random opponent on an 11×11 board.

Using the model: when you have trained weights, load them into a SnakeNet via the recorder pattern (DefaultFileRecorder, load, load_record) at startup and add the forward pass to decide(). The fallback chain means the server still works if the model is wrong or missing.

Part 9 covers scaling up: GPU training, convolutional architectures, and what to try next if you want a genuinely competitive snake.

Previous: Part 7 — Self-Play Training · Next: Part 9 — Scaling Up