Part 4 — The Heuristic Baseline
Here’s the situation: we have a neural network that takes a board encoding and outputs move probabilities. But the weights are random. It’s useless.
Before we can train the network to play well, we need a teacher — a snake that makes smart enough decisions to generate training data. That’s what we build in this part.
The teacher doesn’t have to be brilliant. It needs to be better than random. We’ll use a simple rule-based snake that:
- Never moves into a wall, itself, or another snake (survival)
- Uses A* pathfinding to find a path to the nearest food
This becomes the “ground truth” for imitation learning in Part 5.
A* pathfinding
A* (pronounced “A-star”) is a graph search algorithm that finds the shortest path between two points. It’s the same algorithm used in video games for character navigation and in GPS systems for route planning.
The core idea: start at the current position, explore neighbors by priority, where priority is cost so far + estimated distance to goal. Always expand the lowest-priority node next.
open_set: priority queue sorted by (g + h)
closed_set: positions we've already explored
start at current position
while open_set not empty:
pop the position with lowest priority
if it's the goal: reconstruct path, done
for each neighbor:
if neighbor not in closed_set and not blocked:
tentative_g = g(current) + 1
if this path to neighbor is better than any previous:
update parent, update g score
push neighbor to open_set
The heuristic h (estimated distance to goal) is what makes A* admissible — it never overestimates, so the solution it finds is guaranteed to be optimal. For a grid, Manhattan distance works: |x1 - x2| + |y1 - y2|.
Here’s the implementation in pure Rust, no external crates:
#![allow(unused)]
fn main() {
use std::collections::{BinaryHeap, HashMap, HashSet};
use std::cmp::Ordering;
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct Pos {
pub x: i32,
pub y: i32,
}
impl Pos {
pub fn manhattan(&self, other: &Pos) -> i32 {
(self.x - other.x).abs() + (self.y - other.y).abs()
}
}
// Node in the A* search
#[derive(Clone)]
struct Node {
pos: Pos,
g: i32, // cost from start
f: i32, // g + h
}
impl PartialEq for Node {
fn eq(&self, other: &Self) -> bool { self.f == other.f }
}
impl Eq for Node {}
impl Ord for Node {
fn cmp(&self, other: &Self) -> Ordering {
// BinaryHeap is a max-heap; we want min-f, so flip the comparison
other.f.cmp(&self.f)
}
}
impl PartialOrd for Node {
fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
Some(self.cmp(other))
}
}
/// A* pathfinder for a rectangular grid.
/// Returns the first step toward the goal, or None if no path exists.
pub struct Astar {
width: i32,
height: i32,
blocked: HashSet<Pos>,
}
impl Astar {
pub fn new(width: i32, height: i32) -> Self {
Self {
width,
height,
blocked: HashSet::new(),
}
}
/// Mark a position as impassable (walls, other snakes' bodies)
pub fn block(&mut self, pos: Pos) {
self.blocked.insert(pos);
}
/// Find the first step toward `goal` from `start`.
/// Returns `None` if no path exists.
pub fn step(&self, start: &Pos, goal: &Pos) -> Option<Pos> {
let mut open_set = BinaryHeap::new();
let mut came_from: HashMap<Pos, Pos> = HashMap::new();
let mut g_score: HashMap<Pos, i32> = HashMap::new();
let mut closed: HashSet<Pos> = HashSet::new();
g_score.insert(start.clone(), 0);
open_set.push(Node {
pos: start.clone(),
g: 0,
f: start.manhattan(goal),
});
while let Some(Node { pos, g, f: _ }) = open_set.pop() {
if pos == *goal {
// Reconstruct: find the step right after `start`
return self.reconstruct_first_step(&came_from, start, goal);
}
if closed.contains(&pos) {
continue;
}
closed.insert(pos.clone());
for neighbor in self.neighbors(&pos) {
if closed.contains(&neighbor) || self.blocked.contains(&neighbor) {
continue;
}
let tentative_g = g + 1;
let is_better = g_score
.get(&neighbor)
.map(|current| tentative_g < *current)
.unwrap_or(true);
if is_better {
came_from.insert(neighbor.clone(), pos.clone());
g_score.insert(neighbor.clone(), tentative_g);
open_set.push(Node {
pos: neighbor,
g: tentative_g,
f: tentative_g + neighbor.manhattan(goal),
});
}
}
}
None // no path found
}
fn neighbors(&self, pos: &Pos) -> Vec<Pos> {
let mut n = Vec::with_capacity(4);
if pos.y + 1 < self.height { n.push(Pos { x: pos.x, y: pos.y + 1 }); } // up
if pos.y - 1 >= 0 { n.push(Pos { x: pos.x, y: pos.y - 1 }); } // down
if pos.x - 1 >= 0 { n.push(Pos { x: pos.x - 1, y: pos.y }); } // left
if pos.x + 1 < self.width { n.push(Pos { x: pos.x + 1, y: pos.y }); } // right
n
}
fn reconstruct_first_step(&self, came_from: &HashMap<Pos, Pos>, start: &Pos, goal: &Pos) -> Option<Pos> {
// Walk backward from goal to start, then return the step after start
let mut current = goal.clone();
while let Some(parent) = came_from.get(¤t) {
if parent == start {
return Some(current);
}
current = parent.clone();
}
None
}
}
}
Note that we store blocked positions inside Astar. In practice you’ll create a fresh Astar for each query and populate it from the current game state. That’s fine — the search space is small (11×11 = 121 nodes).
Survival rules
One subtle edge case: when a snake moves, its tail vacates a square before the head arrives there. If you’re checking safety against the current board state, you might incorrectly think the tail is still occupying its current square. To handle this correctly, the board representation needs a
next_state()method that simulates the next tick — or equivalently, treat the last segment of each snake’s body as not blocking for the current move (it’ll vacate). The survival rule insafe_directionsbelow doesn’t need this because it only looks at walls and other snakes’ bodies directly; the game loop handles tail movement when it updates the board after each tick.
A* gets us to food, but it’s not enough on its own. The snake needs to also avoid immediately dying. Before following the A* path, we check which directions are immediately safe:
#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum Direction { Up, Down, Left, Right }
impl Direction {
pub fn delta(&self) -> (i32, i32) {
match self {
Self::Up => (0, 1),
Self::Down => (0, -1),
Self::Left => (-1, 0),
Self::Right => (1, 0),
}
}
}
pub struct HeuristicSnake {
width: i32,
height: i32,
}
impl HeuristicSnake {
pub fn new(width: i32, height: i32) -> Self {
Self { width, height }
}
/// Pick the best direction using survival + A* pathfinding to nearest food
pub fn decide(&self, my_snake: &Snake, board: &Board) -> Direction {
let my_head = Pos { x: my_snake.head.x, y: my_snake.head.y };
// 1. Find immediate safe moves (not walls, not bodies, not enemy heads)
let safe = self.safe_directions(&my_head, board, my_snake);
// 2. If no safe moves, we have to pick something (we'll die)
if safe.is_empty() {
return Direction::Up;
}
// 3. Find nearest food
let nearest_food = board.food.iter()
.map(|f| Pos { x: f.x, y: f.y })
.min_by_key(|f| my_head.manhattan(f))
.unwrap_or(Pos { x: self.width / 2, y: self.height / 2 });
// 4. Build A* pathfinder with current snake bodies blocked
let mut astar = Astar::new(self.width, self.height);
// Block all body squares (including our own)
for snake in &board.snakes {
for segment in &snake.body {
astar.block(Pos { x: segment.x, y: segment.y });
}
}
// 5. Find A* path to food
let target = astar.step(&my_head, &nearest_food);
// 6. Choose the safe direction that gets us closest to food
let best = target.and_then(|goal| {
// Direction that moves toward the first step
self.step_toward(&my_head, &goal)
});
match best {
Some(dir) if safe.contains(&dir) => dir,
_ => {
// Fall back to the first safe direction
safe.into_iter().next().unwrap()
}
}
}
fn safe_directions(&self, head: &Pos, board: &Board, my_snake: &Snake) -> Vec<Direction> {
let mut safe = Vec::with_capacity(4);
for dir in [Direction::Up, Direction::Down, Direction::Left, Direction::Right] {
let (dx, dy) = dir.delta();
let next = Pos { x: head.x + dx, y: head.y + dy };
// Out of bounds
if next.x < 0 || next.x >= self.width || next.y < 0 || next.y >= self.height {
continue;
}
// Own body (any body segment)
if my_snake.body.iter().any(|s| s.x == next.x && s.y == next.y) {
continue;
}
// Any enemy body
let body_hit = board.snakes.iter()
.filter(|s| s.id != my_snake.id)
.any(|s| s.body.iter().any(|b| b.x == next.x && b.y == next.y));
if body_hit {
continue;
}
safe.push(dir);
}
safe
}
fn step_toward(&self, from: &Pos, goal: &Pos) -> Option<Direction> {
if goal.x > from.x { Some(Direction::Right) }
else if goal.x < from.x { Some(Direction::Left) }
else if goal.y > from.y { Some(Direction::Up) }
else if goal.y < from.y { Some(Direction::Down) }
else { None }
}
}
}
Notice what we’re blocking: body segments (not only the head). In BattleSnake, you die if you move into any snake’s body. Enemy heads are passable (you can move into the square in front of an enemy head — the restriction is the neck, not the head itself).
This heuristic is competitive. It will beat a random snake every time. It knows where food is, it avoids dying immediately, and it plans paths rather than stumbling toward food.
What it’s missing
The heuristic is good, but it’s not perfect. It has blind spots:
- Head-to-head collisions: If two snakes are moving toward each other, both might see the space in front of the opponent’s head as valid. A* doesn’t know about this.
- Long snakes closing off paths: A* finds a path to food, but the path might go through an area that fills up with body segments on future turns.
- Health management: It always goes for the nearest food, regardless of health. If you’re at full health and there’s food nearby, wandering toward food might be a waste of a turn.
Part 5 uses this heuristic to generate training data anyway. It’s good enough to teach the network something useful. And the network, trained on this data, will learn to avoid some of the heuristic’s blind spots by seeing patterns the heuristic couldn’t.
Wiring it into the server
Back in snake-server/src/main.rs, swap out the wall-check for the heuristic:
#![allow(unused)]
fn main() {
use snake_server::HeuristicSnake;
async fn move_handler(req: web::Json<MoveRequest>) -> HttpResponse {
let snake = HeuristicSnake::new(req.board.width, req.board.height);
let direction = snake.decide(&req.you, &req.board);
HttpResponse::Ok().json(MoveResponse {
direction: direction.as_str().to_string(),
})
}
}
Run it, point it at BattleSnake, and watch it play. It won’t be world-class, but it’ll survive longer than anything random.
That’s the teacher. Next: we use it to generate training data and train the network to copy it.
Previous: Part 3 — Your First Network · Next: Part 5 — Imitation Learning