Example of presenting algorithm in meta-language DataScript (.ds3)
Introduction to the A* Algorithm
A* search algorithm - Wikipedia
A* search algorithm - Rosetta Code
A* Search Algorithm — Tutorial
What Is A*?
A* (pronounced "A-star") finds the shortest path between two points in a graph or grid.
Invented in 1968 by Peter Hart, Nils Nilsson, and Bertram Raphael at Stanford Research Institute.
It's the standard pathfinding algorithm in games, robotics, and navigation.
The Family of Search Algorithms
A* doesn't exist in isolation — it's the final step in a natural progression:
1. Breadth-First Search (BFS)
Explores in all directions equally. Like dropping a stone in water — the frontier
expands as a ring. Finds shortest path when all moves cost the same.
frontier: int[] = [] # FIFO — first in, first out
ds_push(frontier, start)
came_from[start] = -1
while len(frontier) > 0
current := frontier[0]
# shift front (dequeue)
for i in 0..<len(frontier) - 1
frontier[i] = frontier[i + 1]
ds_pop(frontier)
for d in 0..<8
nx := current_x + DX[d]
ny := current_y + DY[d]
ni := to_index(nx, ny)
if came_from[ni] == -2 # not visited
ds_push(frontier, ni)
came_from[ni] = current
- Guarantees: shortest path (uniform cost)
- Problem: explores everything — slow when you know where the goal is
2. Dijkstra's Algorithm (1956)
Like BFS, but handles different movement costs (roads vs mud vs mountains).
Uses a priority queue ordered by actual cost from start.
# open set as Node[] with f = cost_so_far (no heuristic)
open: Node[] = []
ds_push(open, Node { x: sx, y: sy, f: 0 })
cost_so_far[to_index(sx, sy)] = 0
while len(open) > 0
# find node with lowest f (cost_so_far)
best := 0
for bi in 1..<len(open)
if open[bi].f < open[best].f
best = bi
cur := open[best]
open[best] = open[len(open) - 1]
ds_pop(open)
ci := to_index(cur.x, cur.y)
for d in 0..<8
nx := cur.x + DX[d]
ny := cur.y + DY[d]
ni := to_index(nx, ny)
new_cost := cost_so_far[ci] + move_cost(ci, ni)
if new_cost < cost_so_far[ni]
cost_so_far[ni] = new_cost
came_from[ni] = ci
ds_push(open, Node { x: nx, y: ny, f: new_cost })
- Guarantees: optimal path with varying costs
- Problem: still explores in all directions — doesn't "aim" for the goal
3. Greedy Best-First Search
Uses a heuristic to aim straight for the goal. Fast but can be fooled by obstacles.
# same as Dijkstra but priority = heuristic only (ignores actual cost)
f := heuristic(nx, ny, gx, gy) # estimated distance to goal
ds_push(open, Node { x: nx, y: ny, f: f })
- Fast: explores far fewer nodes than Dijkstra
- Problem: NOT optimal — can find longer paths, gets stuck behind walls
4. A* — The Best of Both
The key insight: combine actual cost with estimated remaining cost.
f(n) = g(n) + h(n)
g(n) = actual cost from start to n (like Dijkstra)
h(n) = estimated cost from n to goal (like Greedy)
f(n) = total estimated cost through n (priority for the queue)
open: Node[] = []
ds_push(open, Node { x: sx, y: sy, f: heuristic(sx, sy, gx, gy) })
g_score[to_index(sx, sy)] = 0
while len(open) > 0
# find node with lowest f = g + h
best := 0
for bi in 1..<len(open)
if open[bi].f < open[best].f
best = bi
cur := open[best]
open[best] = open[len(open) - 1]
ds_pop(open)
ci := to_index(cur.x, cur.y)
if cur.x == gx and cur.y == gy
break # early exit!
for d in 0..<8
nx := cur.x + DX[d]
ny := cur.y + DY[d]
ni := to_index(nx, ny)
new_g := g_score[ci] + move_cost(ci, ni)
if new_g < g_score[ni]
g_score[ni] = new_g
came_from[ni] = ci
f := new_g + heuristic(nx, ny, gx, gy)
ds_push(open, Node { x: nx, y: ny, f: f })
- Guarantees: optimal path (if heuristic is admissible)
- Fast: explores far fewer nodes than Dijkstra
- Early exit: can stop as soon as the goal is reached
The Heuristic — The Critical Choice
The heuristic h(n) estimates remaining distance. The choice of heuristic depends on
how movement works. This is where most implementations go wrong.
Rules
- Admissible: must never overestimate the true cost (otherwise path may not be optimal)
- Consistent:
h(a) <= cost(a,b) + h(b)for any edge (a,b) — guarantees nodes are
processed at most once - Higher is better (as long as still admissible) — prunes more of the search space
Common Heuristics for Grids
| Movement | Heuristic | Formula |
|---|---|---|
| 4-direction (no diagonals) | Manhattan | ` |
| 8-direction (king moves) | Chebyshev | `max( |
| Any angle (free movement) | Euclidean | sqrt(dx² + dy²) |
Why Most Rosetta Code Implementations Are Wrong
The Rosetta Code A* task uses 8-directional king movement (all moves cost 1,
including diagonals). The correct heuristic is Chebyshev distance: max(|dx|, |dy|).
| Heuristic Used | Admissible? | Problem |
|---|---|---|
Chebyshev max(|dx|, |dy|) | Yes | Correct — exact match for king moves |
Manhattan |dx| + |dy| | No | Overestimates diagonals (says 2, truth is 1) |
Euclidean sqrt(dx²+dy²) | Yes | Works but underestimates — explores too many nodes |
Squared Euclidean dx²+dy² | No | Grossly overestimates — completely wrong |
| Zero | Yes | Degrades to Dijkstra — "not really A*" |
The C, C++, Java, JavaScript, Dart, and Common Lisp implementations on Rosetta Code
were all flagged for using wrong heuristics.
Rule of thumb: match your heuristic to your movement model.
Path Reconstruction
A* doesn't store the path during search — it stores came_from[node] = previous_node.
After reaching the goal, walk backwards:
path: int[] = []
pi := to_index(gx, gy)
while pi != -1
ds_push(path, pi)
pi = came_from[pi]
# reverse
lo := 0
hi := len(path) - 1
while lo < hi
tmp := path[lo]
path[lo] = path[hi]
path[hi] = tmp
lo = lo + 1
hi = hi - 1
Complexity
| Aspect | Value |
|---|---|
| Time | O(E log V) with binary heap — E edges, V vertices |
| Space | O(V) — stores g-scores, came_from, open/closed sets |
| Optimal | Yes, if h(n) is admissible |
| Complete | Yes, on finite graphs |
In practice, a good heuristic makes A* explore far fewer nodes than Dijkstra.
On a grid with a clear path, A* can be 10-100x faster.
When to Use What
| Situation | Algorithm |
|---|---|
| Uniform cost, explore everything | BFS |
| Varying costs, no target | Dijkstra |
| Need speed, don't need optimal | Greedy Best-First |
| Need optimal path to a specific goal | A* |
| Very large graphs, need suboptimal fast | Weighted A* (f = g + w*h, w > 1) |
DataScript Implementation
The DS3 implementation in source/ds3/a_search.ds3 follows the Rosetta Code task:
- Grid: 8×8, cells are 0 (open) or 1 (barrier, cost 100)
- Movement: 8 directions (king moves), cost 1 per step
- Heuristic: Chebyshev distance —
max(|dx|, |dy|)— correct for this movement model - Data structures: arrays for g_score, came_from, closed; Node records in open list
- Priority queue: simple scan-for-minimum (fine for 64-cell grid; for large grids, use a heap)
Expected output: path of 11 steps avoiding the U-shaped barrier.
No comments:
Post a Comment