- Zig 70.3%
- Python 29.7%
| Filename | Latest commit message | Latest commit date |
|---|---|---|
| zig-raptor | ||
| .gitignore | ||
| hub_set.json | ||
| idxrail.py | ||
| IMPLEMENTATION_PLAN.md | ||
| rail_query.py | ||
| README.md | ||
| solve.py | ||
| solve_fast.py | ||
UK Rail All-Pairs Shortest Path Solver
Computes fastest rail journeys between all ~2,605 UK National Rail stations (~6.8M directed pairs) from Network Rail timetable data. Replaced an 8-hour Python implementation with a Zig solver that crunches in ~38 seconds.
Performance
| Metric | Python (old) | Zig (new) | Speedup |
|---|---|---|---|
| Full crunch | ~8 hours | ~38 seconds | ~750× |
| Query lookup | SQLite query (~0.1s) | Binary mmap (~0.005s) | ~20× |
| Parser | Hours | ~25s (with mmap) | — |
| Dijkstra | ~3 hours | ~10s CPU time | ~1,000× |
Quick Start
cd zig-raptor
zig build -Doptimize=ReleaseFast
# Build day-specific index (mon/tue/wed/thu/fri/sat/sun)
./zig-out/bin/raptor solve tue
# Query a route (O(1) lookup from binary index)
./zig-out/bin/raptor query tue RDG LDS
# → RDG → LDS: 3h01m, 2 transfers, dep 05:48
How It Works
Algorithm
Time-dependent Dijkstra with earliest-arrival optimization and transfer minimization. Run once per source station (2,605 × single-source Dijkstra).
Priority queue ordering: (arrival_time, transfers, -first_dep) — finds fastest arrival, then fewest changes, then latest departure (shortest duration).
Transfer handling:
- Boarding a train: +1 transfer
- London Underground between terminals: +1 transfer (20 min fixed)
- Staying seated through stops: 0 transfers
- Smart limit: Up to 15 transfers for remote stations (e.g., INE → WTN needs 11)
Data Flow
- Parse 3GB JSONL timetable with POSIX
mmap - Filter by day-of-week (e.g., Tuesday =
days[1] == '1') - Build per-station sorted departure lists (binary search for connections)
- Solve 2,605 single-source Dijkstras with pre-allocated buffers
- Write binary index with predecessor matrices for route reconstruction
Output Format
Binary index file (rail_index_tue.bin, ~84MB):
- Header: magic + version + station count
- Station names (4-byte CRS codes)
- Duration matrix (u16 per pair, minutes)
- Transfer matrix (u8 per pair)
- Departure matrix (u16 per pair, minutes from midnight)
- Predecessor matrices (for route reconstruction)
Query returns: duration, transfers, departure time, and full route with train UIDs.
Commands
# Build indices for specific days
./raptor solve mon # Monday service
./raptor solve tue # Tuesday service
./raptor solve sun # Sunday service (fewer trains, ~38s)
# Build all 7 days at once (~4-5 minutes)
./raptor solve-all
# Query routes (O(1) from binary index)
./raptor query tue RDG LDS # Reading → Leeds
./raptor query tue PAD EUS # Paddington → Euston (underground)
./raptor query tue INE WTN # Inverness → Watton (11 transfers!)
# Validate route reconstruction (loads timetable + rebuilds full route with UIDs)
./raptor validate tue RDG LDS
Example Output
$ ./raptor query tue RDG LDS
RDG → LDS: 3h01m, 2 transfers, dep 05:48
Train W34850: RDG → PAD (05:48 – 06:12)
Underground: PAD → KGX
Train C02149: KGX → LDS (06:40 – 08:49)
Key Optimizations
- Binary heap priority queue — O(log n) push/pop vs O(n log n) ArrayList+sort
- Pre-allocated reusable buffers —
best_arrand PQ buffer allocated once, reused per source - Stack-allocated search patterns — Parser builds
"key":"on stack instead ofpage_allocatoralloc/free millions of times (was 436s system time → 26s) - POSIX mmap — Kernel pages in 3GB file on demand instead of read() syscall overhead
- Zig ReleaseFast — Aggressive inlining, no bounds checks in hot loops
Coverage
- 97.3% of station pairs reachable (6.6M / 6.8M)
- Max 12 transfers for remote routes
- Max 18h15m journey time
- 2.7% unreachable: remote/island stations with genuinely no rail service
Day-Specific Routing
The timetable contains different schedules for different days:
| Day | Schedules | Crunch Time | Example: RDG→LDS |
|---|---|---|---|
| Tuesday | ~81K | ~34s | 3h01m |
| Sunday | ~44K | ~20s | 4h29m (slower service) |
Some routes only exist on weekdays (e.g., INE → WTN: 6h26m on Tue, no service on Sun).
Architecture
zig-raptor/
├── src/
│ ├── main.zig # CLI: solve/query/validate commands
│ ├── dijkstra.zig # Core solver: time-dependent Dijkstra
│ ├── parser.zig # Fast JSONL parser with mmap
│ ├── index.zig # Binary index I/O format
│ ├── parallel_dijkstra.zig # Multi-threaded attempt (WIP)
│ ├── hub_dp.zig # Hub-based dynamic programming (WIP)
│ └── raptor.zig # Multi-round DP experiment (abandoned)
├── build.zig # Zig build file
└── zig-out/bin/raptor # Compiled binary
Data Sources
- Timetable: Network Rail CIF/SCHEDULE feed (
toc-full.json, ~3GB JSONL) - Stations:
UK Railway Stations.csv— 2,605 stations with CRS codes
Limitations
- Timetable snapshot: Single-day schedules. Cancellations, engineering works, and real-time delays not included.
- Departure window: 05:30–17:30 (390-minute window). Best journey within this window is returned.
- London Underground: Fixed 20-minute hops between 11 major termini only. Actual TfL network not modelled.
- No bus/ferry: National Rail only. Some rural connections may be missing.
- Fixed interchange: 5 minutes at all stations. Some complex interchanges need longer.
Development
# Build debug version
zig build
# Build optimized release
zig build -Doptimize=ReleaseFast
# Run tests
zig build test
License
MIT