bug: bench command fails with error.UseForwardPrefillDecode #55

Closed
opened 2026-05-20 21:02:19 +02:00 by sleepy · 0 comments
Owner

The bench subcommand in src/bench/bench.zig calls model.forward() which returns error.UseForwardPrefillDecode. It should use forward_prefill() and forward_decode() instead.

Reproduction

./zig-out/bin/sleepy-llm bench --model ~/.sleepy-llm/models/Qwen3.5-0.8B --prompt "Hello" --max-tokens 128

Result: Error: UseForwardPrefillDecode

Expected

Benchmark should run and report prefill + decode tok/s.

Fix

Update bench_prefill to call model.forward_prefill() and bench_decode to call model.forward_decode() with proper KV-cache management.

The `bench` subcommand in `src/bench/bench.zig` calls `model.forward()` which returns `error.UseForwardPrefillDecode`. It should use `forward_prefill()` and `forward_decode()` instead. ## Reproduction ``` ./zig-out/bin/sleepy-llm bench --model ~/.sleepy-llm/models/Qwen3.5-0.8B --prompt "Hello" --max-tokens 128 ``` Result: `Error: UseForwardPrefillDecode` ## Expected Benchmark should run and report prefill + decode tok/s. ## Fix Update `bench_prefill` to call `model.forward_prefill()` and `bench_decode` to call `model.forward_decode()` with proper KV-cache management.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
sleepy/sleepy-llm#55
No description provided.