bug: generate command produces garbage output on Qwen3.5-0.8B #57

New issue

Closed

opened 2026-05-20 23:58:25 +02:00 by sleepy · 0 comments

sleepy commented

2026-05-20 23:58:25 +02:00

Owner

Running sleepy-llm generate with Qwen3.5-0.8B produces garbled Unicode tokens instead of coherent text.

Reproduction

mkdir -p ~/.sleepy-llm/models/Qwen3.5-0.8B
ln -sf ~/.cache/huggingface/hub/models--Qwen--Qwen3.5-0.8B/snapshots/*/config.json ~/.sleepy-llm/models/Qwen3.5-0.8B/
ln -sf ~/.cache/huggingface/hub/models--Qwen--Qwen3.5-0.8B/snapshots/*/model.safetensors-* ~/.sleepy-llm/models/Qwen3.5-0.8B/
ln -sf ~/.cache/huggingface/hub/models--Qwen--Qwen3.5-0.8B/snapshots/*/tokenizer.json ~/.sleepy-llm/models/Qwen3.5-0.8B/
./zig-out/bin/sleepy-llm generate --model ~/.sleepy-llm/models/Qwen3.5-0.8B --prompt "Hello" --max-tokens 5

Result: åĴĮåĴĮåĴĮåĴĮåĴĮ (garbage)

Expected: Coherent English text like "Hello! How can I help..."

Notes

The model weights load without errors (no safetensors parse failures)
Tokenizer appears to work (no encode/decode errors)
Could be: incorrect weight format parsing, wrong dtype conversion, Metal kernel producing incorrect results, or tokenizer vocab mismatch
Need to verify against reference mlx-lm output with same seed

Issue #55 (bench command also broken)
Correctness tests in src/tests/correctness.zig also hardcoded to 4B model

Running `sleepy-llm generate` with Qwen3.5-0.8B produces garbled Unicode tokens instead of coherent text. ## Reproduction ```bash mkdir -p ~/.sleepy-llm/models/Qwen3.5-0.8B ln -sf ~/.cache/huggingface/hub/models--Qwen--Qwen3.5-0.8B/snapshots/*/config.json ~/.sleepy-llm/models/Qwen3.5-0.8B/ ln -sf ~/.cache/huggingface/hub/models--Qwen--Qwen3.5-0.8B/snapshots/*/model.safetensors-* ~/.sleepy-llm/models/Qwen3.5-0.8B/ ln -sf ~/.cache/huggingface/hub/models--Qwen--Qwen3.5-0.8B/snapshots/*/tokenizer.json ~/.sleepy-llm/models/Qwen3.5-0.8B/ ./zig-out/bin/sleepy-llm generate --model ~/.sleepy-llm/models/Qwen3.5-0.8B --prompt "Hello" --max-tokens 5 ``` Result: `åĴĮåĴĮåĴĮåĴĮåĴĮ` (garbage) Expected: Coherent English text like "Hello! How can I help..." ## Notes - The model weights load without errors (no safetensors parse failures) - Tokenizer appears to work (no encode/decode errors) - Could be: incorrect weight format parsing, wrong dtype conversion, Metal kernel producing incorrect results, or tokenizer vocab mismatch - Need to verify against reference mlx-lm output with same seed ## Related - Issue #55 (bench command also broken) - Correctness tests in `src/tests/correctness.zig` also hardcoded to 4B model

sleepy added the

bug

label

2026-05-20 23:58:41 +02:00

sleepy referenced this issue from a commit

2026-05-21 00:06:17 +02:00

fix(#57): make tokenizer test path robust, skip if model not found

sleepy referenced this issue

2026-05-21 00:06:47 +02:00

[fix] Make tokenizer test path robust, skip if model not found (#57) #59

sleepy referenced this issue from a commit

2026-05-21 00:07:26 +02:00