Run smoke test to validate all fixes before full training #3

Open
opened 2026-05-01 14:01:26 +02:00 by sleepy · 0 comments
sleepy commented 2026-05-01 14:01:26 +02:00 (Migrated from localhost:18431)

Goal

Verify that the fixed modules produce reasonable loss values and gradients flow correctly.

Test (fixes/smoke_test.py)

  • Forward pass: loss should be < 10.4 (log(32K vocab))
  • Backward pass: gradients should exist and be non-zero
  • RoPE: verify correct frequency interleaving
  • Data alignment: input/labels match properly

Command

cd /home/sleepy/ternary && CUDA_VISIBLE_DEVICES=0 python fixes/train_fixed.py --smoke

Expected

  • Initial loss ~10.4 (random), decreasing within first 50 steps
  • No NaN/Inf in gradients
  • RoPE test passes

Status

  • Smoke test run and passed
## Goal Verify that the fixed modules produce reasonable loss values and gradients flow correctly. ## Test (fixes/smoke_test.py) - Forward pass: loss should be < 10.4 (log(32K vocab)) - Backward pass: gradients should exist and be non-zero - RoPE: verify correct frequency interleaving - Data alignment: input/labels match properly ## Command ```bash cd /home/sleepy/ternary && CUDA_VISIBLE_DEVICES=0 python fixes/train_fixed.py --smoke ``` ## Expected - Initial loss ~10.4 (random), decreasing within first 50 steps - No NaN/Inf in gradients - RoPE test passes ## Status - [ ] Smoke test run and passed
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
sleepy/ternary#3
No description provided.