[MEDIUM] _infer_quant_params rejects models with non-standard head dimensions #106

Closed
opened 2026-05-20 12:33:15 +02:00 by sleepy · 0 comments
Owner

In omlx/cache/type_handlers.py line 397, _infer_quant_params has: if head_dim <= 0 or head_dim % 64 != 0: continue. This rejects models with head_dim not divisible by 64 (e.g. head_dim=80, 96). The function falls back to (64, 8) which may not divide head_dim, causing mx.quantize to fail later.

Fix

Remove or relax the head_dim % 64 != 0 check.

In omlx/cache/type_handlers.py line 397, _infer_quant_params has: if head_dim <= 0 or head_dim % 64 != 0: continue. This rejects models with head_dim not divisible by 64 (e.g. head_dim=80, 96). The function falls back to (64, 8) which may not divide head_dim, causing mx.quantize to fail later. ## Fix Remove or relax the head_dim % 64 != 0 check.
Sign in to join this conversation.
No labels
bug
feature
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
sleepy/omlx#106
No description provided.