[MEDIUM] _infer_quant_params rejects models with non-standard head dimensions #106
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
In omlx/cache/type_handlers.py line 397, _infer_quant_params has: if head_dim <= 0 or head_dim % 64 != 0: continue. This rejects models with head_dim not divisible by 64 (e.g. head_dim=80, 96). The function falls back to (64, 8) which may not divide head_dim, causing mx.quantize to fail later.
Fix
Remove or relax the head_dim % 64 != 0 check.