Block a user
[metal] extend bin op fusion to MUL/SUB/DIV chains (#28)
Merged via squash. Coherence test passed (token output byte-identical to master).
[metal] extend bin op fusion to MUL/SUB/DIV chains (#28)
IQ4_XS tg4096 anomaly (45 vs 76 tok/s on 4B)
Eliminate zero-ops (VIEW/RESHAPE/TRANSPOSE/PERMUTE)
Reduce GPU dispatch count (1151 per tick)
[metal] extend bin op fusion to MUL/SUB/DIV chains (#28)