Georgi Gerganov
744c0c7310
llama : rotate activations for better quantization (#21038)
* llama : rotate activations for better quantization
* cont : rotate V more + refactor
* cont : rotate caches separately + support non-power-of-2 head sizes
* cont : simplify
* cont : add reference for V rotation
* cont : refactor
* cont : support context shift
* cont : consolidate
* cont : dedup + allow different types for the rotation matrix
* cont : add env variable to disable rotation
* cont : simplify attn rot kv cache logic + rename env
* cont : pre-compute the Hadamard matrices
2026-04-01 16:58:01 +03:00
..
2026-03-31 13:50:51 +02:00
2026-02-26 12:14:09 +01:00
2026-03-31 13:50:51 +02:00
2026-03-18 12:03:26 +02:00
2026-03-29 19:45:40 +02:00
2026-03-25 19:57:40 +01:00
2026-03-06 08:46:51 +02:00
2026-03-31 13:50:51 +02:00
2026-03-25 19:57:40 +01:00
2026-03-25 19:57:40 +01:00
2026-03-31 13:50:51 +02:00
2026-02-16 09:21:11 +02:00
2025-06-15 10:08:58 +03:00
2026-03-11 22:46:40 +02:00
2026-03-12 13:26:00 +01:00
2026-03-21 18:43:35 +01:00
2026-01-03 16:02:43 -06:00
2026-04-01 16:58:01 +03:00
2026-04-01 16:58:01 +03:00
2026-03-09 22:22:39 +01:00
2026-03-11 19:27:53 +01:00
2026-03-04 09:53:38 +01:00
2026-03-11 22:46:40 +02:00
2025-03-13 12:35:44 +02:00
2025-03-13 12:35:44 +02:00
2026-02-06 21:06:14 +01:00
2025-09-24 16:53:48 +02:00
2026-04-01 16:58:01 +03:00
2026-04-01 16:58:01 +03:00
2025-10-29 18:09:18 +01:00
2026-04-01 12:50:17 +03:00
2026-01-21 14:30:23 +02:00
2026-04-01 12:50:17 +03:00
2025-09-24 16:53:48 +02:00
2026-03-23 14:08:46 +02:00
2025-10-28 11:23:54 +01:00
2025-06-30 18:03:03 +03:00
2025-09-24 16:53:48 +02:00
2026-03-25 12:53:16 +02:00
2026-03-25 12:53:16 +02:00
2026-03-30 17:40:17 +08:00
2026-03-25 12:53:16 +02:00
2026-03-25 12:53:16 +02:00
2026-03-25 12:53:16 +02:00
2026-03-26 16:52:06 +01:00
2026-03-26 16:52:06 +01:00
2026-04-01 08:43:00 +03:00
2025-01-03 10:18:53 +02:00
2026-02-06 07:26:54 +01:00
2026-02-06 07:26:54 +01:00
2026-03-25 19:57:40 +01:00
2026-02-19 13:30:17 +01:00
2026-03-25 12:53:16 +02:00
2024-10-08 13:27:04 +02:00
2024-10-02 15:49:55 +02:00
2026-03-05 08:50:21 +01:00
2025-09-27 02:03:33 +08:00