Files
intel-gpu-llm-diagnosis/repos/patch/phase2-sycl-kernel/0003-tune-dmmv-xy-common-hpp.patch
T
sleepy 6ad84d543c feat: phased patch system for Intel Arc GPU performance fixes
3-model council (GLM-5.1, Minimax-M2.7, Kimi k2p5) analyzed Intel Arc GPU
performance issues and produced patches for llama.cpp:

Phase 1 - SYCL Sync: Enable graph execution by default (GGML_SYCL_DISABLE_GRAPH)
Phase 2 - SYCL Kernel: Fix VER_GEN12/13 thresholds, tune DMMV_X/MMV_Y
Phase 3 - Vulkan Intel: Arc 140T device-ID Xe2 override

Includes:
- Phased apply script (apply-phase.sh [1|2|3|all])
- Master apply script with --status/--reverse/--dry-run
- Per-phase READMEs with testing checklists
- Council deliberation logs (gitignored in logs/)

Verified: all patches apply/reverse cleanly via git apply.
Static verification: VER_GEN arithmetic and DMMV_X divisibility pass.
2026-04-15 14:53:40 +02:00

18 lines
496 B
Diff

diff --git a/ggml/src/ggml-sycl/common.hpp b/ggml/src/ggml-sycl/common.hpp
index fd84c91..dd5cf1a 100644
--- a/ggml/src/ggml-sycl/common.hpp
+++ b/ggml/src/ggml-sycl/common.hpp
@@ -103,10 +103,10 @@ extern int g_ggml_sycl_enable_flash_attention;
// dmmv = dequantize_mul_mat_vec
#ifndef GGML_SYCL_DMMV_X
-#define GGML_SYCL_DMMV_X 32
+#define GGML_SYCL_DMMV_X 64
#endif
#ifndef GGML_SYCL_MMV_Y
-#define GGML_SYCL_MMV_Y 1
+#define GGML_SYCL_MMV_Y 2
#endif
typedef sycl::queue *queue_ptr;