Issues - llama.cpp - Sleepy Git

sleepy/llama.cpp

Labels Milestones New Issue

Implement MXFP4 GGUF converter feature

#37 opened 2026-04-30 18:11:37 +02:00 by sleepy

Compare llama.cpp and MLX dispatch structure profiling

#36 opened 2026-04-30 18:11:37 +02:00 by sleepy

Profile concurrent encoding effectiveness profiling

#35 opened 2026-04-30 18:11:37 +02:00 by sleepy

Profile graph fusion effectiveness profiling

#34 opened 2026-04-30 18:11:36 +02:00 by sleepy

KV cache IO scaling with context length perf

#32 opened 2026-04-30 18:11:35 +02:00 by sleepy

Investigate CPY overhead (159 MB/tick at 9B) perf

#31 opened 2026-04-30 18:11:35 +02:00 by sleepy

Investigate GET_ROWS overhead (678 MB/tick at 9B) perf

#30 opened 2026-04-30 18:11:35 +02:00 by sleepy

Port contiguous weight reads to Q4_0 MUL_MAT kernel kernel

#29 opened 2026-04-30 18:11:34 +02:00 by sleepy