This website requires JavaScript.
Explore
Help
Register
Sign In
sleepy
/
llama.cpp
Watch
1
Star
0
Fork
0
You've already forked llama.cpp
Code
Issues
11
Pull Requests
Actions
108
Packages
Projects
Releases
Wiki
Activity
Files
98dc1418ea0491d62948f712ed534ece3b773564
llama.cpp
/
ggml
T
History
Johannes Gäßler
9725a313be
CUDA: reduce MMQ stream-k overhead (
#22298
)
...
* CUDA: reduce MMQ stream-k overhead * use 32 bit integers for kbc
2026-04-25 14:15:03 +02:00
..
cmake
ggml: backend-agnostic tensor parallelism (experimental) (
#19378
)
2026-04-09 16:42:19 +02:00
include
CUDA: manage NCCL communicators in context (
#21891
)
2026-04-15 15:58:40 +02:00
src
CUDA: reduce MMQ stream-k overhead (
#22298
)
2026-04-25 14:15:03 +02:00
.gitignore
vulkan : cmake integration (
#8119
)
2024-07-13 18:12:39 +02:00
CMakeLists.txt
HIP: flip GGML_HIP_GRAPHS to default on (
#22254
)
2026-04-23 02:34:31 +02:00