llama.cpp/tools at f896d2c34f7bb502c13986830b3ed7d85aac67d9 - llama.cpp - Sleepy Git

sleepy/llama.cpp

Files

T

History

Xuan-Son Nguyen f896d2c34f server: improve speed of speculative decoding (#17808 )

* server: improve speed of speculative decoding

* fix small draft case

* add link to the PR

* server : fix generation time measurement

* server : fix draft acceptance logs (add SRV_CNT, SLT_CNT macros)

* server : add comment

* add PR to docs

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

2025-12-08 14:35:28 +01:00

..

batched-bench : add "separate text gen" mode (#17103 )

2025-11-10 12:59:29 +02:00

cvector-generator

cmake : Do not install tools on iOS targets (#15903 )

2025-09-16 09:54:44 +07:00

cmake : Do not install tools on iOS targets (#15903 )

2025-09-16 09:54:44 +07:00

ci : use smaller model (#16168 )

2025-09-22 09:11:39 +03:00

Manually link -lbsd to resolve flock symbol on AIX (#16610 )

2025-10-23 19:37:31 +08:00

ggml-zendnn : add ZenDNN backend for AMD CPUs (#17690 )

2025-12-07 00:13:33 +08:00

cli: add migration warning (#17620 )

2025-11-30 15:32:43 +01:00

mtmd: fix --no-warmup (#17695 )

2025-12-02 22:48:08 +01:00

perplexity : show more kl-divergence data (#16321 )

2025-09-29 09:30:45 +03:00

ci : use smaller model (#16168 )

2025-09-22 09:11:39 +03:00

Install rpc-server when GGML_RPC is ON. (#17149 )

2025-11-11 10:53:59 +00:00

Manually link -lbsd to resolve flock symbol on AIX (#16610 )

2025-10-23 19:37:31 +08:00

server: improve speed of speculative decoding (#17808 )

2025-12-08 14:35:28 +01:00

cmake : Do not install tools on iOS targets (#15903 )

2025-09-16 09:54:44 +07:00

model : Apertus model implementation (#15852 )

2025-10-02 20:43:22 +03:00

CMakeLists.txt

mtmd : rename llava directory to mtmd (#13311 )

2025-05-05 16:02:55 +02:00