4301e27319
* common : restart grammar-based rejection sampling * sampling : allow null samplers
llama.cpp/examples/speculative
Demonstration of speculative decoding and tree-based speculative decoding techniques
More info: