llama.cpp

Files

T

Jesus Talavera 6137c325a1 chat : add Granite 4.0 chat template with correct tool_call role mapping (#20804 )

* chat : add Granite 4.0 chat template with correct tool_call role mapping

Introduce `LLM_CHAT_TEMPLATE_GRANITE_4_0` alongside the existing Granite
3.x template (renamed `LLM_CHAT_TEMPLATE_GRANITE_3_X`).

The Granite 4.0 Jinja template uses `<tool_call>` XML tags and maps the
`assistant_tool_call` role to `<|start_of_role|>assistant<|end_of_role|><|tool_call|>`.
Without a matching C++ handler, the fallback path emits the literal role
`assistant_tool_call` which the model does not recognize, breaking tool
calling when `--jinja` is not used.

Changes:
- Rename `LLM_CHAT_TEMPLATE_GRANITE` to `LLM_CHAT_TEMPLATE_GRANITE_3_X`
  (preserves existing 3.x behavior unchanged)
- Add `LLM_CHAT_TEMPLATE_GRANITE_4_0` enum, map entry, and handler
- Detection: `<|start_of_role|>` + (`<tool_call>` or `<tools>`) → 4.0,
  otherwise → 3.x
- Add production Granite 4.0 Jinja template
- Add tests for both 3.x and 4.0 template paths (C++ and Jinja)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Code review: follow standard format and use common logic in test-chat-template.cpp

* Rename custom_conversation variable for extra_conversation to give it a more meaningful name

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

2026-04-02 11:28:56 +02:00

Apertus-8B-Instruct.jinja

Autoparser - complete refactoring of parser architecture (#18675 )

2026-03-06 21:01:00 +01:00

Apriel-1.6-15b-Thinker-fixed.jinja

common/parser: add proper reasoning tag prefill reading (#20424 )

2026-03-19 16:58:21 +01:00

Bielik-11B-v3.0-Instruct.jinja

Autoparser - complete refactoring of parser architecture (#18675 )

2026-03-06 21:01:00 +01:00

ByteDance-Seed-OSS.jinja

chat : Seed OSS thinking + tool call support (#15552 )

2025-08-29 14:53:41 +02:00

CohereForAI-c4ai-command-r7b-12-2024-tool_use.jinja

Autoparser - complete refactoring of parser architecture (#18675 )

2026-03-06 21:01:00 +01:00

CohereForAI-c4ai-command-r-plus-tool_use.jinja

Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639 )

2025-01-30 19:13:58 +00:00

deepseek-ai-DeepSeek-R1-Distill-Llama-8B.jinja

Autoparser - complete refactoring of parser architecture (#18675 )

2026-03-06 21:01:00 +01:00

deepseek-ai-DeepSeek-R1-Distill-Qwen-32B.jinja

common/parser: add proper reasoning tag prefill reading (#20424 )

2026-03-19 16:58:21 +01:00

deepseek-ai-DeepSeek-V3.1.jinja

common/parser: add proper reasoning tag prefill reading (#20424 )

2026-03-19 16:58:21 +01:00

fireworks-ai-llama-3-firefunction-v2.jinja

Autoparser - complete refactoring of parser architecture (#18675 )

2026-03-06 21:01:00 +01:00

GigaChat3-10B-A1.8B.jinja

common/parser: add GigaChatV3/3.1 models support (#19931 )

2026-03-12 01:22:25 +01:00

GigaChat3.1-10B-A1.8B.jinja

common/parser: add GigaChatV3/3.1 models support (#19931 )

2026-03-12 01:22:25 +01:00

GLM-4.6.jinja

common : Generalized XML-style tool-call parsing with streaming support (GLM 4.5/4.6 + MiniMax M2 + SeedOSS + Kimi-K2 + Qwen3-Coder + Apriel-1.5 + Xiaomi-MiMo) (#16932 )

2025-11-18 18:54:15 +01:00

GLM-4.7-Flash.jinja

Autoparser - complete refactoring of parser architecture (#18675 )

2026-03-06 21:01:00 +01:00

google-gemma-2-2b-it.jinja

Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639 )

2025-01-30 19:13:58 +00:00

HuggingFaceTB-SmolLM3-3B.jinja

common/autoparser : detect reasoning markers when enable_thinking changes system prompt (#20859 )

2026-03-23 08:35:27 +01:00

ibm-granite-granite-3.3-2B-Instruct.jinja

chat : support Granite model reasoning and tool call (#14864 )

2025-08-06 20:27:30 +02:00

ibm-granite-granite-4.0.jinja

chat : add Granite 4.0 chat template with correct tool_call role mapping (#20804 )

2026-04-02 11:28:56 +02:00

Kimi-K2-Instruct.jinja

Fix Kimi-K2 tool-call parsing issues (#17376 )

2025-12-08 14:32:04 +01:00

Kimi-K2-Thinking.jinja

Fix Kimi-K2 tool-call parsing issues (#17376 )

2025-12-08 14:32:04 +01:00

LFM2-8B-A1B.jinja

PEG parser for LFM2 (#20251 )

2026-03-09 01:11:22 +01:00

LFM2.5-Instruct.jinja

fix: tool call parsing for LFM2 and LFM2.5 models (#21242 )

2026-04-01 16:22:44 +02:00

llama-cpp-deepseek-r1.jinja

common/parser: add proper reasoning tag prefill reading (#20424 )

2026-03-19 16:58:21 +01:00

llama-cpp-rwkv-world.jinja

llama : add jinja template for rwkv-world (#14665 )

2025-07-14 07:43:43 +08:00

meetkai-functionary-medium-v3.1.jinja

common/parser: add proper reasoning tag prefill reading (#20424 )

2026-03-19 16:58:21 +01:00

meetkai-functionary-medium-v3.2.jinja

Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639 )

2025-01-30 19:13:58 +00:00

meta-llama-Llama-3.1-8B-Instruct.jinja

Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639 )

2025-01-30 19:13:58 +00:00

meta-llama-Llama-3.2-3B-Instruct.jinja

Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639 )

2025-01-30 19:13:58 +00:00

meta-llama-Llama-3.3-70B-Instruct.jinja

Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639 )

2025-01-30 19:13:58 +00:00

microsoft-Phi-3.5-mini-instruct.jinja

Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639 )

2025-01-30 19:13:58 +00:00

MiMo-VL.jinja

common : Generalized XML-style tool-call parsing with streaming support (GLM 4.5/4.6 + MiniMax M2 + SeedOSS + Kimi-K2 + Qwen3-Coder + Apriel-1.5 + Xiaomi-MiMo) (#16932 )

2025-11-18 18:54:15 +01:00

MiniMax-M2.jinja

common : Generalized XML-style tool-call parsing with streaming support (GLM 4.5/4.6 + MiniMax M2 + SeedOSS + Kimi-K2 + Qwen3-Coder + Apriel-1.5 + Xiaomi-MiMo) (#16932 )

2025-11-18 18:54:15 +01:00

Mistral-Small-3.2-24B-Instruct-2506.jinja

jinja : Add Mistral-Small-3.2-24B-Instruct-2506.jinja (#14349 )

2025-06-24 09:17:58 +03:00

mistralai-Ministral-3-14B-Reasoning-2512.jinja

common : add parser for ministral/mistral large 3/devstral 2 (#17713 )

2025-12-09 17:31:04 -06:00

mistralai-Mistral-Nemo-Instruct-2407.jinja

Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639 )

2025-01-30 19:13:58 +00:00

moonshotai-Kimi-K2.jinja

Autoparser - complete refactoring of parser architecture (#18675 )

2026-03-06 21:01:00 +01:00

NousResearch-Hermes-2-Pro-Llama-3-8B-tool_use.jinja

Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639 )

2025-01-30 19:13:58 +00:00

NousResearch-Hermes-3-Llama-3.1-8B-tool_use.jinja

Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639 )

2025-01-30 19:13:58 +00:00

NVIDIA-Nemotron-3-Nano-30B-A3B-BF16.jinja

common : implement new jinja template engine (#18462 )

2026-01-16 11:22:06 +01:00

NVIDIA-Nemotron-Nano-v2.jinja

chat : nemotron thinking & toolcalling support (#15676 )

2025-09-05 01:22:22 +02:00

openai-gpt-oss-120b.jinja

gpt-oss: implement harmony parsing (#15181 )

2025-08-14 17:23:11 +03:00

Qwen3-Coder.jinja

Autoparser - complete refactoring of parser architecture (#18675 )

2026-03-06 21:01:00 +01:00

Qwen3.5-4B.jinja

common/parser: fix handling of tool definition with missing properties key (#21128 )

2026-03-28 20:41:32 +01:00

Qwen-Qwen2.5-7B-Instruct.jinja

Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639 )

2025-01-30 19:13:58 +00:00

Qwen-Qwen3-0.6B.jinja

server: add --reasoning-budget 0 to disable thinking (incl. qwen3 w/ enable_thinking:false) (#13771 )

2025-05-26 00:30:51 +01:00

Qwen-QwQ-32B.jinja

Autoparser - complete refactoring of parser architecture (#18675 )

2026-03-06 21:01:00 +01:00

README.md

chat : Deepseek V3.1 reasoning and tool calling support (OpenAI Style) (#15533 )

2025-09-08 16:59:48 +02:00

StepFun3.5-Flash.jinja

Autoparser - complete refactoring of parser architecture (#18675 )

2026-03-06 21:01:00 +01:00

stepfun-ai-Step-3.5-Flash.jinja

common : fix Step-3.5-Flash format detection and thinking support (#19635 )

2026-02-19 22:40:52 +01:00

unsloth-Apriel-1.5.jinja

Autoparser - complete refactoring of parser architecture (#18675 )

2026-03-06 21:01:00 +01:00

unsloth-mistral-Devstral-Small-2507.jinja

mtmd : add support for Voxtral (#14862 )

2025-07-28 15:01:48 +02:00

upstage-Solar-Open-100B.jinja

chat : add parsing for solar-open-100b (#18540 )

2026-01-29 16:06:15 +01:00

README.md

These templates can be updated with the following commands:

./scripts/get_chat_template.py CohereForAI/c4ai-command-r-plus tool_use      > models/templates/CohereForAI-c4ai-command-r-plus-tool_use.jinja
./scripts/get_chat_template.py CohereForAI/c4ai-command-r7b-12-2024 default  > models/templates/CohereForAI-c4ai-command-r7b-12-2024-default.jinja
./scripts/get_chat_template.py CohereForAI/c4ai-command-r7b-12-2024 rag      > models/templates/CohereForAI-c4ai-command-r7b-12-2024-rag.jinja
./scripts/get_chat_template.py CohereForAI/c4ai-command-r7b-12-2024 tool_use > models/templates/CohereForAI-c4ai-command-r7b-12-2024-tool_use.jinja
./scripts/get_chat_template.py deepseek-ai/DeepSeek-R1-Distill-Llama-8B      > models/templates/deepseek-ai-DeepSeek-R1-Distill-Llama-8B.jinja
./scripts/get_chat_template.py deepseek-ai/DeepSeek-R1-Distill-Qwen-32B      > models/templates/deepseek-ai-DeepSeek-R1-Distill-Qwen-32B.jinja
./scripts/get_chat_template.py fireworks-ai/llama-3-firefunction-v2          > models/templates/fireworks-ai-llama-3-firefunction-v2.jinja
./scripts/get_chat_template.py google/gemma-2-2b-it                          > models/templates/google-gemma-2-2b-it.jinja
./scripts/get_chat_template.py meetkai/functionary-medium-v3.1               > models/templates/meetkai-functionary-medium-v3.1.jinja
./scripts/get_chat_template.py meetkai/functionary-medium-v3.2               > models/templates/meetkai-functionary-medium-v3.2.jinja
./scripts/get_chat_template.py meta-llama/Llama-3.1-8B-Instruct              > models/templates/meta-llama-Llama-3.1-8B-Instruct.jinja
./scripts/get_chat_template.py meta-llama/Llama-3.2-3B-Instruct              > models/templates/meta-llama-Llama-3.2-3B-Instruct.jinja
./scripts/get_chat_template.py meta-llama/Llama-3.3-70B-Instruct             > models/templates/meta-llama-Llama-3.3-70B-Instruct.jinja
./scripts/get_chat_template.py microsoft/Phi-3.5-mini-instruct               > models/templates/microsoft-Phi-3.5-mini-instruct.jinja
./scripts/get_chat_template.py mistralai/Mistral-Nemo-Instruct-2407          > models/templates/mistralai-Mistral-Nemo-Instruct-2407.jinja
./scripts/get_chat_template.py NousResearch/Hermes-2-Pro-Llama-3-8B tool_use > models/templates/NousResearch-Hermes-2-Pro-Llama-3-8B-tool_use.jinja
./scripts/get_chat_template.py NousResearch/Hermes-3-Llama-3.1-8B tool_use   > models/templates/NousResearch-Hermes-3-Llama-3.1-8B-tool_use.jinja
./scripts/get_chat_template.py Qwen/Qwen2.5-7B-Instruct                      > models/templates/Qwen-Qwen2.5-7B-Instruct.jinja
./scripts/get_chat_template.py Qwen/QwQ-32B                                  > models/templates/Qwen-QwQ-32B.jinja
./scripts/get_chat_template.py Qwen/Qwen3-0.6B                               > models/templates/Qwen-Qwen3-0.6B.jinja
./scripts/get_chat_template.py zai-org/GLM-4.5                               > models/templates/zai-org-GLM-4.5.jinja
./scripts/get_chat_template.py deepseek-ai/DeepSeek-V3.1                     > models/templates/deepseek-ai-DeepSeek-V3.1.jinja