server: add built-in tools backend support (#20898)

* wip: server_tools

* refactor

* displayName -> display_name

* snake_case everywhere

* rm redundant field

* change arg to --tools all

* add readme mention

* llama-gen-docs
This commit is contained in:
Xuan-Son Nguyen
2026-03-27 10:07:11 +01:00
committed by GitHub
parent ba38f3becc
commit 20197b6fe3
8 changed files with 922 additions and 0 deletions
+15
View File
@@ -194,6 +194,7 @@ For the full list of features, please refer to [server's changelog](https://gith
| `--webui-config JSON` | JSON that provides default WebUI settings (overrides WebUI defaults)<br/>(env: LLAMA_ARG_WEBUI_CONFIG) |
| `--webui-config-file PATH` | JSON file that provides default WebUI settings (overrides WebUI defaults)<br/>(env: LLAMA_ARG_WEBUI_CONFIG_FILE) |
| `--webui-mcp-proxy, --no-webui-mcp-proxy` | experimental: whether to enable MCP CORS proxy - do not enable in untrusted environments (default: disabled)<br/>(env: LLAMA_ARG_WEBUI_MCP_PROXY) |
| `--tools TOOL1,TOOL2,...` | experimental: whether to enable built-in tools for AI agents - do not enable in untrusted environments (default: no tools)<br/>specify "all" to enable all tools<br/>available tools: read_file, file_glob_search, grep_search, exec_shell_command, write_file, edit_file, apply_diff<br/>(env: LLAMA_ARG_TOOLS) |
| `--webui, --no-webui` | whether to enable the Web UI (default: enabled)<br/>(env: LLAMA_ARG_WEBUI) |
| `--embedding, --embeddings` | restrict to only support embedding use case; use only with dedicated embedding models (default: disabled)<br/>(env: LLAMA_ARG_EMBEDDINGS) |
| `--rerank, --reranking` | enable reranking endpoint on server (default: disabled)<br/>(env: LLAMA_ARG_RERANKING) |
@@ -293,6 +294,12 @@ It is currently available in the following endpoints:
For more details, please refer to [multimodal documentation](../../docs/multimodal.md)
### Built-in tools support
The server includes a set of built-in tools that enable the LLM to access the local file system directly from the Web UI.
To use this feature, start the server with `--tools all`. You can also enable only specific tools by passing a comma-separated list: `--tools name1,name2,...`. Run `--help` for the full list of available tool names.
## Build
`llama-server` is built alongside everything else from the root of the project
@@ -1438,6 +1445,14 @@ curl http://localhost:8080/v1/messages/count_tokens \
{"input_tokens": 10}
```
## Server built-in tools
The server exposes a REST API under `/tools` that allows the Web UI to call built-in tools. This endpoint is intended to be used internally by the Web UI and subject to change or to be removed in the future.
**Please do NOT use this endpoint in a downstream application**
For further documentation about this endpoint, please refer to [server internal documentation](./README-dev.md)
## Using multiple models
`llama-server` can be launched in a **router mode** that exposes an API for dynamically loading and unloading models. The main process (the "router") automatically forwards each request to the appropriate model instance.