Files
llama.cpp/scripts
Ryan Goulden 26c9ce1288 server: Add cached_tokens info to oaicompat responses (#19361)
* tests : fix fetch_server_test_models.py

* server: to_json_oaicompat cached_tokens

Adds OpenAI and Anthropic compatible information about the
number of cached prompt tokens used in a response.
2026-03-19 19:09:33 +01:00
..
2026-03-19 17:05:44 +01:00
2026-03-08 12:30:21 +01:00
2025-08-18 22:06:44 +03:00
2026-03-18 15:17:28 +02:00
2025-08-18 22:06:44 +03:00