server : support multi-modal context checkpoints (#19849)

* Modify llama-memory-hybrid-iswa.cpp * Modify llama-memory-recurrent.cpp * Modify server-common.cpp * Modify server-common.h * Modify server-context.cpp * Modify server-task.h * Added comment to llama-memory-hybrid-iswa.cpp * Remove comment from server-context.cpp * Stylistic fix server-context.cpp * Fix an issue when seqrm isn't called in server-context.cpp * cont : alternative impl * cont : cleanup * cont : n_tokens -> int64_t --------- Co-authored-by: timkhronos <timkhronos@gmail.com>
2026-02-25 15:14:27 +02:00
parent c747294b2d
commit d7d826b3c1
5 changed files with 100 additions and 35 deletions
@@ -557,6 +557,8 @@ struct server_prompt_checkpoint {
    llama_pos pos_min;
    llama_pos pos_max;

+    int64_t n_tokens;
+
    std::vector<uint8_t> data;

    size_t size() const {