Make graph_max_nodes vary by ubatch size (#17794)

* Make graph_max_nodes vary by ubatch size for models where chunking might explode the graph

* Update src/llama-context.h

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Add missing const

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This commit is contained in:
Piotr Wilkin (ilintar)
2025-12-08 14:32:41 +01:00
committed by GitHub
parent 636fc17a37
commit e4e9c4329c
2 changed files with 7 additions and 7 deletions
+1 -1
View File
@@ -197,7 +197,7 @@ private:
//
public:
uint32_t graph_max_nodes() const;
uint32_t graph_max_nodes(uint32_t n_tokens) const;
// can reuse the llm_graph_result instance of the context (for example to update a memory module)
llm_graph_result * get_gf_res_reserve() const;