llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653)

* llama: automatically fit args to free memory llama-fit-params tool * fix CI * hints for bug reports, ensure no reallocation * fix segfault with Vulkan * add llama-fit-params to CI * fix CI * fix CI * fix CI * minor adjustments * fix assignment of 1 dense layer * fix logger not being reset on model load failure * remove --n-gpu-layer hint on model load failure * fix llama-fit-params verbosity * fix edge case * fix typo [no ci]
2025-12-15 09:24:59 +01:00
parent 4aced7a631
commit b1f3a6e5db
26 changed files with 1075 additions and 63 deletions
@@ -37,4 +37,5 @@ else()
        add_subdirectory(cvector-generator)
        add_subdirectory(export-lora)
    endif()
+    add_subdirectory(fit-params)
 endif()