cli: new CLI experience (#17824)

* wip

* wip

* fix logging, add display info

* handle commands

* add args

* wip

* move old cli to llama-completion

* rm deprecation notice

* move server to a shared library

* move ci to llama-completion

* add loading animation

* add --show-timings arg

* add /read command, improve LOG_ERR

* add args for speculative decoding, enable show timings by default

* add arg --image and --audio

* fix windows build

* support reasoning_content

* fix llama2c workflow

* color default is auto

* fix merge conflicts

* properly fix color problem

Co-authored-by: bandoti <bandoti@users.noreply.github.com>

* better loading spinner

* make sure to clean color on force-exit

* also clear input files on "/clear"

* simplify common_log_flush

* add warning in mtmd-cli

* implement console writter

* fix data race

* add attribute

* fix llama-completion and mtmd-cli

* add some notes about console::log

* fix compilation

---------

Co-authored-by: bandoti <bandoti@users.noreply.github.com>
This commit is contained in:
Xuan-Son Nguyen
2025-12-10 15:28:59 +01:00
committed by GitHub
parent b677721819
commit 6c2131773c
26 changed files with 742 additions and 148 deletions
+5 -2
View File
@@ -310,6 +310,9 @@ int main(int argc, char ** argv) {
if (g_is_interrupted) return 130;
LOG_WRN("WARN: This is an experimental CLI for testing multimodal capability.\n");
LOG_WRN(" For normal use cases, please use the standard llama-cli\n");
if (is_single_turn) {
g_is_generating = true;
if (params.prompt.find(mtmd_default_marker()) == std::string::npos) {
@@ -349,11 +352,11 @@ int main(int argc, char ** argv) {
while (!g_is_interrupted) {
g_is_generating = false;
LOG("\n> ");
console::set_display(console::user_input);
console::set_display(DISPLAY_TYPE_USER_INPUT);
std::string line;
console::readline(line, false);
if (g_is_interrupted) break;
console::set_display(console::reset);
console::set_display(DISPLAY_TYPE_RESET);
line = string_strip(line);
if (line.empty()) {
continue;