model : add tokenizer from LFM2.5-Audio-1.5B (#19687)

* model : Add tokenizer from LFM2.5-Audio-1.5B

[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer.

Tokenizer based on LFM2 architecture and acts as "embedding" model with
different input `n_embd` and output `n_embd_out`.

To be used in https://github.com/ggml-org/llama.cpp/pull/18641.

To convert use

```shell
python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer
```

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Formatting

* Rework check for attention layers

* Add LFM2 SWA model support

* Address PR feedback

* Set vocab to none

* Move helper function definitions to cpp file

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
This commit is contained in:
Tarek Dakhran
2026-02-19 09:54:48 +01:00
committed by GitHub
parent eacb4b67a2
commit 8004f3a8d1
7 changed files with 185 additions and 137 deletions
+5 -1
View File
@@ -2417,8 +2417,9 @@ llm_graph_input_mem_hybrid_iswa * llm_graph_context::build_inp_mem_hybrid_iswa()
void llm_graph_context::build_dense_out(
ggml_tensor * dense_2,
ggml_tensor * dense_2_b,
ggml_tensor * dense_3) const {
if (!cparams.embeddings || !(dense_2 || dense_3)) {
if (!cparams.embeddings || !(dense_2 || dense_2_b || dense_3)) {
return;
}
ggml_tensor * cur = res->t_embd_pooled != nullptr ? res->t_embd_pooled : res->t_embd;
@@ -2427,6 +2428,9 @@ void llm_graph_context::build_dense_out(
if (dense_2) {
cur = ggml_mul_mat(ctx0, dense_2, cur);
}
if (dense_2_b) {
cur = ggml_add(ctx0, cur, dense_2_b);
}
if (dense_3) {
cur = ggml_mul_mat(ctx0, dense_3, cur);
}