mtmd: add llama-mtmd-debug binary (#20508)

* mtmd: add llama-mtmd-debug binary

* adapt

* fixes

* fix compile error

* fix windows compile error

* rm legacy clip_debug_encode()

* add MTMD_API to fix build
This commit is contained in:
Xuan-Son Nguyen
2026-03-14 15:52:29 +01:00
committed by GitHub
parent a93c0ef0fa
commit 94d0262277
7 changed files with 392 additions and 15 deletions
+25
View File
@@ -0,0 +1,25 @@
# mtmd-debug
## Debugging encode pass
Example of debugging an input gray image (raw, not preprocessed):
```py
from transformers import AutoModel
model = AutoModel.from_pretrained(...)
def test_vision():
img_size = 896 # number of patches per side
pixel_values = torch.zeros(1, 3, img_size, img_size) + 0.5 # gray image
with torch.no_grad():
outputs = model.model.get_image_features(pixel_values=pixel_values)
print("last_hidden_state shape:", outputs.last_hidden_state.shape)
print("last_hidden_state:", outputs.last_hidden_state)
test_vision()
```
## Debugging preprocess pass
(TODO)