llama: end-to-end tests (#19802)

* tests: add end-to-end tests per model architecture * fixup for rebase * fix use-after-free in llama-model-loader.cpp * fix CI * fix WebGPU * fix CI * disable CI for macOS-latest-cmake-arm64 * use expert_weights_scale only if != 0.0f * comments
2026-03-08 12:30:21 +01:00
parent a95047979a
commit a976ff081b
33 changed files with 1607 additions and 633 deletions
@@ -0,0 +1,11 @@
+# Results
+
+The `llama-results` tool can be used to `--check` the outputs of a model vs. a previous commit to detect whether they have changed.
+Example usage:
+
+``` sh
+llama-results --model model.gguf --output results.gguf --prompt "People die when they are killed."  # writes results to file
+llama-results --model model.gguf --output results.gguf --prompt "People die when they are killed." --check  # compares results vs file
+```
+
+The metric by which the results are compared is the normalized mean squared error (NMSE) with a tolerance of $10^{-6}$.