ggml-webgpu: compute pass batching and removing profiling overhead (#21873)

* Update register tiling matmul to use f32 accumulation

* fix profiling code

* Fix register tiling matmul for chrome, i'm blaming dawn

* Update batch tuning value for iOS

* compile fix

* Fix use of new load function

* Move to a single query set for GPU profiling

* Move to batching compute passes when not profiling

* Refactor build_multi

* remove iOS throttling now that we're batching compute passes

This commit is contained in:

Reese Levine

2026-04-16 01:12:19 -07:00

committed by

GitHub

parent 8612ed18b7

commit 82677a6ede

1 changed files with 349 additions and 452 deletions

ggml/src/ggml-webgpu/ggml-webgpu.cpp

+349 -452

View File

File diff suppressed because it is too large Load Diff