[infra] Runtime Metal shader compilation when metallib unavailable (#60) #61

Open
sleepy wants to merge 4 commits from fix/60-runtime-metal-compilation into main
Owner

Summary

Unblocks building on systems without the Metal Developer Tools (metallib compiler).

Problem

  • xcrun metallib not available on fresh Xcode installs
  • zig build creates empty .metallibLibraryLoadFailed at runtime

Solution

  • Added sleepy_mtl_compile_library() to compile .metal sources at runtime using MTLDevice.newLibraryWithSource()
  • Modified PipelineCache to fall back to runtime compilation when .metallib is empty/invalid
  • Fixed shim.m for ARC compatibility (removed manual release, added void* casts)

Test Results

  • zig build --release=fast compiles successfully
  • ./zig-out/bin/sleepy-llm generate runs without LibraryLoadFailed

Trade-offs

  • First kernel load triggers compilation (~300-500ms one-time cost)
  • Subsequent loads use cached pipelines

Future Work

  • Install Metal Developer Tools for precompiled .metallib (faster startup)
  • Model still produces garbage output (separate issue)

Closes #60

## Summary Unblocks building on systems without the Metal Developer Tools (`metallib` compiler). ## Problem - `xcrun metallib` not available on fresh Xcode installs - `zig build` creates empty `.metallib` → `LibraryLoadFailed` at runtime ## Solution - Added `sleepy_mtl_compile_library()` to compile `.metal` sources at runtime using `MTLDevice.newLibraryWithSource()` - Modified `PipelineCache` to fall back to runtime compilation when `.metallib` is empty/invalid - Fixed `shim.m` for ARC compatibility (removed manual `release`, added `void*` casts) ## Test Results - `zig build --release=fast` compiles successfully - `./zig-out/bin/sleepy-llm generate` runs without `LibraryLoadFailed` ## Trade-offs - First kernel load triggers compilation (~300-500ms one-time cost) - Subsequent loads use cached pipelines ## Future Work - Install Metal Developer Tools for precompiled `.metallib` (faster startup) - Model still produces garbage output (separate issue) Closes #60
Token IDs like 95793 were decoded as literal UTF-8 strings (e.g. 'åĴĮ')
instead of being mapped back through the GPT-2 bytes_to_unicode table
to reconstruct the original UTF-8 bytes (e.g. '和').

fix(loader): detect model.safetensors-*.safetensors shard files

detectBf16Model() and loadZeroCopy() only matched model-*.safetensors,
missing the model.safetensors-00001-of-00001.safetensors naming scheme
used by Qwen3.5. This caused GPU zero-copy loading to be skipped
entirely, falling back to CPU load path.
- Added sleepy_mtl_compile_library() to compile shaders from source at runtime
- Modified PipelineCache to fall back to runtime compilation when .metallib is empty
- Fixed shim.m for ARC compatibility (removed manual release, added void casts)
- Updated build to work without metallib toolchain

This unblocks building on systems without the Metal Developer Tools.
This pull request can be merged automatically.
This branch is out-of-date with the base branch
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin fix/60-runtime-metal-compilation:fix/60-runtime-metal-compilation
git switch fix/60-runtime-metal-compilation

Merge

Merge the changes and update on Forgejo.

Warning: The "Autodetect manual merge" setting is not enabled for this repository, you will have to mark this pull request as manually merged afterwards.

git switch main
git merge --no-ff fix/60-runtime-metal-compilation
git switch fix/60-runtime-metal-compilation
git rebase main
git switch main
git merge --ff-only fix/60-runtime-metal-compilation
git switch fix/60-runtime-metal-compilation
git rebase main
git switch main
git merge --no-ff fix/60-runtime-metal-compilation
git switch main
git merge --squash fix/60-runtime-metal-compilation
git switch main
git merge --ff-only fix/60-runtime-metal-compilation
git switch main
git merge fix/60-runtime-metal-compilation
git push origin main
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
sleepy/sleepy-llm!61
No description provided.