d16fb2b931
- Document that Swift APIs are 'coming soon' per Google - Add C++ bridge integration guide - Create stub implementation that explains current limitations - Add LITERT_IOS_STATUS.md with detailed integration options - Fix MainViewModel to match MainView property expectations
4.3 KiB
4.3 KiB
LiteRT-LM on iOS - Current Status
⚠️ Critical Information
LiteRT-LM Swift APIs are "coming soon" as of 2025 (per Google AI Edge).
Current iOS Support
| Feature | Android | iOS Current | iOS Future |
|---|---|---|---|
| Kotlin/Swift API | ✅ Full | ❌ Not yet | ⏳ Coming soon |
| C++ API | ✅ Available | ✅ Available | ✅ Available |
| Gemma 4 Models | ✅ Yes | ✅ Yes (.litertlm) | ✅ Yes |
| KV Cache | ✅ Managed | ⚠️ Manual (C++) | ✅ Managed |
| Conversation API | ✅ Yes | ⚠️ Manual (C++) | ✅ Yes |
| Tool Use | ✅ Yes | ⚠️ Manual (C++) | ✅ Yes |
| Metal GPU | N/A | ✅ Yes | ✅ Yes |
| CoreML NPU | N/A | ✅ Yes | ✅ Yes |
Integration Options
Option 1: C++ Bridge (Recommended for Production)
Use LiteRT-LM C++ API with Objective-C++ bridging.
Files needed:
LlmEngineBridge.h- Objective-C headerLlmEngineBridge.mm- Objective-C++ implementationLlmEngine.swift- Swift wrapper
Example:
// LlmEngineBridge.h
@interface LlmEngineBridge : NSObject
- (BOOL)loadModel:(NSString *)path error:(NSError **)error;
- (NSString *)generate:(NSString *)prompt;
@end
// LlmEngineBridge.mm
#import "LlmEngineBridge.h"
#include "litert_lm/engine.h"
@implementation LlmEngineBridge {
std::unique_ptr<litert::lm::Engine> engine;
}
- (BOOL)loadModel:(NSString *)path error:(NSError **)error {
auto config = litert::lm::EngineConfig{
.model_path = [path UTF8String]
};
auto result = litert::lm::Engine::Create(config);
if (!result.ok()) {
// Set error
return NO;
}
engine = std::move(*result);
return YES;
}
@end
Pros:
- Full LiteRT-LM features (KV cache, tool use, multimodal)
- Best performance (Metal/CoreML delegates)
- Production-ready
Cons:
- Requires Objective-C++ knowledge
- More complex build setup
- Bridge code maintenance
Option 2: TensorFlowLiteSwift (Limited)
Use standard TensorFlow Lite Swift pod.
pod 'TensorFlowLiteSwift', '~> 2.16.0'
Pros:
- Pure Swift
- Easy integration
- Stable API
Cons:
- ❌ No KV cache management
- ❌ No conversation handling
- ❌ No tool use support
- ❌ No streaming generation
- Manual tokenization required
Verdict: Not suitable for LLM chat apps.
Option 3: Wait for Swift APIs
Monitor for official Swift API release:
Timeline: Unknown (marked as "coming soon" since 2024)
What Works Now
The current implementation uses stub/fallback mode:
- ✅ UI fully functional
- ✅ Audio recording/playback
- ✅ TTS
- ✅ Web search
- ✅ Model download
- ✅ Conversation management
- ❌ LLM inference (stubbed)
To Enable Full LLM Support
Step 1: Add C++ Bridge
- Create bridging header:
# In your project
touch SleepyAgent/Inference/Bridge/LlmEngineBridge.h
touch SleepyAgent/Inference/Bridge/LlmEngineBridge.mm
- Download LiteRT-LM iOS binaries:
# From GitHub releases or build from source
# https://github.com/google-ai-edge/LiteRT-LM/releases
- Link libraries:
liblitert_lm.a(static library)libtensorflow-lite.a- Metal framework
- CoreML framework
Step 2: Update Build Settings
In Xcode:
- Set
Compile Sources AstoObjective-C++for .mm files - Add header search paths for LiteRT-LM
- Link required frameworks
Step 3: Implement Bridge Methods
See LlmEngine.swift TODO comments for specific methods to implement.
Testing Without LLM
The app works in "demo mode" with stub responses. To test:
- Build and run
- Type any message
- See stub response about LiteRT-LM integration
References
- LiteRT-LM GitHub: https://github.com/google-ai-edge/LiteRT-LM
- iOS C++ Guide: https://ai.google.dev/edge/litert-lm/cpp
- CocoaPods: https://cocoapods.org/pods/TensorFlowLiteSwift
- Models: https://huggingface.co/litert-community
- Sample App: https://github.com/google-ai-edge/gallery (AI Edge Gallery)
Recommendation
For a developer build/demo:
- Use current stub implementation to test UI/features
- Add C++ bridge when ready for production LLM support
- Monitor for official Swift API release
The architecture is ready - just need the inference backend integration.