Files

T

sleepy d16fb2b931 Update LiteRT-LM implementation with accurate iOS status

- Document that Swift APIs are 'coming soon' per Google
- Add C++ bridge integration guide
- Create stub implementation that explains current limitations
- Add LITERT_IOS_STATUS.md with detailed integration options
- Fix MainViewModel to match MainView property expectations

2026-04-06 14:33:50 +02:00

4.3 KiB

Raw Blame History

LiteRT-LM on iOS - Current Status

⚠️ Critical Information

LiteRT-LM Swift APIs are "coming soon" as of 2025 (per Google AI Edge).

Current iOS Support

Feature	Android	iOS Current	iOS Future
Kotlin/Swift API	✅ Full	❌ Not yet	⏳ Coming soon
C++ API	✅ Available	✅ Available	✅ Available
Gemma 4 Models	✅ Yes	✅ Yes (.litertlm)	✅ Yes
KV Cache	✅ Managed	⚠️ Manual (C++)	✅ Managed
Conversation API	✅ Yes	⚠️ Manual (C++)	✅ Yes
Tool Use	✅ Yes	⚠️ Manual (C++)	✅ Yes
Metal GPU	N/A	✅ Yes	✅ Yes
CoreML NPU	N/A	✅ Yes	✅ Yes

Integration Options

Option 1: C++ Bridge (Recommended for Production)

Use LiteRT-LM C++ API with Objective-C++ bridging.

Files needed:

LlmEngineBridge.h - Objective-C header
LlmEngineBridge.mm - Objective-C++ implementation
LlmEngine.swift - Swift wrapper

Example:

// LlmEngineBridge.h
@interface LlmEngineBridge : NSObject
- (BOOL)loadModel:(NSString *)path error:(NSError **)error;
- (NSString *)generate:(NSString *)prompt;
@end

// LlmEngineBridge.mm
#import "LlmEngineBridge.h"
#include "litert_lm/engine.h"

@implementation LlmEngineBridge {
    std::unique_ptr<litert::lm::Engine> engine;
}

- (BOOL)loadModel:(NSString *)path error:(NSError **)error {
    auto config = litert::lm::EngineConfig{
        .model_path = [path UTF8String]
    };
    auto result = litert::lm::Engine::Create(config);
    if (!result.ok()) {
        // Set error
        return NO;
    }
    engine = std::move(*result);
    return YES;
}

@end

Pros:

Full LiteRT-LM features (KV cache, tool use, multimodal)
Best performance (Metal/CoreML delegates)
Production-ready

Cons:

Requires Objective-C++ knowledge
More complex build setup
Bridge code maintenance

Option 2: TensorFlowLiteSwift (Limited)

Use standard TensorFlow Lite Swift pod.

pod 'TensorFlowLiteSwift', '~> 2.16.0'

Pros:

Pure Swift
Easy integration
Stable API

Cons:

❌ No KV cache management
❌ No conversation handling
❌ No tool use support
❌ No streaming generation
Manual tokenization required

Verdict: Not suitable for LLM chat apps.

Option 3: Wait for Swift APIs

Monitor for official Swift API release:

Timeline: Unknown (marked as "coming soon" since 2024)

What Works Now

The current implementation uses stub/fallback mode:

✅ UI fully functional
✅ Audio recording/playback
✅ TTS
✅ Web search
✅ Model download
✅ Conversation management
❌ LLM inference (stubbed)

To Enable Full LLM Support

Step 1: Add C++ Bridge

Create bridging header:

# In your project
touch SleepyAgent/Inference/Bridge/LlmEngineBridge.h
touch SleepyAgent/Inference/Bridge/LlmEngineBridge.mm

Download LiteRT-LM iOS binaries:

# From GitHub releases or build from source
# https://github.com/google-ai-edge/LiteRT-LM/releases

Link libraries:

liblitert_lm.a (static library)
libtensorflow-lite.a
Metal framework
CoreML framework

Step 2: Update Build Settings

In Xcode:

Set Compile Sources As to Objective-C++ for .mm files
Add header search paths for LiteRT-LM
Link required frameworks

Step 3: Implement Bridge Methods

See LlmEngine.swift TODO comments for specific methods to implement.

Testing Without LLM

The app works in "demo mode" with stub responses. To test:

Build and run
Type any message
See stub response about LiteRT-LM integration

References

LiteRT-LM GitHub: https://github.com/google-ai-edge/LiteRT-LM
iOS C++ Guide: https://ai.google.dev/edge/litert-lm/cpp
CocoaPods: https://cocoapods.org/pods/TensorFlowLiteSwift
Models: https://huggingface.co/litert-community
Sample App: https://github.com/google-ai-edge/gallery (AI Edge Gallery)

Recommendation

For a developer build/demo:

Use current stub implementation to test UI/features
Add C++ bridge when ready for production LLM support
Monitor for official Swift API release

The architecture is ready - just need the inference backend integration.

4.3 KiB Raw Blame History