sleepy_agent_ios/LITERT_IOS_STATUS.md

# LiteRT-LM on iOS - Current Status

## ⚠️ Critical Information

**LiteRT-LM Swift APIs are "coming soon"** as of 2025 (per [Google AI Edge](https://ai.google.dev/edge/litert-lm)).

## Current iOS Support

| Feature | Android | iOS Current | iOS Future |
|---------|---------|-------------|------------|
| Kotlin/Swift API | ✅ Full | ❌ Not yet | ⏳ Coming soon |
| C++ API | ✅ Available | ✅ Available | ✅ Available |
| Gemma 4 Models | ✅ Yes | ✅ Yes (.litertlm) | ✅ Yes |
| KV Cache | ✅ Managed | ⚠️ Manual (C++) | ✅ Managed |
| Conversation API | ✅ Yes | ⚠️ Manual (C++) | ✅ Yes |
| Tool Use | ✅ Yes | ⚠️ Manual (C++) | ✅ Yes |
| Metal GPU | N/A | ✅ Yes | ✅ Yes |
| CoreML NPU | N/A | ✅ Yes | ✅ Yes |

## Integration Options

### Option 1: C++ Bridge (Recommended for Production)

Use LiteRT-LM C++ API with Objective-C++ bridging.

**Files needed:**
- `LlmEngineBridge.h` - Objective-C header
- `LlmEngineBridge.mm` - Objective-C++ implementation
- `LlmEngine.swift` - Swift wrapper

**Example:**
```objc
// LlmEngineBridge.h
@interface LlmEngineBridge : NSObject
- (BOOL)loadModel:(NSString *)path error:(NSError **)error;
- (NSString *)generate:(NSString *)prompt;
@end
```

```objc
// LlmEngineBridge.mm
#import "LlmEngineBridge.h"
#include "litert_lm/engine.h"

@implementation LlmEngineBridge {
    std::unique_ptr<litert::lm::Engine> engine;
}

- (BOOL)loadModel:(NSString *)path error:(NSError **)error {
    auto config = litert::lm::EngineConfig{
        .model_path = [path UTF8String]
    };
    auto result = litert::lm::Engine::Create(config);
    if (!result.ok()) {
        // Set error
        return NO;
    }
    engine = std::move(*result);
    return YES;
}

@end
```

**Pros:**
- Full LiteRT-LM features (KV cache, tool use, multimodal)
- Best performance (Metal/CoreML delegates)
- Production-ready

**Cons:**
- Requires Objective-C++ knowledge
- More complex build setup
- Bridge code maintenance

### Option 2: TensorFlowLiteSwift (Limited)

Use standard TensorFlow Lite Swift pod.

```ruby
pod 'TensorFlowLiteSwift', '~> 2.16.0'
```

**Pros:**
- Pure Swift
- Easy integration
- Stable API

**Cons:**
- ❌ No KV cache management
- ❌ No conversation handling
- ❌ No tool use support
- ❌ No streaming generation
- Manual tokenization required

**Verdict:** Not suitable for LLM chat apps.

### Option 3: Wait for Swift APIs

Monitor for official Swift API release:
- https://ai.google.dev/edge/litert-lm
- https://github.com/google-ai-edge/LiteRT-LM

**Timeline:** Unknown (marked as "coming soon" since 2024)

## What Works Now

The current implementation uses **stub/fallback mode**:
- ✅ UI fully functional
- ✅ Audio recording/playback
- ✅ TTS
- ✅ Web search
- ✅ Model download
- ✅ Conversation management
- ❌ LLM inference (stubbed)

## To Enable Full LLM Support

### Step 1: Add C++ Bridge

1. Create bridging header:
```bash
# In your project
touch SleepyAgent/Inference/Bridge/LlmEngineBridge.h
touch SleepyAgent/Inference/Bridge/LlmEngineBridge.mm
```

2. Download LiteRT-LM iOS binaries:
```bash
# From GitHub releases or build from source
# https://github.com/google-ai-edge/LiteRT-LM/releases
```

3. Link libraries:
- `liblitert_lm.a` (static library)
- `libtensorflow-lite.a`
- Metal framework
- CoreML framework

### Step 2: Update Build Settings

In Xcode:
1. Set `Compile Sources As` to `Objective-C++` for .mm files
2. Add header search paths for LiteRT-LM
3. Link required frameworks

### Step 3: Implement Bridge Methods

See `LlmEngine.swift` TODO comments for specific methods to implement.

## Testing Without LLM

The app works in "demo mode" with stub responses. To test:
1. Build and run
2. Type any message
3. See stub response about LiteRT-LM integration

## References

- **LiteRT-LM GitHub:** https://github.com/google-ai-edge/LiteRT-LM
- **iOS C++ Guide:** https://ai.google.dev/edge/litert-lm/cpp
- **CocoaPods:** https://cocoapods.org/pods/TensorFlowLiteSwift
- **Models:** https://huggingface.co/litert-community
- **Sample App:** https://github.com/google-ai-edge/gallery (AI Edge Gallery)

## Recommendation

For a developer build/demo:
1. Use current stub implementation to test UI/features
2. Add C++ bridge when ready for production LLM support
3. Monitor for official Swift API release

The architecture is ready - just need the inference backend integration.