Files

T

sleepy 47df14c952 Initial release: Sleepy Agent v1.0

A fully local AI assistant for Android powered by Google's Gemma 4 models.

Features:
- Fully local inference (voice/image/text on-device)
- Voice input with Voice Activity Detection
- Image understanding with camera/gallery support
- Text chat with markdown rendering
- Gemma 4 via LiteRT-LM (E2B/E4B variants)
- Model download manager
- Session management with persistent history
- Smart TTS with auto-detect mode
- Device RAM info for model selection

2026-04-06 04:17:39 +02:00

2.9 KiB

Raw Permalink Blame History

Sleepy Agent

A fully local AI assistant for Android powered by Google's Gemma 4 models via LiteRT-LM. Your conversations stay on your device - no cloud required. Can search the web when you need up-to-date information.

Features

🔒 Fully Local Inference

Voice, image, and text processing all happens on-device
No internet connection required for inference (except for web search tool)
Conversations stay private - no data sent to external AI services

🎙️ Voice Input

Tap the mic button and speak naturally
Voice Activity Detection (VAD) automatically stops recording after you finish speaking
Optional TTS (Text-to-Speech) responses when using voice input

🖼️ Image Understanding

Send images from your gallery or take a photo
Ask questions about what's in the image
Works with text prompts alongside images

📝 Text Chat

Full markdown support including tables and code blocks
Persistent conversation history
Navigate between multiple chat sessions

🧠 Gemma 4 via LiteRT-LM

Powered by Google's official LiteRT-LM SDK
Choose between E2B (2B params, ~2.7GB, faster) or E4B (4B params, ~4.5GB, higher quality)
16K token context window
KV cache reuse for faster multi-turn conversations
Performance: E2B runs at ~25-30 tokens/sec on a Samsung Galaxy Z Fold 5 (personal testing)

📥 Easy Model Setup

Download directly in the app: Settings → Download Gemma 4 E2B/E4B
Or select your own model: Use any .litertlm file from HuggingFace LiteRT Community
Device info card shows your RAM to help choose the right model

💾 Session Management

Navigation drawer shows all your past conversations
Continue previous chats or start fresh
Auto-saved conversation history

🔊 Smart TTS

Optional text-to-speech for responses
Auto-detect mode: speaks when you use voice input, silent for text input

Work in Progress

Floating Bubble: Quick access overlay (requires additional permissions)
Home Server Delegation: Optionally route requests to your own server

Requirements

Android 8.0+ (API 26)
4GB+ RAM recommended
~3GB free storage for E2B model (~5GB for E4B)

Building

See DEVELOPMENT.md for detailed build instructions and how to configure your own SearXNG server.

Quick build:

./gradlew :app:assembleDebug

The release APK is ~50MB (arm64-v8a only).

Web Search Setup

The app can search the web using a SearXNG server. To set up your own, see DEVELOPMENT.md.

Model Sources

Download .litertlm models from:

HuggingFace LiteRT Community
Gemma 4 E2B: gemma-4-E2B-it-litert-lm
Gemma 4 E4B: gemma-4-E4B-it-litert-lm

License

MIT License - See LICENSE file

2.9 KiB Raw Permalink Blame History