Initial release: Sleepy Agent v1.0

A fully local AI assistant for Android powered by Google's Gemma 4 models.

Features:
- Fully local inference (voice/image/text on-device)
- Voice input with Voice Activity Detection
- Image understanding with camera/gallery support
- Text chat with markdown rendering
- Gemma 4 via LiteRT-LM (E2B/E4B variants)
- Model download manager
- Session management with persistent history
- Smart TTS with auto-detect mode
- Device RAM info for model selection
This commit is contained in:
2026-04-05 02:18:42 +02:00
commit 47df14c952
65 changed files with 7214 additions and 0 deletions
+75
View File
@@ -0,0 +1,75 @@
# Gradle
.gradle/
build/
!gradle/wrapper/gradle-wrapper.jar
!**/src/main/**/build/
!**/src/test/**/build/
# Local configuration
local.properties
*.properties
!.gradle.properties
!gradle.properties
# Android Studio
.idea/
*.iml
*.iws
*.ipr
.navigation/
captures/
.externalNativeBuild/
.cxx/
*.apk
*.ap_
*.aab
# macOS
.DS_Store
.AppleDouble
.LSOverride
# Windows
Thumbs.db
ehthumbs.db
Desktop.ini
# Linux
*~
.nfs*
# JetBrains
*.swp
*.swo
# Firebase/ google-services
google-services.json
# Keystore files
*.jks
*.keystore
# Output JSON files generated by the Android Gradle plugin
output.json
# Build outputs
**/build/
app/release/
app/debug/
!app-release.apk
!sleepy-agent-*.apk
# Temporary files
*.tmp
*.temp
*.log
# Test outputs
test-results/
screenshots/
# Coverage
*.ec
# NDK
obj/
+27
View File
@@ -0,0 +1,27 @@
MIT License
Copyright (c) 2026 Sleepy Agent Contributors
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
---
Note: While not required, the authors appreciate attribution when this software
is used in derivative works. A simple mention or link back to the original
project is always welcome!
+87
View File
@@ -0,0 +1,87 @@
# Sleepy Agent
A fully local AI assistant for Android powered by Google's Gemma 4 models via LiteRT-LM. Your conversations stay on your device - no cloud required. Can search the web when you need up-to-date information.
<p align="center">
<img src="docs/screen.jpg" alt="Sleepy Agent Screenshot" width="300">
</p>
## Features
### 🔒 Fully Local Inference
- **Voice, image, and text processing** all happens on-device
- No internet connection required for inference (except for web search tool)
- Conversations stay private - no data sent to external AI services
### 🎙️ Voice Input
- Tap the mic button and speak naturally
- Voice Activity Detection (VAD) automatically stops recording after you finish speaking
- Optional TTS (Text-to-Speech) responses when using voice input
### 🖼️ Image Understanding
- Send images from your gallery or take a photo
- Ask questions about what's in the image
- Works with text prompts alongside images
### 📝 Text Chat
- Full markdown support including tables and code blocks
- Persistent conversation history
- Navigate between multiple chat sessions
### 🧠 Gemma 4 via LiteRT-LM
- Powered by Google's official LiteRT-LM SDK
- Choose between **E2B** (2B params, ~2.7GB, faster) or **E4B** (4B params, ~4.5GB, higher quality)
- 16K token context window
- KV cache reuse for faster multi-turn conversations
- **Performance**: E2B runs at ~25-30 tokens/sec on a Samsung Galaxy Z Fold 5 (personal testing)
### 📥 Easy Model Setup
- **Download directly in the app**: Settings → Download Gemma 4 E2B/E4B
- **Or select your own model**: Use any `.litertlm` file from HuggingFace LiteRT Community
- Device info card shows your RAM to help choose the right model
### 💾 Session Management
- Navigation drawer shows all your past conversations
- Continue previous chats or start fresh
- Auto-saved conversation history
### 🔊 Smart TTS
- Optional text-to-speech for responses
- Auto-detect mode: speaks when you use voice input, silent for text input
## Work in Progress
- **Floating Bubble**: Quick access overlay (requires additional permissions)
- **Home Server Delegation**: Optionally route requests to your own server
## Requirements
- Android 8.0+ (API 26)
- 4GB+ RAM recommended
- ~3GB free storage for E2B model (~5GB for E4B)
## Building
See [DEVELOPMENT.md](docs/DEVELOPMENT.md) for detailed build instructions and how to configure your own SearXNG server.
Quick build:
```bash
./gradlew :app:assembleDebug
```
The release APK is ~50MB (arm64-v8a only).
## Web Search Setup
The app can search the web using a SearXNG server. To set up your own, see [DEVELOPMENT.md](docs/DEVELOPMENT.md).
## Model Sources
Download `.litertlm` models from:
- [HuggingFace LiteRT Community](https://huggingface.co/litert-community)
- Gemma 4 E2B: `gemma-4-E2B-it-litert-lm`
- Gemma 4 E4B: `gemma-4-E4B-it-litert-lm`
## License
MIT License - See LICENSE file
+31
View File
@@ -0,0 +1,31 @@
<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:tools="http://schemas.android.com/tools">
<!-- Permissions added for Task 1 -->
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.CAMERA" />
<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />
<application
android:allowBackup="true"
android:dataExtractionRules="@xml/data_extraction_rules"
android:icon="@mipmap/ic_launcher"
android:label="@string/app_name"
android:roundIcon="@mipmap/ic_launcher_round"
android:supportsRtl="true"
android:theme="@style/Theme.SleepyAgent"
tools:targetApi="31">
<activity
android:name=".MainActivity"
android:exported="true"
android:theme="@style/Theme.SleepyAgent">
<intent-filter>
<action android:name="android.intent.action.MAIN" />
<category android:name="android.intent.category.LAUNCHER" />
</intent-filter>
</activity>
</application>
</manifest>
+140
View File
@@ -0,0 +1,140 @@
plugins {
id("com.android.application")
id("org.jetbrains.kotlin.android")
id("org.jetbrains.kotlin.plugin.serialization")
// KSP removed due to KSP2/Hilt incompatibility - using manual DI instead
id("org.jetbrains.kotlin.plugin.compose")
}
android {
namespace = "com.sleepy.agent"
compileSdk = 35
defaultConfig {
applicationId = "com.sleepy.agent"
minSdk = 26
targetSdk = 35
versionCode = 1
versionName = "1.0"
testInstrumentationRunner = "androidx.test.runner.AndroidJUnitRunner"
vectorDrawables {
useSupportLibrary = true
}
}
buildTypes {
release {
isMinifyEnabled = true
isShrinkResources = true
proguardFiles(
getDefaultProguardFile("proguard-android-optimize.txt"),
"proguard-rules.pro"
)
}
}
// Only ship arm64-v8a (modern phones). Saves ~62MB!
defaultConfig {
ndk {
abiFilters += listOf("arm64-v8a")
}
}
compileOptions {
sourceCompatibility = JavaVersion.VERSION_17
targetCompatibility = JavaVersion.VERSION_17
}
kotlin {
compilerOptions {
jvmTarget = org.jetbrains.kotlin.gradle.dsl.JvmTarget.JVM_17
}
}
buildFeatures {
compose = true
}
composeOptions {
kotlinCompilerExtensionVersion = "1.5.14"
}
packaging {
resources {
excludes += "/META-INF/{AL2.0,LGPL2.1}"
}
}
}
dependencies {
// Core Android
implementation("androidx.core:core-ktx:1.13.1")
implementation("androidx.lifecycle:lifecycle-runtime-ktx:2.8.2")
implementation("androidx.activity:activity-compose:1.9.0")
// Compose BOM - Latest stable
implementation(platform("androidx.compose:compose-bom:2024.12.01"))
implementation("androidx.compose.ui:ui")
implementation("androidx.compose.ui:ui-graphics")
implementation("androidx.compose.ui:ui-tooling-preview")
implementation("androidx.compose.material3:material3")
implementation("androidx.compose.material:material-icons-extended")
// Navigation
implementation("androidx.navigation:navigation-compose:2.7.7")
// CameraX - Latest stable
val cameraxVersion = "1.3.4"
implementation("androidx.camera:camera-core:$cameraxVersion")
implementation("androidx.camera:camera-camera2:$cameraxVersion")
implementation("androidx.camera:camera-lifecycle:$cameraxVersion")
implementation("androidx.camera:camera-view:$cameraxVersion")
implementation("androidx.camera:camera-video:$cameraxVersion")
// Room removed - using in-memory storage instead due to KSP2 issues
// Ktor Client - Latest stable
val ktorVersion = "2.3.11"
implementation("io.ktor:ktor-client-core:$ktorVersion")
implementation("io.ktor:ktor-client-okhttp:$ktorVersion")
implementation("io.ktor:ktor-client-content-negotiation:$ktorVersion")
implementation("io.ktor:ktor-serialization-kotlinx-json:$ktorVersion")
implementation("io.ktor:ktor-client-logging:$ktorVersion")
// DataStore - Latest stable
implementation("androidx.datastore:datastore-preferences:1.1.1")
implementation("androidx.datastore:datastore:1.1.1")
// Kotlinx Serialization
implementation("org.jetbrains.kotlinx:kotlinx-serialization-json:1.6.3")
// LiteRT-LM for Gemma 4 inference (Google's current recommendation)
implementation("com.google.ai.edge.litertlm:litertlm-android:0.10.0") {
exclude(group = "org.jetbrains.kotlin", module = "kotlin-stdlib")
exclude(group = "org.jetbrains.kotlin", module = "kotlin-stdlib-jdk8")
exclude(group = "org.jetbrains.kotlin", module = "kotlin-reflect")
}
// LiteRT base library
implementation("com.google.ai.edge.litert:litert:2.1.0") {
exclude(group = "org.jetbrains.kotlin", module = "kotlin-stdlib")
}
// Note: MediaPipe removed - using LiteRT-LM for all inference
// Coroutines
implementation("org.jetbrains.kotlinx:kotlinx-coroutines-android:1.8.1")
implementation("org.jetbrains.kotlinx:kotlinx-coroutines-play-services:1.8.1")
// WorkManager (for background tasks)
implementation("androidx.work:work-runtime-ktx:2.9.0")
// Markdown rendering with tables and code blocks
implementation("com.github.jeziellago:compose-markdown:0.5.2")
// Testing
testImplementation("junit:junit:4.13.2")
testImplementation("org.jetbrains.kotlinx:kotlinx-coroutines-test:1.8.1")
androidTestImplementation("androidx.test.ext:junit:1.1.5")
androidTestImplementation("androidx.test.espresso:espresso-core:3.5.1")
androidTestImplementation(platform("androidx.compose:compose-bom:2024.06.00"))
androidTestImplementation("androidx.compose.ui:ui-test-junit4")
debugImplementation("androidx.compose.ui:ui-tooling")
debugImplementation("androidx.compose.ui:ui-test-manifest")
}
+36
View File
@@ -0,0 +1,36 @@
# Add project specific ProGuard rules here.
# You can control the set of applied configuration files using the
# proguardFiles setting in build.gradle.kts.
# Keep MediaPipe classes
-keep class com.google.mediapipe.** { *; }
-dontwarn com.google.mediapipe.**
# Keep TensorFlow Lite classes
-keep class org.tensorflow.** { *; }
-dontwarn org.tensorflow.**
# Keep Ktor classes
-keep class io.ktor.** { *; }
-dontwarn io.ktor.**
# Keep Kotlinx Serialization
-keepattributes *Annotation*, InnerClasses
-dontnote kotlinx.serialization.AnnotationsKt
-keepclassmembers class kotlinx.serialization.json.** { *; }
# SLF4J
-dontwarn org.slf4j.**
-keep class org.slf4j.** { *; }
# LiteRT-LM
-keep class com.google.ai.edge.litertlm.** { *; }
-dontwarn com.google.ai.edge.litertlm.**
# Gson (used by LiteRT-LM)
-keep class com.google.gson.** { *; }
-dontwarn com.google.gson.**
# Markdown rendering
-keep class dev.jeziellago.compose.markdowntext.** { *; }
-dontwarn dev.jeziellago.compose.markdowntext.**
+82
View File
@@ -0,0 +1,82 @@
<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:tools="http://schemas.android.com/tools">
<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.CAMERA" />
<!-- Storage permissions for model files -->
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE"
android:maxSdkVersion="32" />
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"
android:maxSdkVersion="32" />
<uses-permission android:name="android.permission.READ_MEDIA_AUDIO" />
<uses-permission android:name="android.permission.READ_MEDIA_IMAGES" />
<uses-permission android:name="android.permission.SYSTEM_ALERT_WINDOW" />
<uses-permission android:name="android.permission.FOREGROUND_SERVICE" />
<uses-permission android:name="android.permission.FOREGROUND_SERVICE_SPECIAL_USE" />
<uses-feature android:name="android.hardware.camera" android:required="false" />
<uses-feature android:name="android.hardware.camera.autofocus" android:required="false" />
<uses-feature android:name="android.hardware.microphone" android:required="true" />
<queries>
<!-- For file picker -->
<intent>
<action android:name="android.intent.action.OPEN_DOCUMENT" />
<data android:mimeType="*/*" />
</intent>
<!-- For camera app -->
<intent>
<action android:name="android.media.action.IMAGE_CAPTURE" />
</intent>
</queries>
<application
android:name=".SleepyAgentApplication"
android:allowBackup="true"
android:dataExtractionRules="@xml/data_extraction_rules"
android:fullBackupContent="@xml/backup_rules"
android:icon="@mipmap/ic_launcher"
android:label="@string/app_name"
android:roundIcon="@mipmap/ic_launcher_round"
android:supportsRtl="true"
android:theme="@style/Theme.SleepyAgent"
android:largeHeap="true"
android:networkSecurityConfig="@xml/network_security_config"
tools:targetApi="34">
<!-- Native libraries for LiteRT-LM GPU support -->
<uses-native-library
android:name="libvndksupport.so"
android:required="false"/>
<uses-native-library
android:name="libOpenCL.so"
android:required="false"/>
<activity
android:name=".MainActivity"
android:exported="true"
android:theme="@style/Theme.SleepyAgent"
android:windowSoftInputMode="adjustResize">
<intent-filter>
<action android:name="android.intent.action.MAIN" />
<category android:name="android.intent.category.LAUNCHER" />
</intent-filter>
</activity>
<!-- Floating Button Service -->
<service
android:name=".service.FloatingButtonService"
android:enabled="true"
android:exported="false"
android:foregroundServiceType="specialUse">
<property
android:name="android.app.PROPERTY_SPECIAL_USE_FGS_SUBTYPE"
android:value="system_alert_window" />
</service>
</application>
</manifest>
+1
View File
@@ -0,0 +1 @@
Entry not found
@@ -0,0 +1,294 @@
package com.sleepy.agent
import android.Manifest
import android.content.Context
import android.content.Intent
import android.content.pm.PackageManager
import android.graphics.Bitmap
import android.graphics.BitmapFactory
import android.media.projection.MediaProjectionManager
import android.net.Uri
import android.os.Build
import android.os.Bundle
import android.provider.Settings
import android.widget.Toast
import androidx.activity.ComponentActivity
import androidx.activity.compose.rememberLauncherForActivityResult
import androidx.activity.compose.setContent
import androidx.activity.enableEdgeToEdge
import androidx.activity.result.ActivityResultLauncher
import androidx.activity.result.contract.ActivityResultContracts
import androidx.compose.foundation.layout.fillMaxSize
import androidx.compose.material3.MaterialTheme
import androidx.compose.material3.Surface
import androidx.compose.runtime.LaunchedEffect
import androidx.compose.runtime.getValue
import androidx.compose.runtime.mutableStateOf
import androidx.compose.runtime.remember
import androidx.compose.runtime.setValue
import androidx.compose.ui.Modifier
import androidx.core.content.ContextCompat
import androidx.lifecycle.Lifecycle
import androidx.lifecycle.lifecycleScope
import androidx.lifecycle.repeatOnLifecycle
import androidx.lifecycle.viewmodel.compose.viewModel
import androidx.navigation.compose.NavHost
import androidx.navigation.compose.composable
import androidx.navigation.compose.rememberNavController
import com.sleepy.agent.di.AppModule
import com.sleepy.agent.service.FloatingButtonService
import com.sleepy.agent.ui.screens.MainScreen
import com.sleepy.agent.ui.screens.MainViewModel
import com.sleepy.agent.ui.screens.MainViewModelFactory
import com.sleepy.agent.ui.screens.SettingsScreen
import com.sleepy.agent.ui.screens.SettingsViewModel
import com.sleepy.agent.ui.theme.SleepyAgentTheme
import kotlinx.coroutines.flow.first
import kotlinx.coroutines.launch
class MainActivity : ComponentActivity() {
private lateinit var appModule: AppModule
private val requiredPermissions = if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.TIRAMISU) {
arrayOf(
Manifest.permission.RECORD_AUDIO,
Manifest.permission.CAMERA,
Manifest.permission.INTERNET,
Manifest.permission.ACCESS_NETWORK_STATE,
Manifest.permission.READ_MEDIA_IMAGES
)
} else {
arrayOf(
Manifest.permission.RECORD_AUDIO,
Manifest.permission.CAMERA,
Manifest.permission.INTERNET,
Manifest.permission.ACCESS_NETWORK_STATE,
Manifest.permission.READ_EXTERNAL_STORAGE,
Manifest.permission.WRITE_EXTERNAL_STORAGE
)
}
private val permissionLauncher = registerForActivityResult(
ActivityResultContracts.RequestMultiplePermissions()
) { permissions ->
permissions.entries.forEach { entry ->
android.util.Log.d("MainActivity", "Permission ${entry.key}: ${entry.value}")
}
}
// For MediaProjection (screenshot capture)
private var mediaProjectionLauncher: ActivityResultLauncher<Intent>? = null
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
appModule = SleepyAgentApplication.getAppModule(application)
// Setup MediaProjection launcher for screenshots
mediaProjectionLauncher = registerForActivityResult(
ActivityResultContracts.StartActivityForResult()
) { result ->
if (result.resultCode == RESULT_OK && result.data != null) {
val mediaProjectionManager = getSystemService(Context.MEDIA_PROJECTION_SERVICE) as MediaProjectionManager
FloatingButtonService.mediaProjection = mediaProjectionManager.getMediaProjection(result.resultCode, result.data!!)
Toast.makeText(this, "Screen capture enabled", Toast.LENGTH_SHORT).show()
}
}
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.M) {
requestPermissionsIfNeeded()
}
// Handle intent from floating button with screenshot
handleIntent(intent)
enableEdgeToEdge()
setContent {
SleepyAgentTheme {
Surface(
modifier = Modifier.fillMaxSize(),
color = MaterialTheme.colorScheme.background
) {
val navController = rememberNavController()
var screenshotBitmap by remember { mutableStateOf<Bitmap?>(null) }
var autoAnalyze by remember { mutableStateOf(false) }
// Handle screenshot from intent
LaunchedEffect(Unit) {
intent.getStringExtra("screenshot_path")?.let { path ->
screenshotBitmap = BitmapFactory.decodeFile(path)
autoAnalyze = intent.getBooleanExtra("auto_analyze", false)
}
}
NavHost(
navController = navController,
startDestination = "main"
) {
composable("main") { backStackEntry ->
val viewModel: MainViewModel = viewModel(
viewModelStoreOwner = backStackEntry,
factory = MainViewModelFactory(appModule, applicationContext, backStackEntry)
)
var pendingImageText by remember { mutableStateOf("") }
val imagePickerLauncher = rememberLauncherForActivityResult(
contract = ActivityResultContracts.GetContent()
) { uri: Uri? ->
uri?.let {
val bitmap = loadBitmapFromUri(it)
viewModel.onImageSelected(bitmap, pendingImageText)
pendingImageText = ""
}
}
// Handle screenshot from floating button
LaunchedEffect(screenshotBitmap) {
screenshotBitmap?.let { bitmap ->
if (autoAnalyze) {
viewModel.onImageSelected(bitmap, "Analyze this screenshot and tell me what you see. Ask if I have follow-up questions.")
} else {
viewModel.onImageSelected(bitmap, "")
}
screenshotBitmap = null
autoAnalyze = false
}
}
MainScreen(
onNavigateToSettings = {
navController.navigate("settings")
},
viewModel = viewModel,
onPickImage = { text ->
pendingImageText = text
imagePickerLauncher.launch("image/*")
}
)
}
composable("settings") {
val viewModel = appModule.createSettingsViewModel()
SettingsScreen(
onNavigateBack = {
navController.popBackStack()
},
viewModel = viewModel,
onRequestOverlayPermission = {
requestOverlayPermission()
},
onRequestMediaProjection = {
requestMediaProjection()
}
)
}
}
}
}
}
}
override fun onNewIntent(intent: Intent) {
super.onNewIntent(intent)
handleIntent(intent)
}
private fun handleIntent(intent: Intent?) {
intent?.let {
// Check if we need to handle screenshot
if (it.getBooleanExtra("from_floating_button", false)) {
android.util.Log.d("MainActivity", "Opened from floating button")
}
}
}
private fun requestPermissionsIfNeeded() {
// Only ask for permissions on first launch - store flag in SharedPreferences
val prefs = getSharedPreferences("app_prefs", Context.MODE_PRIVATE)
val hasRequestedPermissions = prefs.getBoolean("has_requested_permissions", false)
val permissionsToRequest = requiredPermissions.filter {
ContextCompat.checkSelfPermission(this, it) != PackageManager.PERMISSION_GRANTED
}.toTypedArray()
if (permissionsToRequest.isNotEmpty() && !hasRequestedPermissions) {
permissionLauncher.launch(permissionsToRequest)
prefs.edit().putBoolean("has_requested_permissions", true).apply()
}
}
private fun requestOverlayPermission() {
if (!Settings.canDrawOverlays(this)) {
val intent = Intent(
Settings.ACTION_MANAGE_OVERLAY_PERMISSION,
Uri.parse("package:$packageName")
)
startActivity(intent)
}
}
private fun requestMediaProjection() {
val mediaProjectionManager = getSystemService(Context.MEDIA_PROJECTION_SERVICE) as MediaProjectionManager
mediaProjectionLauncher?.launch(mediaProjectionManager.createScreenCaptureIntent())
}
override fun onResume() {
super.onResume()
// Check floating button setting and start/stop service
lifecycleScope.launch {
try {
val userSettings = appModule.userSettings
val floatingEnabled = userSettings.floatingButtonEnabled.first()
val hasOverlayPermission = Settings.canDrawOverlays(this@MainActivity)
if (floatingEnabled && hasOverlayPermission) {
// Only start if not already running
if (!FloatingButtonService.isRunning) {
FloatingButtonService.start(this@MainActivity)
}
} else {
// Stop if running
if (FloatingButtonService.isRunning) {
FloatingButtonService.stop(this@MainActivity)
}
// If enabled but no permission, disable the setting
if (floatingEnabled && !hasOverlayPermission) {
userSettings.setFloatingButtonEnabled(false)
android.util.Log.w("MainActivity", "Floating button disabled: overlay permission not granted")
}
}
} catch (e: Exception) {
android.util.Log.e("MainActivity", "Error managing floating button service", e)
}
}
}
private fun loadBitmapFromUri(uri: Uri): Bitmap? {
return try {
contentResolver.openInputStream(uri)?.use { stream ->
val options = BitmapFactory.Options().apply {
inJustDecodeBounds = true
}
BitmapFactory.decodeStream(stream, null, options)
var sampleSize = 1
while (options.outWidth / sampleSize > 2048 || options.outHeight / sampleSize > 2048) {
sampleSize *= 2
}
contentResolver.openInputStream(uri)?.use { stream2 ->
val decodeOptions = BitmapFactory.Options().apply {
inSampleSize = sampleSize
inPreferredConfig = Bitmap.Config.ARGB_8888
}
BitmapFactory.decodeStream(stream2, null, decodeOptions)
}
}
} catch (e: Exception) {
android.util.Log.e("MainActivity", "Error loading bitmap", e)
null
}
}
}
@@ -0,0 +1,21 @@
package com.sleepy.agent
import android.app.Application
import com.sleepy.agent.di.AppModule
class SleepyAgentApplication : Application() {
lateinit var appModule: AppModule
private set
override fun onCreate() {
super.onCreate()
appModule = AppModule(this)
}
companion object {
fun getAppModule(application: Application): AppModule {
return (application as SleepyAgentApplication).appModule
}
}
}
@@ -0,0 +1,303 @@
package com.sleepy.agent.audio
import android.Manifest
import android.content.Context
import android.content.pm.PackageManager
import android.media.AudioFormat
import android.media.AudioRecord
import android.media.MediaRecorder
import android.util.Log
import androidx.core.content.ContextCompat
import kotlinx.coroutines.*
import java.io.ByteArrayOutputStream
import java.nio.ByteBuffer
import java.nio.ByteOrder
import java.util.concurrent.atomic.AtomicBoolean
interface AudioRecorder {
fun startRecording(): Result<Unit>
suspend fun stopRecording(): ByteArray
fun isRecording(): Boolean
fun setOnAudioChunkListener(listener: (ByteArray) -> Unit)
fun setOnSilenceDetectedListener(listener: (() -> Unit)?)
}
class AudioRecorderImpl(
private val context: Context
) : AudioRecorder {
companion object {
private const val TAG = "AudioRecorder"
private const val SAMPLE_RATE = 16000
private const val CHANNEL_CONFIG = AudioFormat.CHANNEL_IN_MONO
private const val AUDIO_FORMAT = AudioFormat.ENCODING_PCM_16BIT
private const val BYTES_PER_SAMPLE = 2 // 16-bit = 2 bytes
private const val CHUNK_DURATION_MS = 1000 // 1 second chunks
private const val SILENCE_TIMEOUT_MS = 2000L // Stop after 2 seconds of silence
}
private var audioRecord: AudioRecord? = null
private var recordingJob: Job? = null
private val isRecordingState = AtomicBoolean(false)
private val recordedData = ByteArrayOutputStream()
private var audioChunkListener: ((ByteArray) -> Unit)? = null
private var silenceListener: (() -> Unit)? = null
private val scope = CoroutineScope(SupervisorJob() + Dispatchers.IO)
// VAD for auto-stopping on silence
private val vad = VoiceActivityDetector(
silenceThresholdMs = SILENCE_TIMEOUT_MS,
speechThresholdDb = -40.0,
silenceThresholdDb = -50.0
)
private val bufferSize: Int by lazy {
val minBufferSize = AudioRecord.getMinBufferSize(
SAMPLE_RATE,
CHANNEL_CONFIG,
AUDIO_FORMAT
)
val oneSecondBufferSize = SAMPLE_RATE * BYTES_PER_SAMPLE
maxOf(minBufferSize, oneSecondBufferSize)
}
private val chunkSize: Int
get() = SAMPLE_RATE * BYTES_PER_SAMPLE // 1 second = 32000 bytes
override fun setOnAudioChunkListener(listener: (ByteArray) -> Unit) {
audioChunkListener = listener
}
override fun setOnSilenceDetectedListener(listener: (() -> Unit)?) {
silenceListener = listener
vad.setCallbacks(
onSpeechStart = null,
onSpeechEnd = listener,
onAudioLevel = null
)
}
override fun startRecording(): Result<Unit> {
if (isRecordingState.get()) {
return Result.failure(IllegalStateException("Already recording"))
}
if (!hasRecordAudioPermission()) {
return Result.failure(SecurityException("RECORD_AUDIO permission not granted"))
}
return try {
initializeAudioRecord()
startRecordingInternal()
Result.success(Unit)
} catch (e: Exception) {
Log.e(TAG, "Failed to start recording", e)
cleanup()
Result.failure(e)
}
}
private fun hasRecordAudioPermission(): Boolean {
return ContextCompat.checkSelfPermission(
context,
Manifest.permission.RECORD_AUDIO
) == PackageManager.PERMISSION_GRANTED
}
private fun initializeAudioRecord() {
audioRecord = AudioRecord(
MediaRecorder.AudioSource.MIC,
SAMPLE_RATE,
CHANNEL_CONFIG,
AUDIO_FORMAT,
bufferSize
)
if (audioRecord?.state != AudioRecord.STATE_INITIALIZED) {
throw IllegalStateException("AudioRecord initialization failed")
}
}
private fun startRecordingInternal() {
audioRecord?.startRecording()
isRecordingState.set(true)
recordedData.reset()
// Start VAD
vad.setCallbacks(
onSpeechStart = { Log.d(TAG, "VAD: Speech detected") },
onSpeechEnd = {
Log.d(TAG, "VAD: Silence detected, auto-stopping")
silenceListener?.invoke()
},
onAudioLevel = { db ->
// Log audio level for debugging
// Log.v(TAG, "Audio level: ${db.toInt()} dB")
}
)
vad.start()
recordingJob = scope.launch {
recordAudioChunks()
}
Log.d(TAG, "Recording started with VAD (sampleRate=$SAMPLE_RATE, bufferSize=$bufferSize)")
}
private suspend fun recordAudioChunks() {
val audioRecord = this.audioRecord ?: return
val buffer = ShortArray(chunkSize / BYTES_PER_SAMPLE)
val job = recordingJob ?: return
while (isRecordingState.get() && job.isActive) {
val bytesRead = audioRecord.read(buffer, 0, buffer.size)
if (bytesRead > 0) {
val byteBuffer = ByteBuffer.allocate(bytesRead * BYTES_PER_SAMPLE)
byteBuffer.order(ByteOrder.LITTLE_ENDIAN)
buffer.take(bytesRead).forEach { short ->
byteBuffer.putShort(short)
}
val chunkBytes = byteBuffer.array()
synchronized(recordedData) {
recordedData.write(chunkBytes)
}
// Process for VAD
val audioLevel = calculateAudioLevelDb(chunkBytes)
vad.processAudio(chunkBytes, audioLevel)
audioChunkListener?.invoke(chunkBytes)
} else if (bytesRead < 0) {
Log.e(TAG, "AudioRecord read error: $bytesRead")
break
}
}
}
override suspend fun stopRecording(): ByteArray {
if (!isRecordingState.get()) {
Log.w(TAG, "stopRecording called but not recording")
return byteArrayOf()
}
Log.d(TAG, "Stopping recording...")
// Cancel job first, then stop state
recordingJob?.cancel()
withContext(Dispatchers.IO) {
recordingJob?.join()
}
isRecordingState.set(false)
// Stop VAD
vad.stop()
audioRecord?.let { record ->
try {
record.stop()
Log.d(TAG, "AudioRecord stopped successfully")
} catch (e: IllegalStateException) {
Log.w(TAG, "AudioRecord stop failed (may not have started)", e)
}
}
val result = synchronized(recordedData) {
recordedData.toByteArray()
}
cleanup()
// Validate the audio data
if (result.isEmpty()) {
Log.w(TAG, "No audio data recorded")
return byteArrayOf()
}
if (result.size < 6400) { // Less than 200ms at 16kHz 16-bit
Log.w(TAG, "Audio too short: ${result.size} bytes (${result.size / 32}ms)")
return byteArrayOf()
}
Log.d(TAG, "Recording stopped, captured ${result.size} bytes (${result.size / 32}ms of audio)")
return result
}
private fun cleanup() {
recordingJob?.cancel()
recordingJob = null
audioRecord?.release()
audioRecord = null
vad.stop()
recordedData.reset()
isRecordingState.set(false)
}
/**
* Calculate audio level in dB from PCM16 buffer.
*/
private fun calculateAudioLevelDb(buffer: ByteArray): Double {
if (buffer.size < 2) return -100.0
val byteBuffer = ByteBuffer.wrap(buffer)
byteBuffer.order(ByteOrder.LITTLE_ENDIAN)
var sum = 0.0
var count = 0
while (byteBuffer.remaining() >= 2) {
val sample = byteBuffer.short.toInt()
sum += sample * sample
count++
}
if (count == 0) return -100.0
val rms = kotlin.math.sqrt(sum / count)
return if (rms > 0) {
20 * kotlin.math.log10(rms / Short.MAX_VALUE)
} else {
-100.0
}
}
override fun isRecording(): Boolean {
return isRecordingState.get()
}
fun getRecordingDurationMs(): Long {
val byteCount = synchronized(recordedData) {
recordedData.size()
}
return (byteCount / BYTES_PER_SAMPLE * 1000L) / SAMPLE_RATE
}
fun getAudioLevel(buffer: ByteArray): Double {
if (buffer.isEmpty()) return 0.0
val byteBuffer = ByteBuffer.wrap(buffer)
byteBuffer.order(ByteOrder.LITTLE_ENDIAN)
var sum = 0.0
var count = 0
while (byteBuffer.remaining() >= 2) {
val sample = byteBuffer.short.toInt()
sum += sample * sample
count++
}
if (count == 0) return 0.0
val rms = kotlin.math.sqrt(sum / count)
return 20 * kotlin.math.log10(rms / Short.MAX_VALUE)
}
}
class AudioPermissionException(message: String) : SecurityException(message)
@@ -0,0 +1,147 @@
package com.sleepy.agent.audio
import android.content.Context
import android.content.Intent
import android.net.Uri
import android.speech.tts.TextToSpeech
import android.speech.tts.UtteranceProgressListener
import kotlinx.coroutines.flow.Flow
import kotlinx.coroutines.flow.MutableStateFlow
import kotlinx.coroutines.flow.StateFlow
import kotlinx.coroutines.flow.asStateFlow
import java.util.Locale
enum class TtsState { INITIALIZING, READY, ERROR }
class TtsService(
private val context: Context
) {
private var textToSpeech: TextToSpeech? = null
private val _state = MutableStateFlow(TtsState.INITIALIZING)
val state: StateFlow<TtsState> = _state.asStateFlow()
private var isInitialized = false
private var pendingCompletionCallback: (() -> Unit)? = null
/**
* Initializes the TTS engine and emits state changes.
* Emits: INITIALIZING → READY (or ERROR if TTS not available)
*/
fun initialize(): Flow<TtsState> {
if (isInitialized && textToSpeech != null) {
_state.value = TtsState.READY
return state
}
_state.value = TtsState.INITIALIZING
textToSpeech = TextToSpeech(context) { status ->
when (status) {
TextToSpeech.SUCCESS -> {
textToSpeech?.let { tts ->
// Set up progress listener for completion callbacks
tts.setOnUtteranceProgressListener(object : UtteranceProgressListener() {
override fun onStart(utteranceId: String?) {
// Speech started
}
override fun onDone(utteranceId: String?) {
pendingCompletionCallback?.invoke()
pendingCompletionCallback = null
}
override fun onError(utteranceId: String?) {
pendingCompletionCallback?.invoke()
pendingCompletionCallback = null
}
})
// Set default language
val result = tts.setLanguage(Locale.getDefault())
if (result == TextToSpeech.LANG_MISSING_DATA ||
result == TextToSpeech.LANG_NOT_SUPPORTED
) {
// Language not available, but TTS is still functional
// Fallback to US English
tts.setLanguage(Locale.US)
}
isInitialized = true
_state.value = TtsState.READY
}
}
TextToSpeech.ERROR -> {
_state.value = TtsState.ERROR
// TTS engine not installed - redirect to install
redirectToTtsInstall()
}
}
}
return state
}
/**
* Speaks the given text. Optional callback invoked when speech completes.
*/
fun speak(text: String, onComplete: (() -> Unit)? = null) {
if (!isInitialized || textToSpeech == null) {
onComplete?.invoke()
return
}
pendingCompletionCallback = onComplete
// Stop any current speech
stop()
val utteranceId = System.currentTimeMillis().toString()
textToSpeech?.speak(text, TextToSpeech.QUEUE_FLUSH, null, utteranceId)
}
/**
* Stops the current speech.
*/
fun stop() {
textToSpeech?.stop()
}
/**
* Returns true if currently speaking.
*/
fun isSpeaking(): Boolean {
return textToSpeech?.isSpeaking ?: false
}
/**
* Shuts down the TTS engine and releases resources.
*/
fun shutdown() {
textToSpeech?.stop()
textToSpeech?.shutdown()
textToSpeech = null
isInitialized = false
pendingCompletionCallback = null
_state.value = TtsState.INITIALIZING
}
/**
* Redirects user to install TTS engine from Play Store.
*/
private fun redirectToTtsInstall() {
try {
val intent = Intent(Intent.ACTION_VIEW).apply {
data = Uri.parse("market://details?id=com.google.android.tts")
flags = Intent.FLAG_ACTIVITY_NEW_TASK
}
context.startActivity(intent)
} catch (e: Exception) {
// Play Store not available, open in browser
val intent = Intent(Intent.ACTION_VIEW).apply {
data = Uri.parse("https://play.google.com/store/apps/details?id=com.google.android.tts")
flags = Intent.FLAG_ACTIVITY_NEW_TASK
}
context.startActivity(intent)
}
}
}
@@ -0,0 +1,116 @@
package com.sleepy.agent.audio
import android.util.Log
import kotlinx.coroutines.*
import java.util.concurrent.atomic.AtomicBoolean
import java.util.concurrent.atomic.AtomicLong
/**
* Simple Voice Activity Detector that monitors audio levels and triggers
* callbacks when speech starts and stops.
*/
class VoiceActivityDetector(
private val silenceThresholdMs: Long = 2000, // Stop after 2 seconds of silence
private val speechThresholdDb: Double = -40.0, // Consider speech above -40dB
private val silenceThresholdDb: Double = -50.0 // Consider silence below -50dB
) {
companion object {
private const val TAG = "VAD"
private const val MIN_SPEECH_DURATION_MS = 500 // Minimum speech before we start monitoring for silence
}
private val isRunning = AtomicBoolean(false)
private val lastSpeechTime = AtomicLong(0)
private val speechStartTime = AtomicLong(0)
private val hasDetectedSpeech = AtomicBoolean(false)
private var onSpeechStart: (() -> Unit)? = null
private var onSpeechEnd: (() -> Unit)? = null
private var onAudioLevel: ((Double) -> Unit)? = null
private var monitoringJob: Job? = null
private val scope = CoroutineScope(SupervisorJob() + Dispatchers.Default)
fun setCallbacks(
onSpeechStart: (() -> Unit)? = null,
onSpeechEnd: (() -> Unit)? = null,
onAudioLevel: ((Double) -> Unit)? = null
) {
this.onSpeechStart = onSpeechStart
this.onSpeechEnd = onSpeechEnd
this.onAudioLevel = onAudioLevel
}
fun start() {
isRunning.set(true)
hasDetectedSpeech.set(false)
lastSpeechTime.set(System.currentTimeMillis())
speechStartTime.set(0)
// Start monitoring job
monitoringJob = scope.launch {
while (isRunning.get()) {
checkForSilenceTimeout()
delay(100)
}
}
Log.d(TAG, "VAD started")
}
fun stop() {
isRunning.set(false)
monitoringJob?.cancel()
monitoringJob = null
hasDetectedSpeech.set(false)
Log.d(TAG, "VAD stopped")
}
/**
* Process audio buffer for voice activity detection.
* Call this with each audio chunk received.
*/
fun processAudio(buffer: ByteArray, audioLevelDb: Double) {
if (!isRunning.get()) return
onAudioLevel?.invoke(audioLevelDb)
val now = System.currentTimeMillis()
if (audioLevelDb > speechThresholdDb) {
// Speech detected
if (!hasDetectedSpeech.get()) {
// First speech detection
hasDetectedSpeech.set(true)
speechStartTime.set(now)
lastSpeechTime.set(now)
onSpeechStart?.invoke()
Log.d(TAG, "Speech started")
} else {
// Ongoing speech
lastSpeechTime.set(now)
}
}
}
private suspend fun checkForSilenceTimeout() {
if (!hasDetectedSpeech.get()) return
val now = System.currentTimeMillis()
val speechStart = speechStartTime.get()
val lastSpeech = lastSpeechTime.get()
val speechDuration = now - speechStart
val silenceDuration = now - lastSpeech
// Only check for silence after minimum speech duration
if (speechDuration > MIN_SPEECH_DURATION_MS && silenceDuration > silenceThresholdMs) {
Log.d(TAG, "Silence detected for ${silenceDuration}ms, triggering speech end")
onSpeechEnd?.invoke()
// Reset to prevent multiple triggers
hasDetectedSpeech.set(false)
}
}
fun isActive(): Boolean = isRunning.get()
fun hasDetectedSpeech(): Boolean = hasDetectedSpeech.get()
}
@@ -0,0 +1,82 @@
package com.sleepy.agent.audio
import java.io.ByteArrayOutputStream
import java.nio.ByteBuffer
import java.nio.ByteOrder
/**
* Converts raw PCM16 audio to WAV format.
* Gemma 4 E2B's miniaudio decoder expects WAV format with proper headers.
*/
object WavConverter {
private const val SAMPLE_RATE = 16000
private const val CHANNELS = 1 // Mono
private const val BITS_PER_SAMPLE = 16
/**
* Converts raw PCM16 audio data to WAV format with proper headers.
*
* @param pcmData Raw PCM16 little-endian audio data
* @return WAV formatted byte array with RIFF headers
*/
fun pcmToWav(pcmData: ByteArray): ByteArray {
if (pcmData.isEmpty()) {
return byteArrayOf()
}
val byteRate = SAMPLE_RATE * CHANNELS * BITS_PER_SAMPLE / 8
val blockAlign = CHANNELS * BITS_PER_SAMPLE / 8
val dataSize = pcmData.size
val totalSize = 36 + dataSize // Header size (44) - 8 for "RIFF" and size field
val output = ByteArrayOutputStream()
// RIFF chunk descriptor
output.write("RIFF".toByteArray(Charsets.US_ASCII))
output.writeInt(totalSize)
output.write("WAVE".toByteArray(Charsets.US_ASCII))
// fmt sub-chunk
output.write("fmt ".toByteArray(Charsets.US_ASCII))
output.writeInt(16) // Subchunk size (16 for PCM)
output.writeShort(1) // Audio format (1 = PCM)
output.writeShort(CHANNELS.toShort())
output.writeInt(SAMPLE_RATE)
output.writeInt(byteRate)
output.writeShort(blockAlign.toShort())
output.writeShort(BITS_PER_SAMPLE.toShort())
// data sub-chunk
output.write("data".toByteArray(Charsets.US_ASCII))
output.writeInt(dataSize)
output.write(pcmData)
return output.toByteArray()
}
/**
* Check if audio data already has WAV headers (starts with "RIFF")
*/
fun isWav(data: ByteArray): Boolean {
return data.size >= 4 &&
data[0] == 'R'.code.toByte() &&
data[1] == 'I'.code.toByte() &&
data[2] == 'F'.code.toByte() &&
data[3] == 'F'.code.toByte()
}
private fun ByteArrayOutputStream.writeInt(value: Int) {
val buffer = ByteBuffer.allocate(4)
buffer.order(ByteOrder.LITTLE_ENDIAN)
buffer.putInt(value)
write(buffer.array())
}
private fun ByteArrayOutputStream.writeShort(value: Short) {
val buffer = ByteBuffer.allocate(2)
buffer.order(ByteOrder.LITTLE_ENDIAN)
buffer.putShort(value)
write(buffer.array())
}
}
@@ -0,0 +1,173 @@
package com.sleepy.agent.camera
import android.content.Context
import android.graphics.Bitmap
import android.graphics.BitmapFactory
import android.graphics.ImageFormat
import android.graphics.Matrix
import android.graphics.Rect
import android.graphics.YuvImage
import android.util.Log
import androidx.camera.core.CameraSelector
import androidx.camera.core.ImageAnalysis
import androidx.camera.core.ImageCapture
import androidx.camera.core.ImageCaptureException
import androidx.camera.core.ImageProxy
import androidx.camera.core.Preview
import androidx.camera.lifecycle.ProcessCameraProvider
import androidx.core.content.ContextCompat
import androidx.lifecycle.LifecycleOwner
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.withContext
import java.io.ByteArrayOutputStream
import java.util.concurrent.ExecutorService
import java.util.concurrent.Executors
import kotlin.coroutines.resume
import kotlin.coroutines.resumeWithException
import kotlin.coroutines.suspendCoroutine
/**
* Simple camera capture utility for taking photos to send to the multimodal model.
*/
class CameraCapture(private val context: Context) {
companion object {
private const val TAG = "CameraCapture"
}
private var imageCapture: ImageCapture? = null
private var cameraExecutor: ExecutorService = Executors.newSingleThreadExecutor()
/**
* Starts camera preview and returns a capture function.
* Call this from a Compose AndroidView or similar.
*/
suspend fun startCamera(
lifecycleOwner: LifecycleOwner,
previewView: androidx.camera.view.PreviewView
): Result<Unit> = withContext(Dispatchers.Main) {
try {
val cameraProvider = suspendCoroutine<ProcessCameraProvider> { continuation ->
ProcessCameraProvider.getInstance(context).apply {
addListener({
try {
continuation.resume(get())
} catch (e: Exception) {
continuation.resumeWithException(e)
}
}, ContextCompat.getMainExecutor(context))
}
}
// Set up preview
val preview = Preview.Builder()
.build()
.also { it.setSurfaceProvider(previewView.surfaceProvider) }
// Set up image capture
imageCapture = ImageCapture.Builder()
.setCaptureMode(ImageCapture.CAPTURE_MODE_MINIMIZE_LATENCY)
.build()
// Select back camera
val cameraSelector = CameraSelector.DEFAULT_BACK_CAMERA
// Unbind all use cases and rebind
cameraProvider.unbindAll()
cameraProvider.bindToLifecycle(
lifecycleOwner,
cameraSelector,
preview,
imageCapture
)
Result.success(Unit)
} catch (e: Exception) {
Log.e(TAG, "Failed to start camera", e)
Result.failure(e)
}
}
/**
* Captures a photo and returns it as a Bitmap.
*/
suspend fun capturePhoto(): Result<Bitmap> = withContext(Dispatchers.IO) {
try {
val capture = imageCapture ?: throw IllegalStateException("Camera not started")
val bitmap = suspendCoroutine<Bitmap> { continuation ->
capture.takePicture(
cameraExecutor,
object : ImageCapture.OnImageCapturedCallback() {
override fun onCaptureSuccess(image: ImageProxy) {
try {
val bitmap = imageProxyToBitmap(image)
image.close()
continuation.resume(bitmap)
} catch (e: Exception) {
image.close()
continuation.resumeWithException(e)
}
}
override fun onError(exception: ImageCaptureException) {
continuation.resumeWithException(exception)
}
}
)
}
Result.success(bitmap)
} catch (e: Exception) {
Log.e(TAG, "Failed to capture photo", e)
Result.failure(e)
}
}
/**
* Converts ImageProxy to Bitmap, handling rotation.
*/
private fun imageProxyToBitmap(image: ImageProxy): Bitmap {
val buffer = image.planes[0].buffer
val bytes = ByteArray(buffer.remaining())
buffer.get(bytes)
val bitmap = when (image.format) {
ImageFormat.JPEG -> {
BitmapFactory.decodeByteArray(bytes, 0, bytes.size)
}
ImageFormat.YUV_420_888 -> {
// Convert YUV to Bitmap
val yuvImage = YuvImage(
bytes,
ImageFormat.NV21,
image.width,
image.height,
null
)
val out = ByteArrayOutputStream()
yuvImage.compressToJpeg(Rect(0, 0, image.width, image.height), 100, out)
BitmapFactory.decodeByteArray(out.toByteArray(), 0, out.size())
}
else -> throw IllegalArgumentException("Unsupported image format: ${image.format}")
}
// Handle rotation
val matrix = Matrix()
matrix.postRotate(image.imageInfo.rotationDegrees.toFloat())
return Bitmap.createBitmap(
bitmap,
0,
0,
bitmap.width,
bitmap.height,
matrix,
true
)
}
fun shutdown() {
cameraExecutor.shutdown()
}
}
@@ -0,0 +1,151 @@
package com.sleepy.agent.data
import android.content.Context
import android.util.Log
import com.sleepy.agent.ui.screens.ConversationMessage
import kotlinx.serialization.Serializable
import kotlinx.serialization.encodeToString
import kotlinx.serialization.json.Json
import java.io.File
import java.util.UUID
/**
* Simple JSON-based conversation storage.
* Stores chat history as files in app's private directory.
*/
class ConversationStorage(private val context: Context) {
private val storageDir = File(context.filesDir, "conversations")
private val json = Json { ignoreUnknownKeys = true }
init {
storageDir.mkdirs()
}
companion object {
private const val TAG = "ConversationStorage"
private const val MAX_CONVERSATIONS = 50
}
/**
* Saves a conversation to storage.
*/
fun saveConversation(id: String, messages: List<ConversationMessage>): Boolean {
return try {
val conversation = SavedConversation(
id = id,
title = generateTitle(messages),
timestamp = System.currentTimeMillis(),
messages = messages
)
val file = File(storageDir, "$id.json")
file.writeText(json.encodeToString(conversation))
// Clean up old conversations if needed
cleanupOldConversations()
Log.d(TAG, "Saved conversation $id with ${messages.size} messages")
true
} catch (e: Exception) {
Log.e(TAG, "Failed to save conversation", e)
false
}
}
/**
* Loads a conversation from storage.
*/
fun loadConversation(id: String): List<ConversationMessage>? {
return try {
val file = File(storageDir, "$id.json")
if (!file.exists()) return null
val conversation = json.decodeFromString<SavedConversation>(file.readText())
Log.d(TAG, "Loaded conversation $id with ${conversation.messages.size} messages")
conversation.messages
} catch (e: Exception) {
Log.e(TAG, "Failed to load conversation", e)
null
}
}
/**
* Deletes a conversation.
*/
fun deleteConversation(id: String): Boolean {
return try {
val file = File(storageDir, "$id.json")
file.delete()
} catch (e: Exception) {
Log.e(TAG, "Failed to delete conversation", e)
false
}
}
/**
* Gets all saved conversations sorted by most recent.
*/
fun getAllConversations(): List<ConversationInfo> {
return try {
storageDir.listFiles { file -> file.extension == "json" }
?.mapNotNull { file ->
try {
val conversation = json.decodeFromString<SavedConversation>(file.readText())
ConversationInfo(
id = conversation.id,
title = conversation.title,
timestamp = conversation.timestamp,
messageCount = conversation.messages.size
)
} catch (e: Exception) {
null
}
}
?.sortedByDescending { it.timestamp }
?: emptyList()
} catch (e: Exception) {
Log.e(TAG, "Failed to list conversations", e)
emptyList()
}
}
/**
* Creates a new conversation ID.
*/
fun createNewConversationId(): String {
return UUID.randomUUID().toString()
}
private fun generateTitle(messages: List<ConversationMessage>): String {
// Use first user message as title, truncated
val firstUserMessage = messages.firstOrNull { it.isUser }
return firstUserMessage?.text?.take(50)?.let {
if (firstUserMessage.text.length > 50) "$it..." else it
} ?: "New Chat"
}
private fun cleanupOldConversations() {
val conversations = getAllConversations()
if (conversations.size > MAX_CONVERSATIONS) {
conversations.drop(MAX_CONVERSATIONS).forEach {
deleteConversation(it.id)
}
}
}
}
@Serializable
data class SavedConversation(
val id: String,
val title: String,
val timestamp: Long,
val messages: List<ConversationMessage>
)
data class ConversationInfo(
val id: String,
val title: String,
val timestamp: Long,
val messageCount: Int
)
@@ -0,0 +1,111 @@
package com.sleepy.agent.di
import android.content.Context
import androidx.datastore.core.DataStore
import androidx.datastore.preferences.core.Preferences
import androidx.datastore.preferences.preferencesDataStore
import com.sleepy.agent.audio.AudioRecorder
import com.sleepy.agent.audio.AudioRecorderImpl
import com.sleepy.agent.audio.TtsService
import com.sleepy.agent.download.ModelDownloadManager
import com.sleepy.agent.inference.Agent
import com.sleepy.agent.inference.ConversationContext
import com.sleepy.agent.inference.LiteRtLlmEngine
import com.sleepy.agent.inference.LlmEngine
import com.sleepy.agent.settings.UserSettings
import com.sleepy.agent.tools.ServerTool
import com.sleepy.agent.tools.Tool
import com.sleepy.agent.tools.WebSearchTool
import com.sleepy.agent.ui.screens.MainViewModel
import com.sleepy.agent.ui.screens.SettingsViewModel
import io.ktor.client.HttpClient
import io.ktor.client.engine.okhttp.OkHttp
import io.ktor.client.plugins.HttpTimeout
import io.ktor.client.plugins.contentnegotiation.ContentNegotiation
import io.ktor.serialization.kotlinx.json.json
import kotlinx.serialization.json.Json
private val Context.dataStore: DataStore<Preferences> by preferencesDataStore(name = "settings")
/**
* Manual Dependency Injection container.
* Replaces Hilt/KSP for compatibility with LiteRT-LM's Kotlin 2.3.0 requirement.
*/
class AppModule(private val context: Context) {
// Core
val dataStore: DataStore<Preferences> by lazy { context.dataStore }
// Settings
val userSettings: UserSettings by lazy { UserSettings(dataStore) }
// Network
val ktorClient: HttpClient by lazy {
HttpClient(OkHttp) {
install(ContentNegotiation) {
json(Json { ignoreUnknownKeys = true })
}
install(HttpTimeout) {
requestTimeoutMillis = 60_000
}
}
}
// Audio
val audioRecorder: AudioRecorder by lazy { AudioRecorderImpl(context) }
val ttsService: TtsService by lazy { TtsService(context) }
// LLM
val llmEngine: LlmEngine by lazy { LiteRtLlmEngine(context) }
val conversationContext: ConversationContext by lazy { ConversationContext() }
// System prompt
val systemPrompt: String by lazy {
"""You are a helpful AI assistant with access to tools.
|
|Available tools:
|- web_search: Search the web for information
|- home_server: Execute commands on the home server
""".trimMargin()
}
// Tools
val webSearchTool: WebSearchTool by lazy { WebSearchTool(ktorClient, "http://sleepy-think:7777") }
val serverTool: ServerTool by lazy { ServerTool(ktorClient, "http://sleepy-think:8000") }
// Download Manager
val downloadManager: ModelDownloadManager by lazy { ModelDownloadManager(context) }
val tools: Map<String, Tool> by lazy {
mapOf(
webSearchTool.name to webSearchTool,
serverTool.name to serverTool
)
}
// Agent
val agent: Agent by lazy { Agent(llmEngine, conversationContext, tools) }
fun createMainViewModel(savedStateHandle: androidx.lifecycle.SavedStateHandle): MainViewModel {
return MainViewModel(
savedStateHandle = savedStateHandle,
context = context,
audioRecorder = audioRecorder,
ttsService = ttsService,
agent = agent,
llmEngine = llmEngine,
userSettings = userSettings,
webSearchTool = webSearchTool
)
}
fun createSettingsViewModel(): SettingsViewModel {
return SettingsViewModel(
userSettings = userSettings,
httpClient = ktorClient,
llmEngine = llmEngine,
context = context,
downloadManager = downloadManager
)
}
}
@@ -0,0 +1,194 @@
package com.sleepy.agent.download
import android.content.Context
import android.util.Log
import androidx.work.*
import kotlinx.coroutines.flow.MutableStateFlow
import kotlinx.coroutines.flow.StateFlow
import kotlinx.coroutines.flow.asStateFlow
import java.io.File
import java.net.URL
import java.util.concurrent.TimeUnit
/**
* Manages downloading the Gemma 4 model from HuggingFace.
* Uses WorkManager for reliable background downloads that persist across app updates.
*/
class ModelDownloadManager(private val context: Context) {
companion object {
private const val TAG = "ModelDownloadManager"
// E2B Model
const val E2B_MODEL_URL = "https://huggingface.co/litert-community/gemma-4-E2B-it-litert-lm/resolve/main/gemma-4-E2B-it.litertlm"
const val E2B_MODEL_FILE_NAME = "gemma-4-E2B-it.litertlm"
const val E2B_MODEL_SIZE_BYTES = 2717263232L // ~2.53 GB
// E4B Model
const val E4B_MODEL_URL = "https://huggingface.co/litert-community/gemma-4-E4B-it-litert-lm/resolve/main/gemma-4-E4B-it.litertlm"
const val E4B_MODEL_FILE_NAME = "gemma-4-E4B-it.litertlm"
const val E4B_MODEL_SIZE_BYTES = 4831838208L // ~4.5 GB
fun getModelsDir(context: Context): File {
return File(context.getExternalFilesDir(null), "models").apply { mkdirs() }
}
// E2B Methods
fun getE2BModelFile(context: Context): File {
return File(getModelsDir(context), E2B_MODEL_FILE_NAME)
}
fun isE2BDownloaded(context: Context): Boolean {
val file = getE2BModelFile(context)
return file.exists() && file.length() > 100000000L // At least 100MB (partial download check)
}
// E4B Methods
fun getE4BModelFile(context: Context): File {
return File(getModelsDir(context), E4B_MODEL_FILE_NAME)
}
fun isE4BDownloaded(context: Context): Boolean {
val file = getE4BModelFile(context)
return file.exists() && file.length() > 100000000L // At least 100MB (partial download check)
}
// Legacy methods (for backward compatibility, default to E2B)
fun getModelFile(context: Context): File = getE2BModelFile(context)
fun isModelDownloaded(context: Context): Boolean = isE2BDownloaded(context)
fun getDownloadProgress(context: Context): Float {
val file = getE2BModelFile(context)
return if (!file.exists()) 0f else (file.length().toFloat() / E2B_MODEL_SIZE_BYTES).coerceIn(0f, 1f)
}
fun getDownloadedSize(context: Context): String {
val file = getE2BModelFile(context)
return formatBytes(file.length())
}
fun formatBytes(bytes: Long): String {
val mb = bytes / (1024 * 1024)
val gb = mb / 1024f
return if (gb >= 1) String.format("%.2f GB", gb) else "$mb MB"
}
// New variant-aware methods
fun downloadModelVariant(context: Context, variant: String): Boolean {
val url = when (variant) {
"e2b" -> E2B_MODEL_URL
"e4b" -> E4B_MODEL_URL
else -> return false
}
// Start download via WorkManager
val workManager = WorkManager.getInstance(context)
val workRequest = OneTimeWorkRequestBuilder<ModelDownloadWorker>()
.setInputData(workDataOf(
"MODEL_URL" to url,
"MODEL_VARIANT" to variant
))
.setConstraints(
Constraints.Builder()
.setRequiredNetworkType(NetworkType.CONNECTED)
.setRequiresStorageNotLow(true)
.build()
)
.addTag("model_download_$variant")
.build()
workManager.enqueueUniqueWork(
"model_download_$variant",
ExistingWorkPolicy.REPLACE,
workRequest
)
return true
}
fun deleteModelVariant(context: Context, variant: String) {
val file = when (variant) {
"e2b" -> getE2BModelFile(context)
"e4b" -> getE4BModelFile(context)
else -> return
}
if (file.exists()) {
file.delete()
}
}
}
private val _downloadState = MutableStateFlow<DownloadState>(DownloadState.Idle)
val downloadState: StateFlow<DownloadState> = _downloadState.asStateFlow()
sealed class DownloadState {
object Idle : DownloadState()
object Checking : DownloadState()
data class Downloading(val progress: Float, val bytesDownloaded: Long) : DownloadState()
object Completed : DownloadState()
data class Error(val message: String) : DownloadState()
}
/**
* Starts the model download using WorkManager for reliability.
*/
fun startDownload(): androidx.work.Operation {
_downloadState.value = DownloadState.Checking
val constraints = Constraints.Builder()
.setRequiredNetworkType(NetworkType.CONNECTED)
.setRequiresStorageNotLow(true)
.build()
val downloadWork = OneTimeWorkRequestBuilder<ModelDownloadWorker>()
.setConstraints(constraints)
.setBackoffCriteria(
BackoffPolicy.EXPONENTIAL,
WorkRequest.MIN_BACKOFF_MILLIS,
TimeUnit.MILLISECONDS
)
.addTag("model_download")
.build()
val workManager = WorkManager.getInstance(context)
// Observe work progress
workManager.getWorkInfoByIdLiveData(downloadWork.id).observeForever { workInfo ->
when (workInfo?.state) {
WorkInfo.State.RUNNING -> {
val progress = workInfo.progress.getFloat(ModelDownloadWorker.PROGRESS, 0f)
val bytes = workInfo.progress.getLong(ModelDownloadWorker.BYTES_DOWNLOADED, 0L)
_downloadState.value = DownloadState.Downloading(progress, bytes)
}
WorkInfo.State.SUCCEEDED -> {
_downloadState.value = DownloadState.Completed
}
WorkInfo.State.FAILED -> {
val error = workInfo.outputData.getString(ModelDownloadWorker.ERROR_MSG) ?: "Download failed"
_downloadState.value = DownloadState.Error(error)
}
WorkInfo.State.CANCELLED -> {
_downloadState.value = DownloadState.Idle
}
else -> {}
}
}
return workManager.enqueue(downloadWork)
}
fun cancelDownload() {
WorkManager.getInstance(context).cancelAllWorkByTag("model_download")
_downloadState.value = DownloadState.Idle
}
fun deleteModel() {
val file = getModelFile(context)
if (file.exists()) {
file.delete()
Log.d(TAG, "Deleted model file")
}
_downloadState.value = DownloadState.Idle
}
}
@@ -0,0 +1,175 @@
package com.sleepy.agent.download
import android.content.Context
import android.util.Log
import androidx.work.CoroutineWorker
import androidx.work.Data
import androidx.work.WorkerParameters
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.delay
import kotlinx.coroutines.isActive
import kotlinx.coroutines.withContext
import java.io.File
import java.io.FileOutputStream
import java.net.HttpURLConnection
import java.net.URL
import kotlin.coroutines.coroutineContext
/**
* WorkManager worker for downloading the Gemma 4 model.
* Handles resumable downloads and reports progress.
*/
class ModelDownloadWorker(
context: Context,
params: WorkerParameters
) : CoroutineWorker(context, params) {
companion object {
private const val TAG = "ModelDownloadWorker"
const val PROGRESS = "progress"
const val BYTES_DOWNLOADED = "bytes_downloaded"
const val ERROR_MSG = "error_msg"
private const val CHUNK_SIZE = 8192 // 8KB chunks
private const val PROGRESS_UPDATE_INTERVAL = 500L // Update every 500ms
}
override suspend fun doWork(): Result = withContext(Dispatchers.IO) {
try {
// Get variant from input data (e2b or e4b)
val variant = inputData.getString("MODEL_VARIANT") ?: "e2b"
val modelFile = when (variant) {
"e2b" -> ModelDownloadManager.getE2BModelFile(applicationContext)
"e4b" -> ModelDownloadManager.getE4BModelFile(applicationContext)
else -> ModelDownloadManager.getE2BModelFile(applicationContext)
}
val tempFile = File(modelFile.parentFile, "${modelFile.name}.tmp")
// Check if we have a partial download to resume
val resumeFrom = if (tempFile.exists()) tempFile.length() else 0L
val totalSize = when (variant) {
"e2b" -> ModelDownloadManager.E2B_MODEL_SIZE_BYTES
"e4b" -> ModelDownloadManager.E4B_MODEL_SIZE_BYTES
else -> ModelDownloadManager.E2B_MODEL_SIZE_BYTES
}
val modelUrl = when (variant) {
"e2b" -> ModelDownloadManager.E2B_MODEL_URL
"e4b" -> ModelDownloadManager.E4B_MODEL_URL
else -> ModelDownloadManager.E2B_MODEL_URL
}
Log.d(TAG, "Starting download of $variant from byte $resumeFrom, total: $totalSize")
val url = URL(modelUrl)
val connection = url.openConnection() as HttpURLConnection
// Set up connection
connection.apply {
setRequestProperty("User-Agent", "SleepyAgent/1.0")
connectTimeout = 30000
readTimeout = 30000
// Resume partial download if exists
if (resumeFrom > 0) {
setRequestProperty("Range", "bytes=$resumeFrom-")
Log.d(TAG, "Resuming download from $resumeFrom")
}
}
connection.connect()
val responseCode = connection.responseCode
if (responseCode != HttpURLConnection.HTTP_OK && responseCode != HttpURLConnection.HTTP_PARTIAL) {
val error = "HTTP error: $responseCode"
Log.e(TAG, error)
return@withContext Result.failure(
Data.Builder().putString(ERROR_MSG, error).build()
)
}
// Get content length (might be partial content length or full)
val contentLength = connection.contentLengthLong
val actualTotalSize = if (responseCode == HttpURLConnection.HTTP_PARTIAL && resumeFrom > 0 && contentLength > 0) {
resumeFrom + contentLength
} else if (contentLength > 0) {
contentLength
} else {
totalSize // Fallback to expected size
}
Log.d(TAG, "Content length: $contentLength, actual total: $actualTotalSize")
connection.inputStream.use { input ->
FileOutputStream(tempFile, resumeFrom > 0).use { output ->
val buffer = ByteArray(CHUNK_SIZE)
var bytesRead: Int
var totalRead = resumeFrom
var lastProgressUpdate = System.currentTimeMillis()
while (input.read(buffer).also { bytesRead = it } != -1) {
// Check if work was cancelled
if (!coroutineContext.isActive) {
Log.d(TAG, "Download cancelled")
return@withContext Result.failure()
}
output.write(buffer, 0, bytesRead)
totalRead += bytesRead
// Update progress periodically
val now = System.currentTimeMillis()
if (now - lastProgressUpdate > PROGRESS_UPDATE_INTERVAL) {
val progress = totalRead.toFloat() / actualTotalSize
setProgress(
Data.Builder()
.putFloat(PROGRESS, progress)
.putLong(BYTES_DOWNLOADED, totalRead)
.build()
)
lastProgressUpdate = now
Log.d(TAG, "Progress: ${(progress * 100).toInt()}%")
}
}
// Final progress update
setProgress(
Data.Builder()
.putFloat(PROGRESS, 1f)
.putLong(BYTES_DOWNLOADED, totalRead)
.build()
)
}
}
connection.disconnect()
// Verify download completed
if (tempFile.length() < actualTotalSize - 1000) { // Allow 1KB tolerance
val error = "Download incomplete: ${tempFile.length()} / $actualTotalSize"
Log.e(TAG, error)
return@withContext Result.failure(
Data.Builder().putString(ERROR_MSG, error).build()
)
}
// Move temp file to final location
if (modelFile.exists()) {
modelFile.delete()
}
tempFile.renameTo(modelFile)
Log.d(TAG, "Download completed: ${modelFile.absolutePath}")
Result.success()
} catch (e: Exception) {
Log.e(TAG, "Download failed", e)
Result.failure(
Data.Builder()
.putString(ERROR_MSG, e.message ?: "Unknown error")
.build()
)
}
}
}
@@ -0,0 +1,405 @@
package com.sleepy.agent.inference
import android.util.Log
import com.sleepy.agent.tools.Tool
import kotlinx.coroutines.coroutineScope
import kotlinx.coroutines.flow.Flow
import kotlinx.coroutines.flow.MutableStateFlow
import kotlinx.coroutines.flow.StateFlow
import kotlinx.coroutines.flow.channelFlow
import kotlinx.serialization.json.Json
import kotlinx.serialization.json.jsonObject
import kotlinx.serialization.json.jsonPrimitive
import java.util.UUID
/**
* Agent that manages conversation with the LLM, including tool calling.
*
* Supports multiple Gemma 4 tool call formats:
* 1. JSON format: <|tool_call>call:tool_name{"name": "tool_name", "arguments": {...}}<tool_call|>
* 2. Direct args: <|tool_call>call:tool_name{query: "value"}<tool_call|>
* 3. Special tokens: <|tool_call>call:tool_name{query:<|"|>value<|"|>}<tool_call|>
* 4. Old format: <tool_call>{"name": "tool_name", ...}</tool_call>
*/
class Agent(
private val llmEngine: LlmEngine,
private val context: ConversationContext,
private val tools: Map<String, Tool>
) {
enum class State { IDLE, GENERATING, EXECUTING_TOOL, STREAMING, ERROR }
private val _state = MutableStateFlow(State.IDLE)
val state: StateFlow<State> = _state
companion object {
private const val TAG = "Agent"
private const val MAX_ITERATIONS = 5
private val jsonParser = Json { ignoreUnknownKeys = true }
// Supported tool call patterns
private val TOOL_PATTERNS = listOf(
Pair("<|tool_call>call:", "<tool_call|>"),
Pair("<|tool_call>", "<tool_call|>"),
Pair("<tool_call>", "</tool_call>")
)
}
private var conversation: Conversation? = null
private val systemPrompt = buildString {
appendLine("You are a helpful AI assistant with access to tools.")
appendLine()
appendLine("Available tools:")
appendLine("- web_search: Search the web for information. Parameters: query (string, required)")
appendLine("- home_server: Execute commands on the home server. Parameters: command (string, required), args (string, optional)")
appendLine()
appendLine("When you need to use a tool, you MUST output EXACTLY in this format:")
appendLine("<|tool_call>call:web_search{\"name\": \"web_search\", \"arguments\": {\"query\": \"your search query here\"}}<tool_call|>")
appendLine()
appendLine("IMPORTANT:")
appendLine("- Replace 'web_search' with the actual tool name you want to use")
appendLine("- Do NOT use 'tool_name' as a placeholder - use the real tool name")
appendLine("- For web_search, use: {\"name\": \"web_search\", \"arguments\": {\"query\": \"...\"}}")
appendLine("- For home_server, use: {\"name\": \"home_server\", \"arguments\": {\"command\": \"...\"}}")
appendLine()
appendLine("After receiving tool results, provide a helpful response to the user.")
}
private fun ensureConversation(): Conversation {
if (conversation?.isAlive != true) {
Log.d(TAG, "Creating new conversation")
conversation = llmEngine.createConversation(systemPrompt)
}
return conversation!!
}
suspend fun prewarmCache() {
try {
Log.d(TAG, "Pre-warming KV cache with system prompt...")
ensureConversation()
Log.d(TAG, "KV cache pre-warmed and ready")
} catch (e: Exception) {
Log.e(TAG, "Failed to pre-warm cache", e)
}
}
suspend fun processInput(
input: String,
audioData: ByteArray? = null,
images: List<android.graphics.Bitmap>? = null,
onToken: ((String) -> Unit)? = null
): Flow<AgentEvent> = channelFlow {
Log.d(TAG, "processInput called with input: ${input.take(50)}...")
_state.value = State.GENERATING
context.addMessage(Message.User(input))
val conv = ensureConversation()
Log.d(TAG, "Conversation ensured, alive: ${conv.isAlive}")
var iteration = 0
var audioDataForIteration = audioData
var imagesForIteration = images
try {
while (iteration < MAX_ITERATIONS) {
iteration++
Log.d(TAG, "Iteration $iteration")
val prompt = if (iteration == 1) input else context.buildPrompt()
_state.value = State.GENERATING
val responseBuilder = StringBuilder()
try {
Log.d(TAG, "Calling llmEngine.generateStream...")
llmEngine.generateStream(
conversation = conv,
prompt = prompt,
audioData = audioDataForIteration,
images = imagesForIteration
) { token ->
responseBuilder.append(token)
}
Log.d(TAG, "generateStream completed, response length: ${responseBuilder.length}")
} catch (e: Exception) {
Log.e(TAG, "Error during generation", e)
send(AgentEvent.Error("Generation failed: ${e.message}"))
_state.value = State.ERROR
return@channelFlow
}
audioDataForIteration = null
imagesForIteration = null
val fullResponse = responseBuilder.toString()
Log.d(TAG, "Raw response length: ${fullResponse.length}, content: ${fullResponse.take(200)}...")
val toolCalls = parseToolCalls(fullResponse)
if (toolCalls.isEmpty()) {
_state.value = State.STREAMING
fullResponse.chunked(1).forEach { chunk ->
send(AgentEvent.Token(chunk))
onToken?.invoke(chunk)
kotlinx.coroutines.delay(5)
}
context.addMessage(Message.Assistant(content = fullResponse, toolCalls = null))
send(AgentEvent.Complete(fullResponse))
_state.value = State.IDLE
return@channelFlow
}
Log.d(TAG, "Found ${toolCalls.size} tool call(s): ${toolCalls.map { it.name }}")
_state.value = State.EXECUTING_TOOL
val contentBeforeTools = extractContentBeforeTools(fullResponse)
toolCalls.forEach { toolCall ->
val toolDisplayName = tools[toolCall.name]?.displayName ?: toolCall.name
send(AgentEvent.ExecutingTool(toolDisplayName, toolCall.arguments))
}
val toolResults = executeToolCalls(toolCalls)
context.addMessage(Message.Assistant(content = contentBeforeTools, toolCalls = toolCalls))
toolResults.forEach { result ->
context.addMessage(Message.ToolResult(toolCallId = result.id, toolName = result.name, result = result.result))
send(AgentEvent.ToolResult(result.name, result.result))
}
Log.d(TAG, "Tool results added, getting final response")
}
_state.value = State.GENERATING
val finalResponse = llmEngine.generate(
conversation = conv,
prompt = context.buildPrompt(),
audioData = null,
images = null
)
_state.value = State.STREAMING
finalResponse.chunked(1).forEach { chunk ->
send(AgentEvent.Token(chunk))
onToken?.invoke(chunk)
kotlinx.coroutines.delay(5)
}
context.addMessage(Message.Assistant(content = finalResponse, toolCalls = null))
send(AgentEvent.Complete(finalResponse))
_state.value = State.IDLE
} catch (e: Exception) {
Log.e(TAG, "Error processing input", e)
_state.value = State.ERROR
send(AgentEvent.Error(e.message ?: "Unknown error"))
throw e
}
}
fun reset() {
Log.d(TAG, "Resetting conversation")
conversation?.close()
conversation = null
context.clear()
_state.value = State.IDLE
}
/**
* Parse tool calls from model response.
* Handles multiple Gemma 4 formats.
*/
private fun parseToolCalls(response: String): List<ToolCall> {
val toolCalls = mutableListOf<ToolCall>()
for ((startTag, endTag) in TOOL_PATTERNS) {
var currentIndex = 0
while (true) {
val startIdx = response.indexOf(startTag, currentIndex)
if (startIdx == -1) break
val endIdx = response.indexOf(endTag, startIdx + startTag.length)
if (endIdx == -1) break
val content = response.substring(startIdx + startTag.length, endIdx).trim()
parseToolCallContent(content)?.let { toolCalls.add(it) }
currentIndex = endIdx + endTag.length
}
}
return toolCalls
}
/**
* Parse individual tool call content.
* Handles:
* - tool_name{"name": "...", "arguments": {...}}
* - tool_name{query: "value"}
* - tool_name{query:<|"|>value<|"|>}
* - {"name": "...", ...} (legacy)
*/
private fun parseToolCallContent(content: String): ToolCall? {
Log.d(TAG, "Parsing tool call content: $content")
// Clean up special quote tokens first
val cleaned = content
.replace("<|\">", "\"")
.replace("<|\"|>", "\"")
.replace("\">|>", "\"")
return when {
// Has braces - parse as {args} or tool_name{args}
cleaned.contains("{") -> {
val braceIdx = cleaned.indexOf("{")
val toolNamePart = cleaned.substring(0, braceIdx).trim()
val inner = cleaned.substring(braceIdx)
// Try JSON first, then direct args
parseAsJson(toolNamePart, inner)
?: parseAsDirectArgs(toolNamePart, inner)
}
// Try as pure JSON
cleaned.trim().startsWith("{") -> {
parseAsJson("", cleaned)
}
else -> {
Log.w(TAG, "Unrecognized tool call format: $content")
null
}
}
}
private fun parseAsJson(toolNamePrefix: String, jsonStr: String): ToolCall? {
return try {
val obj = jsonParser.parseToJsonElement(jsonStr).jsonObject
// Get tool name from "name" field or prefix
val toolName = obj["name"]?.jsonPrimitive?.content
?: toolNamePrefix.takeIf { it.isNotEmpty() }
?: return null
// Get arguments from "arguments" field or top-level
val args = obj["arguments"]?.jsonObject?.let { argsObj ->
argsObj.entries.associate { (k, v) ->
k to (v.jsonPrimitive.content ?: v.toString())
}
} ?: obj.entries
.filter { it.key != "name" }
.associate { (k, v) ->
k to (v.jsonPrimitive.content ?: v.toString())
}
ToolCall(id = generateToolCallId(), name = toolName, arguments = args)
} catch (e: Exception) {
Log.d(TAG, "JSON parse failed: ${e.message}")
null
}
}
private fun parseAsDirectArgs(toolName: String, argsStr: String): ToolCall? {
val args = mutableMapOf<String, String>()
// Extract content between outer braces
val inner = argsStr.trim().trim('{', '}')
// Split by comma, but be careful with nested structures
var depth = 0
var current = StringBuilder()
val parts = mutableListOf<String>()
for (char in inner) {
when (char) {
'{', '[' -> {
depth++
current.append(char)
}
'}', ']' -> {
depth--
current.append(char)
}
',' -> {
if (depth == 0) {
parts.add(current.toString().trim())
current = StringBuilder()
} else {
current.append(char)
}
}
else -> current.append(char)
}
}
if (current.isNotEmpty()) {
parts.add(current.toString().trim())
}
// Parse each key:value pair
for (part in parts) {
val colonIdx = part.indexOf(':')
if (colonIdx != -1) {
val key = part.substring(0, colonIdx).trim().trim('"', '\'')
var value = part.substring(colonIdx + 1).trim()
// Clean up value
value = value.trim('"', '\'', '{', '}')
if (key.isNotEmpty()) {
args[key] = value
}
}
}
return if (args.isNotEmpty() && toolName.isNotEmpty()) {
ToolCall(id = generateToolCallId(), name = toolName, arguments = args)
} else null
}
private fun extractContentBeforeTools(response: String): String {
var firstIdx = -1
for ((startTag, _) in TOOL_PATTERNS) {
val idx = response.indexOf(startTag)
if (idx != -1 && (firstIdx == -1 || idx < firstIdx)) {
firstIdx = idx
}
}
return if (firstIdx != -1) {
response.substring(0, firstIdx).trim()
} else {
response.trim()
}
}
private suspend fun executeToolCalls(toolCalls: List<ToolCall>): List<ToolCallResult> = coroutineScope {
toolCalls.map { toolCall ->
val tool = tools[toolCall.name]
val result = if (tool != null) {
try {
Log.d(TAG, "Executing tool: ${toolCall.name}")
tool.execute(toolCall.arguments)
} catch (e: Exception) {
Log.e(TAG, "Error executing tool '${toolCall.name}'", e)
"Error: ${e.message}"
}
} else {
"Tool '${toolCall.name}' not found"
}
ToolCallResult(toolCall.id, toolCall.name, result)
}
}
private fun generateToolCallId(): String = "call_${UUID.randomUUID().toString().take(8)}"
private data class ToolCallResult(val id: String, val name: String, val result: String)
}
sealed class AgentEvent {
data class Token(val text: String) : AgentEvent()
data class ExecutingTool(val toolName: String, val arguments: Map<String, String>) : AgentEvent()
data class ToolResult(val toolName: String, val result: String) : AgentEvent()
data class Complete(val response: String) : AgentEvent()
data class Error(val message: String) : AgentEvent()
}
@@ -0,0 +1,186 @@
package com.sleepy.agent.inference
import kotlinx.serialization.Serializable
@Serializable
sealed class Message {
@Serializable
data class User(val content: String) : Message()
@Serializable
data class Assistant(val content: String, val toolCalls: List<ToolCall>? = null) : Message()
@Serializable
data class ToolResult(val toolCallId: String, val toolName: String, val result: String) : Message()
@Serializable
data class System(val content: String) : Message()
}
@Serializable
data class ToolCall(val id: String, val name: String, val arguments: Map<String, String>)
class ConversationContext(
private val systemPrompt: String = "You are a helpful AI assistant.",
private val maxTokens: Int = 32768, // Gemma 4 supports up to 32k context
private val reservedForResponse: Int = 4096 // Reserve more for longer responses
) {
private val messages = mutableListOf<Message>()
private val effectiveBudget = maxTokens - reservedForResponse
companion object {
const val CHARS_PER_TOKEN = 4
}
/**
* Adds a message to the conversation context.
* Automatically prunes oldest non-system messages if token budget is exceeded.
*
* @param message The message to add
* @return true if message was added successfully, false if it couldn't fit even after pruning
*/
fun addMessage(message: Message): Boolean {
val messageTokens = estimateTokens(message.toText())
// If even with empty context this message exceeds budget, reject it
if (messageTokens > effectiveBudget) {
return false
}
messages.add(message)
// Prune if necessary to stay within budget
pruneIfNeeded()
return true
}
/**
* Returns all messages in the conversation, including system prompt as first message.
*/
fun getMessages(): List<Message> {
return listOf(Message.System(systemPrompt)) + messages
}
/**
* Builds a formatted prompt string with XML-style tags for LLM consumption.
*/
fun buildPrompt(): String {
val sb = StringBuilder()
// System message first
sb.append("<system>").append(escapeXml(systemPrompt)).append("</system>\n")
// User messages
messages.forEach { message ->
when (message) {
is Message.User -> {
sb.append("<user>").append(escapeXml(message.content)).append("</user>\n")
}
is Message.Assistant -> {
sb.append("<assistant>")
sb.append(escapeXml(message.content))
message.toolCalls?.let { toolCalls ->
toolCalls.forEach { toolCall ->
sb.append("\n<tool_call id=\"").append(escapeXml(toolCall.id)).append("\"")
sb.append(" name=\"").append(escapeXml(toolCall.name)).append("\">")
val argsStr = toolCall.arguments.entries.joinToString(", ") { (k, v) ->
"\"${escapeXml(k)}\": \"${escapeXml(v)}\""
}
sb.append("{").append(argsStr).append("}")
sb.append("</tool_call>")
}
}
sb.append("</assistant>\n")
}
is Message.ToolResult -> {
sb.append("<tool_result")
sb.append(" id=\"").append(escapeXml(message.toolCallId)).append("\"")
sb.append(" tool=\"").append(escapeXml(message.toolName)).append("\">")
sb.append(escapeXml(message.result))
sb.append("</tool_result>\n")
}
is Message.System -> {
// System messages from the list shouldn't appear here
// (only the constructor systemPrompt is used)
}
}
}
return sb.toString().trimEnd()
}
/**
* Estimates token count for given text.
* Uses a simple heuristic: ~4 characters per token for English text.
*/
fun estimateTokens(text: String): Int {
return text.length / CHARS_PER_TOKEN
}
/**
* Clears all conversation messages (except system prompt).
*/
fun clear() {
messages.clear()
}
/**
* Returns the current total token count including system prompt.
*/
fun getTokenCount(): Int {
val systemTokens = estimateTokens(systemPrompt)
val messagesTokens = messages.sumOf { estimateTokens(it.toText()) }
return systemTokens + messagesTokens
}
/**
* Returns the available token budget for new messages.
*/
fun getAvailableTokens(): Int {
return effectiveBudget - getTokenCount() + estimateTokens(systemPrompt)
}
/**
* Returns all messages for serialization (for state restoration).
*/
fun getSerializableMessages(): List<Message> = messages.toList()
/**
* Restores messages from serialization.
*/
fun restoreMessages(msgs: List<Message>) {
messages.clear()
messages.addAll(msgs)
}
private fun pruneIfNeeded() {
while (getTokenCount() > effectiveBudget && messages.isNotEmpty()) {
// Find and remove the oldest non-system message
// Keep removing from the beginning until we're under budget
val indexToRemove = messages.indexOfFirst { it !is Message.System }
if (indexToRemove == -1) {
// No more non-system messages to remove
break
}
messages.removeAt(indexToRemove)
}
}
private fun Message.toText(): String {
return when (this) {
is Message.User -> content
is Message.Assistant -> content + (toolCalls?.joinToString { "${it.name} ${it.arguments}" } ?: "")
is Message.ToolResult -> "$toolName $result"
is Message.System -> content
}
}
private fun escapeXml(text: String): String {
return text
.replace("&", "&amp;")
.replace("<", "&lt;")
.replace(">", "&gt;")
.replace("\"", "&quot;")
}
}
@@ -0,0 +1,298 @@
package com.sleepy.agent.inference
import android.content.Context
import android.graphics.Bitmap
import android.util.Log
import com.google.ai.edge.litertlm.*
import com.sleepy.agent.audio.WavConverter
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.flow.Flow
import kotlinx.coroutines.flow.catch
import kotlinx.coroutines.flow.collect
import kotlinx.coroutines.withContext
import java.io.ByteArrayOutputStream
import java.io.File
/**
* LLM Engine interface for text generation with optional multimodal inputs.
*/
interface LlmEngine {
suspend fun loadModel(modelPath: String): Result<Unit>
/**
* Creates a new conversation with the given system prompt.
* This should be called once per chat session to enable KV cache reuse.
*/
fun createConversation(systemPrompt: String): Conversation
/**
* Generate a response within an existing conversation.
* This reuses the KV cache from previous turns.
*/
suspend fun generate(
conversation: Conversation,
prompt: String,
audioData: ByteArray? = null,
images: List<Bitmap>? = null
): String
suspend fun generateStream(
conversation: Conversation,
prompt: String,
audioData: ByteArray? = null,
images: List<Bitmap>? = null,
onToken: (String) -> Unit
)
fun isLoaded(): Boolean
fun unload()
}
/**
* Wrapper for LiteRT-LM Conversation to manage lifecycle.
*/
class Conversation(
internal val liteRtConversation: com.google.ai.edge.litertlm.Conversation
) : AutoCloseable {
private var isClosed = false
val isAlive: Boolean get() = !isClosed && liteRtConversation.isAlive
override fun close() {
if (!isClosed) {
isClosed = true
liteRtConversation.close()
}
}
}
/**
* LiteRT-LM based LLM Engine implementation for Gemma 4 E2B/E4B models.
*
* Uses .litertlm model format - download from HuggingFace LiteRT Community:
* https://huggingface.co/litert-community
*
* Gemma 4 E2B is natively multimodal - it can process text, images, and audio directly.
*/
class LiteRtLlmEngine(
private val context: Context
) : LlmEngine {
companion object {
private const val TAG = "LiteRtLlmEngine"
private const val MAX_TOKENS = 16384 // 16k context as requested
}
private var engine: Engine? = null
private var currentModelPath: String? = null
private val lock = Any()
override suspend fun loadModel(modelPath: String): Result<Unit> = withContext(Dispatchers.Default) {
runCatching {
synchronized(lock) {
unload()
val modelFile = File(modelPath)
if (!modelFile.exists()) {
throw IllegalArgumentException("Model file not found: $modelPath")
}
Log.d(TAG, "Loading model from: $modelPath")
// Ensure cache directory exists
val cacheDir = File(context.cacheDir, "litertlm_cache")
cacheDir.mkdirs()
Log.d(TAG, "Using cache dir: ${cacheDir.absolutePath}")
val engineConfig = EngineConfig(
modelPath = modelPath,
backend = Backend.CPU(),
visionBackend = Backend.CPU(), // Required for image processing!
audioBackend = Backend.CPU(), // Required for audio processing!
maxNumTokens = MAX_TOKENS,
cacheDir = cacheDir.absolutePath
)
val newEngine = Engine(engineConfig)
newEngine.initialize()
engine = newEngine
currentModelPath = modelPath
Log.d(TAG, "Model loaded successfully with 16k context")
Unit
}
}.onFailure { e ->
Log.e(TAG, "Failed to load model", e)
}
}
override fun createConversation(systemPrompt: String): Conversation {
val eng = synchronized(lock) { engine } ?: throw IllegalStateException("Model not loaded")
val conversationConfig = ConversationConfig(
systemInstruction = Contents.of(systemPrompt)
)
val liteRtConversation = eng.createConversation(conversationConfig)
return Conversation(liteRtConversation)
}
override suspend fun generate(
conversation: Conversation,
prompt: String,
audioData: ByteArray?,
images: List<Bitmap>?
): String = withContext(Dispatchers.Default) {
try {
if (!conversation.isAlive) {
Log.e(TAG, "Conversation is closed")
return@withContext "Error: Conversation closed"
}
val contents = buildContents(prompt, audioData, images)
val response = conversation.liteRtConversation.sendMessage(contents)
response.toString()
} catch (e: Exception) {
Log.e(TAG, "Generation failed", e)
"Error: ${e.message}"
}
}
override suspend fun generateStream(
conversation: Conversation,
prompt: String,
audioData: ByteArray?,
images: List<Bitmap>?,
onToken: (String) -> Unit
) {
withContext(Dispatchers.Default) {
try {
if (!conversation.isAlive) {
Log.e(TAG, "Conversation is closed or not alive")
onToken("Error: Conversation closed")
return@withContext
}
// For multimodal inputs, we need to use the Contents API
if (audioData != null || !images.isNullOrEmpty()) {
Log.d(TAG, "Processing multimodal input (audio=${audioData?.size ?: 0} bytes, images=${images?.size ?: 0})")
try {
val contents = buildContents(prompt, audioData, images)
if (contents.contents.isEmpty()) {
Log.w(TAG, "No valid content to send")
onToken("Error: No valid audio or image data")
return@withContext
}
Log.d(TAG, "Built contents with ${contents.contents.size} parts, sending to model...")
val response = conversation.liteRtConversation.sendMessage(contents)
Log.d(TAG, "Got response: ${response.toString().take(100)}...")
onToken(response.toString())
} catch (e: Exception) {
Log.e(TAG, "Multimodal processing failed", e)
onToken("Error processing input: ${e.message}")
}
} else {
// Text-only streaming - reuses KV cache!
Log.d(TAG, "Starting text-only streaming with prompt: ${prompt.take(100)}...")
conversation.liteRtConversation.sendMessageAsync(prompt)
.catch { e -> Log.e(TAG, "Stream error", e) }
.collect { message ->
Log.d(TAG, "Token received: ${message.toString().take(30)}...")
onToken(message.toString())
}
Log.d(TAG, "Text streaming completed")
}
} catch (e: Exception) {
Log.e(TAG, "Streaming generation failed", e)
onToken("Error: ${e.message}")
}
}
}
/**
* Builds a Contents object with text, audio, and/or images.
*/
private fun buildContents(
prompt: String,
audioData: ByteArray?,
images: List<Bitmap>?
): Contents {
val contentList = mutableListOf<Content>()
// Add images first if provided
images?.take(1)?.forEachIndexed { index, bitmap ->
try {
Log.d(TAG, "Processing image $index: ${bitmap.width}x${bitmap.height}")
if (bitmap.isRecycled) {
Log.e(TAG, "Bitmap is recycled!")
return@forEachIndexed
}
// Resize to max 512x512
val resized = resizeBitmap(bitmap, 512, 512)
val stream = ByteArrayOutputStream()
resized.compress(Bitmap.CompressFormat.JPEG, 85, stream)
val bytes = stream.toByteArray()
if (bytes.isNotEmpty()) {
contentList.add(Content.ImageBytes(bytes))
Log.d(TAG, "Added image: ${bytes.size} bytes")
}
if (resized !== bitmap) resized.recycle()
} catch (e: Exception) {
Log.e(TAG, "Failed to process image", e)
}
}
// Add audio if provided
audioData?.let { audio ->
try {
if (audio.size >= 6400) {
val wavData = if (WavConverter.isWav(audio)) audio else WavConverter.pcmToWav(audio)
if (wavData.isNotEmpty()) {
contentList.add(Content.AudioBytes(wavData))
Log.d(TAG, "Added audio: ${wavData.size} bytes")
}
}
} catch (e: Exception) {
Log.e(TAG, "Failed to add audio", e)
}
}
contentList.add(Content.Text(prompt))
return Contents.of(contentList)
}
private fun resizeBitmap(bitmap: Bitmap, maxWidth: Int, maxHeight: Int): Bitmap {
val width = bitmap.width
val height = bitmap.height
if (width <= maxWidth && height <= maxHeight) return bitmap
val ratio = minOf(maxWidth.toFloat() / width, maxHeight.toFloat() / height)
val newWidth = (width * ratio).toInt()
val newHeight = (height * ratio).toInt()
return Bitmap.createScaledBitmap(bitmap, newWidth, newHeight, true)
}
override fun isLoaded(): Boolean = synchronized(lock) {
engine != null
}
override fun unload() {
synchronized(lock) {
engine?.close()
engine = null
currentModelPath = null
Log.d(TAG, "Model unloaded")
}
}
}
@@ -0,0 +1,374 @@
package com.sleepy.agent.service
import android.app.NotificationChannel
import android.app.NotificationManager
import android.app.Service
import android.content.Context
import android.content.Intent
import android.graphics.PixelFormat
import android.graphics.Bitmap
import android.graphics.drawable.GradientDrawable
import android.hardware.display.DisplayManager
import android.hardware.display.VirtualDisplay
import android.media.ImageReader
import android.media.projection.MediaProjection
import android.media.projection.MediaProjectionManager
import android.net.Uri
import android.os.Build
import android.os.Handler
import android.os.IBinder
import android.os.Looper
import android.provider.Settings
import android.util.Log
import android.view.Gravity
import android.view.MotionEvent
import android.view.View
import android.view.WindowManager
import android.widget.ImageView
import androidx.core.app.NotificationCompat
import androidx.core.content.ContextCompat
import androidx.lifecycle.Lifecycle
import androidx.lifecycle.LifecycleOwner
import androidx.lifecycle.LifecycleRegistry
import com.sleepy.agent.MainActivity
import com.sleepy.agent.R
import java.io.File
import java.io.FileOutputStream
import java.nio.ByteBuffer
/**
* Floating button service that provides an overlay button for quick access.
*
* Features:
* - Tap: Opens Sleepy Agent (compact mode)
* - Long press: Takes screenshot and opens Sleepy Agent
* - Drag: Moves button around screen
*/
class FloatingButtonService : Service(), LifecycleOwner {
private lateinit var windowManager: WindowManager
private var floatingButton: View? = null
private var params: WindowManager.LayoutParams? = null
private var initialX = 0
private var initialY = 0
private var initialTouchX = 0f
private var initialTouchY = 0f
private var isDragging = false
private var longPressHandler: Handler? = null
private var isLongPress = false
private lateinit var lifecycleRegistry: LifecycleRegistry
companion object {
private const val TAG = "FloatingButtonService"
private const val LONG_PRESS_DURATION = 800L // ms
private const val DRAG_THRESHOLD = 10f // pixels
private const val CHANNEL_ID = "floating_button_channel"
private const val NOTIFICATION_ID = 1001
var mediaProjection: MediaProjection? = null
@Volatile
var isRunning = false
fun start(context: Context) {
val intent = Intent(context, FloatingButtonService::class.java)
context.startForegroundService(intent)
}
fun stop(context: Context) {
val intent = Intent(context, FloatingButtonService::class.java)
context.stopService(intent)
}
}
override fun onCreate() {
super.onCreate()
try {
lifecycleRegistry = LifecycleRegistry(this)
lifecycleRegistry.currentState = Lifecycle.State.CREATED
windowManager = getSystemService(Context.WINDOW_SERVICE) as WindowManager
longPressHandler = Handler(Looper.getMainLooper())
// Create notification channel first
createNotificationChannel()
// Then start as foreground service
startForeground()
} catch (e: Exception) {
Log.e(TAG, "Error creating service", e)
stopSelf()
}
}
override fun onStartCommand(intent: Intent?, flags: Int, startId: Int): Int {
try {
// Check overlay permission before showing button
if (!Settings.canDrawOverlays(this)) {
Log.e(TAG, "Overlay permission not granted, stopping service")
stopSelf()
return START_NOT_STICKY
}
if (!isRunning) {
showFloatingButton()
isRunning = true
lifecycleRegistry.currentState = Lifecycle.State.STARTED
}
} catch (e: Exception) {
Log.e(TAG, "Error in onStartCommand", e)
stopSelf()
}
return START_STICKY
}
private fun startForeground() {
val notification = NotificationCompat.Builder(this, CHANNEL_ID)
.setContentTitle("Sleepy Agent")
.setContentText("Floating button active")
.setSmallIcon(R.drawable.ic_launcher_foreground)
.setOngoing(true)
.build()
startForeground(NOTIFICATION_ID, notification)
}
private fun createNotificationChannel() {
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.O) {
val channel = android.app.NotificationChannel(
CHANNEL_ID,
"Floating Button",
android.app.NotificationManager.IMPORTANCE_LOW
)
val manager = getSystemService(NotificationManager::class.java)
manager.createNotificationChannel(channel)
}
}
private fun showFloatingButton() {
try {
val buttonSize = 150 // dp
val sizePx = (buttonSize * resources.displayMetrics.density).toInt()
// Create floating button view
floatingButton = ImageView(this).apply {
setImageResource(R.drawable.ic_launcher_foreground)
val drawable = GradientDrawable().apply {
shape = GradientDrawable.OVAL
setColor(ContextCompat.getColor(this@FloatingButtonService, android.R.color.white))
}
background = drawable
setPadding(20, 20, 20, 20)
elevation = 10f
setOnTouchListener { view, event ->
handleTouch(event, view)
true
}
}
// Set up window parameters
params = WindowManager.LayoutParams(
sizePx,
sizePx,
if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.O) {
WindowManager.LayoutParams.TYPE_APPLICATION_OVERLAY
} else {
WindowManager.LayoutParams.TYPE_PHONE
},
WindowManager.LayoutParams.FLAG_NOT_FOCUSABLE or
WindowManager.LayoutParams.FLAG_LAYOUT_IN_SCREEN,
PixelFormat.TRANSLUCENT
).apply {
gravity = Gravity.TOP or Gravity.START
x = resources.displayMetrics.widthPixels - sizePx - 50
y = 200
}
windowManager.addView(floatingButton, params)
Log.d(TAG, "Floating button shown")
} catch (e: Exception) {
Log.e(TAG, "Failed to show floating button", e)
}
}
private fun handleTouch(event: MotionEvent, view: View) {
when (event.action) {
MotionEvent.ACTION_DOWN -> {
initialX = params?.x ?: 0
initialY = params?.y ?: 0
initialTouchX = event.rawX
initialTouchY = event.rawY
isDragging = false
isLongPress = false
// Start long press timer
longPressHandler?.postDelayed({
if (!isDragging) {
isLongPress = true
performLongPress()
}
}, LONG_PRESS_DURATION)
}
MotionEvent.ACTION_MOVE -> {
val deltaX = event.rawX - initialTouchX
val deltaY = event.rawY - initialTouchY
// Check if we're dragging
if (!isDragging && (Math.abs(deltaX) > DRAG_THRESHOLD || Math.abs(deltaY) > DRAG_THRESHOLD)) {
isDragging = true
longPressHandler?.removeCallbacksAndMessages(null)
}
if (isDragging) {
params?.x = (initialX + deltaX).toInt()
params?.y = (initialY + deltaY).toInt()
windowManager.updateViewLayout(view, params)
}
}
MotionEvent.ACTION_UP -> {
longPressHandler?.removeCallbacksAndMessages(null)
if (!isDragging && !isLongPress) {
// Simple tap - open app
openMainActivity()
}
isDragging = false
}
MotionEvent.ACTION_CANCEL -> {
longPressHandler?.removeCallbacksAndMessages(null)
isDragging = false
}
}
}
private fun performLongPress() {
Log.d(TAG, "Long press detected - taking screenshot")
takeScreenshot()
}
private fun takeScreenshot() {
val projection = mediaProjection ?: run {
Log.e(TAG, "MediaProjection not available")
// Fallback: just open app without screenshot
openMainActivity()
return
}
try {
val displayMetrics = resources.displayMetrics
val width = displayMetrics.widthPixels
val height = displayMetrics.heightPixels
val density = displayMetrics.densityDpi
val imageReader = ImageReader.newInstance(width, height, PixelFormat.RGBA_8888, 2)
val virtualDisplay = projection.createVirtualDisplay(
"screenshot",
width, height, density,
DisplayManager.VIRTUAL_DISPLAY_FLAG_AUTO_MIRROR,
imageReader.surface, null, null
)
// Wait for image
Handler(Looper.getMainLooper()).postDelayed({
try {
val image = imageReader.acquireLatestImage()
if (image != null) {
val bitmap = imageToBitmap(image)
val screenshotPath = saveBitmap(bitmap)
image.close()
Log.d(TAG, "Screenshot saved: $screenshotPath")
// Open app with screenshot
openMainActivityWithImage(screenshotPath)
} else {
Log.e(TAG, "Failed to acquire image")
openMainActivity()
}
virtualDisplay.release()
imageReader.close()
} catch (e: Exception) {
Log.e(TAG, "Error processing screenshot", e)
openMainActivity()
}
}, 500) // Small delay to ensure capture
} catch (e: Exception) {
Log.e(TAG, "Failed to take screenshot", e)
openMainActivity()
}
}
private fun imageToBitmap(image: android.media.Image): Bitmap {
val planes = image.planes
val buffer: ByteBuffer = planes[0].buffer
val pixelStride = planes[0].pixelStride
val rowStride = planes[0].rowStride
val rowPadding = rowStride - pixelStride * image.width
val bitmap = Bitmap.createBitmap(
image.width + rowPadding / pixelStride,
image.height,
Bitmap.Config.ARGB_8888
)
bitmap.copyPixelsFromBuffer(buffer)
return bitmap
}
private fun saveBitmap(bitmap: Bitmap): String {
val filename = "screenshot_${System.currentTimeMillis()}.png"
val file = File(cacheDir, filename)
FileOutputStream(file).use { out ->
bitmap.compress(Bitmap.CompressFormat.PNG, 100, out)
}
return file.absolutePath
}
private fun openMainActivity() {
val intent = Intent(this, MainActivity::class.java).apply {
flags = Intent.FLAG_ACTIVITY_NEW_TASK or Intent.FLAG_ACTIVITY_CLEAR_TOP
putExtra("from_floating_button", true)
}
startActivity(intent)
}
private fun openMainActivityWithImage(imagePath: String) {
val intent = Intent(this, MainActivity::class.java).apply {
flags = Intent.FLAG_ACTIVITY_NEW_TASK or Intent.FLAG_ACTIVITY_CLEAR_TOP
putExtra("from_floating_button", true)
putExtra("screenshot_path", imagePath)
putExtra("auto_analyze", true)
}
startActivity(intent)
}
override fun onBind(intent: Intent?): IBinder? = null
override fun onDestroy() {
super.onDestroy()
lifecycleRegistry.currentState = Lifecycle.State.DESTROYED
isRunning = false
floatingButton?.let {
try {
windowManager.removeView(it)
} catch (e: Exception) {
Log.e(TAG, "Error removing floating button", e)
}
}
floatingButton = null
}
override val lifecycle: Lifecycle
get() = lifecycleRegistry
}
@@ -0,0 +1,18 @@
package com.sleepy.agent.settings
import androidx.datastore.preferences.core.booleanPreferencesKey
import androidx.datastore.preferences.core.stringPreferencesKey
object Preferences {
val SERVER_URL = stringPreferencesKey("server_url")
val MODEL_PATH = stringPreferencesKey("model_path")
val ENABLE_SERVER_DELEGATION = booleanPreferencesKey("enable_server_delegation")
val MODEL_SOURCE = stringPreferencesKey("model_source")
val SELECTED_SERVER_MODEL = stringPreferencesKey("selected_server_model")
}
enum class ModelSource(val displayName: String, val description: String) {
FILE_PATH("Local File", "Load a .litertlm model file from storage"),
E2B("Gemma 4 E2B", "2B parameter model (fast, ~1.5GB)"),
E4B("Gemma 4 E4B", "4B parameter model (capable, ~2.5GB)")
}
@@ -0,0 +1,149 @@
package com.sleepy.agent.settings
import androidx.datastore.core.DataStore
import androidx.datastore.preferences.core.Preferences
import androidx.datastore.preferences.core.booleanPreferencesKey
import androidx.datastore.preferences.core.edit
import androidx.datastore.preferences.core.stringPreferencesKey
import kotlinx.coroutines.flow.Flow
import kotlinx.coroutines.flow.map
class UserSettings(
private val dataStore: DataStore<Preferences>
) {
companion object {
// Server URLs
val SERVER_URL = stringPreferencesKey("server_url")
val SEARCH_SERVER_URL = stringPreferencesKey("search_server_url")
val DELEGATE_SERVER_URL = stringPreferencesKey("delegate_server_url")
// Model settings
val MODEL_PATH = stringPreferencesKey("model_path")
val ENABLE_SERVER_DELEGATION = booleanPreferencesKey("enable_server_delegation")
val MODEL_SOURCE = stringPreferencesKey("model_source")
val SELECTED_SERVER_MODEL = stringPreferencesKey("selected_server_model")
// TTS settings
val TTS_ENABLED = booleanPreferencesKey("tts_enabled")
val TTS_AUTO_MODE = booleanPreferencesKey("tts_auto_mode")
val TTS_PREFERRED_INPUT = stringPreferencesKey("tts_preferred_input")
// Experimental features
val FLOATING_BUTTON_ENABLED = booleanPreferencesKey("floating_button_enabled")
}
// Search server (for web search tool) - empty default, user must configure
val searchServerUrl: Flow<String> = dataStore.data.map { prefs ->
prefs[SEARCH_SERVER_URL] ?: ""
}
// Delegate server (for LLM inference) - empty default, user must configure
val delegateServerUrl: Flow<String> = dataStore.data.map { prefs ->
prefs[DELEGATE_SERVER_URL] ?: ""
}
// Legacy combined URL
val serverUrl: Flow<String> = dataStore.data.map { prefs ->
prefs[SERVER_URL] ?: ""
}
val modelPath: Flow<String> = dataStore.data.map { prefs ->
prefs[MODEL_PATH] ?: ""
}
val enableServerDelegation: Flow<Boolean> = dataStore.data.map { prefs ->
prefs[ENABLE_SERVER_DELEGATION] ?: false
}
val modelSource: Flow<ModelSource> = dataStore.data.map { prefs ->
prefs[MODEL_SOURCE]?.let { ModelSource.valueOf(it) } ?: ModelSource.FILE_PATH
}
val selectedServerModel: Flow<String> = dataStore.data.map { prefs ->
prefs[SELECTED_SERVER_MODEL] ?: ""
}
// TTS settings
val ttsEnabled: Flow<Boolean> = dataStore.data.map { prefs ->
prefs[TTS_ENABLED] ?: true // Default to enabled
}
val ttsAutoMode: Flow<Boolean> = dataStore.data.map { prefs ->
prefs[TTS_AUTO_MODE] ?: true // Default to auto-detect
}
val ttsPreferredInput: Flow<String> = dataStore.data.map { prefs ->
prefs[TTS_PREFERRED_INPUT] ?: "" // Empty = not set yet, "voice" or "text"
}
suspend fun setSearchServerUrl(url: String) {
dataStore.edit { prefs ->
prefs[SEARCH_SERVER_URL] = url
}
}
suspend fun setDelegateServerUrl(url: String) {
dataStore.edit { prefs ->
prefs[DELEGATE_SERVER_URL] = url
}
}
suspend fun setServerUrl(url: String) {
dataStore.edit { prefs ->
prefs[SERVER_URL] = url
}
}
suspend fun setModelPath(path: String) {
dataStore.edit { prefs ->
prefs[MODEL_PATH] = path
}
}
suspend fun setEnableServerDelegation(enabled: Boolean) {
dataStore.edit { prefs ->
prefs[ENABLE_SERVER_DELEGATION] = enabled
}
}
suspend fun setModelSource(source: ModelSource) {
dataStore.edit { prefs ->
prefs[MODEL_SOURCE] = source.name
}
}
suspend fun setSelectedServerModel(model: String) {
dataStore.edit { prefs ->
prefs[SELECTED_SERVER_MODEL] = model
}
}
suspend fun setTtsEnabled(enabled: Boolean) {
dataStore.edit { prefs ->
prefs[TTS_ENABLED] = enabled
}
}
suspend fun setTtsAutoMode(auto: Boolean) {
dataStore.edit { prefs ->
prefs[TTS_AUTO_MODE] = auto
}
}
suspend fun setTtsPreferredInput(input: String) {
dataStore.edit { prefs ->
prefs[TTS_PREFERRED_INPUT] = input
}
}
// Floating button (experimental feature)
val floatingButtonEnabled: Flow<Boolean> = dataStore.data.map { prefs ->
prefs[FLOATING_BUTTON_ENABLED] ?: false // Default off
}
suspend fun setFloatingButtonEnabled(enabled: Boolean) {
dataStore.edit { prefs ->
prefs[FLOATING_BUTTON_ENABLED] = enabled
}
}
}
@@ -0,0 +1,50 @@
package com.sleepy.agent.tools
import io.ktor.client.HttpClient
import io.ktor.client.call.body
import io.ktor.client.request.post
import io.ktor.client.request.setBody
import io.ktor.http.ContentType
import io.ktor.http.contentType
import kotlinx.serialization.json.JsonObject
import kotlinx.serialization.json.buildJsonObject
import kotlinx.serialization.json.put
class ServerTool(
private val client: HttpClient,
private val baseUrl: String = "http://sleepy-think:8000"
) : Tool {
override val name: String = "home_server"
override val displayName: String = "Home Server"
override val description: String = "Execute commands on the home server. Parameters: command (string), args (optional string)"
override suspend fun execute(arguments: Map<String, String>): String {
val command = arguments["command"] ?: return "Error: 'command' parameter is required"
val args = arguments["args"]?.split(",")?.map { it.trim() } ?: emptyList()
return executeSync(command, args)
}
/**
* Synchronous version for tool calling.
*/
fun executeSync(command: String, args: List<String> = emptyList()): String {
return try {
kotlinx.coroutines.runBlocking {
val response: JsonObject = client.post("$baseUrl/execute") {
contentType(ContentType.Application.Json)
setBody(buildJsonObject {
put("command", command)
put("args", args.joinToString(","))
})
}.body()
response["result"]?.toString()
?: response["output"]?.toString()
?: "Command executed successfully"
}
} catch (e: Exception) {
"Error executing server command: ${e.message}"
}
}
}
@@ -0,0 +1,11 @@
package com.sleepy.agent.tools
/**
* Interface for tools that can be called by the Agent.
*/
interface Tool {
val name: String
val displayName: String
val description: String
suspend fun execute(arguments: Map<String, String>): String
}
@@ -0,0 +1,76 @@
package com.sleepy.agent.tools
import io.ktor.client.HttpClient
import io.ktor.client.call.body
import io.ktor.client.request.get
import io.ktor.client.request.parameter
import kotlinx.serialization.Serializable
import kotlinx.serialization.json.Json
class WebSearchTool(
private val client: HttpClient,
private var baseUrl: String
) : Tool {
override val name: String = "web_search"
override val displayName: String = "Web Search"
override val description: String = "Search the web for information. Parameters: query (string)"
/**
* Updates the base URL for the search endpoint.
* Call this when the server URL setting changes.
*/
fun updateBaseUrl(newUrl: String) {
baseUrl = newUrl.trim().trimEnd('/')
}
private val json = Json { ignoreUnknownKeys = true }
override suspend fun execute(arguments: Map<String, String>): String {
val query = arguments["query"] ?: return "Error: 'query' parameter is required"
return executeSync(query)
}
/**
* Synchronous version for tool calling.
*/
fun executeSync(query: String): String {
return try {
kotlinx.coroutines.runBlocking {
val response: SearxngResponse = client.get("$baseUrl/search") {
parameter("q", query)
parameter("format", "json")
parameter("safesearch", "0")
}.body()
if (response.results.isEmpty()) {
"No results found for '$query'"
} else {
response.results.take(5).joinToString("\n\n") { result ->
buildString {
append("Title: ${result.title}\n")
append("URL: ${result.url}\n")
append("Content: ${result.content}")
}
}
}
}
} catch (e: Exception) {
"Error performing web search: ${e.message}"
}
}
@Serializable
data class SearxngResponse(
val query: String = "",
val results: List<SearchResult> = emptyList()
)
@Serializable
data class SearchResult(
val title: String = "",
val url: String = "",
val content: String = "",
val engine: String = ""
)
}
@@ -0,0 +1 @@
test
@@ -0,0 +1,546 @@
package com.sleepy.agent.ui.screens
import androidx.compose.foundation.background
import androidx.compose.foundation.clickable
import androidx.compose.foundation.layout.Arrangement
import androidx.compose.foundation.layout.Box
import androidx.compose.foundation.layout.Column
import androidx.compose.foundation.layout.PaddingValues
import androidx.compose.foundation.layout.Row
import androidx.compose.foundation.layout.Spacer
import androidx.compose.foundation.layout.WindowInsets
import androidx.compose.foundation.layout.fillMaxHeight
import androidx.compose.foundation.layout.fillMaxSize
import androidx.compose.foundation.layout.fillMaxWidth
import androidx.compose.foundation.layout.height
import androidx.compose.foundation.layout.imePadding
import androidx.compose.foundation.layout.navigationBars
import androidx.compose.foundation.layout.padding
import androidx.compose.foundation.layout.size
import androidx.compose.foundation.layout.width
import androidx.compose.foundation.layout.windowInsetsPadding
import androidx.compose.foundation.lazy.LazyColumn
import androidx.compose.foundation.lazy.items
import androidx.compose.foundation.lazy.rememberLazyListState
import androidx.compose.foundation.shape.RoundedCornerShape
import androidx.compose.foundation.text.KeyboardActions
import androidx.compose.foundation.text.KeyboardOptions
import androidx.compose.foundation.text.selection.SelectionContainer
import dev.jeziellago.compose.markdowntext.MarkdownText
import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.automirrored.filled.ArrowBack
import androidx.compose.material.icons.automirrored.filled.Send
import androidx.compose.material.icons.filled.Add
import androidx.compose.material.icons.filled.Delete
import androidx.compose.material.icons.filled.Image
import androidx.compose.material.icons.filled.Menu
import androidx.compose.material.icons.filled.Mic
import androidx.compose.material.icons.filled.Settings
import androidx.compose.material3.Button
import androidx.compose.material3.Card
import androidx.compose.material3.CardDefaults
import androidx.compose.material3.CircularProgressIndicator
import androidx.compose.material3.DrawerValue
import androidx.compose.material3.ExperimentalMaterial3Api
import androidx.compose.material3.FloatingActionButton
import androidx.compose.material3.FloatingActionButtonDefaults
import androidx.compose.material3.HorizontalDivider
import androidx.compose.material3.Icon
import androidx.compose.material3.IconButton
import androidx.compose.material3.MaterialTheme
import androidx.compose.material3.ModalDrawerSheet
import androidx.compose.material3.ModalNavigationDrawer
import androidx.compose.material3.OutlinedTextField
import androidx.compose.material3.Scaffold
import androidx.compose.material3.Text
import androidx.compose.material3.TopAppBar
import androidx.compose.material3.TopAppBarDefaults
import androidx.compose.material3.rememberDrawerState
import androidx.compose.runtime.Composable
import androidx.compose.runtime.LaunchedEffect
import androidx.compose.runtime.getValue
import androidx.compose.runtime.mutableStateOf
import androidx.compose.runtime.remember
import androidx.compose.runtime.rememberCoroutineScope
import androidx.compose.runtime.setValue
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.draw.clip
import androidx.compose.ui.input.nestedscroll.nestedScroll
import androidx.compose.ui.platform.LocalSoftwareKeyboardController
import androidx.compose.ui.text.input.ImeAction
import androidx.compose.ui.text.style.TextOverflow
import androidx.compose.ui.unit.dp
import androidx.lifecycle.compose.collectAsStateWithLifecycle
import com.sleepy.agent.data.ConversationInfo
import kotlinx.coroutines.launch
import java.text.SimpleDateFormat
import java.util.Date
import java.util.Locale
@OptIn(ExperimentalMaterial3Api::class)
@Composable
fun MainScreen(
onNavigateToSettings: () -> Unit,
viewModel: MainViewModel,
onPickImage: (String) -> Unit = {}
) {
val uiState by viewModel.uiState.collectAsStateWithLifecycle()
val responseText by viewModel.responseText.collectAsStateWithLifecycle()
val messages by viewModel.messages.collectAsStateWithLifecycle()
val conversations by viewModel.conversations.collectAsStateWithLifecycle()
val listState = rememberLazyListState()
val scrollBehavior = TopAppBarDefaults.pinnedScrollBehavior()
val drawerState = rememberDrawerState(initialValue = DrawerValue.Closed)
val scope = rememberCoroutineScope()
// Text state managed here so it can be passed to image picker
var currentText by remember { mutableStateOf("") }
// Scroll to bottom when new messages arrive
LaunchedEffect(messages.size, responseText) {
if (messages.isNotEmpty()) {
listState.animateScrollToItem(messages.size - 1)
}
}
ModalNavigationDrawer(
drawerState = drawerState,
drawerContent = {
ChatHistoryDrawer(
conversations = conversations,
onConversationClick = { id ->
viewModel.loadConversation(id)
scope.launch { drawerState.close() }
},
onNewChat = {
viewModel.startNewConversation()
scope.launch { drawerState.close() }
},
onDeleteConversation = { id ->
viewModel.deleteConversation(id)
}
)
}
) {
Scaffold(
modifier = Modifier.nestedScroll(scrollBehavior.nestedScrollConnection),
topBar = {
TopAppBar(
title = { Text("Sleepy Agent") },
navigationIcon = {
IconButton(onClick = { scope.launch { drawerState.open() } }) {
Icon(Icons.Default.Menu, contentDescription = "Menu")
}
},
actions = {
IconButton(onClick = onNavigateToSettings) {
Icon(Icons.Default.Settings, contentDescription = "Settings")
}
},
scrollBehavior = scrollBehavior
)
}
) { paddingValues ->
Column(
modifier = Modifier
.fillMaxSize()
.padding(paddingValues)
.imePadding()
) {
// Messages list
if (messages.isEmpty()) {
Box(
modifier = Modifier
.weight(1f)
.fillMaxWidth(),
contentAlignment = Alignment.Center
) {
Column(
horizontalAlignment = Alignment.CenterHorizontally,
verticalArrangement = Arrangement.Center
) {
Text(
text = "👋 Welcome to Sleepy Agent",
style = MaterialTheme.typography.headlineSmall,
color = MaterialTheme.colorScheme.primary
)
Spacer(modifier = Modifier.height(8.dp))
Text(
text = "Tap the microphone to start speaking\nor type a message below",
style = MaterialTheme.typography.bodyMedium,
color = MaterialTheme.colorScheme.onSurfaceVariant
)
Spacer(modifier = Modifier.height(16.dp))
if (uiState == UIState.ERROR) {
Text(
text = responseText,
style = MaterialTheme.typography.bodySmall,
color = MaterialTheme.colorScheme.error,
modifier = Modifier.padding(horizontal = 32.dp)
)
}
}
}
} else {
LazyColumn(
modifier = Modifier
.weight(1f)
.fillMaxWidth(),
state = listState,
contentPadding = PaddingValues(16.dp),
verticalArrangement = Arrangement.spacedBy(8.dp)
) {
items(
items = messages,
key = { it.id }
) { message ->
MessageBubble(
message = message,
modifier = Modifier.fillMaxWidth()
)
}
// Show streaming response if any
if ((uiState == UIState.SPEAKING || uiState == UIState.EXECUTING_TOOL) && responseText.isNotEmpty() &&
messages.lastOrNull()?.isUser == true) {
item {
MessageBubble(
message = ConversationMessage(
text = responseText + if (uiState == UIState.SPEAKING) "" else "",
isUser = false,
isToolCall = uiState == UIState.EXECUTING_TOOL
),
modifier = Modifier.fillMaxWidth()
)
}
}
}
}
// Bottom input area
BottomInputBar(
text = currentText,
onTextChange = { currentText = it },
onSendMessage = { message ->
viewModel.sendTextMessage(message)
currentText = ""
},
onMicClick = {
when (uiState) {
UIState.IDLE -> viewModel.startRecording()
UIState.LISTENING -> viewModel.stopRecording()
else -> { }
}
},
onPickImage = { onPickImage(currentText) },
isRecording = uiState == UIState.LISTENING,
isProcessing = uiState == UIState.PROCESSING,
isExecutingTool = uiState == UIState.EXECUTING_TOOL,
modifier = Modifier
.fillMaxWidth()
.windowInsetsPadding(WindowInsets.navigationBars)
)
}
}
}
}
@Composable
private fun ChatHistoryDrawer(
conversations: List<ConversationInfo>,
onConversationClick: (String) -> Unit,
onNewChat: () -> Unit,
onDeleteConversation: (String) -> Unit
) {
ModalDrawerSheet {
Column(
modifier = Modifier
.fillMaxHeight()
.padding(16.dp)
) {
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.SpaceBetween,
verticalAlignment = Alignment.CenterVertically
) {
Text(
text = "Chat History",
style = MaterialTheme.typography.titleLarge
)
IconButton(onClick = onNewChat) {
Icon(Icons.Default.Add, contentDescription = "New Chat")
}
}
Spacer(modifier = Modifier.height(16.dp))
// New Chat button
Button(
onClick = onNewChat,
modifier = Modifier.fillMaxWidth()
) {
Icon(Icons.Default.Add, contentDescription = null)
Spacer(Modifier.width(8.dp))
Text("New Chat")
}
Spacer(modifier = Modifier.height(16.dp))
HorizontalDivider()
Spacer(modifier = Modifier.height(8.dp))
// Conversations list
LazyColumn(
modifier = Modifier.weight(1f),
verticalArrangement = Arrangement.spacedBy(4.dp)
) {
items(conversations) { conversation ->
ConversationItem(
conversation = conversation,
onClick = { onConversationClick(conversation.id) },
onDelete = { onDeleteConversation(conversation.id) }
)
}
}
if (conversations.isEmpty()) {
Box(
modifier = Modifier.weight(1f),
contentAlignment = Alignment.Center
) {
Text(
text = "No previous chats",
style = MaterialTheme.typography.bodyMedium,
color = MaterialTheme.colorScheme.onSurfaceVariant
)
}
}
}
}
}
@Composable
private fun ConversationItem(
conversation: ConversationInfo,
onClick: () -> Unit,
onDelete: () -> Unit
) {
val dateFormat = remember { SimpleDateFormat("MMM dd, HH:mm", Locale.getDefault()) }
Card(
modifier = Modifier
.fillMaxWidth()
.clickable(onClick = onClick),
colors = CardDefaults.cardColors(
containerColor = MaterialTheme.colorScheme.surfaceVariant
)
) {
Row(
modifier = Modifier
.padding(12.dp)
.fillMaxWidth(),
horizontalArrangement = Arrangement.SpaceBetween,
verticalAlignment = Alignment.CenterVertically
) {
Column(modifier = Modifier.weight(1f)) {
Text(
text = conversation.title,
style = MaterialTheme.typography.bodyMedium,
maxLines = 1,
overflow = TextOverflow.Ellipsis
)
Text(
text = "${dateFormat.format(Date(conversation.timestamp))}${conversation.messageCount} messages",
style = MaterialTheme.typography.bodySmall,
color = MaterialTheme.colorScheme.onSurfaceVariant
)
}
IconButton(onClick = onDelete) {
Icon(
Icons.Default.Delete,
contentDescription = "Delete",
tint = MaterialTheme.colorScheme.error
)
}
}
}
}
@Composable
private fun BottomInputBar(
text: String,
onTextChange: (String) -> Unit,
onSendMessage: (String) -> Unit,
onMicClick: () -> Unit,
onPickImage: () -> Unit,
isRecording: Boolean,
isProcessing: Boolean,
isExecutingTool: Boolean = false,
modifier: Modifier = Modifier
) {
val keyboardController = LocalSoftwareKeyboardController.current
Row(
modifier = modifier
.padding(horizontal = 16.dp, vertical = 8.dp),
verticalAlignment = Alignment.CenterVertically,
horizontalArrangement = Arrangement.spacedBy(8.dp)
) {
// Text input field
OutlinedTextField(
value = text,
onValueChange = onTextChange,
placeholder = {
Text(
when {
isExecutingTool -> "🔧 Executing tool..."
isProcessing -> "Thinking..."
else -> "Type a message..."
}
)
},
modifier = Modifier.weight(1f),
enabled = !isProcessing && !isExecutingTool,
keyboardOptions = KeyboardOptions(imeAction = ImeAction.Send),
keyboardActions = KeyboardActions(
onSend = {
if (text.isNotBlank()) {
onSendMessage(text)
onTextChange("")
keyboardController?.hide()
}
}
),
singleLine = false,
maxLines = 4,
shape = RoundedCornerShape(24.dp)
)
// Image picker button
IconButton(
onClick = onPickImage,
enabled = !isProcessing && !isExecutingTool,
modifier = Modifier.size(48.dp)
) {
Icon(
imageVector = Icons.Default.Image,
contentDescription = "Send Image",
modifier = Modifier.size(24.dp)
)
}
// Mic button
FloatingActionButton(
onClick = onMicClick,
modifier = Modifier.size(48.dp),
containerColor = when {
isRecording -> MaterialTheme.colorScheme.error
isProcessing || isExecutingTool -> MaterialTheme.colorScheme.secondary
else -> MaterialTheme.colorScheme.primary
},
elevation = FloatingActionButtonDefaults.elevation(0.dp, 0.dp, 0.dp, 0.dp)
) {
when {
isProcessing || isExecutingTool -> CircularProgressIndicator(
modifier = Modifier.size(24.dp),
color = MaterialTheme.colorScheme.onSecondary,
strokeWidth = 2.dp
)
else -> Icon(
imageVector = Icons.Default.Mic,
contentDescription = if (isRecording) "Stop Recording" else "Start Recording",
modifier = Modifier.size(24.dp)
)
}
}
// Send button
IconButton(
onClick = {
if (text.isNotBlank()) {
onSendMessage(text)
onTextChange("")
keyboardController?.hide()
}
},
enabled = text.isNotBlank() && !isProcessing && !isExecutingTool,
modifier = Modifier.size(48.dp)
) {
Icon(
imageVector = Icons.AutoMirrored.Filled.Send,
contentDescription = "Send",
modifier = Modifier.size(24.dp)
)
}
}
}
@Composable
private fun MessageBubble(
message: ConversationMessage,
modifier: Modifier = Modifier
) {
val alignment = if (message.isUser) Alignment.CenterEnd else Alignment.CenterStart
val (backgroundColor, textColor) = when {
message.isToolCall -> Pair(
MaterialTheme.colorScheme.tertiaryContainer,
MaterialTheme.colorScheme.onTertiaryContainer
)
message.isUser -> Pair(
MaterialTheme.colorScheme.primaryContainer,
MaterialTheme.colorScheme.onPrimaryContainer
)
else -> Pair(
MaterialTheme.colorScheme.secondaryContainer,
MaterialTheme.colorScheme.onSecondaryContainer
)
}
Box(
modifier = modifier,
contentAlignment = alignment
) {
Card(
colors = CardDefaults.cardColors(
containerColor = backgroundColor,
contentColor = textColor
),
shape = RoundedCornerShape(
topStart = 16.dp,
topEnd = 16.dp,
bottomStart = if (message.isUser) 16.dp else 4.dp,
bottomEnd = if (message.isUser) 4.dp else 16.dp
),
modifier = Modifier.padding(horizontal = 8.dp, vertical = 4.dp)
) {
// Use Markdown for AI messages, plain text for user
if (message.isUser) {
SelectionContainer {
Text(
text = message.text,
style = MaterialTheme.typography.bodyLarge,
modifier = Modifier.padding(12.dp)
)
}
} else {
// Wrap markdown in key to prevent recomposition wobbliness
androidx.compose.runtime.key(message.id, message.text.length) {
Box(modifier = Modifier.padding(12.dp)) {
MarkdownText(
markdown = message.text,
modifier = Modifier.fillMaxWidth(),
style = if (message.isToolCall) {
MaterialTheme.typography.bodyMedium.copy(color = textColor)
} else {
MaterialTheme.typography.bodyLarge.copy(color = textColor)
}
)
}
}
}
}
}
}
@@ -0,0 +1,656 @@
package com.sleepy.agent.ui.screens
import android.content.Context
import android.util.Log
import androidx.lifecycle.SavedStateHandle
import androidx.lifecycle.ViewModel
import androidx.lifecycle.viewModelScope
import com.sleepy.agent.audio.AudioRecorder
import com.sleepy.agent.audio.TtsService
import com.sleepy.agent.data.ConversationInfo
import com.sleepy.agent.data.ConversationStorage
import com.sleepy.agent.download.ModelDownloadManager
import com.sleepy.agent.inference.Agent
import com.sleepy.agent.inference.AgentEvent
import com.sleepy.agent.inference.LlmEngine
import com.sleepy.agent.settings.UserSettings
import com.sleepy.agent.tools.WebSearchTool
import kotlinx.coroutines.flow.MutableStateFlow
import kotlinx.coroutines.flow.StateFlow
import kotlinx.coroutines.flow.asStateFlow
import kotlinx.coroutines.flow.first
import kotlinx.coroutines.launch
import kotlinx.serialization.Serializable
import kotlinx.serialization.encodeToString
import kotlinx.serialization.json.Json
enum class UIState {
IDLE,
LISTENING,
PROCESSING,
EXECUTING_TOOL,
SPEAKING,
ERROR
}
@Serializable
data class ConversationMessage(
val id: String = System.currentTimeMillis().toString(),
val text: String,
val isUser: Boolean,
val isToolCall: Boolean = false,
val timestamp: Long = System.currentTimeMillis()
)
class MainViewModel(
private val savedStateHandle: SavedStateHandle,
private val context: Context,
private val audioRecorder: AudioRecorder,
private val ttsService: TtsService,
private val agent: Agent,
private val llmEngine: LlmEngine,
private val userSettings: UserSettings,
private val webSearchTool: WebSearchTool
) : ViewModel() {
private val conversationStorage = ConversationStorage(context)
private val _uiState = MutableStateFlow(UIState.IDLE)
val uiState: StateFlow<UIState> = _uiState.asStateFlow()
private val _responseText = MutableStateFlow("")
val responseText: StateFlow<String> = _responseText.asStateFlow()
private val _messages = MutableStateFlow<List<ConversationMessage>>(emptyList())
val messages: StateFlow<List<ConversationMessage>> = _messages.asStateFlow()
private val _conversations = MutableStateFlow<List<ConversationInfo>>(emptyList())
val conversations: StateFlow<List<ConversationInfo>> = _conversations.asStateFlow()
private var currentConversationId: String = savedStateHandle.get<String>("conversation_id")
?: conversationStorage.createNewConversationId()
private var recordedAudio: ByteArray = byteArrayOf()
// Track if user started with voice or text for TTS auto mode
private var firstInputWasVoice: Boolean? = null
companion object {
private const val TAG = "MainViewModel"
private const val KEY_MESSAGES = "messages"
}
init {
ttsService.initialize()
restoreState()
loadConversationsList()
viewModelScope.launch {
var modelPath = userSettings.modelPath.first()
// Auto-detect downloaded model if path is empty
if (modelPath.isEmpty()) {
if (ModelDownloadManager.isE2BDownloaded(context)) {
modelPath = ModelDownloadManager.getE2BModelFile(context).absolutePath
userSettings.setModelPath(modelPath)
Log.d(TAG, "Auto-detected E2B model at: $modelPath")
} else if (ModelDownloadManager.isE4BDownloaded(context)) {
modelPath = ModelDownloadManager.getE4BModelFile(context).absolutePath
userSettings.setModelPath(modelPath)
Log.d(TAG, "Auto-detected E4B model at: $modelPath")
}
}
if (modelPath.isNotEmpty() && !llmEngine.isLoaded()) {
loadModel(modelPath)
}
}
// Update web search URL when settings change
viewModelScope.launch {
userSettings.searchServerUrl.collect { url ->
webSearchTool.updateBaseUrl(url)
Log.d(TAG, "Updated web search URL to: $url")
}
}
}
private fun restoreState() {
// Try to load from persistent storage first
conversationStorage.loadConversation(currentConversationId)?.let { messages ->
_messages.value = messages
Log.d(TAG, "Loaded ${messages.size} messages from storage")
return
}
// Fallback to SavedStateHandle for rotation
savedStateHandle.get<String>(KEY_MESSAGES)?.let { json ->
try {
val restored = Json.decodeFromString<List<ConversationMessage>>(json)
_messages.value = restored
Log.d(TAG, "Restored ${restored.size} messages from SavedState")
} catch (e: Exception) {
Log.e(TAG, "Failed to restore messages", e)
}
}
}
private fun saveState() {
// Save to persistent storage
if (_messages.value.isNotEmpty()) {
conversationStorage.saveConversation(currentConversationId, _messages.value)
}
// Also save to SavedStateHandle for rotation
try {
val json = Json.encodeToString(_messages.value)
savedStateHandle[KEY_MESSAGES] = json
savedStateHandle["conversation_id"] = currentConversationId
} catch (e: Exception) {
Log.e(TAG, "Failed to save messages", e)
}
}
private fun loadConversationsList() {
_conversations.value = conversationStorage.getAllConversations()
}
fun loadConversation(id: String) {
conversationStorage.loadConversation(id)?.let { messages ->
// Save current conversation before switching
if (_messages.value.isNotEmpty()) {
conversationStorage.saveConversation(currentConversationId, _messages.value)
}
currentConversationId = id
_messages.value = messages
agent.reset() // Reset agent for new conversation context
savedStateHandle[KEY_MESSAGES] = Json.encodeToString(messages)
savedStateHandle["conversation_id"] = id
}
loadConversationsList()
}
fun startNewConversation() {
// Save current conversation
if (_messages.value.isNotEmpty()) {
conversationStorage.saveConversation(currentConversationId, _messages.value)
}
// Create new one
currentConversationId = conversationStorage.createNewConversationId()
_messages.value = emptyList()
_responseText.value = ""
agent.reset()
firstInputWasVoice = null
saveState()
loadConversationsList()
}
fun deleteConversation(id: String) {
conversationStorage.deleteConversation(id)
if (id == currentConversationId) {
startNewConversation()
} else {
loadConversationsList()
}
}
private suspend fun loadModel(modelPath: String) {
Log.d(TAG, "Auto-loading model from: $modelPath")
llmEngine.loadModel(modelPath)
.onSuccess {
Log.d(TAG, "Model loaded successfully")
// Pre-warm KV cache with system prompt for faster first response
agent.prewarmCache()
}
.onFailure { e -> Log.e(TAG, "Failed to load model", e) }
}
fun startRecording() {
// Track that first input was voice for TTS auto mode
if (firstInputWasVoice == null) {
firstInputWasVoice = true
Log.d(TAG, "First input was voice - TTS auto-enabled")
}
audioRecorder.setOnSilenceDetectedListener {
Log.d(TAG, "Auto-stopping recording due to silence")
stopRecording()
}
val result = audioRecorder.startRecording()
result.fold(
onSuccess = {
recordedAudio = byteArrayOf()
_uiState.value = UIState.LISTENING
Log.d(TAG, "Started recording with auto-stop")
},
onFailure = { e ->
Log.e(TAG, "Failed to start recording", e)
_uiState.value = UIState.ERROR
_responseText.value = "Error: ${e.message}"
}
)
}
fun stopRecording() {
viewModelScope.launch {
val audioData = audioRecorder.stopRecording()
recordedAudio = audioData
if (audioData.isEmpty()) {
Log.w(TAG, "No audio recorded or recording too short")
_uiState.value = UIState.IDLE
_responseText.value = "Recording too short, please try again"
return@launch
}
Log.d(TAG, "Audio recorded: ${audioData.size} bytes")
_uiState.value = UIState.PROCESSING
val useServer = userSettings.enableServerDelegation.first()
if (useServer) {
processAudioWithServer(audioData)
} else {
processAudioWithLocalModel(audioData)
}
}
}
private suspend fun processAudioWithLocalModel(audioData: ByteArray) {
try {
if (!llmEngine.isLoaded()) {
val modelPath = userSettings.modelPath.first()
if (modelPath.isNotEmpty()) {
_responseText.value = "Loading model..."
val result = llmEngine.loadModel(modelPath)
result.onFailure { e ->
_uiState.value = UIState.ERROR
_responseText.value = "Failed to load model: ${e.message}"
return@processAudioWithLocalModel
}
agent.prewarmCache()
} else {
_uiState.value = UIState.ERROR
_responseText.value = "No model loaded. Please go to Settings and load a model."
return
}
}
val userMessage = ConversationMessage(
text = "🎤 [Voice message]",
isUser = true
)
_messages.value = _messages.value + userMessage
saveState()
Log.d(TAG, "Processing audio: ${audioData.size} bytes")
val responseBuilder = StringBuilder()
// Send empty text with audio - the model will process the audio as the user's message
agent.processInput(
input = "",
audioData = audioData
).collect { event ->
when (event) {
is AgentEvent.Token -> {
responseBuilder.append(event.text)
_responseText.value = responseBuilder.toString()
_uiState.value = UIState.SPEAKING
}
is AgentEvent.ExecutingTool -> {
_uiState.value = UIState.EXECUTING_TOOL
_responseText.value = "🔧 Using ${event.toolName}..."
}
is AgentEvent.ToolResult -> {
// Tool completed, will continue to next iteration
}
is AgentEvent.Complete -> {
val aiMessage = ConversationMessage(
text = event.response,
isUser = false
)
_messages.value = _messages.value + aiMessage
saveState()
// Speak response if TTS enabled (auto mode is on since first input was voice)
speakResponse(event.response)
_uiState.value = UIState.IDLE
}
is AgentEvent.Error -> {
Log.e(TAG, "Agent error: ${event.message}")
_responseText.value = "Error: ${event.message}"
_uiState.value = UIState.ERROR
}
else -> {} // Handle other events if needed
}
}
} catch (e: Exception) {
Log.e(TAG, "Error processing audio", e)
_uiState.value = UIState.ERROR
_responseText.value = "Error: ${e.message}"
}
}
private suspend fun processAudioWithServer(audioData: ByteArray) {
val userMessage = ConversationMessage(
text = "🎤 [Voice message]",
isUser = true
)
_messages.value = _messages.value + userMessage
saveState()
val aiMessage = ConversationMessage(
text = "Server mode doesn't support native audio understanding yet. Please use local model for voice input.",
isUser = false
)
_messages.value = _messages.value + aiMessage
saveState()
_uiState.value = UIState.IDLE
}
fun sendTextMessage(text: String) {
viewModelScope.launch {
if (text.isBlank()) return@launch
Log.d(TAG, "sendTextMessage called with: ${text.take(50)}...")
Log.d(TAG, "Model loaded: ${llmEngine.isLoaded()}")
// Track that first input was text for TTS auto mode
if (firstInputWasVoice == null) {
firstInputWasVoice = false
Log.d(TAG, "First input was text - TTS auto-disabled")
}
val useServer = userSettings.enableServerDelegation.first()
Log.d(TAG, "useServer: $useServer")
if (useServer) {
processTextWithServer(text)
} else {
processTextWithLocalModel(text)
}
}
}
private suspend fun processTextWithLocalModel(text: String) {
Log.d(TAG, "processTextWithLocalModel started")
val userMessage = ConversationMessage(
text = text,
isUser = true
)
_messages.value = _messages.value + userMessage
saveState()
_uiState.value = UIState.PROCESSING
try {
if (!llmEngine.isLoaded()) {
Log.d(TAG, "Model not loaded, attempting to load...")
val modelPath = userSettings.modelPath.first()
Log.d(TAG, "Model path from settings: $modelPath")
if (modelPath.isNotEmpty()) {
_responseText.value = "Loading model..."
val result = llmEngine.loadModel(modelPath)
result.onFailure { e ->
Log.e(TAG, "Failed to load model", e)
_uiState.value = UIState.ERROR
_responseText.value = "Failed to load model: ${e.message}"
return@processTextWithLocalModel
}
Log.d(TAG, "Model loaded successfully")
// Pre-warm cache after successful load
agent.prewarmCache()
} else {
Log.w(TAG, "No model path configured")
_uiState.value = UIState.ERROR
_responseText.value = "No model loaded. Please go to Settings and load a model."
return
}
}
Log.d(TAG, "Starting agent.processInput...")
val responseBuilder = StringBuilder()
agent.processInput(input = text).collect { event ->
Log.d(TAG, "Agent event: $event")
when (event) {
is AgentEvent.Token -> {
responseBuilder.append(event.text)
_responseText.value = responseBuilder.toString()
_uiState.value = UIState.SPEAKING
}
is AgentEvent.ExecutingTool -> {
_uiState.value = UIState.EXECUTING_TOOL
_responseText.value = "🔧 Using ${event.toolName}..."
}
is AgentEvent.ToolResult -> {
// Tool completed, will continue to next iteration
}
is AgentEvent.Complete -> {
val aiMessage = ConversationMessage(
text = event.response,
isUser = false
)
_messages.value = _messages.value + aiMessage
saveState()
// Speak response if TTS conditions met
speakResponse(event.response)
_uiState.value = UIState.IDLE
}
is AgentEvent.Error -> {
Log.e(TAG, "Agent error: ${event.message}")
_responseText.value = "Error: ${event.message}"
_uiState.value = UIState.ERROR
}
else -> {}
}
}
} catch (e: Exception) {
Log.e(TAG, "Error processing message", e)
_uiState.value = UIState.ERROR
_responseText.value = "Error: ${e.message}"
}
}
private suspend fun speakResponse(response: String) {
val ttsEnabled = userSettings.ttsEnabled.first()
val ttsAutoMode = userSettings.ttsAutoMode.first()
val shouldSpeak = when {
// TTS disabled completely
!ttsEnabled -> false
// Auto mode: speak if first input was voice, don't speak if first input was text
ttsAutoMode -> firstInputWasVoice == true
// Manual mode: speak if enabled
else -> true
}
if (shouldSpeak) {
Log.d(TAG, "Speaking response (firstInputWasVoice=$firstInputWasVoice, autoMode=$ttsAutoMode)")
ttsService.speak(response) {
// Callback when done speaking
}
} else {
Log.d(TAG, "Skipping TTS (firstInputWasVoice=$firstInputWasVoice, autoMode=$ttsAutoMode)")
}
}
private suspend fun processTextWithServer(text: String) {
val userMessage = ConversationMessage(
text = text,
isUser = true
)
_messages.value = _messages.value + userMessage
saveState()
_uiState.value = UIState.PROCESSING
val aiMessage = ConversationMessage(
text = "Server mode not yet implemented. Please use local model.",
isUser = false
)
_messages.value = _messages.value + aiMessage
saveState()
_uiState.value = UIState.IDLE
}
fun setResponse(text: String) {
_responseText.value = text
_uiState.value = UIState.SPEAKING
}
fun clearResponse() {
_responseText.value = ""
}
fun clearMessages() {
_messages.value = emptyList()
firstInputWasVoice = null // Reset TTS auto mode
agent.reset()
saveState()
}
fun setError(message: String) {
_responseText.value = message
_uiState.value = UIState.ERROR
}
fun resetToIdle() {
_uiState.value = UIState.IDLE
}
fun onImageSelected(bitmap: android.graphics.Bitmap?, text: String = "") {
if (bitmap == null) {
setError("Failed to load image")
return
}
// Validate bitmap
if (bitmap.width == 0 || bitmap.height == 0) {
setError("Invalid image dimensions")
return
}
Log.d(TAG, "Image selected: ${bitmap.width}x${bitmap.height}, text: '$text'")
viewModelScope.launch {
// Add image message to chat (with text if provided)
val displayText = if (text.isNotBlank()) "🖼️ $text" else "🖼️ [Image]"
val userMessage = ConversationMessage(
text = displayText,
isUser = true
)
_messages.value = _messages.value + userMessage
saveState()
firstInputWasVoice = false // Image is not voice input
_uiState.value = UIState.PROCESSING
try {
if (!llmEngine.isLoaded()) {
val modelPath = userSettings.modelPath.first()
if (modelPath.isNotEmpty()) {
_responseText.value = "Loading model..."
val result = llmEngine.loadModel(modelPath)
result.onFailure { e ->
_uiState.value = UIState.ERROR
_responseText.value = "Failed to load model: ${e.message}"
return@launch
}
agent.prewarmCache()
} else {
_uiState.value = UIState.ERROR
_responseText.value = "No model loaded. Please go to Settings and load a model."
return@launch
}
}
val responseBuilder = StringBuilder()
Log.d(TAG, "Processing image with model...")
// Send empty text with image - model will process image naturally
agent.processInput(
input = text, // Use the text the user typed (may be empty)
images = listOf(bitmap)
).collect { event ->
when (event) {
is AgentEvent.Token -> {
responseBuilder.append(event.text)
_responseText.value = responseBuilder.toString()
_uiState.value = UIState.SPEAKING
}
is AgentEvent.ExecutingTool -> {
_uiState.value = UIState.EXECUTING_TOOL
_responseText.value = "🔧 Using ${event.toolName}..."
}
is AgentEvent.ToolResult -> {
// Tool completed
}
is AgentEvent.Complete -> {
val aiMessage = ConversationMessage(
text = event.response,
isUser = false
)
_messages.value = _messages.value + aiMessage
saveState()
speakResponse(event.response)
_uiState.value = UIState.IDLE
}
is AgentEvent.Error -> {
Log.e(TAG, "Agent error: ${event.message}")
_responseText.value = "Error: ${event.message}"
_uiState.value = UIState.ERROR
}
else -> {}
}
}
} catch (e: Exception) {
Log.e(TAG, "Error processing image", e)
_uiState.value = UIState.ERROR
_responseText.value = "Error processing image: ${e.message}"
// Add error message to chat
val errorMessage = ConversationMessage(
text = "❌ Failed to process image: ${e.message}",
isUser = false
)
_messages.value = _messages.value + errorMessage
saveState()
}
}
}
override fun onCleared() {
super.onCleared()
if (audioRecorder.isRecording()) {
// Fire and forget
}
ttsService.shutdown()
llmEngine.unload()
}
}
@@ -0,0 +1,36 @@
package com.sleepy.agent.ui.screens
import android.content.Context
import androidx.lifecycle.AbstractSavedStateViewModelFactory
import androidx.lifecycle.SavedStateHandle
import androidx.lifecycle.ViewModel
import androidx.savedstate.SavedStateRegistryOwner
import com.sleepy.agent.di.AppModule
class MainViewModelFactory(
private val appModule: AppModule,
private val context: Context,
owner: SavedStateRegistryOwner
) : AbstractSavedStateViewModelFactory(owner, null) {
@Suppress("UNCHECKED_CAST")
override fun <T : ViewModel> create(
key: String,
modelClass: Class<T>,
handle: SavedStateHandle
): T {
if (modelClass.isAssignableFrom(MainViewModel::class.java)) {
return MainViewModel(
savedStateHandle = handle,
context = context,
audioRecorder = appModule.audioRecorder,
ttsService = appModule.ttsService,
agent = appModule.agent,
llmEngine = appModule.llmEngine,
userSettings = appModule.userSettings,
webSearchTool = appModule.webSearchTool
) as T
}
throw IllegalArgumentException("Unknown ViewModel class: ${modelClass.name}")
}
}
@@ -0,0 +1,654 @@
package com.sleepy.agent.ui.screens
import android.app.Activity
import android.app.ActivityManager
import android.content.Context
import android.content.Intent
import android.os.Build
import android.provider.Settings
import androidx.activity.compose.rememberLauncherForActivityResult
import androidx.activity.result.contract.ActivityResultContracts
import androidx.compose.foundation.background
import androidx.compose.foundation.layout.Arrangement
import androidx.compose.foundation.layout.Box
import androidx.compose.foundation.layout.Column
import androidx.compose.foundation.layout.Row
import androidx.compose.foundation.layout.Spacer
import androidx.compose.foundation.layout.fillMaxSize
import androidx.compose.foundation.layout.fillMaxWidth
import androidx.compose.foundation.layout.height
import androidx.compose.foundation.layout.padding
import androidx.compose.foundation.layout.size
import androidx.compose.foundation.layout.width
import androidx.compose.foundation.rememberScrollState
import androidx.compose.foundation.shape.RoundedCornerShape
import androidx.compose.foundation.verticalScroll
import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.automirrored.filled.ArrowBack
import androidx.compose.material.icons.filled.Delete
import androidx.compose.material.icons.filled.Download
import androidx.compose.material.icons.filled.FolderOpen
import androidx.compose.material3.Button
import androidx.compose.material3.ButtonDefaults
import androidx.compose.material3.Card
import androidx.compose.material3.CardDefaults
import androidx.compose.material3.CircularProgressIndicator
import androidx.compose.material3.ExperimentalMaterial3Api
import androidx.compose.material3.HorizontalDivider
import androidx.compose.material3.Icon
import androidx.compose.material3.IconButton
import androidx.compose.material3.LinearProgressIndicator
import androidx.compose.material3.MaterialTheme
import androidx.compose.material3.OutlinedButton
import androidx.compose.material3.OutlinedTextField
import androidx.compose.material3.Scaffold
import androidx.compose.material3.Switch
import androidx.compose.material3.Text
import androidx.compose.material3.TextButton
import androidx.compose.material3.TopAppBar
import androidx.compose.runtime.Composable
import androidx.compose.runtime.LaunchedEffect
import androidx.compose.runtime.getValue
import androidx.compose.runtime.mutableStateOf
import androidx.compose.runtime.remember
import androidx.compose.runtime.setValue
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.draw.clip
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.platform.LocalContext
import androidx.compose.ui.text.input.PasswordVisualTransformation
import androidx.compose.ui.text.input.VisualTransformation
import androidx.compose.ui.unit.dp
import androidx.lifecycle.compose.collectAsStateWithLifecycle
import com.sleepy.agent.download.ModelDownloadManager
@OptIn(ExperimentalMaterial3Api::class)
@Composable
fun SettingsScreen(
onNavigateBack: () -> Unit,
viewModel: SettingsViewModel,
onRequestOverlayPermission: () -> Unit = {},
onRequestMediaProjection: () -> Unit = {}
) {
val uiState by viewModel.uiState.collectAsStateWithLifecycle()
val context = LocalContext.current
val filePickerLauncher = rememberLauncherForActivityResult(
ActivityResultContracts.StartActivityForResult()
) { result ->
if (result.resultCode == Activity.RESULT_OK) {
result.data?.data?.let { uri ->
context.contentResolver.takePersistableUriPermission(
uri,
Intent.FLAG_GRANT_READ_URI_PERMISSION
)
viewModel.onModelFileSelected(uri.toString())
}
}
}
LaunchedEffect(Unit) {
viewModel.loadSettings()
}
Scaffold(
topBar = {
TopAppBar(
title = { Text("Settings") },
navigationIcon = {
IconButton(onClick = onNavigateBack) {
Icon(
imageVector = Icons.AutoMirrored.Filled.ArrowBack,
contentDescription = "Back"
)
}
}
)
}
) { paddingValues ->
Column(
modifier = Modifier
.fillMaxSize()
.padding(paddingValues)
.verticalScroll(rememberScrollState())
.padding(16.dp),
verticalArrangement = Arrangement.spacedBy(16.dp)
) {
// Model Section
ModelSection(
uiState = uiState,
viewModel = viewModel,
onSelectModel = {
val intent = Intent(Intent.ACTION_OPEN_DOCUMENT).apply {
addCategory(Intent.CATEGORY_OPENABLE)
type = "*/*"
putExtra(Intent.EXTRA_MIME_TYPES, arrayOf("application/octet-stream", "*/*"))
}
filePickerLauncher.launch(intent)
},
onLoadModel = { viewModel.loadModel() }
)
HorizontalDivider()
// Server Section
ServerSection(
searchServerUrl = uiState.searchServerUrl,
delegateServerUrl = uiState.delegateServerUrl,
onSearchServerChange = { viewModel.setSearchServerUrl(it) },
onDelegateServerChange = { viewModel.setDelegateServerUrl(it) }
)
HorizontalDivider()
// TTS Section
TtsSection(
enabled = uiState.ttsEnabled,
autoMode = uiState.ttsAutoMode,
onEnabledChange = { viewModel.setTtsEnabled(it) },
onAutoModeChange = { viewModel.setTtsAutoMode(it) }
)
HorizontalDivider()
// Experimental Features
ExperimentalSection(
floatingButtonEnabled = uiState.floatingButtonEnabled,
overlayPermissionGranted = Settings.canDrawOverlays(context),
onFloatingButtonChange = { enabled ->
if (enabled && !Settings.canDrawOverlays(context)) {
onRequestOverlayPermission()
}
viewModel.setFloatingButtonEnabled(enabled)
},
onRequestMediaProjection = onRequestMediaProjection
)
HorizontalDivider()
// Device Info
DeviceSection()
HorizontalDivider()
// About
AboutSection()
}
}
}
@Composable
private fun ModelSection(
uiState: SettingsUiState,
viewModel: SettingsViewModel,
onSelectModel: () -> Unit,
onLoadModel: () -> Unit
) {
Column(verticalArrangement = Arrangement.spacedBy(12.dp)) {
Text(
text = "Models",
style = MaterialTheme.typography.titleMedium
)
// Current model status
when {
uiState.isLoadingModel -> {
Row(
verticalAlignment = Alignment.CenterVertically,
horizontalArrangement = Arrangement.spacedBy(8.dp)
) {
CircularProgressIndicator(modifier = Modifier.size(20.dp), strokeWidth = 2.dp)
Text("Loading model...", style = MaterialTheme.typography.bodyMedium)
}
}
uiState.modelLoaded -> {
Text(
text = "✓ Model loaded: ${uiState.modelPath.takeLast(40)}",
style = MaterialTheme.typography.bodyMedium,
color = MaterialTheme.colorScheme.primary
)
}
uiState.modelLoadError != null -> {
Text(
text = "${uiState.modelLoadError}",
style = MaterialTheme.typography.bodyMedium,
color = MaterialTheme.colorScheme.error
)
}
}
// Download progress (for variant downloads)
if (uiState.downloadingVariant != null) {
Column {
LinearProgressIndicator(
progress = { uiState.downloadProgress / 100f },
modifier = Modifier.fillMaxWidth()
)
Text(
text = "Downloading ${uiState.downloadingVariant.uppercase()}: ${uiState.downloadProgress.toInt()}%",
style = MaterialTheme.typography.bodySmall
)
}
}
// Gemma 4 E2B Card
ModelCard(
name = "Gemma 4 E2B",
description = "2B params, fastest, good for most tasks (~2.7GB)",
isDownloaded = uiState.isE2BDownloaded,
isSelected = uiState.selectedModelVariant == "e2b",
onSelect = { viewModel.selectModelVariant("e2b") },
onDownload = { viewModel.downloadModel("e2b") },
onDelete = { viewModel.deleteModel("e2b") },
isDownloading = uiState.downloadingVariant == "e2b",
enabled = !uiState.isLoadingModel
)
// Gemma 4 E4B Card
ModelCard(
name = "Gemma 4 E4B",
description = "4B params, better quality, slower (~4.5GB)",
isDownloaded = uiState.isE4BDownloaded,
isSelected = uiState.selectedModelVariant == "e4b",
onSelect = { viewModel.selectModelVariant("e4b") },
onDownload = { viewModel.downloadModel("e4b") },
onDelete = { viewModel.deleteModel("e4b") },
isDownloading = uiState.downloadingVariant == "e4b",
enabled = !uiState.isLoadingModel
)
// Select from file button
OutlinedButton(
onClick = onSelectModel,
enabled = !uiState.isLoadingModel,
modifier = Modifier.fillMaxWidth()
) {
Icon(Icons.Default.FolderOpen, contentDescription = null)
Spacer(Modifier.width(8.dp))
Text("Select from file")
}
// Load button (if model selected but not loaded)
if (uiState.modelPath.isNotEmpty() && !uiState.modelLoaded && !uiState.isLoadingModel) {
Button(
onClick = onLoadModel,
modifier = Modifier.fillMaxWidth()
) {
Text("Load Selected Model")
}
}
}
}
@Composable
private fun ModelCard(
name: String,
description: String,
isDownloaded: Boolean,
isSelected: Boolean,
onSelect: () -> Unit,
onDownload: () -> Unit,
onDelete: () -> Unit,
isDownloading: Boolean,
enabled: Boolean
) {
Card(
modifier = Modifier.fillMaxWidth(),
colors = CardDefaults.cardColors(
containerColor = if (isSelected)
MaterialTheme.colorScheme.primaryContainer
else
MaterialTheme.colorScheme.surfaceVariant
),
border = if (isSelected) {
androidx.compose.foundation.BorderStroke(2.dp, MaterialTheme.colorScheme.primary)
} else null
) {
Column(
modifier = Modifier.padding(16.dp),
verticalArrangement = Arrangement.spacedBy(12.dp)
) {
// Model info
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.SpaceBetween,
verticalAlignment = Alignment.CenterVertically
) {
Column(modifier = Modifier.weight(1f)) {
Text(
text = name,
style = MaterialTheme.typography.titleSmall
)
Text(
text = description,
style = MaterialTheme.typography.bodySmall,
color = MaterialTheme.colorScheme.onSurfaceVariant
)
}
if (isDownloaded) {
Text(
text = "",
style = MaterialTheme.typography.titleMedium,
color = MaterialTheme.colorScheme.primary
)
}
}
// Action buttons
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.spacedBy(8.dp)
) {
// Select button (only enabled if downloaded)
OutlinedButton(
onClick = onSelect,
enabled = enabled && isDownloaded && !isSelected,
modifier = Modifier.weight(1f)
) {
Text(if (isSelected) "Selected" else "Select")
}
// Download or Delete button
if (isDownloaded) {
OutlinedButton(
onClick = onDelete,
enabled = enabled && !isDownloading,
modifier = Modifier.weight(1f),
colors = ButtonDefaults.outlinedButtonColors(
contentColor = MaterialTheme.colorScheme.error
)
) {
Icon(Icons.Default.Delete, contentDescription = null, modifier = Modifier.size(18.dp))
Spacer(Modifier.width(4.dp))
Text("Delete")
}
} else {
Button(
onClick = onDownload,
enabled = enabled && !isDownloading,
modifier = Modifier.weight(1f)
) {
if (isDownloading) {
CircularProgressIndicator(
modifier = Modifier.size(16.dp),
strokeWidth = 2.dp,
color = MaterialTheme.colorScheme.onPrimary
)
} else {
Icon(Icons.Default.Download, contentDescription = null, modifier = Modifier.size(18.dp))
}
Spacer(Modifier.width(4.dp))
Text(if (isDownloading) "..." else "Download")
}
}
}
}
}
}
@Composable
private fun ServerSection(
searchServerUrl: String,
delegateServerUrl: String,
onSearchServerChange: (String) -> Unit,
onDelegateServerChange: (String) -> Unit
) {
Column(verticalArrangement = Arrangement.spacedBy(12.dp)) {
Text(
text = "Servers",
style = MaterialTheme.typography.titleMedium
)
// Search Server
OutlinedTextField(
value = searchServerUrl,
onValueChange = onSearchServerChange,
label = { Text("Search Server (SearXNG)") },
placeholder = { Text("http://your-server:8080") },
modifier = Modifier.fillMaxWidth(),
singleLine = true
)
// Delegate Server
OutlinedTextField(
value = delegateServerUrl,
onValueChange = onDelegateServerChange,
label = { Text("Delegate Server (LLM)") },
placeholder = { Text("http://your-server:7777") },
modifier = Modifier.fillMaxWidth(),
singleLine = true
)
Text(
text = "Leave empty to disable server features. URLs are saved automatically.",
style = MaterialTheme.typography.bodySmall,
color = MaterialTheme.colorScheme.onSurfaceVariant
)
}
}
@Composable
private fun TtsSection(
enabled: Boolean,
autoMode: Boolean,
onEnabledChange: (Boolean) -> Unit,
onAutoModeChange: (Boolean) -> Unit
) {
Column(verticalArrangement = Arrangement.spacedBy(12.dp)) {
Text(
text = "Text to Speech",
style = MaterialTheme.typography.titleMedium
)
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.SpaceBetween,
verticalAlignment = Alignment.CenterVertically
) {
Text("Enable TTS")
Switch(checked = enabled, onCheckedChange = onEnabledChange)
}
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.SpaceBetween,
verticalAlignment = Alignment.CenterVertically
) {
Text("Auto-detect mode (voice → speak, text → silent)")
Switch(
checked = autoMode,
onCheckedChange = onAutoModeChange,
enabled = enabled
)
}
}
}
@Composable
private fun ExperimentalSection(
floatingButtonEnabled: Boolean,
overlayPermissionGranted: Boolean,
onFloatingButtonChange: (Boolean) -> Unit,
onRequestMediaProjection: () -> Unit
) {
Card(
modifier = Modifier.fillMaxWidth(),
colors = CardDefaults.cardColors(
containerColor = MaterialTheme.colorScheme.tertiaryContainer.copy(alpha = 0.3f)
)
) {
Column(
modifier = Modifier.padding(16.dp),
verticalArrangement = Arrangement.spacedBy(12.dp)
) {
Text(
text = "🧪 Experimental Features",
style = MaterialTheme.typography.titleMedium
)
// Floating Button
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.SpaceBetween,
verticalAlignment = Alignment.CenterVertically
) {
Column {
Text("Floating Button")
Text(
"Tap to open • Hold for screenshot",
style = MaterialTheme.typography.bodySmall,
color = MaterialTheme.colorScheme.onSurfaceVariant
)
}
Switch(
checked = floatingButtonEnabled,
onCheckedChange = { enabled ->
if (enabled && !overlayPermissionGranted) {
// Will request permission outside
}
onFloatingButtonChange(enabled)
}
)
}
if (floatingButtonEnabled && !overlayPermissionGranted) {
Text(
text = "⚠️ Overlay permission required. Please enable in system settings.",
style = MaterialTheme.typography.bodySmall,
color = MaterialTheme.colorScheme.error
)
}
// MediaProjection button (for screenshot functionality)
if (floatingButtonEnabled && overlayPermissionGranted) {
OutlinedButton(
onClick = onRequestMediaProjection,
modifier = Modifier.fillMaxWidth()
) {
Text("Enable Screen Capture (for screenshots)")
}
}
Text(
text = "Experimental features may be unstable.",
style = MaterialTheme.typography.bodySmall,
color = MaterialTheme.colorScheme.onSurfaceVariant
)
}
}
}
@Composable
private fun DeviceSection() {
val context = androidx.compose.ui.platform.LocalContext.current
val activityManager = context.getSystemService(Context.ACTIVITY_SERVICE) as ActivityManager
val memoryInfo = ActivityManager.MemoryInfo()
activityManager.getMemoryInfo(memoryInfo)
// Total RAM
val totalRam = memoryInfo.totalMem
val totalRamGb = totalRam / (1024 * 1024 * 1024)
// Available RAM
val availableRam = memoryInfo.availMem
val availableRamGb = availableRam / (1024 * 1024 * 1024)
val availableRamMb = availableRam / (1024 * 1024)
Card(
modifier = Modifier.fillMaxWidth(),
colors = CardDefaults.cardColors(
containerColor = MaterialTheme.colorScheme.surfaceVariant
)
) {
Column(
modifier = Modifier.padding(16.dp),
verticalArrangement = Arrangement.spacedBy(8.dp)
) {
Text(
text = "Your Device",
style = MaterialTheme.typography.titleMedium
)
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.SpaceBetween
) {
Text(
text = "Total RAM",
style = MaterialTheme.typography.bodyMedium
)
Text(
text = "${totalRamGb} GB",
style = MaterialTheme.typography.bodyMedium,
color = MaterialTheme.colorScheme.primary
)
}
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.SpaceBetween
) {
Text(
text = "Available RAM",
style = MaterialTheme.typography.bodyMedium
)
Text(
text = if (availableRamGb > 0) "${availableRamGb} GB" else "${availableRamMb} MB",
style = MaterialTheme.typography.bodyMedium,
color = MaterialTheme.colorScheme.primary
)
}
// Device model
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.SpaceBetween
) {
Text(
text = "Device",
style = MaterialTheme.typography.bodyMedium
)
Text(
text = "${Build.MANUFACTURER} ${Build.MODEL}",
style = MaterialTheme.typography.bodyMedium,
color = MaterialTheme.colorScheme.onSurfaceVariant
)
}
// SDK version
Row(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.SpaceBetween
) {
Text(
text = "Android",
style = MaterialTheme.typography.bodyMedium
)
Text(
text = "API ${Build.VERSION.SDK_INT}",
style = MaterialTheme.typography.bodyMedium,
color = MaterialTheme.colorScheme.onSurfaceVariant
)
}
}
}
}
@Composable
private fun AboutSection() {
Column(
modifier = Modifier.fillMaxWidth(),
horizontalAlignment = Alignment.CenterHorizontally
) {
Text(
text = "Sleepy Agent",
style = MaterialTheme.typography.titleMedium
)
Text(
text = "Local LLM inference with Gemma 4",
style = MaterialTheme.typography.bodySmall,
color = MaterialTheme.colorScheme.onSurfaceVariant
)
}
}
@@ -0,0 +1,601 @@
package com.sleepy.agent.ui.screens
import android.content.Context
import android.net.Uri
import android.util.Log
import androidx.lifecycle.ViewModel
import androidx.lifecycle.viewModelScope
import androidx.work.WorkManager
import com.sleepy.agent.download.ModelDownloadManager
import com.sleepy.agent.download.ModelDownloadWorker
import com.sleepy.agent.inference.LlmEngine
import com.sleepy.agent.settings.ModelSource
import com.sleepy.agent.settings.UserSettings
import io.ktor.client.HttpClient
import io.ktor.client.request.get
import io.ktor.client.statement.HttpResponse
import io.ktor.client.statement.bodyAsText
import io.ktor.http.isSuccess
import kotlinx.coroutines.flow.MutableStateFlow
import kotlinx.coroutines.flow.StateFlow
import kotlinx.coroutines.flow.asStateFlow
import kotlinx.coroutines.flow.collectLatest
import kotlinx.coroutines.flow.first
import kotlinx.coroutines.launch
import kotlinx.serialization.json.Json
import kotlinx.serialization.json.jsonArray
import kotlinx.serialization.json.jsonObject
import kotlinx.serialization.json.jsonPrimitive
import java.io.File
data class SettingsUiState(
val modelSource: ModelSource = ModelSource.FILE_PATH,
val modelPath: String = "",
val modelUri: String = "",
val serverEnabled: Boolean = false,
val searchServerUrl: String = "",
val delegateServerUrl: String = "",
val searchServerHealthy: Boolean? = null,
val delegateServerHealthy: Boolean? = null,
val serverModels: List<String> = emptyList(),
val selectedModel: String = "",
val isCheckingSearchHealth: Boolean = false,
val isCheckingDelegateHealth: Boolean = false,
val searchHealthError: String? = null,
val delegateHealthError: String? = null,
val isLoading: Boolean = true,
val isLoadingModel: Boolean = false,
val modelLoaded: Boolean = false,
val modelLoadError: String? = null,
// TTS settings
val ttsEnabled: Boolean = true,
val ttsAutoMode: Boolean = true,
// Download state
val downloadState: ModelDownloadManager.DownloadState = ModelDownloadManager.DownloadState.Idle,
val downloadProgress: Float = 0f,
val downloadedSize: String = "0 MB",
val isModelDownloaded: Boolean = false,
// Experimental features
val floatingButtonEnabled: Boolean = false,
// Model variants (E2B and E4B)
val isE2BDownloaded: Boolean = false,
val isE4BDownloaded: Boolean = false,
val selectedModelVariant: String = "e2b", // "e2b" or "e4b"
val downloadingVariant: String? = null
)
class SettingsViewModel(
private val userSettings: UserSettings,
private val httpClient: HttpClient,
private val llmEngine: LlmEngine,
private val context: Context,
private val downloadManager: ModelDownloadManager = ModelDownloadManager(context)
) : ViewModel() {
companion object {
private const val TAG = "SettingsViewModel"
}
private val _uiState = MutableStateFlow(SettingsUiState())
val uiState: StateFlow<SettingsUiState> = _uiState.asStateFlow()
init {
loadSettings()
observeDownloadState()
observeVariantDownloads()
checkModelDownloaded()
}
private fun observeVariantDownloads() {
val workManager = WorkManager.getInstance(context)
viewModelScope.launch {
// Observe E2B download work
workManager.getWorkInfosByTagFlow("model_download_e2b").collect { workInfos ->
val workInfo = workInfos.firstOrNull()
when (workInfo?.state) {
androidx.work.WorkInfo.State.RUNNING -> {
val progress = workInfo.progress.getFloat(ModelDownloadWorker.PROGRESS, 0f)
_uiState.value = _uiState.value.copy(
downloadingVariant = "e2b",
downloadProgress = progress * 100
)
}
androidx.work.WorkInfo.State.SUCCEEDED -> {
_uiState.value = _uiState.value.copy(
downloadingVariant = null,
isE2BDownloaded = ModelDownloadManager.isE2BDownloaded(context),
downloadProgress = 100f
)
}
androidx.work.WorkInfo.State.FAILED, androidx.work.WorkInfo.State.CANCELLED -> {
_uiState.value = _uiState.value.copy(
downloadingVariant = null
)
}
else -> {}
}
}
}
viewModelScope.launch {
// Observe E4B download work
workManager.getWorkInfosByTagFlow("model_download_e4b").collect { workInfos ->
val workInfo = workInfos.firstOrNull()
when (workInfo?.state) {
androidx.work.WorkInfo.State.RUNNING -> {
val progress = workInfo.progress.getFloat(ModelDownloadWorker.PROGRESS, 0f)
_uiState.value = _uiState.value.copy(
downloadingVariant = "e4b",
downloadProgress = progress * 100
)
}
androidx.work.WorkInfo.State.SUCCEEDED -> {
_uiState.value = _uiState.value.copy(
downloadingVariant = null,
isE4BDownloaded = ModelDownloadManager.isE4BDownloaded(context),
downloadProgress = 100f
)
}
androidx.work.WorkInfo.State.FAILED, androidx.work.WorkInfo.State.CANCELLED -> {
_uiState.value = _uiState.value.copy(
downloadingVariant = null
)
}
else -> {}
}
}
}
}
private fun observeDownloadState() {
viewModelScope.launch {
downloadManager.downloadState.collectLatest { state ->
_uiState.value = _uiState.value.copy(
downloadState = state,
downloadProgress = when (state) {
is ModelDownloadManager.DownloadState.Downloading -> state.progress
is ModelDownloadManager.DownloadState.Completed -> 1f
else -> 0f
}
)
if (state is ModelDownloadManager.DownloadState.Completed ||
state is ModelDownloadManager.DownloadState.Downloading) {
_uiState.value = _uiState.value.copy(
downloadedSize = ModelDownloadManager.getDownloadedSize(context)
)
}
if (state is ModelDownloadManager.DownloadState.Completed) {
checkModelDownloaded()
val modelFile = ModelDownloadManager.getModelFile(context)
setModelPath(modelFile.absolutePath)
}
}
}
}
private fun checkModelDownloaded() {
val isDownloaded = ModelDownloadManager.isModelDownloaded(context)
_uiState.value = _uiState.value.copy(
isModelDownloaded = isDownloaded,
downloadedSize = ModelDownloadManager.getDownloadedSize(context)
)
}
fun loadSettings() {
viewModelScope.launch {
_uiState.value = _uiState.value.copy(isLoading = true)
val modelSource = userSettings.modelSource.first()
val modelPath = userSettings.modelPath.first()
val serverEnabled = userSettings.enableServerDelegation.first()
val searchServerUrl = userSettings.searchServerUrl.first()
val delegateServerUrl = userSettings.delegateServerUrl.first()
val selectedModel = userSettings.selectedServerModel.first()
val ttsEnabled = userSettings.ttsEnabled.first()
val ttsAutoMode = userSettings.ttsAutoMode.first()
val floatingButtonEnabled = userSettings.floatingButtonEnabled.first()
val finalModelPath = if (modelPath.isEmpty() && ModelDownloadManager.isModelDownloaded(context)) {
ModelDownloadManager.getModelFile(context).absolutePath
} else {
modelPath
}
_uiState.value = SettingsUiState(
modelSource = modelSource,
modelPath = finalModelPath,
modelUri = finalModelPath,
serverEnabled = serverEnabled,
searchServerUrl = searchServerUrl,
delegateServerUrl = delegateServerUrl,
selectedModel = selectedModel,
isLoading = false,
modelLoaded = llmEngine.isLoaded(),
isModelDownloaded = ModelDownloadManager.isModelDownloaded(context),
downloadedSize = ModelDownloadManager.getDownloadedSize(context),
ttsEnabled = ttsEnabled,
ttsAutoMode = ttsAutoMode,
floatingButtonEnabled = floatingButtonEnabled,
isE2BDownloaded = ModelDownloadManager.isE2BDownloaded(context),
isE4BDownloaded = ModelDownloadManager.isE4BDownloaded(context)
)
}
}
fun setModelSource(source: ModelSource) {
_uiState.value = _uiState.value.copy(modelSource = source)
viewModelScope.launch {
userSettings.setModelSource(source)
}
}
fun setModelPath(path: String) {
_uiState.value = _uiState.value.copy(modelPath = path)
viewModelScope.launch {
userSettings.setModelPath(path)
}
}
fun setModelUri(uriString: String) {
_uiState.value = _uiState.value.copy(
modelUri = uriString,
modelPath = uriString,
modelLoaded = false,
modelLoadError = null
)
viewModelScope.launch {
userSettings.setModelPath(uriString)
}
}
fun loadModel() {
viewModelScope.launch {
_uiState.value = _uiState.value.copy(
isLoadingModel = true,
modelLoadError = null
)
try {
val modelPath = if (_uiState.value.modelPath.isEmpty() &&
ModelDownloadManager.isModelDownloaded(context)) {
ModelDownloadManager.getModelFile(context).absolutePath
} else {
_uiState.value.modelPath
}
if (modelPath.isEmpty()) {
_uiState.value = _uiState.value.copy(
isLoadingModel = false,
modelLoadError = "No model file selected"
)
return@launch
}
val finalPath = getPathFromUri(modelPath) ?: modelPath
if (finalPath == null) {
_uiState.value = _uiState.value.copy(
isLoadingModel = false,
modelLoadError = "Cannot access model file. Please select a valid .litertlm file."
)
return@launch
}
val modelFile = File(finalPath)
if (!modelFile.exists()) {
_uiState.value = _uiState.value.copy(
isLoadingModel = false,
modelLoadError = "Model file does not exist: $finalPath"
)
return@launch
}
if (modelFile.length() == 0L) {
_uiState.value = _uiState.value.copy(
isLoadingModel = false,
modelLoadError = "Model file is empty"
)
return@launch
}
val magicBytes = modelFile.inputStream().use { it.readNBytes(8) }
val magicString = String(magicBytes)
Log.d(TAG, "File magic bytes: $magicString")
if (!magicString.startsWith("LITERTLM")) {
_uiState.value = _uiState.value.copy(
isLoadingModel = false,
modelLoadError = "Invalid file format. Expected LITERTLM, got: $magicString"
)
return@launch
}
Log.d(TAG, "Loading model from: $finalPath (${modelFile.length()} bytes)")
val result = llmEngine.loadModel(finalPath)
result.fold(
onSuccess = {
_uiState.value = _uiState.value.copy(
isLoadingModel = false,
modelLoaded = true,
modelLoadError = null,
modelPath = finalPath
)
userSettings.setModelPath(finalPath)
},
onFailure = { error ->
Log.e(TAG, "Failed to load model", error)
_uiState.value = _uiState.value.copy(
isLoadingModel = false,
modelLoaded = false,
modelLoadError = "Failed to load model: ${error.message}"
)
}
)
} catch (e: Exception) {
Log.e(TAG, "Error loading model", e)
_uiState.value = _uiState.value.copy(
isLoadingModel = false,
modelLoaded = false,
modelLoadError = "Error: ${e.message}"
)
}
}
}
private suspend fun getPathFromUri(uriString: String): String? {
return try {
val uri = Uri.parse(uriString)
Log.d(TAG, "URI: $uri")
if (uri.scheme == "file") {
return uri.path
}
var displayName: String? = null
context.contentResolver.query(uri, null, null, null, null)?.use { cursor ->
if (cursor.moveToFirst()) {
val nameIndex = cursor.getColumnIndex(android.provider.OpenableColumns.DISPLAY_NAME)
if (nameIndex >= 0) {
displayName = cursor.getString(nameIndex)
}
}
}
val originalFileName = displayName ?: uri.lastPathSegment?.substringAfterLast("/") ?: "model.litertlm"
val fileName = if (originalFileName.endsWith(".litertlm", ignoreCase = true)) {
originalFileName
} else {
"$originalFileName.litertlm"
}
val modelsDir = File(context.filesDir, "models")
modelsDir.mkdirs()
val destFile = File(modelsDir, fileName)
context.contentResolver.openInputStream(uri)?.use { input ->
destFile.outputStream().use { output ->
input.copyTo(output)
}
} ?: return null
if (!destFile.exists()) return null
destFile.absolutePath
} catch (e: Exception) {
Log.e(TAG, "Error copying file", e)
null
}
}
fun startModelDownload() {
downloadManager.startDownload()
}
fun cancelDownload() {
downloadManager.cancelDownload()
}
fun deleteDownloadedModel() {
downloadManager.deleteModel()
checkModelDownloaded()
}
fun setServerEnabled(enabled: Boolean) {
_uiState.value = _uiState.value.copy(serverEnabled = enabled)
viewModelScope.launch {
userSettings.setEnableServerDelegation(enabled)
}
}
fun setSearchServerUrl(url: String) {
_uiState.value = _uiState.value.copy(searchServerUrl = url, searchServerHealthy = null)
viewModelScope.launch {
userSettings.setSearchServerUrl(url)
}
}
fun setDelegateServerUrl(url: String) {
_uiState.value = _uiState.value.copy(delegateServerUrl = url, delegateServerHealthy = null)
viewModelScope.launch {
userSettings.setDelegateServerUrl(url)
}
}
fun saveSearchServerUrl() {
viewModelScope.launch {
userSettings.setSearchServerUrl(_uiState.value.searchServerUrl)
}
}
fun saveDelegateServerUrl() {
viewModelScope.launch {
userSettings.setDelegateServerUrl(_uiState.value.delegateServerUrl)
}
}
fun checkSearchServerHealth() {
viewModelScope.launch {
_uiState.value = _uiState.value.copy(
isCheckingSearchHealth = true,
searchHealthError = null,
searchServerHealthy = null
)
try {
val url = _uiState.value.searchServerUrl.trim().trimEnd('/')
val response: HttpResponse = httpClient.get("$url/search?q=test&format=json")
_uiState.value = _uiState.value.copy(
searchServerHealthy = response.status.isSuccess(),
isCheckingSearchHealth = false
)
if (response.status.isSuccess()) {
userSettings.setSearchServerUrl(url)
} else {
_uiState.value = _uiState.value.copy(
searchHealthError = "Server returned ${response.status}"
)
}
} catch (e: Exception) {
_uiState.value = _uiState.value.copy(
searchServerHealthy = false,
isCheckingSearchHealth = false,
searchHealthError = e.message ?: "Unknown error"
)
}
}
}
fun checkDelegateServerHealth() {
viewModelScope.launch {
_uiState.value = _uiState.value.copy(
isCheckingDelegateHealth = true,
delegateHealthError = null,
delegateServerHealthy = null
)
try {
val url = _uiState.value.delegateServerUrl.trim().trimEnd('/')
val response: HttpResponse = httpClient.get("$url/v1/models")
if (response.status.isSuccess()) {
val body = response.bodyAsText()
val models = parseModelsFromResponse(body)
_uiState.value = _uiState.value.copy(
delegateServerHealthy = true,
serverModels = models,
isCheckingDelegateHealth = false,
selectedModel = models.firstOrNull() ?: ""
)
userSettings.setDelegateServerUrl(url)
models.firstOrNull()?.let { userSettings.setSelectedServerModel(it) }
} else {
_uiState.value = _uiState.value.copy(
delegateServerHealthy = false,
isCheckingDelegateHealth = false,
delegateHealthError = "Server returned ${response.status}"
)
}
} catch (e: Exception) {
_uiState.value = _uiState.value.copy(
delegateServerHealthy = false,
isCheckingDelegateHealth = false,
delegateHealthError = e.message ?: "Unknown error"
)
}
}
}
private fun parseModelsFromResponse(json: String): List<String> {
return try {
val jsonElement = Json.parseToJsonElement(json)
val jsonObject = jsonElement.jsonObject
val data = jsonObject["data"]?.jsonArray
data?.mapNotNull {
it.jsonObject["id"]?.jsonPrimitive?.content
} ?: emptyList()
} catch (e: Exception) {
emptyList()
}
}
fun setSelectedModel(model: String) {
_uiState.value = _uiState.value.copy(selectedModel = model)
viewModelScope.launch {
userSettings.setSelectedServerModel(model)
}
}
// TTS settings
fun setTtsEnabled(enabled: Boolean) {
_uiState.value = _uiState.value.copy(ttsEnabled = enabled)
viewModelScope.launch {
userSettings.setTtsEnabled(enabled)
}
}
fun setTtsAutoMode(auto: Boolean) {
_uiState.value = _uiState.value.copy(ttsAutoMode = auto)
viewModelScope.launch {
userSettings.setTtsAutoMode(auto)
}
}
// Floating button (experimental)
fun setFloatingButtonEnabled(enabled: Boolean) {
_uiState.value = _uiState.value.copy(floatingButtonEnabled = enabled)
viewModelScope.launch {
userSettings.setFloatingButtonEnabled(enabled)
}
}
fun onModelFileSelected(uri: String) {
setModelUri(uri)
}
// Model variant selection (E2B / E4B)
fun selectModelVariant(variant: String) {
_uiState.value = _uiState.value.copy(selectedModelVariant = variant)
// Set the model path based on variant
val path = when (variant) {
"e2b" -> ModelDownloadManager.getE2BModelFile(context)?.absolutePath ?: ""
"e4b" -> ModelDownloadManager.getE4BModelFile(context)?.absolutePath ?: ""
else -> ""
}
if (path.isNotEmpty()) {
setModelPath(path)
}
}
fun downloadModel(variant: String) {
viewModelScope.launch {
_uiState.value = _uiState.value.copy(downloadingVariant = variant)
val result = ModelDownloadManager.downloadModelVariant(context, variant)
_uiState.value = _uiState.value.copy(
downloadingVariant = null,
isE2BDownloaded = ModelDownloadManager.isE2BDownloaded(context),
isE4BDownloaded = ModelDownloadManager.isE4BDownloaded(context)
)
}
}
fun deleteModel(variant: String) {
viewModelScope.launch {
ModelDownloadManager.deleteModelVariant(context, variant)
_uiState.value = _uiState.value.copy(
isE2BDownloaded = ModelDownloadManager.isE2BDownloaded(context),
isE4BDownloaded = ModelDownloadManager.isE4BDownloaded(context)
)
}
}
}
@@ -0,0 +1,11 @@
package com.sleepy.agent.ui.theme
import androidx.compose.ui.graphics.Color
val Purple80 = Color(0xFFD0BCFF)
val PurpleGrey80 = Color(0xFFCCC2DC)
val Pink80 = Color(0xFFEFB8C8)
val Purple40 = Color(0xFF6650a4)
val PurpleGrey40 = Color(0xFF625b71)
val Pink40 = Color(0xFF7D5260)
@@ -0,0 +1,59 @@
package com.sleepy.agent.ui.theme
import android.app.Activity
import android.os.Build
import androidx.compose.foundation.isSystemInDarkTheme
import androidx.compose.material3.MaterialTheme
import androidx.compose.material3.darkColorScheme
import androidx.compose.material3.dynamicDarkColorScheme
import androidx.compose.material3.dynamicLightColorScheme
import androidx.compose.material3.lightColorScheme
import androidx.compose.runtime.Composable
import androidx.compose.runtime.SideEffect
import androidx.compose.ui.graphics.toArgb
import androidx.compose.ui.platform.LocalContext
import androidx.compose.ui.platform.LocalView
import androidx.core.view.WindowCompat
private val DarkColorScheme = darkColorScheme(
primary = Purple80,
secondary = PurpleGrey80,
tertiary = Pink80
)
private val LightColorScheme = lightColorScheme(
primary = Purple40,
secondary = PurpleGrey40,
tertiary = Pink40
)
@Composable
fun SleepyAgentTheme(
darkTheme: Boolean = isSystemInDarkTheme(),
dynamicColor: Boolean = true,
content: @Composable () -> Unit
) {
val colorScheme = when {
dynamicColor && Build.VERSION.SDK_INT >= Build.VERSION_CODES.S -> {
val context = LocalContext.current
if (darkTheme) dynamicDarkColorScheme(context) else dynamicLightColorScheme(context)
}
darkTheme -> DarkColorScheme
else -> LightColorScheme
}
val view = LocalView.current
if (!view.isInEditMode) {
SideEffect {
val window = (view.context as Activity).window
window.statusBarColor = colorScheme.primary.toArgb()
WindowCompat.getInsetsController(window, view).isAppearanceLightStatusBars = darkTheme
}
}
MaterialTheme(
colorScheme = colorScheme,
typography = Typography,
content = content
)
}
@@ -0,0 +1,17 @@
package com.sleepy.agent.ui.theme
import androidx.compose.material3.Typography
import androidx.compose.ui.text.TextStyle
import androidx.compose.ui.text.font.FontFamily
import androidx.compose.ui.text.font.FontWeight
import androidx.compose.ui.unit.sp
val Typography = Typography(
bodyLarge = TextStyle(
fontFamily = FontFamily.Default,
fontWeight = FontWeight.Normal,
fontSize = 16.sp,
lineHeight = 24.sp,
letterSpacing = 0.5.sp
)
)
@@ -0,0 +1,10 @@
<?xml version="1.0" encoding="utf-8"?>
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="108dp"
android:height="108dp"
android:viewportWidth="108"
android:viewportHeight="108">
<path
android:fillColor="#0F3460"
android:pathData="M0,0h108v108h-108z"/>
</vector>
@@ -0,0 +1,54 @@
<?xml version="1.0" encoding="utf-8"?>
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="108dp"
android:height="108dp"
android:viewportWidth="108"
android:viewportHeight="108">
<!-- Background circle -->
<path
android:fillColor="#1A1A2E"
android:pathData="M54,54m-40,0a40,40 0,1 1,80 0a40,40 0,1 1,-80 0"/>
<!-- Robot head -->
<path
android:fillColor="#16213E"
android:strokeColor="#E94560"
android:strokeWidth="2"
android:pathData="M34,38h40v32h-40z"/>
<!-- Antenna -->
<path
android:strokeColor="#E94560"
android:strokeWidth="2"
android:pathData="M54,38L54,28"/>
<!-- Antenna ball -->
<path
android:fillColor="#E94560"
android:pathData="M54,26m-3,0a3,3 0,1 1,6 0a3,3 0,1 1,-6 0"/>
<!-- Eyes (sleepy/closed) -->
<path
android:strokeColor="#E94560"
android:strokeWidth="2"
android:strokeLineCap="round"
android:pathData="M42,52h6"/>
<path
android:strokeColor="#E94560"
android:strokeWidth="2"
android:strokeLineCap="round"
android:pathData="M60,52h6"/>
<!-- Mouth -->
<path
android:strokeColor="#533483"
android:strokeWidth="2"
android:strokeLineCap="round"
android:pathData="M46,62c2.5,2 7.5,2 10,0"/>
<!-- Zzz -->
<path
android:fillColor="#E94560"
android:pathData="M74,32h4l-4,5h4"
android:strokeColor="#E94560"
android:strokeWidth="1.5"/>
</vector>
@@ -0,0 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
<adaptive-icon xmlns:android="http://schemas.android.com/apk/res/android">
<background android:drawable="@drawable/ic_launcher_background"/>
<foreground android:drawable="@drawable/ic_launcher_foreground"/>
</adaptive-icon>
@@ -0,0 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
<adaptive-icon xmlns:android="http://schemas.android.com/apk/res/android">
<background android:drawable="@drawable/ic_launcher_background"/>
<foreground android:drawable="@drawable/ic_launcher_foreground"/>
</adaptive-icon>
@@ -0,0 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
<adaptive-icon xmlns:android="http://schemas.android.com/apk/res/android">
<background android:drawable="@drawable/ic_launcher_background"/>
<foreground android:drawable="@drawable/ic_launcher_foreground"/>
</adaptive-icon>
@@ -0,0 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
<adaptive-icon xmlns:android="http://schemas.android.com/apk/res/android">
<background android:drawable="@drawable/ic_launcher_background"/>
<foreground android:drawable="@drawable/ic_launcher_foreground"/>
</adaptive-icon>
@@ -0,0 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
<adaptive-icon xmlns:android="http://schemas.android.com/apk/res/android">
<background android:drawable="@drawable/ic_launcher_background"/>
<foreground android:drawable="@drawable/ic_launcher_foreground"/>
</adaptive-icon>
@@ -0,0 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
<adaptive-icon xmlns:android="http://schemas.android.com/apk/res/android">
<background android:drawable="@drawable/ic_launcher_background"/>
<foreground android:drawable="@drawable/ic_launcher_foreground"/>
</adaptive-icon>
@@ -0,0 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
<adaptive-icon xmlns:android="http://schemas.android.com/apk/res/android">
<background android:drawable="@drawable/ic_launcher_background"/>
<foreground android:drawable="@drawable/ic_launcher_foreground"/>
</adaptive-icon>
@@ -0,0 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
<adaptive-icon xmlns:android="http://schemas.android.com/apk/res/android">
<background android:drawable="@drawable/ic_launcher_background"/>
<foreground android:drawable="@drawable/ic_launcher_foreground"/>
</adaptive-icon>
@@ -0,0 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
<adaptive-icon xmlns:android="http://schemas.android.com/apk/res/android">
<background android:drawable="@drawable/ic_launcher_background"/>
<foreground android:drawable="@drawable/ic_launcher_foreground"/>
</adaptive-icon>
@@ -0,0 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
<adaptive-icon xmlns:android="http://schemas.android.com/apk/res/android">
<background android:drawable="@drawable/ic_launcher_background"/>
<foreground android:drawable="@drawable/ic_launcher_foreground"/>
</adaptive-icon>
@@ -0,0 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
<adaptive-icon xmlns:android="http://schemas.android.com/apk/res/android">
<background android:drawable="@drawable/ic_launcher_background"/>
<foreground android:drawable="@drawable/ic_launcher_foreground"/>
</adaptive-icon>
@@ -0,0 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
<adaptive-icon xmlns:android="http://schemas.android.com/apk/res/android">
<background android:drawable="@drawable/ic_launcher_background"/>
<foreground android:drawable="@drawable/ic_launcher_foreground"/>
</adaptive-icon>
+3
View File
@@ -0,0 +1,3 @@
<resources>
<string name="app_name">Sleepy Agent</string>
</resources>
+3
View File
@@ -0,0 +1,3 @@
<resources>
<style name="Theme.SleepyAgent" parent="android:Theme.Material.Light.NoActionBar" />
</resources>
+4
View File
@@ -0,0 +1,4 @@
<?xml version="1.0" encoding="utf-8"?>
<full-backup-content>
<exclude domain="sharedpref" path="."/>
</full-backup-content>
@@ -0,0 +1,7 @@
<?xml version="1.0" encoding="utf-8"?>
<data-extraction-rules>
<cloud-backup>
<exclude domain="sharedpref" path="."/>
<exclude domain="root" path="."/>
</cloud-backup>
</data-extraction-rules>
@@ -0,0 +1,21 @@
<?xml version="1.0" encoding="utf-8"?>
<network-security-config>
<!-- Allow cleartext traffic to home server -->
<domain-config cleartextTrafficPermitted="true">
<domain includeSubdomains="true">sleepy-think</domain>
<domain includeSubdomains="true">sleepy-wsl</domain>
<domain includeSubdomains="true">192.168.1.100</domain>
<domain includeSubdomains="true">10.0.0.2</domain>
<domain includeSubdomains="true">localhost</domain>
<domain includeSubdomains="true">127.0.0.1</domain>
</domain-config>
<!-- Allow cleartext for search server too (common for local SearXNG) -->
<domain-config cleartextTrafficPermitted="true">
<domain includeSubdomains="true">searx</domain>
<domain includeSubdomains="true">search</domain>
</domain-config>
<!-- Default: require HTTPS for everything else -->
<base-config cleartextTrafficPermitted="false" />
</network-security-config>
@@ -0,0 +1,215 @@
package com.sleepy.agent
import kotlinx.serialization.json.Json
import kotlinx.serialization.json.jsonObject
import kotlinx.serialization.json.jsonPrimitive
/**
* Standalone test harness for debugging tool call parsing.
* Run with: kotlinc -script ToolCallParserTest.kt
* Or compile and run as regular Kotlin program.
*/
class ToolCallParserTest {
private val jsonParser = Json { ignoreUnknownKeys = true }
// Test the exact format from user's bug report
fun testUserReportedFormat() {
val input = """<|tool_call>call:tool_call{"name": "web_search", "arguments": {"query": "latest litelm-rt model"}}<tool_call|>"""
println("=== Testing User Reported Format ===")
println("Input: $input")
println()
val result = parseToolCalls(input)
println("Parsed ${result.size} tool call(s):")
result.forEach { tc ->
println(" - name: ${tc.name}")
println(" args: ${tc.arguments}")
}
println()
}
// Test various formats the model might output
fun testVariousFormats() {
val testCases = listOf(
// Format 1: Original simple format
"<tool_call>{\"name\": \"web_search\", \"arguments\": {\"query\": \"test\"}}</tool_call>",
// Format 2: Gemma 4 format with call: prefix
"<|tool_call>call:web_search{\"name\": \"web_search\", \"arguments\": {\"query\": \"hello\"}}<tool_call|>",
// Format 3: User's exact bug report
"<|tool_call>call:tool_call{\"name\": \"web_search\", \"arguments\": {\"query\": \"latest litelm-rt model\"}}<tool_call|>",
// Format 4: With text before tool call
"I'll search for that. <|tool_call>call:web_search{\"name\": \"web_search\", \"arguments\": {\"query\": \"weather\"}}<tool_call|>",
// Format 5: Multiple tool calls
"""First search: <|tool_call>call:web_search{"name": "web_search", "arguments": {"query": "A"}}<tool_call|>
Second search: <|tool_call>call:web_search{"name": "web_search", "arguments": {"query": "B"}}<tool_call|>""".trimIndent(),
// Format 6: Nested quotes (potential edge case)
"<|tool_call>call:web_search{\"name\": \"web_search\", \"arguments\": {\"query\": \"what's new\"}}<tool_call|>",
// Format 7: No spaces in JSON
"<|tool_call>call:web_search{\"name\":\"web_search\",\"arguments\":{\"query\":\"test\"}}<tool_call|>",
// Format 8: Tool call in middle of sentence
"Let me <|tool_call>call:web_search{\"name\": \"web_search\", \"arguments\": {\"query\": \"help\"}}<tool_call|> for you."
)
println("=== Testing Various Formats ===")
testCases.forEachIndexed { index, input ->
println("Test ${index + 1}:")
println(" Input: ${input.take(80)}${if (input.length > 80) "..." else ""}")
val result = parseToolCalls(input)
if (result.isEmpty()) {
println(" Result: FAILED - No tool calls parsed")
} else {
println(" Result: SUCCESS - ${result.size} tool call(s)")
result.forEach { tc ->
println(" - ${tc.name}: ${tc.arguments}")
}
}
println()
}
}
// Simulate full agent flow
fun simulateAgentFlow() {
println("=== Simulating Agent Tool Calling Flow ===")
val userInput = "What's the weather in Tokyo?"
println("User: $userInput")
println()
// Simulate model response
val modelResponse = """I'll check the weather for you.
<|tool_call>call:web_search{"name": "web_search", "arguments": {"query": "Tokyo weather today"}}<tool_call|>""".trimIndent()
println("Model raw output:")
println(modelResponse)
println()
// Parse tool calls
val toolCalls = parseToolCalls(modelResponse)
println("Parsed ${toolCalls.size} tool call(s)")
if (toolCalls.isNotEmpty()) {
println("\nExecuting tools...")
toolCalls.forEach { tc ->
println(" Executing: ${tc.name} with args ${tc.arguments}")
val result = executeDummyTool(tc.name, tc.arguments)
println(" Result: $result")
}
println("\nFinal response would be generated with tool results in context")
} else {
println("ERROR: No tool calls found in model response!")
println("This is the bug we need to fix.")
}
}
// The actual parsing logic (copy of Agent.kt implementation)
private fun parseToolCalls(response: String): List<ToolCall> {
val toolCalls = mutableListOf<ToolCall>()
var currentIndex = 0
// Support multiple tool call formats
val patterns = listOf(
Pair("<|tool_call>call:", "<tool_call|>"),
Pair("<tool_call>", "</tool_call>")
)
while (true) {
// Find the next tool call using any of the supported patterns
var foundPattern: Pair<String, String>? = null
var startIndex = -1
for ((startTag, endTag) in patterns) {
val idx = response.indexOf(startTag, currentIndex)
if (idx != -1 && (startIndex == -1 || idx < startIndex)) {
startIndex = idx
foundPattern = Pair(startTag, endTag)
}
}
if (foundPattern == null || startIndex == -1) break
val (startTag, endTag) = foundPattern
val endIndex = response.indexOf(endTag, startIndex + startTag.length)
if (endIndex == -1) break
// Extract content between tags
val contentStart = startIndex + startTag.length
val content = response.substring(contentStart, endIndex).trim()
try {
// For <|tool_call>call: format, find where JSON starts
val jsonStr = if (content.startsWith("tool_call")) {
// Skip "tool_call" prefix and find JSON
content.substring(content.indexOf("{"))
} else if (content.startsWith("{")) {
// Already JSON
content
} else {
// Try to find JSON anywhere in content
val jsonStart = content.indexOf("{")
if (jsonStart != -1) content.substring(jsonStart) else content
}
println(" [DEBUG] Parsing JSON: $jsonStr")
val jsonObject = jsonParser.parseToJsonElement(jsonStr).jsonObject
val name = jsonObject["name"]?.jsonPrimitive?.content ?: run {
println(" [DEBUG] Missing 'name' field in: $jsonStr")
currentIndex = endIndex + endTag.length
continue
}
val arguments = jsonObject["arguments"]?.jsonObject?.let { argsObj ->
argsObj.entries.associate { (k, v) ->
k to v.jsonPrimitive.content
}
} ?: emptyMap()
toolCalls.add(ToolCall(name = name, arguments = arguments))
println(" [DEBUG] Successfully parsed: $name")
} catch (e: Exception) {
println(" [DEBUG] Failed to parse: $content - ${e.message}")
}
currentIndex = endIndex + endTag.length
}
return toolCalls
}
private fun executeDummyTool(name: String, arguments: Map<String, String>): String {
return when (name) {
"web_search" -> "Found results for: ${arguments["query"]}"
"home_server" -> "Executed: ${arguments["command"]}"
else -> "Unknown tool: $name"
}
}
data class ToolCall(val name: String, val arguments: Map<String, String>)
}
// Main entry point
fun main() {
val test = ToolCallParserTest()
println("╔════════════════════════════════════════════════════════════╗")
println("║ Sleepy Agent - Tool Call Parser Debug Harness ║")
println("╚════════════════════════════════════════════════════════════╝")
println()
test.testUserReportedFormat()
test.testVariousFormats()
test.simulateAgentFlow()
println("\n=== Tests Complete ===")
}
+9
View File
@@ -0,0 +1,9 @@
// Root build.gradle.kts
plugins {
id("com.android.application") version "8.9.1" apply false
id("com.android.library") version "8.9.1" apply false
id("org.jetbrains.kotlin.android") version "2.3.20" apply false
id("org.jetbrains.kotlin.plugin.serialization") version "2.3.20" apply false
id("com.google.devtools.ksp") version "2.3.6" apply false
id("org.jetbrains.kotlin.plugin.compose") version "2.3.20" apply false
}
+153
View File
@@ -0,0 +1,153 @@
# Development Guide
## Building from Source
### Prerequisites
- Android Studio Ladybug (2024.2.1) or newer
- JDK 17
- Android SDK with API 35
- A device or emulator running Android 8.0+ (API 26)
### Build Steps
1. **Clone the repository**
```bash
git clone https://github.com/sleepyeldrazi/sleepy-agent.git
cd sleepy-agent
```
2. **Open in Android Studio**
- Open the project folder in Android Studio
- Let Gradle sync complete
3. **Build Debug APK**
```bash
./gradlew :app:assembleDebug
```
Output: `app/build/outputs/apk/debug/app-debug.apk`
4. **Build Release APK**
```bash
./gradlew :app:assembleRelease
```
Output: `app/build/outputs/apk/release/app-release-unsigned.apk`
To sign the release APK:
```bash
# Generate a keystore (one-time)
keytool -genkey -v -keystore my-release-key.keystore -alias sleepyagent -keyalg RSA -keysize 2048 -validity 10000
# Sign the APK
jarsigner -verbose -sigalg SHA1withRSA -digestalg SHA1 -keystore my-release-key.keystore app-release-unsigned.apk sleepyagent
# Align the APK
zipalign -v 4 app-release-unsigned.apk sleepy-agent-signed.apk
```
## Adding Your SearXNG Server
To set up a SearXNG server, refer to the [SearXNG documentation](https://github.com/searxng/searxng) or [Docker setup guide](https://docs.searxng.org/admin/installation-docker.html).
By default, the app allows cleartext (HTTP) connections to:
- `sleepy-think` (local hostname)
- `192.168.1.100` (local network)
- `localhost`
If your SearXNG server uses a different IP or hostname, you need to update the network security config.
### Step 1: Edit `network_security_config.xml`
File: `app/src/main/res/xml/network_security_config.xml`
Add your domain or IP to the `<domain includeSubdomains="true">` list:
```xml
<?xml version="1.0" encoding="utf-8"?>
<network-security-config>
<base-config cleartextTrafficPermitted="false">
<trust-anchors>
<certificates src="system"/>
</trust-anchors>
</base-config>
<!-- Allow cleartext to local/SearXNG servers -->
<domain-config cleartextTrafficPermitted="true">
<!-- Default allowed domains -->
<domain includeSubdomains="true">sleepy-think</domain>
<domain includeSubdomains="true">192.168.1.100</domain>
<domain includeSubdomains="true">localhost</domain>
<!-- ADD YOUR SERVER HERE -->
<domain includeSubdomains="true">192.168.1.50</domain>
<domain includeSubdomains="true">my-searx-server</domain>
<domain includeSubdomains="true">searx.mydomain.com</domain>
</domain-config>
</network-security-config>
```
### Step 2: Rebuild and Install
```bash
./gradlew :app:assembleDebug
adb install -r app/build/outputs/apk/debug/app-debug.apk
```
### Alternative: Use HTTPS
If you have HTTPS set up on your SearXNG server (recommended), no config changes are needed. Just enter the HTTPS URL in Settings:
```
https://your-searx-server.com
```
## Common Issues
### Gradle Sync Fails
- Make sure you're using JDK 17
- Check that Android SDK API 35 is installed
### App Crashes on Launch
- Check `adb logcat` for the specific error
- Common cause: Missing permissions or incompatible Compose version
### Model Won't Load
- Verify the model file is a valid `.litertlm` format
- Check that the file isn't corrupted (compare checksums if available)
- Ensure you have enough free RAM (4GB+ recommended)
### Web Search Not Working
- Verify your SearXNG server is accessible from the device
- Check that the server URL includes the protocol and port:
- Correct: `http://192.168.1.100:8080`
- Wrong: `192.168.1.100:8080`
- If using HTTP on a custom domain, make sure you added it to `network_security_config.xml`
## Project Structure
```
app/src/main/java/com/sleepy/agent/
├── MainActivity.kt # Main entry point
├── SleepyAgentApplication.kt # Application class
├── audio/ # Audio recording, TTS, VAD
├── camera/ # Camera capture
├── data/ # Conversation storage
├── di/ # Dependency injection
├── download/ # Model download manager
├── inference/ # LLM engine, Agent, Conversation
├── service/ # Floating button service
├── settings/ # User preferences
├── tools/ # Web search, server tools
└── ui/ # Screens, ViewModels, Theme
```
## Contributing
1. Fork the repository
2. Create a feature branch: `git checkout -b my-feature`
3. Commit your changes: `git commit -am 'Add new feature'`
4. Push to the branch: `git push origin my-feature`
5. Open a Pull Request
## License
MIT License - See LICENSE file
BIN
View File
Binary file not shown.

After

Width:  |  Height:  |  Size: 483 KiB

+13
View File
@@ -0,0 +1,13 @@
# Gradle settings
org.gradle.jvmargs=-Xmx6g -XX:MaxMetaspaceSize=1g -XX:+HeapDumpOnOutOfMemoryError -Dfile.encoding=UTF-8
org.gradle.parallel=true
org.gradle.caching=false
org.gradle.configureondemand=false
# Android settings
android.useAndroidX=true
android.enableJetifier=true
kotlin.code.style=official
android.nonTransitiveRClass=true
# KSP settings - KSP2 is required for Kotlin 2.3.20+
Binary file not shown.
Vendored Executable
+249
View File
@@ -0,0 +1,249 @@
#!/bin/sh
#
# Copyright © 2015-2021 the original authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
##############################################################################
#
# Gradle start up script for POSIX generated by Gradle.
#
# Important for running:
#
# (1) You need a POSIX-compliant shell to run this script. If your /bin/sh is
# noncompliant, but you have some other compliant shell such as ksh or
# bash, then to run this script, type that shell name before the whole
# command line, like:
#
# ksh Gradle
#
# Busybox and similar reduced shells will NOT work, because this script
# requires all of these POSIX shell features:
# * functions;
# * expansions «$var», «${var}», «${var:-default}», «${var+SET}»,
# «${var#prefix}», «${var%suffix}», and «$( cmd )»;
# * compound commands having a testable exit status, especially «case»;
# * various built-in commands including «command», «set», and «ulimit».
#
# Important for patching:
#
# (2) This script targets any POSIX shell, so it avoids extensions provided
# by Bash, Ksh, etc; in particular arrays are avoided.
#
# The "traditional" practice of packing multiple parameters into a
# space-separated string is a well documented source of bugs and security
# problems, so this is (mostly) avoided, by progressively accumulating
# options in "$@", and eventually passing that to Java.
#
# Where the inherited environment variables (DEFAULT_JVM_OPTS, JAVA_OPTS,
# and GRADLE_OPTS) rely on word-splitting, this is performed explicitly;
# see the in-line comments for details.
#
# There are tweaks for specific operating systems such as AIX, CygWin,
# Darwin, MinGW, and NonStop.
#
# (3) This script is generated from the Groovy template
# https://github.com/gradle/gradle/blob/HEAD/subprojects/plugins/src/main/resources/org/gradle/api/internal/plugins/unixStartScript.txt
# within the Gradle project.
#
# You can find Gradle at https://github.com/gradle/gradle/.
#
##############################################################################
# Attempt to set APP_HOME
# Resolve links: $0 may be a link
app_path=$0
# Need this for daisy-chained symlinks.
while
APP_HOME=${app_path%"${app_path##*/}"} # leaves a trailing /; empty if no leading path
[ -h "$app_path" ]
do
ls=$( ls -ld "$app_path" )
link=${ls#*' -> '}
case $link in #(
/*) app_path=$link ;; #(
*) app_path=$APP_HOME$link ;;
esac
done
# This is normally unused
# shellcheck disable=SC2034
APP_BASE_NAME=${0##*/}
# Discard cd standard output in case $CDPATH is set (https://github.com/gradle/gradle/issues/25036)
APP_HOME=$( cd "${APP_HOME:-./}" > /dev/null && pwd -P ) || exit
# Use the maximum available, or set MAX_FD != -1 to use that value.
MAX_FD=maximum
warn () {
echo "$*"
} >&2
die () {
echo
echo "$*"
echo
exit 1
} >&2
# OS specific support (must be 'true' or 'false').
cygwin=false
msys=false
darwin=false
nonstop=false
case "$( uname )" in #(
CYGWIN* ) cygwin=true ;; #(
Darwin* ) darwin=true ;; #(
MSYS* | MINGW* ) msys=true ;; #(
NONSTOP* ) nonstop=true ;;
esac
CLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar
# Determine the Java command to use to start the JVM.
if [ -n "$JAVA_HOME" ] ; then
if [ -x "$JAVA_HOME/jre/sh/java" ] ; then
# IBM's JDK on AIX uses strange locations for the executables
JAVACMD=$JAVA_HOME/jre/sh/java
else
JAVACMD=$JAVA_HOME/bin/java
fi
if [ ! -x "$JAVACMD" ] ; then
die "ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME
Please set the JAVA_HOME variable in your environment to match the
location of your Java installation."
fi
else
JAVACMD=java
if ! command -v java >/dev/null 2>&1
then
die "ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.
Please set the JAVA_HOME variable in your environment to match the
location of your Java installation."
fi
fi
# Increase the maximum file descriptors if we can.
if ! "$cygwin" && ! "$darwin" && ! "$nonstop" ; then
case $MAX_FD in #(
max*)
# In POSIX sh, ulimit -H is undefined. That's why the result is checked to see if it worked.
# shellcheck disable=SC2039,SC3045
MAX_FD=$( ulimit -H -n ) ||
warn "Could not query maximum file descriptor limit"
esac
case $MAX_FD in #(
'' | soft) :;; #(
*)
# In POSIX sh, ulimit -n is undefined. That's why the result is checked to see if it worked.
# shellcheck disable=SC2039,SC3045
ulimit -n "$MAX_FD" ||
warn "Could not set maximum file descriptor limit to $MAX_FD"
esac
fi
# Collect all arguments for the java command, stacking in reverse order:
# * args from the command line
# * the main class name
# * -classpath
# * -D...appname settings
# * --module-path (only if needed)
# * DEFAULT_JVM_OPTS, JAVA_OPTS, and GRADLE_OPTS environment variables.
# For Cygwin or MSYS, switch paths to Windows format before running java
if "$cygwin" || "$msys" ; then
APP_HOME=$( cygpath --path --mixed "$APP_HOME" )
CLASSPATH=$( cygpath --path --mixed "$CLASSPATH" )
JAVACMD=$( cygpath --unix "$JAVACMD" )
# Now convert the arguments - kludge to limit ourselves to /bin/sh
for arg do
if
case $arg in #(
-*) false ;; # don't mess with options #(
/?*) t=${arg#/} t=/${t%%/*} # looks like a POSIX filepath
[ -e "$t" ] ;; #(
*) false ;;
esac
then
arg=$( cygpath --path --ignore --mixed "$arg" )
fi
# Roll the args list around exactly as many times as the number of
# args, so each arg winds up back in the position where it started, but
# possibly modified.
#
# NB: a `for` loop captures its iteration list before it begins, so
# changing the positional parameters here affects neither the number of
# iterations, nor the values presented in `arg`.
shift # remove old arg
set -- "$@" "$arg" # push replacement arg
done
fi
# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.
DEFAULT_JVM_OPTS='-Dfile.encoding=UTF-8 "-Xmx64m" "-Xms64m"'
# Collect all arguments for the java command:
# * DEFAULT_JVM_OPTS, JAVA_OPTS, JAVA_OPTS, and optsEnvironmentVar are not allowed to contain shell fragments,
# and any embedded shellness will be escaped.
# * For example: A user cannot expect ${Hostname} to be expanded, as it is an environment variable and will be
# treated as '${Hostname}' itself on the command line.
set -- \
"-Dorg.gradle.appname=$APP_BASE_NAME" \
-classpath "$CLASSPATH" \
org.gradle.wrapper.GradleWrapperMain \
"$@"
# Stop when "xargs" is not available.
if ! command -v xargs >/dev/null 2>&1
then
die "xargs is not available"
fi
# Use "xargs" to parse quoted args.
#
# With -n1 it outputs one arg per line, with the quotes and backslashes removed.
#
# In Bash we could simply go:
#
# readarray ARGS < <( xargs -n1 <<<"$var" ) &&
# set -- "${ARGS[@]}" "$@"
#
# but POSIX shell has neither arrays nor command substitution, so instead we
# post-process each arg (as a line of input to sed) to backslash-escape any
# character that might be a shell metacharacter, then use eval to reverse
# that process (while maintaining the separation between arguments), and wrap
# the whole thing up as a single "set" statement.
#
# This will of course break if any of these variables contains a newline or
# an unmatched quote.
#
eval "set -- $(
printf '%s\n' "$DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS" |
xargs -n1 |
sed ' s~[^-[:alnum:]+,./:=@_]~\\&~g; ' |
tr '\n' ' '
)" '"$@"'
exec "$JAVACMD" "$@"
+19
View File
@@ -0,0 +1,19 @@
pluginManagement {
repositories {
google()
mavenCentral()
gradlePluginPortal()
}
}
dependencyResolutionManagement {
repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)
repositories {
google()
mavenCentral()
maven { url = uri("https://jitpack.io") }
}
}
rootProject.name = "SleepyAgent"
include(":app")
Binary file not shown.