Files
mid_model_research/hermes/feedback/general/feature-feedback.md
T
sleepy 51123212c4 Initial commit: coding harness feedback analysis
Harnesses under analysis:
- opencode (Go-based coding agent)
- pi (minimal terminal coding harness by Mario Zechner)
- hermes (Nous Research agent)
- forgecode (AI pair programmer with sub-agents)

Each harness folder contains:
- repo/: Source code from respective repositories
- feedback/localllm/: Community feedback for local/smaller models
- feedback/frontier/: Community feedback for frontier models

Research focus: Tool handling, skills systems, prompt engineering,
context management, and best practices for smaller/local models.
2026-04-09 15:13:45 +02:00

249 lines
6.6 KiB
Markdown

# Feature Feedback and User Experience
**Collection Date:** 2026-04-09
**Sources:** GitHub issues, blog posts, community discussions, documentation
---
## Skills System
### Positive Feedback
**Self-Improvement Loop:**
> "The agent can transform what it learns into reusable skills, improve them through experience, store useful information, and even search for previous conversations."
**Progressive Disclosure:**
- Level 0: Skill names/descriptions (~3,000 tokens)
- Level 1: Full skill content when needed
- Level 2: Specific reference files
**Skill Creation:**
- Auto-generated after complex tasks (5+ tool calls)
- Can be hand-written
- Installable from Skills Hub
- Shareable via agentskills.io format
### Community Contributions
**Awesome Hermes Agent:** https://github.com/0xNyk/awesome-hermes-agent
- Curated list of skills, tools, integrations
- Four plugins covering common operational needs
- Inter-agent bridge for multiple Hermes instances
- Hermes-skill-factory (auto-generates skills from workflows)
---
## Memory System
### Architecture
**Three Layers:**
1. **Short-term** - Recent context in conversation
2. **Long-term** - MEMORY.md (facts, conventions, lessons)
3. **Episodic** - SQLite FTS5 search across all sessions
**Storage:**
- `MEMORY.md` (~2,200 chars) - Always in context
- `USER.md` (~1,375 chars) - User preferences
- `~/.hermes/state.db` - SQLite with full-text search
### User Confusion Points
**Source:** https://vectorize.io/articles/hermes-agent-memory-not-working
> "Memory is for critical facts that should always be in context. Session search is for 'did we discuss X last week?' queries where the agent needs to recall — it doesn't happen automatically before every response."
**Common Misconception:** Agent should automatically remember everything
**Reality:** User must explicitly ask agent to remember: "Remember that my production database runs on port 5433"
---
## Delegation and Subagents
### Performance Benefits
> "Use delegate_task with parallel subtasks. Each subagent runs independently with its own context, and only the final summaries come back — massively reducing your main conversation's token usage."
### Best Practices
1. **Set max_iterations lower** for simple tasks (default: 50)
2. **Be specific in goals** - "Fix the TypeError in api/handlers.py line 47" not "Fix the bug"
3. **Include file paths** - Subagents don't know your project structure
4. **Use for context isolation** - Prevents main conversation bloat
### Multi-Agent Architecture (Future)
**Issue #344 Proposal:**
- L0: Current (exists today)
- L1: Workflow engine
- L2: Checkpointing and recovery
- L3: Full orchestration
---
## Cron and Scheduling
### Use Cases
**Examples:**
> "Every morning at 9am, check Hacker News for AI news and send me a summary on Telegram."
> "Weekly dependency audit every Sunday at 6 AM"
### Features
- Output automatically delivered to configured platform
- Job output saved to `~/.hermes/cron/output/<job-id>/<timestamp>.md`
- Test with `/cron run <job_id>` before scheduling
### Limitations
- Agent only sees script stdout
- Background execution requires proper setup
---
## Gateway and Messaging
### Supported Platforms
**Full List:**
- Telegram
- Discord
- Slack
- WhatsApp
- Signal
- Email
- SMS
- Home Assistant
- Matrix/Mattermost
- DingTalk/Feishu/WeCom
### Cross-Platform Continuity
> "Instructions are given via Telegram in the morning, and progress is checked via Discord at night. It's seamless."
### Voice Support
- Voice memo transcription on all platforms
- TTS output with `/voice` command
- Discord voice channel support
---
## Terminal Backends
### Options
1. **Local** (default)
2. **Docker** (sandboxed)
3. **SSH** (remote server)
4. **Daytona** (serverless persistence)
5. **Singularity**
6. **Modal** (serverless, hibernates when idle)
### Security
- Container hardening with read-only root
- Dropped capabilities
- Namespace isolation
- Dangerous command approval system
---
## Browser and Vision
### Browser Tools
**Set:**
- `browser_navigate`
- `browser_click`
- `browser_snapshot`
- `browser_type`
- etc. (11 tools total)
**Cost Impact:**
- Browser tools add ~1,258 tokens to every request (even when unused in messaging)
- Screenshots + vision analysis are high-token operations
### Vision Analysis
**Supported:**
- Image URLs via `vision_analyze`
- Image paste in CLI (with xclip/x11 forwarding)
- Images via messaging platforms
---
## Voice Mode
### Features
- **STT:** faster-whisper (local, free)
- **TTS:** Microsoft Edge TTS (free)
- **Recording:** Ctrl+B in CLI
- **Cross-platform:** Works in Telegram, Discord, etc.
---
## Comparison: Hermes vs OpenClaw
### Hermes Advantages
| Aspect | Winner | Reason |
|--------|--------|--------|
| Personal companion | Hermes | Continuous learning, personalization |
| Repetitive task automation | Hermes | Skill learning adapts to workflows |
| Voice interaction | Hermes | Native voice support |
| Lightweight deployment | Hermes | 20MB vs 200MB+ |
| Signal support | Hermes | Better multi-platform |
| Local model support | Hermes | Works better with Ollama/llama.cpp |
### OpenClaw Advantages
| Aspect | Winner | Reason |
|--------|--------|--------|
| Multi-agent coordination | OpenClaw | Better fleet management |
| Browser automation | OpenClaw | More mature plugin ecosystem |
| Community/plugins | OpenClaw | 307k stars vs 6k |
| MCP ecosystem | OpenClaw | More mature |
### Community Recommendation
> "Use both. OpenClaw as the 'fleet commander' for multi-agent coordination, Hermes as your 'personal advisor' for one-on-one tasks."
---
## User Experience Feedback
### Positive
> "Hermes optimizes for depth of learning. It is smaller, more opinionated, and built by a team that trains the underlying models."
> "For repetitive workflows where agent improvement creates measurable value over time, Hermes is the stronger choice."
> "It just works — installation to first conversation is minutes, not hours."
### Areas for Improvement
1. **Token overhead transparency** - Users surprised by costs
2. **Memory system education** - Users expect automatic memory
3. **Local model guidance** - Need better model recommendations
4. **Gateway debugging** - Error messages can be cryptic
5. **Migration experience** - OpenClaw migration has rough edges
---
## Summary
**Strengths:**
- Self-improving skill system
- Excellent multi-platform support
- Strong memory architecture
- Good local model support
- Active development
**Weaknesses:**
- Token overhead can surprise users
- Some migration/tooling rough edges
- Documentation gaps for advanced features
- Memory system requires user education