51123212c4
Harnesses under analysis: - opencode (Go-based coding agent) - pi (minimal terminal coding harness by Mario Zechner) - hermes (Nous Research agent) - forgecode (AI pair programmer with sub-agents) Each harness folder contains: - repo/: Source code from respective repositories - feedback/localllm/: Community feedback for local/smaller models - feedback/frontier/: Community feedback for frontier models Research focus: Tool handling, skills systems, prompt engineering, context management, and best practices for smaller/local models.
249 lines
6.6 KiB
Markdown
249 lines
6.6 KiB
Markdown
# Feature Feedback and User Experience
|
|
|
|
**Collection Date:** 2026-04-09
|
|
**Sources:** GitHub issues, blog posts, community discussions, documentation
|
|
|
|
---
|
|
|
|
## Skills System
|
|
|
|
### Positive Feedback
|
|
|
|
**Self-Improvement Loop:**
|
|
> "The agent can transform what it learns into reusable skills, improve them through experience, store useful information, and even search for previous conversations."
|
|
|
|
**Progressive Disclosure:**
|
|
- Level 0: Skill names/descriptions (~3,000 tokens)
|
|
- Level 1: Full skill content when needed
|
|
- Level 2: Specific reference files
|
|
|
|
**Skill Creation:**
|
|
- Auto-generated after complex tasks (5+ tool calls)
|
|
- Can be hand-written
|
|
- Installable from Skills Hub
|
|
- Shareable via agentskills.io format
|
|
|
|
### Community Contributions
|
|
|
|
**Awesome Hermes Agent:** https://github.com/0xNyk/awesome-hermes-agent
|
|
- Curated list of skills, tools, integrations
|
|
- Four plugins covering common operational needs
|
|
- Inter-agent bridge for multiple Hermes instances
|
|
- Hermes-skill-factory (auto-generates skills from workflows)
|
|
|
|
---
|
|
|
|
## Memory System
|
|
|
|
### Architecture
|
|
|
|
**Three Layers:**
|
|
1. **Short-term** - Recent context in conversation
|
|
2. **Long-term** - MEMORY.md (facts, conventions, lessons)
|
|
3. **Episodic** - SQLite FTS5 search across all sessions
|
|
|
|
**Storage:**
|
|
- `MEMORY.md` (~2,200 chars) - Always in context
|
|
- `USER.md` (~1,375 chars) - User preferences
|
|
- `~/.hermes/state.db` - SQLite with full-text search
|
|
|
|
### User Confusion Points
|
|
|
|
**Source:** https://vectorize.io/articles/hermes-agent-memory-not-working
|
|
|
|
> "Memory is for critical facts that should always be in context. Session search is for 'did we discuss X last week?' queries where the agent needs to recall — it doesn't happen automatically before every response."
|
|
|
|
**Common Misconception:** Agent should automatically remember everything
|
|
**Reality:** User must explicitly ask agent to remember: "Remember that my production database runs on port 5433"
|
|
|
|
---
|
|
|
|
## Delegation and Subagents
|
|
|
|
### Performance Benefits
|
|
|
|
> "Use delegate_task with parallel subtasks. Each subagent runs independently with its own context, and only the final summaries come back — massively reducing your main conversation's token usage."
|
|
|
|
### Best Practices
|
|
|
|
1. **Set max_iterations lower** for simple tasks (default: 50)
|
|
2. **Be specific in goals** - "Fix the TypeError in api/handlers.py line 47" not "Fix the bug"
|
|
3. **Include file paths** - Subagents don't know your project structure
|
|
4. **Use for context isolation** - Prevents main conversation bloat
|
|
|
|
### Multi-Agent Architecture (Future)
|
|
|
|
**Issue #344 Proposal:**
|
|
- L0: Current (exists today)
|
|
- L1: Workflow engine
|
|
- L2: Checkpointing and recovery
|
|
- L3: Full orchestration
|
|
|
|
---
|
|
|
|
## Cron and Scheduling
|
|
|
|
### Use Cases
|
|
|
|
**Examples:**
|
|
> "Every morning at 9am, check Hacker News for AI news and send me a summary on Telegram."
|
|
|
|
> "Weekly dependency audit every Sunday at 6 AM"
|
|
|
|
### Features
|
|
- Output automatically delivered to configured platform
|
|
- Job output saved to `~/.hermes/cron/output/<job-id>/<timestamp>.md`
|
|
- Test with `/cron run <job_id>` before scheduling
|
|
|
|
### Limitations
|
|
- Agent only sees script stdout
|
|
- Background execution requires proper setup
|
|
|
|
---
|
|
|
|
## Gateway and Messaging
|
|
|
|
### Supported Platforms
|
|
|
|
**Full List:**
|
|
- Telegram
|
|
- Discord
|
|
- Slack
|
|
- WhatsApp
|
|
- Signal
|
|
- Email
|
|
- SMS
|
|
- Home Assistant
|
|
- Matrix/Mattermost
|
|
- DingTalk/Feishu/WeCom
|
|
|
|
### Cross-Platform Continuity
|
|
|
|
> "Instructions are given via Telegram in the morning, and progress is checked via Discord at night. It's seamless."
|
|
|
|
### Voice Support
|
|
|
|
- Voice memo transcription on all platforms
|
|
- TTS output with `/voice` command
|
|
- Discord voice channel support
|
|
|
|
---
|
|
|
|
## Terminal Backends
|
|
|
|
### Options
|
|
|
|
1. **Local** (default)
|
|
2. **Docker** (sandboxed)
|
|
3. **SSH** (remote server)
|
|
4. **Daytona** (serverless persistence)
|
|
5. **Singularity**
|
|
6. **Modal** (serverless, hibernates when idle)
|
|
|
|
### Security
|
|
|
|
- Container hardening with read-only root
|
|
- Dropped capabilities
|
|
- Namespace isolation
|
|
- Dangerous command approval system
|
|
|
|
---
|
|
|
|
## Browser and Vision
|
|
|
|
### Browser Tools
|
|
|
|
**Set:**
|
|
- `browser_navigate`
|
|
- `browser_click`
|
|
- `browser_snapshot`
|
|
- `browser_type`
|
|
- etc. (11 tools total)
|
|
|
|
**Cost Impact:**
|
|
- Browser tools add ~1,258 tokens to every request (even when unused in messaging)
|
|
- Screenshots + vision analysis are high-token operations
|
|
|
|
### Vision Analysis
|
|
|
|
**Supported:**
|
|
- Image URLs via `vision_analyze`
|
|
- Image paste in CLI (with xclip/x11 forwarding)
|
|
- Images via messaging platforms
|
|
|
|
---
|
|
|
|
## Voice Mode
|
|
|
|
### Features
|
|
|
|
- **STT:** faster-whisper (local, free)
|
|
- **TTS:** Microsoft Edge TTS (free)
|
|
- **Recording:** Ctrl+B in CLI
|
|
- **Cross-platform:** Works in Telegram, Discord, etc.
|
|
|
|
---
|
|
|
|
## Comparison: Hermes vs OpenClaw
|
|
|
|
### Hermes Advantages
|
|
|
|
| Aspect | Winner | Reason |
|
|
|--------|--------|--------|
|
|
| Personal companion | Hermes | Continuous learning, personalization |
|
|
| Repetitive task automation | Hermes | Skill learning adapts to workflows |
|
|
| Voice interaction | Hermes | Native voice support |
|
|
| Lightweight deployment | Hermes | 20MB vs 200MB+ |
|
|
| Signal support | Hermes | Better multi-platform |
|
|
| Local model support | Hermes | Works better with Ollama/llama.cpp |
|
|
|
|
### OpenClaw Advantages
|
|
|
|
| Aspect | Winner | Reason |
|
|
|--------|--------|--------|
|
|
| Multi-agent coordination | OpenClaw | Better fleet management |
|
|
| Browser automation | OpenClaw | More mature plugin ecosystem |
|
|
| Community/plugins | OpenClaw | 307k stars vs 6k |
|
|
| MCP ecosystem | OpenClaw | More mature |
|
|
|
|
### Community Recommendation
|
|
|
|
> "Use both. OpenClaw as the 'fleet commander' for multi-agent coordination, Hermes as your 'personal advisor' for one-on-one tasks."
|
|
|
|
---
|
|
|
|
## User Experience Feedback
|
|
|
|
### Positive
|
|
|
|
> "Hermes optimizes for depth of learning. It is smaller, more opinionated, and built by a team that trains the underlying models."
|
|
|
|
> "For repetitive workflows where agent improvement creates measurable value over time, Hermes is the stronger choice."
|
|
|
|
> "It just works — installation to first conversation is minutes, not hours."
|
|
|
|
### Areas for Improvement
|
|
|
|
1. **Token overhead transparency** - Users surprised by costs
|
|
2. **Memory system education** - Users expect automatic memory
|
|
3. **Local model guidance** - Need better model recommendations
|
|
4. **Gateway debugging** - Error messages can be cryptic
|
|
5. **Migration experience** - OpenClaw migration has rough edges
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
**Strengths:**
|
|
- Self-improving skill system
|
|
- Excellent multi-platform support
|
|
- Strong memory architecture
|
|
- Good local model support
|
|
- Active development
|
|
|
|
**Weaknesses:**
|
|
- Token overhead can surprise users
|
|
- Some migration/tooling rough edges
|
|
- Documentation gaps for advanced features
|
|
- Memory system requires user education
|