Harnesses under analysis: - opencode (Go-based coding agent) - pi (minimal terminal coding harness by Mario Zechner) - hermes (Nous Research agent) - forgecode (AI pair programmer with sub-agents) Each harness folder contains: - repo/: Source code from respective repositories - feedback/localllm/: Community feedback for local/smaller models - feedback/frontier/: Community feedback for frontier models Research focus: Tool handling, skills systems, prompt engineering, context management, and best practices for smaller/local models.
6.6 KiB
Feature Feedback and User Experience
Collection Date: 2026-04-09
Sources: GitHub issues, blog posts, community discussions, documentation
Skills System
Positive Feedback
Self-Improvement Loop:
"The agent can transform what it learns into reusable skills, improve them through experience, store useful information, and even search for previous conversations."
Progressive Disclosure:
- Level 0: Skill names/descriptions (~3,000 tokens)
- Level 1: Full skill content when needed
- Level 2: Specific reference files
Skill Creation:
- Auto-generated after complex tasks (5+ tool calls)
- Can be hand-written
- Installable from Skills Hub
- Shareable via agentskills.io format
Community Contributions
Awesome Hermes Agent: https://github.com/0xNyk/awesome-hermes-agent
- Curated list of skills, tools, integrations
- Four plugins covering common operational needs
- Inter-agent bridge for multiple Hermes instances
- Hermes-skill-factory (auto-generates skills from workflows)
Memory System
Architecture
Three Layers:
- Short-term - Recent context in conversation
- Long-term - MEMORY.md (facts, conventions, lessons)
- Episodic - SQLite FTS5 search across all sessions
Storage:
MEMORY.md(~2,200 chars) - Always in contextUSER.md(~1,375 chars) - User preferences~/.hermes/state.db- SQLite with full-text search
User Confusion Points
Source: https://vectorize.io/articles/hermes-agent-memory-not-working
"Memory is for critical facts that should always be in context. Session search is for 'did we discuss X last week?' queries where the agent needs to recall — it doesn't happen automatically before every response."
Common Misconception: Agent should automatically remember everything Reality: User must explicitly ask agent to remember: "Remember that my production database runs on port 5433"
Delegation and Subagents
Performance Benefits
"Use delegate_task with parallel subtasks. Each subagent runs independently with its own context, and only the final summaries come back — massively reducing your main conversation's token usage."
Best Practices
- Set max_iterations lower for simple tasks (default: 50)
- Be specific in goals - "Fix the TypeError in api/handlers.py line 47" not "Fix the bug"
- Include file paths - Subagents don't know your project structure
- Use for context isolation - Prevents main conversation bloat
Multi-Agent Architecture (Future)
Issue #344 Proposal:
- L0: Current (exists today)
- L1: Workflow engine
- L2: Checkpointing and recovery
- L3: Full orchestration
Cron and Scheduling
Use Cases
Examples:
"Every morning at 9am, check Hacker News for AI news and send me a summary on Telegram."
"Weekly dependency audit every Sunday at 6 AM"
Features
- Output automatically delivered to configured platform
- Job output saved to
~/.hermes/cron/output/<job-id>/<timestamp>.md - Test with
/cron run <job_id>before scheduling
Limitations
- Agent only sees script stdout
- Background execution requires proper setup
Gateway and Messaging
Supported Platforms
Full List:
- Telegram
- Discord
- Slack
- Signal
- SMS
- Home Assistant
- Matrix/Mattermost
- DingTalk/Feishu/WeCom
Cross-Platform Continuity
"Instructions are given via Telegram in the morning, and progress is checked via Discord at night. It's seamless."
Voice Support
- Voice memo transcription on all platforms
- TTS output with
/voicecommand - Discord voice channel support
Terminal Backends
Options
- Local (default)
- Docker (sandboxed)
- SSH (remote server)
- Daytona (serverless persistence)
- Singularity
- Modal (serverless, hibernates when idle)
Security
- Container hardening with read-only root
- Dropped capabilities
- Namespace isolation
- Dangerous command approval system
Browser and Vision
Browser Tools
Set:
browser_navigatebrowser_clickbrowser_snapshotbrowser_type- etc. (11 tools total)
Cost Impact:
- Browser tools add ~1,258 tokens to every request (even when unused in messaging)
- Screenshots + vision analysis are high-token operations
Vision Analysis
Supported:
- Image URLs via
vision_analyze - Image paste in CLI (with xclip/x11 forwarding)
- Images via messaging platforms
Voice Mode
Features
- STT: faster-whisper (local, free)
- TTS: Microsoft Edge TTS (free)
- Recording: Ctrl+B in CLI
- Cross-platform: Works in Telegram, Discord, etc.
Comparison: Hermes vs OpenClaw
Hermes Advantages
| Aspect | Winner | Reason |
|---|---|---|
| Personal companion | Hermes | Continuous learning, personalization |
| Repetitive task automation | Hermes | Skill learning adapts to workflows |
| Voice interaction | Hermes | Native voice support |
| Lightweight deployment | Hermes | 20MB vs 200MB+ |
| Signal support | Hermes | Better multi-platform |
| Local model support | Hermes | Works better with Ollama/llama.cpp |
OpenClaw Advantages
| Aspect | Winner | Reason |
|---|---|---|
| Multi-agent coordination | OpenClaw | Better fleet management |
| Browser automation | OpenClaw | More mature plugin ecosystem |
| Community/plugins | OpenClaw | 307k stars vs 6k |
| MCP ecosystem | OpenClaw | More mature |
Community Recommendation
"Use both. OpenClaw as the 'fleet commander' for multi-agent coordination, Hermes as your 'personal advisor' for one-on-one tasks."
User Experience Feedback
Positive
"Hermes optimizes for depth of learning. It is smaller, more opinionated, and built by a team that trains the underlying models."
"For repetitive workflows where agent improvement creates measurable value over time, Hermes is the stronger choice."
"It just works — installation to first conversation is minutes, not hours."
Areas for Improvement
- Token overhead transparency - Users surprised by costs
- Memory system education - Users expect automatic memory
- Local model guidance - Need better model recommendations
- Gateway debugging - Error messages can be cryptic
- Migration experience - OpenClaw migration has rough edges
Summary
Strengths:
- Self-improving skill system
- Excellent multi-platform support
- Strong memory architecture
- Good local model support
- Active development
Weaknesses:
- Token overhead can surprise users
- Some migration/tooling rough edges
- Documentation gaps for advanced features
- Memory system requires user education