# Claude Sonnet Feedback for Hermes Agent **Source reference:** GitHub issues, community discussions, official docs --- ## Claude Sonnet 4.5/4.6 - Primary Recommendation **Status:** Excellent performance, commonly used as default ### Token Usage Reality Check **Source:** https://hermes-agent.ai/blog/hermes-agent-token-overhead | Scenario | API Calls | Est. Cost (Sonnet 4.5) | |----------|-----------|------------------------| | Simple bug fix | 20 | ~$6 | | Feature implementation | 100 | ~$34 | | Large refactor | 500 | ~$187 | | Full project build | 1,000 | ~$405 | ### Real-World Usage Example **Source:** GitHub Issue #4379 **Single Evening Deployment (3 Active Sessions):** | Session | Platform | Messages | Est. API Calls | Est. Input Tokens | |---------|----------|----------|----------------|-------------------| | Chat session | Telegram | 168 | ~84 | ~1.6M | | Group chat | WhatsApp | 122 | ~61 | ~1.2M | | Group chat | WhatsApp | 64 | ~32 | ~574K | | **Total** | | **354** | **~207** | **~3.9M** | --- ## Token Overhead Analysis (All Models) **Critical Finding:** 73% of every API call is fixed overhead (~13.9K tokens) | Component | Tokens | % of Request | |-----------|--------|--------------| | Tool definitions (31 tools) | 8,759 | 46.1% | | System prompt (SOUL.md + skills) | 5,176 | 27.2% | | Messages (conversation) | 3,000-8,775 | 26.7% avg | | **Total per request** | **~17,000-23,000** | | **Impact:** This overhead is constant regardless of using Sonnet, Haiku, Llama, or any OpenRouter model. --- ## Performance Comparison **Source:** https://www.buildmvpfast.com/blog/hermes-agent-v04-open-source-agent-infrastructure-2026 > "One developer reported that a task taking OpenClaw 50+ tool calls and steps took Hermes 5 correct tool calls and finished 2.5 minutes faster." --- ## Best Practices for Cost Management ### 1. Use Cheaper Models for Routine Tasks Reserve Claude/GPT-4 for complex reasoning only: - File organization → Use Kimi, MiniMax, DeepSeek - Simple responses → Budget models - Complex architecture → Claude Sonnet ### 2. Enable Caching (Where Available) | Provider | Cache Support | Discount | |----------|--------------|----------| | DeepSeek | 90% off | Best option | | Kimi K2.5 | 75% off | Good option | | Anthropic | Full | Cache markers visible | | OpenRouter | Partial | Depends on upstream | | Gemini/GLM | None | Full price | ### 3. Short Sessions Start fresh for unrelated tasks: ```bash hermes --fresh ``` --- ## User Experience Feedback ### Positive - Excellent tool calling reliability - Strong reasoning for complex multi-step tasks - Good context understanding ### Cost Concerns **Quote from Reddit user:** > "4 million tokens in 2 hours of light usage" — Reddit user who quit **High-token triggers:** - Terminal tool spawning - Browser automation with screenshots - Complex code execution with large file reads --- ## Configuration Tips ### Auxiliary Vision Model For vision tasks, consider using a cheaper model: ```yaml auxiliary: vision: provider: "openrouter" model: "google/gemini-2.5-flash" ``` Or use Codex for vision (ChatGPT Pro/Plus): ```yaml auxiliary: vision: provider: "codex" # Uses ChatGPT OAuth token ``` --- ## Summary Claude Sonnet provides excellent performance with Hermes Agent but users should be aware of: 1. Fixed 13.9K token overhead per request 2. Costs can accumulate quickly with active usage 3. Best used selectively for complex tasks 4. Consider cheaper alternatives for routine work