# General Local LLM Feedback for Hermes Agent **Collection Date:** 2026-04-09 **Sources:** Reddit r/LocalLLaMA, r/LocalLLM, GitHub issues, blog posts, community discussions --- ## Overall Assessment Hermes Agent is widely reported to work "way better" with local models than OpenClaw. However, users face challenges with configuration complexity and model selection. --- ## Positive Feedback ### Better Than OpenClaw for Local Models **Source:** https://www.reddit.com/r/LocalLLM/comments/1rye221/anyone_working_with_hermes_agent/ > "its worknig better for me than openclaw, this i mean with local models, when i use openclaw i cant even load up 4b models, i am not sure why but i decided to see if the same problem would persist with hermes and i dint get this issue." **Source:** https://www.reddit.com/r/LocalLLaMA/comments/1rwhi2h/running_hermes_agent_locally_with_lm_studio/ > "This Hermes agent already works way way better than Open Claw and it actually works pretty well locally. I have to be super careful about exposing this to the outside world because the model is not smart enough, probably, to catch sophisticated..." ### Architecture Appreciation **Source:** https://www.reddit.com/r/LocalLLM/comments/1scglgq/i_looked_into_hermes_agent_architecture_to_dig/ > "It identified 11 websites from pure text and hit 60% testing WebArena tasks without tuning" --- ## Challenges and Issues ### Tool Calling Reliability **Issue:** Models work initially but forget which tools to use after first call **Affected:** Smaller models (4B, 7B range) > "tool calls not always work i use ollama and qwen3.5:4b qwen2.5:7b and they all tool call once than they forget which one to use" ### Context Management Confusion **Source:** https://www.reddit.com/r/LocalLLM/comments/1sc82o8/hermesagent_what_is_this_message_about/ > "Context exceeded your setting. Either your Hermes context or your llm server context setting for that particular model. By default context is usually set to something comically low." ### System Prompt Size Concerns **Source:** https://www.reddit.com/r/LocalLLaMA/comments/1rwhi2h/running_hermes_agent_locally_with_lm_studio/ > "Hermes has a huge system prompt. When I try to run it with Qwen-3.5 35B it's difficult..." --- ## Model-Specific Feedback ### Recommended for Local Use 1. **Qwen 3.5 27B** - Best overall performance - Requires: 24GB+ VRAM - Speed: ~25 t/s with proper quantization - Tool use: Excellent 2. **Qwen 3.5 14B** - Good balance - Requires: 16GB VRAM - Decent tool use reliability 3. **Qwen 3.5 8B** - Minimum viable - Requires: 8GB VRAM - Tool use may be inconsistent ### Not Recommended - Very small models (4B and below) for complex agent tasks - Models without good tool calling fine-tuning --- ## Token Overhead Impact on Local Models **Critical Issue:** Even local models face 13.9K token overhead per request **Source:** GitHub Issue #4379 | Component | Tokens | |-----------|--------| | Tool definitions (31 tools) | 8,759 | | System prompt | 5,176 | | Fixed overhead | ~13,935 | **Impact:** Local models with smaller context windows hit limits quickly due to this overhead. --- ## Community Suggestions 1. **Better documentation** for local model setup 2. **Recommended model list** with VRAM requirements 3. **Tool calling reliability benchmarks** by model size 4. **Reduced toolset option** for resource-constrained setups 5. **Better context management guidance** --- ## Summary Table | Aspect | Rating | Notes | |--------|--------|-------| | Local model support | ⭐⭐⭐⭐⭐ | Better than alternatives | | Setup ease | ⭐⭐⭐ | Requires technical knowledge | | Tool calling (8B+) | ⭐⭐⭐⭐ | Good with right models | | Tool calling (4B) | ⭐⭐ | Inconsistent | | Documentation | ⭐⭐⭐ | Improving but gaps remain | | Community support | ⭐⭐⭐⭐⭐ | Active and helpful |