51123212c4
Harnesses under analysis: - opencode (Go-based coding agent) - pi (minimal terminal coding harness by Mario Zechner) - hermes (Nous Research agent) - forgecode (AI pair programmer with sub-agents) Each harness folder contains: - repo/: Source code from respective repositories - feedback/localllm/: Community feedback for local/smaller models - feedback/frontier/: Community feedback for frontier models Research focus: Tool handling, skills systems, prompt engineering, context management, and best practices for smaller/local models.
118 lines
3.8 KiB
Markdown
118 lines
3.8 KiB
Markdown
# General Local LLM Feedback for Hermes Agent
|
|
|
|
**Collection Date:** 2026-04-09
|
|
**Sources:** Reddit r/LocalLLaMA, r/LocalLLM, GitHub issues, blog posts, community discussions
|
|
|
|
---
|
|
|
|
## Overall Assessment
|
|
|
|
Hermes Agent is widely reported to work "way better" with local models than OpenClaw. However, users face challenges with configuration complexity and model selection.
|
|
|
|
---
|
|
|
|
## Positive Feedback
|
|
|
|
### Better Than OpenClaw for Local Models
|
|
|
|
**Source:** https://www.reddit.com/r/LocalLLM/comments/1rye221/anyone_working_with_hermes_agent/
|
|
|
|
> "its worknig better for me than openclaw, this i mean with local models, when i use openclaw i cant even load up 4b models, i am not sure why but i decided to see if the same problem would persist with hermes and i dint get this issue."
|
|
|
|
**Source:** https://www.reddit.com/r/LocalLLaMA/comments/1rwhi2h/running_hermes_agent_locally_with_lm_studio/
|
|
|
|
> "This Hermes agent already works way way better than Open Claw and it actually works pretty well locally. I have to be super careful about exposing this to the outside world because the model is not smart enough, probably, to catch sophisticated..."
|
|
|
|
### Architecture Appreciation
|
|
|
|
**Source:** https://www.reddit.com/r/LocalLLM/comments/1scglgq/i_looked_into_hermes_agent_architecture_to_dig/
|
|
|
|
> "It identified 11 websites from pure text and hit 60% testing WebArena tasks without tuning"
|
|
|
|
---
|
|
|
|
## Challenges and Issues
|
|
|
|
### Tool Calling Reliability
|
|
|
|
**Issue:** Models work initially but forget which tools to use after first call
|
|
|
|
**Affected:** Smaller models (4B, 7B range)
|
|
|
|
> "tool calls not always work i use ollama and qwen3.5:4b qwen2.5:7b and they all tool call once than they forget which one to use"
|
|
|
|
### Context Management Confusion
|
|
|
|
**Source:** https://www.reddit.com/r/LocalLLM/comments/1sc82o8/hermesagent_what_is_this_message_about/
|
|
|
|
> "Context exceeded your setting. Either your Hermes context or your llm server context setting for that particular model. By default context is usually set to something comically low."
|
|
|
|
### System Prompt Size Concerns
|
|
|
|
**Source:** https://www.reddit.com/r/LocalLLaMA/comments/1rwhi2h/running_hermes_agent_locally_with_lm_studio/
|
|
|
|
> "Hermes has a huge system prompt. When I try to run it with Qwen-3.5 35B it's difficult..."
|
|
|
|
---
|
|
|
|
## Model-Specific Feedback
|
|
|
|
### Recommended for Local Use
|
|
|
|
1. **Qwen 3.5 27B** - Best overall performance
|
|
- Requires: 24GB+ VRAM
|
|
- Speed: ~25 t/s with proper quantization
|
|
- Tool use: Excellent
|
|
|
|
2. **Qwen 3.5 14B** - Good balance
|
|
- Requires: 16GB VRAM
|
|
- Decent tool use reliability
|
|
|
|
3. **Qwen 3.5 8B** - Minimum viable
|
|
- Requires: 8GB VRAM
|
|
- Tool use may be inconsistent
|
|
|
|
### Not Recommended
|
|
|
|
- Very small models (4B and below) for complex agent tasks
|
|
- Models without good tool calling fine-tuning
|
|
|
|
---
|
|
|
|
## Token Overhead Impact on Local Models
|
|
|
|
**Critical Issue:** Even local models face 13.9K token overhead per request
|
|
|
|
**Source:** GitHub Issue #4379
|
|
|
|
| Component | Tokens |
|
|
|-----------|--------|
|
|
| Tool definitions (31 tools) | 8,759 |
|
|
| System prompt | 5,176 |
|
|
| Fixed overhead | ~13,935 |
|
|
|
|
**Impact:** Local models with smaller context windows hit limits quickly due to this overhead.
|
|
|
|
---
|
|
|
|
## Community Suggestions
|
|
|
|
1. **Better documentation** for local model setup
|
|
2. **Recommended model list** with VRAM requirements
|
|
3. **Tool calling reliability benchmarks** by model size
|
|
4. **Reduced toolset option** for resource-constrained setups
|
|
5. **Better context management guidance**
|
|
|
|
---
|
|
|
|
## Summary Table
|
|
|
|
| Aspect | Rating | Notes |
|
|
|--------|--------|-------|
|
|
| Local model support | ⭐⭐⭐⭐⭐ | Better than alternatives |
|
|
| Setup ease | ⭐⭐⭐ | Requires technical knowledge |
|
|
| Tool calling (8B+) | ⭐⭐⭐⭐ | Good with right models |
|
|
| Tool calling (4B) | ⭐⭐ | Inconsistent |
|
|
| Documentation | ⭐⭐⭐ | Improving but gaps remain |
|
|
| Community support | ⭐⭐⭐⭐⭐ | Active and helpful |
|