Files
mid_model_research/hermes/feedback/localllm/general-local-llm-feedback.md
T
sleepy 51123212c4 Initial commit: coding harness feedback analysis
Harnesses under analysis:
- opencode (Go-based coding agent)
- pi (minimal terminal coding harness by Mario Zechner)
- hermes (Nous Research agent)
- forgecode (AI pair programmer with sub-agents)

Each harness folder contains:
- repo/: Source code from respective repositories
- feedback/localllm/: Community feedback for local/smaller models
- feedback/frontier/: Community feedback for frontier models

Research focus: Tool handling, skills systems, prompt engineering,
context management, and best practices for smaller/local models.
2026-04-09 15:13:45 +02:00

118 lines
3.8 KiB
Markdown

# General Local LLM Feedback for Hermes Agent
**Collection Date:** 2026-04-09
**Sources:** Reddit r/LocalLLaMA, r/LocalLLM, GitHub issues, blog posts, community discussions
---
## Overall Assessment
Hermes Agent is widely reported to work "way better" with local models than OpenClaw. However, users face challenges with configuration complexity and model selection.
---
## Positive Feedback
### Better Than OpenClaw for Local Models
**Source:** https://www.reddit.com/r/LocalLLM/comments/1rye221/anyone_working_with_hermes_agent/
> "its worknig better for me than openclaw, this i mean with local models, when i use openclaw i cant even load up 4b models, i am not sure why but i decided to see if the same problem would persist with hermes and i dint get this issue."
**Source:** https://www.reddit.com/r/LocalLLaMA/comments/1rwhi2h/running_hermes_agent_locally_with_lm_studio/
> "This Hermes agent already works way way better than Open Claw and it actually works pretty well locally. I have to be super careful about exposing this to the outside world because the model is not smart enough, probably, to catch sophisticated..."
### Architecture Appreciation
**Source:** https://www.reddit.com/r/LocalLLM/comments/1scglgq/i_looked_into_hermes_agent_architecture_to_dig/
> "It identified 11 websites from pure text and hit 60% testing WebArena tasks without tuning"
---
## Challenges and Issues
### Tool Calling Reliability
**Issue:** Models work initially but forget which tools to use after first call
**Affected:** Smaller models (4B, 7B range)
> "tool calls not always work i use ollama and qwen3.5:4b qwen2.5:7b and they all tool call once than they forget which one to use"
### Context Management Confusion
**Source:** https://www.reddit.com/r/LocalLLM/comments/1sc82o8/hermesagent_what_is_this_message_about/
> "Context exceeded your setting. Either your Hermes context or your llm server context setting for that particular model. By default context is usually set to something comically low."
### System Prompt Size Concerns
**Source:** https://www.reddit.com/r/LocalLLaMA/comments/1rwhi2h/running_hermes_agent_locally_with_lm_studio/
> "Hermes has a huge system prompt. When I try to run it with Qwen-3.5 35B it's difficult..."
---
## Model-Specific Feedback
### Recommended for Local Use
1. **Qwen 3.5 27B** - Best overall performance
- Requires: 24GB+ VRAM
- Speed: ~25 t/s with proper quantization
- Tool use: Excellent
2. **Qwen 3.5 14B** - Good balance
- Requires: 16GB VRAM
- Decent tool use reliability
3. **Qwen 3.5 8B** - Minimum viable
- Requires: 8GB VRAM
- Tool use may be inconsistent
### Not Recommended
- Very small models (4B and below) for complex agent tasks
- Models without good tool calling fine-tuning
---
## Token Overhead Impact on Local Models
**Critical Issue:** Even local models face 13.9K token overhead per request
**Source:** GitHub Issue #4379
| Component | Tokens |
|-----------|--------|
| Tool definitions (31 tools) | 8,759 |
| System prompt | 5,176 |
| Fixed overhead | ~13,935 |
**Impact:** Local models with smaller context windows hit limits quickly due to this overhead.
---
## Community Suggestions
1. **Better documentation** for local model setup
2. **Recommended model list** with VRAM requirements
3. **Tool calling reliability benchmarks** by model size
4. **Reduced toolset option** for resource-constrained setups
5. **Better context management guidance**
---
## Summary Table
| Aspect | Rating | Notes |
|--------|--------|-------|
| Local model support | ⭐⭐⭐⭐⭐ | Better than alternatives |
| Setup ease | ⭐⭐⭐ | Requires technical knowledge |
| Tool calling (8B+) | ⭐⭐⭐⭐ | Good with right models |
| Tool calling (4B) | ⭐⭐ | Inconsistent |
| Documentation | ⭐⭐⭐ | Improving but gaps remain |
| Community support | ⭐⭐⭐⭐⭐ | Active and helpful |