Files

T

sleepy 51123212c4 Initial commit: coding harness feedback analysis

Harnesses under analysis:
- opencode (Go-based coding agent)
- pi (minimal terminal coding harness by Mario Zechner)
- hermes (Nous Research agent)
- forgecode (AI pair programmer with sub-agents)

Each harness folder contains:
- repo/: Source code from respective repositories
- feedback/localllm/: Community feedback for local/smaller models
- feedback/frontier/: Community feedback for frontier models

Research focus: Tool handling, skills systems, prompt engineering,
context management, and best practices for smaller/local models.

2026-04-09 15:13:45 +02:00

3.2 KiB

Raw Blame History

Hermes Agent Feedback Collection

Last Updated: 2026-04-09
Purpose: Community feedback and performance data for the hermes-agent harness

Folder Structure

feedback/
├── localllm/          # Community feedback for local/smaller models
├── frontier/          # Community feedback for frontier models
├── general/           # General feature feedback, issues, benchmarks
└── README.md          # This file

Feedback Format

Each feedback file includes:

Model/Feature used (name, size, provider)
Benchmark results or task performance
Issues encountered
What worked well
Source reference: URL or site where feedback came from

Local/Small Models

File	Topic
localllm/qwen-models-feedback.md	Qwen 3.5 performance
localllm/gemma-models-feedback.md	Gemma 4 comparison
localllm/local-setup-issues.md	Setup challenges
localllm/general-local-llm-feedback.md	Overview

Frontier Models

File	Topic
frontier/claude-sonnet-feedback.md	Claude performance
frontier/openai-gpt-feedback.md	OpenAI integration
frontier/budget-providers-feedback.md	Kimi, DeepSeek, MiniMax
frontier/general-frontier-feedback.md	Overview

General

File	Topic
general/bug-reports-and-issues.md	Known issues
general/feature-feedback.md	Features & UX
general/terminal-bench-benchmarks.md	Benchmarks

Key Findings Summary

Token Overhead (All Models)

Critical: 73% of every API call is fixed overhead (~13.9K tokens)

Component	Tokens
Tool definitions (31 tools)	8,759
System prompt	5,176
Fixed overhead	~13,935

Impact: Even simple queries can cost 15K-20K tokens

Best Local Models

Model	VRAM	Rating
Qwen 3.5 27B	24GB	⭐⭐⭐⭐⭐
Qwen 3.5 14B	16GB	⭐⭐⭐⭐
Qwen 3.5 8B	8GB	⭐⭐⭐

Cost-Effective Providers

Provider	Cache Discount	Use Case
DeepSeek	90%	Maximum savings
Kimi K2.5	75%	Daily driver
MiniMax	None	Fast, capable

Critical Issues

Issue	Severity	Status
#4146 - Sandbox bypass	Critical	Open
#1071 - llama-server compatibility	Critical	Fix ready
#4469 - Message queue bug	High	Open

Contributing

To add feedback:

Create a new file in appropriate folder
Follow the feedback format
Include source URLs
Update this README if needed

External Resources

GitHub: https://github.com/NousResearch/hermes-agent
Docs: https://hermes-agent.nousresearch.com/
Discord: Community discussions
Reddit: r/LocalLLaMA, r/LocalLLM

3.2 KiB Raw Blame History