Files
mid_model_research/hermes/feedback/README.md
T
sleepy 51123212c4 Initial commit: coding harness feedback analysis
Harnesses under analysis:
- opencode (Go-based coding agent)
- pi (minimal terminal coding harness by Mario Zechner)
- hermes (Nous Research agent)
- forgecode (AI pair programmer with sub-agents)

Each harness folder contains:
- repo/: Source code from respective repositories
- feedback/localllm/: Community feedback for local/smaller models
- feedback/frontier/: Community feedback for frontier models

Research focus: Tool handling, skills systems, prompt engineering,
context management, and best practices for smaller/local models.
2026-04-09 15:13:45 +02:00

3.2 KiB

Hermes Agent Feedback Collection

Last Updated: 2026-04-09
Purpose: Community feedback and performance data for the hermes-agent harness


Folder Structure

feedback/
├── localllm/          # Community feedback for local/smaller models
├── frontier/          # Community feedback for frontier models
├── general/           # General feature feedback, issues, benchmarks
└── README.md          # This file

Feedback Format

Each feedback file includes:

  • Model/Feature used (name, size, provider)
  • Benchmark results or task performance
  • Issues encountered
  • What worked well
  • Source reference: URL or site where feedback came from

Quick Navigation

Local/Small Models

File Topic
localllm/qwen-models-feedback.md Qwen 3.5 performance
localllm/gemma-models-feedback.md Gemma 4 comparison
localllm/local-setup-issues.md Setup challenges
localllm/general-local-llm-feedback.md Overview

Frontier Models

File Topic
frontier/claude-sonnet-feedback.md Claude performance
frontier/openai-gpt-feedback.md OpenAI integration
frontier/budget-providers-feedback.md Kimi, DeepSeek, MiniMax
frontier/general-frontier-feedback.md Overview

General

File Topic
general/bug-reports-and-issues.md Known issues
general/feature-feedback.md Features & UX
general/terminal-bench-benchmarks.md Benchmarks

Key Findings Summary

Token Overhead (All Models)

Critical: 73% of every API call is fixed overhead (~13.9K tokens)

Component Tokens
Tool definitions (31 tools) 8,759
System prompt 5,176
Fixed overhead ~13,935

Impact: Even simple queries can cost 15K-20K tokens

Best Local Models

Model VRAM Rating
Qwen 3.5 27B 24GB
Qwen 3.5 14B 16GB
Qwen 3.5 8B 8GB

Cost-Effective Providers

Provider Cache Discount Use Case
DeepSeek 90% Maximum savings
Kimi K2.5 75% Daily driver
MiniMax None Fast, capable

Critical Issues

Issue Severity Status
#4146 - Sandbox bypass Critical Open
#1071 - llama-server compatibility Critical Fix ready
#4469 - Message queue bug High Open

Contributing

To add feedback:

  1. Create a new file in appropriate folder
  2. Follow the feedback format
  3. Include source URLs
  4. Update this README if needed

External Resources