# Coding Harness Feedback Analysis **Last Updated:** April 9, 2026 Research analyzing four coding agent harnesses (opencode, pi, hermes, forgecode) to understand what works best for local/smaller models (7B-27B parameters). ## What Was Done 1. **Repository Analysis**: Each harness was analyzed for prompts, tools, parsing, and skills system suitability for local models 2. **Community Feedback Synthesis**: GitHub issues, Reddit discussions, and Discord reports compiled per harness 3. **Research Integration**: Findings cross-referenced with agent systems research (prompting, orchestration, evaluation) ## Key Output **`conclusion.md`** — Comprehensive analysis covering: - What's working well across all four harnesses - Critical gaps for local model compatibility - Research-backed recommendations with citations - Priority fixes (immediate, short-term, medium-term) ## Folder Structure ``` ├── conclusion.md # Main findings and recommendations │ ├── AGENTS.md # Original project scope and strategy │ ├── opencode/ │ ├── REPO_FEEDBACK.md # Repository analysis (prompts, tools, parsing) │ └── feedback/ # Community feedback by model tier │ ├── frontier/ # GPT-5.4, Claude, Gemini │ └── localllm/ # Qwen, Gemma, local model issues │ ├── pi/ │ ├── REPO_FEEDBACK.md # Repository analysis │ └── feedback/ │ ├── frontier/ # Frontier model feedback │ └── localllm/ # Local model feedback │ ├── hermes/ │ ├── REPO_FEEDBACK.md # Repository analysis │ └── feedback/ │ ├── frontier/ # Claude, GPT feedback │ ├── localllm/ # Qwen, Gemma, local setup │ └── general/ # Bug reports, benchmarks │ └── forgecode/ ├── REPO_FEEDBACK.md # Repository analysis └── feedback/ ├── frontier/ # GPT-5.4, Claude, pricing └── localllm/ # Qwen, MiniMax, GLM, DeepSeek ``` ## Quick Reference | Harness | Best For | Key Limitation | |---------|----------|----------------| | **pi-mono** | Local models (7B+) | Minimal overhead, needs JSON retry layer | | **hermes** | Frontier & 27B+ | 14K token overhead, needs tiered toolsets | | **forgecode** | Sub-agent workflows | Multiple system messages break Qwen3.5 | | **opencode** | Frontier models | Verbose prompts, no JSON repair | ## Research Sources Analysis cross-references findings from: - SOLVE-Med / MATA (small-model orchestration) - ATLAS (generate-verify-repair with 14B models) - StateFlow (FSM-based agent loops) - JetBrains (observation masking vs summarization) - Anthropic (Building Effective AI Agents) - Anthropic (Harness Design for Long-Running Apps) See `Research*.md` for full research notes.