diff --git a/README.md b/README.md new file mode 100644 index 0000000..b7b24e5 --- /dev/null +++ b/README.md @@ -0,0 +1,61 @@ +# Coding Harness Feedback Analysis + +Research on four coding agent harnesses to understand what works best for different model sizes, particularly smaller/local models. + +## Folder Structure + +``` +├── AGENTS.md # Project overview and data collection strategy +├── Research*.md # Prompt research and orchestration strategies +│ +├── opencode/ # Go-based coding agent +│ ├── feedback/ +│ │ ├── frontier/ # GPT-5.4, Claude Opus, Gemini feedback +│ │ └── localllm/ # Local model feedback (prompting, tool handling) +│ └── repo/ # Source code (submodule) +│ +├── pi/ # Minimal terminal coding harness by Mario Zechner +│ ├── feedback/ +│ │ ├── frontier/ # (empty - in progress) +│ │ └── localllm/ # (empty - in progress) +│ └── repo/ # Source code (submodule) +│ +├── hermes/ # Nous Research's agent +│ ├── feedback/ +│ │ ├── frontier/ # Claude, GPT, budget provider feedback +│ │ ├── localllm/ # Qwen, Gemma, local model feedback +│ │ └── general/ # Bug reports, benchmarks, features +│ └── repo/ # Source code (submodule) +│ +└── forgecode/ # AI pair programmer with sub-agents + ├── feedback/ + │ ├── frontier/ # GPT-5.4, Claude, Gemini, pricing, security + │ └── localllm/ # Qwen, MiniMax, GLM, DeepSeek feedback + └── repo/ # Source code (submodule) +``` + +## Quick Navigation + +| Harness | Feedback Location | Key Topics | +|---------|------------------|------------| +| **opencode** | `opencode/feedback/` | Tool calling, local model prompting | +| **pi** | `pi/feedback/` | (Being researched) | +| **hermes** | `hermes/feedback/` | Terminal-bench results, local setup | +| **forgecode** | `forgecode/feedback/` | Pricing, benchmarks, security | + +## Feedback Format + +Each feedback file includes: +- Model name/size/provider +- Task performance or benchmark results +- Issues encountered +- What worked well +- Source reference (URL, Discord, GitHub issues) + +## Research Focus + +- Tool handling and capabilities +- Skills system effectiveness +- Prompt engineering strategies +- Context management +- Error recovery