T

sleepy 2623737ad2 Add pi (pi-mono) feedback analysis

- Comprehensive feedback document covering tool handling, UX, performance
- Frontier model feedback (Claude, GPT, Gemini)
- Local LLM feedback (context window issues, prompting strategies)
- Source references from GitHub issues and community

2026-04-09 15:40:56 +02:00

forgecode

Initial commit: coding harness feedback analysis

2026-04-09 15:13:45 +02:00

hermes

Initial commit: coding harness feedback analysis

2026-04-09 15:13:45 +02:00

opencode

Initial commit: coding harness feedback analysis

2026-04-09 15:13:45 +02:00

Add pi (pi-mono) feedback analysis

2026-04-09 15:40:56 +02:00

AGENTS.md

Initial commit: coding harness feedback analysis

2026-04-09 15:13:45 +02:00

prompt.md

Initial commit: coding harness feedback analysis

2026-04-09 15:13:45 +02:00

README.md

Add README with folder navigation

2026-04-09 15:15:28 +02:00

Research-orchestration.md

Initial commit: coding harness feedback analysis

2026-04-09 15:13:45 +02:00

Research-prompt.md

Initial commit: coding harness feedback analysis

2026-04-09 15:13:45 +02:00

Research.md

Initial commit: coding harness feedback analysis

2026-04-09 15:13:45 +02:00

README.md

Coding Harness Feedback Analysis

Research on four coding agent harnesses to understand what works best for different model sizes, particularly smaller/local models.

Folder Structure

├── AGENTS.md              # Project overview and data collection strategy
├── Research*.md           # Prompt research and orchestration strategies
│
├── opencode/              # Go-based coding agent
│   ├── feedback/
│   │   ├── frontier/      # GPT-5.4, Claude Opus, Gemini feedback
│   │   └── localllm/      # Local model feedback (prompting, tool handling)
│   └── repo/              # Source code (submodule)
│
├── pi/                    # Minimal terminal coding harness by Mario Zechner
│   ├── feedback/
│   │   ├── frontier/      # (empty - in progress)
│   │   └── localllm/      # (empty - in progress)
│   └── repo/              # Source code (submodule)
│
├── hermes/                # Nous Research's agent
│   ├── feedback/
│   │   ├── frontier/      # Claude, GPT, budget provider feedback
│   │   ├── localllm/      # Qwen, Gemma, local model feedback
│   │   └── general/       # Bug reports, benchmarks, features
│   └── repo/              # Source code (submodule)
│
└── forgecode/             # AI pair programmer with sub-agents
    ├── feedback/
    │   ├── frontier/      # GPT-5.4, Claude, Gemini, pricing, security
    │   └── localllm/      # Qwen, MiniMax, GLM, DeepSeek feedback
    └── repo/              # Source code (submodule)

Harness	Feedback Location	Key Topics
opencode	`opencode/feedback/`	Tool calling, local model prompting
pi	`pi/feedback/`	(Being researched)
hermes	`hermes/feedback/`	Terminal-bench results, local setup
forgecode	`forgecode/feedback/`	Pricing, benchmarks, security

Feedback Format

Each feedback file includes:

Model name/size/provider
Task performance or benchmark results
Issues encountered
What worked well
Source reference (URL, Discord, GitHub issues)

Research Focus

Tool handling and capabilities
Skills system effectiveness
Prompt engineering strategies
Context management
Error recovery

README.md

Coding Harness Feedback Analysis

Folder Structure

Quick Navigation

Feedback Format

Research Focus