Initial commit: coding harness feedback analysis

Harnesses under analysis: - opencode (Go-based coding agent) - pi (minimal terminal coding harness by Mario Zechner) - hermes (Nous Research agent) - forgecode (AI pair programmer with sub-agents) Each harness folder contains: - repo/: Source code from respective repositories - feedback/localllm/: Community feedback for local/smaller models - feedback/frontier/: Community feedback for frontier models Research focus: Tool handling, skills systems, prompt engineering, context management, and best practices for smaller/local models.
2026-04-09 15:13:45 +02:00
commit 51123212c4
46 changed files with 7213 additions and 0 deletions
@@ -0,0 +1,204 @@
+# ForgeCode Best Practices - Summary
+
+**Compiled from:** Community feedback, GitHub issues, blog posts, documentation  
+**Date Compiled:** April 9, 2026
+
+---
+
+## Quick Start Best Practices
+
+### 1. Disable Telemetry
+```bash
+export FORGE_TRACKER=false
+```
+Add to `~/.zshrc` for persistence.
+
+### 2. Configure API Keys Properly
+```bash
+forge provider login  # Set up providers
+```
+Consider API key helpers (requested in #2888) for security.
+
+### 3. Verify ZSH Integration
+```bash
+forge zsh doctor   # Check for issues
+forge zsh setup    # Re-run if needed
+```
+
+---
+
+## Model Selection Best Practices
+
+### For Speed
+- **Opus 4.6** through ForgeCode: Fastest real-world performance
+- **Avoid GPT 5.4** through ForgeCode: Unstable tool calling
+
+### For Cost
+- **MiniMax M2.1:** Near-SOTA performance at $0.30/$1.20 per million tokens
+- **LongCat-Flash-Lite:** Budget option at $0.10/$0.40
+
+### For Reliability
+- **Claude Sonnet 4.5:** Best independent benchmark scores
+- **Avoid:** Models with known tool calling issues (Qwen 3.5 with current bug)
+
+---
+
+## Agent Usage Best Practices
+
+### Workflow Pattern
+1. **Start with `muse`** for planning complex changes
+2. **Switch to `forge`** for implementation
+3. **Use `sage`** (automatically) for research
+
+### Command Reference
+```bash
+:muse    # Planning mode
+:forge   # Implementation mode
+:agent   # View all agents
+:new     # Fresh conversation
+:compact # Free up token budget
+```
+
+---
+
+## Context Management
+
+### Strengths
+- **~90% context reduction** vs full-file inclusion
+- Function signature indexing
+- Selective context pulling
+
+### Limitations
+- **No auto-compaction** (unlike Claude Code)
+- **No checkpoints/rewind**
+- Manual `:compact` required when context full
+
+### Tips
+- Use `@filename` for file tagging
+- Run `:compact` before long tasks
+- Start with `:new` for unrelated tasks
+
+---
+
+## Tool Calling Best Practices
+
+### For Harness Developers
+1. Use `old_string`/`new_string` argument names
+2. Put `required` before `properties` in JSON schema
+3. Flatten nested schemas
+4. Add explicit truncation reminders
+
+### For Users
+1. **Verify tool calls** - don't blindly accept
+2. **Check file paths** - AI can hallucinate paths
+3. **Review diffs** - especially for large changes
+
+---
+
+## Pricing Optimization
+
+### Cost Control
+1. **Use Sonnet** for routine tasks (cheaper than Opus)
+2. **Limit sub-agent spawning** - burns tokens
+3. **Use context efficiently** - ForgeCode's indexing helps
+4. **Monitor daily limits** - Free tier is 10-50 requests
+
+### Plan Selection
+- **Free:** Testing, small projects
+- **Pro ($20):** Regular use (<1,000 requests/day)
+- **Max ($100):** Power users (1,000-5,000 requests/day)
+
+---
+
+## Project Configuration
+
+### AGENTS.md
+Create at project root or `~/forge/AGENTS.md`:
+```markdown
+# Development Guidelines
+
+## Runtime
+- NEVER restart the dev server (runs on port 3000)
+- Use npm exclusively (not yarn/pnpm)
+
+## Code Style
+- TypeScript strict mode
+- Functional programming preferred
+```
+
+### Tips
+- Be specific and actionable
+- Include negative constraints ("NEVER...")
+- Reference existing code patterns
+
+---
+
+## Common Pitfalls
+
+### 1. Expecting Claude Code Features
+- **Missing:** Checkpoints, auto-memory, IDE extensions
+- **Workaround:** Use git commits frequently
+
+### 2. Ignoring Daily Limits
+- **Problem:** Task stops mid-execution when limit reached
+- **Solution:** Monitor usage, upgrade plan, or switch providers
+
+### 3. Using GPT 5.4 for Research
+- **Problem:** Tool calling failures, infinite loops
+- **Solution:** Use Opus 4.6 or Sonnet instead
+
+### 4. Privacy Concerns
+- **Problem:** Telemetry collects SSH/git data by default
+- **Solution:** Set FORGE_TRACKER=false
+
+---
+
+## When to Use ForgeCode vs Alternatives
+
+### Use ForgeCode When:
+- Terminal-first workflow
+- Speed is priority
+- Multi-model flexibility needed
+- Open source/auditable code required
+- Privacy control essential (with telemetry disabled)
+
+### Use Claude Code When:
+- Team collaboration (shared CLAUDE.md)
+- Need checkpoints/rewind
+- Want auto-memory across sessions
+- IDE extensions needed
+- Prefer subscription pricing (no separate API costs)
+
+### Use Cursor When:
+- IDE-native experience preferred
+- GUI features important
+- Team using VS Code exclusively
+
+---
+
+## Debugging Tips
+
+### Tool Call Failures
+1. Check model compatibility (avoid Qwen 3.5 currently)
+2. Verify JSON schema format
+3. Try `:retry` to resend
+
+### Performance Issues
+1. Use `:compact` to free context
+2. Switch to faster model (Sonnet vs Opus)
+3. Close unnecessary files with `@[filename]`
+
+### Integration Issues
+1. Run `forge zsh doctor`
+2. Verify Nerd Font installed
+3. Check terminal compatibility (Ghostty has resize bug)
+
+---
+
+## Source References
+
+1. **ForgeCode Docs:** https://forgecode.dev/docs/
+2. **ZSH Support:** https://forgecode.dev/docs/zsh-support/
+3. **Operating Agents:** https://forgecode.dev/docs/operating-agents/
+4. **DEV Community:** https://dev.to/liran_baba/forgecode-vs-claude-code-which-ai-coding-agent-actually-wins-36c
+5. **GitHub Issues:** https://github.com/antinomyhq/forgecode/issues