Files
mid_model_research/forgecode/feedback/frontier/summary-best-practices.md
T
sleepy 51123212c4 Initial commit: coding harness feedback analysis
Harnesses under analysis:
- opencode (Go-based coding agent)
- pi (minimal terminal coding harness by Mario Zechner)
- hermes (Nous Research agent)
- forgecode (AI pair programmer with sub-agents)

Each harness folder contains:
- repo/: Source code from respective repositories
- feedback/localllm/: Community feedback for local/smaller models
- feedback/frontier/: Community feedback for frontier models

Research focus: Tool handling, skills systems, prompt engineering,
context management, and best practices for smaller/local models.
2026-04-09 15:13:45 +02:00

4.9 KiB

ForgeCode Best Practices - Summary

Compiled from: Community feedback, GitHub issues, blog posts, documentation
Date Compiled: April 9, 2026


Quick Start Best Practices

1. Disable Telemetry

export FORGE_TRACKER=false

Add to ~/.zshrc for persistence.

2. Configure API Keys Properly

forge provider login  # Set up providers

Consider API key helpers (requested in #2888) for security.

3. Verify ZSH Integration

forge zsh doctor   # Check for issues
forge zsh setup    # Re-run if needed

Model Selection Best Practices

For Speed

  • Opus 4.6 through ForgeCode: Fastest real-world performance
  • Avoid GPT 5.4 through ForgeCode: Unstable tool calling

For Cost

  • MiniMax M2.1: Near-SOTA performance at $0.30/$1.20 per million tokens
  • LongCat-Flash-Lite: Budget option at $0.10/$0.40

For Reliability

  • Claude Sonnet 4.5: Best independent benchmark scores
  • Avoid: Models with known tool calling issues (Qwen 3.5 with current bug)

Agent Usage Best Practices

Workflow Pattern

  1. Start with muse for planning complex changes
  2. Switch to forge for implementation
  3. Use sage (automatically) for research

Command Reference

:muse    # Planning mode
:forge   # Implementation mode
:agent   # View all agents
:new     # Fresh conversation
:compact # Free up token budget

Context Management

Strengths

  • ~90% context reduction vs full-file inclusion
  • Function signature indexing
  • Selective context pulling

Limitations

  • No auto-compaction (unlike Claude Code)
  • No checkpoints/rewind
  • Manual :compact required when context full

Tips

  • Use @filename for file tagging
  • Run :compact before long tasks
  • Start with :new for unrelated tasks

Tool Calling Best Practices

For Harness Developers

  1. Use old_string/new_string argument names
  2. Put required before properties in JSON schema
  3. Flatten nested schemas
  4. Add explicit truncation reminders

For Users

  1. Verify tool calls - don't blindly accept
  2. Check file paths - AI can hallucinate paths
  3. Review diffs - especially for large changes

Pricing Optimization

Cost Control

  1. Use Sonnet for routine tasks (cheaper than Opus)
  2. Limit sub-agent spawning - burns tokens
  3. Use context efficiently - ForgeCode's indexing helps
  4. Monitor daily limits - Free tier is 10-50 requests

Plan Selection

  • Free: Testing, small projects
  • Pro ($20): Regular use (<1,000 requests/day)
  • Max ($100): Power users (1,000-5,000 requests/day)

Project Configuration

AGENTS.md

Create at project root or ~/forge/AGENTS.md:

# Development Guidelines

## Runtime
- NEVER restart the dev server (runs on port 3000)
- Use npm exclusively (not yarn/pnpm)

## Code Style
- TypeScript strict mode
- Functional programming preferred

Tips

  • Be specific and actionable
  • Include negative constraints ("NEVER...")
  • Reference existing code patterns

Common Pitfalls

1. Expecting Claude Code Features

  • Missing: Checkpoints, auto-memory, IDE extensions
  • Workaround: Use git commits frequently

2. Ignoring Daily Limits

  • Problem: Task stops mid-execution when limit reached
  • Solution: Monitor usage, upgrade plan, or switch providers

3. Using GPT 5.4 for Research

  • Problem: Tool calling failures, infinite loops
  • Solution: Use Opus 4.6 or Sonnet instead

4. Privacy Concerns

  • Problem: Telemetry collects SSH/git data by default
  • Solution: Set FORGE_TRACKER=false

When to Use ForgeCode vs Alternatives

Use ForgeCode When:

  • Terminal-first workflow
  • Speed is priority
  • Multi-model flexibility needed
  • Open source/auditable code required
  • Privacy control essential (with telemetry disabled)

Use Claude Code When:

  • Team collaboration (shared CLAUDE.md)
  • Need checkpoints/rewind
  • Want auto-memory across sessions
  • IDE extensions needed
  • Prefer subscription pricing (no separate API costs)

Use Cursor When:

  • IDE-native experience preferred
  • GUI features important
  • Team using VS Code exclusively

Debugging Tips

Tool Call Failures

  1. Check model compatibility (avoid Qwen 3.5 currently)
  2. Verify JSON schema format
  3. Try :retry to resend

Performance Issues

  1. Use :compact to free context
  2. Switch to faster model (Sonnet vs Opus)
  3. Close unnecessary files with @[filename]

Integration Issues

  1. Run forge zsh doctor
  2. Verify Nerd Font installed
  3. Check terminal compatibility (Ghostty has resize bug)

Source References

  1. ForgeCode Docs: https://forgecode.dev/docs/
  2. ZSH Support: https://forgecode.dev/docs/zsh-support/
  3. Operating Agents: https://forgecode.dev/docs/operating-agents/
  4. DEV Community: https://dev.to/liran_baba/forgecode-vs-claude-code-which-ai-coding-agent-actually-wins-36c
  5. GitHub Issues: https://github.com/antinomyhq/forgecode/issues