Files

T

sleepy 51123212c4 Initial commit: coding harness feedback analysis

Harnesses under analysis:
- opencode (Go-based coding agent)
- pi (minimal terminal coding harness by Mario Zechner)
- hermes (Nous Research agent)
- forgecode (AI pair programmer with sub-agents)

Each harness folder contains:
- repo/: Source code from respective repositories
- feedback/localllm/: Community feedback for local/smaller models
- feedback/frontier/: Community feedback for frontier models

Research focus: Tool handling, skills systems, prompt engineering,
context management, and best practices for smaller/local models.

2026-04-09 15:13:45 +02:00

4.9 KiB

Raw Blame History

ForgeCode Best Practices - Summary

Compiled from: Community feedback, GitHub issues, blog posts, documentation
Date Compiled: April 9, 2026

Quick Start Best Practices

1. Disable Telemetry

export FORGE_TRACKER=false

Add to ~/.zshrc for persistence.

2. Configure API Keys Properly

forge provider login  # Set up providers

Consider API key helpers (requested in #2888) for security.

3. Verify ZSH Integration

forge zsh doctor   # Check for issues
forge zsh setup    # Re-run if needed

Model Selection Best Practices

For Speed

Opus 4.6 through ForgeCode: Fastest real-world performance
Avoid GPT 5.4 through ForgeCode: Unstable tool calling

For Cost

MiniMax M2.1: Near-SOTA performance at $0.30/$1.20 per million tokens
LongCat-Flash-Lite: Budget option at $0.10/$0.40

For Reliability

Claude Sonnet 4.5: Best independent benchmark scores
Avoid: Models with known tool calling issues (Qwen 3.5 with current bug)

Agent Usage Best Practices

Workflow Pattern

Start with muse for planning complex changes
Switch to forge for implementation
Use sage (automatically) for research

Command Reference

:muse    # Planning mode
:forge   # Implementation mode
:agent   # View all agents
:new     # Fresh conversation
:compact # Free up token budget

Context Management

Strengths

~90% context reduction vs full-file inclusion
Function signature indexing
Selective context pulling

Limitations

No auto-compaction (unlike Claude Code)
No checkpoints/rewind
Manual :compact required when context full

Tips

Use @filename for file tagging
Run :compact before long tasks
Start with :new for unrelated tasks

Tool Calling Best Practices

For Harness Developers

Use old_string/new_string argument names
Put required before properties in JSON schema
Flatten nested schemas
Add explicit truncation reminders

For Users

Verify tool calls - don't blindly accept
Check file paths - AI can hallucinate paths
Review diffs - especially for large changes

Pricing Optimization

Cost Control

Use Sonnet for routine tasks (cheaper than Opus)
Limit sub-agent spawning - burns tokens
Use context efficiently - ForgeCode's indexing helps
Monitor daily limits - Free tier is 10-50 requests

Plan Selection

Free: Testing, small projects
Pro ($20): Regular use (<1,000 requests/day)
Max ($100): Power users (1,000-5,000 requests/day)

Project Configuration

AGENTS.md

Create at project root or ~/forge/AGENTS.md:

# Development Guidelines

## Runtime
- NEVER restart the dev server (runs on port 3000)
- Use npm exclusively (not yarn/pnpm)

## Code Style
- TypeScript strict mode
- Functional programming preferred

Tips

Be specific and actionable
Include negative constraints ("NEVER...")
Reference existing code patterns

Common Pitfalls

1. Expecting Claude Code Features

Missing: Checkpoints, auto-memory, IDE extensions
Workaround: Use git commits frequently

2. Ignoring Daily Limits

Problem: Task stops mid-execution when limit reached
Solution: Monitor usage, upgrade plan, or switch providers

3. Using GPT 5.4 for Research

Problem: Tool calling failures, infinite loops
Solution: Use Opus 4.6 or Sonnet instead

4. Privacy Concerns

Problem: Telemetry collects SSH/git data by default
Solution: Set FORGE_TRACKER=false

When to Use ForgeCode vs Alternatives

Use ForgeCode When:

Terminal-first workflow
Speed is priority
Multi-model flexibility needed
Open source/auditable code required
Privacy control essential (with telemetry disabled)

Use Claude Code When:

Team collaboration (shared CLAUDE.md)
Need checkpoints/rewind
Want auto-memory across sessions
IDE extensions needed
Prefer subscription pricing (no separate API costs)

Use Cursor When:

IDE-native experience preferred
GUI features important
Team using VS Code exclusively

Debugging Tips

Tool Call Failures

Check model compatibility (avoid Qwen 3.5 currently)
Verify JSON schema format
Try :retry to resend

Performance Issues

Use :compact to free context
Switch to faster model (Sonnet vs Opus)
Close unnecessary files with @[filename]

Integration Issues

Run forge zsh doctor
Verify Nerd Font installed
Check terminal compatibility (Ghostty has resize bug)

Source References

ForgeCode Docs: https://forgecode.dev/docs/
ZSH Support: https://forgecode.dev/docs/zsh-support/
Operating Agents: https://forgecode.dev/docs/operating-agents/
DEV Community: https://dev.to/liran_baba/forgecode-vs-claude-code-which-ai-coding-agent-actually-wins-36c
GitHub Issues: https://github.com/antinomyhq/forgecode/issues

4.9 KiB Raw Blame History

ForgeCode Best Practices - Summary

Quick Start Best Practices

1. Disable Telemetry

2. Configure API Keys Properly

3. Verify ZSH Integration

Model Selection Best Practices

For Speed

For Cost

For Reliability

Agent Usage Best Practices

Workflow Pattern

Command Reference

Context Management

Strengths

Limitations

Tips

Tool Calling Best Practices

For Harness Developers

For Users

Pricing Optimization

Cost Control

Plan Selection

Project Configuration

AGENTS.md

Tips

Common Pitfalls

1. Expecting Claude Code Features

2. Ignoring Daily Limits

3. Using GPT 5.4 for Research

4. Privacy Concerns

When to Use ForgeCode vs Alternatives

Use ForgeCode When:

Use Claude Code When:

Use Cursor When:

Debugging Tips

Tool Call Failures

Performance Issues

Integration Issues

Source References

4.9 KiB

Raw Blame History