Initial commit: coding harness feedback analysis
Harnesses under analysis: - opencode (Go-based coding agent) - pi (minimal terminal coding harness by Mario Zechner) - hermes (Nous Research agent) - forgecode (AI pair programmer with sub-agents) Each harness folder contains: - repo/: Source code from respective repositories - feedback/localllm/: Community feedback for local/smaller models - feedback/frontier/: Community feedback for frontier models Research focus: Tool handling, skills systems, prompt engineering, context management, and best practices for smaller/local models.
This commit is contained in:
@@ -0,0 +1,145 @@
|
||||
# Community Sources & Ongoing Monitoring
|
||||
|
||||
**Last Updated:** April 9, 2026
|
||||
|
||||
---
|
||||
|
||||
## Official Channels
|
||||
|
||||
### Discord
|
||||
- **URL:** https://discord.gg/kRZBPpkgwq
|
||||
- **Purpose:** Community support, feature announcements, feedback
|
||||
- **Activity:** Active (referenced in docs and GitHub)
|
||||
|
||||
### GitHub
|
||||
- **Issues:** https://github.com/antinomyhq/forgecode/issues (48 open, 433 closed)
|
||||
- **Discussions:** https://github.com/antinomyhq/forgecode/discussions
|
||||
- **Releases:** https://github.com/antinomyhq/forgecode/releases
|
||||
|
||||
### Reddit
|
||||
- **r/forgecode:** https://www.reddit.com/r/forgecode/ (official subreddit)
|
||||
- **r/ClaudeCode:** Frequently discusses ForgeCode comparisons
|
||||
- **r/cursor:** Pricing and feature comparisons
|
||||
- **r/LocalLLaMA:** Local model usage with ForgeCode
|
||||
|
||||
---
|
||||
|
||||
## Key External References
|
||||
|
||||
### Benchmarks
|
||||
- **TermBench 2.0:** https://tbench.ai/leaderboard/terminal-bench/2.0
|
||||
- **llm-stats.com:** https://llm-stats.com/benchmarks/terminal-bench
|
||||
- **SWE-bench:** https://www.swebench.com/ (independent validation)
|
||||
|
||||
### Documentation
|
||||
- **Official Docs:** https://forgecode.dev/docs/
|
||||
- **Installation:** https://forgecode.dev/docs/installation/
|
||||
- **ZSH Support:** https://forgecode.dev/docs/zsh-support/
|
||||
- **Blog:** https://forgecode.dev/blog/
|
||||
|
||||
### Articles & Reviews
|
||||
- **DEV Community:** Multiple comparison articles
|
||||
- **TechGig:** Feature overview (August 2025)
|
||||
- **Artificial Analysis:** Independent benchmark tracking
|
||||
|
||||
---
|
||||
|
||||
## Notable GitHub Issues to Watch
|
||||
|
||||
### Critical (Open)
|
||||
| Issue | Description | Status |
|
||||
|-------|-------------|--------|
|
||||
| #2904 | Use models.dev as registry | Open |
|
||||
| #2894 | Qwen 3.5 system messages bug | Open |
|
||||
| #2893 | Ghostty resize bug | Open, PR linked |
|
||||
| #2888 | API key helpers | Open |
|
||||
| #2884 | Muse mode blocked | Open |
|
||||
|
||||
### Historical (Closed but relevant)
|
||||
| Issue | Description |
|
||||
|-------|-------------|
|
||||
| #2813 | Fixed in response to Reddit feedback |
|
||||
| #2485 | Mac installation issues |
|
||||
| #1296 | Daily FORGE limit stops tasks |
|
||||
| #1318 | Telemetry concerns |
|
||||
|
||||
---
|
||||
|
||||
## Research Papers
|
||||
|
||||
### Terminal-Bench
|
||||
- **arXiv:** https://arxiv.org/html/2601.11868v1
|
||||
- **OpenReview:** https://openreview.net/forum?id=a7Qa4CcHak
|
||||
- **Published:** ICLR 2026
|
||||
|
||||
---
|
||||
|
||||
## Monitoring Recommendations
|
||||
|
||||
### Weekly Checks
|
||||
1. GitHub issues for new bugs affecting model compatibility
|
||||
2. Discord announcements for feature updates
|
||||
3. Reddit for user experience reports
|
||||
|
||||
### Monthly Reviews
|
||||
1. Benchmark leaderboard updates (llm-stats.com)
|
||||
2. New model support announcements
|
||||
3. Pricing changes
|
||||
|
||||
### Quarterly Analysis
|
||||
1. Comparative reviews (DEV Community, blogs)
|
||||
2. Feature gap analysis vs competitors
|
||||
3. Local model compatibility updates
|
||||
|
||||
---
|
||||
|
||||
## Data Collection Notes
|
||||
|
||||
### Exhaustive Search Performed
|
||||
- Web search across multiple query angles
|
||||
- GitHub issue extraction
|
||||
- Documentation review
|
||||
- Blog post analysis
|
||||
- Community forum monitoring
|
||||
|
||||
### Sources Checked
|
||||
- GitHub (antinomyhq/forgecode)
|
||||
- Reddit (r/forgecode, r/ClaudeCode, r/cursor, r/LocalLLaMA)
|
||||
- DEV Community
|
||||
- ForgeCode official blog
|
||||
- Independent benchmark sites
|
||||
- Academic papers
|
||||
|
||||
### Limitations
|
||||
- Reddit verification challenges prevented some thread extraction
|
||||
- Discord content not directly accessible (requires login)
|
||||
- Some GitHub issues require authentication for full details
|
||||
|
||||
---
|
||||
|
||||
## Contribution Guidelines
|
||||
|
||||
When adding new feedback:
|
||||
|
||||
1. **Follow the format:**
|
||||
- Model/Topic header
|
||||
- Source references with URLs
|
||||
- What worked / What didn't
|
||||
- Specific issues encountered
|
||||
|
||||
2. **Include dates:** When was the feedback collected?
|
||||
|
||||
3. **Categorize correctly:**
|
||||
- `frontier/` for closed-weight models (GPT, Claude, Gemini, etc.)
|
||||
- `localllm/` for open-weight models (Qwen, Llama, Mistral, etc.)
|
||||
|
||||
4. **Update README.md:** If adding major new categories
|
||||
|
||||
---
|
||||
|
||||
## Contact
|
||||
|
||||
For questions about this research:
|
||||
- Check the GitHub repository for updates
|
||||
- Join the ForgeCode Discord
|
||||
- File issues against this research folder
|
||||
Reference in New Issue
Block a user