Files
mid_model_research/FEEDBACK_TEMPLATE.md
T
sleepy e1781947f4 Fix Qwen3.5-35B-A3B model references
Reverted incorrect changes - Qwen3.5-35B-A3B IS a real model:
- 35B total / 3B active parameters (MoE)
- 262k native context (up to 1M extended)
- Apache 2.0 license
- Available on HuggingFace: Qwen/Qwen3.5-35B-A3B

Updated files:
- opencode/opencode/feedback/localllm/local-llm-feedback.md
- opencode/opencode/feedback/SUMMARY.md
- FEEDBACK_TEMPLATE.md

Added correct specs:
- MMLU-Pro: 85.3%
- SWE-bench Verified: 69.2%
- Context: 262k native, 1M extended
2026-04-09 16:25:19 +02:00

3.2 KiB

Feedback File Structure Template

Use this structure for all feedback files to maintain consistency across the repository.

Standard Header

# [Model Name] with [Harness] - Feedback Report

**Model:** [Full model name]
**Size:** [Parameters, e.g., 27B, 30B-A3B MoE]
**Provider:** [Company/API, e.g., OpenAI, Anthropic, Ollama]
**Harness:** [Harness name, e.g., OpenCode, Hermes, ForgeCode, pi]
**Date Compiled:** [YYYY-MM-DD]
**Source References:** [Primary sources]

---

Required Sections

1. Quick Reference

## Quick Reference

| Attribute | Value |
|-----------|-------|
| Model | [Name] |
| Size | [Parameters] |
| Context Window | [e.g., 128K, 1M] |
| Best For | [Use case summary] |
| Cost | [If applicable] |

2. Benchmark Results

## Benchmark Results

### [Benchmark Name]
- **Score:** [X%] (Rank #Y)
- **Harness:** [If Terminal-Bench or harness-specific]
- **Date:** [When tested]
- **Note:** [Any important context]

**Important:** For Terminal-Bench, always note that scores are harness+model combinations.

3. What Worked Well

## What Worked Well

1. **[Key Point]**
   - Detailed explanation
   - Supporting evidence

2. **[Key Point]**
   - Details

4. Issues Encountered

## Issues Encountered

1. **[Issue Title]**
   - **Severity:** [Critical/Major/Minor]
   - **Description:** Details
   - **Workaround:** If any

2. **[Issue Title]**
   - Details

5. Configuration (Optional)

## Configuration

```json
[Configuration example]

Or for CLI flags:

[Command line options]

### 6. Source References

```markdown
## Source References

1. **[Source Name]**: [URL]
   - [Brief description of what it covers]

2. **[Source Name]**: [URL]
   - Description

For Multi-Model Files

If a file covers multiple models, use this structure:

# [Topic] Feedback for [Harness]

**Date Compiled:** [YYYY-MM-DD]
**Source References:** [Primary sources]

---

## Model Reference Guide

| Model | Size | Provider | Notes |
|-------|------|----------|-------|
| [Name] | [Size] | [Provider] | [Key info] |

---

## [Model 1]

[Follow standard sections above]

---

## [Model 2]

[Follow standard sections above]

Style Guidelines

  1. Use tables for comparative data
  2. Use bullet points for lists
  3. Use numbered lists for sequential steps or ranked items
  4. Bold key terms and metrics
  5. Italic for emphasis
  6. Code formatting for commands, file names, and technical terms
  7. Always cite sources with full URLs
  8. Note dates for time-sensitive information

Special Notes

Terminal-Bench

Always clarify that Terminal-Bench scores represent harness+model combinations, not raw model capability. Include the harness name in the benchmark table.

Qwen Models

Include the Model Reference Guide when discussing Qwen models to avoid confusion between Qwen3, Qwen 3.5, and Qwen2.5 families.

Current Qwen 3.5 MoE models include: 27B, 35B-A3B, 122B-A10B, 397B-A17B.

Verified vs Self-Reported

Note when benchmark scores are:

  • Verified: Independently validated (e.g., SWE-bench Verified)
  • Self-Reported: Submitted by the harness developers themselves