e1781947f4
Reverted incorrect changes - Qwen3.5-35B-A3B IS a real model: - 35B total / 3B active parameters (MoE) - 262k native context (up to 1M extended) - Apache 2.0 license - Available on HuggingFace: Qwen/Qwen3.5-35B-A3B Updated files: - opencode/opencode/feedback/localllm/local-llm-feedback.md - opencode/opencode/feedback/SUMMARY.md - FEEDBACK_TEMPLATE.md Added correct specs: - MMLU-Pro: 85.3% - SWE-bench Verified: 69.2% - Context: 262k native, 1M extended
161 lines
3.2 KiB
Markdown
161 lines
3.2 KiB
Markdown
# Feedback File Structure Template
|
|
|
|
Use this structure for all feedback files to maintain consistency across the repository.
|
|
|
|
## Standard Header
|
|
|
|
```markdown
|
|
# [Model Name] with [Harness] - Feedback Report
|
|
|
|
**Model:** [Full model name]
|
|
**Size:** [Parameters, e.g., 27B, 30B-A3B MoE]
|
|
**Provider:** [Company/API, e.g., OpenAI, Anthropic, Ollama]
|
|
**Harness:** [Harness name, e.g., OpenCode, Hermes, ForgeCode, pi]
|
|
**Date Compiled:** [YYYY-MM-DD]
|
|
**Source References:** [Primary sources]
|
|
|
|
---
|
|
```
|
|
|
|
## Required Sections
|
|
|
|
### 1. Quick Reference
|
|
|
|
```markdown
|
|
## Quick Reference
|
|
|
|
| Attribute | Value |
|
|
|-----------|-------|
|
|
| Model | [Name] |
|
|
| Size | [Parameters] |
|
|
| Context Window | [e.g., 128K, 1M] |
|
|
| Best For | [Use case summary] |
|
|
| Cost | [If applicable] |
|
|
```
|
|
|
|
### 2. Benchmark Results
|
|
|
|
```markdown
|
|
## Benchmark Results
|
|
|
|
### [Benchmark Name]
|
|
- **Score:** [X%] (Rank #Y)
|
|
- **Harness:** [If Terminal-Bench or harness-specific]
|
|
- **Date:** [When tested]
|
|
- **Note:** [Any important context]
|
|
|
|
**Important:** For Terminal-Bench, always note that scores are harness+model combinations.
|
|
```
|
|
|
|
### 3. What Worked Well
|
|
|
|
```markdown
|
|
## What Worked Well
|
|
|
|
1. **[Key Point]**
|
|
- Detailed explanation
|
|
- Supporting evidence
|
|
|
|
2. **[Key Point]**
|
|
- Details
|
|
```
|
|
|
|
### 4. Issues Encountered
|
|
|
|
```markdown
|
|
## Issues Encountered
|
|
|
|
1. **[Issue Title]**
|
|
- **Severity:** [Critical/Major/Minor]
|
|
- **Description:** Details
|
|
- **Workaround:** If any
|
|
|
|
2. **[Issue Title]**
|
|
- Details
|
|
```
|
|
|
|
### 5. Configuration (Optional)
|
|
|
|
```markdown
|
|
## Configuration
|
|
|
|
```json
|
|
[Configuration example]
|
|
```
|
|
|
|
Or for CLI flags:
|
|
|
|
```bash
|
|
[Command line options]
|
|
```
|
|
```
|
|
|
|
### 6. Source References
|
|
|
|
```markdown
|
|
## Source References
|
|
|
|
1. **[Source Name]**: [URL]
|
|
- [Brief description of what it covers]
|
|
|
|
2. **[Source Name]**: [URL]
|
|
- Description
|
|
```
|
|
|
|
## For Multi-Model Files
|
|
|
|
If a file covers multiple models, use this structure:
|
|
|
|
```markdown
|
|
# [Topic] Feedback for [Harness]
|
|
|
|
**Date Compiled:** [YYYY-MM-DD]
|
|
**Source References:** [Primary sources]
|
|
|
|
---
|
|
|
|
## Model Reference Guide
|
|
|
|
| Model | Size | Provider | Notes |
|
|
|-------|------|----------|-------|
|
|
| [Name] | [Size] | [Provider] | [Key info] |
|
|
|
|
---
|
|
|
|
## [Model 1]
|
|
|
|
[Follow standard sections above]
|
|
|
|
---
|
|
|
|
## [Model 2]
|
|
|
|
[Follow standard sections above]
|
|
```
|
|
|
|
## Style Guidelines
|
|
|
|
1. **Use tables** for comparative data
|
|
2. **Use bullet points** for lists
|
|
3. **Use numbered lists** for sequential steps or ranked items
|
|
4. **Bold** key terms and metrics
|
|
5. **Italic** for emphasis
|
|
6. `Code formatting` for commands, file names, and technical terms
|
|
7. **Always cite sources** with full URLs
|
|
8. **Note dates** for time-sensitive information
|
|
|
|
## Special Notes
|
|
|
|
### Terminal-Bench
|
|
Always clarify that Terminal-Bench scores represent **harness+model** combinations, not raw model capability. Include the harness name in the benchmark table.
|
|
|
|
### Qwen Models
|
|
Include the Model Reference Guide when discussing Qwen models to avoid confusion between Qwen3, Qwen 3.5, and Qwen2.5 families.
|
|
|
|
Current Qwen 3.5 MoE models include: 27B, 35B-A3B, 122B-A10B, 397B-A17B.
|
|
|
|
### Verified vs Self-Reported
|
|
Note when benchmark scores are:
|
|
- **Verified:** Independently validated (e.g., SWE-bench Verified)
|
|
- **Self-Reported:** Submitted by the harness developers themselves
|