Fix Qwen3.5-35B-A3B model references
Reverted incorrect changes - Qwen3.5-35B-A3B IS a real model: - 35B total / 3B active parameters (MoE) - 262k native context (up to 1M extended) - Apache 2.0 license - Available on HuggingFace: Qwen/Qwen3.5-35B-A3B Updated files: - opencode/opencode/feedback/localllm/local-llm-feedback.md - opencode/opencode/feedback/SUMMARY.md - FEEDBACK_TEMPLATE.md Added correct specs: - MMLU-Pro: 85.3% - SWE-bench Verified: 69.2% - Context: 262k native, 1M extended
This commit is contained in:
@@ -152,6 +152,8 @@ Always clarify that Terminal-Bench scores represent **harness+model** combinatio
|
|||||||
### Qwen Models
|
### Qwen Models
|
||||||
Include the Model Reference Guide when discussing Qwen models to avoid confusion between Qwen3, Qwen 3.5, and Qwen2.5 families.
|
Include the Model Reference Guide when discussing Qwen models to avoid confusion between Qwen3, Qwen 3.5, and Qwen2.5 families.
|
||||||
|
|
||||||
|
Current Qwen 3.5 MoE models include: 27B, 35B-A3B, 122B-A10B, 397B-A17B.
|
||||||
|
|
||||||
### Verified vs Self-Reported
|
### Verified vs Self-Reported
|
||||||
Note when benchmark scores are:
|
Note when benchmark scores are:
|
||||||
- **Verified:** Independently validated (e.g., SWE-bench Verified)
|
- **Verified:** Independently validated (e.g., SWE-bench Verified)
|
||||||
|
|||||||
@@ -16,14 +16,12 @@ This document provides a comprehensive summary of community feedback, benchmark
|
|||||||
|
|
||||||
| Rank | Model | Strengths | Best For |
|
| Rank | Model | Strengths | Best For |
|
||||||
|------|-------|-----------|----------|
|
|------|-------|-----------|----------|
|
||||||
| 1 | **Qwen3-30B-A3B** | Best balance of speed, accuracy, context (128k) | General coding, long-context tasks |
|
| 1 | **Qwen3.5-35B-A3B** | Best balance of speed, accuracy, context (262k native, 1M extended) | General coding, long-context tasks |
|
||||||
| 2 | **Gemma 4 26B-A4B** | Excellent on M-series Mac, 8W power usage | Laptop development, M5 MacBook |
|
| 2 | **Gemma 4 26B-A4B** | Excellent on M-series Mac, 8W power usage | Laptop development, M5 MacBook |
|
||||||
| 3 | **GLM-5.1** | SWE-Bench Pro #1 (58.4), 8-hour autonomy | Long-horizon tasks, enterprise |
|
| 3 | **GLM-5.1** | SWE-Bench Pro #1 (58.4), 8-hour autonomy | Long-horizon tasks, enterprise |
|
||||||
| 4 | **Nemotron 3 Super** | PinchBench 85.6%, 1M context | Agentic reasoning, GPU clusters |
|
| 4 | **Nemotron 3 Super** | PinchBench 85.6%, 1M context | Agentic reasoning, GPU clusters |
|
||||||
| 5 | **Gemma 4 8B** | Runs on 16GB RAM, fast | Quick tasks, modest hardware |
|
| 5 | **Gemma 4 8B** | Runs on 16GB RAM, fast | Quick tasks, modest hardware |
|
||||||
|
|
||||||
**Note:** "Qwen3.5-35B-A3B" community references likely mean **Qwen3-30B-A3B**. Qwen 3.5 MoE sizes: 27B, 122B-A10B, 397B-A17B.
|
|
||||||
|
|
||||||
### 2. Best Frontier Models for OpenCode
|
### 2. Best Frontier Models for OpenCode
|
||||||
|
|
||||||
| Rank | Model | Strengths | Best For |
|
| Rank | Model | Strengths | Best For |
|
||||||
@@ -96,7 +94,7 @@ This document provides a comprehensive summary of community feedback, benchmark
|
|||||||
**File:** `opencode/feedback/localllm/local-llm-feedback.md`
|
**File:** `opencode/feedback/localllm/local-llm-feedback.md`
|
||||||
|
|
||||||
**Contents:**
|
**Contents:**
|
||||||
- Qwen3-30B-A3B (MoE) - Detailed performance data (Note: community "Qwen3.5-35B-A3B" references)
|
- Qwen3.5-35B-A3B (MoE) - Detailed performance data
|
||||||
- Gemma 4 26B-A4B - M-series Mac optimization
|
- Gemma 4 26B-A4B - M-series Mac optimization
|
||||||
- GLM-4.7 Flash - API performance
|
- GLM-4.7 Flash - API performance
|
||||||
- GLM-5.1 - 8-hour autonomous capability
|
- GLM-5.1 - 8-hour autonomous capability
|
||||||
@@ -237,7 +235,7 @@ This document provides a comprehensive summary of community feedback, benchmark
|
|||||||
## Recommendations
|
## Recommendations
|
||||||
|
|
||||||
### For Local Development
|
### For Local Development
|
||||||
1. **Qwen3-30B-A3B** - Best overall local model (Note: community references to "Qwen3.5-35B-A3B")
|
1. **Qwen3.5-35B-A3B** - Best overall local model (35B/3B MoE, 262k context)
|
||||||
2. **Gemma 4 26B-A4B** - Best for M-series Mac
|
2. **Gemma 4 26B-A4B** - Best for M-series Mac
|
||||||
3. **Increase context to 32K+**
|
3. **Increase context to 32K+**
|
||||||
4. **Use corrected chat templates**
|
4. **Use corrected chat templates**
|
||||||
@@ -280,7 +278,7 @@ This document provides a comprehensive summary of community feedback, benchmark
|
|||||||
The OpenCode ecosystem has matured significantly with strong support for both local and frontier models. Key findings:
|
The OpenCode ecosystem has matured significantly with strong support for both local and frontier models. Key findings:
|
||||||
|
|
||||||
1. **Local models are viable** for most coding tasks with proper configuration
|
1. **Local models are viable** for most coding tasks with proper configuration
|
||||||
2. **Qwen3-30B-A3B** (often referenced as "Qwen3.5-35B-A3B") is the best local model overall
|
2. **Qwen3.5-35B-A3B** is the best local model overall (35B/3B MoE, Apache 2.0)
|
||||||
3. **GLM-5.1** is the best frontier model (SWE-Bench Pro #1)
|
3. **GLM-5.1** is the best frontier model (SWE-Bench Pro #1)
|
||||||
4. **Context management** is critical for long-running sessions
|
4. **Context management** is critical for long-running sessions
|
||||||
5. **Hybrid setups** offer the best of both worlds
|
5. **Hybrid setups** offer the best of both worlds
|
||||||
|
|||||||
@@ -12,31 +12,34 @@ This document compiles community feedback, benchmark results, and performance ob
|
|||||||
| Model Family | Available Sizes | Type | Notes |
|
| Model Family | Available Sizes | Type | Notes |
|
||||||
|--------------|-----------------|------|-------|
|
|--------------|-----------------|------|-------|
|
||||||
| **Qwen 3.5** | 0.8B, 2B, 4B, 9B | Dense | Released Feb 2026 |
|
| **Qwen 3.5** | 0.8B, 2B, 4B, 9B | Dense | Released Feb 2026 |
|
||||||
| **Qwen 3.5** | 27B, 122B-A10B, 397B-A17B | MoE | Released Feb 2026 |
|
| **Qwen 3.5** | 27B, 35B-A3B, 122B-A10B, 397B-A17B | MoE | Released Feb 2026 |
|
||||||
| **Qwen3** | 0.6B, 1.7B, 4B, 8B, 14B, 32B | Dense | Released April 2025 |
|
| **Qwen3** | 0.6B, 1.7B, 4B, 8B, 14B, 32B | Dense | Released April 2025 |
|
||||||
| **Qwen3** | 30B-A3B, 235B-A22B | MoE | Released April 2025 |
|
| **Qwen3** | 30B-A3B, 235B-A22B | MoE | Released April 2025 |
|
||||||
| **Qwen2.5** | 0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B | Dense | + Coder variants |
|
| **Qwen2.5** | 0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B | Dense | + Coder variants |
|
||||||
|
|
||||||
> **Note:** "Qwen3.5-35B-A3B" references in community posts likely mean **Qwen3-30B-A3B** (from the Qwen3 MoE family) or are speculative. Qwen 3.5 MoE sizes are 27B, 122B-A10B, and 397B-A17B.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### Qwen3-30B-A3B (MoE) [Most likely model referenced]
|
### Qwen3.5-35B-A3B (MoE)
|
||||||
**Model:** Qwen3-30B-A3B (not Qwen 3.5)
|
**Model:** Qwen3.5-35B-A3B
|
||||||
**Size:** 30B total / 3B active parameters
|
**Size:** 35B total / 3B active parameters
|
||||||
**Quantization:** Q4_K_M, Q8_0, UD-Q4_K_XL
|
**Quantization:** Q4_K_M, Q8_0, UD-Q4_K_XL, GPTQ-Int4
|
||||||
**Provider:** llama.cpp / Ollama / HuggingFace
|
**Provider:** llama.cpp / Ollama / vLLM / HuggingFace
|
||||||
|
**Context:** 262k native, up to 1M extended
|
||||||
|
**License:** Apache 2.0
|
||||||
|
|
||||||
**Benchmark Results:**
|
**Benchmark Results:**
|
||||||
- **Performance:** 3-5x faster than dense variants (~60-100 tok/s)
|
- **Performance:** 3-5x faster than dense variants (~60-100 tok/s)
|
||||||
- **Context:** Supports up to 128k context
|
- **Context:** Supports up to 262k context (1M extended)
|
||||||
|
- **MMLU-Pro:** 85.3%
|
||||||
|
- **SWE-bench Verified:** 69.2%
|
||||||
- **Accuracy:** Excellent on coding tasks, comparable to cloud models
|
- **Accuracy:** Excellent on coding tasks, comparable to cloud models
|
||||||
|
|
||||||
**What Worked Well:**
|
**What Worked Well:**
|
||||||
- Long context handling (128k tested)
|
- Long context handling (262k tested, 1M extended)
|
||||||
- Fast inference due to MoE architecture
|
- Fast inference due to MoE architecture
|
||||||
- Good tool calling with corrected chat templates
|
- Good tool calling with corrected chat templates
|
||||||
- Works well with OpenCode's skill system
|
- Works well with OpenCode's skill system
|
||||||
|
- Apache 2.0 license (open source)
|
||||||
|
|
||||||
**Issues Encountered:**
|
**Issues Encountered:**
|
||||||
- Default chat template breaks tool-calling in OpenCode
|
- Default chat template breaks tool-calling in OpenCode
|
||||||
@@ -52,7 +55,7 @@ This document compiles community feedback, benchmark results, and performance ob
|
|||||||
--batch-size 2048
|
--batch-size 2048
|
||||||
--ubatch-size 512
|
--ubatch-size 512
|
||||||
--jinja
|
--jinja
|
||||||
--chat-template-file qwen3-chat-template-corrected.jinja
|
--chat-template-file qwen35-chat-template-corrected.jinja
|
||||||
--context-shift
|
--context-shift
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -324,14 +327,12 @@ docker model configure --context-size=100000 gpt-oss:20B-UD-Q8_K_XL
|
|||||||
|
|
||||||
### Best Local Models for OpenCode (Ranked)
|
### Best Local Models for OpenCode (Ranked)
|
||||||
|
|
||||||
1. **Qwen3-30B-A3B** (or Qwen 3.5 27B-A3B if available) - Best balance of speed, accuracy, context
|
1. **Qwen3.5-35B-A3B** - Best overall balance of speed, accuracy, context (262k native, 1M extended)
|
||||||
2. **Gemma 4 26B-A4B** - Best for M-series Mac, very efficient
|
2. **Gemma 4 26B-A4B** - Best for M-series Mac, very efficient
|
||||||
3. **GLM-5.1** - Best for long-horizon tasks (requires enterprise hardware)
|
3. **GLM-5.1** - Best for long-horizon tasks (requires enterprise hardware)
|
||||||
4. **Nemotron 3 Super** - Best for agentic reasoning (enterprise hardware)
|
4. **Nemotron 3 Super** - Best for agentic reasoning (enterprise hardware)
|
||||||
5. **Gemma 4 8B** - Best for quick tasks on modest hardware
|
5. **Gemma 4 8B** - Best for quick tasks on modest hardware
|
||||||
|
|
||||||
**Note:** Community references to "Qwen3.5-35B-A3B" likely mean **Qwen3-30B-A3B** from the Qwen3 family (not Qwen 3.5). Qwen 3.5 MoE models come in 27B, 122B-A10B, and 397B-A17B sizes.
|
|
||||||
|
|
||||||
### Hybrid Setup Strategy
|
### Hybrid Setup Strategy
|
||||||
- **Local models:** Lightweight tasks, repetitive work, privacy-sensitive
|
- **Local models:** Lightweight tasks, repetitive work, privacy-sensitive
|
||||||
- **Cloud models:** Complex reasoning, multi-file refactors, deep analysis
|
- **Cloud models:** Complex reasoning, multi-file refactors, deep analysis
|
||||||
|
|||||||
Reference in New Issue
Block a user