Files
mid_model_research/forgecode/feedback/localllm/qwen-3.5.md
T
sleepy 51123212c4 Initial commit: coding harness feedback analysis
Harnesses under analysis:
- opencode (Go-based coding agent)
- pi (minimal terminal coding harness by Mario Zechner)
- hermes (Nous Research agent)
- forgecode (AI pair programmer with sub-agents)

Each harness folder contains:
- repo/: Source code from respective repositories
- feedback/localllm/: Community feedback for local/smaller models
- feedback/frontier/: Community feedback for frontier models

Research focus: Tool handling, skills systems, prompt engineering,
context management, and best practices for smaller/local models.
2026-04-09 15:13:45 +02:00

1.8 KiB

Qwen 3.5 with ForgeCode - Feedback Report

Model: Qwen 3.5
Provider: Alibaba Cloud (via local inference)
Harness: ForgeCode
Source References: GitHub Issue #2894, Reddit r/LocalLLaMA
Date Compiled: April 9, 2026


Known Issues

Multiple System Messages Bug

GitHub Issue: #2894 (Open as of April 8, 2026)

Problem: Multiple system messages break models with strict chat templates (e.g., Qwen3.5)

Error Manifestation:

  • Models with strict chat templates fail to parse message structure correctly
  • Tool calling may fail or produce incorrect results
  • Agent behavior becomes unpredictable

Impact:

  • Affects local inference with llama.cpp, Ollama, and similar servers
  • Qwen3.5 specifically mentioned as affected

Workaround Status: No official fix yet; issue under investigation


Tool Calling with Qwen Models

General Observations from Community

  1. Qwen3-Coder Next shows promise as "first usable coding model < 60GB"

  2. Tool calling reliability varies by inference backend:

    • LM Studio 0.4.9 reportedly handles Qwen3.5 XML tool parsing more reliably than raw llama.cpp
    • llama.cpp with --jinja flag helps with tool calling
  3. finish_reason issue is annoying to debug according to community reports


Recommendations for Local Use

  1. Use LM Studio for more reliable tool parsing vs raw llama.cpp
  2. Monitor system message count - known issue with ForgeCode's multi-message approach
  3. Test thoroughly before relying on Qwen 3.5 for production tasks via ForgeCode

Source References

  1. GitHub Issue: https://github.com/antinomyhq/forgecode/issues/2894
  2. Reddit r/LocalLLaMA: https://www.reddit.com/r/LocalLLaMA/comments/1sdhvc5/qwen_35_tool_calling_fixes_for_agentic_use_whats/