Files

T

sleepy 51123212c4 Initial commit: coding harness feedback analysis

Harnesses under analysis:
- opencode (Go-based coding agent)
- pi (minimal terminal coding harness by Mario Zechner)
- hermes (Nous Research agent)
- forgecode (AI pair programmer with sub-agents)

Each harness folder contains:
- repo/: Source code from respective repositories
- feedback/localllm/: Community feedback for local/smaller models
- feedback/frontier/: Community feedback for frontier models

Research focus: Tool handling, skills systems, prompt engineering,
context management, and best practices for smaller/local models.

2026-04-09 15:13:45 +02:00

1.8 KiB

Raw Blame History

Qwen 3.5 with ForgeCode - Feedback Report

Model: Qwen 3.5
Provider: Alibaba Cloud (via local inference)
Harness: ForgeCode
Source References: GitHub Issue #2894, Reddit r/LocalLLaMA
Date Compiled: April 9, 2026

Known Issues

Multiple System Messages Bug

GitHub Issue: #2894 (Open as of April 8, 2026)

Problem: Multiple system messages break models with strict chat templates (e.g., Qwen3.5)

Error Manifestation:

Models with strict chat templates fail to parse message structure correctly
Tool calling may fail or produce incorrect results
Agent behavior becomes unpredictable

Impact:

Affects local inference with llama.cpp, Ollama, and similar servers
Qwen3.5 specifically mentioned as affected

Workaround Status: No official fix yet; issue under investigation

Tool Calling with Qwen Models

General Observations from Community

Qwen3-Coder Next shows promise as "first usable coding model < 60GB"
Tool calling reliability varies by inference backend:
- LM Studio 0.4.9 reportedly handles Qwen3.5 XML tool parsing more reliably than raw llama.cpp
- llama.cpp with --jinja flag helps with tool calling
finish_reason issue is annoying to debug according to community reports

Recommendations for Local Use

Use LM Studio for more reliable tool parsing vs raw llama.cpp
Monitor system message count - known issue with ForgeCode's multi-message approach
Test thoroughly before relying on Qwen 3.5 for production tasks via ForgeCode

Source References

GitHub Issue: https://github.com/antinomyhq/forgecode/issues/2894
Reddit r/LocalLLaMA: https://www.reddit.com/r/LocalLLaMA/comments/1sdhvc5/qwen_35_tool_calling_fixes_for_agentic_use_whats/

1.8 KiB Raw Blame History