Files

T

sleepy 51123212c4 Initial commit: coding harness feedback analysis

Harnesses under analysis:
- opencode (Go-based coding agent)
- pi (minimal terminal coding harness by Mario Zechner)
- hermes (Nous Research agent)
- forgecode (AI pair programmer with sub-agents)

Each harness folder contains:
- repo/: Source code from respective repositories
- feedback/localllm/: Community feedback for local/smaller models
- feedback/frontier/: Community feedback for frontier models

Research focus: Tool handling, skills systems, prompt engineering,
context management, and best practices for smaller/local models.

2026-04-09 15:13:45 +02:00

8.2 KiB

Raw Blame History

Prompt Engineering Strategies Feedback

Overview

This document compiles feedback on prompt engineering strategies for local and frontier models in OpenCode. Focuses on what works well, common pitfalls, and optimization techniques.

Model-Specific Prompt Strategies

Qwen3.5-35B-A3B

Recommended Temperature: 0.6 (default for Qwen models)

Prompt Structure:

You are an expert coding assistant. Your task is to:
1. Analyze the codebase
2. Identify the issue
3. Propose a solution
4. Implement the fix

Focus on:
- Code quality and best practices
- Performance implications
- Edge cases and error handling

What Worked Well:

Clear role definition improves output quality
Structured task breakdown helps MoE routing
Explicit focus areas guide model attention

Issues Encountered:

Default template breaks tool calling
Requires corrected Jinja template
System message ordering critical

Source References:

Aayush Garg Blog

Gemma 4 26B-A4B

Recommended Temperature: 0.1 (more deterministic)

Prompt Structure:

You are a code reviewer. Focus on:
- Code quality and best practices
- Potential bugs and edge cases
- Performance implications
- Security considerations

Provide constructive feedback without making direct changes.

What Worked Well:

Lower temperature (0.1) improves consistency
Clear constraints reduce hallucinations
Short thinking traces work well

Issues Encountered:

Requires more specific guidance than other models
Default 4K context causes truncation
Needs tool_call: true in config

Source References:

GLM-5.1

Recommended Temperature: Auto (model-specific defaults)

Prompt Structure:

You are an autonomous coding agent. Your task is to:
1. Understand the requirements
2. Plan the implementation
3. Execute the changes
4. Verify the results

You can run for up to 8 hours autonomously.

What Worked Well:

Long-horizon tasks excel
1,700+ autonomous steps possible
MIT license allows commercial use

Source References:

Temperature Settings

Recommended Temperatures by Model

Model	Temperature	Use Case
Qwen3.5-35B-A3B	0.6	Default, balanced
Gemma 4 26B-A4B	0.1	Deterministic, review
GLM-5.1	Auto	Model-specific
GPT-5.4	0.3-0.5	General coding
Claude Opus 4.6	0.3-0.5	Complex tasks

Source References:

OpenCode Docs: Modes

Context Window Optimization

Increasing Context Window

Ollama:

ollama run gemma4:e4b
/set parameter num_ctx 32768
/save gemma4:e4b-32k
/bye

Docker Model Runner:

docker model configure --context-size=100000 gpt-oss:20B-UD-Q8_K_XL

llama-server:

--ctx-size 65536
--parallel 1
--batch-size 2048
--ubatch-size 512

Source References:

Compaction Threshold

Problem: Hardcoded 75% Threshold

Impact:

Model	Degradation Start	Compaction Trigger	Result
Gemini	~30% (300k)	75% (786k)	2-3x slower, hallucinations
Claude	~50%	75%	Significant quality drops

Proposed Solution:

{
  "compaction": {
    "threshold": 0.40,
    "strategy": "summarize",
    "preserveRecentMessages": 10,
    "preserveSystemPrompt": true
  }
}

Source References:

GitHub Issue #11314: Configurable Context Compaction

Prompt Engineering Best Practices

1. Define Agent Role

You are an expert [role] with [X] years of experience.
Your task is to [specific task].

2. Enforce Structured Tool Use

Use the following tools in order:
1. read - to understand the codebase
2. edit - to make changes
3. bash - to verify the changes

3. Require Thorough Testing

After making changes:
- Run existing tests
- Add new tests if needed
- Verify edge cases

4. Set Markdown Standards

Format your response in Markdown:
- Use code blocks for code
- Use bullet points for lists
- Use headers for sections

Source References:

OpenAI Prompt Engineering Guide

Mode-Specific Prompts

Build Mode (Default)

You are in build mode. Full access to:
- write - create new files
- edit - modify existing files
- bash - execute shell commands
- read - read file contents
- grep - search file contents
- glob - find files by pattern

Plan Mode

You are in plan mode. Limited access:
- read - read file contents
- grep - search file contents
- glob - find files by pattern
- list - list directory contents

Disabled:
- write - cannot create new files
- edit - cannot modify files
- bash - cannot execute commands

Source References:

OpenCode Docs: Modes

Custom Mode Examples

Code Review Mode

---
model: anthropic/claude-sonnet-4-20250514
temperature: 0.1
tools:
  write: false
  edit: false
  bash: false
---

You are in code review mode. Focus on:
- Code quality and best practices
- Potential bugs and edge cases
- Performance implications
- Security considerations

Provide constructive feedback without making direct changes.

Documentation Mode

{
  "mode": {
    "docs": {
      "prompt": "{file:./prompts/documentation.txt}",
      "tools": {
        "write": true,
        "edit": true,
        "bash": false,
        "read": true,
        "grep": true,
        "glob": true
      }
    }
  }
}

Source References:

OpenCode Docs: Modes

Prompt Variants

Built-in Variants

Anthropic:

high (default)
max

OpenAI:

none
minimal
low
medium
high
xhigh

Google:

low
high

Custom Variants

{
  "provider": {
    "openai": {
      "models": {
        "gpt-5": {
          "variants": {
            "thinking": {
              "reasoningEffort": "high",
              "textVerbosity": "low"
            },
            "fast": {
              "disabled": true
            }
          }
        }
      }
    }
  }
}

Source References:

OpenCode Docs: Models

Context Management Strategies

Keep Model Loaded

# Prevent Ollama from unloading model
launchctl setenv OLLAMA_KEEP_ALIVE "-1"

Auto-Preload on Startup

# Create LaunchAgent to keep model warm
cat << 'EOF' > ~/Library/LaunchAgents/com.ollama.preload-gemma4.plist
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.ollama.preload-gemma4</string>
    <key>ProgramArguments</key>
    <array>
        <string>/opt/homebrew/bin/ollama</string>
        <string>run</string>
        <string>gemma4:latest</string>
    </array>
    <key>RunAtLoad</key>
    <true/>
    <key>StartInterval</key>
    <integer>300</integer>
</dict>
</plist>
EOF

launchctl load ~/Library/LaunchAgents/com.ollama.preload-gemma4.plist

Source References:

haimaker.ai: Gemma 4 Setup

Data Sources Summary

Source Type	Count	Topics Covered
Reddit Threads	2	Prompt strategies, user experiences
GitHub Issues	1	Configuration problems
Blog Posts	4	Setup guides, optimization
Documentation	4	Official docs, configuration
Technical Blogs	2	Architecture, performance

Total Sources: 13 unique sources

8.2 KiB Raw Blame History

Prompt Engineering Strategies Feedback

Overview

Model-Specific Prompt Strategies

Qwen3.5-35B-A3B

Gemma 4 26B-A4B

GLM-5.1

Temperature Settings

Recommended Temperatures by Model

Context Window Optimization

Increasing Context Window

Compaction Threshold

Problem: Hardcoded 75% Threshold

Prompt Engineering Best Practices

1. Define Agent Role

2. Enforce Structured Tool Use

3. Require Thorough Testing

4. Set Markdown Standards

Mode-Specific Prompts

Build Mode (Default)

Plan Mode

Custom Mode Examples

Code Review Mode

Documentation Mode

Prompt Variants

Built-in Variants

Custom Variants

Context Management Strategies

Keep Model Loaded

Auto-Preload on Startup

Data Sources Summary

8.2 KiB

Raw Blame History