Files
mid_model_research/opencode/opencode/feedback/localllm/prompt-engineering-feedback.md
T
sleepy 51123212c4 Initial commit: coding harness feedback analysis
Harnesses under analysis:
- opencode (Go-based coding agent)
- pi (minimal terminal coding harness by Mario Zechner)
- hermes (Nous Research agent)
- forgecode (AI pair programmer with sub-agents)

Each harness folder contains:
- repo/: Source code from respective repositories
- feedback/localllm/: Community feedback for local/smaller models
- feedback/frontier/: Community feedback for frontier models

Research focus: Tool handling, skills systems, prompt engineering,
context management, and best practices for smaller/local models.
2026-04-09 15:13:45 +02:00

8.2 KiB

Prompt Engineering Strategies Feedback

Overview

This document compiles feedback on prompt engineering strategies for local and frontier models in OpenCode. Focuses on what works well, common pitfalls, and optimization techniques.


Model-Specific Prompt Strategies

Qwen3.5-35B-A3B

Recommended Temperature: 0.6 (default for Qwen models)

Prompt Structure:

You are an expert coding assistant. Your task is to:
1. Analyze the codebase
2. Identify the issue
3. Propose a solution
4. Implement the fix

Focus on:
- Code quality and best practices
- Performance implications
- Edge cases and error handling

What Worked Well:

  • Clear role definition improves output quality
  • Structured task breakdown helps MoE routing
  • Explicit focus areas guide model attention

Issues Encountered:

  • Default template breaks tool calling
  • Requires corrected Jinja template
  • System message ordering critical

Source References:


Gemma 4 26B-A4B

Recommended Temperature: 0.1 (more deterministic)

Prompt Structure:

You are a code reviewer. Focus on:
- Code quality and best practices
- Potential bugs and edge cases
- Performance implications
- Security considerations

Provide constructive feedback without making direct changes.

What Worked Well:

  • Lower temperature (0.1) improves consistency
  • Clear constraints reduce hallucinations
  • Short thinking traces work well

Issues Encountered:

  • Requires more specific guidance than other models
  • Default 4K context causes truncation
  • Needs tool_call: true in config

Source References:


GLM-5.1

Recommended Temperature: Auto (model-specific defaults)

Prompt Structure:

You are an autonomous coding agent. Your task is to:
1. Understand the requirements
2. Plan the implementation
3. Execute the changes
4. Verify the results

You can run for up to 8 hours autonomously.

What Worked Well:

  • Long-horizon tasks excel
  • 1,700+ autonomous steps possible
  • MIT license allows commercial use

Source References:


Temperature Settings

Model Temperature Use Case
Qwen3.5-35B-A3B 0.6 Default, balanced
Gemma 4 26B-A4B 0.1 Deterministic, review
GLM-5.1 Auto Model-specific
GPT-5.4 0.3-0.5 General coding
Claude Opus 4.6 0.3-0.5 Complex tasks

Source References:


Context Window Optimization

Increasing Context Window

Ollama:

ollama run gemma4:e4b
/set parameter num_ctx 32768
/save gemma4:e4b-32k
/bye

Docker Model Runner:

docker model configure --context-size=100000 gpt-oss:20B-UD-Q8_K_XL

llama-server:

--ctx-size 65536
--parallel 1
--batch-size 2048
--ubatch-size 512

Source References:


Compaction Threshold

Problem: Hardcoded 75% Threshold

Impact:

Model Degradation Start Compaction Trigger Result
Gemini ~30% (300k) 75% (786k) 2-3x slower, hallucinations
Claude ~50% 75% Significant quality drops

Proposed Solution:

{
  "compaction": {
    "threshold": 0.40,
    "strategy": "summarize",
    "preserveRecentMessages": 10,
    "preserveSystemPrompt": true
  }
}

Source References:


Prompt Engineering Best Practices

1. Define Agent Role

You are an expert [role] with [X] years of experience.
Your task is to [specific task].

2. Enforce Structured Tool Use

Use the following tools in order:
1. read - to understand the codebase
2. edit - to make changes
3. bash - to verify the changes

3. Require Thorough Testing

After making changes:
- Run existing tests
- Add new tests if needed
- Verify edge cases

4. Set Markdown Standards

Format your response in Markdown:
- Use code blocks for code
- Use bullet points for lists
- Use headers for sections

Source References:


Mode-Specific Prompts

Build Mode (Default)

You are in build mode. Full access to:
- write - create new files
- edit - modify existing files
- bash - execute shell commands
- read - read file contents
- grep - search file contents
- glob - find files by pattern

Plan Mode

You are in plan mode. Limited access:
- read - read file contents
- grep - search file contents
- glob - find files by pattern
- list - list directory contents

Disabled:
- write - cannot create new files
- edit - cannot modify files
- bash - cannot execute commands

Source References:


Custom Mode Examples

Code Review Mode

---
model: anthropic/claude-sonnet-4-20250514
temperature: 0.1
tools:
  write: false
  edit: false
  bash: false
---

You are in code review mode. Focus on:
- Code quality and best practices
- Potential bugs and edge cases
- Performance implications
- Security considerations

Provide constructive feedback without making direct changes.

Documentation Mode

{
  "mode": {
    "docs": {
      "prompt": "{file:./prompts/documentation.txt}",
      "tools": {
        "write": true,
        "edit": true,
        "bash": false,
        "read": true,
        "grep": true,
        "glob": true
      }
    }
  }
}

Source References:


Prompt Variants

Built-in Variants

Anthropic:

  • high (default)
  • max

OpenAI:

  • none
  • minimal
  • low
  • medium
  • high
  • xhigh

Google:

  • low
  • high

Custom Variants

{
  "provider": {
    "openai": {
      "models": {
        "gpt-5": {
          "variants": {
            "thinking": {
              "reasoningEffort": "high",
              "textVerbosity": "low"
            },
            "fast": {
              "disabled": true
            }
          }
        }
      }
    }
  }
}

Source References:


Context Management Strategies

Keep Model Loaded

# Prevent Ollama from unloading model
launchctl setenv OLLAMA_KEEP_ALIVE "-1"

Auto-Preload on Startup

# Create LaunchAgent to keep model warm
cat << 'EOF' > ~/Library/LaunchAgents/com.ollama.preload-gemma4.plist
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.ollama.preload-gemma4</string>
    <key>ProgramArguments</key>
    <array>
        <string>/opt/homebrew/bin/ollama</string>
        <string>run</string>
        <string>gemma4:latest</string>
    </array>
    <key>RunAtLoad</key>
    <true/>
    <key>StartInterval</key>
    <integer>300</integer>
</dict>
</plist>
EOF

launchctl load ~/Library/LaunchAgents/com.ollama.preload-gemma4.plist

Source References:


Data Sources Summary

Source Type Count Topics Covered
Reddit Threads 2 Prompt strategies, user experiences
GitHub Issues 1 Configuration problems
Blog Posts 4 Setup guides, optimization
Documentation 4 Official docs, configuration
Technical Blogs 2 Architecture, performance

Total Sources: 13 unique sources