Harnesses under analysis: - opencode (Go-based coding agent) - pi (minimal terminal coding harness by Mario Zechner) - hermes (Nous Research agent) - forgecode (AI pair programmer with sub-agents) Each harness folder contains: - repo/: Source code from respective repositories - feedback/localllm/: Community feedback for local/smaller models - feedback/frontier/: Community feedback for frontier models Research focus: Tool handling, skills systems, prompt engineering, context management, and best practices for smaller/local models.
8.2 KiB
Prompt Engineering Strategies Feedback
Overview
This document compiles feedback on prompt engineering strategies for local and frontier models in OpenCode. Focuses on what works well, common pitfalls, and optimization techniques.
Model-Specific Prompt Strategies
Qwen3.5-35B-A3B
Recommended Temperature: 0.6 (default for Qwen models)
Prompt Structure:
You are an expert coding assistant. Your task is to:
1. Analyze the codebase
2. Identify the issue
3. Propose a solution
4. Implement the fix
Focus on:
- Code quality and best practices
- Performance implications
- Edge cases and error handling
What Worked Well:
- Clear role definition improves output quality
- Structured task breakdown helps MoE routing
- Explicit focus areas guide model attention
Issues Encountered:
- Default template breaks tool calling
- Requires corrected Jinja template
- System message ordering critical
Source References:
Gemma 4 26B-A4B
Recommended Temperature: 0.1 (more deterministic)
Prompt Structure:
You are a code reviewer. Focus on:
- Code quality and best practices
- Potential bugs and edge cases
- Performance implications
- Security considerations
Provide constructive feedback without making direct changes.
What Worked Well:
- Lower temperature (0.1) improves consistency
- Clear constraints reduce hallucinations
- Short thinking traces work well
Issues Encountered:
- Requires more specific guidance than other models
- Default 4K context causes truncation
- Needs
tool_call: truein config
Source References:
GLM-5.1
Recommended Temperature: Auto (model-specific defaults)
Prompt Structure:
You are an autonomous coding agent. Your task is to:
1. Understand the requirements
2. Plan the implementation
3. Execute the changes
4. Verify the results
You can run for up to 8 hours autonomously.
What Worked Well:
- Long-horizon tasks excel
- 1,700+ autonomous steps possible
- MIT license allows commercial use
Source References:
Temperature Settings
Recommended Temperatures by Model
| Model | Temperature | Use Case |
|---|---|---|
| Qwen3.5-35B-A3B | 0.6 | Default, balanced |
| Gemma 4 26B-A4B | 0.1 | Deterministic, review |
| GLM-5.1 | Auto | Model-specific |
| GPT-5.4 | 0.3-0.5 | General coding |
| Claude Opus 4.6 | 0.3-0.5 | Complex tasks |
Source References:
Context Window Optimization
Increasing Context Window
Ollama:
ollama run gemma4:e4b
/set parameter num_ctx 32768
/save gemma4:e4b-32k
/bye
Docker Model Runner:
docker model configure --context-size=100000 gpt-oss:20B-UD-Q8_K_XL
llama-server:
--ctx-size 65536
--parallel 1
--batch-size 2048
--ubatch-size 512
Source References:
Compaction Threshold
Problem: Hardcoded 75% Threshold
Impact:
| Model | Degradation Start | Compaction Trigger | Result |
|---|---|---|---|
| Gemini | ~30% (300k) | 75% (786k) | 2-3x slower, hallucinations |
| Claude | ~50% | 75% | Significant quality drops |
Proposed Solution:
{
"compaction": {
"threshold": 0.40,
"strategy": "summarize",
"preserveRecentMessages": 10,
"preserveSystemPrompt": true
}
}
Source References:
Prompt Engineering Best Practices
1. Define Agent Role
You are an expert [role] with [X] years of experience.
Your task is to [specific task].
2. Enforce Structured Tool Use
Use the following tools in order:
1. read - to understand the codebase
2. edit - to make changes
3. bash - to verify the changes
3. Require Thorough Testing
After making changes:
- Run existing tests
- Add new tests if needed
- Verify edge cases
4. Set Markdown Standards
Format your response in Markdown:
- Use code blocks for code
- Use bullet points for lists
- Use headers for sections
Source References:
Mode-Specific Prompts
Build Mode (Default)
You are in build mode. Full access to:
- write - create new files
- edit - modify existing files
- bash - execute shell commands
- read - read file contents
- grep - search file contents
- glob - find files by pattern
Plan Mode
You are in plan mode. Limited access:
- read - read file contents
- grep - search file contents
- glob - find files by pattern
- list - list directory contents
Disabled:
- write - cannot create new files
- edit - cannot modify files
- bash - cannot execute commands
Source References:
Custom Mode Examples
Code Review Mode
---
model: anthropic/claude-sonnet-4-20250514
temperature: 0.1
tools:
write: false
edit: false
bash: false
---
You are in code review mode. Focus on:
- Code quality and best practices
- Potential bugs and edge cases
- Performance implications
- Security considerations
Provide constructive feedback without making direct changes.
Documentation Mode
{
"mode": {
"docs": {
"prompt": "{file:./prompts/documentation.txt}",
"tools": {
"write": true,
"edit": true,
"bash": false,
"read": true,
"grep": true,
"glob": true
}
}
}
}
Source References:
Prompt Variants
Built-in Variants
Anthropic:
high(default)max
OpenAI:
noneminimallowmediumhighxhigh
Google:
lowhigh
Custom Variants
{
"provider": {
"openai": {
"models": {
"gpt-5": {
"variants": {
"thinking": {
"reasoningEffort": "high",
"textVerbosity": "low"
},
"fast": {
"disabled": true
}
}
}
}
}
}
}
Source References:
Context Management Strategies
Keep Model Loaded
# Prevent Ollama from unloading model
launchctl setenv OLLAMA_KEEP_ALIVE "-1"
Auto-Preload on Startup
# Create LaunchAgent to keep model warm
cat << 'EOF' > ~/Library/LaunchAgents/com.ollama.preload-gemma4.plist
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.ollama.preload-gemma4</string>
<key>ProgramArguments</key>
<array>
<string>/opt/homebrew/bin/ollama</string>
<string>run</string>
<string>gemma4:latest</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>StartInterval</key>
<integer>300</integer>
</dict>
</plist>
EOF
launchctl load ~/Library/LaunchAgents/com.ollama.preload-gemma4.plist
Source References:
Data Sources Summary
| Source Type | Count | Topics Covered |
|---|---|---|
| Reddit Threads | 2 | Prompt strategies, user experiences |
| GitHub Issues | 1 | Configuration problems |
| Blog Posts | 4 | Setup guides, optimization |
| Documentation | 4 | Official docs, configuration |
| Technical Blogs | 2 | Architecture, performance |
Total Sources: 13 unique sources