- Add --use-opencode-tools flag to main.py - Default: local tool server mode (~125 tokens, saves ~27k tokens) - Optional: opencode tools mode (~27k tokens, full tool definitions) - Create .opencodeignore to exclude large docs from context - Update design doc with token bloat analysis This allows users to choose between: - Local tool server: Minimal tool instructions, saves 27k tokens - Opencode tools: Full tool definitions, more robust but expensive
3.2 KiB
Investigation: 31k Token Context Issue
Problem
When making requests through opencode to local_swarm, the LLM receives ~31k tokens of context even for simple empty directory queries.
Root Cause Identified
NOT an issue with this repo's codebase - this is expected behavior for function calling.
How it works:
- opencode sends tool definitions in the system message using OpenAI's function calling format
- Each tool definition is ~450 tokens (name + description + parameters)
- opencode has ~60 tools (read, write, bash, glob, grep, edit, question, webfetch, task, etc.)
- Total tool definition tokens: ~27,000 tokens
Calculation:
Single tool definition: ~450 tokens
Number of tools: ~60
Tool schemas total: ~27,000 tokens
System message: ~500 tokens
User query: ~100 tokens
---
Total: ~27,600 tokens
This matches the observed ~31k tokens.
Why This Happens
OpenAI's function calling protocol requires sending the complete function schemas to the LLM with every request. This is how the model:
- Knows what tools are available
- Understands parameter requirements
- Knows how to format tool calls
All major LLM providers using function calling work this way (OpenAI, Anthropic, local models, etc.).
Verification
python -c "
import tiktoken
enc = tiktoken.get_encoding('cl100k_base')
# Example from actual opencode tool definition
read_tool_schema = '''{\"type\": \"function\", \"function\": {\"name\": \"read\", \"description\": \"Read a file or directory from the local filesystem...[full description]\", \"parameters\": {...}}}'''
print(f'Single tool schema: {len(enc.encode(read_tool_schema))} tokens')
print(f'Estimated 60 tools: {len(enc.encode(read_tool_schema)) * 60:,} tokens')
"
Result:
- Single tool definition: ~451 tokens
- 60 tools: ~27,060 tokens
- Plus system + user message: ~27,660 total
This Is NOT a Bug
The 31k token context is correct and expected for function calling with 60+ tools. This is how:
- OpenAI API works
- Claude API works
- Local models with function calling work
Potential Optimizations (Optional)
If reducing context size is critical, consider:
Option 1: Dynamic Tool Selection
- Only send tools relevant to current task
- Example: For file operations, only send [read, write, glob, edit]
- Trade-off: Requires opencode to intelligently filter tools
Option 2: Compressed Tool Descriptions
- Shorten tool descriptions to essentials
- Example: "Read file at path (required: filePath)"
- Trade-off: Model may make more errors with less guidance
Option 3: Tool Grouping
- Group similar tools into single "tools: [read, write, glob]" parameter
- Trade-off: Breaks OpenAI compatibility
Recommendation
NO ACTION REQUIRED. The 31k token context is:
- Standard for function calling with many tools
- Within capabilities of modern LLMs (32k-128k context windows)
- Not caused by this repo's code
The .opencodeignore created earlier will help with opencode's own system prompt, but doesn't affect the LLM context sent to local_swarm.
Additional Finding
While investigating, verified:
config/prompts/tool_instructions.txt: 125 tokens ✅- This repo's tool execution code: No token bloat ✅
- Issue is purely opencode's function calling protocol ✅