# General Frontier Model Feedback

**Collection Date:** 2026-04-09  
**Sources:** GitHub issues, blog posts, community discussions, official documentation

---

## Provider Support Matrix

| Provider | Status | Special Features |
|----------|--------|------------------|
| OpenAI | ✅ Full | Codex OAuth support |
| Anthropic | ✅ Full | Claude Code credential store |
| OpenRouter | ✅ Full | 200+ models, flexible |
| Nous Portal | ✅ Full | OAuth, subscription |
| Kimi/Moonshot | ✅ Full | 75% cache discount |
| DeepSeek | ✅ Full | 90% cache discount |
| MiniMax | ✅ Full | Token plan support |
| z.ai/GLM | ✅ Full | China/global endpoints |
| Gemini | ✅ Full | Via OpenRouter or direct |

---

## Key Feedback Themes

### 1. Token Overhead is the Hidden Cost

**Critical Issue:** Every API call includes ~13.9K tokens of fixed overhead

**Source:** GitHub Issue #4379

> "The dollar cost depends on the model/provider, but the token overhead is constant — these 13.9K tokens are sent on every call whether using Sonnet, Haiku, Llama, or any other model via OpenRouter."

**Breakdown:**
- Tool definitions: 8,759 tokens (46%)
- System prompt: 5,176 tokens (27%)
- Actual messages: ~5,000 tokens (27%)

**Impact on Costs:**
- A "simple weather query" can cost 21,000 tokens when the agent spawns a terminal
- One user reported: "4 million tokens in 2 hours of light usage"

### 2. CLI vs Gateway Token Disparity (Fixed in v0.6.0)

**Bug (pre-v0.6.0):** Telegram used 2-3x more tokens than CLI

| Access Method | Tokens/Request |
|---------------|----------------|
| CLI | 6,000-8,000 |
| Telegram (old) | 15,000-20,000 |

**Root Cause:** Gateway started in repo directory instead of home directory

**Fix:** Update to v0.6.0+ and restart gateway

### 3. Tool Reliability by Provider

**Most Reliable:**
1. Claude Sonnet (excellent tool calling)
2. GPT-4 class models (very reliable)
3. Kimi K2.5 (good for the price)

**Acceptable:**
- MiniMax
- DeepSeek
- Gemini

**Variable:**
- Depends on specific task complexity
- Budget models may struggle with novel tools

---

## Cost Management Strategies

### Strategy 1: Tiered Model Usage

```
Complex reasoning → Claude Sonnet / GPT-4
Routine tasks → Kimi K2.5 / MiniMax
Vision tasks → Gemini Flash / GPT-4o
Maximum savings → DeepSeek with cache
```

### Strategy 2: Session Management

- Use `hermes --fresh` for unrelated tasks
- Run token-intensive work in CLI vs gateway
- Monitor with `/usage` command

### Strategy 3: Toolset Optimization

- Disable unused skill categories (~2,200 tokens saved)
- Use platform-specific toolsets (~1,300 tokens saved)
- Keep MEMORY.md lean

---

## Provider-Specific Notes

### OpenRouter
- **Best for:** Flexibility, trying different models
- **Pros:** 200+ models, single API key
- **Cons:** Cache support depends on upstream

### Anthropic/Claude
- **Best for:** Complex reasoning, reliability
- **Pros:** Excellent tool calling, context understanding
- **Cons:** Higher cost, no special cache discounts

### Nous Portal
- **Best for:** Supporting the project, native integration
- **Pros:** OAuth, built-in support
- **Cons:** Subscription model

### Budget Providers (Kimi, DeepSeek, MiniMax)
- **Best for:** High volume, routine tasks
- **Pros:** 50-90% cost savings, fast
- **Cons:** May struggle with complex tasks

---

## Community Quotes

**On Cost:**
> "Choosing a cache-friendly provider is the single biggest lever for reducing costs."

**On Performance:**
> "One developer reported that a task taking OpenClaw 50+ tool calls and steps took Hermes 5 correct tool calls and finished 2.5 minutes faster."

**On Model Selection:**
> "For most users: Kimi K2.5 from Moonshot or MiniMax as a daily driver — both are fast, capable, and inexpensive."

---

## Summary Table

| Provider | Cost | Reliability | Speed | Cache |
|----------|------|-------------|-------|-------|
| Claude Sonnet | $$$$ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | Partial |
| GPT-4 | $$$$ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | Partial |
| Kimi K2.5 | $$ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | 75% |
| DeepSeek | $ | ⭐⭐⭐ | ⭐⭐⭐⭐ | 90% |
| MiniMax | $$ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | No |
| OpenRouter | Varies | Varies | Varies | Varies |

**Legend:**
- $ = Budget friendly
- ⭐ = Performance rating
- Cache % = Discount on cached tokens