# Budget Providers Feedback (Kimi, DeepSeek, MiniMax) **Source reference:** Community guides, official integration docs, API documentation --- ## Kimi / Moonshot AI (K2.5) **Recommendation:** Primary budget-friendly option ### Why Kimi K2.5? **Source:** https://hermes-agent.ai/blog/hermes-agent-api-keys > "For most users: Kimi K2.5 from Moonshot or MiniMax as a daily driver — both are fast, capable, and inexpensive. Use Claude Sonnet or GPT-4 only for complex reasoning tasks where the extra capability is worth the significantly higher per-token cost." ### Caching Benefits | Provider | Cache Discount | |----------|----------------| | Kimi K2.5 | 75% off on cache hits | | DeepSeek | 90% off on cache hits | | Claude/Anthropic | Full price (no special discount) | ### Cost Comparison **Feature implementation scenario (~100 API calls):** - Claude Sonnet 4.5: ~$34 - Kimi K2.5: ~$3-8 (depending on caching) - DeepSeek (cache hits): Under $1 --- ## DeepSeek **Best for:** Maximum cost savings with caching ### Caching Advantage **Source:** https://hermes-agent.ai/blog/hermes-agent-token-overhead > "DeepSeek (90% off on cache) — Biggest cost lever" ### Use Cases - Routine file organization - Simple message responses - Cron job executions - Research lookups --- ## MiniMax **Integration:** Official partnership/support **Source:** https://platform.minimax.io/docs/token-plan/hermes-agent > "Use MiniMax-M2.7 in Hermes Agent for autonomous AI-powered development." ### Token Plan - Different from pay-as-you-go API keys - Subscribe to Token Plan first - Create Token Plan API Key from the Token Plan page --- ## Other Budget Options ### Z.AI / ZhipuAI (GLM Models) - Good for Chinese language tasks - Competitive pricing - OpenAI-compatible endpoint ### Alibaba Cloud DashScope - Qwen model access - Regional availability advantages ### OpenCode Zen / Go - Curated model access - Budget-friendly options --- ## Provider Selection Strategy ### Tier 1: Daily Driver (High Volume, Lower Cost) - **Kimi K2.5** - 75% cache discount, good capabilities - **DeepSeek** - 90% cache discount, cheapest option - **MiniMax** - Fast, capable, inexpensive ### Tier 2: Complex Tasks (Selective Use) - **Claude Sonnet** - Best reasoning, highest cost - **GPT-4** - Good for specific use cases ### Tier 3: Auxiliary Tasks - **Gemini Flash** - Vision tasks, cheap - **Local models** - Free but require hardware --- ## Configuration Example ```yaml # config.yaml for cost optimization model: default: "moonshot/kimi-k2.5" # Daily driver auxiliary: vision: provider: "openrouter" model: "google/gemini-2.5-flash" # Cheap vision ``` --- ## Community Experience **Positive feedback on budget providers:** - "Fast, capable, and inexpensive" - Significant cost savings vs frontier models - Good enough for 80% of tasks **Trade-offs:** - May struggle with complex multi-step reasoning - Tool calling slightly less reliable than Claude - Context understanding not as nuanced --- ## Cost Optimization Summary | Strategy | Savings | |----------|---------| | Use Kimi/DeepSeek for routine tasks | 50-90% | | Enable provider caching | 75-90% | | Reserve Claude/GPT for complex tasks | Variable | | Use cheaper vision models | 50-70% | | Short sessions (`--fresh`) | Reduces context buildup |