# Local Model Setup Issues & Solutions **Source reference:** GitHub issues, Reddit, official FAQ, blog posts --- ## Issue #523: Local Model Setup Skill Request **Problem:** Users struggle with local model configuration > "No model recommendations: Users must know which models support tool calling. There's no guidance on model selection. No setup instructions: No docs or skills for installing/configuring Ollama, llama.cpp, or vLLM." **Requested Solution:** A skill that guides users through: 1. Setting up local models with Hermes Agent 2. Model recommendations for different use cases 3. Configuration nuances that trip up new users --- ## Issue #1071: llama-server Compatibility (CRITICAL) **Error:** `'dict' object has no attribute 'strip'` **Impact:** Complete failure with llama-server/Ollama backends **Fix Location:** `run_agent.py` line ~4280 **User Workaround:** ```python # Add before: if not args or not args.strip(): if isinstance(args, (dict, list)): tc.function.arguments = json.dumps(args) continue ``` **Related Issues:** - llama.cpp #14697 - ollama-python #484 - litellm #8313 --- ## Context Length Configuration Issues **Common Error:** "Context exceeded your setting" **Source:** https://www.reddit.com/r/LocalLLM/comments/1sc82o8/hermesagent_what_is_this_message_about/ > "Context exceeded your setting. Either your Hermes context or your llm server context setting for that particular model. By default context is usually set to something comically low." **Solution:** ```yaml model: default: your-model-name context_length: 32768 # Match your server's num_ctx ``` --- ## Issue #879: Local Model Routing for Auxiliary Tasks **Feature Request:** Direct auxiliary tasks (vision, etc.) to local endpoint independently of main provider **Use Case:** Use local model for fast tasks, cloud model for complex reasoning **Dependencies:** Multi-model hybrid setup support --- ## Windows/WSL2 Limitations **Status:** Native Windows not supported > "Native Windows support is extremely experimental and unsupported. Please install WSL2 and run Hermes Agent from there." **Installation:** ```bash # Inside WSL2 curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash ``` --- ## Best Practices from Community ### Ollama Setup 1. Start server with adequate context: `ollama run --num_ctx 16384` 2. Match context in Hermes config exactly 3. Use `hermes model` to select "Custom endpoint" 4. Base URL: `http://localhost:11434/v1` 5. Leave API key blank for local ### Recommended Local Models by Use Case | Use Case | Model | VRAM Needed | |----------|-------|-------------| | General agent work | Qwen 3.5 27B | 24GB | | Fast responses | Qwen 3.5 14B | 16GB | | Limited VRAM | Qwen 3.5 8B | 8GB | | Experimental | Gemma 4 27B | 24GB | ### Common Pitfalls 1. **Mismatching context lengths** between Ollama and Hermes 2. **Assuming all models support tool calling** equally well 3. **Not setting max iterations** appropriate for local model speed 4. **Expecting frontier-level reliability** from smaller models --- ## Community Feedback Summary **Positive:** - "Hermes agent already works way way better than Open Claw and it actually works pretty well locally" - Better local model support than alternatives **Challenges:** - Tool calling reliability varies by model - Configuration complexity for beginners - Token overhead still applies (13.9K tokens per call)