# Local Model Setup Issues & Solutions

**Source reference:** GitHub issues, Reddit, official FAQ, blog posts

---

## Issue #523: Local Model Setup Skill Request

**Problem:** Users struggle with local model configuration

> "No model recommendations: Users must know which models support tool calling. There's no guidance on model selection. No setup instructions: No docs or skills for installing/configuring Ollama, llama.cpp, or vLLM."

**Requested Solution:** A skill that guides users through:
1. Setting up local models with Hermes Agent
2. Model recommendations for different use cases
3. Configuration nuances that trip up new users

---

## Issue #1071: llama-server Compatibility (CRITICAL)

**Error:** `'dict' object has no attribute 'strip'`

**Impact:** Complete failure with llama-server/Ollama backends

**Fix Location:** `run_agent.py` line ~4280

**User Workaround:**
```python
# Add before: if not args or not args.strip():
if isinstance(args, (dict, list)):
    tc.function.arguments = json.dumps(args)
    continue
```

**Related Issues:**
- llama.cpp #14697
- ollama-python #484
- litellm #8313

---

## Context Length Configuration Issues

**Common Error:** "Context exceeded your setting"

**Source:** https://www.reddit.com/r/LocalLLM/comments/1sc82o8/hermesagent_what_is_this_message_about/

> "Context exceeded your setting. Either your Hermes context or your llm server context setting for that particular model. By default context is usually set to something comically low."

**Solution:**
```yaml
model:
  default: your-model-name
  context_length: 32768  # Match your server's num_ctx
```

---

## Issue #879: Local Model Routing for Auxiliary Tasks

**Feature Request:** Direct auxiliary tasks (vision, etc.) to local endpoint independently of main provider

**Use Case:** Use local model for fast tasks, cloud model for complex reasoning

**Dependencies:** Multi-model hybrid setup support

---

## Windows/WSL2 Limitations

**Status:** Native Windows not supported

> "Native Windows support is extremely experimental and unsupported. Please install WSL2 and run Hermes Agent from there."

**Installation:**
```bash
# Inside WSL2
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
```

---

## Best Practices from Community

### Ollama Setup
1. Start server with adequate context: `ollama run --num_ctx 16384`
2. Match context in Hermes config exactly
3. Use `hermes model` to select "Custom endpoint"
4. Base URL: `http://localhost:11434/v1`
5. Leave API key blank for local

### Recommended Local Models by Use Case

| Use Case | Model | VRAM Needed |
|----------|-------|-------------|
| General agent work | Qwen 3.5 27B | 24GB |
| Fast responses | Qwen 3.5 14B | 16GB |
| Limited VRAM | Qwen 3.5 8B | 8GB |
| Experimental | Gemma 4 27B | 24GB |

### Common Pitfalls
1. **Mismatching context lengths** between Ollama and Hermes
2. **Assuming all models support tool calling** equally well
3. **Not setting max iterations** appropriate for local model speed
4. **Expecting frontier-level reliability** from smaller models

---

## Community Feedback Summary

**Positive:**
- "Hermes agent already works way way better than Open Claw and it actually works pretty well locally"
- Better local model support than alternatives

**Challenges:**
- Tool calling reliability varies by model
- Configuration complexity for beginners
- Token overhead still applies (13.9K tokens per call)