Files
mid_model_research/hermes/feedback/general/bug-reports-and-issues.md
T
sleepy 51123212c4 Initial commit: coding harness feedback analysis
Harnesses under analysis:
- opencode (Go-based coding agent)
- pi (minimal terminal coding harness by Mario Zechner)
- hermes (Nous Research agent)
- forgecode (AI pair programmer with sub-agents)

Each harness folder contains:
- repo/: Source code from respective repositories
- feedback/localllm/: Community feedback for local/smaller models
- feedback/frontier/: Community feedback for frontier models

Research focus: Tool handling, skills systems, prompt engineering,
context management, and best practices for smaller/local models.
2026-04-09 15:13:45 +02:00

4.9 KiB

Bug Reports and Issues Collection

Collection Date: 2026-04-09
Source: GitHub Issues (NousResearch/hermes-agent)


Critical Issues

Issue #4146: Sandbox Code Execution Security Bypass (CRITICAL)

Status: Open
Severity: Critical

"Critical. Any LLM prompt injection or confused deputy scenario where the agent generates sandbox code could result in arbitrary command execution as the user."

Problem: execute_code sandbox bypasses dangerous command approval via terminal tool

Impact: Security vulnerability - sandboxed code can execute arbitrary commands

Recommended Fix: Remove terminal from SANDBOX_ALLOWED_TOOLS


Issue #1071: llama-server Compatibility (CRITICAL)

Status: Reported with fix
Error: 'dict' object has no attribute 'strip'

Environment: Windows 11 + Ubuntu/WSL2, llama-server with Qwen3.5-27B

Root Cause: llama-server returns function.arguments as dict instead of JSON string

Fix:

if isinstance(args, (dict, list)):
    tc.function.arguments = json.dumps(args)

Gateway Issues

Issue #4469: Multiple Rapid Messages Only Last One Processed

Status: Open
Component: Gateway message queuing

Problem: When user sends multiple messages while agent is running, only the last message is processed

Root Cause: Two separate pending message storage locations:

  • GatewayRunner._pending_messages (written but never read)
  • adapter._pending_messages (read but never written during interrupts)

Impact: Orphaned message queue - user messages lost

Issue #6212: Telegram Context Compaction Handoff Bug

Status: Open
Component: Telegram gateway

Problem: Fresh /start or Hello? dumps raw [CONTEXT COMPACTION] handoff instead of normal greeting

Sessions Affected:

  • 20260408_111232_42b907
  • 20260408_113658_19c1fc

Expected: Short greeting or "Resuming prior task" message Actual: Raw compaction summary dumped to user

Issue #5446: Discord Thread User Addition

Status: Open
Problem: User not added to private Discord thread when using /thread command


Authentication Issues

Issue #5807: Hermes Doctor Reports False "Not Logged In"

Status: Open
Component: Authentication status checking

Problem: hermes doctor reports "Nous Portal auth (not logged in)" even with valid credentials

Root Cause: get_nous_auth_status() only checks legacy providers section, not credential_pool

Workaround: Use hermes auth list for accurate status


Migration Issues

Issue #5191: OpenClaw Migration Silent Failures

Status: Open
Component: Migration tool

Bug 1: Orphaned openclaw.json - migration renames directory but doesn't copy config

Bug 2: Missing Slack token migration - tokens not extracted to ~/.hermes/.env

Impact: Gateway starts in broken state with cryptic errors

Workaround:

cp ~/.openclaw.pre-migration-*/openclaw.json ~/.openclaw/openclaw.json
# Add to ~/.hermes/.env:
SLACK_BOT_TOKEN=xoxb-...
SLACK_APP_TOKEN=xapp-...

Configuration Issues

Issue #5528: Configurable Dangerous Command Patterns

Status: Feature Request
Type: Configuration enhancement

Problem: Dangerous-command approval patterns are hard-coded in tools/approval.py

Use Case: Users cannot mark installation-specific commands (e.g., systemctl restart hermes-gateway) as approval-required

Proposed Solution:

approvals:
  extra_dangerous_patterns:
    - pattern: "\\bsystemctl\\b.*\\brestart\\b.*hermes-gateway"
      description: "restart gateway service"

Performance Issues

Issue #4379: Token Overhead Analysis

Status: Documented/Under Discussion
Finding: 73% of every API call is fixed overhead (~13.9K tokens)

Breakdown:

  • Tool definitions: 8,759 tokens
  • System prompt: 5,176 tokens
  • Skills catalog: ~2,200 tokens (eagerly loaded)

Recommended Optimizations:

  1. Platform-aware tool filtering (messaging platforms don't need browser tools)
  2. Lazy skills loading (remove from system prompt)
  3. Compression tuning documentation

Memory Issues

Issue #509: Cognitive Memory Operations

Status: Feature Request
Proposal: Add LLM-driven encoding, consolidation, adaptive recall & extraction

Goal: Self-maintaining knowledge base that compounds over time

Issue #3943: MemoryProvider Interface

Status: Feature Request
Proposal: Interface for long-term memory integrations


Summary Table

Issue Severity Status Component
#4146 Critical Open Security
#1071 Critical Fix Ready Local Models
#4469 High Open Gateway
#6212 Medium Open Telegram
#5807 Medium Open Auth
#5191 Medium Open Migration
#4379 Medium Documented Performance
#5528 Low Feature Req Config
#509 Low Feature Req Memory
#3943 Low Feature Req Memory