local_swarm/streaming_patch.diff

# Patch to add real-time streaming for tools

# This patch adds real-time streaming of assistant content ("thinking") and tool calls
# when tools are used. Previously, all content was buffered until complete,
# causing opencode to wait with no feedback.

# Key changes:
# 1. Stream model output incrementally as it's generated
# 2. Parse for tool_calls and content in each chunk
# 3. Send content chunks immediately (the "thinking")
# 4. Send tool_calls deltas immediately when found
# 5. Don't execute tools server-side in streaming mode
# 6. Send DONE marker at end

# Apply this patch with:
#   patch -p1 < this_file src/api/routes.py