2c46d48004
Streams assistant's thinking and tool calls back to opencode immediately: - Sends content chunks as they're generated - Parses and sends tool_calls deltas incrementally - Doesn't execute tools server-side - Allows opencode to show progress during generation Note: Real implementation requires fixing syntax errors in routes.py
17 lines
638 B
Diff
17 lines
638 B
Diff
# Patch to add real-time streaming for tools
|
|
|
|
# This patch adds real-time streaming of assistant content ("thinking") and tool calls
|
|
# when tools are used. Previously, all content was buffered until complete,
|
|
# causing opencode to wait with no feedback.
|
|
|
|
# Key changes:
|
|
# 1. Stream model output incrementally as it's generated
|
|
# 2. Parse for tool_calls and content in each chunk
|
|
# 3. Send content chunks immediately (the "thinking")
|
|
# 4. Send tool_calls deltas immediately when found
|
|
# 5. Don't execute tools server-side in streaming mode
|
|
# 6. Send DONE marker at end
|
|
|
|
# Apply this patch with:
|
|
# patch -p1 < this_file src/api/routes.py
|