[frontend/multimodal] Add video upload support to chat for direct video input to multimodal models (Gemma 4 12B) #826
Labels
No labels
area:chat
area:core
area:llm
area:routes
area:tools
bug
documentation
duplicate
enhancement
good first issue
help wanted
invalid
question
refactor
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
sleepy/odysseus#826
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Context
Gemma 4 12B supports native video input — video frames (at 1 FPS, up to 60 seconds) are passed directly to the model alongside any text prompt. The OpenAI-compatible API accepts video as a sequence of image frames or as a
video_urlcontent type:The frontend already has file attachment infrastructure (
attachmentsfield in chat submission). We need to extend it to support video files as inline multimodal content rather than just document attachments.Requirements
1. Video upload in chat bar
2. Video preview
3. Direct video passthrough to backend
video_data(base64) andvideo_format("mp4", "webm") in the FormData4. UI placement
video/mp4,video/webm5. Fallback for non-multimodal models
Files to modify
static/js/chat.js— video file handling in message submissionstatic/index.html— video upload button in chat bar, video preview containerstatic/css/*.css— video thumbnail styling, upload button stylingroutes/upload_routes.py— if video needs server-side processing before sending to modelAcceptance criteria