Web Gemini Multimodal Filter¶
Filter v0.3.2
A powerful filter that provides multimodal capabilities (PDF, Office, Images, Audio, Video) to any model in OpenWebUI.
Overview¶
This plugin enables multimodal processing for any model by leveraging Gemini as an analyzer. It supports direct file processing for Gemini models and "Analyzer Mode" for other models (like DeepSeek, Llama), where Gemini analyzes the file and injects the result as context.
Features¶
- Multimodal Support: Process PDF, Word, Excel, PowerPoint, EPUB, MP3, MP4, and Images.
- Smart Routing:
- Direct Mode: Files are passed directly to Gemini models.
- Analyzer Mode: Files are analyzed by Gemini, and results are injected into the context for other models.
- Persistent Context: Maintains session history across multiple turns using OpenWebUI Chat ID.
- Deduplication: Automatically tracks analyzed file hashes to prevent redundant processing.
- Subtitle Enhancement: Specialized mode for generating high-quality SRT subtitles from video/audio.
Installation¶
- Download the plugin file:
web_gemini_multimodel.py - Upload to OpenWebUI: Admin Panel → Settings → Functions
- Configure the Gemini Adapter URL and other settings.
- Enable the filter globally or per chat.
Configuration¶
| Option | Type | Default | Description |
|---|---|---|---|
gemini_adapter_url | string | http://... | URL of the Gemini Adapter service |
target_model_keyword | string | "webgemini" | Keyword to identify Gemini models |
mode | string | "auto" | auto, direct, or analyzer |
analyzer_base_model_id | string | "gemini-3.0-pro" | Model used for document analysis |
subtitle_keywords | string | "字幕,srt" | Keywords to trigger subtitle flow |
Usage¶
- Upload a file (PDF, Image, Video, etc.) in the chat.
- Ask a question about the file.
- The plugin will automatically process the file and provide context to your selected model.