← All templatestemplates / multimodal-voice-and-vision-assistant-for-ios
LiveKit Agentsupdated May 14, 2025 · other · support
Multimodal Voice and Vision Assistant for iOS
A voice AI assistant with realtime audio and video input capabilities. Built for iOS, it supports front and back camera switching, natural voice conversations, live screen sharing, and background operation. The assistant can observe and interact seamlessly while users work on other tasks, making it suitable for hands-free assistance scenarios.
telephonyWeb Only
speech-to-textGoogle Speech-to-Text
llmGemini 2.5 Pro
text-to-speechGoogle Cloud TTS
No prompt published.
multimodalvisioniosmobilescreen-sharingbackground-modegemini-live
Voice Notes
Voice AI recipes, picks, and analysis.
Get the useful new templates plus the occasional teardown of what’s working in production voice AI.