Generative AI Multi-Modal Output Demo

Video Icon

This Generative AI-powered tutor expands the way users interact with media, allowing AI to not only retrieve information but actively manipulate and highlight content in real time. Whether guiding a learner through technical repair, explaining complex visual concepts, or navigating 3D models, the AI functions as an interactive assistant with direct control over video, images, and 3D objects.

Overview

  • AI-Driven Interactive Learning
    The AI can highlight, zoom, annotate, and manipulate media to make explanations more engaging and hands-on.

  • Multi-Modal Media Integration
    Supports video, images, and 3D models, allowing users to ask questions and see dynamic, real-time responses.

  • Embedded Intelligence
    Media files are pre-annotated with structured data that enables the AI to understand and interact with content effectively.

  • Built-In Knowledge Retrieval
    A custom CMS and vector search system ensures relevant media is surfaced and context-aware to the user’s conversation.

SCREENSHOT GALLERY

Gallery Icon

Highlights

  1. Video Interaction & Highlighting

    • Users can ask the AI questions about a video, and it can find and play key moments based on transcription metadata (SRT files).
    • The AI can highlight important on-screen elements in real time.
  2. Image Annotation & Analysis

    • Images are stored with SVG overlays for AI-driven zooming, highlighting, and labeling.
    • Users can ask the AI to explain specific parts of an image, and it will dynamically respond.
  3. 3D Model Manipulation

    • The AI interacts with GLTF 3D models, reading structured metadata to understand how the object functions in the real world.
    • Users can rotate, highlight, and manipulate parts of the model as they would with real equipment.
  4. Smart Content Retrieval

    • A Strapi-based CMS lets users upload, label, and annotate media while automatically generating vector embeddings.
    • The Weaviate vector database ensures relevant media is surfaced based on the AI’s understanding of the current conversation.
  5. Real-World Use Cases

    • A student can explore course materials dynamically, asking deeper questions and getting tailored, interactive responses.
    • A technician troubleshooting new equipment can see highlighted steps, manipulate digital twins, and retrieve just-in-time knowledge.

Conclusion

This project reimagines AI-assisted learning by making media not just searchable, but fully interactive. By integrating annotated video, images, and 3D models with an AI-powered real-time tutor, this system provides engaging, on-demand support for learners and professionals alike.