Skip to content

Lstsk/blucks-clip

 
 

Repository files navigation

Inspiration

We noticed how time-consuming it can be to sift through lengthy videos just to find specific information. This sparked an idea: what if there was a way to make video content more accessible and interactive?

What we learned

Diving into this project taught us a lot, especially about the powerful tools available through Google Cloud:

  • AI Interaction with Gemini Flash Model: We integrated Google's Gemini Flash model, which provided fast responses and enhanced performance.
  • Efficient Storage with Google Cloud Storage Buckets: Utilizing Google Cloud Storage buckets, we were able to store and serve user-uploaded videos securely and efficiently.
  • Data Management with Google Firestore: Implementing Google Firestore allowed for real-time data synchronization and efficient querying, which was essential for managing user data and application state.
  • Video Processing with FFmpeg: We utilized FFmpeg, a powerful open-source multimedia framework, to handle various video processing tasks.

How We Built It

We selected a tech stack prioritizing performance and scalability:

  • Backend: Flask for lightweight and flexible API development.
  • Database: Google Firestore for efficient, real-time data management.
  • Storage: Google Cloud Storage buckets to securely store and serve user-uploaded videos.
  • AI Models: Integrated Google's Gemini Flash model for responsive AI interactions utilizing it's high token parameter.
  • Frontend: React with TypeScript to create a responsive and dynamic user-friendly chatbot interface.I

Core Features

  1. Video Upload & Processing: Users can upload videos to the chat-bot interface
  2. AI Chatbot: Users can ask questions, and the AI finds relevant video segments based on their queries and selected videos.
  3. Clip Retrieval: AI identifies timestamps of key moments and return the clips to the user.
  4. JWT Token Authentication: Implemented secured JWT authentication and token refresh mechanism.

Challenges Faced

  • None of us had significant experience with AI before this, and diving into technologies Gemini Flash truly opened our eyes. This project showed us the immense potential of AI in enhancing user experiences, and we’re excited to explore it further!

Future Improvements

  • Multimodal Search: enable image/voice queries (e.g., "Find scenes with this diagram" via screenshot upload)
  • Optimize backend infrastructure to handle real-time video processing and search across more efficiently.
  • Introduce features like AI-generated video chapters, smart annotations, or automatic clip generation for sharing.

Conclusion

This project was an exciting journey into AI-powered video interaction. It reinforced our understanding of real-world AI applications and scalability challenges. There’s still a lot of room for improvement, but the foundation is solid. We look forward to refining it and making video interaction smarter and more intuitive!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 66.9%
  • Python 31.7%
  • Other 1.4%