GitHub - Lstsk/blucks-clip

Inspiration

We noticed how time-consuming it can be to sift through lengthy videos just to find specific information. This sparked an idea: what if there was a way to make video content more accessible and interactive?

What we learned

Diving into this project taught us a lot, especially about the powerful tools available through Google Cloud:

AI Interaction with Gemini Flash Model: We integrated Google's Gemini Flash model, which provided fast responses and enhanced performance.
Efficient Storage with Google Cloud Storage Buckets: Utilizing Google Cloud Storage buckets, we were able to store and serve user-uploaded videos securely and efficiently.
Data Management with Google Firestore: Implementing Google Firestore allowed for real-time data synchronization and efficient querying, which was essential for managing user data and application state.
Video Processing with FFmpeg: We utilized FFmpeg, a powerful open-source multimedia framework, to handle various video processing tasks.

How We Built It

We selected a tech stack prioritizing performance and scalability:

Backend: Flask for lightweight and flexible API development.
Database: Google Firestore for efficient, real-time data management.
Storage: Google Cloud Storage buckets to securely store and serve user-uploaded videos.
AI Models: Integrated Google's Gemini Flash model for responsive AI interactions utilizing it's high token parameter.
Frontend: React with TypeScript to create a responsive and dynamic user-friendly chatbot interface.I

Core Features

Video Upload & Processing: Users can upload videos to the chat-bot interface
AI Chatbot: Users can ask questions, and the AI finds relevant video segments based on their queries and selected videos.
Clip Retrieval: AI identifies timestamps of key moments and return the clips to the user.
JWT Token Authentication: Implemented secured JWT authentication and token refresh mechanism.

Challenges Faced

None of us had significant experience with AI before this, and diving into technologies Gemini Flash truly opened our eyes. This project showed us the immense potential of AI in enhancing user experiences, and we’re excited to explore it further!

Future Improvements

Multimodal Search: enable image/voice queries (e.g., "Find scenes with this diagram" via screenshot upload)
Optimize backend infrastructure to handle real-time video processing and search across more efficiently.
Introduce features like AI-generated video chapters, smart annotations, or automatic clip generation for sharing.

Conclusion

This project was an exciting journey into AI-powered video interaction. It reinforced our understanding of real-world AI applications and scalability challenges. There’s still a lot of room for improvement, but the foundation is solid. We look forward to refining it and making video interaction smarter and more intuitive!

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
backend		backend
frontend		frontend
nginx		nginx
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.prod.yml		docker-compose.prod.yml
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Inspiration

What we learned

How We Built It

Core Features

Challenges Faced

Future Improvements

Conclusion

About

Releases

Packages

Languages

License

Lstsk/blucks-clip

Folders and files

Latest commit

History

Repository files navigation

Inspiration

What we learned

How We Built It

Core Features

Challenges Faced

Future Improvements

Conclusion

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages