Skip to content

Latest commit

 

History

History
50 lines (36 loc) · 2.47 KB

File metadata and controls

50 lines (36 loc) · 2.47 KB

IMDb Movie Analysis Project

This project involves analyzing the data obtained from the IMDb website using web scraping method and predicting the ratings of movies with a series of linear regression models.

Data Sources

  • IMDb: We will scrape movie data from the IMDb website, which is one of the most comprehensive sources for movie information.

Tools and Techniques

  • Web scraping with beautifulsoup: We will use this Python library to extract data from the IMDb website.
  • Selenium: We will use this Python library to automate the scraping process and make it more efficient.
  • EDA: We will perform exploratory data analysis to gain insights into the data and identify patterns.
  • Linear regression: We will use linear regression to build a predictive model that can recommend movies based on our viewing history.
  • Feature engineering: We will create new features from the existing data to improve the accuracy of our predictive model.

Deliverables

  • Presentation File: We will create a visual and oral presentation to showcase our project and findings.
  • Project Repository: We will create a GitHub repository to share our code and project details.
  • Blog Post: We will publish a blog post on the internet (e.g. Medium) to share our project and findings with the broader data science community.

Conclusion

This project aims to provide insights into what kind of movies we would love to watch as a team based on our Netflix view history and IMDb data. We will use a combination of web scraping, exploratory data analysis, linear regression, and feature engineering to build a predictive model. Our project will showcase our data science skills and provide us with valuable experience in working with real-world datasets.

License

This project is licensed under the MIT License.

Contributors

Muhammed Maral
Halil Kolatan

Contact

Please feel free to contact the project team for any questions or feedback.