All source code for Data Science Courses project on Kaggle platform including webcrawlers, preprocessing pipelines and EDA notebooks.
demo.mp4
Nowadays, online educational platforms provide a vast amount of online courses. For self-learning beginners in Data Science, sometimes it's hard to choose an online course to start. This data was collected with the intent to answer common questions when choosing a new study.
Data was collected via web scraping from popular online platforms: Coursera, Stepik, Udemy, edX, Pluralsight, Alison, FutureLearn, and Skillshare. From each platform were queried courses only related to the "Data Science" topic. The original author of the image thumbnail is Ales Nesetril.
The primary intent behind collecting courses data is to discover which online platform provides the highest educational quality. Also, further analysis should reveal answers like "Does a paid course provide higher quality than a free one?" or "Which platform is the most suitable for beginners?".
- Web crawlers
- On-demand dataframe
- EDA notebooks
- Data processing pipelines
- Test suites
- Common utility scripts
Before setting up the environment, ensure you have Make
installed on your local machine. To install the packages required for the Streamlit dashboard, use the following command:
make min-dep
If you prefer a virtual environment setup, use requirements.txt
. You can set up extra dependencies for development, testing, data collection, and processing with the command:
make all-dep
To run a Streamlit server, use the command:
make serve