Develop a model over Expedia Dataset
pip install numpy
pip install seaborn
pip install sklearn
pip install matplotlib
Personalize Expedia Hotel Searches
- Load the Dataset
- Find NaNs in the data and remove columns having them
- Dataset is very huge, hence we will work on a subset
- Find the most populat property, country and room
- Perform a K Means Clustering
- Plot out Graphs
Formed 10 Clusters
Cluster Centers
Elbow Curve
5D K Means Graphs
x = price_usd
y = srch_booking_window
z = srch_saturday_night_bool
c = Cluster labels
s = varied sizes based on srch_length_of_stay