Python Visualization exercise using MatPlotLib (Module 5 UT Data Visualization Bootcamp)
For this challenge, task was to utilize MatPlotLib with Python to explore various charting & formatting options available to perform intrisic data visualization.
Specific dataset used was a simulation of ride sharing data:
- Ride data (2376 rows)
- City data (121 rows)
Using the visualizations within the module & challenge assignment, goal is to determine what relationships (specifically by city type) can be discerned within the datasets.
Key charts from module & challenge assignment to determine pattern by city types:
- Rural
- Suburban
- Urban
( Average Fare versus Total Number of Rides Per City Type )
( Percentage of Total Drivers by City Type )
Based on Total Fare by City Type, highest fare revenue consistently month over month consistently comes from city types:
- Urban
- Suburban
- Rural
Urban city types achieves this through sheer volume. Average Fare versus Total Number of Rides Per City Type shows a descending relationship.
The higher number of rides within Urban environment leads to lower average fare cost---likely due to increased competition among drivers.
Percentage of TOtal Drivers by City Type helps to confirm this hypthosis. Number of available drivers from highest to lowest is: -- Urban -- Suburban -- Rural
To address this disparity, the following is recommended:
- Increasing the number of available drivers in rural & suburban cities may help to reduce average fare price.
- Late February shows peak alignment in total fare revenue among city types. It is recommended to take advantage of this by increasing fare pricing.
- Early January alignment as well, but for lowest fare revenue. This may be due to holiday season. To confirm, it is recommended to extend analysis to include December 2018.