Skip to content

20Bolin/Managing_Big_Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Managing_Big_Data

This is the code repositary of my project assignment in the master course "Managing Big Data" at University of Twente. Spark was implemented for analysing the Kaggle dataset "Newyork city Taxi Trip Records Dataset" to generate and provide business insights regarding the taxi industry in New York City. RQ denotes "Research Question". RQ1 denotes the code of the first research question. There are three research questions in this project assignment as follows:

  1. What is the difference of usage between Green Taxi and HVFHV on weekdays and weekends between 2019 and 2022?
  2. For Green Taxi and HVFHV, which type of taxi is preferred on different timeslots in one day?
  3. Regarding the travel distance difference between Green Taxi and HVFHV, do commuters have a preference for taking a certain type of taxi for long/short distance travel?

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages