The Office - Words and Numbers

The data this week comes from the schrute R package for The Office transcripts and data.world for IMDB ratings of each episode.

If you'd like to use the schrute R package for ALL the lines/dialogue from the show - please install it from CRAN via install.packages("schrute"). A quick example from the vignette can be found here.

If you want to do text analysis - make sure to check out the tidytext package - a vignette can be found here and the Tidy Text Mining with R book can be found freely online here.

Lastly - the pudding analyzed The Office dialogue across a few charts - their article is here.

Get the data here

# Get the Data

office_ratings <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-03-17/office_ratings.csv')

# Or read in with tidytuesdayR package (https://github.com/thebioengineer/tidytuesdayR)
# PLEASE NOTE TO USE 2020 DATA YOU NEED TO USE tidytuesdayR version ? from GitHub

# Either ISO-8601 date or year/week works!

# Install via devtools::install_github("thebioengineer/tidytuesdayR")

tuesdata <- tidytuesdayR::tt_load('2020-03-17')
tuesdata <- tidytuesdayR::tt_load(2020, week = 12)


office_ratings <- tuesdata$office_ratings

Data Dictionary

`office_ratings.csv`

variable	class	description
season	double	Season number
episode	double	Episode number
title	character	Title of episode
imdb_rating	double	IMDB Rating (10 is best)
total_votes	double	Total votes by users
air_date	date	Original air date

`schrute` data

variable	class	description
index	integer	Index
season	character	Season Number
episode	character	Season episode
episode_name	character	Episode title
director	character	Episode Director
writer	character	Episode Writer
character	character	Episode Character
text	character	Dialogue as text
text_w_direction	character	Dialogue as text with direction

Cleaning Script

No cleaning this week!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

The Office - Words and Numbers

Get the data here

Data Dictionary

`office_ratings.csv`

`schrute` data

Cleaning Script

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

The Office - Words and Numbers

Get the data here

Data Dictionary

office_ratings.csv

schrute data

Cleaning Script

`office_ratings.csv`

`schrute` data