diff --git a/_projects/2024/100528259/100528259.Rmd b/_projects/2024/100528259/100528259.Rmd new file mode 100644 index 00000000..f47f51a3 --- /dev/null +++ b/_projects/2024/100528259/100528259.Rmd @@ -0,0 +1,1005 @@ +--- +title: "An 8-Bit Look at the Evolution of BMI Values of Olympic Athletes" +description: | + This study displays the evolution of BMI values of Olympic athletes who + competed in the Summer Olympic Games of 1960 and 2016. +categories: "2024" +author: Gür Piren +date: "`r Sys.Date()`" +output: + distill::distill_article: + self_contained: false + toc: true +--- + +```{r setup, include=FALSE} +knitr::opts_chunk$set(out.width="100%", fig.align="center", echo=TRUE) +``` + +## Original Chart + +The Olympic Games have long been a platform to showcase the pinnacle of human performance and athleticism. Over the years, the physical features of athletes competing in these prestigious events have transformed significantly. [This chart](https://www.visualcapitalist.com/cp/charted-olympic-athletes-getting-bigger/) from Visual Capitalist displays the evolution of Olympic athletes' sizes, illustrating trends in height and weight across sports and genders. It utilises BMI categories to place various sports to capture changes over time. + +There are several reasons as to why I picked this plot. At first sight, it attracts attention with its coloured tiles, and it looks well-organised with the facets and labeled data points. Getting above such features with a brand new design seemed challenging - and I liked the idea. + +Second, after careful exploration, I noticed that there was something wrong with the representation of sports: some sports were not present in the 1960 Summer Olympic games, however they were displayed it the facets that demonstrate 2016 games, which was precisely the moment that gave me the idea that this work was in fact not perfect. + +To not spoil all the fun by revealing the ideas, let's proceed to data cleaning and adjusting section. + + +![Olympic Graph. Source: [Visual Capitalist](https://www.visualcapitalist.com/)](olympics.png){width="100%" .external} + + +## Packages and fonts + +```{r} + +# For replication graph + +library(tidyverse) +library(readr) +library(ggplot2) +library(dplyr) +library(ggrepel) +library(extrafont) +library(showtext) +library(sysfonts) + +# Extras - For alternative plot + +library(grid) +library(ggtext) +library(gridExtra) + + +# Fonts for the replication plot + +font_add_google("Crimson Pro", "crimson-pro") +font_add_google("Outfit", "outfit") +font_add_google("Cormorant Garamond", "cormorant-garamond") +font_add_google("Montserrat", "Montserrat") +font_add_google("Archivo Black", "ArchivoBlack") +font_add_google("Piazzolla", "PiazzollaExtraLight") + +# Fonts for the improved plot + +font_add_google("Press Start 2P", "press-start") +font_add_google("VT323", "vt323") +font_add_google("DotGothic16", "dotgothic16") + +showtext_auto() + +``` + +The data set displays personal records of Olympic athletes. + +The original plot focuses on athletes who are at least 20 years old and competed in 1960 and 2016 Summer Olympic games. Taking into account their height and weight values, it then calculates BMI values of athletes and categorise them. + +In so doing, it creates four different facets and coloured tiles in each one of them. Obviously, each colour tile represents a BMI category. + +The four facets are designed to represent 1960 and 2016 Summer Olympic games, and male and female athletes. + +## Replication + +### Working with the data + +Reading the dataset + +```{r} +data_main <- read_csv("athlete_events.csv") +``` + +As we will only need the sports branches, Olympic game's year information and the name, gender, and the BMI values of the athletes, I get rid of everything else. + +```{r} + +olympics_tidy <- data_main |> + filter(Age >= 20, Year %in% c(2016, 1960), Season == "Summer") |> + select(-c(Team, Games, NOC, City, Event, Medal)) |> + distinct(Name, .keep_all = TRUE) +``` + + +Some athletes have no height or weight (or either) info. This is just to have an idea of what's going on. + +```{r} + +missing_values <- olympics_tidy |> + summarize( + missing_height = sum(is.na(Height)), + missing_weight = sum(is.na(Weight)) + ) + +missing_values + +``` + +I want to create a BMI variable for future use, as BMIs will be the main concern of this study. + + +```{r} +olympics_tidy <- + olympics_tidy |> + mutate(BMI = Weight / (Height / 100)^2) +``` + + +Now let's acquire the means for height, weight, and BMI values for different branches, across genders and for the two Olympic games. + +```{r} + +olympics_tidy <- olympics_tidy |> + summarise( + height_mean = mean(Height, na.rm = TRUE), + weight_mean = mean(Weight, na.rm = TRUE), + BMI_mean = mean(BMI, na.rm = TRUE), + .by = c("Sport", "Sex", "Year")) +``` + + +Recoding 'Sex' for better readability. + + +```{r} + +olympics_tidy <- olympics_tidy |> + mutate(Sex = recode(Sex, `M` = "Men", `F` = "Women")) +``` + + +Although the values range from 30 to 170 for weight and from 143 to 218 for height, the graph seems to omit some observations and use different ranges. + +After a careful and (highly) manual measurement, I decided that for weight the range would be from 45 to 105, and for height it would be 155 and 205. + +This can be observed in the original graph. + +Since I'll be viusalising 'Gender' and changes in 'Height', 'Weight', I will be using facets. +Let's organise the data taking into account the new value ranges indicated above, and then proceed to labeling BMI ranges. + +To do that, I first reframe my data set indicating new ranges for Height and Weight values. + +Following this, I create a new variable named 'BMI Label', which will store BMI categories and their values. Then, this variable will help create tiles for each BMI category. + +Using 'ordered = TRUE', I ensure the logical order between 'Underweight', 'Normal', 'Overweight', and 'Obese' categories. + +```{r} + +olympics_reframed <- crossing( + Height = 155:205, + Weight = 45:105, + Sex = c("Men", "Women"), + Year = c(1960, 2016) +) |> + mutate( + BMI = Weight / (Height / 100)^2, + `BMI Label` = factor(case_when( + BMI <= 18.5 ~ "Underweight", + BMI > 18.5 & BMI <= 24.9 ~ "Normal", + BMI > 24.9 & BMI <= 29.9 ~ "Overweight", + BMI > 29.9 ~ "Obese" + ), + levels = c("Underweight", "Normal", "Overweight", "Obese"), + ordered = TRUE) + ) +``` + + +### Plotting + +Let's begin with stating x any y axes. I will use 'weight_mean' and 'height_mean' from 'olympics_tidy' data set. + +Next, 'geom_tile' will enable me to create tiles based on BMI categories that I have indicated in 'BMI Label' column in 'olympics_reframed'. + +I have used a separate data set called 'olympics_reframed' for my own ease. All these information could probably work out together in a single data set. + +The filling of the tiles will be automatic. I will be changing it in the next step. + +```{r fig.height=14, fig.width=13} + +final_graph <- ggplot(olympics_tidy) + + aes(weight_mean, height_mean) + + geom_tile(data = olympics_reframed, aes(x = Weight, y = Height, fill = `BMI Label`)) + +final_graph + +``` + +Here, I enhance the appearance of my existing graph by customizing the fill colours for different BMI categories. I use scale_fill_manual() to manually set specific colours for each category, such as "Underweight," "Normal," "Overweight," and "Obese," assigning colours as close as possible to the original plot. + +I also adjust the legend with setting 'guides()' to NULL in order to remove the title. + +```{r fig.height=14, fig.width=13} +final_graph <- + final_graph + + scale_fill_manual(values = c( + "Underweight" = "cadetblue2", + "Normal" = "darkolivegreen2", + "Overweight" = "lightgoldenrod1", + "Obese" = "lightcoral")) + + guides(fill = guide_legend(title = NULL)) + +final_graph + +``` + +It's time to set the title and subtitle for of the plot and add axes titles. Lastly, I add the caption at the bottom right of the original plot indicating the source of data. + +```{r fig.height=14, fig.width=13} +final_graph <- + final_graph + + labs( + x = "WEIGHT (KG)", + y = "HEIGHT (CM)", + title = "OLYMPIC ATHLETES ARE GETTING BIGGER", + subtitle = "Mean weight and height by sport at the 1960 and 2016 Summer Olympics. Values calculated \nfor athletes 20 years or older, compared to the BMI classification for the general population.", + caption = "Source: Kaggle • Graphic: Georgios Karamanis" + ) +final_graph +``` + +No party without our good old friends: data points! Using geom_point, I invite them to the party. + +```{r fig.height=14, fig.width=13} +final_graph <- + final_graph + + geom_point() + +final_graph +``` + +Since I will be comparing 1960 and 2016 Summer Olympics while displaying BMI values based on gender, I will need 4 different facets. + +The 'Sex' variable has two values (Men and Women), and the 'Year' variable only has 1960 and 2016. Therefore, four facets. + +Lastly, I tell R to not add extra space to x and y axes using coord_cartesian(expand = FALSE). + +```{r fig.height=14, fig.width=13} +final_graph <- + final_graph + + facet_grid(Sex ~ Year) + + coord_cartesian(expand = FALSE) + +final_graph +``` + +In the original plot, the legend is located at the top, right below the subtitle. First I relocate it on the top and ensure that it is centered. Then, I place it in a horizontal way. Lastly, I ensure that the legend key sizes somewhat match to that of original plot. + +```{r fig.height=14, fig.width=13} +final_graph <- + final_graph + + theme( + legend.position = "top", + legend.justification = "center", + legend.direction = "horizontal", + legend.key.size = unit(0.7, "cm") + ) + +final_graph +``` +Using strip.text, I ensure the colour for texts in the strips and match their size to the original plot. + +Using plot.title and plot.subtitle, I set sizes, alignments, and margins of the title and subtitle. For the title, I use 'ArchivoBlack', and for the subtitle, I use 'Montserrat' (both from Google fonts). + +After setting the margins of the plot and the caption that is at the bottom right, I move to indicating sizes for the legend, x and y axes (for both axis texts and axis titles). + +For axis titles, I use 'cormorant-garamond' from Google fonts. + +Lastly, I change the background colour for facet strips and set the colour for outer borders to = NA. This way, it will look as if I have no borders for the strips. + + +```{r fig.height=14, fig.width=13} +final_graph <- + final_graph + + theme( + strip.text = element_text(color = "black", size = 20), + plot.title = element_text(size = 36, hjust = 0.5, family = "ArchivoBlack", + margin = margin(t = 10, b = 5, l = 15, r = 15)), + plot.subtitle = element_text(size = 18, hjust = 0.5, family = "Montserrat", + face = "plain", lineheight = 1, margin = margin(t = 5, b = 10, l = 15, r = 15), color = "grey15"), + plot.margin = margin(20, 10, 10, 10), + plot.caption = element_text(size = 16, hjust = 1, lineheight = 1.2), + legend.text = element_text(size = 14), + axis.text.x = element_text(size = 14), + axis.text.y = element_text(size = 14), + axis.title.x = element_text(size = 16, family = "cormorant-garamond"), + axis.title.y = element_text(size = 16, family = "cormorant-garamond"), + strip.background = element_rect(fill = "gray95", color = NA), + plot.background = element_rect(fill = "gray95"), + legend.background = element_rect(fill = "gray95") +) + + + + +final_graph +``` + +The only thing that is missing right now (pretty much) is labeling data points. + +To make it look a bit more like the original plot, I begin with changing the colour, size, and the transparency level of my data points. (see geom_point(...)). + +Thanks to geom_text_repel, I will be able to put labels to the data points. Since each data point represents an Olympic sport, these texts will be name of Olympic sports. + +Firstly, I ensure all sports names are in capital letters, using 'label = toupper(Spor)'. + +Next, I set the font of data labels to 'Montserrat' and set their size to 4. + +To have a clean and readable plot, I had to ensure that text labels do not overlap. Using 'nudge_x' and 'nudge_y', I slightly moved text labels horizontally and vertically. 'box.padding', on the other hand, adds padding around each label’s box to create more space between the text and other elements, so it helps the cause dearly. Lastly, using 'max.overlaps', I ensured to limit the maximum number of overlapping labels to be shown; so that excess overlaps are suppressed. + +'min.segment.length' helps me to somewhat restrict as to which connecting lines will be displayed, so I set it to 1.2 and disable the ones that would be longer than that value. With 'segment.size', I make those connecting lines extremely thin and through 'segment.color', their colour is indicated as 'grey30'. + +I use the same colour for my text labels using "color = 'grey30'". Lastly, I hide a possible legend for the data labels through 'show.legend = FALSE'. + +```{r fig.height=14, fig.width=13, preview=TRUE} +final_graph <- + final_graph + + geom_point(color = "gray40", size = 1.5, alpha = 0.7) + + geom_text_repel(aes(label = toupper(Sport)), + family = "Montserrat", + size = 4, + nudge_y = 3, + nudge_x = 1.5, + box.padding = 0.75, + max.overlaps = 15, + min.segment.length = 1.2, + segment.size = 0.1, + segment.color = "grey30", + color = "grey30", + show.legend = FALSE) + +final_graph + +``` + +### Downsides of the original plot + +The initial plot displays some Olympic events that did not exist in 1960 but took place in the 2016 Olympic Games (e.g., Tennis). I don't need to show them as they do not contribute to the study and add noise to the data, particularly in women’s sports. This is due to the fact that women's sports were not as diverse in 1960 as they were in 2016. + +There are extremely few or no 'Underweight' and 'Obese' athletes in the initial study. Using tiles for nonexistent categories seems redundant. + +Another issue with the plot is that the data point trends are not easy to follow. For any sport, it is quite difficult to determine whether it has experienced an increase or decrease over time. + +## Alternative Plot + +### Possible improvements + +1. As the original plot's biggest weakness is the difficulty in tracking individual sports' trends over time, I try to address this issue by using a slope chart as my alternative plot. A slope chart is useful when comparing trends over time, using two data points for each observation and a line that connects those two data points. + +2. Most sports branches have somewhat similar BMI values, which eventually leads to a plot that has many overlappings. A slope chart is likely to reduce those overlappings while still emphasising trends. + +3. I can change the title, coming up with a more striking one. Also I can use a fancy font. The font and colouring has actually been my favourite part for this study. + +4. Who does not love 8-bit stuff? Well, not me. The inspiration of my alternative plot will be 8-bit aesthetics. I had to do it, because for my whole childhood I played 'Track & Field'! + +### Challenges + +The closely clustered nature of the data points and the lack of direct connections between 1960 and 2016 data for each sport make trend analysis challenging. I will try to address these issues in my slope chart, but highly close numbers of BMI values pose threats to clarity of my plot. + +In my first attempt, I completely excluded women's sports that were present in both Olympic games (which amounted to 6 branches in total). + +In the second attempt, I'll omit some of the men's sports branches and include women's sports into my plot. But for now, let's move onto the first one. + +### Working with the data + +Let's first read the data and start with the improvement. + +```{r} +athletes <- read_csv("athlete_events.csv") +``` + +As the original study suggests, I filter 1960 and 2016 Summer Olympic Games, and athletes that were at least 20 years old. +Then I move onto create a BMI variable for the athletes in my dataset. + +```{r} +athletes <- athletes |> + filter(Year %in% c(1960, 2016), Season == "Summer", Age >= 20) |> + mutate(BMI = Weight / (Height / 100)^2) +``` + +In this chunk of code, I create two subsets of my dataset that will display sports branches that male athletes competed in 1960 and 2016 Summer Olmypics. + +Lastly, I want to see which sports are common in both Olympic games to prevent data noise in the future. Using 'pull', I acquire those sports for each year and then store their intersection in 'common_sports_men'. + +```{r} +sports_1960_men <- athletes |> + filter(Year == 1960, Sex == "M") |> + pull(Sport) + +sports_2016_men <- athletes |> + filter(Year == 2016, Sex == "M") |> + pull(Sport) + + +common_sports_men <- intersect(sports_1960_men, sports_2016_men) + +``` + +To create a subset of data with only male athletes that competed in intersecting sports, I filter my dataset. Then, for a tidy table, I drop unnecessary columns. + +```{r} + +filtered_male_athletes <- athletes |> + filter(Sex == "M" & Sport %in% common_sports_men) |> + select(-Name, -Team, -NOC, -Games, -City, -Event, -Medal) +``` + +Now I am at the point where I need to create an average BMI value for each sports in my dataset. I group BMI values by 'Year' and 'Sports' as they will be displayed on the basis of competition year and event type. + +Then I sort them in alphabetical order for my own use. + +```{r} + +avg_bmi_men <- filtered_male_athletes |> + group_by(Sport, Year) |> + summarize(Avg_BMI = mean(BMI, na.rm = TRUE), .groups = 'drop') + +avg_bmi_men <- avg_bmi_men |> + arrange(Sport) + +``` + + +This is where I manually group sports for each facet, considering their BMI values in 1960 and 2016. This way, I am planning to reduce overlappings as efficiently as possible. + +```{r} +avg_bmi_men <- avg_bmi_men |> + mutate(Facet_Group = case_when( + Sport %in% c("Water Polo", "Sailing", "Hockey", "Diving", "Football") ~ "1", + Sport %in% c("Shooting", "Modern Pentathlon", "Rowing", "Cycling", "Wrestling") ~ "2", + Sport %in% c("Weightlifting", "Swimming", "Athletics", "Boxing", "Canoeing") ~ "3", + TRUE ~ "4" + )) + +``` + +This was one of my favourite parts doing my alternative plot. Retro work!! + +The aesthetics of my plot will be providing a retro look. + +So, I assign colours for each sports so that they will (try to) not confuse the reader. + +```{r} + +retro_colours <- c( + "Athletics" = "#00a046", + "Basketball" = "#FF6F20", + "Boxing" = "#fcab98", + "Canoeing" = "#7fcdbb", + "Cycling" = "#bcbddc", + "Diving" = "#000000", + "Equestrianism" = "#a65628", + "Fencing" = "#A3D900", + "Football" = "#ec213e", + "Gymnastics" = "#d8b365", + "Hockey" = "#F1C40F", + "Modern Pentathlon" = "#FF448C", + "Rowing" = "#10EA00", + "Sailing" = "#636363", + "Shooting" = "#D167BF", + "Swimming" = "#4400AA", + "Weightlifting" = "#7A9A01", + "Wrestling" = "#80BFFF", + "Water Polo" = "#0057A1" +) + +``` + +### Plotting + +From the avg_bmi_men table, I extract BMI values for both years and genders. Next, I specify the x and y axes for my plot. Lastly, I ensure that the grouping is based on the type of sport. + +The geom_line function helps me create lines that connect values from two Olympic years. Using the 'linetype' parameter, I apply dotted lines to give an 8-bit feel to the plot. Similarly, by using 'geom_point', I employ square-shaped data points and adjust their size. + +With 'facet_wrap', I arrange the sports in facets that I matched earlier. By using 'ncol = 2', I display all my facets in two columns. + + +```{r fig.height=5, fig.width = 8.5} + +improved_graph <- + ggplot(avg_bmi_men, aes(x = factor(Year), y = Avg_BMI, group = Sport, color = Sport)) + + geom_line(linetype = "dotted", linewidth = 1.5) + + geom_point(size = 2, shape = 15) + + facet_wrap(~ Facet_Group, ncol = 2) + +improved_graph + +``` + +Quickly changing the structure of the legend into a two-column one. I will be using the space below the legend. + +```{r fig.height=5, fig.width = 8.5} +improved_graph <- + improved_graph + + guides(color = guide_legend(ncol = 2)) + +improved_graph +``` + +It is time I name the title, subtitle, and caption, as well as the x and y axes. + +I also change the randomly assigned colors of data points and lines to the colors I manually picked and stored in 'retro_colours' earlier. + +As the BMI values range between 20 and 29, I customize the y-axis, ensuring that the values increment by 1 unit (scale_y_continuous). + +```{r fig.height=5, fig.width = 8.5} +improved_graph <- + improved_graph + + labs(title = "Avg. BMI of Male Olympic Athletes \nby Sport (1960 vs 2016)", + subtitle = "Have they lowkey gotten fatter?", + caption = "My first ever original plot © ", + x = "Year", + y = "Average BMI", + color = "Olympic sports") + + scale_color_manual(values = retro_colours) + + scale_y_continuous(breaks = seq(20, 29, by = 1)) + +improved_graph +``` + +I customise the plot title using 'plot.title', applying the "press-start" font for consistency, and set it to bold for emphasis. + +For the plot subtitle, I use 'plot.subtitle' with the "vt323" font from Google fonts, adjusting the alignment to ensure it appears centred below the title. + + +```{r fig.height=5, fig.width = 8.5} +improved_graph <- + improved_graph + + theme( + plot.title = element_text(family = "press-start", size = 10, face = "bold", + hjust = 0.2, color = "yellow2"), + plot.subtitle = element_text(family = "vt323", size = 14, hjust = 0.5, color = "#abdbe3") + ) + +improved_graph + +``` + +I customise the axis text using 'axis.text', setting the colour to light blue and applying the "press-start" font for a consistent style. + +The axis titles are styled with 'axis.title', making them bold and yellow to ensure they stand out. + +```{r fig.height=5, fig.width = 8.5} + +improved_graph <- + improved_graph + + theme( + axis.text = element_text(size = 5, color = "lightblue1", family = "press-start"), + axis.title = element_text(size = 8, face = "bold", family = "press-start", color = "yellow2"), + ) + +improved_graph + +``` + + +I set the plot background colour using 'plot.background', filling it with a dark grey. The panel background is customised with 'panel.background', applying a lighter grey and defining a subtle border for clear look of sections. For the grid lines, I adjust the major grid lines using 'panel.grid.major', setting their colour to a light grey. + +I also configure the minor grid lines with 'panel.grid.minor', ensuring they maintain a harmonious appearance with a very light grey. + +```{r fig.height=5, fig.width = 8.5} + +improved_graph <- + improved_graph + + theme( + plot.background = element_rect(fill = "grey15"), + panel.background = element_rect(fill = "grey95", color ="grey95", linewidth = 2), + panel.grid.major = element_line(color = "grey90"), + panel.grid.minor = element_line(color = "grey95") + ) + +improved_graph + +``` + +I position the legend on the right with 'legend.position', ensuring a clean layout. The legend title is styled with 'legend.title'. I also apply the "press-start" font here. I adjust the size of the legend keys with 'legend.key.size'. For the legend text, I use 'legend.text' to define the font properties and colour, enhancing readability. + +The background of the legend is refined using 'legend.background', to give it a light grey fill with a subtle border for better definition. + +I remove the facet strip titles by setting 'strip.text to element_blank()', because I do not want to see them. + +```{r fig.height=5, fig.width = 8.5} +improved_graph <- + improved_graph + + theme( + legend.position = "right", + legend.title = element_text(family = "press-start", + size = 10, + color = "blue"), + legend.key.size = unit(0.4, "cm"), + legend.text = element_text(size = 5, family = "press-start", color = "blue"), + legend.background = element_rect(fill = "grey95", color = "grey70", size = 2), + strip.text = element_blank() + ) + +improved_graph +``` + +For the honours, let's put the copyright. + +Using 'plot.caption' and 'plot.margin', I set the properties of my caption such as the font type, size colour, alignment, and margin. + +Lastly, I adjust the overall plot margins using 'plot.margin', defining the space around the entire plot. + +```{r fig.height=5, fig.width = 8.5} + +improved_graph <- + improved_graph + + labs( + caption = "My first ever original plot © ") + + theme( + plot.caption = element_text(size = 6, family = "press-start", color = "yellow2", + hjust = 2.2, margin = margin(b = 14)), + plot.margin = margin(7, 12, 1, 5) +) + +improved_graph +``` + + +I then create an annotation background using a light grey rectangle and add text to it, explaining the BMI categories: 'Underweight', 'Normal', 'Overweight', and 'Obese', all styled with a blue color. + +Finally, I arrange the main graph alongside the annotation so that the graph occupies 92% of the height and the annotation takes up 8%, ensuring a clear and informative presentation. + +```{r fig.height=5, fig.width = 8.5} +annotation_background <- rectGrob( + gp = gpar(fill = "grey90") +) + +# Creating the annotation text +annotation_grob <- textGrob( + "BMI ≤ 18.5 'Underweight' 18.5 < BMI < 25 'Normal' 25 ≤ BMI < 30 'Overweight' BMI ≥ 30 'Obese'", + gp = gpar(fontsize = 5, + fontfamily = "press-start", + col = "blue") +) + +combined_annotation <- grobTree(annotation_background, annotation_grob) + +grid.arrange(improved_graph, + combined_annotation, + heights = c(0.92, 0.08), + ncol = 1) +``` + +## Alternative Plot After Class Presentation + +During my presentation for my initial improved plot, I received recommendations on several aspects of my work. At the end of this document, I will put an evaluation for those recommendations and what I have done to address the issues that were raised. + +### Working with the data + +Let's start with reading the data. + +```{r} + +data <- read_csv("athlete_events.csv") + + +``` + +I erase the columns that I will not be using. + +```{r} +cleaned_data <- data %>% + select(-Team, -NOC, -Games, -City, -Event, -Medal) + +``` + + +Here, I filter 1960 and 2016 Summer Olympics and athletes >= 20 years old. + +```{r, echo=TRUE} + +filtered_data <- cleaned_data |> + filter(Year %in% c(1960, 2016), + Season == "Summer", + Age >= 20) + +``` + + I want to detect those sports that are common in both Olympic game years for men. + +```{r} +sports_1960_men <- filtered_data |> + filter(Year == 1960, Sex == "M") |> + pull(Sport) + +sports_2016_men <- filtered_data |> + filter(Year == 2016, Sex == "M") |> + pull(Sport) +``` + + +I find the intersection for men, this is essential for ensuring meaningful data representation. + +```{r} +common_sports_men <- intersect(sports_1960_men, sports_2016_men) + +common_sports_men +``` + + +Then, I do the same for women, because there is a lot of data noise at women's sports. Vast majority of women's sports in 2016 did not exist in 1960 Summer Olympics. + +```{r} +sports_1960_women <- filtered_data |> + filter(Year == 1960, Sex == "F") |> + pull(Sport) + +sports_2016_women <- filtered_data |> + filter(Year == 2016, Sex == "F") |> + pull(Sport) + +``` + + +Let's find the intersection for women as well. + +```{r} + +common_sports_women <- intersect(sports_1960_women, sports_2016_women) + +common_sports_women + +``` + + +Keeping only the intersecting sports will help me ensure there is no data noise. + +```{r} +olympics <- filtered_data |> + filter(Year %in% c(1960, 2016), + (Sport %in% common_sports_men & Sex == "M") | + (Sport %in% common_sports_women & Sex == "F")) +``` + +Now I will work on BMI values and categories. So let's create a BMI variable first. + +```{r} +filtered_data_BMI <- olympics |> + mutate(BMI = Weight / (Height / 100)^2) +``` + +I calculate average BMI values for each sport. + +```{r} +avg_BMI <- filtered_data_BMI |> + group_by(Sport, Year, Sex) |> + summarize(avg_BMI = mean(BMI, na.rm = TRUE), .groups = 'drop') +``` + + +For the three facets that I will create in my plot, I specify sports. + +```{r} +avg_BMI <- avg_BMI |> + mutate(Facet_Group = case_when( + Sport %in% c("Water Polo", "Wrestling", "Swimming", "Shooting", + "Weightlifting", "Gymnastics", "Equestrianism", "Canoeing", "Sailing", "Hockey") ~ "1", + Sport %in% c("Gymnastics", "Athletics", "Fencing", "Canoeing", + "Swimming", "Diving", "Equestrianism", "Sailing") ~ "2", + )) +``` + +Picking colours again! + +```{r} + +retro_colours_2 <- c( + "Athletics" = "#00a046", + "Basketball" = "#FF6F20", + "Boxing" = "#fcab98", + "Canoeing" = "#7fcdbb", + "Cycling" = "#bcbddc", + "Diving" = "#FF448C", + "Gymnastics" = "#d8b365", + "Hockey" = "#F1C40F", + "Rowing" = "#10EA00", + "Sailing" = "#4400AA", + "Shooting" = "#D167BF", + "Swimming" = "#0057A1", + "Weightlifting" = "#7A9A01", + "Wrestling" = "#80BFFF", + "Water Polo" = "#636363" +) + +``` + + +### Onto the plot + +In this section, I define two character vectors for male and female Olympic sports, categorizing them into two groups for men and one for women. These sports are chosen on the basis of their presence in both Olympic Games. In men's sports, I manually picked those that will display little or no overlap. + +```{r} +men_sports <- c("Athletics", "Canoeing", "Swimming", "Shooting", "Weightlifting", + "Gymnastics", "Basketball", "Boxing", "Sailing", "Hockey", "Wrestling") +women_sports <- c("Gymnastics", "Athletics", "Fencing", + "Canoeing", "Swimming", "Diving", "Sailing") +``` + + +Next, I enhance the avg_BMI dataset by creating a new column named Facet_Group, which assigns a numeric identifier based on the athlete's sex and sport. Male athletes participating in sports from men_sports receive a "1," and female athletes are tagged with a "2" for sports from women_sports. + +Afterwards, I filter the data to keep only the rows with defined Facet_Group values, resulting in the filtered_avg_BMI dataset containing relevant entries. + +```{r} +avg_BMI <- avg_BMI |> + mutate(Facet_Group = case_when( + Sport %in% men_sports & Sex == "M" ~ "1", + Sport %in% women_sports & Sex == "F" ~ "2" + )) + +filtered_avg_BMI <- avg_BMI |> + filter(!is.na(Facet_Group)) +``` + + +The next step involves visualizing the data with ggplot, where I create improved_graph_2. Here, I specify the x and y axes, grouping by sport and color coding them accordingly. I utilize the geom_line function for connecting lines and geom_point for square-shaped data points. This time, I apply facet_grid to arrange the facets flexibly, with free y-axis scales and labeled to distinguish male and female groups. + +```{r fig.height=5, fig.width = 8.5} + +improved_graph_2 <- ggplot(filtered_avg_BMI, aes(x = factor(Year), y = avg_BMI, + group = Sport, color = Sport)) + + geom_line(linewidth = 1.5, linetype = "dotted") + + geom_point(size = 2, shape = 15) + + facet_grid(~ Facet_Group, scales = "free", + labeller = as_labeller(c("1" = "Male", "2" = "Female"))) + +improved_graph_2 +``` + + +Once the basic structure is in place, I add the title, subtitle, and axis labels. I also customize the colors for the lines and points using retro_colours_2 and ensure the y-axis is properly segmented with scale_y_continuous. + +```{r fig.height=5, fig.width = 8.5} +improved_graph_2 <- + improved_graph_2 + + labs(title = "Avg. BMI of Olympic Athletes \nby Olympic Sports (1960 vs 2016)", + subtitle = "Have they lowkey gotten fatter?", + x = "Year", + y = "Average BMI", + color = "Olympic Sports") + + scale_color_manual(values = retro_colours_2) + # Custom colors for the sports + scale_y_continuous(breaks = seq(20, 29, by = 1)) + +improved_graph_2 +``` + +Using 'plot.title = element(...)' and 'plot.subtitle = element(...)', I set the type, size, alignment, and the colour of fonts for my title and subtitle. + +```{r fig.height=5, fig.width = 8.5} +improved_graph_2 <- + improved_graph_2 + + theme( + plot.title = element_text(family = "press-start", size = 10, face = "bold", + hjust = 0.5, color = "yellow2"), + plot.subtitle = element_text(family = "vt323", size = 14, hjust = 0.5, color = "#abdbe3") + ) + +improved_graph_2 +``` + +The code 'plot.background = element_rect(fill = "grey15")' sets the plot's background to dark grey. + +Next, through 'panel.spacing = unit(1, "lines")' I adjust the space between panels, making each section more distinct. + +By using panel.background = element_rect(fill = "grey95", color ="grey95", linewidth = 2), I give each panel a light grey background and a subtle border to clearly define the sections. I set major grid lines to light grey with 'panel.grid.major = element_line(color = "grey90")', ensuring they support readability without being distracting. For minor grid lines, 'panel.grid.minor = element_line(color = "grey95")' maintains a consistent layout. + + +```{r fig.height=5, fig.width = 8.5} +improved_graph_2 <- + improved_graph_2 + + theme( + plot.background = element_rect(fill = "grey15"), + panel.spacing = unit(1, "lines"), + panel.background = element_rect(fill = "grey95", color ="grey95", linewidth = 2), + panel.grid.major = element_line(color = "grey90"), + panel.grid.minor = element_line(color = "grey95") + ) + +improved_graph_2 +``` + +I adjust the axis text with 'axis.text' to enhance visibility and set the colour to light blue. Next, I rotate the y-axis text using 'axis.text.y' at an angle for better readability. For the x-axis, I apply 'axis.text.x' to angle the text, ensuring clarity for closely spaced labels. This is different from the initial improved plot. + +Lastly, I customise the axis titles with 'axis.title', making them bold and using a yellow colour to maintain a consistent style. + + +```{r fig.height=5, fig.width = 8.5} +improved_graph_2 <- + improved_graph_2 + + theme( + axis.text = element_text(size = 5, color = "lightblue1", family = "press-start"), + axis.text.y = element_text( angle = 45), + axis.text.x = element_text( angle = 25), + axis.title = element_text(size = 8, face = "bold", family = "press-start", + color = "yellow2") + ) +improved_graph_2 +``` + + +I position the legend on the right using 'legend.position'. The legend title is customised with 'legend.title', I use the font "press-start" font for consistency. Then, I adjust the size of legend keys with 'legend.key.size'.For the legend text, I use 'legend.text' to set the font properties and colour, enhancing its visibility. The background of the legend is changed with 'legend.background', providing a light grey fill and a subtle border. I modify the facet strip titles with 'strip.text', and it makes them bold and blue for emphasis. + +Finally, guides() helps format the legend layout, positioning it into two columns for better organisation. + +```{r fig.height=5, fig.width = 8.5} +improved_graph_2 <- + improved_graph_2 + + theme( + legend.position = "right", + legend.title = element_text(family = "press-start", size = 10, color = "blue"), + legend.key.size = unit(0.4, "cm"), + legend.text = element_text(size = 5, family = "press-start", color = "gray15"), + legend.background = element_rect(fill = "grey95", color = "grey70", size = 2), + strip.text = element_text(size = 7, face = "bold", family = "press-start", color = "blue") + ) + + guides(color = guide_legend(ncol = 2)) + + +improved_graph_2 + +``` + +For the honours, let's put the copyright. + +Using 'plot.caption' and 'plot.margin', I set the properties of my caption such as the font type, size colour, alignment, and margin. Don't worry about the alignment of the caption for now. The issue will resolve in the next (and last) step. + +Lastly, I adjust the overall plot margins using 'plot.margin', defining the space around the entire plot. + +```{r fig.height=5, fig.width = 8.5} + +improved_graph_2 <- + improved_graph_2 + + labs( + caption = "My first original plot © " + ) + + theme( + plot.caption = element_text(size = 6, family = "press-start", color = "yellow2", + hjust = 1.7, margin = margin(b = 10)), + plot.margin = margin(7, 12, 1, 5) + ) + +improved_graph_2 +``` + + +I conclude this section by organising the plot together with annotations. The 'annotation_background' is created using 'rectGrob', which provides a rectangular background with a fill colour of grey90. For the text annotation, I employ 'textGrob' to add BMI classification information, styling it with the specified size, font family, and blue colour. + +To combine both the plot and its annotations for final presentation, I utilise grid.arrange, ensuring that the main graph occupies 92% of the height while the annotation takes up the remaining 10%, aligning them in a single column (ncol = 1). + +This final improved_graph_2 now includes both the graphical data and the informative annotations integrated for better interpretation. + +```{r fig.height=5, fig.width = 8.5} + +annotation_background <- rectGrob(gp = gpar(fill = "grey90")) + +annotation_grob <- textGrob( + "BMI ≤ 18.5 'Underweight' 18.5 < BMI < 25 'Normal' 25 ≤ BMI < 30 'Overweight' BMI ≥ 30 'Obese'", + gp = gpar(fontsize = 5.5, fontfamily = "press-start", col = "blue") +) + +combined_annotation <- grobTree(annotation_background, annotation_grob) + +# To combine the plot and annotations +improved_graph_2 <- + grid.arrange(improved_graph_2, + combined_annotation, + heights = c(0.92, 0.08), + ncol = 1) + + +``` + + + +## Concluding Notes + +One of the main criticisms I received during my plot presentation was that women's sports were excluded from my initial work. To address this data loss, I considered recommendations from the class and created a fifth facet for women's sports present in both the 1960 and 2016 Summer Olympic Games. This was somewhat problematic because the BMI values for some sports overlapped, and adding an extra facet just made the plot look unbalanced. + +To remedy this, I set aside some of the men's sports and adjusted the number of facets to two. This eventually served to enable me to pick those sports which would overlap very little - or not at all. + +Another issue raised during the presentation was related to the annotation. When a classmate asked about what the BMI values represented, I realized that my annotation might not be as visible to the reader as I intended. Therefore, I slightly increased its size to enhance visibility. But really, it is not that bad! :) + +Lastly, I received criticism about my legend's colorfulness and the space it occupies in the plot. I decided not to address the colorfulness; because I like it that way. The space issue was resolved when I moved to two facets and reduced the number of sports displayed. Now, I believe my plot has a better balance between the sizes of the plot and legend. + +Time for self-criticism - What I do not like very much about this plot is the spaces in the upper area of women's sports facet. I was not able to find a way around it to make it appear more filled, since it is due to the nature of data. Women's BMI values start at a low point and do not show dramatic increases that would fill the upper side of the facet. diff --git a/_projects/2024/100528259/100528259.html b/_projects/2024/100528259/100528259.html new file mode 100644 index 00000000..fec3d309 --- /dev/null +++ b/_projects/2024/100528259/100528259.html @@ -0,0 +1,2470 @@ + + + + +
+ + + + + + + + + + + + + + + +This study displays the evolution of BMI values of Olympic athletes who +competed in the Summer Olympic Games of 1960 and 2016.
+The Olympic Games have long been a platform to showcase the pinnacle of human performance and athleticism. Over the years, the physical features of athletes competing in these prestigious events have transformed significantly. This chart from Visual Capitalist displays the evolution of Olympic athletes’ sizes, illustrating trends in height and weight across sports and genders. It utilises BMI categories to place various sports to capture changes over time.
+There are several reasons as to why I picked this plot. At first sight, it attracts attention with its coloured tiles, and it looks well-organised with the facets and labeled data points. Getting above such features with a brand new design seemed challenging - and I liked the idea.
+Second, after careful exploration, I noticed that there was something wrong with the representation of sports: some sports were not present in the 1960 Summer Olympic games, however they were displayed it the facets that demonstrate 2016 games, which was precisely the moment that gave me the idea that this work was in fact not perfect.
+To not spoil all the fun by revealing the ideas, let’s proceed to data cleaning and adjusting section.
+ +# For replication graph
+
+library(tidyverse)
+library(readr)
+library(ggplot2)
+library(dplyr)
+library(ggrepel)
+library(extrafont)
+library(showtext)
+library(sysfonts)
+
+# Extras - For alternative plot
+
+library(grid)
+library(ggtext)
+library(gridExtra)
+
+
+# Fonts for the replication plot
+
+font_add_google("Crimson Pro", "crimson-pro")
+font_add_google("Outfit", "outfit")
+font_add_google("Cormorant Garamond", "cormorant-garamond")
+font_add_google("Montserrat", "Montserrat")
+font_add_google("Archivo Black", "ArchivoBlack")
+font_add_google("Piazzolla", "PiazzollaExtraLight")
+
+# Fonts for the improved plot
+
+font_add_google("Press Start 2P", "press-start")
+font_add_google("VT323", "vt323")
+font_add_google("DotGothic16", "dotgothic16")
+
+showtext_auto()
+The data set displays personal records of Olympic athletes.
+The original plot focuses on athletes who are at least 20 years old and competed in 1960 and 2016 Summer Olympic games. Taking into account their height and weight values, it then calculates BMI values of athletes and categorise them.
+In so doing, it creates four different facets and coloured tiles in each one of them. Obviously, each colour tile represents a BMI category.
+The four facets are designed to represent 1960 and 2016 Summer Olympic games, and male and female athletes.
+Reading the dataset
+data_main <- read_csv("athlete_events.csv")
+As we will only need the sports branches, Olympic game’s year information and the name, gender, and the BMI values of the athletes, I get rid of everything else.
+Some athletes have no height or weight (or either) info. This is just to have an idea of what’s going on.
+missing_values <- olympics_tidy |>
+ summarize(
+ missing_height = sum(is.na(Height)),
+ missing_weight = sum(is.na(Weight))
+ )
+
+missing_values
+# A tibble: 1 × 2
+ missing_height missing_weight
+ <int> <int>
+1 255 326
+I want to create a BMI variable for future use, as BMIs will be the main concern of this study.
+olympics_tidy <-
+ olympics_tidy |>
+ mutate(BMI = Weight / (Height / 100)^2)
+Now let’s acquire the means for height, weight, and BMI values for different branches, across genders and for the two Olympic games.
+Recoding ‘Sex’ for better readability.
+ +Although the values range from 30 to 170 for weight and from 143 to 218 for height, the graph seems to omit some observations and use different ranges.
+After a careful and (highly) manual measurement, I decided that for weight the range would be from 45 to 105, and for height it would be 155 and 205.
+This can be observed in the original graph.
+Since I’ll be viusalising ‘Gender’ and changes in ‘Height’, ‘Weight’, I will be using facets. +Let’s organise the data taking into account the new value ranges indicated above, and then proceed to labeling BMI ranges.
+To do that, I first reframe my data set indicating new ranges for Height and Weight values.
+Following this, I create a new variable named ‘BMI Label’, which will store BMI categories and their values. Then, this variable will help create tiles for each BMI category.
+Using ‘ordered = TRUE’, I ensure the logical order between ‘Underweight’, ‘Normal’, ‘Overweight’, and ‘Obese’ categories.
+olympics_reframed <- crossing(
+ Height = 155:205,
+ Weight = 45:105,
+ Sex = c("Men", "Women"),
+ Year = c(1960, 2016)
+) |>
+ mutate(
+ BMI = Weight / (Height / 100)^2,
+ `BMI Label` = factor(case_when(
+ BMI <= 18.5 ~ "Underweight",
+ BMI > 18.5 & BMI <= 24.9 ~ "Normal",
+ BMI > 24.9 & BMI <= 29.9 ~ "Overweight",
+ BMI > 29.9 ~ "Obese"
+ ),
+ levels = c("Underweight", "Normal", "Overweight", "Obese"),
+ ordered = TRUE)
+ )
+Let’s begin with stating x any y axes. I will use ‘weight_mean’ and ‘height_mean’ from ‘olympics_tidy’ data set.
+Next, ‘geom_tile’ will enable me to create tiles based on BMI categories that I have indicated in ‘BMI Label’ column in ‘olympics_reframed’.
+I have used a separate data set called ‘olympics_reframed’ for my own ease. All these information could probably work out together in a single data set.
+The filling of the tiles will be automatic. I will be changing it in the next step.
+Here, I enhance the appearance of my existing graph by customizing the fill colours for different BMI categories. I use scale_fill_manual() to manually set specific colours for each category, such as “Underweight,” “Normal,” “Overweight,” and “Obese,” assigning colours as close as possible to the original plot.
+I also adjust the legend with setting ‘guides()’ to NULL in order to remove the title.
+final_graph <-
+ final_graph +
+ scale_fill_manual(values = c(
+ "Underweight" = "cadetblue2",
+ "Normal" = "darkolivegreen2",
+ "Overweight" = "lightgoldenrod1",
+ "Obese" = "lightcoral")) +
+ guides(fill = guide_legend(title = NULL))
+
+final_graph
+It’s time to set the title and subtitle for of the plot and add axes titles. Lastly, I add the caption at the bottom right of the original plot indicating the source of data.
+final_graph <-
+ final_graph +
+ labs(
+ x = "WEIGHT (KG)",
+ y = "HEIGHT (CM)",
+ title = "OLYMPIC ATHLETES ARE GETTING BIGGER",
+ subtitle = "Mean weight and height by sport at the 1960 and 2016 Summer Olympics. Values calculated \nfor athletes 20 years or older, compared to the BMI classification for the general population.",
+ caption = "Source: Kaggle • Graphic: Georgios Karamanis"
+ )
+final_graph
+No party without our good old friends: data points! Using geom_point, I invite them to the party.
+final_graph <-
+ final_graph +
+ geom_point()
+
+final_graph
+Since I will be comparing 1960 and 2016 Summer Olympics while displaying BMI values based on gender, I will need 4 different facets.
+The ‘Sex’ variable has two values (Men and Women), and the ‘Year’ variable only has 1960 and 2016. Therefore, four facets.
+Lastly, I tell R to not add extra space to x and y axes using coord_cartesian(expand = FALSE).
+final_graph <-
+ final_graph +
+ facet_grid(Sex ~ Year) +
+ coord_cartesian(expand = FALSE)
+
+final_graph
+In the original plot, the legend is located at the top, right below the subtitle. First I relocate it on the top and ensure that it is centered. Then, I place it in a horizontal way. Lastly, I ensure that the legend key sizes somewhat match to that of original plot.
+Using strip.text, I ensure the colour for texts in the strips and match their size to the original plot.
+Using plot.title and plot.subtitle, I set sizes, alignments, and margins of the title and subtitle. For the title, I use ‘ArchivoBlack’, and for the subtitle, I use ‘Montserrat’ (both from Google fonts).
+After setting the margins of the plot and the caption that is at the bottom right, I move to indicating sizes for the legend, x and y axes (for both axis texts and axis titles).
+For axis titles, I use ‘cormorant-garamond’ from Google fonts.
+Lastly, I change the background colour for facet strips and set the colour for outer borders to = NA. This way, it will look as if I have no borders for the strips.
+final_graph <-
+ final_graph +
+ theme(
+ strip.text = element_text(color = "black", size = 20),
+ plot.title = element_text(size = 36, hjust = 0.5, family = "ArchivoBlack",
+ margin = margin(t = 10, b = 5, l = 15, r = 15)),
+ plot.subtitle = element_text(size = 18, hjust = 0.5, family = "Montserrat",
+ face = "plain", lineheight = 1, margin = margin(t = 5, b = 10, l = 15, r = 15), color = "grey15"),
+ plot.margin = margin(20, 10, 10, 10),
+ plot.caption = element_text(size = 16, hjust = 1, lineheight = 1.2),
+ legend.text = element_text(size = 14),
+ axis.text.x = element_text(size = 14),
+ axis.text.y = element_text(size = 14),
+ axis.title.x = element_text(size = 16, family = "cormorant-garamond"),
+ axis.title.y = element_text(size = 16, family = "cormorant-garamond"),
+ strip.background = element_rect(fill = "gray95", color = NA),
+ plot.background = element_rect(fill = "gray95"),
+ legend.background = element_rect(fill = "gray95")
+)
+
+
+
+
+final_graph
+The only thing that is missing right now (pretty much) is labeling data points.
+To make it look a bit more like the original plot, I begin with changing the colour, size, and the transparency level of my data points. (see geom_point(…)).
+Thanks to geom_text_repel, I will be able to put labels to the data points. Since each data point represents an Olympic sport, these texts will be name of Olympic sports.
+Firstly, I ensure all sports names are in capital letters, using ‘label = toupper(Spor)’.
+Next, I set the font of data labels to ‘Montserrat’ and set their size to 4.
+To have a clean and readable plot, I had to ensure that text labels do not overlap. Using ‘nudge_x’ and ‘nudge_y’, I slightly moved text labels horizontally and vertically. ‘box.padding’, on the other hand, adds padding around each label’s box to create more space between the text and other elements, so it helps the cause dearly. Lastly, using ‘max.overlaps’, I ensured to limit the maximum number of overlapping labels to be shown; so that excess overlaps are suppressed.
+‘min.segment.length’ helps me to somewhat restrict as to which connecting lines will be displayed, so I set it to 1.2 and disable the ones that would be longer than that value. With ‘segment.size’, I make those connecting lines extremely thin and through ‘segment.color’, their colour is indicated as ‘grey30’.
+I use the same colour for my text labels using “color = ‘grey30’”. Lastly, I hide a possible legend for the data labels through ‘show.legend = FALSE’.
+final_graph <-
+ final_graph +
+ geom_point(color = "gray40", size = 1.5, alpha = 0.7) +
+ geom_text_repel(aes(label = toupper(Sport)),
+ family = "Montserrat",
+ size = 4,
+ nudge_y = 3,
+ nudge_x = 1.5,
+ box.padding = 0.75,
+ max.overlaps = 15,
+ min.segment.length = 1.2,
+ segment.size = 0.1,
+ segment.color = "grey30",
+ color = "grey30",
+ show.legend = FALSE)
+
+final_graph
+The initial plot displays some Olympic events that did not exist in 1960 but took place in the 2016 Olympic Games (e.g., Tennis). I don’t need to show them as they do not contribute to the study and add noise to the data, particularly in women’s sports. This is due to the fact that women’s sports were not as diverse in 1960 as they were in 2016.
+There are extremely few or no ‘Underweight’ and ‘Obese’ athletes in the initial study. Using tiles for nonexistent categories seems redundant.
+Another issue with the plot is that the data point trends are not easy to follow. For any sport, it is quite difficult to determine whether it has experienced an increase or decrease over time.
+As the original plot’s biggest weakness is the difficulty in tracking individual sports’ trends over time, I try to address this issue by using a slope chart as my alternative plot. A slope chart is useful when comparing trends over time, using two data points for each observation and a line that connects those two data points.
Most sports branches have somewhat similar BMI values, which eventually leads to a plot that has many overlappings. A slope chart is likely to reduce those overlappings while still emphasising trends.
I can change the title, coming up with a more striking one. Also I can use a fancy font. The font and colouring has actually been my favourite part for this study.
Who does not love 8-bit stuff? Well, not me. The inspiration of my alternative plot will be 8-bit aesthetics. I had to do it, because for my whole childhood I played ‘Track & Field’!
The closely clustered nature of the data points and the lack of direct connections between 1960 and 2016 data for each sport make trend analysis challenging. I will try to address these issues in my slope chart, but highly close numbers of BMI values pose threats to clarity of my plot.
+In my first attempt, I completely excluded women’s sports that were present in both Olympic games (which amounted to 6 branches in total).
+In the second attempt, I’ll omit some of the men’s sports branches and include women’s sports into my plot. But for now, let’s move onto the first one.
+Let’s first read the data and start with the improvement.
+athletes <- read_csv("athlete_events.csv")
+As the original study suggests, I filter 1960 and 2016 Summer Olympic Games, and athletes that were at least 20 years old. +Then I move onto create a BMI variable for the athletes in my dataset.
+In this chunk of code, I create two subsets of my dataset that will display sports branches that male athletes competed in 1960 and 2016 Summer Olmypics.
+Lastly, I want to see which sports are common in both Olympic games to prevent data noise in the future. Using ‘pull’, I acquire those sports for each year and then store their intersection in ‘common_sports_men’.
+To create a subset of data with only male athletes that competed in intersecting sports, I filter my dataset. Then, for a tidy table, I drop unnecessary columns.
+Now I am at the point where I need to create an average BMI value for each sports in my dataset. I group BMI values by ‘Year’ and ‘Sports’ as they will be displayed on the basis of competition year and event type.
+Then I sort them in alphabetical order for my own use.
+This is where I manually group sports for each facet, considering their BMI values in 1960 and 2016. This way, I am planning to reduce overlappings as efficiently as possible.
+avg_bmi_men <- avg_bmi_men |>
+ mutate(Facet_Group = case_when(
+ Sport %in% c("Water Polo", "Sailing", "Hockey", "Diving", "Football") ~ "1",
+ Sport %in% c("Shooting", "Modern Pentathlon", "Rowing", "Cycling", "Wrestling") ~ "2",
+ Sport %in% c("Weightlifting", "Swimming", "Athletics", "Boxing", "Canoeing") ~ "3",
+ TRUE ~ "4"
+ ))
+This was one of my favourite parts doing my alternative plot. Retro work!!
+The aesthetics of my plot will be providing a retro look.
+So, I assign colours for each sports so that they will (try to) not confuse the reader.
+retro_colours <- c(
+ "Athletics" = "#00a046",
+ "Basketball" = "#FF6F20",
+ "Boxing" = "#fcab98",
+ "Canoeing" = "#7fcdbb",
+ "Cycling" = "#bcbddc",
+ "Diving" = "#000000",
+ "Equestrianism" = "#a65628",
+ "Fencing" = "#A3D900",
+ "Football" = "#ec213e",
+ "Gymnastics" = "#d8b365",
+ "Hockey" = "#F1C40F",
+ "Modern Pentathlon" = "#FF448C",
+ "Rowing" = "#10EA00",
+ "Sailing" = "#636363",
+ "Shooting" = "#D167BF",
+ "Swimming" = "#4400AA",
+ "Weightlifting" = "#7A9A01",
+ "Wrestling" = "#80BFFF",
+ "Water Polo" = "#0057A1"
+)
+From the avg_bmi_men table, I extract BMI values for both years and genders. Next, I specify the x and y axes for my plot. Lastly, I ensure that the grouping is based on the type of sport.
+The geom_line function helps me create lines that connect values from two Olympic years. Using the ‘linetype’ parameter, I apply dotted lines to give an 8-bit feel to the plot. Similarly, by using ‘geom_point’, I employ square-shaped data points and adjust their size.
+With ‘facet_wrap’, I arrange the sports in facets that I matched earlier. By using ‘ncol = 2’, I display all my facets in two columns.
+improved_graph <-
+ ggplot(avg_bmi_men, aes(x = factor(Year), y = Avg_BMI, group = Sport, color = Sport)) +
+ geom_line(linetype = "dotted", linewidth = 1.5) +
+ geom_point(size = 2, shape = 15) +
+ facet_wrap(~ Facet_Group, ncol = 2)
+
+improved_graph
+Quickly changing the structure of the legend into a two-column one. I will be using the space below the legend.
+improved_graph <-
+ improved_graph +
+ guides(color = guide_legend(ncol = 2))
+
+improved_graph
+It is time I name the title, subtitle, and caption, as well as the x and y axes.
+I also change the randomly assigned colors of data points and lines to the colors I manually picked and stored in ‘retro_colours’ earlier.
+As the BMI values range between 20 and 29, I customize the y-axis, ensuring that the values increment by 1 unit (scale_y_continuous).
+improved_graph <-
+ improved_graph +
+ labs(title = "Avg. BMI of Male Olympic Athletes \nby Sport (1960 vs 2016)",
+ subtitle = "Have they lowkey gotten fatter?",
+ caption = "My first ever original plot © ",
+ x = "Year",
+ y = "Average BMI",
+ color = "Olympic sports") +
+ scale_color_manual(values = retro_colours) +
+ scale_y_continuous(breaks = seq(20, 29, by = 1))
+
+improved_graph
+I customise the plot title using ‘plot.title’, applying the “press-start” font for consistency, and set it to bold for emphasis.
+For the plot subtitle, I use ‘plot.subtitle’ with the “vt323” font from Google fonts, adjusting the alignment to ensure it appears centred below the title.
+improved_graph <-
+ improved_graph +
+ theme(
+ plot.title = element_text(family = "press-start", size = 10, face = "bold",
+ hjust = 0.2, color = "yellow2"),
+ plot.subtitle = element_text(family = "vt323", size = 14, hjust = 0.5, color = "#abdbe3")
+ )
+
+improved_graph
+I customise the axis text using ‘axis.text’, setting the colour to light blue and applying the “press-start” font for a consistent style.
+The axis titles are styled with ‘axis.title’, making them bold and yellow to ensure they stand out.
+improved_graph <-
+ improved_graph +
+ theme(
+ axis.text = element_text(size = 5, color = "lightblue1", family = "press-start"),
+ axis.title = element_text(size = 8, face = "bold", family = "press-start", color = "yellow2"),
+ )
+
+improved_graph
+I set the plot background colour using ‘plot.background’, filling it with a dark grey. The panel background is customised with ‘panel.background’, applying a lighter grey and defining a subtle border for clear look of sections. For the grid lines, I adjust the major grid lines using ‘panel.grid.major’, setting their colour to a light grey.
+I also configure the minor grid lines with ‘panel.grid.minor’, ensuring they maintain a harmonious appearance with a very light grey.
+improved_graph <-
+ improved_graph +
+ theme(
+ plot.background = element_rect(fill = "grey15"),
+ panel.background = element_rect(fill = "grey95", color ="grey95", linewidth = 2),
+ panel.grid.major = element_line(color = "grey90"),
+ panel.grid.minor = element_line(color = "grey95")
+ )
+
+improved_graph
+I position the legend on the right with ‘legend.position’, ensuring a clean layout. The legend title is styled with ‘legend.title’. I also apply the “press-start” font here. I adjust the size of the legend keys with ‘legend.key.size’. For the legend text, I use ‘legend.text’ to define the font properties and colour, enhancing readability.
+The background of the legend is refined using ‘legend.background’, to give it a light grey fill with a subtle border for better definition.
+I remove the facet strip titles by setting ‘strip.text to element_blank()’, because I do not want to see them.
+improved_graph <-
+ improved_graph +
+ theme(
+ legend.position = "right",
+ legend.title = element_text(family = "press-start",
+ size = 10,
+ color = "blue"),
+ legend.key.size = unit(0.4, "cm"),
+ legend.text = element_text(size = 5, family = "press-start", color = "blue"),
+ legend.background = element_rect(fill = "grey95", color = "grey70", size = 2),
+ strip.text = element_blank()
+ )
+
+improved_graph
+For the honours, let’s put the copyright.
+Using ‘plot.caption’ and ‘plot.margin’, I set the properties of my caption such as the font type, size colour, alignment, and margin.
+Lastly, I adjust the overall plot margins using ‘plot.margin’, defining the space around the entire plot.
+improved_graph <-
+ improved_graph +
+ labs(
+ caption = "My first ever original plot © ") +
+ theme(
+ plot.caption = element_text(size = 6, family = "press-start", color = "yellow2",
+ hjust = 2.2, margin = margin(b = 14)),
+ plot.margin = margin(7, 12, 1, 5)
+)
+
+improved_graph
+I then create an annotation background using a light grey rectangle and add text to it, explaining the BMI categories: ‘Underweight’, ‘Normal’, ‘Overweight’, and ‘Obese’, all styled with a blue color.
+Finally, I arrange the main graph alongside the annotation so that the graph occupies 92% of the height and the annotation takes up 8%, ensuring a clear and informative presentation.
+annotation_background <- rectGrob(
+ gp = gpar(fill = "grey90")
+)
+
+# Creating the annotation text
+annotation_grob <- textGrob(
+ "BMI ≤ 18.5 'Underweight' 18.5 < BMI < 25 'Normal' 25 ≤ BMI < 30 'Overweight' BMI ≥ 30 'Obese'",
+ gp = gpar(fontsize = 5,
+ fontfamily = "press-start",
+ col = "blue")
+)
+
+combined_annotation <- grobTree(annotation_background, annotation_grob)
+
+grid.arrange(improved_graph,
+ combined_annotation,
+ heights = c(0.92, 0.08),
+ ncol = 1)
+During my presentation for my initial improved plot, I received recommendations on several aspects of my work. At the end of this document, I will put an evaluation for those recommendations and what I have done to address the issues that were raised.
+Let’s start with reading the data.
+data <- read_csv("athlete_events.csv")
+I erase the columns that I will not be using.
+ +Here, I filter 1960 and 2016 Summer Olympics and athletes >= 20 years old.
+I want to detect those sports that are common in both Olympic game years for men.
+I find the intersection for men, this is essential for ensuring meaningful data representation.
+common_sports_men <- intersect(sports_1960_men, sports_2016_men)
+
+common_sports_men
+ [1] "Shooting" "Weightlifting" "Football"
+ [4] "Boxing" "Water Polo" "Hockey"
+ [7] "Wrestling" "Rowing" "Fencing"
+[10] "Athletics" "Basketball" "Cycling"
+[13] "Gymnastics" "Sailing" "Canoeing"
+[16] "Modern Pentathlon" "Swimming" "Equestrianism"
+[19] "Diving"
+Then, I do the same for women, because there is a lot of data noise at women’s sports. Vast majority of women’s sports in 2016 did not exist in 1960 Summer Olympics.
+Let’s find the intersection for women as well.
+common_sports_women <- intersect(sports_1960_women, sports_2016_women)
+
+common_sports_women
+[1] "Gymnastics" "Athletics" "Fencing" "Canoeing"
+[5] "Swimming" "Diving" "Equestrianism" "Sailing"
+Keeping only the intersecting sports will help me ensure there is no data noise.
+Now I will work on BMI values and categories. So let’s create a BMI variable first.
+filtered_data_BMI <- olympics |>
+ mutate(BMI = Weight / (Height / 100)^2)
+I calculate average BMI values for each sport.
+For the three facets that I will create in my plot, I specify sports.
+avg_BMI <- avg_BMI |>
+ mutate(Facet_Group = case_when(
+ Sport %in% c("Water Polo", "Wrestling", "Swimming", "Shooting",
+ "Weightlifting", "Gymnastics", "Equestrianism", "Canoeing", "Sailing", "Hockey") ~ "1",
+ Sport %in% c("Gymnastics", "Athletics", "Fencing", "Canoeing",
+ "Swimming", "Diving", "Equestrianism", "Sailing") ~ "2",
+ ))
+Picking colours again!
+retro_colours_2 <- c(
+ "Athletics" = "#00a046",
+ "Basketball" = "#FF6F20",
+ "Boxing" = "#fcab98",
+ "Canoeing" = "#7fcdbb",
+ "Cycling" = "#bcbddc",
+ "Diving" = "#FF448C",
+ "Gymnastics" = "#d8b365",
+ "Hockey" = "#F1C40F",
+ "Rowing" = "#10EA00",
+ "Sailing" = "#4400AA",
+ "Shooting" = "#D167BF",
+ "Swimming" = "#0057A1",
+ "Weightlifting" = "#7A9A01",
+ "Wrestling" = "#80BFFF",
+ "Water Polo" = "#636363"
+)
+In this section, I define two character vectors for male and female Olympic sports, categorizing them into two groups for men and one for women. These sports are chosen on the basis of their presence in both Olympic Games. In men’s sports, I manually picked those that will display little or no overlap.
+Next, I enhance the avg_BMI dataset by creating a new column named Facet_Group, which assigns a numeric identifier based on the athlete’s sex and sport. Male athletes participating in sports from men_sports receive a “1,” and female athletes are tagged with a “2” for sports from women_sports.
+Afterwards, I filter the data to keep only the rows with defined Facet_Group values, resulting in the filtered_avg_BMI dataset containing relevant entries.
+The next step involves visualizing the data with ggplot, where I create improved_graph_2. Here, I specify the x and y axes, grouping by sport and color coding them accordingly. I utilize the geom_line function for connecting lines and geom_point for square-shaped data points. This time, I apply facet_grid to arrange the facets flexibly, with free y-axis scales and labeled to distinguish male and female groups.
+improved_graph_2 <- ggplot(filtered_avg_BMI, aes(x = factor(Year), y = avg_BMI,
+ group = Sport, color = Sport)) +
+ geom_line(linewidth = 1.5, linetype = "dotted") +
+ geom_point(size = 2, shape = 15) +
+ facet_grid(~ Facet_Group, scales = "free",
+ labeller = as_labeller(c("1" = "Male", "2" = "Female")))
+
+improved_graph_2
+Once the basic structure is in place, I add the title, subtitle, and axis labels. I also customize the colors for the lines and points using retro_colours_2 and ensure the y-axis is properly segmented with scale_y_continuous.
+improved_graph_2 <-
+ improved_graph_2 +
+ labs(title = "Avg. BMI of Olympic Athletes \nby Olympic Sports (1960 vs 2016)",
+ subtitle = "Have they lowkey gotten fatter?",
+ x = "Year",
+ y = "Average BMI",
+ color = "Olympic Sports") +
+ scale_color_manual(values = retro_colours_2) + # Custom colors for the sports
+ scale_y_continuous(breaks = seq(20, 29, by = 1))
+
+improved_graph_2
+Using ‘plot.title = element(…)’ and ‘plot.subtitle = element(…)’, I set the type, size, alignment, and the colour of fonts for my title and subtitle.
+improved_graph_2 <-
+ improved_graph_2 +
+ theme(
+ plot.title = element_text(family = "press-start", size = 10, face = "bold",
+ hjust = 0.5, color = "yellow2"),
+ plot.subtitle = element_text(family = "vt323", size = 14, hjust = 0.5, color = "#abdbe3")
+ )
+
+improved_graph_2
+The code ‘plot.background = element_rect(fill = “grey15”)’ sets the plot’s background to dark grey.
+Next, through ‘panel.spacing = unit(1, “lines”)’ I adjust the space between panels, making each section more distinct.
+By using panel.background = element_rect(fill = “grey95”, color =“grey95”, linewidth = 2), I give each panel a light grey background and a subtle border to clearly define the sections. I set major grid lines to light grey with ‘panel.grid.major = element_line(color = “grey90”)’, ensuring they support readability without being distracting. For minor grid lines, ‘panel.grid.minor = element_line(color = “grey95”)’ maintains a consistent layout.
+improved_graph_2 <-
+ improved_graph_2 +
+ theme(
+ plot.background = element_rect(fill = "grey15"),
+ panel.spacing = unit(1, "lines"),
+ panel.background = element_rect(fill = "grey95", color ="grey95", linewidth = 2),
+ panel.grid.major = element_line(color = "grey90"),
+ panel.grid.minor = element_line(color = "grey95")
+ )
+
+improved_graph_2
+I adjust the axis text with ‘axis.text’ to enhance visibility and set the colour to light blue. Next, I rotate the y-axis text using ‘axis.text.y’ at an angle for better readability. For the x-axis, I apply ‘axis.text.x’ to angle the text, ensuring clarity for closely spaced labels. This is different from the initial improved plot.
+Lastly, I customise the axis titles with ‘axis.title’, making them bold and using a yellow colour to maintain a consistent style.
+improved_graph_2 <-
+ improved_graph_2 +
+ theme(
+ axis.text = element_text(size = 5, color = "lightblue1", family = "press-start"),
+ axis.text.y = element_text( angle = 45),
+ axis.text.x = element_text( angle = 25),
+ axis.title = element_text(size = 8, face = "bold", family = "press-start",
+ color = "yellow2")
+ )
+improved_graph_2
+I position the legend on the right using ‘legend.position’. The legend title is customised with ‘legend.title’, I use the font “press-start” font for consistency. Then, I adjust the size of legend keys with ‘legend.key.size’.For the legend text, I use ‘legend.text’ to set the font properties and colour, enhancing its visibility. The background of the legend is changed with ‘legend.background’, providing a light grey fill and a subtle border. I modify the facet strip titles with ‘strip.text’, and it makes them bold and blue for emphasis.
+Finally, guides() helps format the legend layout, positioning it into two columns for better organisation.
+improved_graph_2 <-
+ improved_graph_2 +
+ theme(
+ legend.position = "right",
+ legend.title = element_text(family = "press-start", size = 10, color = "blue"),
+ legend.key.size = unit(0.4, "cm"),
+ legend.text = element_text(size = 5, family = "press-start", color = "gray15"),
+ legend.background = element_rect(fill = "grey95", color = "grey70", size = 2),
+ strip.text = element_text(size = 7, face = "bold", family = "press-start", color = "blue")
+ ) +
+ guides(color = guide_legend(ncol = 2))
+
+
+improved_graph_2
+For the honours, let’s put the copyright.
+Using ‘plot.caption’ and ‘plot.margin’, I set the properties of my caption such as the font type, size colour, alignment, and margin. Don’t worry about the alignment of the caption for now. The issue will resolve in the next (and last) step.
+Lastly, I adjust the overall plot margins using ‘plot.margin’, defining the space around the entire plot.
+improved_graph_2 <-
+ improved_graph_2 +
+ labs(
+ caption = "My first original plot © "
+ ) +
+ theme(
+ plot.caption = element_text(size = 6, family = "press-start", color = "yellow2",
+ hjust = 1.7, margin = margin(b = 10)),
+ plot.margin = margin(7, 12, 1, 5)
+ )
+
+improved_graph_2
+I conclude this section by organising the plot together with annotations. The ‘annotation_background’ is created using ‘rectGrob’, which provides a rectangular background with a fill colour of grey90. For the text annotation, I employ ‘textGrob’ to add BMI classification information, styling it with the specified size, font family, and blue colour.
+To combine both the plot and its annotations for final presentation, I utilise grid.arrange, ensuring that the main graph occupies 92% of the height while the annotation takes up the remaining 10%, aligning them in a single column (ncol = 1).
+This final improved_graph_2 now includes both the graphical data and the informative annotations integrated for better interpretation.
+annotation_background <- rectGrob(gp = gpar(fill = "grey90"))
+
+annotation_grob <- textGrob(
+ "BMI ≤ 18.5 'Underweight' 18.5 < BMI < 25 'Normal' 25 ≤ BMI < 30 'Overweight' BMI ≥ 30 'Obese'",
+ gp = gpar(fontsize = 5.5, fontfamily = "press-start", col = "blue")
+)
+
+combined_annotation <- grobTree(annotation_background, annotation_grob)
+
+# To combine the plot and annotations
+improved_graph_2 <-
+ grid.arrange(improved_graph_2,
+ combined_annotation,
+ heights = c(0.92, 0.08),
+ ncol = 1)
+One of the main criticisms I received during my plot presentation was that women’s sports were excluded from my initial work. To address this data loss, I considered recommendations from the class and created a fifth facet for women’s sports present in both the 1960 and 2016 Summer Olympic Games. This was somewhat problematic because the BMI values for some sports overlapped, and adding an extra facet just made the plot look unbalanced.
+To remedy this, I set aside some of the men’s sports and adjusted the number of facets to two. This eventually served to enable me to pick those sports which would overlap very little - or not at all.
+Another issue raised during the presentation was related to the annotation. When a classmate asked about what the BMI values represented, I realized that my annotation might not be as visible to the reader as I intended. Therefore, I slightly increased its size to enhance visibility. But really, it is not that bad! :)
+Lastly, I received criticism about my legend’s colorfulness and the space it occupies in the plot. I decided not to address the colorfulness; because I like it that way. The space issue was resolved when I moved to two facets and reduced the number of sports displayed. Now, I believe my plot has a better balance between the sizes of the plot and legend.
+Time for self-criticism - What I do not like very much about this plot is the spaces in the upper area of women’s sports facet. I was not able to find a way around it to make it appear more filled, since it is due to the nature of data. Women’s BMI values start at a low point and do not show dramatic increases that would fill the upper side of the facet.
+
`,e.githubCompareUpdatesUrl&&(t+=`View all changes to this article since it was first published.`),t+=` + If you see mistakes or want to suggest changes, please create an issue on GitHub.
+ `);const n=e.journal;return'undefined'!=typeof n&&'Distill'===n.title&&(t+=` +Diagrams and text are licensed under Creative Commons Attribution CC-BY 4.0 with the source available on GitHub, unless noted otherwise. The figures that have been reused from other sources don’t fall under this license and can be recognized by a note in their caption: “Figure from …”.
+ `),'undefined'!=typeof e.publishedDate&&(t+=` +For attribution in academic contexts, please cite this work as
+${e.concatenatedAuthors}, "${e.title}", Distill, ${e.publishedYear}.+
BibTeX citation
+${m(e)}+ `),t}var An=Math.sqrt,En=Math.atan2,Dn=Math.sin,Mn=Math.cos,On=Math.PI,Un=Math.abs,In=Math.pow,Nn=Math.LN10,jn=Math.log,Rn=Math.max,qn=Math.ceil,Fn=Math.floor,Pn=Math.round,Hn=Math.min;const zn=['Sunday','Monday','Tuesday','Wednesday','Thursday','Friday','Saturday'],Bn=['Jan.','Feb.','March','April','May','June','July','Aug.','Sept.','Oct.','Nov.','Dec.'],Wn=(e)=>10>e?'0'+e:e,Vn=function(e){const t=zn[e.getDay()].substring(0,3),n=Wn(e.getDate()),i=Bn[e.getMonth()].substring(0,3),a=e.getFullYear().toString(),d=e.getUTCHours().toString(),r=e.getUTCMinutes().toString(),o=e.getUTCSeconds().toString();return`${t}, ${n} ${i} ${a} ${d}:${r}:${o} Z`},$n=function(e){const t=Array.from(e).reduce((e,[t,n])=>Object.assign(e,{[t]:n}),{});return t},Jn=function(e){const t=new Map;for(var n in e)e.hasOwnProperty(n)&&t.set(n,e[n]);return t};class Qn{constructor(e){this.name=e.author,this.personalURL=e.authorURL,this.affiliation=e.affiliation,this.affiliationURL=e.affiliationURL,this.affiliations=e.affiliations||[]}get firstName(){const e=this.name.split(' ');return e.slice(0,e.length-1).join(' ')}get lastName(){const e=this.name.split(' ');return e[e.length-1]}}class Gn{constructor(){this.title='unnamed article',this.description='',this.authors=[],this.bibliography=new Map,this.bibliographyParsed=!1,this.citations=[],this.citationsCollected=!1,this.journal={},this.katex={},this.publishedDate=void 0}set url(e){this._url=e}get url(){if(this._url)return this._url;return this.distillPath&&this.journal.url?this.journal.url+'/'+this.distillPath:this.journal.url?this.journal.url:void 0}get githubUrl(){return this.githubPath?'https://github.com/'+this.githubPath:void 0}set previewURL(e){this._previewURL=e}get previewURL(){return this._previewURL?this._previewURL:this.url+'/thumbnail.jpg'}get publishedDateRFC(){return Vn(this.publishedDate)}get updatedDateRFC(){return Vn(this.updatedDate)}get publishedYear(){return this.publishedDate.getFullYear()}get publishedMonth(){return Bn[this.publishedDate.getMonth()]}get publishedDay(){return this.publishedDate.getDate()}get publishedMonthPadded(){return Wn(this.publishedDate.getMonth()+1)}get publishedDayPadded(){return Wn(this.publishedDate.getDate())}get publishedISODateOnly(){return this.publishedDate.toISOString().split('T')[0]}get volume(){const e=this.publishedYear-2015;if(1>e)throw new Error('Invalid publish date detected during computing volume');return e}get issue(){return this.publishedDate.getMonth()+1}get concatenatedAuthors(){if(2
tag. We found the following text: '+t);const n=document.createElement('span');n.innerHTML=e.nodeValue,e.parentNode.insertBefore(n,e),e.parentNode.removeChild(e)}}}}).observe(this,{childList:!0})}}var Ti='undefined'==typeof window?'undefined'==typeof global?'undefined'==typeof self?{}:self:global:window,_i=f(function(e,t){(function(e){function t(){this.months=['jan','feb','mar','apr','may','jun','jul','aug','sep','oct','nov','dec'],this.notKey=[',','{','}',' ','='],this.pos=0,this.input='',this.entries=[],this.currentEntry='',this.setInput=function(e){this.input=e},this.getEntries=function(){return this.entries},this.isWhitespace=function(e){return' '==e||'\r'==e||'\t'==e||'\n'==e},this.match=function(e,t){if((void 0==t||null==t)&&(t=!0),this.skipWhitespace(t),this.input.substring(this.pos,this.pos+e.length)==e)this.pos+=e.length;else throw'Token mismatch, expected '+e+', found '+this.input.substring(this.pos);this.skipWhitespace(t)},this.tryMatch=function(e,t){return(void 0==t||null==t)&&(t=!0),this.skipWhitespace(t),this.input.substring(this.pos,this.pos+e.length)==e},this.matchAt=function(){for(;this.input.length>this.pos&&'@'!=this.input[this.pos];)this.pos++;return!('@'!=this.input[this.pos])},this.skipWhitespace=function(e){for(;this.isWhitespace(this.input[this.pos]);)this.pos++;if('%'==this.input[this.pos]&&!0==e){for(;'\n'!=this.input[this.pos];)this.pos++;this.skipWhitespace(e)}},this.value_braces=function(){var e=0;this.match('{',!1);for(var t=this.pos,n=!1;;){if(!n)if('}'==this.input[this.pos]){if(0 =k&&(++x,i=k);if(d[x]instanceof n||d[T-1].greedy)continue;w=T-x,y=e.slice(i,k),v.index-=i}if(v){g&&(h=v[1].length);var S=v.index+h,v=v[0].slice(h),C=S+v.length,_=y.slice(0,S),L=y.slice(C),A=[x,w];_&&A.push(_);var E=new n(o,u?a.tokenize(v,u):v,b,v,f);A.push(E),L&&A.push(L),Array.prototype.splice.apply(d,A)}}}}}return d},hooks:{all:{},add:function(e,t){var n=a.hooks.all;n[e]=n[e]||[],n[e].push(t)},run:function(e,t){var n=a.hooks.all[e];if(n&&n.length)for(var d,r=0;d=n[r++];)d(t)}}},i=a.Token=function(e,t,n,i,a){this.type=e,this.content=t,this.alias=n,this.length=0|(i||'').length,this.greedy=!!a};if(i.stringify=function(e,t,n){if('string'==typeof e)return e;if('Array'===a.util.type(e))return e.map(function(n){return i.stringify(n,t,e)}).join('');var d={type:e.type,content:i.stringify(e.content,t,n),tag:'span',classes:['token',e.type],attributes:{},language:t,parent:n};if('comment'==d.type&&(d.attributes.spellcheck='true'),e.alias){var r='Array'===a.util.type(e.alias)?e.alias:[e.alias];Array.prototype.push.apply(d.classes,r)}a.hooks.run('wrap',d);var l=Object.keys(d.attributes).map(function(e){return e+'="'+(d.attributes[e]||'').replace(/"/g,'"')+'"'}).join(' ');return'<'+d.tag+' class="'+d.classes.join(' ')+'"'+(l?' '+l:'')+'>'+d.content+''+d.tag+'>'},!t.document)return t.addEventListener?(t.addEventListener('message',function(e){var n=JSON.parse(e.data),i=n.language,d=n.code,r=n.immediateClose;t.postMessage(a.highlight(d,a.languages[i],i)),r&&t.close()},!1),t.Prism):t.Prism;var d=document.currentScript||[].slice.call(document.getElementsByTagName('script')).pop();return d&&(a.filename=d.src,document.addEventListener&&!d.hasAttribute('data-manual')&&('loading'===document.readyState?document.addEventListener('DOMContentLoaded',a.highlightAll):window.requestAnimationFrame?window.requestAnimationFrame(a.highlightAll):window.setTimeout(a.highlightAll,16))),t.Prism}();e.exports&&(e.exports=n),'undefined'!=typeof Ti&&(Ti.Prism=n),n.languages.markup={comment://,prolog:/<\?[\w\W]+?\?>/,doctype://i,cdata://i,tag:{pattern:/<\/?(?!\d)[^\s>\/=$<]+(?:\s+[^\s>\/=]+(?:=(?:("|')(?:\\\1|\\?(?!\1)[\w\W])*\1|[^\s'">=]+))?)*\s*\/?>/i,inside:{tag:{pattern:/^<\/?[^\s>\/]+/i,inside:{punctuation:/^<\/?/,namespace:/^[^\s>\/:]+:/}},"attr-value":{pattern:/=(?:('|")[\w\W]*?(\1)|[^\s>]+)/i,inside:{punctuation:/[=>"']/}},punctuation:/\/?>/,"attr-name":{pattern:/[^\s>\/]+/,inside:{namespace:/^[^\s>\/:]+:/}}}},entity:/?[\da-z]{1,8};/i},n.hooks.add('wrap',function(e){'entity'===e.type&&(e.attributes.title=e.content.replace(/&/,'&'))}),n.languages.xml=n.languages.markup,n.languages.html=n.languages.markup,n.languages.mathml=n.languages.markup,n.languages.svg=n.languages.markup,n.languages.css={comment:/\/\*[\w\W]*?\*\//,atrule:{pattern:/@[\w-]+?.*?(;|(?=\s*\{))/i,inside:{rule:/@[\w-]+/}},url:/url\((?:(["'])(\\(?:\r\n|[\w\W])|(?!\1)[^\\\r\n])*\1|.*?)\)/i,selector:/[^\{\}\s][^\{\};]*?(?=\s*\{)/,string:{pattern:/("|')(\\(?:\r\n|[\w\W])|(?!\1)[^\\\r\n])*\1/,greedy:!0},property:/(\b|\B)[\w-]+(?=\s*:)/i,important:/\B!important\b/i,function:/[-a-z0-9]+(?=\()/i,punctuation:/[(){};:]/},n.languages.css.atrule.inside.rest=n.util.clone(n.languages.css),n.languages.markup&&(n.languages.insertBefore('markup','tag',{style:{pattern:/(
+
+
+ ${e.map(l).map((e)=>`
`)}}const Mi=`
+d-citation-list {
+ contain: layout style;
+}
+
+d-citation-list .references {
+ grid-column: text;
+}
+
+d-citation-list .references .title {
+ font-weight: 500;
+}
+`;class Oi extends HTMLElement{static get is(){return'd-citation-list'}connectedCallback(){this.hasAttribute('distill-prerendered')||(this.style.display='none')}set citations(e){x(this,e)}}var Ui=f(function(e){var t='undefined'==typeof window?'undefined'!=typeof WorkerGlobalScope&&self instanceof WorkerGlobalScope?self:{}:window,n=function(){var e=/\blang(?:uage)?-(\w+)\b/i,n=0,a=t.Prism={util:{encode:function(e){return e instanceof i?new i(e.type,a.util.encode(e.content),e.alias):'Array'===a.util.type(e)?e.map(a.util.encode):e.replace(/&/g,'&').replace(/e.length)break tokenloop;if(!(y instanceof n)){c.lastIndex=0;var v=c.exec(y),w=1;if(!v&&f&&x!=d.length-1){if(c.lastIndex=i,v=c.exec(e),!v)break;for(var S=v.index+(g?v[1].length:0),C=v.index+v[0].length,T=x,k=i,p=d.length;T
+
+`);class Ni extends ei(Ii(HTMLElement)){renderContent(){if(this.languageName=this.getAttribute('language'),!this.languageName)return void console.warn('You need to provide a language attribute to your
Footnotes
+
+`,!1);class Fi extends qi(HTMLElement){connectedCallback(){super.connectedCallback(),this.list=this.root.querySelector('ol'),this.root.style.display='none'}set footnotes(e){if(this.list.innerHTML='',e.length){this.root.style.display='';for(const t of e){const e=document.createElement('li');e.id=t.id+'-listing',e.innerHTML=t.innerHTML;const n=document.createElement('a');n.setAttribute('class','footnote-backlink'),n.textContent='[\u21A9]',n.href='#'+t.id,e.appendChild(n),this.list.appendChild(e)}}else this.root.style.display='none'}}const Pi=ti('d-hover-box',`
+
+
+