-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
362 lines (275 loc) · 11.4 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
library(dplyr)
library(tablespan)
```
# tablespan
<!-- badges: start -->
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[![CRAN status](https://www.r-pkg.org/badges/version/tablespan)](https://CRAN.R-project.org/package=tablespan)
[![Total Downloads](https://cranlogs.r-pkg.org/badges/grand-total/tablespan)](https://cranlogs.r-pkg.org/badges/grand-total/tablespan)
<!-- badges: end -->
> Create satisficing tables in R the formula way.
The objective of `tablespan` is to provide a "good enough" approach to creating tables by
leveraging R's formulas.
`tablespan` builds on the awesome packages [`openxlsx`](https://ycphs.github.io/openxlsx/) and [`gt`](https://gt.rstudio.com/), which allows tables created with `tablespan`
to be exported to the following formats:
1. **Excel** (using [`openxlsx`](https://ycphs.github.io/openxlsx/))
2. **HTML** (using [`gt`](https://gt.rstudio.com/))
3. **LaTeX** (using [`gt`](https://gt.rstudio.com/))
4. **RTF** (using [`gt`](https://gt.rstudio.com/))
## Installation
To install `tablespan` from CRAN use:
```{r, eval=FALSE}
install.packages("tablespan")
```
The development version of `tablespan` can be installed from GitHub with:
```{r, eval=FALSE}
library(remotes)
remotes::install_github("jhorzek/tablespan")
```
## Introduction
R has a large set of great packages that allow you to create and export tables
that look exactly like you envisioned. However, sometimes you may just need a
good-enough table that is easy to create and share with others. This is where
`tablespan` can be of help.
Let's assume that we want to share the following table:
```{r}
library(dplyr)
data("mtcars")
summarized_table <- mtcars |>
group_by(cyl, vs) |>
summarise(N = n(),
mean_hp = mean(hp),
sd_hp = sd(hp),
mean_wt = mean(wt),
sd_wt = sd(wt))
print(summarized_table)
```
We don't want to share the table as is - the variable names are all a bit technical
and the table could need some spanners summarizing columns. So, we want
to share a table that looks something like this:
```
| | Horse Power | Weight |
| Cylinder | Engine | Mean | SD | Mean | SD |
| -------- | ------ | ----- | --- | ---- | -- |
| | |
```
`tablespan` allows us to create this table with a single formula.
### Creating a Basic Table
In `tablespan`, the table headers are defined with a formula. For example,
`cyl ~ mean_hp + sd_hp` defines a table with `cyl` as the
row names and `mean_hp` and `sd_hp` as columns:
```{r}
library(tablespan)
tablespan(data = summarized_table,
formula = cyl ~ mean_hp + sd_hp)
```
Note that the row names (`cyl`) are in a separate block to the left.
### Adding Spanners
Spanners are defined using braces and spanner names. For example, the
following defines a spanner for `mean_hp` and `sd_hp` with the name `Horsepower`:
`cyl ~ (Horsepower = mean_hp + sd_hp)`:
```{r}
tablespan(data = summarized_table,
formula = cyl ~ (Horsepower = mean_hp + sd_hp))
```
Spanners can also be nested:
```{r}
tablespan(data = summarized_table,
formula = cyl ~ (Horsepower = (Mean = mean_hp) + (SD = sd_hp)))
```
### Renaming Columns
Variable names in an R `data.frame` are often very technical (e.g., `mean_hp` and `sd_hp`).
When sharing the table, we may want to replace those names. In the example above,
we may want to replace `mean_hp` and `sd_hp` with "Mean" and "SD". In
`tablespan` renaming variables is achieved with `new_name:old_name`.
For example, `cyl ~ (Horsepower = Mean:mean_hp + SD:sd_hp)` renames `mean_hp` to
`Mean` and `sd_hp` to `SD`:
```{r}
tablespan(data = summarized_table,
formula = cyl ~ (Horsepower = Mean:mean_hp + SD:sd_hp))
```
### Creating the Full Table
The combination of row names, spanners, and renaming of variables allows creating
the full table:
```{r}
library(dplyr)
library(tablespan)
data("mtcars")
summarized_table <- mtcars |>
group_by(cyl, vs) |>
summarise(N = n(),
mean_hp = mean(hp),
sd_hp = sd(hp),
mean_wt = mean(wt),
sd_wt = sd(wt))
tbl <- tablespan(data = summarized_table,
formula = Cylinder:cyl + Engine:vs ~
N +
(`Horse Power` = Mean:mean_hp + SD:sd_hp) +
(`Weight` = Mean:mean_wt + SD:sd_wt),
title = "Motor Trend Car Road Tests",
subtitle = "A table created with tablespan",
footnote = "Data from the infamous mtcars data set.")
tbl
```
## Exporting to Excel
Tables created with `tablespan` can now be translated to xlsx tables with [`openxlsx`](https://ycphs.github.io/openxlsx/) using the `as_excel` function:
```{r}
# as_excel creates an openxlsx workbook
wb <- as_excel(tbl = tbl)
# Save the workbook as an xlsx file:
# openxlsx::saveWorkbook(wb,
# file = "cars.xlsx",
# overwrite = TRUE)
```
![](man/figures/tablespan_example_cars.png)
### Styling
While `tablespan` provides limited styling options, some elements can be adjusted.
For example, we may want to print some elements in bold or format numbers differently.
In `tablespan`, styling happens when translating the table to an `openxlsx` workbook
with `as_excel`.
To this end, `tablespan` provides a `styles` argument.
#### Changing the Overall Look
The easiest way to customize tables is to change the default color scheme.
The function `tbl_styles` provides control over most elements in the table,
but in many cases `style_color` may be sufficient. The following creates a
table with teal-colored backgrounds for the title, header, and row names:
```{r}
wb <- as_excel(tbl = tbl,
styles = style_color(primary_color = "#008080"))
# Save the workbook as an xlsx file:
# openxlsx::saveWorkbook(wb,
# file = "cars.xlsx",
# overwrite = TRUE)
```
![](man/figures/tablespan_example_cars_color.png)
Similarly, a dark background can be defined as follows:
```{r}
wb <- as_excel(tbl = tbl,
styles = style_color(primary_color = "#000000"))
# Save the workbook as an xlsx file:
# openxlsx::saveWorkbook(wb,
# file = "cars.xlsx",
# overwrite = TRUE)
```
#### Formatting Cells
Let's assume we want all `mean_hp` values with a value $\geq 100$ to be printed
in bold. To this end, we first create a new style object using `openxlsx`:
```{r}
bold <- openxlsx::createStyle(textDecoration = "bold")
```
Next, we create a cell style with `tablespan`:
```{r}
hp_ge_100 <- cell_style(rows = which(summarized_table$mean_hp >= 100),
colnames = "mean_hp",
style = bold,
gridExpand = FALSE)
```
Note that we specify the indices of the rows that we want to be in bold and the
column name of the item.
Finally, we pass this style as part of a list to `as_excel`:
```{r}
# as_excel creates an openxlsx workbook
wb <- as_excel(tbl = tbl,
styles = tbl_styles(cell_styles = list(hp_ge_100)))
# Save the workbook as an xlsx file:
# openxlsx::saveWorkbook(wb,
# file = "cars.xlsx",
# overwrite = TRUE)
```
![](man/figures/tablespan_example_cars_styled.png)
#### Formatting Data Types
`tablespan` also allows formatting specific data types. Let's assume that we want
to round all doubles to 3 instead of the default 2 digits.
To this end, we use the `create_data_styles` function, where we specify (1) a
function that checks for the data type we want to style (here `is.double`) and
(2) a style for all columns that match that style:
```{r}
double_style <- create_data_styles(double = list(test = is.double,
style = openxlsx::createStyle(numFmt = "0.000")))
wb <- as_excel(tbl = tbl, styles = tbl_styles(data_styles = double_style))
# Save the workbook as an xlsx file:
# openxlsx::saveWorkbook(wb,
# file = "cars.xlsx",
# overwrite = TRUE)
```
![](man/figures/tablespan_example_cars_styled_data.png)
## Exporting to HTML, LaTeX, and RTF
Tables created with `tablespan` can also be exported to `gt` which allows saving as HTML, LaTeX, or RTF file. To this end, we simply have to call `as_gt` on our table:
```{r, include=FALSE}
# Translate to gt:
gt_tbl <- as_gt(tbl = tbl)
```
```{r, eval=FALSE}
# Translate to gt:
gt_tbl <- as_gt(tbl = tbl)
gt_tbl
```
<p align="center">
<img src="man/figures/tablespan_example_gt_cars.png" alt="Standard table" width="50%">
</p>
### Styling Great Tables
The `gt` package provides a wide range of functions to adapt the style of the
table created with `as_gt`. For instance, `opt_stylize` adds a pre-defined style
to the entire table:
```{r, eval=FALSE}
gt_tbl |>
gt::opt_stylize(style = 6,
color = 'gray')
```
<p align="center">
<img src="man/figures/tablespan_example_gt_cars_styled.png" alt="Styled table" width="50%">
</p>
When adapting the `gt` object, there is an important detail to keep in mind: To
ensure that each table spanner has a unique ID, `tablespan` will create IDs that
differ from the text shown in the spanner. To demonstrate this, Let's assume
that we want to add a spanner above `Horse Power` and `Weight`:
```{r, error=TRUE}
gt_tbl |>
gt::tab_spanner(label = "New Spanner",
spanners = c("Horse Power", "Weight"))
```
This will throw an error because the spanner IDs are different from the spanner labels.
To get the spanner IDs, use `gt::tab_info()`:
```{r, eval = FALSE}
gt_tbl |>
gt::tab_info()
```
<p align="center">
<img src="man/figures/tablespan_example_tab_info.png" alt="Table spanner IDs" width="50%">
</p>
The IDs for the spanners can be found at the very bottom. To add another spanner
above `Horse Power` and `Weight`, we have to use these IDs:
```{r, eval=FALSE}
gt_tbl |>
gt::tab_spanner(label = "New Spanner",
spanners = c("__BASE_LEVEL__Horse Power",
"__BASE_LEVEL__Weight"))
```
<p align="center">
<img src="man/figures/tablespan_example_new_spanner.png" alt="Table with additional spanner" width="50%">
</p>
## Tables without row names
Using `1` on the left hand side of the formula creates a table without row names.
For example, `1 ~ (Horsepower = Mean:mean_hp + SD:sd_hp)` defines
```{r}
tablespan(data = summarized_table,
formula = 1 ~ (Horsepower = Mean:mean_hp + SD:sd_hp))
```
## References
- gt: Iannone R, Cheng J, Schloerke B, Hughes E, Lauer A, Seo J, Brevoort K, Roy O (2024). gt: Easily Create Presentation-Ready Display Tables. R package version 0.11.1.9000, <https://github.com/rstudio/gt>, <https://gt.rstudio.com>.
- expss: Gregory D et al. (2024). expss: Tables with Labels in R. R package version 0.9.31, <https://gdemin.github.io/expss/>.
- tables: Murdoch D (2024). tables: Formula-Driven Table Generation. R package version 0.9.31, <https://dmurdoch.github.io/tables/>.
- openxlsx: Schauberger P, Walker A (2023). _openxlsx: Read, Write and Edit xlsx Files_. R package version 4.2.5.2,
<https://ycphs.github.io/openxlsx/>.