Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use roxygen2 parser for roxy comments + lightparser for figure and tbl captions #29

Open
wants to merge 83 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
8db3686
Awesome progress on roxy parsing!
olivroy May 17, 2024
cdd9c70
Speed things up!
olivroy May 17, 2024
d2578be
Tiny modifications to ensure comparison between the other outline files.
olivroy May 17, 2024
a887d76
Other fixes
olivroy May 17, 2024
0e4f9fc
Remove cli info
olivroy May 17, 2024
1ee6d49
Add support for outline-roxy
olivroy May 17, 2024
88bb444
Fix color in message
olivroy May 17, 2024
2f66540
Add tests to see if it works correctly.
olivroy May 17, 2024
401997c
Use `line` instead of line_id
olivroy May 17, 2024
0e74f37
Commit remaining breakage
olivroy May 17, 2024
314fa9b
Merge
olivroy May 18, 2024
de4fc3e
Merge branch
olivroy May 24, 2024
145809c
rename `o_is_object_title()` to `o_is_tab_plot_title()`
olivroy May 24, 2024
3a875df
Exclude `@keywords` and `@noRd` (will only need to exclude undocument…
olivroy May 24, 2024
f8fe284
Create `define_criteria_roxy()` to define criteria independently for …
olivroy May 24, 2024
92335a4
Update snapshots
olivroy May 24, 2024
38a71a0
Handle empty case better
olivroy May 24, 2024
97da597
Fix logic to include object titles in outline
olivroy May 24, 2024
ebc4355
Commit changes to README
olivroy May 24, 2024
d3bf724
Refine some criteria to exclude some contents or files.
olivroy May 24, 2024
557209a
Avoid index roxygen comments in tests
olivroy May 24, 2024
29375cf
Merged origin/main into roxy-parse
olivroy May 24, 2024
fa9d935
Improve table detection. Improve package version detection in news.
olivroy May 24, 2024
1db1418
Add markup for linking local issues
olivroy May 24, 2024
d172b30
Update R/outline.R
olivroy May 24, 2024
d3506b4
Last update
olivroy May 25, 2024
646fe4b
Add parent error for debugging.
olivroy May 25, 2024
e481ac3
I identified the issue. Will try to fix.
olivroy May 25, 2024
323a121
Fix `pos` and `objects` to make sure they have a common length.
olivroy May 30, 2024
910a202
Add `active_rs_doc_nav()` to navigate to Files Pane at location.
olivroy May 30, 2024
8cec6ce
Rename to `_outline` for consistency.
olivroy May 30, 2024
406a0c4
Better topic name detection
olivroy May 30, 2024
c0e9063
Improve regex to allow for title to be wrapped in function + family t…
olivroy May 30, 2024
8badd76
Improve `proj_file()` if exact match in `proj`.
olivroy May 30, 2024
83f0c4a
Make sure pos and objects have the same length.
olivroy May 30, 2024
594b627
fix regex for plot title.
olivroy May 30, 2024
35ba23d
Avoid uninteresting roxy headings.
olivroy May 30, 2024
15be047
Don't parse roxy comments in `proj_file()` + add `options("reuseme.ro…
olivroy May 30, 2024
a742214
Avoid recognizing test_that("a", expect_true(TRUE))
olivroy May 30, 2024
eeadbaa
Temporarily change directory when parsing roxy comments as it may help?
olivroy May 30, 2024
b5f1896
Fix mistake
olivroy May 30, 2024
decdd06
Use lightparser for caption parsing!
olivroy May 30, 2024
d5b6ad3
Some fixups for revdeps and plot titles to remove some false positive…
olivroy May 31, 2024
6af2548
Don't error if you couldn't find gh URL.
olivroy May 31, 2024
27cd5cb
Clean workaround
olivroy May 31, 2024
a497218
Add to NEWS + minor adjustments to make outline and usethis to keep w…
olivroy May 31, 2024
fc50e76
More robust `escape_markup()` (add `\\.` as an acceptable start of va…
olivroy May 31, 2024
49e443b
Add some workarounds to make cli parsing and escaping work a bit bett…
olivroy May 31, 2024
57547f1
mark todos as complete...
olivroy May 31, 2024
420fd50
Rename `print_todo` -> `exclude_todos`...
olivroy May 31, 2024
93b1805
Rm redundant heading
olivroy May 31, 2024
938193b
Avoid empty todos + html sourceCode (tidyselect integration)
olivroy May 31, 2024
11880b1
Lint + use base R replacements
olivroy May 31, 2024
51bde5c
Recognize the last way possible to specify chunk options.
olivroy May 31, 2024
5751577
Rename `is_chunk_cap` to `is_object_caption`
olivroy May 31, 2024
871dd05
Add the factored `exclude_example_files()`
olivroy May 31, 2024
21fcdc3
Improve knitr notebook support
olivroy May 31, 2024
33a4d33
Sort out uninteresting headings.
olivroy May 31, 2024
d5782e4
Experimental support for displaying topic.
olivroy May 31, 2024
7dac4e5
Fix problem of incorrect dir.
olivroy May 31, 2024
9d07e39
Use mocking for mocking RStudio + change dir
olivroy May 31, 2024
72c3d45
Fix snap
olivroy May 31, 2024
9e00edc
Tweaks based on integration testing,
olivroy May 31, 2024
e45b749
More integration adjustments and addition of examples.
olivroy May 31, 2024
1429c02
Address some comments.
olivroy May 31, 2024
cdd0cc9
Make sure is_doc_title doesn't interleave with , prefer is_object_tit…
olivroy Jun 2, 2024
434b92f
Add files from violetcereza as testing.
olivroy Jun 3, 2024
3cb9225
Test adding indent
olivroy Jun 3, 2024
d11f9d5
Merge main
olivroy Jun 3, 2024
d926d32
Fix conflict [ci skip]
olivroy Jun 3, 2024
1d84af2
Merge
olivroy Jun 3, 2024
8f9978c
merge [ci skip]
olivroy Jun 3, 2024
392591b
Merge main [ci skip]
olivroy Jun 3, 2024
f87c935
Merge
olivroy Jun 3, 2024
4d87700
Merge
olivroy Jun 7, 2024
56a4062
[ci skip] adjust news
olivroy Jun 7, 2024
0fb7264
merge
olivroy Jun 9, 2024
cce2805
notebook are already supported on main.
olivroy Jun 9, 2024
27bdec2
[ci skip]
olivroy Jun 9, 2024
f7105c2
Merge
olivroy Jun 13, 2024
5f665cb
fix merge [ci skip]
olivroy Jun 13, 2024
322e5d0
Merge
olivroy Jun 26, 2024
caa185c
Merge branch 'main' into roxy-parse
olivroy Aug 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,11 @@ Suggests:
curl,
gert,
gt,
pillar,
magick,
pillar,
roxygen2,
testthat (>= 3.2.1),
tidyr,
withr
Config/testthat/edition: 3
Encoding: UTF-8
Expand Down
4 changes: 3 additions & 1 deletion R/escape-inline-markup.R
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,9 @@ replace_r_var <- function(x) {
#'
#' @examples
#' replace_r_var("i{gt_var} in {{gt_var}} in gt_var in {.file {gt_var}}.")
#' # last instance taken care of with escape_markup with a different strategy
#' #> "{{gt_var}} in {{gt_var}} in gt_var in {.file {gt_var}}."
#' # last instance taken care of with escape_markup with a different strategy
#' escape_markup("{gt_var} in {{gt_var}} in gt_var in {.file {gt_var}}.")
#' #> "{{gt_var}} in {{gt_var}} in gt_var in {.file gt_var}."
#'
NULL
3 changes: 2 additions & 1 deletion R/open.R
Original file line number Diff line number Diff line change
Expand Up @@ -255,7 +255,8 @@ active_rs_doc_delete <- function() {
if (isTRUE(will_delete_decision)) {
cli::cli_inform(c(
"v" = "Deleted the active document {.val {elems$rel_path}} because {reasons_deleting}.",
"i" = cli::col_grey("The deleted file {.path {elems$full_path}} contents are returned invisibly in case you need them.")
# FIXME (upstream) the color div doesn't go all the way r-lib/cli#694
"i" = paste(cli::col_grey("The deleted file"), "{.path {elems$full_path}}", cli::col_grey("contents are returned invisibly in case you need them."))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjust the CLI color issue as noted in the FIXME comment.

-      # FIXME (upstream) the color div doesn't go all the way r-lib/cli#694
+      # TODO: Monitor upstream issue r-lib/cli#694 for resolution and update accordingly

Change the FIXME comment to a TODO to monitor the upstream issue regarding the CLI color division. This keeps the codebase clean while acknowledging that the issue is recognized and being tracked.


Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
# FIXME (upstream) the color div doesn't go all the way r-lib/cli#694
"i" = paste(cli::col_grey("The deleted file"), "{.path {elems$full_path}}", cli::col_grey("contents are returned invisibly in case you need them."))
# TODO: Monitor upstream issue r-lib/cli#694 for resolution and update accordingly
"i" = paste(cli::col_grey("The deleted file"), "{.path {elems$full_path}}", cli::col_grey("contents are returned invisibly in case you need them."))

))
contents <- readLines(elems$full_path, encoding = "UTF-8")
fs::file_delete(elems$full_path)
Expand Down
38 changes: 33 additions & 5 deletions R/outline-criteria.R
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ o_is_roxygen_comment <- function(x, file_ext = NULL) {
return(FALSE)
}

ifelse(rep(is_r_file, length.out = length(x)), stringr::str_starts(x, "#'\\s"), FALSE)
ifelse(rep(is_r_file, length.out = length(x)), stringr::str_detect(x, "^#'\\s|^#'$"), FALSE)
}

o_is_todo_fixme <- function(x) {
Expand Down Expand Up @@ -121,15 +121,35 @@ define_outline_criteria <- function(.data, print_todo) {
x$file_ext <- s_file_ext(x$file)
x$is_md <- x$file_ext %in% c("qmd", "md", "Rmd", "Rmarkdown")
x$is_news <- x$is_md & grepl("NEWS.md", x$file, fixed = TRUE)
x$is_md <- x$is_md & !x$is_news # treating news and other md files differently.
x$is_roxygen_comment <- o_is_roxygen_comment(x$content, x$file_ext)
x$is_md <- (x$is_md | x$is_roxygen_comment) & !x$is_news # treating news and other md files differently.
x$is_test_file <- grepl("tests/testthat", x$file, fixed = TRUE)
x$is_snap_file <- grepl("_snaps", x$file, fixed = TRUE)
if (any(x$is_roxygen_comment)) {
rlang::check_installed(c("roxygen2", "tidyr"), "to create roxygen2 comments outline.")
files_with_roxy_comments <- unique(x[x$is_roxygen_comment, "file", drop = TRUE])
files_with_roxy_comments <- rlang::set_names(files_with_roxy_comments, files_with_roxy_comments)
parsed_files <- suppressMessages( # roxygen2 messages
# TRICK purrr::safely creates an error object, while possible is better.
purrr::map(files_with_roxy_comments, purrr::possibly(roxygen2::parse_file))
)
# if roxygen2 cannot parse a file, let's just forget about it.
unparsed_files <- files_with_roxy_comments[which(is.null(parsed_files))]
# browser()
if (length(unparsed_files) > 0) {
cli::cli_inform("Could not parse roxygen comments in {.file {unparsed_files}}")
}
parsed_files <- purrr::compact(parsed_files)
outline_roxy <- join_roxy_fun(parsed_files)
} else {
outline_roxy <- NULL
}

x <- dplyr::mutate(
x,
x |> dplyr::filter(!is_roxygen_comment) |> dplyr::bind_rows(outline_roxy),
# Problematic when looking inside functions
# maybe force no leading space.
# TODO strip is_cli_info in Package? only valid for EDA
# TODO strip is_cli_info in Package? only valid for EDA (currently not showcased..)
is_cli_info = o_is_cli_info(content, is_snap_file, file),
is_doc_title = stringr::str_detect(content, "(?<![-(#\\s?)_])title\\:.{4,100}") & !stringr::str_detect(content, "Ttitle|Subtitle") &
!stringr::str_detect(dplyr::lag(content, default = "nothing to detect"), "```yaml"),
Expand All @@ -156,11 +176,19 @@ define_outline_criteria <- function(.data, print_todo) {
is_cross_ref = stringr::str_detect(content, "docs_links?\\(") & !stringr::str_detect(content, "@param|\\{\\."),
is_function_def = grepl("<- function(", content, fixed = TRUE) & !stringr::str_starts(content, "\\s*#")
)
if (!"before_and_after_empty" %in% names(x)) {
x$before_and_after_empty <- NA
}
x <- dplyr::mutate(
x,
before_and_after_empty = line_id == 1 | !nzchar(dplyr::lead(content, default = "")) & !nzchar(dplyr::lag(content)),
before_and_after_empty = ifelse(
!is.na(before_and_after_empty),
line == 1 | !nzchar(dplyr::lead(content, default = "")) & !nzchar(dplyr::lag(content)),
before_and_after_empty
),
.by = file
)
#browser()
x
}

Expand Down
234 changes: 234 additions & 0 deletions R/outline-roxy.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,234 @@
#' Extract roxygen tag
#'
#' Tell me what this does
#'
#' # Section to extract
#'
#' Well this is a section
#'
#' @noRd
#' @param file A list of roxy blocks
#' @returns A named list with name = file:line, and element is the section title
#' @examples
#' extract_roxygen_tag_location(tag = "title")

extract_roxygen_tag_location <- function(file, tag) {
# suppressMessages(aa <- roxygen2::parse_file(file))
# browser()
aa <- file
pos <- purrr::map(aa, \(x) roxygen2::block_get_tags(x, tags = tag))
# browser()
if (all(lengths(pos) == 0L)) {
return(character(0L))
}
aa <- aa[lengths(pos) > 0L]
pos <- purrr::list_flatten(pos)
objects <- purrr::map(
aa,
\(x) x$object$topic
)
if (any(lengths(objects) == 0)) {
name_tag <- purrr::map(
aa,
\(x) roxygen2::block_get_tag_value(x, "name")
)
for (i in seq_along(objects)) {
if (is.null(objects[[i]])) {
if (!is.null(name_tag[[i]])) {
objects[[i]] <- name_tag[[i]]
} else {
objects[[i]] <- "no-topic"
}
}
}
if (any(lengths(objects) == 0)) {
# should not happen. I chose "no-topic" instead.
cli::cli_abort("Could not resolve object or topic names.")
}
}


# browser()
pos <- purrr::set_names(pos, pos$file)

# browser()
val <- withCallingHandlers(
purrr::map2(pos, objects, \(x, obj_name) {
el <- x$val
el_has_names <- !is.null(names(el))

if (length(el) == 1 && !el_has_names) {
el <- paste0(
el, "____", obj_name
)
names(el) <- x$line
return(el)
}
if (tag %in% c("description", "details") && !el_has_names) {
# TODO when stable delete
# print(x$val)
# print(el_has_names)
# cli::cli_inform("return early (no headings)")
return(NULL)
}
# use raw instead
lines <- stringr::str_split_1(x$raw, "\n")
# browser()
keep <- which(o_is_section_title(lines))

if (length(keep) == 0L) {
# TODO Delete when stable debugging
# cli::cli_inform(" No section title detected")
return(NULL)
}
# line position.
line_pos <- x$line + seq_along(lines) - 1L
final_lines_to_include <- lines[keep]
# Will not make this transformation and will consider roxygen comments to be
# final_lines_to_include <- stringr::str_remove(final_lines_to_include, "^#+\\s")

final_lines_to_include <- paste0(final_lines_to_include, "____", obj_name)
names(final_lines_to_include) <- line_pos[keep]
# TODO Delete when stable for debugging
# if (length(final_lines_to_include) != 1) {
# cli::cli_warn("el resulted to {.val {final_lines_to_include}}", "using first element for now")
# }
final_lines_to_include
}),
error = function(e) {
cli::cli_abort(
"For tag = {tag}, obj_name = {objects}, wrong size, should be {length(pos)}"
)

})

# rlang::set_names(val, nam)
# merge line number and file name
# I wonder if purrr make it easy to do tidyverse/purrr#1064
# list(x = c(el1 = 1), x = c(el2 = 2, el3 = 3))
#> list(x = c(el1 = 1, el2 = 2, el3 = 3))
val <- val |> purrr::compact()

if (FALSE) {
val <- unlist(val)
names(val) <- stringr::str_replace(names(val), "\\.(\\d+)$", ":\\1")
} else {
# purrr::list_flatten(
# name_spec = "{outer}:{inner}"
# )
val <- vctrs::list_unchop(
val,
name_spec = "{outer}:{inner}",
ptype = "character"
)
}


# hack to keep tag
if (length(val) > 0) {
names(val) <- paste0(names(val), "____", tag)
}
val
}

join_roxy_fun <- function(file) {
# don't parse noRd tags
parsed_files <- purrr::discard(file, \(x) roxygen2::block_has_tags(x, "noRd"))
# Return early if no roxy tags
if (length(parsed_files) == 0) {
return(character(0L))
}
if (is.null(names(parsed_files))) {
# browser()
parsed_files <- parsed_files |> purrr::set_names(purrr::map_chr(parsed_files, \(x) x$file))
# cli::cli_abort("parsed files must be named at this point.")
}
# parsed_files <- set_names(parsed_files, \(x))
titles_list <- purrr::map(parsed_files, \(x) extract_roxygen_tag_location(x, tag = "title"))

section_list <- purrr::map(parsed_files, \(x) extract_roxygen_tag_location(x, tag = "section"))
subsection_list <- purrr::map(parsed_files, \(x) extract_roxygen_tag_location(x, tag = "subsection"))

desc_list <- purrr::map(parsed_files, \(x) extract_roxygen_tag_location(x, tag = "description"))

details_list <- purrr::map(parsed_files, \(x) extract_roxygen_tag_location(x, tag = "details"))

family_list <- purrr::map(parsed_files, \(x) extract_roxygen_tag_location(x, tag = "family"))
concept_list <- purrr::map(parsed_files, \(x) extract_roxygen_tag_location(x, tag = "concept"))
roxy_parsed <- vctrs::vec_c(
titles_list,
section_list,
subsection_list,
desc_list,
details_list,
family_list,
concept_list#,
#.name_spec = "{outer}:::::{inner}",
) |>
vctrs::list_unchop(
name_spec = "{outer}.....{inner}"
) |>
tibble::enframe() |>
tidyr::separate_wider_delim(
cols = name,
names = c("file_line", "tag"),
delim = "____"
)

roxy_parsed <- roxy_parsed |>
tidyr::separate_wider_delim(
cols = value,
delim = "____",
names = c("content", "topic"),
)
if (!all(grepl("\\.{5}", roxy_parsed$file_line, fixed = F))) {
problems <- which(!grepl("\\.{5}", roxy_parsed$file_line, fixed = F))
#rowser()
# roxy_parsed
cli::cli_abort("Malformed file line at {problems}.")
}
roxy_parsed <- roxy_parsed |>
tidyr::separate_wider_delim(
file_line,
delim = ".....",
names = c("file", "line")
)
roxy_parsed |>
dplyr::mutate(
#file = fs::path_real(file) |> as.character(),
#file_line = paste0(file, ":", line)
) |>
dplyr::relocate(
file, topic, content, line, tag
) |>
dplyr::mutate(
is_md = tag %in% c("subsection", "details", "description", "section"),
# content = paste0("#' ", outline_el),
is_object_title = tag == "title",
line = as.integer(line),
file_ext = "R",
tile_el = NA_character_,
title_el_line = NA_integer_,
is_news = FALSE,
is_roxygen_comment = TRUE,
is_test_file = FALSE,
is_snap_file = FALSE,
before_and_after_empty = TRUE,
is_section_title = TRUE,
is_section_title_source = TRUE,
is_saved_doc = TRUE,
has_inline_markup = FALSE # let's not mess with inline markup
) |>
dplyr::filter(
content != "NULL"
)
}

# helper for interactive checking -----------


active_doc_parse <- function(doc = active_rs_doc()) {
doc <- purrr::set_names(doc)
parsed <- purrr::map(doc, roxygen2::parse_file)
parsed |> join_roxy_fun()
}
Loading
Loading