Skip to content

Commit

Permalink
Disable caching in rmarkdown files
Browse files Browse the repository at this point in the history
- Update R version in DESCRIPTION file.
- Caching needs to be disabled in README.rmd, features.Rmd and
  Overview.Rmd files.
- Print statement in env-manager.R needs to be commented out.
- Build website using pkgdown.
- Update NEWS.md and cran-comments.md files.
- Update link in codecov badge.
  • Loading branch information
pakjiddat committed Jan 4, 2022
1 parent dfb6f96 commit a642b2e
Show file tree
Hide file tree
Showing 65 changed files with 15,035 additions and 5,387 deletions.
1 change: 1 addition & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,4 @@
^vignettes/overview_cache$
^CRAN-RELEASE$
^codecov\.yml$
^CRAN-SUBMISSION$
2 changes: 0 additions & 2 deletions CRAN-RELEASE

This file was deleted.

3 changes: 3 additions & 0 deletions CRAN-SUBMISSION
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Version: 0.0.3
Date: 2022-01-03 10:44:59 UTC
SHA: e08f755a59570f644a442d01647e672e6a1a73f1
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: wordpredictor
Title: Develop Text Prediction Models Based on N-Grams
Version: 0.0.2
Version: 0.0.3
URL: https://github.com/pakjiddat/word-predictor, https://pakjiddat.github.io/word-predictor/
BugReports: https://github.com/pakjiddat/word-predictor/issues
Authors@R:
Expand All @@ -20,7 +20,7 @@ Description: A framework for developing n-gram models for text prediction.
License: MIT + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.1.1
RoxygenNote: 7.1.2
Imports: digest, ggplot2, patchwork, stringr, dplyr, SnowballC
Suggests:
testthat,
Expand Down
6 changes: 6 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# wordpredictor 0.0.3

## Bug fixes

* Disabled caching in R Markdown files, because it was causing problems with CRAN checks.

# wordpredictor 0.0.2

## Bug fixes
Expand Down
2 changes: 1 addition & 1 deletion R/env-manager.R
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ EnvManager <- R6::R6Class(
wp$ed <- ed
# The wordpredictor options are updated
options("wordpredictor" = wp)
print(list.files(system.file(package = "wordpredictor")))
# print(list.files(system.file(package = "wordpredictor")))
# Each file is copied from extdata to the given folder
for (fn in fns) {
# The source file path
Expand Down
12 changes: 6 additions & 6 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ clean_up <- function(ve) {
[![R-CMD-check](https://github.com/pakjiddat/word-predictor/workflows/R-CMD-check/badge.svg)](https://github.com/pakjiddat/word-predictor/actions)
[![lint](https://github.com/pakjiddat/word-predictor/workflows/lint/badge.svg)](https://github.com/pakjiddat/word-predictor/actions)
[![test-coverage](https://github.com/pakjiddat/word-predictor/workflows/test-coverage/badge.svg)](https://github.com/pakjiddat/word-predictor/actions)
[![Codecov test coverage](https://codecov.io/gh/pakjiddat/word-predictor/branch/master/graph/badge.svg)](https://codecov.io/gh/pakjiddat/word-predictor?branch=master)
[![Codecov test coverage](https://codecov.io/gh/pakjiddat/word-predictor/branch/master/graph/badge.svg)](https://app.codecov.io/gh/pakjiddat/word-predictor?branch=master)
[![CRAN version](https://www.r-pkg.org/badges/version/wordpredictor)](https://cran.r-project.org/package=wordpredictor)
<!-- badges: end -->

Expand Down Expand Up @@ -91,7 +91,7 @@ Information about the package can be obtained using the command line or the pack

The following example shows how to generate a n-gram model.

```{r generate-model, cache=TRUE, results='hide'}
```{r generate-model, cache=FALSE, results='hide'}
# The required files
rf <- c("input.txt")
# The test environment is setup
Expand Down Expand Up @@ -131,7 +131,7 @@ The above code generates the file **def-model.RDS**. This file represents the n-

The following example shows how to predict the next word given a set of words:

```{r predict-word, cache=TRUE}
```{r predict-word, cache=FALSE}
# The required files
rf <- c("def-model.RDS")
# The test environment is setup
Expand Down Expand Up @@ -183,7 +183,7 @@ clean_up(ve)

The following example plots the n-gram frequency coverage. It shows the percentage of n-grams with frequency 1, 2 ... 10.

```{r analyze-ngrams-2, cache=TRUE, fig.width=6, fig.height=4}
```{r analyze-ngrams-2, cache=FALSE, fig.width=6, fig.height=4}
# The required files
rf <- c("n2.RDS")
# The test environment is setup
Expand All @@ -208,7 +208,7 @@ clean_up(ve)

The following example shows how to get the list of bi-grams starting with **"great_"** along with their frequencies. It also shows how to get the frequency of the bi-gram **"great_deal"**.

```{r analyze-n-grams-3, cache=TRUE}
```{r analyze-n-grams-3, cache=FALSE}
# The required files
rf <- c("n2.RDS")
# The test environment is setup
Expand Down Expand Up @@ -294,7 +294,7 @@ Extrinsic evaluation measures the accuracy score for the sentences in a validati

The following example shows how to evaluate the performance of a model:

```{r evaluate-performance-1, cache=TRUE, results='hide'}
```{r evaluate-performance-1, cache=FALSE, results='hide'}
# The required files
rf <- c("def-model.RDS", "validate.txt")
# The test environment is setup
Expand Down
26 changes: 11 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
[![lint](https://github.com/pakjiddat/word-predictor/workflows/lint/badge.svg)](https://github.com/pakjiddat/word-predictor/actions)
[![test-coverage](https://github.com/pakjiddat/word-predictor/workflows/test-coverage/badge.svg)](https://github.com/pakjiddat/word-predictor/actions)
[![Codecov test
coverage](https://codecov.io/gh/pakjiddat/word-predictor/branch/master/graph/badge.svg)](https://codecov.io/gh/pakjiddat/word-predictor?branch=master)
coverage](https://codecov.io/gh/pakjiddat/word-predictor/branch/master/graph/badge.svg)](https://app.codecov.io/gh/pakjiddat/word-predictor?branch=master)
[![CRAN
version](https://www.r-pkg.org/badges/version/wordpredictor)](https://cran.r-project.org/package=wordpredictor)
<!-- badges: end -->
Expand Down Expand Up @@ -106,7 +106,7 @@ mg <- ModelGenerator$new(
# Generates n-gram model. The output is the file def-model.RDS
mg$generate_model()

# The test envionment is cleaned up
# The test environment is cleaned up
clean_up(ve)
```

Expand All @@ -133,7 +133,7 @@ mp <- ModelPredictor$new(mf = mfn)
# next words are returned along with their respective probabilities.
res <- mp$predict_word(words = "how are", 3)

# The test envionment is cleaned up
# The test environment is cleaned up
clean_up(ve)
```

Expand Down Expand Up @@ -176,8 +176,7 @@ df <- da$plot_n_gram_stats(opts = list(
![](man/figures/README-analyze-ngrams-1-1.png)<!-- -->

``` r

# The test envionment is cleaned up
# The test environment is cleaned up
clean_up(ve)
```

Expand Down Expand Up @@ -207,14 +206,13 @@ df <- da$plot_n_gram_stats(opts = list(
![](man/figures/README-analyze-ngrams-2-1.png)<!-- -->

``` r

# The test envionment is cleaned up
# The test environment is cleaned up
clean_up(ve)
```

The following example shows how to get the list of bi-grams starting
with **“great\_** along with their frequencies. It also shows how to
get the frequency of the bi-gram **great\_deal**.
get the frequency of the bi-gram **great_deal**.

``` r
# The required files
Expand All @@ -234,18 +232,17 @@ df <- df[order(df$freq, decreasing = T),]
# The frequency of the bi-gram "great_deal"
f <- as.numeric(df[df$pre == "great_deal", "freq"])

# The test envionment is cleaned up
# The test environment is cleaned up
clean_up(ve)
```

## Customizing the n-gram model

The **dc\_opts** parameter to the **ModelGenerator** class specifies the
The **dc_opts** parameter to the **ModelGenerator** class specifies the
data cleaning options. The following code shows the data cleaning
options and their default values:

``` r

# @field dc_opts The options for the data cleaner object.
# min_words -> The minimum number of words per sentence.
# line_count -> The number of lines to read and clean at a time.
Expand Down Expand Up @@ -280,12 +277,11 @@ dc_opts = list(
)
```

The **tg\_opts** parameter to the **ModelGenerator** class specifies the
The **tg_opts** parameter to the **ModelGenerator** class specifies the
token generation options. The following code shows the token generation
options and their default values:

``` r

# @field tg_opts The options for the token generator obj.
# min_freq -> All ngrams with frequency less than min_freq are
# ignored.
Expand Down Expand Up @@ -331,7 +327,7 @@ me <- ModelEvaluator$new(mf = mfn, ve = 2)
# a data frame and also saved within the model file itself.
stats <- me$evaluate_performance(lc = 20, fn = vfn)

# The test envionment is cleaned up
# The test environment is cleaned up
clean_up(ve)
```

Expand Down Expand Up @@ -388,7 +384,7 @@ biological sequence analysis, data compression and more. This will
require further performance optimization.

The source code is organized using R6 classes. It is easy to extend.
Contributions are welcome \!.
Contributions are welcome !.

## Acknowledgments

Expand Down
2 changes: 2 additions & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,5 @@ url: https://pakjiddat.github.io/word-predictor
title: Word Predictor
development:
mode: release
template:
bootstrap: 5
10 changes: 2 additions & 8 deletions cran-comments.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,8 @@
## Test environments

- local, ubuntu 20.04.2, R 3.6.3
- github actions, macos-latest, R release
- github actions, windows-latest, R release
- github actions, ubuntu 20.04, R devel
- github actions, ubuntu 20.04, R release
- rhub, debian-clang-devel, R devel
- local, ubuntu 20.04.2, R 4.1.2
- rhub, ubuntu 20.04.2, R release
- rhub, fedora-clang-devel, R devel
- rhub, solaris-x86-patched, R release
- rhub, macos-highsierra-release, R release
- winbuilder, windows, R devel

## R CMD check results
Expand Down
Loading

0 comments on commit a642b2e

Please sign in to comment.