diff --git a/DESCRIPTION b/DESCRIPTION
index 1ed743a..a8b631a 100644
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -2,8 +2,12 @@ Package: medrxivr
Title: Access and Search MedRxiv and BioRxiv Preprint Data
Version: 0.0.5.9000
Authors@R: c(
+ person("Yaoxiang", "Li",
+ role = c("aut", "cre"),
+ email = "liyaoxiang@outlook.com",
+ comment = c(ORCID="0000-0001-9200-1016")),
person("Luke", "McGuinness",
- role = c("aut", "cre"),
+ role = c("aut"),
email = "luke.mcguinness@bristol.ac.uk",
comment = c(ORCID = "0000-0001-8730-9761")),
person("Lena", "Schmidt",
diff --git a/LICENSE b/LICENSE
deleted file mode 100644
index 6d1d5c0..0000000
--- a/LICENSE
+++ /dev/null
@@ -1,2 +0,0 @@
-YEAR: 2020
-COPYRIGHT HOLDER: Luke McGuinness
diff --git a/R/mx_api.R b/R/mx_api.R
index d37b5ce..986193d 100644
--- a/R/mx_api.R
+++ b/R/mx_api.R
@@ -56,11 +56,8 @@ mx_api_content <- function(from_date = "2013-01-01",
details_link <- api_link(server, from_date, to_date, "0")
details <- api_to_df(details_link)
- # Ensure 'count' is numeric
count <- as.numeric(details$messages[1, 6])
- if (is.na(count)) {
- stop("Count value is not numeric.")
- }
+
pages <- floor(count / 100)
message("Estimated total number of records as per API metadata: ", count)
diff --git a/README.Rmd b/README.Rmd
index 0ab3dc7..07a80e0 100644
--- a/README.Rmd
+++ b/README.Rmd
@@ -18,7 +18,7 @@ knitr::opts_chunk$set(
library(medrxivr)
```
-# medrxivr
+# medrxivr
@@ -28,7 +28,6 @@ library(medrxivr)
[![CRAN Downloads.](https://cranlogs.r-pkg.org/badges/grand-total/medrxivr)](https://CRAN.R-project.org/package=medrxivr)
[![R build status](https://github.com/ropensci/medrxivr/workflows/R-CMD-check/badge.svg)](https://github.com/ropensci/medrxivr/actions)
-[![Travis build status](https://travis-ci.com/ropensci/medrxivr.svg?branch=master)](https://travis-ci.com/ropensci/medrxivr)
[![Codecov test coverage](https://codecov.io/gh/ropensci/medrxivr/branch/master/graph/badge.svg)](https://codecov.io/gh/ropensci/medrxivr?branch=master)
diff --git a/README.md b/README.md
index 553540f..e9fa850 100644
--- a/README.md
+++ b/README.md
@@ -1,7 +1,7 @@
-# medrxivr
+# medrxivr
@@ -15,8 +15,6 @@ Badge](https://badges.ropensci.org/380_status.svg)](https://github.com/ropensci/
Downloads.](https://cranlogs.r-pkg.org/badges/grand-total/medrxivr)](https://CRAN.R-project.org/package=medrxivr)
[![R build
status](https://github.com/ropensci/medrxivr/workflows/R-CMD-check/badge.svg)](https://github.com/ropensci/medrxivr/actions)
-[![Travis build
-status](https://travis-ci.com/ropensci/medrxivr.svg?branch=master)](https://travis-ci.com/ropensci/medrxivr)
[![Codecov test
coverage](https://codecov.io/gh/ropensci/medrxivr/branch/master/graph/badge.svg)](https://codecov.io/gh/ropensci/medrxivr?branch=master)
@@ -66,27 +64,23 @@ library(medrxivr)
`medrixvr` provides two ways to access medRxiv data:
- - `mx_api_content(server = "medrxiv")` creates a local copy of all
- data available from the medRxiv API at the time the function is run.
-
-
+- `mx_api_content(server = "medrxiv")` creates a local copy of all data
+ available from the medRxiv API at the time the function is run.
``` r
# Get a copy of the database from the live medRxiv API endpoint
preprint_data <- mx_api_content()
```
- - `mx_snapshot()` provides access to a static snapshot of the medRxiv
- database. The snapshot is created each morning at 6am using
- `mx_api_content()` and is stored as CSV file in the [medrxivr-data
- repository](https://github.com/mcguinlu/medrxivr-data). This method
- does not rely on the API (which can become unavailable during peak
- usage times) and is usually faster (as it reads data from a CSV
- rather than having to re-extract it from the API). Discrepancies
- between the most recent static snapshot and the live database can be
- assessed using `mx_crosscheck()`.
-
-
+- `mx_snapshot()` provides access to a static snapshot of the medRxiv
+ database. The snapshot is created each morning at 6am using
+ `mx_api_content()` and is stored as CSV file in the [medrxivr-data
+ repository](https://github.com/mcguinlu/medrxivr-data). This method
+ does not rely on the API (which can become unavailable during peak
+ usage times) and is usually faster (as it reads data from a CSV rather
+ than having to re-extract it from the API). Discrepancies between the
+ most recent static snapshot and the live database can be assessed
+ using `mx_crosscheck()`.
``` r
# Get a copy of the database from the daily snapshot
@@ -102,13 +96,10 @@ summarised in the figure below:
Only one data source exists for the bioRxiv repository:
- - `mx_api_content(server = "biorxiv")` creates a local copy of all
- data available from the bioRxiv API endpoint at the time the
- function is run. **Note**: due to it’s size, downloading a complete
- copy of the bioRxiv repository in this manner takes a long time (\~
- 1 hour).
-
-
+- `mx_api_content(server = "biorxiv")` creates a local copy of all data
+ available from the bioRxiv API endpoint at the time the function is
+ run. **Note**: due to it’s size, downloading a complete copy of the
+ bioRxiv repository in this manner takes a long time (~ 1 hour).
``` r
# Get a copy of the database from the live bioRxiv API endpoint
@@ -125,12 +116,12 @@ advanced search strategy.
``` r
# Import the medrxiv database
preprint_data <- mx_snapshot()
-#> Using medRxiv snapshot - 2021-01-28 09:31
+#> Using medRxiv snapshot - 2022-07-06 01:09
# Perform a simple search
results <- mx_search(data = preprint_data,
query ="dementia")
-#> Found 192 record(s) matching your search.
+#> Found 427 record(s) matching your search.
# Perform an advanced search
topic1 <- c("dementia","vascular","alzheimer's") # Combined with Boolean OR
@@ -139,7 +130,7 @@ myquery <- list(topic1, topic2) # Combined with Boolean AND
results <- mx_search(data = preprint_data,
query = myquery)
-#> Found 70 record(s) matching your search.
+#> Found 143 record(s) matching your search.
```
You can also explore which search terms are contributing most to your
@@ -149,15 +140,15 @@ search by setting `report = TRUE`:
results <- mx_search(data = preprint_data,
query = myquery,
report = TRUE)
-#> Found 70 record(s) matching your search.
-#> Total topic 1 records: 1078
-#> dementia: 192
-#> vascular: 917
+#> Found 143 record(s) matching your search.
+#> Total topic 1 records: 2272
+#> dementia: 427
+#> vascular: 1918
#> alzheimer's: 0
-#> Total topic 2 records: 203
-#> lipids: 74
-#> statins: 25
-#> cholesterol: 136
+#> Total topic 2 records: 410
+#> lipids: 157
+#> statins: 61
+#> cholesterol: 255
```
## Further functionality
@@ -222,14 +213,14 @@ and then search medRxiv and bioRxiv data. Below are a list of
complementary packages that provide distinct but related functionality
when working with medRxiv and bioRxiv data:
- - [`rbiorxiv`](https://github.com/nicholasmfraser/rbiorxiv) by
- [Nicholas Fraser](https://github.com/nicholasmfraser) provides
- access to the same medRxiv and bioRxiv *content* data as `medrxivr`,
- but also provides access to the *usage* data (e.g. downloads per
- month) that the Cold Spring Harbour Laboratory API offers. This is
- useful if you wish to explore, for example, [how the number of PDF
- downloads from bioRxiv has grown over
- time.](https://github.com/nicholasmfraser/rbiorxiv#pdf-downloads-over-time)
+- [`rbiorxiv`](https://github.com/nicholasmfraser/rbiorxiv) by [Nicholas
+ Fraser](https://github.com/nicholasmfraser) provides access to the
+ same medRxiv and bioRxiv *content* data as `medrxivr`, but also
+ provides access to the *usage* data (e.g. downloads per month) that
+ the Cold Spring Harbour Laboratory API offers. This is useful if you
+ wish to explore, for example, [how the number of PDF downloads from
+ bioRxiv has grown over
+ time.](https://github.com/nicholasmfraser/rbiorxiv#pdf-downloads-over-time)
## Code of conduct
@@ -242,4 +233,4 @@ project, you agree to abide by its terms.
This package and the data it accesses/returns are provided “as is”, with
no guarantee of accuracy. Please be sure to check the accuracy of the
data yourself (and do let me know if you find an issue so I can fix it
-for everyone\!)
+for everyone!)
diff --git a/man/figures/logo.png b/man/figures/logo.png
new file mode 100644
index 0000000..0140713
Binary files /dev/null and b/man/figures/logo.png differ