forked from UBC-MDS/covid-19-cases-vs-tests-analysis
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
87 lines (70 loc) · 3.27 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
---
title: "Covid-19 Cases vs Tests Analysis"
output: github_document
author: Fatime Selimi, Neel Phaterpekar, Nicholas Wu, Tanmay Sharma
bibliography: doc/covid19.bib
---
# About
The data set used in this project comes from the Our World in Data
COVID-19 Database created by Hannah Richie et al. [@owidcoronavirus]. This data set examines the impact of COVID-19 on countries all
over the world, where daily statistics pertaining to the pandemic from
over 200 countries have been recorded each day since December 31st 2019.
Each row in the data set represents a date in a country, where
measurements like total cases, new daily cases, hospital admission rates
etc. are recorded. Data has been collected in conjunction with the World
Health Organization (WHO), the European Center for Disease Prevention
and Control (ECDC) and is available on [Our World in
Data](https://ourworldindata.org/coronavirus) and raw data can be found
[here](https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/owid-covid-data.csv).
With this data, we ask whether there is a difference in the ratio of the
new daily tests performed to the new daily cases between Canada and the
United States. Since the distributions were found to be left skewed and
of unequal variance (see EDA [here](https://github.com/UBC-MDS/covid-19-cases-vs-tests-analysis/tree/0.1.0/eda)),
we performed a two-tailed hypothesis test checking for
the independence of medians using permutation. With a significance level of
0.05, we found that there was enough evidence to conclude that the median
response ratio was significantly different between Canada and the United States
(p-value < 0.0001). This is one way to begin to assess the different responses
and outcomes that these two countries have faced during the pandemic. However,
further analysis is required to better understand the differences present in
the Canada-US responses to COVID-19.
# Report
The final report can be found [here](https://github.com/UBC-MDS/covid-19-cases-vs-tests-analysis/blob/main/doc/covid-response.pdf).
# Usage
To replicate the analysis, clone this GitHub repository, install the
[dependencies](#dependencies) listed below, and run the following
commands at the command line/terminal from the root directory of this
project:
make all
To reset the repo to a clean state, with no intermediate or results files, run the following command at the command line/terminal from the root directory of this project:
make clean
# Dependencies
- Python 3.8.5 and Python packages:
- docopt==0.6.2
- pandas==1.1.4
- numpy
- altair
- altair_saver
- selenium
- webdriver_manager.chrome
- R version 3.6.1 and R packages:
- knitr==1.29
- readr==1.3.1
- tidyverse==1.3.0
- docopt
- broom==0.7.1
- infer==0.5.3
- cowplot==1.1.0
- ggplot2
- kableExtra
- webshot
- magick
- ggthemes
- GNU make 4.2.1
# License
The materials on analysis about Covid-19 mean response ratio for Canada
and USA are licensed under the MIT License (Copyright (c) 2020 Master of
Data Science at the University of British Columbia). If you want to
re-use/re-mix the analysis and the materials used in this project,
please provide attribution and link to this repository.
# References