-
Notifications
You must be signed in to change notification settings - Fork 8
/
Copy pathAdvancedR_week2_errorhandlingandgeneration.R
225 lines (168 loc) · 11.4 KB
/
AdvancedR_week2_errorhandlingandgeneration.R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
######################################################################################################################
### Author: Anni Norring ###
### Date: April 2018 ###
### Content: This script contains the R code for the 2nd week of Advanced R programming course ###
######################################################################################################################
# Access all the needed libraries:
######################################################################################################################
### ERROR HANDLING AND GENERATION
######################################################################################################################
######################################################################################################################
## WHAT IS AN ERROR?
######################################################################################################################
#Errors most often occur when code is used in a way that it is not intended to be used. For example adding two
# strings together produces the following error:
"hello" + "world"
#The + operator is essentially a function that takes two numbers as arguments and finds their sum. Since neither
# "hello"nor "world" are numbers, the R interpreter produces an error. Errors will stop the execution of your
# program, and they will (hopefully) print an error message to the R console.
#In R there are two other constructs in R which are both related to errors: warnings and messages. Warnings are meant
# to indicate that something seems to have gone wrong in your program which should be inspected. Here's a simple
# example of a warning being generated:
as.numeric(c("5", "6", "seven"))
#The as.numeric() function attempts to convert each string in c("5", "6", "seven") into a number, however it is
# impossible to convert "seven", so a warning is generated. Execution of the code is not halted, and an NA is
# produced in place of "seven" instead of a number.
#Messages simply print test to the R console, though they are generated by an underlying mechanism that is similar to
# how errors and warning are generated. Here's a small function that will generate a message:
f <- function(){
message("This is a message.")
}
f()
######################################################################################################################
## GENERATING ERRORS
######################################################################################################################
#There are a few essential functions for generating errors, warnings, and messages in R. The stop() function will
# generate an error. Let's generate an error:
stop("Something erroneous has occured!")
#If an error occurs inside of a function then the name of that function will appear in the error message:
name_of_function <- function(){
stop("Something bad happened.")
}
name_of_function()
#The stopifnot() function takes a series of logical expressions as arguments and if any of them are false an error
# is generated specifying which expression is false. Let's take a look at an example:
error_if_n_is_greater_than_zero <- function(n){
stopifnot(n <= 0)
n
}
error_if_n_is_greater_than_zero(5)
#The warning() function creates a warning, and the function itself is very similar to the stop() function. Remember
# that a warning does not stop the execution of a program (unlike an error.)
warning("Consider yourself warned!")
#Just like errors, a warning generated inside of a function will include the name of the function it was generated
# in:
make_NA <- function(x){
warning("Generating an NA.")
NA
}
make_NA("Sodium")
#Messages are simpler than errors or warnings, they just print strings to the R console. You can issue a message with
#the message() function:
message("In a bottle.")
######################################################################################################################
## WHEN TO GENERATE ERRORS AND WARNINGS?
######################################################################################################################
#Stopping the execution of your program with stop() should only happen in the event of a catastrophe - meaning only
# if it is impossible for your program to continue. If there are conditions that you can anticipate that would
# cause your program to create an error then you should document those conditions so whoever uses your software is
# aware. Common failure conditions like providing invalid arguments to a function should be checked at the
# beginning of your program so that the user can quickly realize something has gone wrong. This is case of
# checking function inputs is a typical use of the stopifnot() function.
#You can think of a function as kind of contract between you and the user: if the user provides specified arguments
# your program will provide predictable results. Of course it's impossible for you to anticipate all of the
# potential uses of your program, so the results of executing a function can only be predictable with regard to
# the type of the result. It's appropriate to create a warning when this contract between you and the user is
# violated. A perfect example of this situation is the result of as.numeric(c("5", "6", "seven")) which we saw
# before. The user expects a vector of numbers to be returned as the result of as.numeric() but "seven" is coerced
# into being NA, which is not completely intuitive.
#R has largely been developed according to the Unix Philosophy (which is further discussed in Chapter 3) which
# generally discourages printing text to the console unless something unexpected has occurred. Languages than
# commonly run on Unix systems like C, C++, and Go are rarely used interactively, meaning that they usually
# underpin computer infrastructure (computers "talking" to other computers). Messages printed to the console are
# therefore not very useful since nobody will ever read them and it's not straightforward for other programs to
# capture and interpret them. In contrast R code is frequently executed by human beings in the R console which
# serves as an interactive environment between the computer and person at the keyboard. If you think your program
# should produce a message, make sure that the output of the message is primarily meant for a human to read. You
# should avoid signaling a condition or the result of your program to another program by creating a message.
######################################################################################################################
## HOW SHOULD ERRORS BE HANDLED?
######################################################################################################################
#Imagine writing a program that will take a long time to complete because of a complex calculation or because you're
# handling a large amount of data. If an error occurs during this computation then you're liable to lose all of
# the results that were calculated before the error, or your program may not finish a critical task that a program
# further down your pipeline is depending on. If you anticipate the possibility of errors occuring during the
# execution of your program then you can design your program to handle them appropriately.
#The tryCatch() function is the workhorse of handling errors and warnings in R. The first argument of this function
# is any R expression, followed by conditions which specify how to handle an error or a warning. The last argument
# finally specifies a function or expression that will be executed after the expression no matter what, even in
# the event of an error or a warning.
#Let's construct a simple function I'm going to call beera that catches errors and warnings gracefully.
beera <- function(expr){
tryCatch(expr,
error = function(e){
message("An error occurred:\n", e)
},
warning = function(w){
message("A warning occured:\n", w)
},
finally = {
message("Finally done!")
})
}
#This function takes an expression as an argument and tries to evaluate it. If the expression can be evaluated
# without any errors or warnings then the result of the expression is returned and the message Finally done! is
# printed to the R console. If an error or warning is generated then the functions that are provided to the error
# or warning arguments are printed. Let's try this function out with a few examples.
beera({
2 + 2
})
beera({
"two" + 2
})
beera({
as.numeric(c(1, "two", 3))
})
#Notice that we've effectively transformed errors and warnings into messages.
#Now that you know the basics of generating and catching errors you'll need to decide when your program should
# generate an error. My advice to you is to limit the number of errors your program generates as much as possible.
# Even if you design your program so that it's able to catch and handle errors, the error handling process slows
# down your program by orders of magnitude. Imagine you wanted to write a simple function that checks if an
# argument is an even number. You might write the following:
is_even <- function(n){
n %% 2 == 0
}
is_even(768)
is_even("two")
#You can see that providing a string causes this function to raise an error. You could imagine though that you want
# to use this function across a list of different data types, and you only want to know which elements of that
# list are even numbers. You might think to write the following:
is_even_error <- function(n){
tryCatch(n %% 2 == 0,
error = function(e){
FALSE
})
}
is_even_error(714)
is_even_error("eight")
#This appears to be working the way you intended, however when applied to more data this function will be seriously
# slow compared to alternatives. For example I could do a check that n is numeric before treating n like a number:
is_even_check <- function(n){
is.numeric(n) && n %% 2 == 0
}
is_even_check(1876)
is_even_check("twelve")
#Notice that by using `is.numeric()` before the "AND" operator (`&&`) the expression `n %% 2 == 0` is never
# evaluated. This is a programming language design feature called "short circuiting." The expression can never
# evaluate to `TRUE` if the left hand side of `&&` evaluates to `FALSE`, so the right hand side is ignored.
#To demonstrate the difference in the speed of the code we'll use the microbenchmark package to measure how long it
# takes for each function to be applied to the same data.
library(microbenchmark)
microbenchmark(sapply(letters, is_even_check))
microbenchmark(sapply(letters, is_even_error))
#The error catching approach is nearly 15 times slower!
#Proper error handling is an essential tool for any software developer so that you can design programs that are
# error tolerant. Creating clear and informative error messages is essential for building quality software. One
# closing tip I recommend is to put documentation for your software online, including the meaning of the errors
# that your software can potentially throw. Often a user's first instinct when encountering an error is to search
# online for that error message, which should lead them to your documentation!