generated from jhudsl/OTTR_Template
-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy path03-learning-from-failure.Rmd
13 lines (6 loc) · 2.5 KB
/
03-learning-from-failure.Rmd
1
2
3
4
5
6
7
8
9
10
11
# Learning from Failure
Learning about how data analyses succeeds or fails (but more importantly, fails) is extremely challenging without actually going through the process yourself. I don’t think I ever learned about it except through first hand experience, which took place over the course of years. There are a few reasons for this that I have observed over time:
* Success in scientific data analysis is usually concerned with whether the claims made based on the results are *true* or not. If the results feel true, and the analysis appears rigorous, then that’s usually the end of the discussion. Focus is put on the result and what should come next. The underlying idea here is not necessarily misguided: Progress in science depends on independent replication, and any given analysis cannot be assigned too much weight.
* When analyses fail, the results are usually vague and confusing. Furthermore, the public rarely finds out about them because they are not published. This is mostly due to human nature: it’s difficult to motivate oneself to write about an experience that was inconclusive and perhaps incoherent. It can also be embarrassing if honest mistakes were made. Publication of negative studies is a separate matter, because a truly negative study is, in fact, conclusive. But often, we don’t even have that much clarity.
* In the rare cases where we do find out about data analysis failures, the focus is often on who or what is to blame. In cases where criminal activity has taken place, this is an important aspect. However, identifying who or what is to blame usually doesn’t provide us with generalizable knowledge that we can apply to our own data analyses. The underlying assumption of assigning blame is that this failure was a unique situation that could never have happened if the individual to blame had not been involved. Occasionally, there are cases where there is a clear bug in some software that leads to erroneous results. Fixing the bug in the code will "fix" the results, but even in that situation it’s not always clear that the bug is the ultimate cause of failure (although in this case it is the proximate cause).
The case study presented in the next chapter is useful for thinking about what kinds of generalizable knowledge we can obtain from data analysis failures. This case is special because it had serious implications and large parts of it played out in public. While we likely will never know *all* of the details, we know enough to have a meaningful discussion about the lessons learned.