From 3c34f35b68999524d4a82e154f9da7feb0ac5aa9 Mon Sep 17 00:00:00 2001
From: ivelasq 1.8 Colophonbookdown using RStudio. The complete source is available on GitHub.
This version of the book was built with R version 4.4.0 (2024-04-24) and with the packages listed in Table 1.1.
-The set or group we want to survey is known as the population of interest or the target population. The population of interest could be broad, such as “all adults age 18+ living in the U.S.” or a specific population based on a particular characteristic or location. For example, we may want to know about “adults aged 18–24 who live in North Carolina” or “eligible voters living in Illinois.”
-However, a sampling frame with contact information is needed to survey individuals in these populations of interest. If we are looking at eligible voters, the sampling frame could be the voting registry for a given state or area. If we are looking at more board populations of interest, like all adults in the United States, the sampling frame is likely imperfect. In these cases, a full list of individuals in the United States is not available for a sampling frame. Instead, we may choose to use a sampling frame of mailing addresses and send the survey to households, or we may choose to use random digit dialing (RDD) and call random phone numbers (that may or may not be assigned, connected, and working).
+However, a sampling frame with contact information is needed to survey individuals in these populations of interest. If we are looking at eligible voters, the sampling frame could be the voting registry for a given state or area. If we are looking at more broad populations of interest, like all adults in the United States, the sampling frame is likely imperfect. In these cases, a full list of individuals in the United States is not available for a sampling frame. Instead, we may choose to use a sampling frame of mailing addresses and send the survey to households, or we may choose to use random digit dialing (RDD) and call random phone numbers (that may or may not be assigned, connected, and working).
These imperfect sampling frames can result in coverage error where there is a mismatch between the population of interest and the list of individuals we can select. For example, if we are looking to obtain estimates for “all adults aged 18+ living in the U.S.,” a sampling frame of mailing addresses will miss specific types of individuals, such as the homeless, transient populations, and incarcerated individuals. Additionally, many households have more than one adult resident, so we would need to consider how to get a specific individual to fill out the survey (called within household selection) or adjust the population of interest to report on “U.S. households” instead of “individuals.”
Once we have selected the sampling frame, the next step is determining how to select individuals for the survey. In rare cases, we may conduct a census and survey everyone on the sampling frame. However, the ability to implement a questionnaire at that scale is something only a few can do (e.g., government censuses). Instead, we typically choose to sample individuals and use weights to estimate numbers in the population of interest. They can use a variety of different sampling methods, and more information on these can be found in Chapter 10. This decision of which sampling method to use impacts sampling error and can be accounted for in weighting.
chi_ex2_obs_table
chi_ex3_obs_table
trust_gov_gt %>%
tab_caption("Example of {gt} table with trust in government estimate")
These errata are in the print version. They have been corrected in the online version.
+The word broad not board should be in the following sentence. “If we are looking at more broad populations of interest, like all adults in the United States, the sampling frame is likely imperfect.”
+Stata files have the extension of .dta
. There are a few instances of using the function read_dat()
instead of read_dta()
on Page 315.
## # A tibble: 2,724 × 90
## SERIALNO SPORDER AGEP PUMA ST SEX HISP RAC1P WGTP PWGTP
## <chr> <dbl> <dbl> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>
-## 1 2022HU0937941 1 60 01302 37 2 01 1 132 132
-## 2 2022HU0937941 2 61 01302 37 1 01 1 132 107
-## 3 2022HU0938759 1 44 01301 37 1 01 1 60 61
-## 4 2022HU0938759 2 48 01301 37 2 01 1 60 63
-## 5 2022HU0938759 3 19 01301 37 1 01 1 60 107
-## 6 2022HU0938759 4 16 01301 37 2 01 1 60 50
-## 7 2022HU0938759 5 12 01301 37 2 01 1 60 84
-## 8 2022HU0939904 1 53 01302 37 1 01 1 104 104
-## 9 2022HU0939904 2 53 01302 37 1 01 1 104 101
-## 10 2022HU0941348 1 70 01301 37 1 01 1 77 77
+## 1 2022HU0427307 1 69 01301 37 2 01 1 104 104
+## 2 2022HU0431707 1 62 01301 37 2 01 2 145 145
+## 3 2022HU0432012 1 48 01301 37 1 01 6 63 63
+## 4 2022HU0432210 1 70 01301 37 2 01 1 102 101
+## 5 2022HU0432210 2 69 01301 37 1 01 1 102 74
+## 6 2022HU0432778 1 74 01302 37 2 01 1 123 124
+## 7 2022HU0432967 1 26 01301 37 2 01 6 22 23
+## 8 2022HU0432967 2 35 01301 37 2 01 2 22 65
+## 9 2022HU0432967 3 28 01301 37 2 01 6 22 25
+## 10 2022HU0432967 4 25 01301 37 2 01 1 22 48
## # ℹ 2,714 more rows
## # ℹ 80 more variables: PWGTP1 <dbl>, PWGTP2 <dbl>, PWGTP3 <dbl>,
## # PWGTP4 <dbl>, PWGTP5 <dbl>, PWGTP6 <dbl>, PWGTP7 <dbl>,
diff --git a/index.html b/index.html
index 7b87de5..dff007d 100644
--- a/index.html
+++ b/index.html
@@ -23,7 +23,7 @@
-
+
@@ -507,6 +507,7 @@
E Corrections & Remarks
References
@@ -538,7 +539,7 @@
Exploring Complex Survey Data Analysis Using R
A Tidy Introduction with {srvyr} and {survey}
-2024-11-11
+2024-12-31