Session 10: Standard error, Bootstrapping and Confidence intervals

This week we will continue the statistics training by studying three important concepts; standard error (Video 1), bootstrapping (Video 2) and confidence intervals (Video 3).

Start by watching Video 1 (The standard error, Clearly Explained!!!).

  1. In the video, Josh explains that the standard error is the standard deviation of sample means drawn from the same population. In other words, if we repeatedly take samples from a population, we can compute the standard error by calculating the standard deviation of those sample means. He illustrates this idea with an example starting at 2:44, in which samples of mouse weight measurements are taken (although this may not be the most well considered example, since the normal curve is centered at zero, implying that 50% of the mice would have negative weights). In the example, the sampling is repeated only three times, which hardly qualifies as “a bunch of times.” Your task is to replicate this analysis, but instead of taking three samples, take 10,000. Use the visuals and table in the example to determine the population parameters (speaking of well considered examples, would you say that the population normal curve in the video aligns with the characteristics of the drawn samples?). Calculate the standard error of the mean. If this exercise feels similar to something we did when learning about variance and standard deviation, you are on the right track.

  2. When taking samples to calculate the standard error (and in other situations when working with simulations or random sampling), why is it important to do so “a bunch of times”, and how many times are enough?

log10_seq <- as.integer(10 ^ (seq(1, 5, .1)))

log10_seq_names <- set_names(log10_seq)

map(log10_seq_names, \(i) {
  map(1:i, \(x) tibble(values = rnorm(n = 20, mean = 20, sd = 10))) |>  
    list_rbind(names_to = "sample")
}) |> 
  list_rbind(names_to = "n_samples") |>  
  group_by(n_samples, sample) |>  
  summarise(m = mean(values)) |>  
  group_by(n_samples) |> 
  summarise(sterr = sd(m)) |> 
  mutate(n_samples = parse_number(n_samples))

  1. Using similar logic as in the two previous exercises, create a simple simulation study in which you investigate the relationship between the standard error of the mean and sample size, while keeping the population mean and standard deviation constant. Vary the sample size across a reasonable range and compute the standard error for each case. Visualize your results and explain what you observe. Hint: First, create a vector of sample sizes that you want to investigate. Then, use nested map() calls as in Exercise 2. In this case, set 1:10000 as the first argument in the inner map() function and use n = i in rnorm().

Now watch Video 2 (Bootstrapping Main Ideas!!!)

  1. We will now use our R skills to calculate the standard error in three different ways; via simulation, bootstrapping and formula. To practice this we will use data from the palmerpenguins package, data you are already familiar with. Once the package is loaded you will be able to work with the data by referencing the object penguins.
boot <- function(data){
  data |>
    slice_sample(prop = 1, replace = _____)
}

Adelie_no_missing <- penguins |>
  filter(species == _____, 
         !is.na(_____)) 

Adelie_boot <- map(1:10000, \(i) boot(_____)) |>
  list_rbind(names_to = "iteration")
  1. Calculate the correlation between bill length and body mass in Adelie penguins (hint: use summarize() and cor()).

We are ready to start thinking about confidence intervals. Check out Video 3 (Confidence Intervals, Clearly Explained!!!)

Josh explains that a 95% confidence interval is just an interval that covers 95% of the (bootstrap) means, or whatever statistic we decide to focus on.

  1. Use bootstrapping to calculate the 95% confidence interval of the mean bill length in Adelie penguins. To find the interval that covers 95% of the bootstrap means, use the summarize() and quantile() functions.
  1. We have defined 95% confidence intervals as an interval that contains 95% of the means calculated when bootstrapping the sample. While this is correct and provides an intuitive way to think about confidence intervals, the formal definition is somewhat different (and more general). We do not want you to get bogged down in the minutiae of definitions here, but it might be useful to look at a visualization that illustrates the more general way of defining confidence intervals. Check out the visualization found here: https://rpsychologist.com/d3/ci/. Can you infer the definition of confidence intervals from this visualization?