Bootstrapping is a method to construct empiric distributions of various statistics (mean, confidence intervals, deviation, etc) by using repreated sampling from an observed dataset.

A little magic is why exactly taking random samples like [1,1,2], [3,2,2], [2,1,3], etc is a good idea to approximate properties of a dataset [1,2,3].

Bootstrap originally proposed by Bradley Efron in 1979. For formal introduction see Horowitz chapter in Handbook of Econometrics (2001) and a usage overview by MacKinnon 2006.

Toy example

Bootstrap confidence intervals by Jeremy Orloff and Jonathan Bloom, pp. 4-6 provides the following basic code example for bootstrap. Their full code for this excercise is here.

# Bootstrap
# Adapted from
cat("Example. Empirical boostrap confidence interval for the mean.",'\n')
x = c(30,37,36,43,42,43,43,46,41,42)
n = length(x)
set.seed(1)  # for repeatability

# sample mean
xbar = mean(x)
cat("data mean = ",xbar,'\n')
nboot = 20
# Generate 20 bootstrap samples, i.e. an n x 20 array of
# random resamples from x.
tmpdata = sample(x,n*nboot, replace=TRUE)
bootstrapsample = matrix(tmpdata, nrow=n, ncol=nboot)

# Compute the means xbar*
xbarstar = colMeans(bootstrapsample)

# Compute delta* for each bootstrap sample
deltastar = xbarstar - xbar

# Find the 0.1 and 0.9 quantiles for deltastar
d = quantile(deltastar,c(0.1,0.9))

# Calculate the 80\% confidence interval for the mean.
ci = xbar - c(d[2],d[1])
cat('Bootstrap confidence interval: [',ci,']','\n')

Bootstrap do’s and don’ts by Anna Mikusheva

  • If you have a pivotal statistic, bootstrap can give a refinement. So, if you have choice of statistics, bootstrap a pivotal one.

  • Bootstrap may fix a finite-sample bias, but cannot help if you have inconsistent estimator.

  • In general, if something does not work with traditional asymptotics, the
    bootstrap cannot fix your problem. For example, if we have an inconsistent estimate, the bootstrap bias correction does not fix anything.

  • Bootstrap could not fix the following problems: weak instruments, parameter on a boundary, unit root, persistent regressors.

  • Bootstrap requires re-centering (the null hypothesis to be true).

Source: MIT lecture notes

Editor notes