Skip to the content.

Regression to the mean

Introduction

Table of contents

  1. Section 1
  2. Section 2

🡻

1. Section 1

Statistical model:

For individuals i = {i, …, n} and runs t = {1, 2 }, let us assume the statistcal model for a hypothetical 1-mile run time is:

\[\textrm{run}_{i,t} = \beta_0 + \eta_i + \epsilon_{i,t} \textrm{ where } \eta_i \sim N( 0, \sigma^{\eta} ) \textrm{ and } \epsilon_{i,t} \sim N( 0, \sigma^{\epsilon} ).\]
# install.packages( 'dplyr' )
library( dplyr )

# Fix RNG seed for reproducibility
set.seed( 100 )

B0 = 9 # Mean time to run a mile (minutes)
sigma = 1 # Standard deviation

# Standard deviation for subject-level differences
sigma_eta = .5
# Standard deviation for differences between 
# runs (i.e., residual variation)
sigma_epsilon = sqrt( (sigma^2) - sigma_eta^2 )

# Sample size
n = 100

# Initialize data frmae
dtf = data.frame(
  ID = rep( 1:n ),
  # Subject-level deviations
  eta = NA,
  # Residual deviations 
  epsilon_1 = NA,
  epsilon_2 = NA,
  # Intervention (based on performance on first run)
  Intervention = 'Criticize',
  x = NA, # Dummy code for intervention
  # Observed 1-mile run time (minutes)
  run_1 = NA,
  run_2 = NA,
  stringsAsFactors = F
)

# Simulate subject-level deviations in run time 
# (i.e., individual differences in performance,
#  constant over run 1 and 2)
dtf$eta = rnorm( n, 0, sigma_eta )

# Simulate independent residual variation 
# between run 1 and 2
dtf$epsilon_1 = rnorm( n, 0, sigma_epsilon )
dtf$epsilon_2 = rnorm( n, 0, sigma_epsilon )

# Simulate run times (no effect of intervention)
dtf$run_1 = B0 + dtf$eta + dtf$epsilon_1
dtf$run_2 = B0 + dtf$eta + dtf$epsilon_2

# Specify intervention type based on 
# performance on run 1

# Above-average performance gets praise
dtf$Intervention[ dtf$run_1 > B0 ] = 'Praise'
# Dummy coded version
dtf$x = as.numeric( dtf$run_1 > B0 )

# Correlation between runs 1 and 2
# (i.e., stability between subjects)
cor( dtf$run_1, dtf$run_2 )

dtf %>% 
  group_by( 
    Intervention
  ) %>% 
  summarise( 
    Mean_change = round( mean( run_2 - run_1 ), 2 ), 
    .groups = 'drop'
  )

#   Intervention Mean_change
# 1 Criticize          0.53
# 2 Praise            -0.48

Note: Advanced content.

References:

🡹 🡻

1. Section 2

Content.

# Example R code

Note: Advanced content.

References:

🡹

Return to: Chapter; Sections; Index; Home page