Markov-Chain Monte Carlo

If you’re in the business of sampling from a distribution that you only know up to a constant normalization, Markov-Chain Monte Carlo (and the Metropolis Algorithm) are for you.  The Metropolis algorithm is name after a scientist, and not the adopted hometown of an illegal alien, but it can leap unruly distributions in a (burn-in + n) bound.  In particular, Bayes’ Theorem gives us an unnormalized distribution we would like to sample from.

 



MCMC


Our first very simple example: we will use Markov chain Monte Carlo to draw samples from a standard normal distribution using a proposal distribution \[q(\cdot | X) \sim N(X, \sigma^2).\]

library(ggplot2)

simpleMCMC <- function(st_dev, start, num_total){
  X = 0;
  X[1] = start
  for (t in 2:(num_total)){
    y = rnorm(1,mean = X[t-1], sd = st_dev)
    u = runif(1)
    numerator = dnorm(y)*dnorm(X[t-1],mean=y, sd = st_dev)
    denominator = dnorm(X[t-1])*dnorm(y,mean=X[t-1], sd = st_dev)
    alpha = min(c(1,numerator/denominator))
    X[t] = y*(u<= alpha) + X[t-1]*(u>alpha)
  }
  X
}

Now, we will test this out with different values of the standard deviation

\(\sigma=0.5\)

X = data.frame(x = simpleMCMC(0.5,-10,500), t = 1:500)
ggplot(X,aes(x=t, y=x)) + geom_point() + geom_line()

plot of chunk unnamed-chunk-2

\(\sigma=0.1\)

X = data.frame(x = simpleMCMC(0.1,0,500), t = 1:500)
ggplot(X,aes(x=t, y=x)) + geom_point() + geom_line()

plot of chunk unnamed-chunk-3

\(\sigma=10\)

X = data.frame(x = simpleMCMC(10,0,500), t = 1:500)
ggplot(X,aes(x=t, y=x)) + geom_point() + geom_line()

plot of chunk unnamed-chunk-4

Let’s show that we get the correct answer.

X = data.frame(x = simpleMCMC(0.5,0,50000), t = 1:5000)
X <- X[10000:50000,]
mean(X$x)
## [1] -0.02637
sd(X$x)
## [1] 1.008
quantile(X$x)
##       0%      25%      50%      75%     100% 
## -4.37686 -0.70038 -0.02983  0.63757  3.81768
quantile(rnorm(500))
##       0%      25%      50%      75%     100% 
## -3.73404 -0.59400 -0.02245  0.58701  2.63172

Can we sample from a uniform distribution by using a normal proposal?

simpleMCMC_unif <- function(st_dev, start, num_total){
  X = 0;
  X[1] = start
  for (t in 2:(num_total)){
    y = rnorm(1,mean = X[t-1], sd = st_dev)
    u = runif(1)
    numerator = dunif(y)*dnorm(X[t-1],mean=y, sd = st_dev)
    denominator = dunif(X[t-1])*dnorm(y,mean=X[t-1], sd = st_dev)
    alpha = min(c(1,numerator/denominator))
    X[t] = y*(u<= alpha) + X[t-1]*(u>alpha)
  }
  X
}
X = data.frame(x = simpleMCMC_unif(0.1,0,5000), t = 1:500)
ggplot(X[1:500,],aes(x=t, y=x)) + geom_point() + geom_line()

plot of chunk unnamed-chunk-6

X = X[1000:5000,]
ggplot(X,aes(x=x)) + geom_histogram()
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.

plot of chunk unnamed-chunk-6

Calibration using MCMC

Let’s consider the simple simulation model \[v_\mathrm{sim} = g t,\] where \(g\) is the acceleration due to gravity and \(t\) is the time since the object was dropped from rest. The acceleration due to gravity is an uncertain parameter in our model.

Now let’s suppose that the “real’’ model that gives us our observations is \[ v_\mathrm{obs} = 9.81(t – 0.1 \sin(\pi t)).\]

Our simulation model does not capture all the physics of the observation, but will do a decent job, especially at late times.

We have a measurment device that allows us to measure with an accuracy having a known standard deviation of \(\sigma_\mathrm{obs}=10^{-1}\) m/s.Furthermore, we assume we have no knowledge of the prior distribution for \(g\), other than it is in between 9.79 m/s\(^2\) and 9.82 m/s\(^2\). Based on this, the likelihood of a given value of \(g\), given an observation at \(t\) is \[\hat{p}(g|t) = \exp\left(-\frac{(v_\mathrm{obs} – v_\mathrm{sim})^2}{2 \sigma_\mathrm{obs}^2}\right),\] for \(g \in [9.79,9.82]\), and 0 otherwise.

To use the Metropolis algorithm to sample from the posterior distribution for \(g\) given some observational data, we will need a proposal distribution. For this we will use a proposal distribution given by \[ q(y|x) \sim N(\mu_X, 0.05).\]

The observational data we are given is at \(t = 0.1, 0.5, 1, 1.5, 2, 5, 10, 20\).

gDF <- data.frame(t = c(0.1, 0.5, 1, 1.5, 2, 5, 10, 20))
gDF$vobs= with(gDF,9.81*(t - 0.1*sin(pi*t)))
limits <- aes(ymax = gDF$vobs + .1, ymin=gDF$vobs - .1)
ggplot(gDF, aes(x=t, y = vobs)) + geom_point() + geom_errorbar(limits) + scale_x_continuous("time (s)") + scale_y_log10("Observed velocity (m/s)") + ggtitle("Observational Data")

plot of chunk unnamed-chunk-7

We will start our chain at \(g=9.79\) and use 1000 burn in samples. The first thing we will do is define a function to be our \(\hat p\):

phat <- function(g){
  #compute exponent for each point
  exponent <- -(gDF$vobs - g*gDF$t)^2/(2 * 0.1^2)
  #sum of exponent is equal to logarithm of product
  phatS <- exp(sum(exponent))
  if ((g > 9.82) || (g < 9.79)){ phatS = 0}
  phatS
}

Our samples are generated by

#do 10000 samples
Nsamps <- 10^4
gSamps <- data.frame(i = 1:Nsamps, g=0*(1:Nsamps)+9.81)
for (i in 1:(Nsamps-1)){
  proposed <- rnorm(1,mean=gSamps$g[i],sd = 0.05)
  u = runif(1)
  numerator = phat(proposed)*dnorm(gSamps$g[i],mean=proposed, sd = 0.05)
  denominator = phat(gSamps$g[i])*dnorm(proposed,mean=gSamps$g[i], sd = 0.05)
  alpha = min(c(1,numerator/denominator))
  gSamps$g[i+1] = proposed*(u<= alpha) + gSamps$g[i]*(u>alpha)
}
ggplot(gSamps,aes(x=i, y = g)) +  geom_line() + scale_x_continuous("sample") + scale_y_continuous(expression("g (m/s^2)")) + geom_line(aes(x=c(1000,1000), y=c(9.79,9.82)), color="red", size=1)

plot of chunk unnamed-chunk-9

Looking at our distribution, subtracting out the burn-in, we get a mean of \(g=9.8111\) with a standard deviation of 0.0041. The histogram looks like

ggplot(gSamps[1001:Nsamps,],aes(x=g)) + geom_histogram(fill="white",color="blue")
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
## Warning: position_stack requires constant width: output may be incorrect

plot of chunk unnamed-chunk-10



Leave a Reply

Your email address will not be published. Required fields are marked *