set.seed(1)
<- function(x) formatC(x, 20, format = 'g')
dg <- rnorm(100, 0, 1)
z <- z + 1e12
x ## Calculate the empirical variances
dg(var(z))
[1] "0.80676208969370799551"
dg(var(x))
[1] "0.8067583587735590589"
Due Friday Nov. 4, 10 am
Consider the following estimates of the variance of a set of numbers. The results depend on whether the magnitude of the numbers is large or small. You can assume that for a vector w
,
set.seed(1)
dg <- function(x) formatC(x, 20, format = 'g')
z <- rnorm(100, 0, 1)
x <- z + 1e12
## Calculate the empirical variances
dg(var(z))
[1] "0.80676208969370799551"
[1] "0.8067583587735590589"
Explain why these two estimates agree to only a small number of decimal places and which of the two is the more accurate answer, when mathematically the variance of z
and the variance of x
are exactly the same (since x
is just the addition of a constant to z
).
Consider the following, in which we run into problems when trying to calculate on a computer. Suppose I want to calculate a predictive density for new data (e.g., in a model comparison in a Bayesian context):
If we have a set of samples for
Explain why I should calculate the product in the equation above on the log scale. What is likely to happen if I just try to calculate it directly?
Here’s a re-expression, using the log scale for the inner quantity,
Consider the log predictive density,
Hint: recall that with the logistic regression example in class, we scaled the problematic expression to remove the numerical problem. Here you can do something similar with the
Experimenting with importance sampling.
Use importance sampling to estimate the mean (i.e.,
Now use importance sampling to estimate the mean of the same truncated t distribution with 3 degrees of freedom, truncated such that
Extra credit: This problem explores the smallest positive number that R can represent and how R represents numbers just larger than the smallest positive number that can be represented. (Note: if you did this in Python you’d get the same results.)
By experimentation in R, find the base 10 representation of the smallest positive number that can be represented in R. Hint: it’s rather smaller than
Explain how it can be that we can store a number smaller than
Hint: you’ll be working with numbers that are not normalized (i.e., denormalized; numbers that do not have 1 as the fixed number before the decimal point in the floating point representation we discussed in Unit 8.
Comments