Simulation studies
Overview
Our in-class discussion and participation assignment today focuses on section 4. Numerical studies of the paper by Cao et al., which you read for Problem Set 5.
Specifically, we’ll discuss the goals of the simulation study, the choices the authors made in designing their data generating mechanisms, how those choices may affect the results of the study, and interpretation of those results as presented in Table 1 on page 765.
Schedule
- First, to get us all refreshed on the context, I’ll give a brief overview of their method. ~5-10 minutes
- Please form groups of 2-3 students. Open a shared Google Doc, giving access to everyone in your group.
- Discuss how the authors generate their simulated datasets, as described in Section 4.1 of the paper. In your Google Doc, have one group member write down pseudo-code to generate a single dataset. ~5-10 minutes
- They use the multivariate normal (MVN) distribution for part of their simulation. Given a specific covariance matrix, can you show how to efficiently generate MVN data with that covariance structure using the Cholesky decomposition (described in Unit 10)?
- Next, discuss the specific choices the authors made in designing their simulated datasets. As you discuss, write down those choices in your Google Doc. ~10 minutes~
- As a group, pick the two most important choices and justify why you picked them in the Google Doc. ~5 minutes
- With those choices in mind, brainstorm and write how you might simulate the data differently. ~5-10 minutes
- Finally, each group member should export the Google Doc to PDF and submit it for the “Lab 8: Simulation studies” Gradescope assignment.
- For the last 10–15 minutes, we’ll come back together and have a class-wide discussion.