CS计算机代考程序代写 Hive Homework 3: STA465/ STA2016
Homework 3: STA465/ STA2016
Homework 3 is due on Friday, March 12th at 23:59 EST. The homework assignment is worth 20 points in total.
Question 1 (20 pts)
For this question, we will use the Pennsylvania data set we have already covered in class. More information on the data and an analysis can be found here: https://journal.r-project.org/archive/2018/RJ-2018-036/RJ- 2018-036.pdf
library(SpatialEpi) library(tidyverse) library(sf) library(spdep)
data(pennLC)
population <- pennLC$data$population
cases <- pennLC$data$cases
n.strata <- 16
E <- expected(population, cases, n.strata)
d <- aggregate(x = pennLC$data$cases, by = list(county = pennLC$data$county), FUN = sum)
# from spatial polygon to simple feature
pennLC.sf <- st_as_sf(pennLC$spatial.polygon)
pennLC.sf$county <- d$county
pennLC.sf$counts <- d$x
pennLC.sf$E <- E[match(pennLC.sf$county, unique(pennLC$data$county))]
pennLC.sf <- merge(pennLC.sf, pennLC$smoking, by.x = "county", by.y = "county") pennLC.sf <- pennLC.sf%>%
mutate(SIR = counts/E)
The BYM model for the data:
Yi|θi ∼ Po(Eiθi) log(θi)=β0 +β1 ×smokingi +ui +vi
j∼iuj 1 ui|uj∼i∼N d ,d τ
i,i
1 vi∼N 0,τv
i,i u
where β0 is the intercept, β1 relates to the effect of proportion of smokers on lung cancer, τu and τv are the precision of the spatial and unstructured random effect, respectively. The expected counts per county are denoted by Ei and total number of neighbors for location ni is denoted as di,i.
1
Question 1.1
What is the geometry type and CRS of the Pennsylvania lip cancer data set (pennLC.sf)? Display counts, E, smoking, and SIR in a four panel plot.
Question 1.2
We will fit a Besag-York-Mollié model to the data, but first do a few prior predictive checks. The ICAR structure is an improper and non-generative prior for the spatial random effect of the BYM model. However, we can impose a sum-to-zero constraint and use an eigenvalue decomposition to be able to generate data from the ICAR prior.
Use the following two sets of priors to simulate 100 data sets from the prior predictive distribution of the BYM model:
1. β0 ∼ N(0,1), β1 ∼ N(0,1), τv ∼ Gamma(1,1), τu ∼ Gamma(1,1)
2. β0 ∼ N (0, 100), β1 ∼ N (0, 100), τv ∼ Gamma(0.1, 0.1), τu ∼ Gamma(0.1, 0.1)
For counties centre, monroe, westmoreland, lebanon, beaver, and erie, make a histogram of the prior predictive draws and add a vertical line at the observed value.
Question 1.3
For prior sets 1 and 2 given in Question 1.2, create maps of 4 different prior predictive draws for lung cancer counts in Pennsylvania. You should have a total of 8 plots.
Question 1.4
Fit the model in INLA using the two different sets of priors in Question 2.2 and using the default priors specified in INLA (a total of 3 models). Organize the results of the estimated parameters (mean and 95% credible intervals) in a table. Include R code used.
## Example code for the BYM model — modify as needed
library(INLA)
## Values of E_i and neighborhood structure
E.penn <- pennLC.sf$E
neighbor.penn <- poly2nb(pennLC.sf)
nb2INLA("npenn.adj", neighbor.penn)
g <- inla.read.graph(filename = "npenn.adj")
pennLC.sf$re_u <- 1:nrow(pennLC.sf) pennLC.sf$re_v <- 1:nrow(pennLC.sf)
## priors specified through ’hyper’ and ’control.fixed’ arguments
formula <- counts ~ smoking +
f(re_u, model = "besag", graph = g,
hyper = list(prec = list(prior = "loggamma",param = c(0.01, 0.01)))) + f(re_v, model = "iid",
2
hyper = list(prec = list(prior = "loggamma",param = c(0.01, 0.01))))
res <- inla(formula, family = "poisson", data = pennLC.sf, E = E.penn, control.predictor = list(compute = TRUE),
control.fixed = list(mean.intercept = 0, prec.intercept = 1,
mean = 0,
prec = 1))
Question 1.5
Create two maps of the results for each fitted model.
• Map 1: mean expected count of lung cancer cases across Pennsylvania.
• Map 2: standard deviation of expected counts of lung cancer cases across Pennsylvania
Question 1.6
Simulate 100 data sets from the posterior predictive distribution per each of the three fitted models. For counties centre, monroe, westmoreland, lebanon, beaver, and erie, make a histogram of the posterior predictive draws and add a vertical line at the observed value.
Question 1.7
Per each of the three fitted models in Question 1.4, create maps of 4 different posterior predictive draws for lung cancer counts in Pennsylvania (a total of 12 maps).
Optional: Question 2 (bonus 1 pt)
Letτu =1,andshowthatfortheICARmodelforaspatialrandomeffect,u,wehavethat-1u⊤[D−W]u
## Example code for Map 1:
res$summary.fitted.values[,"mean"]
## Example code for Map 2:
res$summary.fitted.values[,"sd"]
is equal to -1 (ui − uj)2. 2 j∼i
2
3