hSDM.ZIB.iCAR.alteration.Rd
The hSDM.ZIB.iCAR.alteration
function can be
used to model species distribution including different processes in a
hierarchical Bayesian framework: (i) a
\(\mathcal{B}ernoulli\) suitability process (refering to
environmental suitability) which takes into account the spatial
dependence of the observations, (ii) an alteration process (refering
to anthropogenic disturbances), and (iii) a
\(\mathcal{B}inomial\) observability process (refering to
various ecological and methodological issues explaining the species
detection). The hSDM.ZIB.iCAR.alteration
function calls a
Gibbs sampler written in C code which uses an adaptive Metropolis
algorithm to estimate the conditional posterior distribution of
hierarchical model's parameters.
hSDM.ZIB.iCAR.alteration(presences, trials, suitability,
observability, spatial.entity, alteration, data, n.neighbors, neighbors,
suitability.pred=NULL, spatial.entity.pred=NULL, burnin = 5000, mcmc =
10000, thin = 10, beta.start, gamma.start, Vrho.start, mubeta = 0, Vbeta
= 1e+06, mugamma = 0, Vgamma = 1e+06, priorVrho = "1/Gamma", shape =
0.5, rate = 0.0005, Vrho.max=1000, seed = 1234, verbose = 1, save.rho =
0, save.p = 0)
A vector indicating the number of successes (or presences) for each observation.
A vector indicating the number of trials for each observation. \(t_i\) should be superior to zero and superior or equal to \(y_i\), the number of successes for observation \(i\).
A one-sided formula of the form \(\sim x_1+...+x_p\) with \(p\) terms specifying the explicative variables for the suitability process.
A one-sided formula of the form \(\sim w_1+...+w_q\) with \(q\) terms specifying the explicative variables for the observability process.
A vector indicating the spatial entity identifier (from one to the total number of entities) for each observation. Several observations can occur in one spatial entity. A spatial entity can be a raster cell for example.
A vector indicating the proportion of area in the spatial cell which is transformed (by anthropogenic activities for example) for each observation. Must be between 0 and 1.
A data frame containing the model's variables.
A vector of integers that indicates the number of
neighbors (adjacent entities) of each spatial
entity. length(n.neighbors)
indicates the total number of
spatial entities.
A vector of integers indicating the neighbors
(adjacent entities) of each spatial entity. Must be of the form
c(neighbors of entity 1, neighbors of entity 2, ... , neighbors of
the last entity). Length of the neighbors
vector should be
equal to sum(n.neighbors)
.
An optional data frame in which to look for variables with which to predict. If NULL, the observations are used.
An optional vector indicating the spatial
entity identifier (from one to the total number of entities) for
predictions. If NULL, the vector spatial.entity
for
observations is used.
The number of burnin iterations for the sampler.
The number of Gibbs iterations for the sampler. Total
number of Gibbs iterations is equal to
burnin+mcmc
. burnin+mcmc
must be divisible by 10 and
superior or equal to 100 so that the progress bar can be displayed.
The thinning interval used in the simulation. The number of mcmc iterations must be divisible by this value.
Starting values for \(\beta\) parameters of the suitability process. This can either be a scalar or a \(p\)-length vector.
Starting values for \(\beta\) parameters of the observability process. This can either be a scalar or a \(q\)-length vector.
Positive scalar indicating the starting value for the variance of the spatial random effects.
Means of the priors for the \(\beta\) parameters
of the suitability process. mubeta
must be either a scalar or a
p-length vector. If mubeta
takes a scalar value, then that value will
serve as the prior mean for all of the betas. The default value is set
to 0 for an uninformative prior.
Variances of the Normal priors for the \(\beta\)
parameters of the suitability process. Vbeta
must be either a
scalar or a p-length vector. If Vbeta
takes a scalar value,
then that value will serve as the prior variance for all of the
betas. The default variance is large and set to 1.0E6 for an
uninformative flat prior.
Means of the Normal priors for the \(\gamma\)
parameters of the observability process. mugamma
must be either
a scalar or a p-length vector. If mugamma
takes a scalar value,
then that value will serve as the prior mean for all of the
gammas. The default value is set to 0 for an uninformative prior.
Variances of the Normal priors for the
\(\gamma\) parameters of the observability
process. Vgamma
must be either a scalar or a p-length
vector. If Vgamma
takes a scalar value, then that value will
serve as the prior variance for all of the gammas. The default
variance is large and set to 1.0E6 for an uninformative flat prior.
Type of prior for the variance of the spatial random
effects. Can be set to a fixed positive scalar, or to an inverse-gamma
distribution ("1/Gamma") with parameters shape
and rate
,
or to a uniform distribution ("Uniform") on the interval
[0,Vrho.max
]. Default set to "1/Gamma".
The shape parameter for the Gamma prior on the precision
of the spatial random effects. Default value is shape=0.05
for
uninformative prior.
The rate (1/scale) parameter for the Gamma prior on the
precision of the spatial random effects. Default value is
rate=0.0005
for uninformative prior.
Upper bound for the uniform prior of the spatial random effect variance. Default set to 1000.
The seed for the random number generator. Default set to 1234.
A switch (0,1) which determines whether or not the progress of the sampler is printed to the screen. Default is 1: a progress bar is printed, indicating the step (in %) reached by the Gibbs sampler.
A switch (0,1) which determines whether or not the
sampled values for rhos are saved. Default is 0: the posterior mean
is computed and returned in the rho.pred
vector. Be careful,
setting save.rho
to 1 might require a large amount of
memory.
A switch (0,1) which determines whether or not the
sampled values for predictions are saved. Default is 0: the
posterior mean is computed and returned in the prob.p.pred
vector. Be careful, setting save.p
to 1 might require a large
amount of memory.
An mcmc object that contains the posterior sample. This object can be summarized by functions provided by the coda package. The posterior sample of the deviance \(D\), with \(D=-2\log(\prod_i P(y_i,z_i|...))\), is also provided.
If save.rho
is set to 0 (default),
rho.pred
is the predictive posterior mean of the spatial
random effect associated to each spatial entity. If save.rho
is
set to 1, rho.pred
is an mcmc
object with sampled
values for each spatial random effect associated to each spatial
entity.
If save.p
is set to 0 (default),
prob.p.pred
is the predictive posterior mean of the
probability associated to the suitability process for each
prediction. If save.p
is set to 1, prob.p.pred
is an
mcmc
object with sampled values of the probability associated
to the suitability process for each prediction.
Predictive posterior mean of the probability associated to the suitability process for each observation.
Predictive posterior mean of the probability associated to the observability process for each observation.
The model integrates two processes, an ecological process associated to the presence or absence of the species due to habitat suitability and an observation process that takes into account the fact that the probability of detection of the species is inferior to one. The ecological process includes an intrinsic conditional autoregressive model (iCAR) model for spatial autocorrelation between observations, assuming that the probability of presence of the species at one site depends on the probability of presence of the species on neighboring sites.
Ecological process: $$z_i \sim \mathcal{B}ernoulli(\theta_i)$$ $$logit(\theta_i) = X_i \beta + \rho_{j(i)}$$ \(\rho_j\): spatial random effect
\(j(i)\): index of the spatial entity for observation \(i\).
Spatial autocorrelation:
An intrinsic conditional autoregressive model (iCAR) is assumed: $$\rho_j \sim \mathcal{N}ormal(\mu_j,V_{\rho} / n_j)$$ \(\mu_j\): mean of \(\rho_{j'}\) in the neighborhood of \(j\).
\(V_{\rho}\): variance of the spatial random effects.
\(n_j\): number of neighbors for spatial entity \(j\).
Observation process: $$y_i \sim \mathcal{B}inomial(z_i * \delta_i, t_i)$$ $$logit(\delta_i) = W_i \gamma$$
Gelfand, A. E.; Schmidt, A. M.; Wu, S.; Silander, J. A.; Latimer, A. and Rebelo, A. G. (2005) Modelling species diversity through species level hierarchical modelling. Applied Statistics, 54, 1-20.
Latimer, A. M.; Wu, S. S.; Gelfand, A. E. and Silander, J. A. (2006) Building statistical models to analyze species distributions. Ecological Applications, 16, 33-50.
MacKenzie, D. I.; Nichols, J. D.; Lachman, G. B.; Droege, S.; Andrew Royle, J. and Langtimm, C. A. (2002) Estimating site occupancy rates when detection probabilities are less than one. Ecology, 83, 2248-2255.
if (FALSE) {
#==============================================
# hSDM.ZIB.iCAR.alteration()
# Example with simulated data
#==============================================
#============
#== Preambule
library(hSDM)
library(raster)
library(sp)
library(mvtnorm)
#==================
#== Data simulation
# Set seed for repeatability
seed <- 1234
# Target parameters
beta.target <- matrix(c(0.2,0.5,0.5),ncol=1)
gamma.target <- matrix(c(1),ncol=1)
## Uncomment if you want covariates on the observability process
## gamma.target <- matrix(c(0.2,0.5,0.5),ncol=1)
Vrho.target <- 1 # Spatial Variance
# Landscape
Landscape <- raster(ncol=20,nrow=20,crs='+proj=utm +zone=1')
ncell <- ncell(Landscape)
# Neighbors
neighbors.mat <- adjacent(Landscape, cells=c(1:ncell), directions=8, pairs=TRUE, sorted=TRUE)
n.neighbors <- as.data.frame(table(as.factor(neighbors.mat[,1])))[,2]
adj <- neighbors.mat[,2]
# Generate symmetric adjacency matrix, A
A <- matrix(0,ncell,ncell)
index.start <- 1
for (i in 1:ncell) {
index.end <- index.start+n.neighbors[i]-1
A[i,adj[c(index.start:index.end)]] <- 1
index.start <- index.end+1
}
# Spatial effects
d <- 1 # Spatial dependence parameter = 1 for intrinsic CAR
Q <- diag(n.neighbors)-d*A + diag(.0001,ncell) # Add small constant to make Q non-singular
covrho <- Vrho.target*solve(Q) # Covariance of rhos
set.seed(seed)
rho <- c(rmvnorm(1,sigma=covrho)) # Spatial Random Effects
rho <- rho-mean(rho) # Centering rhos on zero
# Visited cells
n.visited <- 150 # Compare with 400, 350 and 100 for example
set.seed(seed)
visited.cells <- sort(sample(1:ncell,n.visited,replace=FALSE)) # Draw visited cells at random
notvisited.cells <- c(1:ncell)[-visited.cells]
# Number of observations
nobs <- 300
# Cell vector
set.seed(seed)
cells <- c(visited.cells,sample(visited.cells,nobs-n.visited,replace=TRUE))
coords <- xyFromCell(Landscape,cells) # Get coordinates
# Covariates for "suitability" process
set.seed(seed)
X1.cell <- rnorm(n=ncell,0,1)
set.seed(2*seed)
X2.cell <- rnorm(n=ncell,0,1)
X1 <- X1.cell[cells]
X2 <- X2.cell[cells]
X <- cbind(rep(1,nobs),X1,X2)
# Alteration
U <- runif(n=nobs,min=0,max=1)
# Covariates for "observability" process
W <- cbind(rep(1,nobs))
## Uncomment if you want covariates on the observability process
## set.seed(3*seed)
## W1 <- rnorm(n=nobs,0,1)
## set.seed(4*seed)
## W2 <- rnorm(n=nobs,0,1)
## W <- cbind(rep(1,nobs),W1,W2)
#== Simulating latent variables
# Suitability
logit.theta.1 <- vector()
for (n in 1:nobs) {
logit.theta.1[n] <- X[n,]%*%beta.target+rho[cells[n]]
}
theta.1 <- inv.logit(logit.theta.1)
set.seed(seed)
y.1 <- rbinom(nobs,1,theta.1)
# Alteration
u <- rbinom(nobs,1,U)
# Observability
set.seed(seed)
trials <- rpois(nobs,5) # Number of trial associated to each observation
trials[trials==0] <- 1
logit.theta.2 <- W%*%gamma.target
theta.2 <- inv.logit(logit.theta.2)
set.seed(seed)
y.2 <- rbinom(nobs,trials,theta.2)
#== Simulating response variable
Y <- y.2*(1-u)*y.1
#== Data-set
Data <- data.frame(Y,trials,cells,X1,X2,U)
## Uncomment if you want covariates on the observability process
## Data <- data.frame(Y,trials,cells,X1,X2,W1,W2,U)
Data <- SpatialPointsDataFrame(coords=coords,data=Data)
plot(Data)
#== Data-set for predictions (suitability on each spatial cell)
Data.pred <- data.frame(X1=X1.cell,X2=X2.cell,cells=c(1:ncell))
#==================================
#== Site-occupancy model
mod.hSDM.ZIB.iCAR.alteration <- hSDM.ZIB.iCAR.alteration(presences=Data$Y,
trials=Data$trials,
suitability=~X1+X2,
observability=~1,
spatial.entity=Data$cells,
alteration=Data$U,
data=Data,
n.neighbors=n.neighbors,
neighbors=adj,
## suitability.pred=NULL,
## spatial.entity.pred=NULL,
suitability.pred=Data.pred,
spatial.entity.pred=Data.pred$cells,
burnin=5000, mcmc=5000, thin=5,
beta.start=0,
gamma.start=0,
Vrho.start=10,
priorVrho="1/Gamma",
#priorVrho="Uniform",
#priorVrho=10,
mubeta=0, Vbeta=1.0E6,
mugamma=0, Vgamma=1.0E6,
shape=0.5, rate=0.0005,
#Vrho.max=1000,
seed=1234, verbose=1,
save.rho=1, save.p=0)
#==========
#== Outputs
#= Parameter estimates
summary(mod.hSDM.ZIB.iCAR.alteration$mcmc)
#= MCMC and posteriors
pdf(file="Posteriors_hSDM.ZIB.iCAR.alteration.pdf")
plot(mod.hSDM.ZIB.iCAR.alteration$mcmc)
dev.off()
pdf(file="Posteriors.rho_hSDM.ZIB.iCAR.alteration.pdf")
plot(mod.hSDM.ZIB.iCAR.alteration$rho.pred)
dev.off()
#= Summary plots
# rho
r.rho <- r.rho.pred <- r.visited <- Landscape
r.rho[] <- rho
rho.pred <- apply(mod.hSDM.ZIB.iCAR.alteration$rho.pred,2,mean)
r.rho.pred[] <- rho.pred
r.visited[] <- 0
r.visited[visited.cells] <- 1
# prob.p
r.prob.p <- Landscape
r.prob.p[] <- mod.hSDM.ZIB.iCAR.alteration$prob.p.pred
pdf(file="Summary_hSDM.ZIB.iCAR.alteration.pdf")
par(mfrow=c(3,2))
plot(r.rho, main="rho target")
plot(r.visited,main="Visited cells and presences")
plot(Data[Y>0,],add=TRUE,pch=16,cex=0.5)
plot(r.rho.pred, main="rho estimated")
plot(rho[visited.cells],rho.pred[visited.cells],
xlab="rho target",
ylab="rho estimated")
points(rho[notvisited.cells],rho.pred[notvisited.cells],pch=16,col="blue")
legend(x=-4,y=3.5,legend="Unvisited cells",col="blue",pch=16,bty="n")
abline(a=0,b=1,col="red")
plot(r.prob.p,main="Proba of presence")
plot(Data[Y>0,],add=TRUE,pch=16,cex=0.5)
dev.off()
}