The jSDM_binomial_probit_sp_constrained function performs in parallel Binomial probit regressions in a Bayesian framework. The function calls a Gibbs sampler written in 'C++' code which uses conjugate priors to estimate the conditional posterior distribution of model's parameters. Then the function evaluates the convergence of MCMC \(\lambda\) chains using the Gelman-Rubin convergence diagnostic (\(\hat{R}\)). \(\hat{R}\) is computed using the gelman.diag function. We identify the species (\(\hat{j}_l\)) having the higher \(\hat{R}\) for \(\lambda_{\hat{j}_l}\). These species drive the structure of the latent axis \(l\). The \(\lambda\) corresponding to this species are constrained to be positive and placed on the diagonal of the \(\Lambda\) matrix for fitting a second model. This usually improves the convergence of the latent variables and factor loadings. The function returns the parameter posterior distributions for this second model.

Arguments

burnin

The number of burn-in iterations for the sampler.

mcmc

The number of Gibbs iterations for the sampler. Total number of Gibbs iterations is equal to burnin+mcmc for each chain. burnin+mcmc must be divisible by 10 and superior or equal to 100 so that the progress bar can be displayed.

thin

The thinning interval used in the simulation. The number of MCMC iterations must be divisible by this value.

nchains

The number of Monte Carlo Markov Chains (MCMCs) simulated in parallel. If not specified, the number of chains is set to 2.

ncores

The number of cores to use for parallel execution. If not specified, the number of cores is set to 2.

presence_data

A matrix \(n_{site} \times n_{species}\) indicating the presence by a 1 (or the absence by a 0) of each species on each site.

site_formula

A one-sided formula of the form '~x1+...+xp' specifying the explanatory variables for the suitability process of the model, used to form the design matrix \(X\) of size \(n_{site} \times np\).

site_data

A data frame containing the model's explanatory variables by site.

trait_data

A data frame containing the species traits which can be included as part of the model. Default to NULL to fit a model without species traits.

trait_formula

A one-sided formula of the form '~ t1 + ... + tk + x1:t1 + ... + xp:tk' specifying the interactions between the environmental variables and the species traits to be considered in the model, used to form the trait design matrix \(Tr\) of size \(n_{species} \times nt\) and to set to \(0\) the \(\gamma\) parameters corresponding to interactions not taken into account according to the formula. Default to NULL to fit a model with all possible interactions between species traits found in trait_data and environmental variables defined by site_formula.

n_latent

An integer which specifies the number of latent variables to generate. Defaults to \(0\).

site_effect

A string indicating whether row effects are included as fixed effects ("fixed"), as random effects ("random"), or not included ("none") in the model. If fixed effects, then for parameter identifiability the first row effect is set to zero, which analogous to acting as a reference level when dummy variables are used. If random effects, they are drawn from a normal distribution with mean zero and unknown variance, analogous to a random intercept in mixed models. Defaults to "none".

beta_start

Starting values for \(\beta\) parameters of the suitability process for each species must be either a scalar or a \(np \times n_{species}\) matrix. If beta_start takes a scalar value, then that value will serve for all of the \(\beta\) parameters. Different starting values for each chain can be specified by a list or a vector of length nchains, by default the same starting values are considered for all chains.

gamma_start

Starting values for \(\gamma\) parameters that represent the influence of species-specific traits on species' responses \(\beta\), gamma_start must be either a scalar, a vector of length \(nt\), \(np\) or \(nt.np\) or a \(nt \times np\) matrix. If gamma_start takes a scalar value, then that value will serve for all of the \(\gamma\) parameters. If gamma_start is a vector of length \(nt\) or \(nt.np\) the resulting \(nt \times np\) matrix is filled by column with specified values, if a \(np\)-length vector is given, the matrix is filled by row. Different starting values for each chain can be specified by a list or a vector of length nchains, by default the same starting values are considered for all chains.

lambda_start

Starting values for \(\lambda\) parameters corresponding to the latent variables for each species must be either a scalar or a \(n_{latent} \times n_{species}\) upper triangular matrix with strictly positive values on the diagonal, ignored if n_latent=0. If lambda_start takes a scalar value, then that value will serve for all of the \(\lambda\) parameters except those concerned by the constraints explained above. Different starting values for each chain can be specified by a list or a vector of length nchains, by default the same starting values are considered for all chains.

W_start

Starting values for latent variables must be either a scalar or a \(nsite \times n_latent\) matrix, ignored if n_latent=0. If W_start takes a scalar value, then that value will serve for all of the \(W_{il}\) with \(i=1,\ldots,n_{site}\) and \(l=1,\ldots,n_{latent}\). Different starting values for each chain can be specified by a list or a vector of length nchains, by default the same starting values are considered for all chains.

alpha_start

Starting values for random site effect parameters must be either a scalar or a \(n_{site}\)-length vector, ignored if site_effect="none". If alpha_start takes a scalar value, then that value will serve for all of the \(\alpha\) parameters. Different starting values for each chain can be specified by a list or a vector of length nchains, by default the same starting values are considered for all chains.

V_alpha

Starting value for variance of random site effect if site_effect="random" or constant variance of the Gaussian prior distribution for the fixed site effect if site_effect="fixed". Must be a strictly positive scalar, ignored if site_effect="none". Different starting values for each chain can be specified by a list or a vector of length nchains, by default the same starting values are considered for all chains.

shape_Valpha

Shape parameter of the Inverse-Gamma prior for the random site effect variance V_alpha, ignored if site_effect="none" or site_effect="fixed". Must be a strictly positive scalar. Default to 0.5 for weak informative prior.

rate_Valpha

Rate parameter of the Inverse-Gamma prior for the random site effect variance V_alpha, ignored if site_effect="none" or site_effect="fixed" Must be a strictly positive scalar. Default to 0.0005 for weak informative prior.

mu_beta

Means of the Normal priors for the \(\beta\) parameters of the suitability process. mu_beta must be either a scalar or a \(np\)-length vector. If mu_beta takes a scalar value, then that value will serve as the prior mean for all of the \(\beta\) parameters. The default value is set to 0 for an uninformative prior, ignored if trait_data is specified.

V_beta

Variances of the Normal priors for the \(\beta\) parameters of the suitability process. V_beta must be either a scalar or a \(np \times np\) symmetric positive semi-definite square matrix. If V_beta takes a scalar value, then that value will serve as the prior variance for all of the \(\beta\) parameters, so the variance covariance matrix used in this case is diagonal with the specified value on the diagonal. The default variance is large and set to 10 for an uninformative flat prior.

mu_gamma

Means of the Normal priors for the \(\gamma\) parameters. mu_gamma must be either a scalar, a vector of length \(nt\), \(np\) or \(nt.np\) or a \(nt \times np\) matrix. If mu_gamma takes a scalar value, then that value will serve as the prior mean for all of the \(\gamma\) parameters. If mu_gamma is a vector of length \(nt\) or \(nt.np\) the resulting \(nt \times np\) matrix is filled by column with specified values, if a \(np\)-length vector is given, the matrix is filled by row. The default value is set to 0 for an uninformative prior, ignored if trait_data=NULL.

V_gamma

Variances of the Normal priors for the \(\gamma\) parameters. V_gamma must be either a scalar, a vector of length \(nt\), \(np\) or \(nt.np\) or a \(nt \times np\) positive matrix. If V_gamma takes a scalar value, then that value will serve as the prior variance for all of the \(\gamma\) parameters. If V_gamma is a vector of length \(nt\) or \(nt.np\) the resulting \(nt \times np\) matrix is filled by column with specified values, if a \(np\)-length vector is given, the matrix is filled by row. The default variance is large and set to 10 for an uninformative flat prior, ignored if trait_data=NULL.

mu_lambda

Means of the Normal priors for the \(\lambda\) parameters corresponding to the latent variables. mu_lambda must be either a scalar or a \(n_{latent}\)-length vector. If mu_lambda takes a scalar value, then that value will serve as the prior mean for all of the \(\lambda\) parameters. The default value is set to 0 for an uninformative prior.

V_lambda

Variances of the Normal priors for the \(\lambda\) parameters corresponding to the latent variables. V_lambda must be either a scalar or a \(n_{latent} \times n_{latent}\) symmetric positive semi-definite square matrix. If V_lambda takes a scalar value, then that value will serve as the prior variance for all of \(\lambda\) parameters, so the variance covariance matrix used in this case is diagonal with the specified value on the diagonal. The default variance is large and set to 10 for an uninformative flat prior.

seed

The seed for the random number generator. Default to c(123, 1234) for two chains and for more chains different seeds are generated for each chain.

verbose

A switch (0,1) which determines whether or not the progress of the sampler is printed to the screen. Default is 1: a progress bar is printed, indicating the step (in %) reached by the Gibbs sampler.

Value

A list of length nchains where each element is an object of class "jSDM" acting like a list including :

mcmc.alpha

An mcmc object that contains the posterior samples for site effects \(\alpha\), not returned if site_effect="none".

mcmc.V_alpha

An mcmc object that contains the posterior samples for variance of random site effect, not returned if site_effect="none" or site_effect="fixed".

mcmc.latent

A list by latent variable of mcmc objects that contains the posterior samples for latent variables \(W_l\) with \(l=1,\ldots,n_{latent}\), not returned if n_latent=0.

mcmc.sp

A list by species of mcmc objects that contains the posterior samples for species effects \(\beta_j\) and \(\lambda_j\) if n_latent>0 with \(j=1,\ldots,n_{species}\).

mcmc.gamma

A list by covariates of mcmc objects that contains the posterior samples for \(\gamma_p\) parameters with \(p=1,\ldots,np\) if trait_data is specified.

mcmc.Deviance

The posterior sample of the deviance (\(D\)) is also provided, with \(D\) defined as : \(D=-2\log(\prod_{ij} P(y_{ij}|\beta_j,\lambda_j, \alpha_i, W_i))\).

Z_latent

Predictive posterior mean of the latent variable Z.

probit_theta_latent

Predictive posterior mean of the probability to each species to be present on each site, transformed by probit link function.

theta_latent

Predictive posterior mean of the probability to each species to be present on each site.

model_spec

Various attributes of the model fitted, including the response and model matrix used, distributional assumptions as link function, family and number of latent variables, hyperparameters used in the Bayesian estimation and mcmc, burnin and thin.

sp_constrained

Indicates the reference species (\(\hat{j}_l\)) considered that structures itself most clearly on each latent axis \(l\), chosen such as \(\lambda_{\hat{j}_l}\) maximize the \(\hat{R}\) computed on all chains. The \(lambda\) corresponding to this species are constrained to be positive by being placed on the diagonal of the \(\Lambda\) matrix.

The mcmc. objects can be summarized by functions provided by the coda package.

Details

We model an ecological process where the presence or absence of species \(j\) on site \(i\) is explained by habitat suitability.

Ecological process: $$y_{ij} \sim \mathcal{B}ernoulli(\theta_{ij})$$ where

if n_latent=0 and site_effect="none"probit\((\theta_{ij}) = X_i \beta_j\)
if n_latent>0 and site_effect="none"probit\((\theta_{ij}) = X_i \beta_j + W_i \lambda_j\)
if n_latent=0 and site_effect="random"probit\((\theta_{ij}) = X_i \beta_j + \alpha_i\) and \(\alpha_i \sim \mathcal{N}(0,V_\alpha)\)
if n_latent>0 and site_effect="fixed"probit\((\theta_{ij}) = X_i \beta_j + W_i \lambda_j + \alpha_i\)
if n_latent=0 and site_effect="fixed"probit\((\theta_{ij}) = X_i \beta_j + \alpha_i\)
if n_latent>0 and site_effect="random"probit\((\theta_{ij}) = X_i \beta_j + W_i \lambda_j + \alpha_i\) and \(\alpha_i \sim \mathcal{N}(0,V_\alpha)\)

In the absence of data on species traits (trait_data=NULL), the effect of species \(j\): \(\beta_j\); follows the same a priori Gaussian distribution such that \(\beta_j \sim \mathcal{N}_{np}(\mu_{\beta},V_{\beta})\), for each species.

If species traits data are provided, the effect of species \(j\): \(\beta_j\); follows an a priori Gaussian distribution such that \(\beta_j \sim \mathcal{N}_{np}(\mu_{\beta_j},V_{\beta})\), where \(\mu_{\beta_jp} = \sum_{k=1}^{nt} t_{jk}.\gamma_{kp}\), takes different values for each species.

We assume that \(\gamma_{kp} \sim \mathcal{N}(\mu_{\gamma_{kp}},V_{\gamma_{kp}})\) as prior distribution.

We define the matrix \(\gamma=(\gamma_{kp})_{k=1,...,nt}^{p=1,...,np}\) such as :

\(x_0\)\(x_1\)...\(x_p\)...\(x_{np}\)
__________________________________________________
\(t_0\) |\(\gamma_{0,0}\)\(\gamma_{0,1}\)...\(\gamma_{0,p}\)...\(\gamma_{0,np}\){ effect of
|interceptenvironmental
| variables
\(t_1\) |\(\gamma_{1,0}\)\(\gamma_{1,1}\)...\(\gamma_{1,p}\)...\(\gamma_{1,np}\)
... |..................
\(t_k\) |\(\gamma_{k,0}\)\(\gamma_{k,1}\)...\(\gamma_{k,p}\)...\(\gamma_{k,np}\)
... |..................
\(t_{nt}\) |\(\gamma_{nt,0}\)\(\gamma_{nt,1}\)...\(\gamma_{nt,p}\)...\(\gamma_{nt,np}\)
average
trait effectinteractiontraitsenvironment

References

Chib, S. and Greenberg, E. (1998) Analysis of multivariate probit models. Biometrika, 85, 347-361.

Warton, D. I.; Blanchet, F. G.; O'Hara, R. B.; O'Hara, R. B.; Ovaskainen, O.; Taskinen, S.; Walker, S. C. and Hui, F. K. C. (2015) So Many Variables: Joint Modeling in Community Ecology. Trends in Ecology & Evolution, 30, 766-779.

Ovaskainen, O., Tikhonov, G., Norberg, A., Blanchet, F. G., Duan, L., Dunson, D., Roslin, T. and Abrego, N. (2017) How to make more out of community data? A conceptual framework and its implementation as models and software. Ecology Letters, 20, 561-576.

Author

Ghislain Vieilledent <ghislain.vieilledent@cirad.fr>

Jeanne Clément <jeanne.clement16@laposte.net>

Examples

#======================================
# jSDM_binomial_probit_sp_constrained()
# Example with simulated data
#====================================

#=================
#== Load libraries
library(jSDM)

#==================
#== Data simulation

#= Number of sites
nsite <- 30

#= Set seed for repeatability
seed <- 1234
set.seed(seed)

#= Number of species
nsp <- 10

#= Number of latent variables
n_latent <- 2

#= Ecological process (suitability)
x1 <- rnorm(nsite,0,1)
x2 <- rnorm(nsite,0,1)
X <- cbind(rep(1,nsite),x1,x2)
np <- ncol(X)
#= Latent variables W
W <- matrix(rnorm(nsite*n_latent,0,1), nsite, n_latent)
#= Fixed species effect beta 
beta.target <- t(matrix(runif(nsp*np,-2,2),
                        byrow=TRUE, nrow=nsp))
#= Factor loading lambda  
lambda.target <- matrix(0, n_latent, nsp)
mat <- t(matrix(runif(nsp*n_latent, -2, 2), byrow=TRUE, nrow=nsp))
lambda.target[upper.tri(mat, diag=TRUE)] <- mat[upper.tri(mat, diag=TRUE)]
diag(lambda.target) <- runif(n_latent, 0, 2)
#= Variance of random site effect 
V_alpha.target <- 0.5
#= Random site effect alpha
alpha.target <- rnorm(nsite,0 , sqrt(V_alpha.target))
# Simulation of response data with probit link
probit_theta <- X%*%beta.target + W%*%lambda.target + alpha.target
theta <- pnorm(probit_theta)
e <- matrix(rnorm(nsp*nsite,0,1),nsite,nsp)
# Latent variable Z 
Z_true <- probit_theta + e
# Presence-absence matrix Y
Y <- matrix (NA, nsite,nsp)
for (i in 1:nsite){
  for (j in 1:nsp){
    if ( Z_true[i,j] > 0) {Y[i,j] <- 1}
    else {Y[i,j] <- 0}
  }
}

#==================================
#== Site-occupancy model

# Increase number of iterations (burnin and mcmc) to get convergence
mod <- jSDM_binomial_probit_sp_constrained(# Iteration
                                           burnin=100,
                                           mcmc=100,
                                           thin=1,
                                           # parallel MCMCs
                                           nchains=2, ncores=2,
                                           # Response variable
                                           presence_data=Y,
                                           # Explanatory variables
                                           site_formula=~x1+x2,
                                           site_data = X,
                                           n_latent=2,
                                           site_effect="random",
                                           # Starting values
                                           alpha_start=0,
                                           beta_start=0,
                                           lambda_start=0,
                                           W_start=0,
                                           V_alpha=1,
                                           # Priors
                                           shape_Valpha=0.5,
                                           rate_Valpha=0.0005,
                                           mu_beta=0, V_beta=1,
                                           mu_lambda=0, V_lambda=1,
                                           seed=c(123,1234), verbose=1)

# ===================================================
# Result analysis
# ===================================================

#==========
#== Outputs
oldpar <- par(no.readonly = TRUE) 
burnin <- mod[[1]]$model_spec$burnin
ngibbs <- burnin + mod[[1]]$model_spec$mcmc
thin <-  mod[[1]]$model_spec$thin
require(coda)
#> Loading required package: coda
arr2mcmc <- function(x) {
  return(mcmc(as.data.frame(x),
              start=burnin+1 , end=ngibbs, thin=thin))
}
mcmc_list_Deviance <- mcmc.list(lapply(lapply(mod,"[[","mcmc.Deviance"), arr2mcmc))
mcmc_list_alpha <- mcmc.list(lapply(lapply(mod,"[[","mcmc.alpha"), arr2mcmc))
mcmc_list_V_alpha <- mcmc.list(lapply(lapply(mod,"[[","mcmc.V_alpha"), arr2mcmc))
mcmc_list_param <- mcmc.list(lapply(lapply(mod,"[[","mcmc.sp"), arr2mcmc))
mcmc_list_lv <- mcmc.list(lapply(lapply(mod,"[[","mcmc.latent"), arr2mcmc))
mcmc_list_beta <- mcmc_list_param[,grep("beta",colnames(mcmc_list_param[[1]]))]
mcmc_list_beta0 <- mcmc_list_beta[,grep("Intercept", colnames(mcmc_list_beta[[1]]))]
mcmc_list_lambda <- mcmc.list(
  lapply(mcmc_list_param[, grep("lambda", colnames(mcmc_list_param[[1]]))],
         arr2mcmc))
# Compute Rhat
psrf_alpha <- mean(gelman.diag(mcmc_list_alpha, multivariate=FALSE)$psrf[,2])
psrf_V_alpha <- gelman.diag(mcmc_list_V_alpha)$psrf[,2]
psrf_beta0 <- mean(gelman.diag(mcmc_list_beta0)$psrf[,2])
psrf_beta <- mean(gelman.diag(mcmc_list_beta[,grep("Intercept",
                                                   colnames(mcmc_list_beta[[1]]),
                                                   invert=TRUE)])$psrf[,2])
psrf_lambda <- mean(gelman.diag(mcmc_list_lambda,
                                multivariate=FALSE)$psrf[,2],
                    na.rm=TRUE)
psrf_lv <- mean(gelman.diag(mcmc_list_lv, multivariate=FALSE)$psrf[,2])
Rhat <- data.frame(Rhat=c(psrf_alpha, psrf_V_alpha, psrf_beta0, psrf_beta,
                          psrf_lambda, psrf_lv),
                   Variable=c("alpha", "Valpha", "beta0", "beta",
                              "lambda", "W"))
# Barplot
library(ggplot2)
ggplot2::ggplot(Rhat, aes(x=Variable, y=Rhat)) + 
  ggtitle("Averages of Rhat obtained with jSDM for each type of parameter") +
  theme(plot.title = element_text(hjust = 0.5, size=15)) +
  geom_bar(fill="skyblue", stat = "identity") +
  geom_text(aes(label=round(Rhat,2)), vjust=0, hjust=-0.1) +
  geom_hline(yintercept=1, color='red') +
  coord_flip()


#= Parameter estimates
nchains <- length(mod)
## beta_j
pdf(file=file.path(tempdir(), "Posteriors_species_effect_jSDM_probit.pdf"))
plot(mcmc_list_param)
dev.off()
#> agg_png 
#>       2 

## Latent variables 
pdf(file=file.path(tempdir(), "Posteriors_latent_variables_jSDM_probit.pdf"))
plot(mcmc_list_lv)
dev.off()
#> agg_png 
#>       2 

## Random site effect and its variance
pdf(file=file.path(tempdir(), "Posteriors_site_effect_jSDM_probit.pdf"))
plot(mcmc_list_V_alpha)
plot(mcmc_list_alpha)
dev.off()
#> agg_png 
#>       2 

## Predictive posterior mean for each observation
# Species effects beta and factor loadings lambda
param <- list()
for (i in 1:nchains){
param[[i]] <- matrix(unlist(lapply(mod[[i]]$mcmc.sp,colMeans)),
                     nrow=nsp, byrow=TRUE)
}
par(mfrow=c(1,1))
for (i in 1:nchains){
  if(i==1){
    plot(t(beta.target), param[[i]][,1:np],
         main="species effect beta",
         xlab ="obs", ylab ="fitted")
    abline(a=0,b=1, col='red')
  }
  else{
    points(t(beta.target), param[[i]][,1:np], col=i)
  }
}

par(mfrow=c(1,2))
for(l in 1:n_latent){
  for (i in 1:nchains){
    if (i==1){
      plot(t(lambda.target)[,l],
           param[[i]][,np+l],
           main=paste0("factor loadings lambda_", l),
           xlab ="obs", ylab ="fitted")
      abline(a=0,b=1, col='red')
    } else {
      points(t(lambda.target)[,l],
             param[[i]][,np+l],
             col=i)
    }
  }
}

## W latent variables
mean_W <- list()
for (i in 1:nchains){
  mean_W[[i]] <- sapply(mod[[i]]$mcmc.latent,colMeans)
}
par(mfrow=c(1,2))
for (l in 1:n_latent) {
  for (i in 1:nchains){
    if (i==1){
      plot(W[,l], mean_W[[i]][,l],
           main = paste0("Latent variable W_", l),
           xlab ="obs", ylab ="fitted")
      abline(a=0,b=1, col='red')
    }
    else{
      points(W[,l], mean_W[[i]][,l], col=i)
    }
  }
}


#= W.lambda
par(mfrow=c(1,2))
for (i in 1:nchains){
  if (i==1){
    plot(W%*%lambda.target, 
         mean_W[[i]]%*%t(param[[i]][,(np+1):(np+n_latent)]),
         main = "W.lambda",
         xlab ="obs", ylab ="fitted")
    abline(a=0,b=1, col='red')
  }
  else{
    points(W%*%lambda.target, 
           mean_W[[i]]%*%t(param[[i]][,(np+1):(np+n_latent)]),
           col=i)
  }
}

# Site effect alpha et V_alpha
plot(alpha.target,colMeans(mod[[1]]$mcmc.alpha),
     xlab="obs", ylab="fitted",
     main="Random site effect alpha")
abline(a=0,b=1, col='red')
points(V_alpha.target, mean(mod[[1]]$mcmc.V_alpha),
       pch=18, cex=2)
legend("bottomright", legend=c("V_alpha"), pch =18, pt.cex=1.5)
for (i in 2:nchains){
  points(alpha.target, colMeans(mod[[i]]$mcmc.alpha), col=i)
  points(V_alpha.target, mean(mod[[i]]$mcmc.V_alpha),
         pch=18, col=i, cex=2)
}


#= Predictions 
## Occurence probabilities 
plot(pnorm(probit_theta), mod[[1]]$theta_latent,
     main="theta",xlab="obs",ylab="fitted")
for (i in 2:nchains){
  points(pnorm(probit_theta), mod[[i]]$theta_latent, col=i)
}
abline(a=0,b=1, col='red')

## probit(theta)
plot(probit_theta, mod[[1]]$probit_theta_latent,
     main="probit(theta)",xlab="obs",ylab="fitted")
for (i in 2:nchains){
  points(probit_theta, mod[[i]]$probit_theta_latent, col=i)
}
abline(a=0,b=1, col='red')


## Deviance
plot(mcmc_list_Deviance)


par(oldpar)