This Supplement presents data, code, and detailed methodology of the analysis of counts of sharks collected during Diver-Operated Video (DOV) surveys conducted at Darwin and Wolf Islands in the Galápagos Marine Reserve. The data comprise counts from 64 sub-transects, each 50 m long and 5 m wide. Counts were taken using an instantaneous (insiders only) survey method and a non-instantaneous (total = insiders + outsiders) survey method.
In this Supplement (2), we present logistic regression models used to estimate the proportion of the total counts that were insiders, and thus the degree of bias associated with using a non-instantaneous survey method, as a function of diver swimming speed. The Supplement is organised as follows: in Section 2, we provide the data; in Section 3, we define the base model and provide the code used to fit it; in Section 4, we examine our prior distributions using prior predictive checks; in Section 5, we provide code used to obtain estimates and produced the figures used for inference in the main article; in Section 6, we compare our base model to alternative models; in Section 7, we examine model diagnostics; and, in Section 8, we compare inferences from our base model with one that has a prior that might give more conservative estimates of non-instantaneous bias.
Another Supplement (3) presents models used to estimate mean densities of sharks for each method at each island, for exposed and non-exposed transects.
The following code inputs and tabulates (Table S2-1) the raw data.
dp <- structure(list(Insider = c(1, 4, 1, 3, 0, 0, 1, 2, 0, 2, 4, 2, 8, 4, 1, 5, 3, 6, 6, 1, 1, 5, 0, 24, 18, 1, 2, 1, 4, 7, 1, 6, 1, 1, 0, 0, 2, 1, 0, 3, 2, 1, 2, 12, 0, 1, 0, 3, 3, 0, 1, 6, 1, 1, 0, 0, 1, 3, 0, 4, 3, 3, 1, 0), Total = c(2, 5, 1, 12, 3, 1, 2, 9, 4, 2, 9, 2, 8, 8, 6, 6, 3, 10, 51, 5, 1, 6, 1, 27, 21, 1, 2, 3, 5, 18, 2, 8, 1, 2, 1, 1, 3, 1, 1, 3, 2, 1, 3, 13, 3, 3, 5, 3, 3, 1, 2, 6, 1, 1, 3, 1, 1, 3, 1, 4, 3, 3, 4, 1), Transect = c(3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 4, 4, 4, 4, 4, 6, 9, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13, 13, 13, 16), Speed = c(28.57142857, 43.47826087, 44.7761194, 43.47826087, 46.15384615, 48.38709677, 37.97468354, 40.54054054, 40.54054054, 44.7761194, 43.47826087, 47.61904762, 51.72413793, 47.61904762, 50, 46.875, 32.96703297, 39.47368421,
29.12621359, 37.5, 38.96103896, 35.71428571, 32.96703297, 43.47826087, 37.03703704, 49.18032787, 44.11764706, 42.85714286, 38.96103896, 37.5, 40.54054054, 30.6122449, 42.25352113, 41.09589041, 22.05882353, 37.97468354, 36.14457831, 32.96703297, 38.96103896, 45.45454545, 50, 42.25352113, 37.03703704, 41.66666667, 41.09589041, 38.46153846, 33.70786517, 38.46153846, 46.15384615, 50.84745763, 48.38709677, 54.54545455, 49.18032787, 50, 40.54054054, 41.09589041, 42.25352113, 53.57142857, 49.18032787, 61.2244898, 65.2173913, 65.2173913, 58.82352941, 53.57142857), R = c(2, 1.25, 1, 4, Inf, Inf, 2, 4.5, Inf, 1, 2.25, 1, 1, 2, 6, 1.2, 1, 1.66666666666667, 8.5, 5, 1, 1.2, Inf, 1.125, 1.16666666666667, 1, 1, 3, 1.25, 2.57142857142857, 2, 1.33333333333333, 1, 2, Inf, Inf, 1.5, 1, Inf, 1, 1, 1, 1.5, 1.08333333333333, Inf, 3, Inf, 1, 1, Inf, 2, 1, 1, 1, Inf, Inf, 1, 1, Inf, 1, 1, 1, 4, Inf)), row.names = c(NA, -64L), class = c("tbl_df", "tbl", "data.frame"))
dp %>%
select(Transect, Speed, Insider, Total,
`Raw ratio` = R) %>%
kable(caption = tables("tdat")) %>%
kable_styling
Transect | Speed | Insider | Total | Raw ratio |
---|---|---|---|---|
3 | 28.57143 | 1 | 2 | 2.000000 |
3 | 43.47826 | 4 | 5 | 1.250000 |
3 | 44.77612 | 1 | 1 | 1.000000 |
3 | 43.47826 | 3 | 12 | 4.000000 |
3 | 46.15385 | 0 | 3 | Inf |
3 | 48.38710 | 0 | 1 | Inf |
3 | 37.97468 | 1 | 2 | 2.000000 |
3 | 40.54054 | 2 | 9 | 4.500000 |
3 | 40.54054 | 0 | 4 | Inf |
3 | 44.77612 | 2 | 2 | 1.000000 |
3 | 43.47826 | 4 | 9 | 2.250000 |
3 | 47.61905 | 2 | 2 | 1.000000 |
3 | 51.72414 | 8 | 8 | 1.000000 |
3 | 47.61905 | 4 | 8 | 2.000000 |
3 | 50.00000 | 1 | 6 | 6.000000 |
3 | 46.87500 | 5 | 6 | 1.200000 |
1 | 32.96703 | 3 | 3 | 1.000000 |
1 | 39.47368 | 6 | 10 | 1.666667 |
1 | 29.12621 | 6 | 51 | 8.500000 |
1 | 37.50000 | 1 | 5 | 5.000000 |
1 | 38.96104 | 1 | 1 | 1.000000 |
1 | 35.71429 | 5 | 6 | 1.200000 |
2 | 32.96703 | 0 | 1 | Inf |
2 | 43.47826 | 24 | 27 | 1.125000 |
2 | 37.03704 | 18 | 21 | 1.166667 |
2 | 49.18033 | 1 | 1 | 1.000000 |
2 | 44.11765 | 2 | 2 | 1.000000 |
2 | 42.85714 | 1 | 3 | 3.000000 |
2 | 38.96104 | 4 | 5 | 1.250000 |
2 | 37.50000 | 7 | 18 | 2.571429 |
2 | 40.54054 | 1 | 2 | 2.000000 |
2 | 30.61224 | 6 | 8 | 1.333333 |
2 | 42.25352 | 1 | 1 | 1.000000 |
2 | 41.09589 | 1 | 2 | 2.000000 |
4 | 22.05882 | 0 | 1 | Inf |
4 | 37.97468 | 0 | 1 | Inf |
4 | 36.14458 | 2 | 3 | 1.500000 |
4 | 32.96703 | 1 | 1 | 1.000000 |
4 | 38.96104 | 0 | 1 | Inf |
6 | 45.45455 | 3 | 3 | 1.000000 |
9 | 50.00000 | 2 | 2 | 1.000000 |
11 | 42.25352 | 1 | 1 | 1.000000 |
11 | 37.03704 | 2 | 3 | 1.500000 |
11 | 41.66667 | 12 | 13 | 1.083333 |
11 | 41.09589 | 0 | 3 | Inf |
11 | 38.46154 | 1 | 3 | 3.000000 |
11 | 33.70787 | 0 | 5 | Inf |
11 | 38.46154 | 3 | 3 | 1.000000 |
12 | 46.15385 | 3 | 3 | 1.000000 |
12 | 50.84746 | 0 | 1 | Inf |
12 | 48.38710 | 1 | 2 | 2.000000 |
12 | 54.54545 | 6 | 6 | 1.000000 |
12 | 49.18033 | 1 | 1 | 1.000000 |
12 | 50.00000 | 1 | 1 | 1.000000 |
12 | 40.54054 | 0 | 3 | Inf |
13 | 41.09589 | 0 | 1 | Inf |
13 | 42.25352 | 1 | 1 | 1.000000 |
13 | 53.57143 | 3 | 3 | 1.000000 |
13 | 49.18033 | 0 | 1 | Inf |
13 | 61.22449 | 4 | 4 | 1.000000 |
13 | 65.21739 | 3 | 3 | 1.000000 |
13 | 65.21739 | 3 | 3 | 1.000000 |
13 | 58.82353 | 1 | 4 | 4.000000 |
16 | 53.57143 | 0 | 1 | Inf |
This code modifies the data for input into the models:
dp2 <- dp %>%
mutate(X = Speed - 43) %>% # X is speed centred on zero
select(-Speed) %>% # remove speed
mutate(Tr = as.numeric(as.factor(dp$Transect))) %>% # transect index
as.list() %>%
list_modify(N = nrow(dp)) %>%
list_modify(N_Tr = max(as.numeric(as.factor(dp$Transect)))) %>%
list_modify(X_pred = 20:65 - 43) %>% # X values for predictions
list_modify(N_pred = length(20:65) ) # length of predict x vector
Diver Swimming speeds ranged from 22 to 65 m min-1, with an average of 43 (Table S2-1). The raw proportion of insiders was 180/328 = 0.5487805.
Here is a mathematical description of the base model used in the manuscript:
\[ \begin{aligned} y^*_i &\sim {\rm Binomial}(y_i, p_i) \\ {\rm logit}(p_i) &= \alpha + \beta ({\rm Speed}_i - 43) + z_{Tr[i]} \sigma_t\\ \alpha &\sim {\rm Normal}(0, 1.4)\\ \beta &\sim {\rm Normal}(0, 0.1)\\ z_{Tr[i]} &\sim {\rm Normal}(0, 1)\\ \sigma_t &\sim {\rm Exponential}(1)\\ f_i &= 1/p_i\\ \end{aligned} \] The parameters of Normal distributions here are means and standard deviations.
This is a logistic regression of the proportion of the total counts that were insiders as a function of diver swimming speed. The predictor variable \({\rm Speed}_i\), the diver swimming speed for transect \(i\), was centred by subtracting the mean speed of 43 m min-1. This way, the intercept parameter \(\alpha\) gives the proportion of insiders at the average swim speed (rather than at a speed of zero).
The variable \(Tr[i]\) is a vector of integers (between 1 and 10) indicating the transects to which each sub-transect \(i\) belongs. The transect effects here are modelled using a non-centred parameterisation, which is equivalent to the more usual parameterisation described in the manuscript, but it avoids problems with poor sampling efficiency that arises with the usual parameterisation when fitting the model with Hamiltonian Monte Carlo (McElreath 2015). With the non-centred approach, the sub-transect effects are given by \(t_j = z_j \sigma_t\), where \(z_j\) are the sub-transect effects as standardised z-scores and \(\sigma_t\) is the standard deviation of the sub-transect effects.
The Stan code for this model is as follows (saved as object mod_propST
):
data{
int N; // sample size
int N_Tr; // number of transects
int Tr[N]; // vector of contiguous transect numbers
int N_pred; // number of prediction speeds
int Total[N]; // Total counts
int Insider[N]; // Insider counts
vector[N] X; // Speed - 43
vector[N_pred] X_pred; // Values of (speed - 43) for prediction
}
parameters{
real a;
real b;
vector[N_Tr] z_Tr;
real<lower=0> sd_Tr;
}
model{
vector[N] p;
b ~ normal( 0 , 0.1 );
a ~ normal( 0 , 1.4 );
z_Tr ~ normal( 0 , 1 );
sd_Tr ~ exponential( 1 );
for ( i in 1:N ) {
p[i] = inv_logit(a + b * X[i] + z_Tr[Tr[i]]*sd_Tr);
}
Insider ~ binomial( Total , p );
}
generated quantities{
vector[N] log_lik;
vector[N] p;
vector[N] f;
vector[N_pred] p_pred;
vector[N_pred] f_pred;
for ( i in 1:N ) {
p[i] = inv_logit(a + b * X[i] + z_Tr[Tr[i]]*sd_Tr);
f[i] = 1 / p[i];
}
for ( i in 1:N ) log_lik[i] = binomial_lpmf( Insider[i] | Total[i] , p[i] );
for ( i in 1:N_pred ) { // Predict p and f across the range of X
p_pred[i] = inv_logit(a + b * X_pred[i]);
f_pred[i] = 1 / p_pred[i];
}
}
Fit model for proportion of insiders \(p\) as a function of diver swimming speed:
fit_propST <- sampling(mod_propST,
data=dp2,
iter=2000,
chains=4,
cores=4,
seed=111)
An important step in Bayesian modelling is to scrutinise the choice of prior distributions by simulating models and data from the priors, so-called prior predictive checks (Gelman et al. 2014, Gabry et al. 2019). Our logistic regression model fits a linear model to the logit-transformed proportion of insiders. When using a link function like the logit, seemingly uninformative priors for logit-transformed parameters (e.g., Normal(0, 100)) can actually be very strong implied priors for the predictions on the original scale (Figure S2-1). In this section, we examine priors for the intercept (\(\alpha\) or a
) and slope (\(\beta\) or b
) parameters.
Here, we examine a range of priors for the intercept parameter \(\alpha\), which represents the logit-transformed proportion of insiders.
We took random samples from a few normal distributions, applied the inverse-logit function to get values on the proportion scale (between 0 and 1), and then examined their distributions using density plots.
n=1e6
ld <- bind_rows(tibble(Prior="N(0,1)", x=rnorm(n,0,1)),
tibble(Prior="N(0,1.4)", x=rnorm(n,0,1.4)),
tibble(Prior="N(0,1.5)", x=rnorm(n,0,1.5)),
tibble(Prior="N(1,1.5)", x=rnorm(n,1,1.5)),
tibble(Prior="N(0,2)", x=rnorm(n,0,2)),
tibble(Prior="N(0,3)", x=rnorm(n,0,3)),
tibble(Prior="N(0,1.7)", x=rnorm(n,0,1.7))) %>%
mutate(x = inv_logit(x))
p1 <- ld %>% ggplot() + aes(x=x, col=Prior) +
geom_density() + xlab("") +
labs(col="Prior on\nalpha = logit(p)") +
ggtitle("Implied prior for\nproportion p")
p2 <- ld %>%
mutate(x = 1/x) %>%
ggplot() + aes(x=x, col=Prior) +
geom_density() + xlim(1,6) + ylab("") + xlab("") +
theme(legend.position = "none") +
ggtitle("Implied prior for\ninflation factor f = 1/p")
p1 + p2 + plot_layout(guides="collect")
In Figure S2-1, we can see that assigning to logit \(p\) a normal prior with large standard deviation (e.g., > 2), the prior is increasingly weighted towards the extreme values of 0 and 1 on the probability scale. On the other hand, when the standard deviation is low as 0.5, extreme values of 0 and 1 are given little weight in favour of values around \(p\) = 0.5. The prior of N(0,1.4) for logit \(p\) gives a relatively flat prior on the probability scale for most of the range between 0 and 1, so this is what we used as the prior for the intercept parameter in our base model. We also show a more conservative prior, N(1,1.5), which puts greater weight on values of \(p\) closer to 1, thus giving a more sceptical view of the non-instantaneous bias. This conservative prior is used in a sensitivity analysis in Section 8.
With Bayesian logistic regression, there is distribution of lines that arises from the priors for the intercept and slope parameters of the linear model fit of \(p\) vs the predictor variable (here, diver swimming speed) on the logit scale. The prior distribution of lines comprises lines that are considered “plausible” prior to seeing the data. When the model is fit to the data, we obtain a selection of lines from the prior that are consistent with the data. In this section, we examine the posterior distribution of lines arising from the priors in our base model, and compare it with that of two models with alternative priors: one with a prior on the intercept that favours lower non-instantaneous bias (\(\alpha \sim N(1,1.5)\)), and one with a broader prior on the slope parameter (\(\beta \sim N(0,0.1)\)).
The choice of priors in our base model achieve a wide range of relationships between \(p\) and swimming speed, including no relationship (flat lines), and a mixture of weak and relatively stong positive and negative relationships ( Figure S2-2A and B).
When examined the prior distribution of lines that arose from a ‘weaker’ prior on the slope parameter–specifically, where the standard deviation is 1 (\(N(0,1)\)) rather than 0.1 as in our base model. We show that the ‘weaker’ prior actually yielded stronger relationships between \(p\) and swimming speed (Figure S2-2C and D). Very few of the lines had no relationship or a weak relationship between bias and swimming speed, and the majority of lines showed a very strong or a very weak relationship. We did not use this prior because we wanted to allow for the model to return no relationship or a weak relationship if that is what is signalled by the data.
The model we fit with a more conservative prior on the intercept parameter (i.e., favouring a lesser degree of non-instantaneous bias overall) is shown in Figure S2-2E and F. These plots confirm that this prior does indeed represent a conservative model, with a greater number of lines near 1 and fewer near 0. Inferences from this model is compared with those of our base model as a sensitivity analysis in Section 8.
## Define function to plot logistic regression lines for different priors
plot_prior_logreg <- function(int_mean = 0, int_sd = 1.4,
slope_mean = 0, slope_sd = 0.1,
f = FALSE) {
pred_X = list(X = 23:65 - 43) # define X
draws <- replicate(100,
inv_logit( rnorm(1, int_mean, int_sd) +
rnorm(1, slope_mean, slope_sd) * pred_X$X )) %>% t # get draws of lines
if(f) {
draws <- 1/draws
yla <- expression(paste("Inflation factor (", italic("f"),")"))
yli <- c(1, 20)
} else {
yla <- expression(paste("Proportion of insiders (", italic("p"),")"))
yli <- c(0,1)
}
pred_X %>% add_draws( draws = draws , value = "f") %>% group_by(X, .draw) %>%
ggplot(aes(x = X + 43, y = f, group = .draw)) +
geom_line(alpha = .5) + coord_cartesian(ylim = yli) + ylab(yla) +
xlab(expression(paste('Swimming speed (m min' ^ -1, ")", sep=""))) +
theme_bw()
}
# Base priors
plot_prior_logreg() +
ggtitle("A. Proportion of insiders, base priors; intercept N(0,1.4), slope N(0,0.1)")
plot_prior_logreg(f = T) +
ggtitle("B. Inflation factor, base priors; intercept N(0,1.4), slope N(0,0.1)")
# Alternative prior for slope
plot_prior_logreg(slope_sd = 1) +
ggtitle("C. Proportion of insiders, intercept N(0,1.4), broader prior for slope N(0,1)")
plot_prior_logreg(slope_sd = 1, f = T) +
ggtitle("D. Inflation factor, intercept N(0,1.4), broader prior for slope N(0,1)")
# Alternative prior for intercept
plot_prior_logreg(int_mean = 1, int_sd = 1.5) +
ggtitle("E. Proportion of insiders, conservative intercept N(1,1.5), slope N(0,0.1)")
plot_prior_logreg(int_mean = 1, int_sd = 1.5, f = T) +
ggtitle("F. Inflation factor, conservative intercept N(1,1.5), slope N(0,0.1)")
In this section, we provide the code used to create Figure 4 in the main article (Figure S2-3 herein) and calculate estimates of \(\alpha\) and \(\beta\) (Table S2-2), and \(p\) and \(f\) for various swimming speeds (Table S2-3).
mypars <- c("a", "b", "sd_Tr")
summary(fit_propST)$summary %>%
as.data.frame %>%
rownames_to_column("Parameter") %>%
dplyr::filter(Parameter %in% mypars) %>%
kable(digits = 2, caption = tables("pars_tab")) %>%
kable_styling()
Parameter | mean | se_mean | sd | 2.5% | 25% | 50% | 75% | 97.5% | n_eff | Rhat |
---|---|---|---|---|---|---|---|---|---|---|
a | 0.47 | 0.01 | 0.35 | -0.26 | 0.26 | 0.48 | 0.68 | 1.17 | 1290.60 | 1 |
b | 0.13 | 0.00 | 0.03 | 0.08 | 0.11 | 0.13 | 0.14 | 0.18 | 2683.52 | 1 |
sd_Tr | 0.84 | 0.01 | 0.38 | 0.35 | 0.58 | 0.76 | 1.01 | 1.78 | 1212.13 | 1 |
post_propST <- extract(fit_propST)
postf <- list(Speed = dp2$X_pred + 43) %>%
add_draws(post_propST$f_pred, value = "f") %>%
group_by(Speed) %>%
mean_qi %>%
dplyr::filter(Speed %in% c(22,30,40,43,50,60,65)) %>%
select(2:4)
postp <- list(Speed = dp2$X_pred + 43) %>%
add_draws(post_propST$p_pred, value = "p") %>%
group_by(Speed) %>%
mean_qi %>%
dplyr::filter(Speed %in% c(22,30,40,43,50,60,65))%>%
select(1:4)
bind_cols(postp,postf) %>%
kable(digits = 3,
caption = tables("tpf"),
col.names = c("Speed", "Mean", "Lower", "Upper", "Mean", "Lower", "Upper")) %>%
add_header_above(c(" " = 1, "Proportion of insiders" = 3, "Inflation factor" = 3)) %>%
kable_styling()
Speed | Mean | Lower | Upper | Mean | Lower | Upper |
---|---|---|---|---|---|---|
22 | 0.114 | 0.028 | 0.260 | 12.365 | 3.851 | 35.956 |
30 | 0.244 | 0.098 | 0.416 | 4.713 | 2.401 | 10.157 |
40 | 0.522 | 0.339 | 0.685 | 1.980 | 1.461 | 2.951 |
43 | 0.612 | 0.436 | 0.763 | 1.667 | 1.311 | 2.293 |
50 | 0.788 | 0.644 | 0.898 | 1.278 | 1.113 | 1.552 |
60 | 0.924 | 0.827 | 0.978 | 1.084 | 1.023 | 1.210 |
65 | 0.955 | 0.881 | 0.991 | 1.048 | 1.009 | 1.134 |
plot_pf <- function(post, dat = dp2, mean_X = 43) {
require(tidybayes)
require(patchwork)
mybreaks=c(1,5,10,20,30,50)
plot_f_by_speed <- list(X = dat$X_pred) %>%
add_draws(post$f_pred[1:200,], value = "f") %>%
group_by(X, .draw) %>%
ggplot(aes(x = X + mean_X, y = f, group = .draw)) +
geom_hline(aes(yintercept=1), col = "dark grey") +
geom_line(alpha = .4, col = "lightsteelblue") +
stat_summary(aes(group = NULL), fun = mean, geom = 'line', alpha=0.9) +
geom_point(data = data.frame(X = dat$X + mean_X, f = dat$R , Tr = dat$Transect),
aes(x = X, y = f, col = factor(Tr),
size=dat$Total), shape=1,
inherit.aes = F) +
ylab(expression(paste("Inflation factor (", italic("f"),")"))) +
xlab(expression(paste('Swimming speed (m min' ^ -1, ")", sep=""))) +
coord_cartesian(ylim=c(1,10)) +
scale_size_continuous("Total count", trans="identity",
range=c(1,10), breaks=mybreaks) +
theme_bw()+
theme(legend.position = "none")
plot_p_by_speed <- list(X = dat$X_pred) %>%
add_draws(post$p_pred[1:200,], value = "p") %>%
group_by(X, .draw) %>%
ggplot(aes(x = X + mean_X, y = p, group = .draw)) +
geom_hline(aes(yintercept=0), col = "dark grey") +
geom_hline(aes(yintercept=1), col = "dark grey") +
geom_line(alpha = .4, col = "lightsteelblue") +
stat_summary(aes(group = NULL), fun = mean, geom = 'line', alpha=0.9) +
geom_point(data = data.frame(X = dat$X + mean_X,
p = dat$Insider / dat$Total , Tr = dat$Transect),
aes(x = X, y = p, col = factor(Tr),
size=dat$Total), shape=1,
inherit.aes = F) +
ylab(expression(paste("Proportion of insiders (", italic("p"),")"))) +
xlab(expression(paste('Swimming speed (m min' ^ -1, ")", sep=""))) +
scale_colour_discrete("Transect no.") +
scale_size_continuous("Total count", trans="identity",
range=c(1,10), breaks=mybreaks) +
theme_bw()
plot_p_by_speed + plot_f_by_speed +
plot_layout( guides = "collect") +
plot_layout(ncol=1) +
plot_annotation(tag_levels = 'A')
}
plot_pf(post_propST)
Our base model mod_propST
included a term for the relationship of the proportion of insiders with swimming speed, and a term allowing for differences among transects. We fitted alternative models with each of these terms removed in turn, and with both terms removed (i.e., an intercept only model), and compared these with the base model using estimated log predictive density (elpd) scores estimated with Pareko-smoothed importance sampling (Vehtari et al. 2017). A lower elpd score indicates that a model will have lower error when making out-of-sample predictions.
mod_propCT
)\[ \begin{aligned} y^*_i &\sim {\rm Binomial}(y_i, p_i) \\ {\rm logit}(p_i) &\sim \alpha + z_{Tr[i]} \sigma_t\\ \alpha &\sim {\rm Normal}(0, 1.4)\\ z_{Tr[i]} &\sim {\rm Normal}(0, 1)\\ \sigma_t &\sim {\rm Exponential}(1)\\\\ \end{aligned} \]
data{
int N;
int N_Tr;
int Tr[N];
int N_pred;
int Total[N];
int Insider[N];
}
parameters{
real a;
vector[N_Tr] z_Tr;
real<lower=0> sd_Tr;
}
model{
vector[N] p;
a ~ normal( 0 , 1.4 );
z_Tr ~ normal( 0 , 1 );
sd_Tr ~ exponential( 1 );
for ( i in 1:N ) p[i] = inv_logit(a + z_Tr[Tr[i]]*sd_Tr);
Insider ~ binomial( Total , p );
}
generated quantities{
vector[N] log_lik;
vector[N] p;
for ( i in 1:N ) p[i] = inv_logit(a + z_Tr[Tr[i]]*sd_Tr);
for ( i in 1:N ) log_lik[i] = binomial_lpmf( Insider[i] | Total[i] , p[i] );
}
fit_propCT <- sampling(mod_propCT,
data=dp2,
iter=2000,
chains=4,
cores=4,
seed=111)
mod_propS
)\[ \begin{aligned} y^*_i &\sim {\rm Binomial}(y_i, p_i) \\ {\rm logit}(p_i) &= \alpha + \beta ({\rm Speed}_i - 43)\\ \alpha &\sim {\rm Normal}(0, 1.4)\\ \beta &\sim {\rm Normal}(0, 0.1)\\ \end{aligned} \]
data{
int N;
int Total[N];
int Insider[N];
vector[N] X;
}
parameters{
real a;
real b;
}
model{
vector[N] p;
b ~ normal( 0 , 0.1 );
a ~ normal( 0 , 1.4 );
for ( i in 1:N ) p[i] = inv_logit(a + b * X[i]);
Insider ~ binomial( Total , p );
}
generated quantities{
vector[N] log_lik;
vector[N] p;
for ( i in 1:N ) p[i] = inv_logit(a + b * X[i]);
for ( i in 1:N ) log_lik[i] = binomial_lpmf( Insider[i] | Total[i] , p[i] );
}
fit_propS <- sampling(mod_propS,
data=dp2,
iter=2000,
chains=4,
cores=4,
seed=111)
mod_propC
)\[ \begin{aligned} y^*_i &\sim {\rm Binomial}(y_i, p) \\ {\rm logit}(p) &\sim {\rm Normal}(0, 1.4)\\ \end{aligned} \]
data{
int N;
int Total[N];
int Insider[N];
}
parameters{
real a;
}
model{
real p;
a ~ normal( 0 , 1.4 );
p = inv_logit(a);
Insider ~ binomial( Total , p );
}
generated quantities{
vector[N] log_lik;
real p;
p = inv_logit(a);
for ( i in 1:N ) log_lik[i] = binomial_lpmf( Insider[i] | Total[i] , p );
}
fit_propC <- sampling(mod_propC,
data=dp2,
iter=2000,
chains=4,
cores=4,
seed=111)
loo_propST <- fit_propST %>% loo
loo_propS <- fit_propS %>% loo
loo_propC <- fit_propC %>% loo
loo_propCT <- fit_propCT %>% loo
loo_compare(loo_propST, loo_propS, loo_propCT, loo_propC)
## elpd_diff se_diff
## model1 0.0 0.0
## model2 -10.2 9.9
## model3 -20.6 11.0
## model4 -28.1 23.5
We can see from the output above that the lowest elpd is achieved by the base model, so we use the base model for inferences.
The table and plots below show no issues with the model fitting: there were no divergent transitions; the effective sample sizes (n_eff
) were all very high; \(\hat{R}\) scores were all near 1; and the chains appear to be well mixed.
mypars = c("a","b","p_pred[1]","p_pred[4]","p_pred[11]","p_pred[21]")
my_diag_plots <- function(fit, mypars) {
pt <- mcmc_trace(as.array(fit), pars = mypars)
pn <- mcmc_neff(neff_ratio(fit, pars = mypars)) + yaxis_text(hjust = 1)
pr <- mcmc_rhat(rhat(fit, pars = mypars)) + yaxis_text(hjust = 1)
pt / (pn + pr) + plot_layout(heights=c(2,1)) +
plot_annotation(tag_levels = 'A')
}
my_diag_plots(fit_propST, mypars=mypars)
For sensitivity analysis, we present the equivalent plots and results for the model with a more conservative prior for that favours higher proportions of insiders (and is thus more sceptical of non-instantaneous bias).
The difference is the prior for the intercept parameter. For the base model, the prior is \(\alpha \sim {\rm Normal}(0, 1.4)\). For the conservative model, the prior is \(\alpha \sim {\rm Normal}(1, 1.5)\).
data{
int N;
int N_Tr;
int Tr[N];
int N_pred;
int Total[N];
int Insider[N];
vector[N] X;
vector[N_pred] X_pred;
}
parameters{
real a;
real b;
vector[N_Tr] z_Tr;
real<lower=0> sd_Tr;
}
model{
vector[N] p;
b ~ normal( 0 , 0.1 );
a ~ normal( 1 , 1.5 );
z_Tr ~ normal( 0 , 1 );
sd_Tr ~ exponential( 1 );
for ( i in 1:N ) {
p[i] = inv_logit(a + b * X[i] + z_Tr[Tr[i]]*sd_Tr);
}
Insider ~ binomial( Total , p );
}
generated quantities{
vector[N] log_lik;
vector[N] p;
vector[N] f;
vector[N_pred] p_pred;
vector[N_pred] f_pred;
for ( i in 1:N ) {
p[i] = inv_logit(a + b * X[i] + z_Tr[Tr[i]]*sd_Tr);
f[i] = 1 / p[i];
}
for ( i in 1:N ) log_lik[i] = binomial_lpmf( Insider[i] | Total[i] , p[i] );
for ( i in 1:N_pred ) { // Predict p and f across the range of X
p_pred[i] = inv_logit(a + b * X_pred[i]);
f_pred[i] = 1 / p_pred[i];
}
}
fit_propST_cons <- sampling(mod_propST_cons,
data=dp2,
iter=2000,
chains=4,
cores=4,
seed=111)
Table S2-4 and Figure S2-5 are used to compare the results for the base and conservative models for predictions of the slope parameter \(\beta\) (b
), and predicted proportion of insiders (p
) and inflation factor (f
) at various diver swimming speeds. The plots and posterior distributions for values of interest are very similar between the base and conservative models, so we can be confident that the base prior for the intercept is not driving our inferences.
post_propST_cons = extract(fit_propST_cons)
plot_pf(post_propST) + plot_annotation(title = "Base prior")
plot_pf(post_propST_cons) + plot_annotation(title = "Conservative prior")
sum_bc <- function(b,c) {
c(`Base model` = paste0(b %>% mean %>% round(.,2), " (",
b %>% quantile(., 0.025) %>% round(.,2), ",",
b %>% quantile(., 0.975) %>% round(.,2), ")") ,
`Conservative model` = paste0(c %>% mean %>% round(.,2), " (",
c %>% quantile(., 0.025) %>% round(.,2), ",",
c %>% quantile(., 0.975) %>% round(.,2), ")") ) }
rbind(
b = sum_bc(post_propST$b, post_propST_cons$b),
p_30 = sum_bc(post_propST$p_pred[,11],post_propST_cons$p_pred[,11]),
p_40 = sum_bc(post_propST$p_pred[,21],post_propST_cons$p_pred[,21]),
p_50 = sum_bc(post_propST$p_pred[,31],post_propST_cons$p_pred[,31]),
f_30 = sum_bc(post_propST$f_pred[,11],post_propST_cons$f_pred[,11]),
f_40 = sum_bc(post_propST$f_pred[,21],post_propST_cons$f_pred[,21]),
f_50 = sum_bc(post_propST$f_pred[,31],post_propST_cons$f_pred[,31])
) %>%
kable(caption = tables("t_dat")) %>%
kable_styling()
Base model | Conservative model | |
---|---|---|
b | 0.13 (0.08,0.18) | 0.13 (0.08,0.18) |
p_30 | 0.24 (0.1,0.42) | 0.25 (0.11,0.44) |
p_40 | 0.52 (0.34,0.68) | 0.54 (0.36,0.7) |
p_50 | 0.79 (0.64,0.9) | 0.8 (0.66,0.91) |
f_30 | 4.71 (2.4,10.16) | 4.47 (2.29,9.42) |
f_40 | 1.98 (1.46,2.95) | 1.91 (1.43,2.74) |
f_50 | 1.28 (1.11,1.55) | 1.26 (1.1,1.51) |
Gabry J, Simpson D, Vehtari A, Betancourt M, Gelman A (2019) Visualization in Bayesian Workflow. Journal of the Royal Statistical Society: Series A (Statistics in Society) 182:389–402.
Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2014) Bayesian Data Analysis, 3rd ed. CRC press, Boca Raton.
McElreath R (2015) Statistical Rethinking: A Bayesian Course with Examples in R and Stan. CRC Press.
Vehtari A, Gelman A, Gabry J (2017) Practical Bayesian Model Evaluation Using Leave-One-Out Cross-Validation and WAIC. Statistics and Computing 27:1413–1432.