This vignette demonstrates how to access most of data stored in a stanfit object. A stanfit object (an object of class "stanfit"
) contains the output derived from fitting a Stan model using Markov chain Monte Carlo or one of Stan’s variational approximations (meanfield or full-rank). Throughout the document we’ll use the stanfit object obtained from fitting the Eight Schools example model:
library(rstan)
fit <- stan_demo("eight_schools", refresh = 0)
Warning: There were 11 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
Warning: Examine the pairs() plot to diagnose sampling problems
class(fit)
[1] "stanfit"
attr(,"package")
[1] "rstan"
There are several functions that can be used to access the draws from the posterior distribution stored in a stanfit object. These are extract
, as.matrix
, as.data.frame
, and as.array
, each of which returns the draws in a different format.
The extract
function (with its default arguments) function returns a list with named components corresponding to the model parameters.
list_of_draws <- extract(fit)
print(names(list_of_draws))
[1] "mu" "tau" "eta" "theta" "lp__"
In this model the parameters mu
and tau
are scalars and theta
is a vector with eight elements. This means that the draws for mu
and tau
will be vectors (with length equal to the number of post-warmup iterations times the number of chains) and the draws for theta
will be a matrix, with each column corresponding to one of the eight components:
head(list_of_draws$mu)
[1] 12.486875 5.076899 9.406396 8.214410 9.929440 14.989116
head(list_of_draws$tau)
[1] 6.7132847 0.2337698 8.1825272 1.3028420 14.3867909 2.4507864
head(list_of_draws$theta)
iterations [,1] [,2] [,3] [,4] [,5] [,6]
[1,] 9.461048 23.908189 16.826486 9.975644 14.560703 6.616533
[2,] 5.094057 4.990376 4.955423 5.057209 4.919296 5.049225
[3,] 14.899391 16.278589 16.229008 16.409987 -10.024558 5.523476
[4,] 10.628681 8.395092 8.203159 9.064492 7.313150 10.330872
[5,] 15.056341 2.778215 4.305036 9.798058 3.036702 2.766697
[6,] 9.229407 15.113267 18.209782 14.809398 16.568487 15.118246
iterations [,7] [,8]
[1,] 6.753117 2.048181
[2,] 5.037592 5.288633
[3,] 15.001986 12.521378
[4,] 9.422001 7.701444
[5,] 11.612728 17.645587
[6,] 14.912755 13.051368
The as.matrix
, as.data.frame
, and as.array
functions can also be used to retrieve the posterior draws from a stanfit object:
matrix_of_draws <- as.matrix(fit)
print(colnames(matrix_of_draws))
[1] "mu" "tau" "eta[1]" "eta[2]" "eta[3]" "eta[4]"
[7] "eta[5]" "eta[6]" "eta[7]" "eta[8]" "theta[1]" "theta[2]"
[13] "theta[3]" "theta[4]" "theta[5]" "theta[6]" "theta[7]" "theta[8]"
[19] "lp__"
df_of_draws <- as.data.frame(fit)
print(colnames(df_of_draws))
[1] "mu" "tau" "eta[1]" "eta[2]" "eta[3]" "eta[4]"
[7] "eta[5]" "eta[6]" "eta[7]" "eta[8]" "theta[1]" "theta[2]"
[13] "theta[3]" "theta[4]" "theta[5]" "theta[6]" "theta[7]" "theta[8]"
[19] "lp__"
array_of_draws <- as.array(fit)
print(dimnames(array_of_draws))
$iterations
NULL
$chains
[1] "chain:1" "chain:2" "chain:3" "chain:4"
$parameters
[1] "mu" "tau" "eta[1]" "eta[2]" "eta[3]" "eta[4]"
[7] "eta[5]" "eta[6]" "eta[7]" "eta[8]" "theta[1]" "theta[2]"
[13] "theta[3]" "theta[4]" "theta[5]" "theta[6]" "theta[7]" "theta[8]"
[19] "lp__"
The as.matrix
and as.data.frame
methods essentially return the same thing except in matrix and data frame form, respectively. The as.array
method returns the draws from each chain separately and so has an additional dimension:
print(dim(matrix_of_draws))
print(dim(df_of_draws))
print(dim(array_of_draws))
[1] 4000 19
[1] 4000 19
[1] 1000 4 19
By default all of the functions for retrieving the posterior draws return the draws for all parameters (and generated quantities). The optional argument pars
(a character vector) can be used if only a subset of the parameters is desired, for example:
mu_and_theta1 <- as.matrix(fit, pars = c("mu", "theta[1]"))
head(mu_and_theta1)
parameters
iterations mu theta[1]
[1,] -0.4539793 4.607627
[2,] -0.4539793 4.607627
[3,] 0.2655726 3.693130
[4,] 2.3952512 13.605032
[5,] 7.4405332 7.229162
[6,] 5.8874078 2.259399
Summary statistics are obtained using the summary
function. The object returned is a list with two components:
fit_summary <- summary(fit)
print(names(fit_summary))
[1] "summary" "c_summary"
In fit_summary$summary
all chains are merged whereas fit_summary$c_summary
contains summaries for each chain individually. Typically we want the summary for all chains merged, which is what we’ll focus on here.
The summary is a matrix with rows corresponding to parameters and columns to the various summary quantities. These include the posterior mean, the posterior standard deviation, and various quantiles computed from the draws. The probs
argument can be used to specify which quantiles to compute and pars
can be used to specify a subset of parameters to include in the summary.
For models fit using MCMC, also included in the summary are the Monte Carlo standard error (se_mean
), the effective sample size (n_eff
), and the R-hat statistic (Rhat
).
print(fit_summary$summary)
mean se_mean sd 2.5% 25%
mu 7.729488267 0.12450562 4.6831183 -1.7261778 4.5427491
tau 6.283030357 0.15340960 5.3516839 0.2703335 2.3083988
eta[1] 0.391475980 0.01760641 0.9139955 -1.4168183 -0.2200949
eta[2] 0.003125851 0.01684324 0.8645206 -1.6597507 -0.5748516
eta[3] -0.166296398 0.01683880 0.9159540 -1.9851784 -0.7853604
eta[4] -0.016487013 0.01580548 0.8783463 -1.7837110 -0.5982585
eta[5] -0.335417619 0.01718280 0.8778829 -2.0382564 -0.9156673
eta[6] -0.184483251 0.01735472 0.8774091 -1.8937820 -0.7810608
eta[7] 0.335008451 0.01660627 0.8934875 -1.5166426 -0.2302950
eta[8] 0.105440665 0.01672050 0.8861569 -1.6553779 -0.4666342
theta[1] 11.249019199 0.18364455 7.9845803 -1.7567678 6.0185741
theta[2] 7.712124731 0.10657145 6.1082910 -4.5840223 3.8721269
theta[3] 6.350647750 0.14996509 7.3762234 -10.0852382 2.5356931
theta[4] 7.661529731 0.11158905 6.3933654 -5.4326505 3.8688637
theta[5] 5.092344387 0.11908738 6.2357212 -8.5015380 1.4423010
theta[6] 6.265100788 0.11780219 6.4771635 -8.5515199 2.6335756
theta[7] 10.412976094 0.12001533 6.5841921 -1.1144428 5.9238673
theta[8] 8.519610689 0.15017011 7.3401927 -5.4383628 4.1354525
lp__ -39.417253126 0.07685835 2.5365246 -44.8643049 -41.0132422
50% 75% 97.5% n_eff Rhat
mu 7.65387977 10.8544748 17.174263 1414.791 1.0017393
tau 5.03708068 9.0438523 19.284862 1216.959 1.0002056
eta[1] 0.40600696 1.0278283 2.113588 2694.925 1.0010516
eta[2] 0.01079158 0.5595116 1.719588 2634.507 1.0009476
eta[3] -0.17901291 0.4409814 1.642419 2958.863 0.9996343
eta[4] -0.01922742 0.5782290 1.672891 3088.277 1.0008058
eta[5] -0.35165801 0.2260518 1.534272 2610.268 1.0002202
eta[6] -0.18976760 0.3893553 1.559546 2556.048 0.9998816
eta[7] 0.34196484 0.9261350 2.035984 2894.897 1.0006285
eta[8] 0.11614203 0.6774195 1.837521 2808.811 1.0008682
theta[1] 9.97486797 15.2540485 31.549272 1890.375 1.0003312
theta[2] 7.70356438 11.4975334 20.220256 3285.169 1.0005471
theta[3] 6.61458873 10.6455958 20.266667 2419.289 1.0023597
theta[4] 7.66481619 11.5583562 20.594161 3282.585 1.0002896
theta[5] 5.59835370 9.1837565 16.285756 2741.839 0.9995162
theta[6] 6.64319811 10.3713008 18.227114 3023.173 1.0001929
theta[7] 9.82056744 14.2065331 24.862031 3009.758 1.0006305
theta[8] 8.09365175 12.5358561 24.651408 2389.175 1.0028142
lp__ -39.23143185 -37.6057979 -35.011607 1089.171 1.0012956
If, for example, we wanted the only quantiles included to be 10% and 90%, and for only the parameters included to be mu
and tau
, we would specify that like this:
mu_tau_summary <- summary(fit, pars = c("mu", "tau"), probs = c(0.1, 0.9))$summary
print(mu_tau_summary)
mean se_mean sd 10% 90% n_eff Rhat
mu 7.729488 0.1245056 4.683118 1.9000236 13.57571 1414.791 1.001739
tau 6.283030 0.1534096 5.351684 0.9346014 12.84918 1216.959 1.000206
Since mu_tau_summary
is a matrix we can pull out columns using their names:
mu_tau_80pct <- mu_tau_summary[, c("10%", "90%")]
print(mu_tau_80pct)
10% 90%
mu 1.9000236 13.57571
tau 0.9346014 12.84918
For models fit using MCMC the stanfit object will also contain the values of parameters used for the sampler. The get_sampler_params
function can be used to access this information.
The object returned by get_sampler_params
is a list with one component (a matrix) per chain. Each of the matrices has number of columns corresponding to the number of sampler parameters and the column names provide the parameter names. The optional argument inc_warmup (defaulting to TRUE
) indicates whether to include the warmup period.
sampler_params <- get_sampler_params(fit, inc_warmup = FALSE)
sampler_params_chain1 <- sampler_params[[1]]
colnames(sampler_params_chain1)
[1] "accept_stat__" "stepsize__" "treedepth__" "n_leapfrog__"
[5] "divergent__" "energy__"
To do things like calculate the average value of accept_stat__
for each chain (or the maximum value of treedepth__
for each chain if using the NUTS algorithm, etc.) the sapply
function is useful as it will apply the same function to each component of sampler_params
:
mean_accept_stat_by_chain <- sapply(sampler_params, function(x) mean(x[, "accept_stat__"]))
print(mean_accept_stat_by_chain)
[1] 0.8420007 0.8495299 0.7739685 0.7506460
max_treedepth_by_chain <- sapply(sampler_params, function(x) max(x[, "treedepth__"]))
print(max_treedepth_by_chain)
[1] 3 4 3 5
The Stan program itself is also stored in the stanfit object and can be accessed using get_stancode
:
code <- get_stancode(fit)
The object code
is a single string and is not very intelligible when printed:
print(code)
[1] "data {\n int<lower=0> J; // number of schools \n real y[J]; // estimated treatment effects\n real<lower=0> sigma[J]; // s.e. of effect estimates \n}\nparameters {\n real mu; \n real<lower=0> tau;\n vector[J] eta;\n}\ntransformed parameters {\n vector[J] theta;\n theta = mu + tau * eta;\n}\nmodel {\n target += normal_lpdf(eta | 0, 1);\n target += normal_lpdf(y | theta, sigma);\n}"
attr(,"model_name2")
[1] "schools"
A readable version can be printed using cat
:
cat(code)
data {
int<lower=0> J; // number of schools
real y[J]; // estimated treatment effects
real<lower=0> sigma[J]; // s.e. of effect estimates
}
parameters {
real mu;
real<lower=0> tau;
vector[J] eta;
}
transformed parameters {
vector[J] theta;
theta = mu + tau * eta;
}
model {
target += normal_lpdf(eta | 0, 1);
target += normal_lpdf(y | theta, sigma);
}
The get_inits
function returns initial values as a list with one component per chain. Each component is itself a (named) list containing the initial values for each parameter for the corresponding chain:
inits <- get_inits(fit)
inits_chain1 <- inits[[1]]
print(inits_chain1)
$mu
[1] 0.1088753
$tau
[1] 1.548756
$eta
[1] -0.2636909 1.8027576 1.0231196 -1.7044300 -1.9552741 0.8916144
[7] -0.4864400 -1.0945375
$theta
[1] -0.2995176 2.9009068 1.6934378 -2.5308708 -2.9193672 1.4897684
[7] -0.6445016 -1.5862962
The get_seed
function returns the (P)RNG seed as an integer:
print(get_seed(fit))
[1] 878857678
The get_elapsed_time
function returns a matrix with the warmup and sampling times for each chain:
print(get_elapsed_time(fit))
warmup sample
chain:1 0.028499 0.022623
chain:2 0.027020 0.026945
chain:3 0.025327 0.019895
chain:4 0.027118 0.021236