Posterior predictive p-Values are not uniformly distributed
between 0 and 1, but instead tend to be peaked around 0.5 (Robins
et al., 2000; Gelman, 2013). Interpreting them as standard p-Values
would thus lead to too liberal model checking procedures. A second
difficulty arises here since five p-Values are combined in pmin. Therefore,
model checking statistics are not directly interpreted. Instead,
each posterior predictive p-Value is compared to the ones obtained on
data sets generated with the model used for the fit. If the posterior
predictive p-Value obtained with the real data is lower than those
obtained on data sets generated with the model, this will indicate a
poor model fit quality of the real data.