The relationship between truck accidents and geometric design of road sections: Poisson versus negative binomial regressions
This paper evaluates the performance of Poisson and negative binomial (NB) regression models in establishing the relationship between truck accidents and geometric design of road sections. Three types of models are considered: Poisson regression, zero-inflated Poisson (ZIP) regression, and NB regression. Maximum likelihood (ML) method is used to estimate the unknown parameters of these models. Two other feasible estimators for estimating the dispersion parameter in the NB regression model are also examined: a moment estimator and a regression-based estimator. These models and estimators are evaluated based on their (i) estimated regression parameters, (ii) overall goodness-of-fit, (iii) estimated relative frequency of truck accident involvements across road sections, (iv) sensitivity to the inclusion of short road sections, and (v) estimated total number of truck accident involvements. Data from the Highway Safety Information System are employed to examine the performance of these models in developing such relationships. The evaluation results suggest that the NB regression model estimated using the moment and regression-based methods should be used with caution. Also, under the ML method, the estimated regression parameters from all three models are quite consistent and no particular model outperforms the other two models in terms of the estimated relative frequencies of truck accident involvements across road sections. It is recommended that the Poisson regression model be used as an intial model for developing the relationship. If the overdispersion of accident data is found to be moderate or high, both the NB and ZIP regression models could be explored. Overall, the ZIP regression model appears to be a serious candidate model when data exhibit excess zeros, e.g. due to underreporting. However, the interpretation of the ZIP model can be difficult.