Extended poisson–tweedie: properties and regression models for count data
Bonat, Wagner H.
Kokonendji, Célestin C.
Demétrio, Clarice G. B.
MetadataShow full item record
This item's downloads: 0 (view details)
Bonat, Wagner H. Jørgensen, Bent; Kokonendji, Célestin C.; Hinde, John; Demétrio, Clarice G. B. (2017). Extended poisson–tweedie: properties and regression models for count data. Statistical Modelling: An International Journal 18 (1), 24-49
We propose a new class of discrete generalized linear models based on the class of Poisson-Tweedie factorial dispersion models with variance of the form mu + phi mu(p), where mu is the mean and phi and p are the dispersion and Tweedie power parameters, respectively. The models are fitted by using an estimating function approach obtained by combining the quasi-score and Pearson estimating functions for the estimation of the regression and dispersion parameters, respectively. This provides a flexible and efficient regression methodology for a comprehensive family of count models including Hermite, Neyman Type A, Polya-Aeppli, negative binomial and Poisson-inverse Gaussian. The estimating function approach allows us to extend the Poisson-Tweedie distributions to deal with underdispersed count data by allowing negative values for the dispersion parameter phi. Furthermore, the Poisson-Tweedie family can automatically adapt to highly skewed count data with excessive zeros, without the need to introduce zero-inflated or hurdle components, by the simple estimation of the power parameter. Thus, the proposed models offer a unified framework to deal with under-, equi-, overdispersed, zero-inflated and heavy-tailed count data. The computational implementation of the proposed models is fast, relying only on a simple Newton scoring algorithm. Simulation studies showed that the estimating function approach provides unbiased and consistent estimators for both regression and dispersion parameters. We highlight the ability of the Poisson-Tweedie distributions to deal with count data through a consideration of dispersion, zero-inflated and heavy tail indices, and illustrate its application with four data analyses. We provide an R implementation and the datasets as supplementary materials.