A Weibull-count approach for handling under- and overdispersed longitudinal/clustered data structures

View/ Open
Date
2018-07-30Author
Luyts, Martial
Molenberghs, Geert
Verbeke, Geert
Matthijs, Koen
Ribeiro Jr, Eduardo E.
Demétrio, Clarice G. B.
Hinde, John
Metadata
Show full item recordUsage
This item's downloads: 364 (view details)
Cited 2 times in Scopus (view citations)
Recommended Citation
Luyts, Martial, Molenberghs, Geert, Verbeke, Geert, Matthijs, Koen, Ribeiro Jr, Eduardo E., Demétrio, Clarice G. B., & Hinde, John. (2018). A Weibull-count approach for handling under- and overdispersed longitudinal/clustered data structures. Statistical Modelling, 19(5), 569-589. doi: 10.1177/1471082X18789992
Published Version
Abstract
A Weibull-model-based approach is examined to handle under- and overdispersed discrete data in a hierarchical framework. This methodology was first introduced by Nakagawa and Osaki (1975, IEEE Transactions on Reliability, 24, 300–301), and later examined for under- and overdispersion by Klakattawi et al. (2018, Entropy, 20, 142) in the univariate case. Extensions to hierarchical approaches with under- and overdispersion were left unnoted, even though they can be obtained in a simple manner. This is of particular interest when analysing clustered/longitudinal data structures, where the underlying correlation structure is often more complex compared to cross-sectional studies. In this article, a random-effects extension of the Weibull-count model is proposed and applied to two motivating case studies, originating from the clinical and sociological research fields. A goodness-of-fit evaluation of the model is provided through a comparison of some well-known count models, that is, the negative binomial, Conway–Maxwell–Poisson and double Poisson models. Empirical results show that the proposed extension flexibly fits the data, more specifically, for heavy-tailed, zero-inflated, overdispersed and correlated count data. Discrete left-skewed time-to-event data structures are also flexibly modelled using the approach, with the ability to derive direct interpretations on the median scale, provided the complementary log–log link is used. Finally, a large simulated set of data is created to examine other characteristics such as computational ease and orthogonality properties of the model, with the conclusion that the approach behaves best for highly overdispersed cases.