Borislava Mihaylova, Andrew Briggs, Anthony O’Hagan and Simon G. Thompson
Health Economics Research Centre, University of Oxford, Public Health and Health Policy, University of Glasgow, Department of Probability and Statistics, University of Sheffield
Publication details: Health Economics 20, 897-916, 2011.
We review published statistical methods for analysing healthcare resource use and cost data, their ability to address skewness, multimodality and heavy right tails, and their ease for general use. The aim is to provide guidance on the appropriate strategy for analysing resource use and costs in clinical trials where sample sizes are often limited. The review identified eleven broad categories of methods: (1) methods based on the normal distribution, (2) models based on normality following a transformation of the data, (3) single-distribution generalized linear models (GLMs), (4) parametric models based on skewed distributions outside the GLM family, (5) models based on mixtures of parametric distributions, (6) two-part and hurdle models, (7) semi-parametric and non-parametric methods, (8) methods based on truncation or trimming of data, (9) data components models, (10) methods based on averaging across a number of models, and (11) Markov chain methods. Based on this review, our recommendations are that, firstly, simple methods are preferred in large sample sizes (in the thousands) where the near-normality of sample means is assured. Secondly, in somewhat smaller sample sizes (in the hundreds), relatively simple methods, able to deal with one or two of the data characteristics studied, may be preferable but checking sensitivity of results to assumptions is necessary. Finally, some more complex methods hold promise for the future, but are relatively untried in practice; their implementation requires substantial expertise and thus they are not currently recommended for wider applied work.