Tony O'Hagan - Academic pages
Tony O'Hagan's Research
Since 1970, I have been committed to the Bayesian paradigm for statistical inference,
and almost all of my research has been in the methodology and applications of Bayesian statistics.
Although I am now retired and not so productively employed in research, I am still
actively engaged with the Bayesian statistics research community.
Please note that I no longer have a formal university appointment, so I cannot take on PhD students,
interns or post-doctoral research assistants.
My CV (last updated 25 September 2022,
file size 110KB, in pdf format, see my
publications page if you do not have the Adobe Acrobat pdf viewer) includes details of all my
published books and papers, as well as research grants.
My research interests are in Bayesian Statistics. For those unfamiliar with the Bayesian
approach to statistics, I have prepared a reading list.
My publications page contains abstracts of all
my papers since 1997, and you can download copies of those that
are not yet in print.
The three research topics I have been most active in are as follows. (Letter codes EL, CC
and HE identify relevant papers on my publications page.) Full citations for papers and books referred
to below can be found in my CV, and on my publications page for all papers since 1997.
- Elicitation of expert knowledge. (EL)
A key task in applications of Bayesian methods
is the expression of expert knowledge and opinions in the form of
probability distributions. As such, I have been keenly interested in elicitation
throughout my research career. Although originally seen within the field of Bayesian statistics as
a tool for constructing prior distributions (representations of knowledge existing prior
to obtaining some statistical data), it is now used much more widely. Expert knowledge is
of vital importance in many areas of scientific research, public policy and decision making.
Wherever there is inadequate data or scientific knowledge to pin down the values of quantities
accurately, elicitation is used to identify the more likely values and to quantify the
degree of uncertainty around those values.
I have been involved in several significant developments in elicitation, and I am one of the most experienced
researchers, teachers and practitioners in the field.
- I led a research project known as BEEP (Bayesian Elicitation of Experts' Probabilities) from 2003 to 2006,
funded by the National Health Service's Research Methodology Programme. The most significant output of this
project was the book O'Hagan, Buck et al (2006) which provided a substantial review of the literature
in the field at the time. It is published by
Wiley
and is still in print.
- In 2008, Jeremy Oakley and I announced the first version of
SHELF, the Sheffield Elicitation Framework.
The original impetus for this was the BEEP project and a realisation that there was almost no training material
available for anyone wanting to learn how to conduct elicitation. So we created SHELF as a package
of documents, templates and simple software as an aid to conducting elicitation. SHELF has been updated
several times since, and is now in version 3.0.
- To provide further support to those who wish to implement the SHELF approach to elicitation, I have
developed training courses,
including a one-day introduction and an intensive three-day training for facilitators.
My principal papers in this field are as follows:
O'Hagan (1998) sets out my early work in the field;
Oakley and O'Hagan (2007) introduces a novel approach to quantifying uncertainty in elicitation, which
was extended by Gosling, Oakley and O'Hagan (2007) and Moala and O'Hagan (2010);
O'Hagan (2012) introduced the idea of elaboration to address complex elicitation challenges.
- Bayesian analysis of computer code outputs. (CC) This topic is
concerned with a variety of problems arising in the use of computer models. Mathematical models are used
in almost every area of science, technology, economics, engineering, etc., and are increasingly
used for making policy and decisions. Models typically incorporate the best scientific understanding
of the processes involved, with a view to understanding complex interactions, predicting and controlling future
behaviour and even to learn about the parameters of the underlying equations.
There is inevitably uncertainty around the predictions of such models, arising from a number of sources,
including uncertainty about the correct values of parameters for the equations, as well as uncertainty
about the validity of the equations themselves. In principle, uncertainty about parameter settings
can be explored by making many runs of the model with the parameters varied between runs. However,
the computer programs to run these models often take hours or even days for a single run.
The Bayesian approach that I have been heavily involved in developing is designed to make maximum use of a
very limited number of model runs.
The field has become a major area of research in statistics, applied mathematics and engineering.
The applied maths and engineering community coined the term "Uncertainty Quantification" (UQ), which has
superceded the name "Bayesian analysis of computer code outputs" that I and other statisticians had used.
Several journals and many large research groups are now devoted to UQ.
I have made some significant contributions to this field.
- The seminal (more that 1700 citations to date according to Google scholar) paper
Kennedy and O'Hagan (2001) introduced the concept of model discrepancy (also known as model error
or structural error), which is crucial when observational data are used to calibrate the model.
- I initiated and led the MUCM (Managing Uncertainty in Complex Models) project from 2006 to 2012
(including the extension, MUCM2). This was funded to the tune of more than three million pounds and
was a collaboration between Sheffield University and four other UK research institutions. The project
extended the work of Kennedy/O'Hagan and others and gave great impetus to the field.
My principal papers in this field are as follows:
Haylock and O'Hagan (1996) laid the foundation for the Bayesian analysis by proposing a Gaussian process
prior distribution for the model output and deriving the uncertainty in outputs due to input uncertainty;
Kennedy and O'Hagan (2000) gave a Bayesian approach for integrating data from a series of models with increasing degrees
of accuracy; Kennedy and O'Hagan (2001) introduced model discrepancy and presented the resulting theory
of model calibration;
Kennedy, O'Hagan et al (2008) presented a major practical application in the study of climate;
Conti and O'Hagan (2010) described extensions to multi-output and dynamic models;
Brynjarsdottir and O'Hagan (2014) showed the importance of using genuine prior information
about the model discrepancy.
See also the GEM software and the short course
on uncertainty in computer codes that is based on GEM.
- Bayesian methods in health economics. (HE)
This topic is concerned with evaluating the cost-effectiveness of medical treatments. Cost-effectiveness
has become an important issue over the last 20 years, with health care budgets strained worldwide by
increasingly sophisticated, and expensive, drugs and procedures. In the UK, NICE (the National Institute for
Health and Care Excellence) advises the National Health Service and helps to determine which treatments
are available under the NHS.
Statistical methods have been firmly established for much longer in the area of analysis of
clinical trial data, and freqeuentist inference dominates that field. It has been very slow to
accept Bayesian approaches. In contrast, health economics is a much younger field which has been
open to Bayesian analysis from the start. Indeed, the ability of Bayesian methods to seemlessly
integrate evidence from all sources, not just well-controlled clinical trials but also
observational studies and even expert judgement, has made the Bayesian approach the dominant one
in this field.
Through the initiative of John Stevens, who was then working for AstraZeneca, I became involved
in health economics at an early stage of its development, and our joint papers were influential
in establishing the soundness and value of Bayesian methods.
My key papers in this field are as follows:
O'Hagan and Stevens (2002) is a review of work on the assessment of cost-effectiveness
using data from clinical trials;
Stevens, O'Hagan and Miller (2003) is a tutorial paper based on an application of those methods,
which was awarded a prize for the best paper in the journal Pharmaceutical Statistics in 2003;
Kharroubi, O'Hagan and Brazier (2005) introduces a Bayesian
method for estimating health-related quality of life, with a significant application published
in Kharroubi, Brazier and O'Hagan (2010);
O'Hagan, Stevenson and Madan (2007) provides efficient
ways to estimate cost-effectiveness using patient-level simulation models;
Pickin, Cooper et al (2009) is an application to a NICE cost-effectiveness assessment.
See also the course of lectures available on Bayesian
methods in health economics.
I have also done research in various other areas in the general field of Bayesian statistics,
including the following.
- Bayesian inference for unknown functions.
There are many situations where we wish to estimate or make other inference
about an unknown function. The topic of Bayesian analysis of computer code
outputs is actually one example of this, where the function of interest
is the computer code itself. My interest in the area began right at the start of
my career with the paper O'Hagan (1978), in which I first used a Gaussian process to model
prior opinion about an unknown function - in this case a regression function. I first used this
approach for a mathematical function in O'Hagan (1991), where I considered inference about an
integral. This led directly to the work on computer codes. Other examples that I have worked on include
inference for the radiocarbon calibration curve, in Gomez Portugal Aguilar, Litton and O'Hagan (2002),
and interpolation of pollution monitoring stations, in Schmidt and O'Hagan (2003).
- Robust Bayesian analysis.
I have worked for some time with
modelling using heavy-tailed distributions, which produce a kind of
automatic robustness to outlying observations or parameters.
The theory of these models concentrates on asymptotic
properties as the conflict between data sources (e.g. conflict between
outliers and other observations) becomes extreme. My first paper in this area was also
very early in my career, O'Hagan (1979), with a small follow-up in O'Hagan (1981).
In O'Hagan (1988) and O'Hagan (1990), I explored how conflicts are resolved with many heavy-tailed
information sources. Generalisations to bivariate heavy-tailed distributions, in O'Hagan and Le (1994),
and to scale parameter inference, in Andrade and O'Hagan (2011), followed.
O'Hagan and Perricchi (2012) is a recent review.
- Bayesian methods in auditing.
In a research contract from the
National Audit Office
(NAO), David Laws and I developed Bayesian methods for auditing
complex organisations; see Laws and O'Hagan (2000, 2002).
Heiner, Kennedy and O'Hagan (2010) present a further generalisation,
with application to monitoring food stamps distribution in New York state.
- Fractional Bayes factors. The Fractional Bayes Factor is a
device I developed, in O'Hagan (1995), for Bayesian comparison of models when the prior
information about parameters within the models is weak. The FBF is compared with an alternative
device called the intrinsic Bayes factor in O'Hagan (1997). See also Conigliani and O'Hagan (2000)
and Conigliani, Castro and O'Hagan (2000).
Updated: 11 April 2022
Maintained by: Tony O'Hagan