Glossary
Terms used throughout this documentation. History matching borrows vocabulary from Bayesian calibration, design of experiments, and the emulation literature; the definitions below assume general familiarity with calibration but not with history matching itself.
ARD — Automatic Relevance Determination
A Gaussian Process kernel parameterization that fits a separate lengthscale to
each input parameter. A short lengthscale means the output varies quickly with
that parameter (it is "relevant"); a long one means the output barely depends on
it. Used by the GPR emulator. (ARD reflects emulator sensitivity to the
current wave's target features; the pairplot.png diagnostic ranks parameters
differently — by overall constraint across all waves — so it does not use ARD.)
Bayes linear emulator
The default emulator (emulator_type='bayes_linear'). It combines a linear
regression trend with a correlated-residual process (a squared-exponential
kernel whose correlation lengths are fit by maximum likelihood) — effectively a
Gaussian process with a linear mean. In the Bayes linear tradition it tracks the
full predictive uncertainty (an adjusted expectation and variance) rather than
just a point estimate, giving calibrated error bars at a fraction of the cost of
GPR and with no TensorFlow dependency. Implemented in pure NumPy/SciPy.
with trend coefficients \(\beta\) (fit by OLS), residual variance \(\sigma^2\), and a per-parameter correlation length \(\theta_j\) fit by maximum likelihood.
Emulator (surrogate)
A fast statistical approximation of the (expensive) simulator, trained on a modest number of simulation runs. The emulator predicts simulator output — with an uncertainty estimate — at new parameter values for a tiny fraction of the cost. History matching uses emulator predictions, not the simulator itself, to rule out parameter space.
Fano factor
The variance-to-mean ratio of an output across the sampled points. The automatic
feature-selection method {'method': 'fano'} uses it to pick the most
informative outputs to emulate: outputs whose value changes a lot relative to
their mean carry more signal for constraining parameters. See also
mean_sq_z and feature/output.
Feature / output
A scalar summary of a simulation run that you have a corresponding observation for (e.g. peak incidence, attack rate, cases in week 1). The two words are used interchangeably: "feature selection" chooses which model outputs to emulate in a given wave. Each selected feature gets its own emulator.
GLM — Generalized Linear Model
An emulator (emulator_type='glm') that fits a linear predictor through a link
function \(g\):
With the default Gaussian family (\(g\) = identity) it reduces to the
linear emulator; with the Poisson family (\(g = \log\),
link='poisson') it models non-negative count outputs.
GPR — Gaussian Process Regression
A flexible, nonlinear emulator (emulator_type='gpr') built on
GPflow/TensorFlow with ARD kernels. Best for
nonlinear response surfaces and excellent uncertainty estimates, at a higher
computational cost than the linear emulators.
with a constant mean \(m\), signal variance \(\sigma_f^2\), observation-noise variance \(\sigma_n^2\), and a separate lengthscale \(\ell_j\) per parameter (ARD).
Implausibility
A score measuring how inconsistent a parameter value is with one observation, expressed in standard deviations. For output \(f\) at parameter \(x\):
where \(z\) is the observed target and the denominator combines the emulator's
predictive variance, the observation's variance, and an optional model
discrepancy variance accounting for structural mismatch between the simulator
and reality. (The discrepancy is supplied as a standard deviation
\(\sigma_{\text{disc}}\) and enters as its square, \(\operatorname{Var}_{\text{disc}} = \sigma_{\text{disc}}^2\) — model_discrepancy, default 0.)
With several outputs, the implausibility of a point is the
maximum over all outputs, so a point must be consistent with every
constraint to survive. A point is ruled out when its implausibility exceeds
the threshold.
Implausibility threshold
The implausibility cutoff above which a parameter value is discarded
(implausibility_threshold, default 3.0). The default of 3 follows the
Vysochanskij–Petunin inequality (popularized as Pukelsheim's "three-sigma
rule"): for any unimodal distribution with finite variance — not just a normal
one — at least ~95% of the probability mass lies within 3 standard deviations of
the mean. So an implausibility above 3 is strong, distribution-free evidence
that the point is inconsistent with the observation.
LHS — Latin Hypercube Sampling
A space-filling experimental design that spreads sample points evenly across
every parameter's range. The default sampling strategy ('lhs'); better coverage
than uniform random sampling for the same number of points.
Linear emulator
The simplest emulator (emulator_type='linear'): an ordinary least-squares fit
with predictive uncertainty taken from the OLS standard errors. Fast and interpretable, but assumes the response is linear in the parameters — see GLM for non-Gaussian outputs, or Bayes linear and GPR for nonlinear response surfaces.
Maximin
A space-filling criterion that maximizes the minimum distance between sample
points, pushing them apart for even coverage. Used as an LHS criterion
({'type': 'lhs', 'criterion': 'maximin'}) and as the final "thinning" stage of
the ray_resample pipeline, where it selects a well-spread
subset from a larger candidate pool.
mean_sq_z
The default automatic feature-selection method ({'method': 'mean_sq_z'}): mean
squared z-score, i.e. how far each output sits from its target in
standard-deviation units, averaged across samples. Outputs that are far from
their target carry the most information for ruling out parameter space. See also
Fano factor.
NROY — Not Ruled Out Yet
The region of parameter space that history matching has not yet shown to be
implausible — the central output of the method. Instead of a
single best-fit point, history matching returns this set of all parameter
values that could plausibly have produced the observed data. The NROY region
shrinks wave by wave. NROY samples are points drawn from it (via
get_nroy_samples()), and the NROY fraction is the share of fresh prior
samples that fall inside it — a convergence diagnostic that falls toward zero as
the calibration tightens.
Prior
The initial parameter space — the bounds you start from, before any wave has ruled anything out. History matching progressively carves the NROY region out of the prior.
Ray sampling / ray_resample
A method for drawing NROY samples efficiently when the NROY region is a
tiny fraction of the prior and plain rejection sampling becomes slow. The
ray_resample pipeline (inspired by the
hmer R package) is four stages:
LHS → ray sampling (drawing points along lines connecting known NROY points) →
importance sampling → maximin thinning. Faster than pure LHS at low
acceptance rates, but can over-represent region boundaries; see the
NROY sampling methods tutorial.
Trajectory selection
A post-calibration step for stochastic models: having found which parameters
are plausible, select specific (parameter set, random seed) pairs whose
simulated trajectories match the observed data, using importance resampling
weighted by a pseudo-likelihood. See the
trajectory selection tutorial.
Wave
One iteration of the history matching loop — sample, simulate, select features,
train emulators, filter. "Wave" and "iteration" are used interchangeably; the
on-disk output is organized into wave1/, wave2/, etc. Each wave starts from
the NROY region left by the previous one, so the searched space
contracts with every wave.