Basic SIR Model History Matching¶
This notebook demonstrates how to use the history matching library with a stochastic SIR epidemiological model using the fully automated workflow.
Overview¶
History matching is a Bayesian method for model calibration that:
- Uses statistical emulators to approximate expensive simulations
- Iteratively reduces the parameter space to "plausible" regions
- Avoids regions where the model cannot match observed data
We will calibrate a SIR model to synthetic outbreak data to recover the "true" transmission
parameters. For a manual step-by-step workflow, see 02_manual_workflow.ipynb.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import historymatching as hm
from model import SIR, generate_observed_data
%matplotlib inline
np.random.seed(42)
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR I0000 00:00:1780533687.274131 1035040 port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. I0000 00:00:1780533687.335197 1035040 cpu_feature_guard.cc:227] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR I0000 00:00:1780533688.530416 1035040 port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
/software/conda/envs/hm/lib/python3.13/site-packages/gpflow/versions.py:1: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. import pkg_resources
Generate Synthetic "Observed" Data¶
We create synthetic outbreak data with known parameters that we will try to recover through history matching.
# "True" parameters that generated our synthetic data
beta_true = 1.3
gamma_true = 0.5
population_size = 10_000
n_seed_infections = 100
print(f"True parameters we want to recover:")
print(f" beta (transmission rate): {beta_true}")
print(f" gamma (recovery rate): {gamma_true}")
print(f" R0 (basic reproduction number): {beta_true/gamma_true:.2f}")
incidence_obs, true_model = generate_observed_data(
beta_true=beta_true,
gamma_true=gamma_true,
population_size=population_size,
n_seed_infections=n_seed_infections,
)
true_model.plot("True Outbreak (Synthetic Observed Data)")
plt.figure(figsize=(8, 4))
incidence_obs.plot(style="o-", markersize=4)
plt.xlabel("Days")
plt.ylabel("Daily Incidence")
plt.title("Observed Daily Incidence Data")
plt.grid(True, alpha=0.3)
plt.show()
print(f"Peak incidence: {incidence_obs.max():.0f} cases/day")
print(f"Total cases: {incidence_obs.sum():.0f}")
print(f"Attack rate: {incidence_obs.sum()/population_size:.1%}")
True parameters we want to recover: beta (transmission rate): 1.3 gamma (recovery rate): 0.5 R0 (basic reproduction number): 2.60
Peak incidence: 1576 cases/day Total cases: 9091 Attack rate: 90.9%
Define Simulation Function¶
History matching requires a function that takes a DataFrame of parameter samples and returns a DataFrame of simulation outputs.
def sir_simulation_function(samples: pd.DataFrame) -> pd.DataFrame:
"""Run SIR model for each row of parameter samples.
Injects rand_seed for deterministic outputs — the engine filters it
out before emulation but stores it on result.samples.
"""
df = samples.copy()
if 'rand_seed' not in df.columns:
df['rand_seed'] = np.random.default_rng(42).integers(0, 2**31, size=len(df))
results = []
for _, row in df.iterrows():
model = SIR(
beta=row["beta"],
gamma=row["gamma"],
s0=population_size - n_seed_infections,
i0=n_seed_infections,
seed=int(row["rand_seed"]),
)
incidence = model.get_incidence()
result = {f"incidence_{i}": incidence[i] for i in range(len(incidence))}
result["peak_incidence"] = incidence.max()
result["total_cases"] = incidence.sum()
result["attack_rate"] = incidence.sum() / population_size
results.append(result)
return pd.DataFrame(results)
# Quick sanity check
test = sir_simulation_function(pd.DataFrame({"beta": [1.0, 1.5], "gamma": [0.3, 0.5]}))
print(f"Simulation output shape: {test.shape}")
print(f"Output columns (first 5): {list(test.columns[:5])}...")
Simulation output shape: (2, 23) Output columns (first 5): ['incidence_0', 'incidence_1', 'incidence_2', 'incidence_3', 'incidence_4']...
Configure History Matching¶
Configure the run in a single hm.HistoryMatching(...) call. We define:
- bounds: the search space for calibration
- observations: the target data as (mean, std) pairs
- function: the simulator to run at each sampled point
parameter_bounds = {
"beta": (0.5, 3.0),
"gamma": (0.1, 1.0),
}
observations = {
"peak_incidence": (incidence_obs.max(), 50),
"total_cases": (incidence_obs.sum(), 200),
"incidence_5": (incidence_obs.iloc[5], 30),
"incidence_10": (incidence_obs.iloc[10], 40),
"incidence_15": (incidence_obs.iloc[15], 20),
}
engine = hm.HistoryMatching(
bounds=parameter_bounds,
observations=observations,
function=sir_simulation_function,
sampling_strategy="lhs",
emulator_type="gpr",
n_samples=500,
max_iterations=4,
implausibility_threshold=3.0,
random_seed=123,
)
print(f"Engine ready. Parameters: {engine.parameter_space.get_parameter_names()}")
print(f"Observations: {list(observations.keys())}")
print(f"Samples per iteration: {engine.n_samples}")
print(f"Max iterations: {engine.max_iterations}")
Engine ready. Parameters: ['beta', 'gamma'] Observations: ['peak_incidence', 'total_cases', 'incidence_5', 'incidence_10', 'incidence_15'] Samples per iteration: 500 Max iterations: 4
Run Automated History Matching¶
print("Running automated history matching...")
results = engine.run()
print(f"\nHistory matching completed!")
print(f" Iterations run: {len(results)}")
print(f" Final acceptance rate: {engine.acceptance_rate:.3f}")
print(f" Total samples generated: {engine.samples_generated}")
print(f" Total samples accepted: {engine.samples_accepted}")
print(f" Emulators trained: {engine.emulators_trained}")
print("\nIteration Summary:")
for i, result in enumerate(results, 1):
s = result.samples
print(
f" Iteration {i}: {len(s)} samples, "
f"NROY fraction {result.nroy_fraction:.1%}, "
f"outputs {result.emulated_outputs} "
f"beta=[{s['beta'].min():.2f}, {s['beta'].max():.2f}] "
f"gamma=[{s['gamma'].min():.2f}, {s['gamma'].max():.2f}]"
)
Running automated history matching...
I0000 00:00:1780533693.973728 1035040 gpu_device.cc:2043] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 2175 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3050 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6
I0000 00:00:1780533698.675350 1035204 cuda_solvers.cc:175] Creating GpuSolver handles for stream 0x617f0acd6360
INFO:historymatching.engine:Checkpoint saved to hm_output/run_20260603_204130/checkpoint.pkl
INFO:historymatching.engine:Wave 1 output saved to hm_output/run_20260603_204130/wave1
INFO:historymatching.engine:[Wave 1] COMMITTED — diagnostics and checkpoint saved to hm_output/run_20260603_204130
INFO:historymatching.engine:============================================================
INFO:historymatching.engine:============================================================
INFO:historymatching.engine:WAVE 2 STARTING
INFO:historymatching.engine:============================================================
INFO:historymatching.engine:[Wave 2] Phase 1/5 SAMPLING: using 500 pre-computed samples from wave 1 [0.0s]
INFO:historymatching.engine:[Wave 2] Phase 2/5 SIMULATION: running 500 simulations...
INFO:historymatching.engine:[Wave 2] Phase 2/5 SIMULATION: complete — 23 outputs [0.4s]
INFO:historymatching.feature_selection: Feature ranking (mean_sq_z): incidence_5=155.07, incidence_10=42.55, peak_incidence=17.51, total_cases=13.77, incidence_15=4.66
INFO:historymatching.feature_selection: Cooldown (skip): ['peak_incidence']
INFO:historymatching.feature_selection: SELECTED: incidence_5 (score=155.07, mean_z=-5.80, std_z=11.03)
INFO:historymatching.engine:[Wave 2] Phase 3/5 FEATURE SELECTION: ['incidence_5'] [0.0s]
INFO:historymatching.engine:[Wave 2] Phase 4/5 EMULATOR TRAINING: training 1 emulators...
INFO:historymatching.emulators.factory: Emulator 1/1 [incidence_5]: creating (500 samples, 2 params, type=gpr)...
DEBUG:historymatching.emulators.factory:Created gpr emulator for feature: incidence_5
INFO:historymatching.emulators.factory: Emulator 1/1 [incidence_5]: training...
INFO:historymatching.emulators.gpr: Training GPR: 375 points, 2 dims
INFO:historymatching.emulators.gpr: Building GPflow model and optimizing hyperparameters...
INFO:historymatching.emulators.gpr: GPR training complete — noise=0.0730, lengthscale range=[0.701, 1.623] [1.0s]
INFO:historymatching.emulators.factory: Emulator 1/1 [incidence_5]: trained [1.0s]
INFO:historymatching.engine:[Wave 2] Phase 4/5 EMULATOR TRAINING: complete [1.0s]
INFO:historymatching.engine:[Wave 2] Phase 5/5 NROY SAMPLING: finding 500 plausible candidates for next wave...
INFO:historymatching.nroy_sampling: LHS rejection: 0/500 (0%) — starting
INFO:historymatching.nroy_sampling: LHS rejection: 44/500 (8%) | 550 tested | rate=8.0000% [0s]
INFO:historymatching.nroy_sampling: LHS rejection: 559/500 (111%) | 6,820 tested | rate=8.1965% [0s]
INFO:historymatching.nroy_sampling: LHS rejection: 500/500 [0.5s]
INFO:historymatching.nroy_sampling: NROY SUMMARY: 500/500 [OK] — sources: LHS=500 [0.5s]
DEBUG:historymatching.engine:NROY fraction: 0.073314
INFO:historymatching.engine:[Wave 2] Phase 5/5 NROY SAMPLING: found 500 candidates [0.5s]
INFO:historymatching.engine:[Wave 2] ALL PHASES COMPLETE [1.9s total]. Committing...
INFO:historymatching.engine:Checkpoint saved to hm_output/run_20260603_204130/checkpoint.pkl
INFO:historymatching.engine:Wave 2 output saved to hm_output/run_20260603_204130/wave2
INFO:historymatching.engine:[Wave 2] COMMITTED — diagnostics and checkpoint saved to hm_output/run_20260603_204130
INFO:historymatching.engine:============================================================
INFO:historymatching.engine:============================================================
INFO:historymatching.engine:WAVE 3 STARTING
INFO:historymatching.engine:============================================================
INFO:historymatching.engine:[Wave 3] Phase 1/5 SAMPLING: using 500 pre-computed samples from wave 2 [0.0s]
INFO:historymatching.engine:[Wave 3] Phase 2/5 SIMULATION: running 500 simulations...
INFO:historymatching.engine:[Wave 3] Phase 2/5 SIMULATION: complete — 23 outputs [0.4s]
INFO:historymatching.feature_selection: Feature ranking (mean_sq_z): incidence_5=31.34, total_cases=16.09, peak_incidence=13.45, incidence_10=9.09, incidence_15=1.25
INFO:historymatching.feature_selection: Cooldown (skip): ['incidence_5']
INFO:historymatching.feature_selection: SELECTED: total_cases (score=16.09, mean_z=-2.24, std_z=3.33)
INFO:historymatching.engine:[Wave 3] Phase 3/5 FEATURE SELECTION: ['total_cases'] [0.0s]
INFO:historymatching.engine:[Wave 3] Phase 4/5 EMULATOR TRAINING: training 1 emulators...
INFO:historymatching.emulators.factory: Emulator 1/1 [total_cases]: creating (500 samples, 2 params, type=gpr)...
DEBUG:historymatching.emulators.factory:Created gpr emulator for feature: total_cases
INFO:historymatching.emulators.factory: Emulator 1/1 [total_cases]: training...
INFO:historymatching.emulators.gpr: Training GPR: 375 points, 2 dims
INFO:historymatching.emulators.gpr: Building GPflow model and optimizing hyperparameters...
INFO:historymatching.emulators.gpr: GPR training complete — noise=0.0114, lengthscale range=[0.856, 1.333] [1.1s]
INFO:historymatching.emulators.factory: Emulator 1/1 [total_cases]: trained [1.1s]
INFO:historymatching.engine:[Wave 3] Phase 4/5 EMULATOR TRAINING: complete [1.1s]
INFO:historymatching.engine:[Wave 3] Phase 5/5 NROY SAMPLING: finding 500 plausible candidates for next wave...
INFO:historymatching.nroy_sampling: LHS rejection: 0/500 (0%) — starting
INFO:historymatching.nroy_sampling: LHS rejection: 22/500 (4%) | 550 tested | rate=4.0000% [0s]
INFO:historymatching.nroy_sampling: LHS rejection: 503/500 (100%) | 13,695 tested | rate=3.6729% [1s]
INFO:historymatching.nroy_sampling: LHS rejection: 500/500 [0.9s]
INFO:historymatching.nroy_sampling: NROY SUMMARY: 500/500 [OK] — sources: LHS=500 [0.9s]
DEBUG:historymatching.engine:NROY fraction: 0.036510
INFO:historymatching.engine:[Wave 3] Phase 5/5 NROY SAMPLING: found 500 candidates [0.9s]
INFO:historymatching.engine:[Wave 3] ALL PHASES COMPLETE [2.4s total]. Committing...
INFO:historymatching.engine:Checkpoint saved to hm_output/run_20260603_204130/checkpoint.pkl
INFO:historymatching.engine:Wave 3 output saved to hm_output/run_20260603_204130/wave3
INFO:historymatching.engine:[Wave 3] COMMITTED — diagnostics and checkpoint saved to hm_output/run_20260603_204130
INFO:historymatching.engine:============================================================
INFO:historymatching.engine:============================================================
INFO:historymatching.engine:WAVE 4 STARTING
INFO:historymatching.engine:============================================================
INFO:historymatching.engine:[Wave 4] Phase 1/5 SAMPLING: using 500 pre-computed samples from wave 3 [0.0s]
INFO:historymatching.engine:[Wave 4] Phase 2/5 SIMULATION: running 500 simulations...
INFO:historymatching.engine:[Wave 4] Phase 2/5 SIMULATION: complete — 23 outputs [0.4s]
INFO:historymatching.feature_selection: Feature ranking (mean_sq_z): incidence_5=41.43, peak_incidence=11.91, incidence_10=7.38, total_cases=3.78, incidence_15=0.77
INFO:historymatching.feature_selection: Cooldown (skip): ['total_cases']
INFO:historymatching.feature_selection: SELECTED: incidence_5 (score=41.43, mean_z=1.10, std_z=6.35)
INFO:historymatching.engine:[Wave 4] Phase 3/5 FEATURE SELECTION: ['incidence_5'] [0.0s]
INFO:historymatching.engine:[Wave 4] Phase 4/5 EMULATOR TRAINING: training 1 emulators...
INFO:historymatching.emulators.factory: Emulator 1/1 [incidence_5]: creating (500 samples, 2 params, type=gpr)...
DEBUG:historymatching.emulators.factory:Created gpr emulator for feature: incidence_5
INFO:historymatching.emulators.factory: Emulator 1/1 [incidence_5]: training...
INFO:historymatching.emulators.gpr: Training GPR: 375 points, 2 dims
INFO:historymatching.emulators.gpr: Building GPflow model and optimizing hyperparameters...
INFO:historymatching.emulators.gpr: GPR training complete — noise=0.2149, lengthscale range=[2.258, 4.068] [1.2s]
INFO:historymatching.emulators.factory: Emulator 1/1 [incidence_5]: trained [1.2s]
INFO:historymatching.engine:[Wave 4] Phase 4/5 EMULATOR TRAINING: complete [1.2s]
INFO:historymatching.engine:[Wave 4] Phase 5/5 NROY SAMPLING: finding 500 plausible candidates for next wave...
INFO:historymatching.nroy_sampling: LHS rejection: 0/500 (0%) — starting
INFO:historymatching.nroy_sampling: LHS rejection: 21/500 (4%) | 550 tested | rate=3.8182% [0s]
INFO:historymatching.nroy_sampling: LHS rejection: 484/500 (96%) | 14,349 tested | rate=3.3731% [1s]
INFO:historymatching.nroy_sampling: LHS rejection: 506/500 (101%) | 14,870 tested | rate=3.4028% [1s]
INFO:historymatching.nroy_sampling: LHS rejection: 500/500 [1.3s]
INFO:historymatching.nroy_sampling: NROY SUMMARY: 500/500 [OK] — sources: LHS=500 [1.3s]
DEBUG:historymatching.engine:NROY fraction: 0.033625
INFO:historymatching.engine:[Wave 4] Phase 5/5 NROY SAMPLING: found 500 candidates [1.3s]
INFO:historymatching.engine:[Wave 4] ALL PHASES COMPLETE [2.9s total]. Committing...
INFO:historymatching.engine:Checkpoint saved to hm_output/run_20260603_204130/checkpoint.pkl
INFO:historymatching.engine:Wave 4 output saved to hm_output/run_20260603_204130/wave4
INFO:historymatching.engine:[Wave 4] COMMITTED — diagnostics and checkpoint saved to hm_output/run_20260603_204130
INFO:historymatching.engine:============================================================
INFO:historymatching.engine:Automated run completed. 4 iterations executed.
History matching completed! Iterations run: 4 Final acceptance rate: 0.087 Total samples generated: 40342 Total samples accepted: 4000 Emulators trained: 4 Iteration Summary: Iteration 1: 500 samples, NROY fraction 11.2%, outputs ['peak_incidence'] beta=[0.50, 3.00] gamma=[0.10, 1.00] Iteration 2: 500 samples, NROY fraction 7.3%, outputs ['incidence_5'] beta=[0.64, 2.01] gamma=[0.10, 1.00] Iteration 3: 500 samples, NROY fraction 3.7%, outputs ['total_cases'] beta=[0.89, 2.02] gamma=[0.10, 1.00] Iteration 4: 500 samples, NROY fraction 3.4%, outputs ['incidence_5'] beta=[0.98, 1.85] gamma=[0.23, 0.83]
Analyze Results¶
final_samples = engine.get_nroy_samples()
print(f"Final plausible samples: {len(final_samples)}")
print(f" beta range: [{final_samples['beta'].min():.3f}, {final_samples['beta'].max():.3f}] (true: {beta_true})")
print(f" gamma range: [{final_samples['gamma'].min():.3f}, {final_samples['gamma'].max():.3f}] (true: {gamma_true})")
print(f" beta median: {final_samples['beta'].median():.3f}")
print(f" gamma median: {final_samples['gamma'].median():.3f}")
beta_ok = final_samples["beta"].min() <= beta_true <= final_samples["beta"].max()
gamma_ok = final_samples["gamma"].min() <= gamma_true <= final_samples["gamma"].max()
print(f"\nTrue beta in plausible region: {'Yes' if beta_ok else 'No'}")
print(f"True gamma in plausible region: {'Yes' if gamma_ok else 'No'}")
Final plausible samples: 500 beta range: [0.962, 1.839] (true: 1.3) gamma range: [0.215, 0.822] (true: 0.5) beta median: 1.328 gamma median: 0.517 True beta in plausible region: Yes True gamma in plausible region: Yes
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
# Scatter: final plausible parameter space
ax = axes[0, 0]
ax.scatter(final_samples["beta"], final_samples["gamma"], alpha=0.6, s=20, color="green")
ax.axvline(beta_true, color="red", linestyle="--", linewidth=2, label=f"True beta = {beta_true}")
ax.axhline(gamma_true, color="red", linestyle="--", linewidth=2, label=f"True gamma = {gamma_true}")
ax.set_xlabel("beta (transmission rate)")
ax.set_ylabel("gamma (recovery rate)")
ax.set_title("Final Plausible Parameter Space")
ax.legend()
ax.grid(True, alpha=0.3)
# Beta distribution
ax = axes[0, 1]
ax.hist(final_samples["beta"], bins=20, alpha=0.7, color="blue", density=True)
ax.axvline(beta_true, color="red", linestyle="--", linewidth=2, label=f"True beta = {beta_true}")
ax.axvline(final_samples["beta"].median(), color="green", linestyle="-", linewidth=2,
label=f"Estimated = {final_samples['beta'].median():.2f}")
ax.set_xlabel("beta")
ax.set_ylabel("Density")
ax.set_title("beta Distribution")
ax.legend()
ax.grid(True, alpha=0.3)
# Gamma distribution
ax = axes[1, 0]
ax.hist(final_samples["gamma"], bins=20, alpha=0.7, color="orange", density=True)
ax.axvline(gamma_true, color="red", linestyle="--", linewidth=2, label=f"True gamma = {gamma_true}")
ax.axvline(final_samples["gamma"].median(), color="green", linestyle="-", linewidth=2,
label=f"Estimated = {final_samples['gamma'].median():.2f}")
ax.set_xlabel("gamma")
ax.set_ylabel("Density")
ax.set_title("gamma Distribution")
ax.legend()
ax.grid(True, alpha=0.3)
# R0 distribution
ax = axes[1, 1]
R0_samples = final_samples["beta"] / final_samples["gamma"]
R0_true = beta_true / gamma_true
ax.hist(R0_samples, bins=20, alpha=0.7, color="purple", density=True)
ax.axvline(R0_true, color="red", linestyle="--", linewidth=2, label=f"True R0 = {R0_true:.2f}")
ax.axvline(R0_samples.median(), color="green", linestyle="-", linewidth=2,
label=f"Estimated R0 = {R0_samples.median():.2f}")
ax.set_xlabel("R0 (basic reproduction number)")
ax.set_ylabel("Density")
ax.set_title("R0 Distribution")
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
Validate Results with Forward Simulation¶
A quick sanity check: run the model at a handful of NROY parameter sets and overlay the trajectories on the observed data.
For a more rigorous approach — weighting trajectories by pseudo-likelihood and
resampling to get a calibrated ensemble — see 05_trajectory_selection.ipynb.
n_runs = 20
idx = np.random.choice(len(final_samples), size=n_runs, replace=False)
validation_samples = final_samples.iloc[idx]
validation_incidences = []
for _, row in validation_samples.iterrows():
model = SIR(
beta=row["beta"],
gamma=row["gamma"],
s0=population_size - n_seed_infections,
i0=n_seed_infections,
)
validation_incidences.append(model.get_incidence())
days = range(len(incidence_obs))
arr = np.array(validation_incidences)
mean_traj = arr.mean(axis=0)
std_traj = arr.std(axis=0)
plt.figure(figsize=(10, 5))
for k, inc in enumerate(validation_incidences):
plt.plot(days, inc, color="gray", alpha=0.3, linewidth=1,
label="Plausible simulations" if k == 0 else None)
plt.plot(days, incidence_obs.values, "ro-", linewidth=2, markersize=5, label="Observed data")
plt.plot(days, mean_traj, "b-", linewidth=2, label="Plausible mean")
plt.fill_between(days, mean_traj - 2 * std_traj, mean_traj + 2 * std_traj,
alpha=0.2, color="blue", label="95% prediction interval")
plt.xlabel("Days")
plt.ylabel("Daily Incidence")
plt.title("Model Validation: Plausible Trajectories vs Observed Data")
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()
rmse_values = [np.sqrt(np.mean((inc - incidence_obs.values) ** 2)) for inc in validation_incidences]
print(f"Mean RMSE: {np.mean(rmse_values):.1f} +/- {np.std(rmse_values):.1f}")
print(f"Best RMSE: {np.min(rmse_values):.1f}")
Mean RMSE: 150.2 +/- 72.8 Best RMSE: 36.2
Multiple Emulators Per Wave¶
By default feature selection (feature_selection) uses automatic Fano-factor
selection capped at one feature per wave. You can raise that cap — or specify
features explicitly — to train several emulators in a single wave. Each emulator
targets one observable, and implausibility is computed per-emulator then maximised
across all of them, so a sample must pass every constraint simultaneously.
Automated mode: raise the cap¶
engine = hm.HistoryMatching(
bounds=parameter_bounds,
observations=observations,
function=sir_simulation_function,
sampling_strategy="lhs",
emulator_type="gpr",
feature_selection={"method": "fano", "max_features": 3}, # ← up to 3 per wave
n_samples=500,
max_iterations=4,
implausibility_threshold=3.0,
random_seed=123,
)
results = engine.run()
Inspecting per-feature quality after engine.run()¶
IterationResult.get_emulator_quality_metrics() returns one entry per emulated
feature. Use it to check whether each emulator was worth including:
print("Per-feature emulator quality across all waves:")
print(f"{'Wave':<6} {'Feature':<20} {'R²':>6} {'MSE':>10} {'n_train':>8}")
print("-" * 56)
for result in results:
metrics = result.get_emulator_quality_metrics()
for feature, m in metrics.items():
r2 = m.get("r2", float("nan"))
mse = m.get("mse", float("nan"))
n = m.get("n_train", "?")
print(f"{result.iteration:<6} {feature:<20} {r2:>6.3f} {mse:>10.4g} {n:>8}")
Per-feature emulator quality across all waves: Wave Feature R² MSE n_train -------------------------------------------------------- 1 peak_incidence 0.995 1.103e+04 375 2 incidence_5 0.936 7989 375 3 total_cases 0.988 5552 375 4 incidence_5 0.798 7263 375
In the automated workflow there is no opportunity to drop a poor emulator mid-wave —
use the manual workflow (02_manual_workflow.ipynb) if you need to inspect
diagnostics and selectively drop emulators before committing each wave.
Saving diagnostics to disk¶
Each IterationResult has a save(fig_dir, all_results) method that writes
per-feature predicted-vs-actual plots, ARD lengthscale charts (GPR only), a convergence
figure, and a metrics.json file:
import tempfile, os
diag_dir = tempfile.mkdtemp(prefix="hm_diagnostics_")
for result in results:
result.save(diag_dir, all_results=results)
print(f"Diagnostics saved to {diag_dir}/")
for f in sorted(os.listdir(diag_dir)):
print(f" {f}")
Diagnostics saved to /tmp/hm_diagnostics_8slb72go/ wave1 wave2 wave3 wave4
Summary¶
This notebook demonstrated the automated history matching workflow:
- Model setup: imported
SIRandgenerate_observed_datafrommodel.py - Configuration: used
hm.HistoryMatchingto define parameter bounds, observations, and emulator settings in a single constructor call - Execution: called
engine.run()to run all iterations automatically - Analysis: inspected the final plausible parameter space and validated against observed data
Next steps¶
| Goal | Tutorial |
|---|---|
| Manual step-by-step control with emulator inspection | 02_manual_workflow.ipynb |
| Advanced configuration options, callbacks, checkpointing | 03_advanced_configuration.ipynb |
Select specific (parameter, seed) trajectories from the NROY ensemble |
05_trajectory_selection.ipynb |