API reference
Documentation is generated from the historymatching package using mkdocstrings.
History matching
The single public entry point. Configure and run an entire workflow through one HistoryMatching(...) constructor call.
Bayesian history matching — configure everything in one constructor call.
Pass your parameter bounds, observations, and simulator function as plain
arguments. Friendly values (strings, dicts, lists) are accepted for the
strategy options and turned into the underlying objects for you; sensible
defaults are used for anything you omit.
Examples:
import historymatching as hm
The simulator takes a DataFrame of samples and returns one row of
outputs per sample. Each output name must match an observation key.
def run_sir(samples): rows = [] for _, row in samples.iterrows(): rows.append({'peak_incidence': simulate_peak(row['beta'], row['gamma'])}) return rows # a DataFrame is also accepted
engine = hm.HistoryMatching( function=run_sir, bounds={'beta': (0.5, 3.0), 'gamma': (0.1, 1.0)}, observations={'peak_incidence': (120.0, 50.0)}, # (mean, std) emulator_type='gpr', n_samples=500, max_iterations=4, )
Automated execution
results = engine.run() plausible = engine.get_nroy_samples() # NROY = "Not Ruled Out Yet"
Interactive, step-by-step execution
result = engine.step() # run one wave engine.commit_step() # accept it (or engine.revert_step()) engine.feature_selection = ['different_output'] # reconfigure on the fly result = engine.step()
The simulator function receives a pandas DataFrame of parameter samples
(one row per sample) and may return either a DataFrame or a list of dicts
(one dict per sample) mapping output names to values.
Configure a history matching run.
An "output" is a named scalar your simulator produces that you have an
observed target for (e.g. 'peak_infected'). Each wave trains an
emulator for one or more outputs and rules out parameter regions whose
emulated outputs are implausibly far from the observations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
function
|
Optional[Callable]
|
The simulator. A callable taking a DataFrame of parameter
samples and returning a DataFrame (or list of dicts) of outputs,
whose column/key names match the observation names. May also be
set later with |
None
|
bounds
|
Union[dict, DataFrame, ParameterSpace, None]
|
The parameter space to search. A dict mapping
|
None
|
observations
|
Union[dict, DataFrame, ObservationData, None]
|
The target data. A dict mapping |
None
|
sampling_strategy
|
Union[str, dict, SamplingStrategy]
|
|
'lhs'
|
feature_selection
|
Union[str, list, dict, FeatureSelectionStrategy, None]
|
which outputs to emulate each wave. A name or list
of names (emulate exactly these), a config dict (e.g. |
None
|
emulator_type
|
str
|
|
'gpr'
|
emulator_factory
|
Optional[EmulatorFactory]
|
a pre-built :class: |
None
|
emulator_bank
|
Optional[EmulatorBank]
|
a pre-populated :class: |
None
|
n_samples
|
int
|
parameter samples generated per wave. |
1000
|
implausibility_threshold
|
float
|
implausibility cutoff, typically 2.5-4.0. |
3.0
|
max_iterations
|
int
|
maximum number of waves to run. |
10
|
random_seed
|
Optional[int]
|
seed for reproducibility. |
None
|
auto_reduce_space
|
bool
|
enable automatic parameter-space reduction. |
False
|
oversample_factor
|
float
|
oversampling factor for rejection filtering (>= 1.0). |
1.1
|
max_batch_size
|
int
|
max candidates per NROY sampling batch (>= 100). |
10000
|
output_dir
|
Optional[str]
|
where to auto-save waves, diagnostics, and checkpoints.
Nothing is written until the first :meth: |
'./hm_output'
|
run_name
|
Optional[str]
|
subdirectory under |
None
|
convergence_threshold
|
float
|
stop early once the plausible (NROY) fraction
falls below this; |
0.0
|
nroy_method
|
str
|
NROY sampler — |
'auto'
|
nroy_options
|
Optional[dict]
|
dict of advanced options forwarded to the NROY sampler. |
None
|
max_candidate_factor
|
int
|
cap on candidates per wave as a multiple of
|
1000
|
historymatching.HistoryMatching.emulator_factory
property
writable
The emulator factory. Assign an :class:EmulatorFactory for full control.
historymatching.HistoryMatching.emulator_type
property
writable
Emulator type as a string. Assign 'gpr'/'glm'/'linear' to change it.
historymatching.HistoryMatching.feature_selection
property
writable
Which outputs to emulate each wave. Assign a name/list/dict/strategy (e.g. engine.feature_selection = ['peak']).
historymatching.HistoryMatching.max_iterations
property
writable
Maximum number of waves to run (you may raise it mid-run).
historymatching.HistoryMatching.outputs
property
Names of the observed outputs being matched.
historymatching.HistoryMatching.parameters
property
Names of the parameters being calibrated.
historymatching.HistoryMatching.results
property
All committed :class:IterationResult objects, in order.
historymatching.HistoryMatching.sampling_strategy
property
writable
Sampling strategy. Assign a name/dict/strategy to change it (e.g. engine.sampling_strategy = 'grid').
historymatching.HistoryMatching.__len__()
Number of committed waves.
historymatching.HistoryMatching.__repr__()
Config-revealing representation (leads with what is being calibrated).
historymatching.HistoryMatching.add_iteration_callback(callback)
Add callback to be called after each iteration.
historymatching.HistoryMatching.add_progress_callback(callback)
Add callback to be called on progress updates.
historymatching.HistoryMatching.commit_step()
Commit the pending iteration results.
This makes the changes from the last step() permanent and advances the iteration counter.
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If no pending iteration to commit |
historymatching.HistoryMatching.drop_emulator_from_pending(feature)
Remove a specific emulator from the pending iteration before committing.
Call this after step() but before commit_step() to exclude an emulator whose diagnostics indicate a poor fit. The emulator will not be stored in the bank and will not contribute to implausibility filtering in future waves.
The simulation data for this wave is unaffected — only the emulator is dropped. The remaining emulators are committed as normal.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
feature
|
str
|
Name of the feature whose emulator should be dropped. |
required |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If called when no step is pending. |
KeyError
|
If the feature was not emulated in the pending step. |
Example
result = engine.step()
metrics = result.get_emulator_quality_metrics() for output in result.emulated_outputs: print(f"{output}: R²={metrics[output]['r2']:.3f}")
Drop any emulator with a poor fit before committing
engine.drop_emulator_from_pending('output_c') engine.commit_step()
historymatching.HistoryMatching.enumerate()
Iterate over committed waves as (iteration, result, samples) tuples.
Example
for i, result, samples in engine.enumerate(): print(i, result.nroy_fraction, len(samples))
historymatching.HistoryMatching.get_all_results()
Get all committed iteration results.
historymatching.HistoryMatching.get_iteration_result(iteration)
Get result for a specific iteration.
historymatching.HistoryMatching.get_nroy_samples(n=None, method=None, **kwargs)
Get plausible parameter samples — the calibration result.
Returns the NROY ("Not Ruled Out Yet") samples: parameter sets that pass
ALL committed emulators' implausibility checks. By default returns the
pre-computed set from the last wave (n_samples of them). Pass n
to draw a fresh, larger set filtered through the current emulator bank.
No new simulations are run — only fast emulator predictions are used.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n
|
Optional[int]
|
Number of NROY samples to return. If None, returns the pre-computed set from the last committed wave. |
None
|
method
|
Optional[str]
|
NROY sampling method: |
None
|
**kwargs
|
Extra options passed to |
{}
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame of NROY samples, or empty DataFrame if no iterations committed. |
Example
results = engine.run() nroy = engine.get_nroy_samples() # cached from last wave nroy = engine.get_nroy_samples(10000) # larger draw (default method) nroy = engine.get_nroy_samples(5000, method='lhs') # unbiased for posterior nroy = engine.get_nroy_samples(5000, method='auto', n_lines=40, points_per_line=100)
historymatching.HistoryMatching.get_pending_next_samples()
Get the proposed samples for the next iteration, if available.
This allows inspection of pre-computed samples after step() but before commit_step(). The samples are computed during step() execution and will be used for the next iteration if the current step is committed.
Returns:
| Type | Description |
|---|---|
Optional[DataFrame]
|
DataFrame of proposed samples for next iteration, or None if no step is pending |
Example
result = engine.step() next_samples = engine.get_pending_next_samples() if next_samples is not None: print(f"Proposed {len(next_samples)} samples for next iteration") # Inspect the samples before deciding to commit engine.commit_step() # or engine.revert_step()
historymatching.HistoryMatching.get_status_summary()
Get a human-readable, multi-line summary of the current status.
(To reconfigure mid-run, just assign to the matching attribute, e.g.
engine.feature_selection = ['peak'] or engine.max_iterations = 20.)
historymatching.HistoryMatching.load_checkpoint(filepath, sampling_strategy, feature_selection, emulator_factory)
classmethod
Load engine state from checkpoint file.
historymatching.HistoryMatching.plot_ensemble_fan(trajectories, observed=None, x=None, xlabel=None, ylabel=None, title=None, ax=None, show=False)
staticmethod
Fan plot of an ensemble of trajectories, optionally vs observed data.
Draws the ensemble median, a shaded 5-95th and 25-75th percentile band, the faint individual trajectories, and (if given) the observed series on top. Handy for eyeballing how well a set of plausible (NROY) parameter sets reproduces the data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
trajectories
|
2-D array-like, shape |
required | |
observed
|
Optional observed series of length |
None
|
|
x
|
Optional x-axis values (defaults to |
None
|
|
xlabel, ylabel, title
|
Optional axis labels / title. |
required | |
ax
|
Optional matplotlib Axes to draw into (a new figure is made if None). |
None
|
|
show
|
bool
|
If True, call |
False
|
Returns:
| Type | Description |
|---|---|
|
|
historymatching.HistoryMatching.plot_nroy_parameters(samples=None, derived=None, true_parameters=None, bins=25, fig_kwargs=None, show=False)
Corner plot of the non-implausible (NROY) parameter samples.
Diagonal panels show marginal histograms; lower-triangle panels show pairwise scatter. Optionally overlay derived quantities and the known "true" values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
samples
|
Optional[DataFrame]
|
NROY samples to plot. Defaults to :meth: |
None
|
derived
|
Optional[dict]
|
Dict mapping a derived-quantity name to a callable that
takes the samples DataFrame and returns a Series/array
(e.g. |
None
|
true_parameters
|
Optional[dict]
|
Dict mapping a column name to its true value; drawn as reference lines/markers. |
None
|
bins
|
int
|
Number of histogram bins for the diagonal panels. |
25
|
fig_kwargs
|
Optional[dict]
|
Extra keyword arguments passed to |
None
|
show
|
bool
|
If True, call |
False
|
Returns:
| Type | Description |
|---|---|
|
|
historymatching.HistoryMatching.print_emulator_quality_metrics(iteration=None)
Print and return per-feature emulator quality metrics for a wave.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
iteration
|
Optional[int]
|
Wave number to report (1-based). Defaults to the most recently committed wave. |
None
|
Returns:
| Type | Description |
|---|---|
dict
|
Dict mapping feature name -> metrics dict (R², MSE, training size, ...). |
historymatching.HistoryMatching.revert_step()
Revert the pending iteration results.
This discards the changes from the last step() and returns to the previous state.
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If no pending iteration to revert |
historymatching.HistoryMatching.run(auto_commit=True, resume=False)
Run automated history matching workflow.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
auto_commit
|
bool
|
Whether to automatically commit each iteration |
True
|
resume
|
bool
|
If True, load checkpoint from output_dir and continue. If False (default), start fresh. Raises if a checkpoint exists and resume is False (to prevent accidental overwrites). |
False
|
Returns:
| Type | Description |
|---|---|
list[IterationResult]
|
List of IterationResult objects for all iterations |
Raises:
| Type | Description |
|---|---|
ValueError
|
If simulation function is not set or configuration is invalid |
historymatching.HistoryMatching.save_checkpoint(filepath)
Save engine state to checkpoint file.
historymatching.HistoryMatching.save_diagnostics(directory, verbose=False)
Save every committed wave's artifacts under directory.
Writes one wave{N}/ subdirectory per wave (samples, simulator
outputs, pickled emulators, predicted-vs-actual + ARD diagnostic plots,
a convergence figure, and metrics.json) by calling
:meth:IterationResult.save for each wave. This is the manual
equivalent of the per-wave output written automatically when
output_dir is set.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
directory
|
str
|
Directory to write into (created if needed). |
required |
verbose
|
bool
|
If True, print the path written for each wave. |
False
|
historymatching.HistoryMatching.step(features=None)
Execute a single history matching iteration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
features
|
Optional[list[str]]
|
Optional list of features to emulate (overrides strategy) |
None
|
Returns:
| Type | Description |
|---|---|
IterationResult
|
IterationResult for this iteration |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If engine is not in a valid state for stepping |
ValueError
|
If simulation function is not set or configuration is invalid |
historymatching.HistoryMatching.validate()
Validate the engine's configuration.
Checks that the required components are present and that all numeric and
enumerated options are within their valid ranges. Called automatically at
the start of :meth:run and :meth:step (configuration attributes are
public and may be changed after construction); may also be called directly.
Raises:
| Type | Description |
|---|---|
ValueError
|
If any required component is missing or an option is invalid. |
Domain objects
Parameter space
Encapsulates parameter space information and operations.
Represents the bounds and constraints for parameters in a history matching analysis. Supports operations like constraining the space based on samples and calculating volume reductions.
Initialize parameter space.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
parameters
|
Union[DataFrame, dict]
|
DataFrame with columns ['parameter', 'minimum', 'maximum'] or dict mapping parameter names to (min, max) tuples |
required |
historymatching.ParameterSpace.__eq__(other)
Check equality with another ParameterSpace.
historymatching.ParameterSpace.__len__()
Return number of parameters.
historymatching.ParameterSpace.__repr__()
String representation showing each parameter's bounds.
historymatching.ParameterSpace.constrain_parameter(param_name, new_min, new_max)
Create new ParameterSpace with updated bounds for one parameter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
param_name
|
str
|
Name of parameter to constrain |
required |
new_min
|
float
|
New minimum value |
required |
new_max
|
float
|
New maximum value |
required |
Returns:
| Type | Description |
|---|---|
ParameterSpace
|
New ParameterSpace instance with updated bounds |
historymatching.ParameterSpace.constrain_to_samples(samples_df, percentile=95)
Create new ParameterSpace constrained to sample bounds.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
samples_df
|
DataFrame
|
DataFrame containing parameter samples |
required |
percentile
|
float
|
Percentile to use for bounds (e.g., 95 means 2.5-97.5 range) |
95
|
Returns:
| Type | Description |
|---|---|
ParameterSpace
|
New ParameterSpace constrained to sample bounds |
historymatching.ParameterSpace.get_bounds(parameter_name)
Get bounds for a specific parameter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
parameter_name
|
str
|
Name of the parameter |
required |
Returns:
| Type | Description |
|---|---|
Tuple[float, float]
|
Tuple of (minimum, maximum) values |
historymatching.ParameterSpace.get_parameter_names()
Get list of all parameter names.
historymatching.ParameterSpace.sample_uniformly(n_samples, seed=None)
Generate uniform random samples within parameter bounds.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_samples
|
int
|
Number of samples to generate |
required |
seed
|
Optional[int]
|
Random seed for reproducibility |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with columns for each parameter |
historymatching.ParameterSpace.to_dataframe()
Return the underlying parameters DataFrame.
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with parameter space definition |
historymatching.ParameterSpace.validate_samples(samples_df)
Check if all samples fall within parameter bounds.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
samples_df
|
DataFrame
|
DataFrame containing parameter samples |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if all samples are valid, False otherwise |
historymatching.ParameterSpace.volume_fraction_remaining(original_space)
Calculate remaining parameter space volume as fraction of original.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
original_space
|
ParameterSpace
|
Original ParameterSpace to compare against |
required |
Returns:
| Type | Description |
|---|---|
float
|
Fraction of volume remaining (0.0 to 1.0) |
Observation data
Encapsulates observational data and implausibility calculations.
Represents the target observations that the model should match, including their means and variances. Provides methods for calculating implausibility metrics.
Initialize observation data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
observations
|
Union[DataFrame, dict]
|
DataFrame with columns ['feature', 'mean', 'std']
or dict mapping each observed output name to a
|
required |
historymatching.ObservationData.__eq__(other)
Check equality with another ObservationData.
historymatching.ObservationData.__len__()
Return number of observed features.
historymatching.ObservationData.__repr__()
String representation showing each observed (mean, std).
historymatching.ObservationData.calculate_implausibilities(predictions, model_discrepancy=0.0)
Calculate implausibilities for multiple features.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predictions
|
Dict[str, Tuple[float, float]]
|
Dict mapping feature names to (predicted_mean, predicted_variance) tuples |
required |
model_discrepancy
|
float
|
Additional model uncertainty |
0.0
|
Returns:
| Type | Description |
|---|---|
Dict[str, float]
|
Dict mapping feature names to implausibility values |
historymatching.ObservationData.calculate_implausibility(feature_name, predicted_mean, predicted_variance, model_discrepancy=0.0)
Calculate implausibility metric for a single feature.
The implausibility is calculated as: I = |predicted_mean - observed_mean| / sqrt(predicted_variance + observed_variance + model_discrepancy^2)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
feature_name
|
str
|
Name of the feature |
required |
predicted_mean
|
Union[float, Series]
|
Predicted mean from emulator (scalar or Series) |
required |
predicted_variance
|
Union[float, Series]
|
Predicted variance from emulator (scalar or Series) |
required |
model_discrepancy
|
float
|
Additional model uncertainty |
0.0
|
Returns:
| Type | Description |
|---|---|
Union[float, Series]
|
Implausibility value(s) (lower is more plausible) - scalar if inputs are scalar, Series if inputs are Series |
historymatching.ObservationData.calculate_maximum_implausibility(predictions, model_discrepancy=0.0)
Calculate maximum implausibility across all features.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predictions
|
Dict[str, Tuple[float, float]]
|
Dict mapping feature names to (predicted_mean, predicted_variance) tuples |
required |
model_discrepancy
|
float
|
Additional model uncertainty |
0.0
|
Returns:
| Type | Description |
|---|---|
float
|
Maximum implausibility value across all features |
historymatching.ObservationData.filter_features(feature_names)
Create new ObservationData with only specified features.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
feature_names
|
List[str]
|
List of feature names to keep |
required |
Returns:
| Type | Description |
|---|---|
ObservationData
|
New ObservationData instance with filtered features |
historymatching.ObservationData.get_all_targets()
Get all feature targets as a dictionary.
Returns:
| Type | Description |
|---|---|
Dict[str, Tuple[float, float]]
|
Dict mapping feature names to (mean, std) tuples |
historymatching.ObservationData.get_feature_names()
Get list of all observed feature names.
historymatching.ObservationData.get_target_for_feature(feature_name)
Get target mean and std for a specific feature.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
feature_name
|
str
|
Name of the feature |
required |
Returns:
| Type | Description |
|---|---|
Tuple[float, float]
|
Tuple of (mean, std) values |
historymatching.ObservationData.has_feature(feature_name)
Check if a feature exists in observations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
feature_name
|
str
|
Name of the feature to check |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if feature exists, False otherwise |
historymatching.ObservationData.to_dataframe()
Return the underlying observations DataFrame.
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with observations data |
Emulator bank
Manages storage and retrieval of emulators across iterations.
The EmulatorBank stores emulators organized by iteration number and feature name, providing methods for adding, retrieving, and managing the emulator collection.
Initialize empty emulator bank.
historymatching.EmulatorBank.__contains__(key)
Check if iteration or (iteration, feature) exists.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
Either iteration number or (iteration, feature) tuple |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if key exists, False otherwise |
historymatching.EmulatorBank.__len__()
Return total number of emulators across all iterations.
historymatching.EmulatorBank.__repr__()
String representation.
historymatching.EmulatorBank.add_emulator(iteration, feature, emulator)
Add an emulator for a specific iteration and feature.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
iteration
|
int
|
Iteration number (1-based) |
required |
feature
|
str
|
Feature name |
required |
emulator
|
BaseEmulator
|
Trained emulator instance |
required |
historymatching.EmulatorBank.clear()
Remove all emulators.
historymatching.EmulatorBank.copy()
Create a deep copy of the emulator bank.
Returns:
| Type | Description |
|---|---|
EmulatorBank
|
New EmulatorBank instance with copied emulators |
historymatching.EmulatorBank.get_all_emulators()
Get all emulators in the bank.
historymatching.EmulatorBank.get_all_features()
Get all unique feature names across all iterations.
Returns:
| Type | Description |
|---|---|
List[str]
|
List of unique feature names |
historymatching.EmulatorBank.get_all_iterations()
Get list of iteration numbers with emulators.
Returns:
| Type | Description |
|---|---|
List[int]
|
Sorted list of iteration numbers |
historymatching.EmulatorBank.get_emulator(iteration, feature)
Retrieve a specific emulator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
iteration
|
int
|
Iteration number |
required |
feature
|
str
|
Feature name |
required |
Returns:
| Type | Description |
|---|---|
Optional[BaseEmulator]
|
Emulator instance or None if not found |
historymatching.EmulatorBank.get_emulators_for_iteration(iteration)
Get all emulators for a specific iteration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
iteration
|
int
|
Iteration number |
required |
Returns:
| Type | Description |
|---|---|
Dict[str, BaseEmulator]
|
Dict mapping feature names to emulators |
historymatching.EmulatorBank.get_features_for_iteration(iteration)
Get feature names for a specific iteration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
iteration
|
int
|
Iteration number |
required |
Returns:
| Type | Description |
|---|---|
List[str]
|
List of feature names |
historymatching.EmulatorBank.get_latest_emulators()
Get emulators from the most recent iteration.
Returns:
| Type | Description |
|---|---|
Dict[str, BaseEmulator]
|
Dict mapping feature names to emulators from latest iteration |
historymatching.EmulatorBank.get_summary_statistics()
Get summary statistics about the emulator bank.
Returns:
| Type | Description |
|---|---|
Dict
|
Dict with summary information |
historymatching.EmulatorBank.has_emulator(iteration, feature)
Check if an emulator exists for specific iteration and feature.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
iteration
|
int
|
Iteration number |
required |
feature
|
str
|
Feature name |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if emulator exists, False otherwise |
historymatching.EmulatorBank.has_emulators()
Check if any emulators exist in the bank.
Returns:
| Type | Description |
|---|---|
bool
|
True if any emulators exist, False otherwise |
historymatching.EmulatorBank.load_from_directory(directory_path)
Load emulators from disk.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
directory_path
|
str
|
Directory containing saved emulators |
required |
historymatching.EmulatorBank.remove_emulator(iteration, feature)
Remove a specific emulator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
iteration
|
int
|
Iteration number |
required |
feature
|
str
|
Feature name |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if emulator was removed, False if not found |
historymatching.EmulatorBank.remove_iteration(iteration)
Remove all emulators for a specific iteration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
iteration
|
int
|
Iteration number to remove |
required |
historymatching.EmulatorBank.save_to_directory(directory_path)
Save all emulators to disk.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
directory_path
|
str
|
Directory to save emulators in |
required |
Iteration result
Immutable result from one history matching wave (iteration).
Holds the parameter samples run this wave, the simulator outputs, the
outputs that were emulated, the trained emulators, and the fraction of
parameter space still plausible (nroy_fraction).
The NROY (Not Ruled Out Yet) set itself is not stored here — it is defined
implicitly by the emulator bank. Use :meth:HistoryMatching.get_nroy_samples
to draw plausible samples. The final wave's samples +
simulation_results can be fed directly into trajectory selection.
historymatching.IterationResult.__post_init__()
Validate the result after creation.
historymatching.IterationResult.get_emulator(output)
Get the emulator trained for a specific output this wave.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output
|
str
|
Name of the emulated output. |
required |
Returns:
| Type | Description |
|---|---|
BaseEmulator
|
Emulator instance. |
Raises:
| Type | Description |
|---|---|
KeyError
|
If no emulator was trained for that output this wave. |
historymatching.IterationResult.get_emulator_quality_metrics()
Quality metrics for each emulated output.
Returns:
| Type | Description |
|---|---|
Dict[str, Dict[str, float]]
|
Dict mapping each output name to a metrics dict with keys |
Dict[str, Dict[str, float]]
|
|
Dict[str, Dict[str, float]]
|
A key is absent if that metric could not be computed. |
historymatching.IterationResult.plot_emulator_diagnostics(output, **kwargs)
Plot diagnostics for a specific output's emulator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output
|
str
|
Name of the emulated output. |
required |
**kwargs
|
Additional arguments passed to the emulator's plot method. |
{}
|
historymatching.IterationResult.save(directory, all_results=None)
Save everything about this wave to {directory}/wave{N}/.
Writes the parameter samples.csv and simulation_results.csv, a
pickle of each emulator under emulators/, per-output diagnostic
figures (predicted-vs-actual, and ARD lengthscales for GPR), a
metrics.json, and — when all_results is supplied — a
convergence.png showing the plausible fraction across waves.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
directory
|
str
|
Parent directory; a |
required |
all_results
|
Optional[list]
|
Optional list of all waves so far, used for the convergence plot. |
None
|
Returns:
| Type | Description |
|---|---|
str
|
The path to the |
historymatching.IterationResult.summary()
A summary of this wave as a plain dict (handy for logging/inspection).
Strategies
Sampling
Bases: ABC
Abstract base class for parameter space sampling strategies.
Sampling strategies generate parameter samples within a given parameter space for use in history matching iterations.
historymatching.sampling.SamplingStrategy.generate_samples(parameter_space, n_samples, seed=None)
abstractmethod
Generate parameter samples within the given parameter space.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
parameter_space
|
ParameterSpace
|
ParameterSpace defining the bounds |
required |
n_samples
|
int
|
Number of samples to generate |
required |
seed
|
Optional[int]
|
Random seed for reproducibility (optional) |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with columns for each parameter and rows for each sample |
historymatching.sampling.SamplingStrategy.get_strategy_name()
abstractmethod
Return human-readable name for this strategy.
historymatching.sampling.SamplingStrategy.validate_parameters(**kwargs)
Validate strategy-specific parameters.
Override in subclasses to add parameter validation.
Raises:
| Type | Description |
|---|---|
ValueError
|
If parameters are invalid |
Bases: SamplingStrategy
Latin Hypercube Sampling strategy.
Generates samples using Latin Hypercube Sampling, which ensures good space-filling properties by dividing each parameter dimension into equally-sized intervals.
Initialize Latin Hypercube sampling strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
criterion
|
str
|
Optimization criterion ('center', 'maximin', 'centermaximin', 'correlation') |
'maximin'
|
iterations
|
int
|
Number of optimization iterations |
5
|
historymatching.sampling.LatinHypercubeSampling.generate_samples(parameter_space, n_samples, seed=None)
Generate Latin Hypercube samples.
historymatching.sampling.LatinHypercubeSampling.validate_parameters(**kwargs)
Validate LHS parameters.
Bases: SamplingStrategy
Grid sampling strategy.
Generates samples on a regular grid in parameter space, providing systematic coverage of the space.
Initialize grid sampling strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
samples_per_dimension
|
Optional[int]
|
Number of samples per dimension (optional) If None, calculated from total n_samples |
None
|
historymatching.sampling.GridSampling.generate_samples(parameter_space, n_samples, seed=None)
Generate grid samples.
historymatching.sampling.GridSampling.validate_parameters(**kwargs)
Validate grid sampling parameters.
Bases: SamplingStrategy
Random sampling strategy.
Generates uniformly random samples within the parameter space bounds.
Initialize random sampling strategy.
historymatching.sampling.RandomSampling.generate_samples(parameter_space, n_samples, seed=None)
Generate random samples.
Factory for creating sampling strategy instances.
Provides a registry-based approach for creating sampling strategies by name, with support for custom strategy registration.
historymatching.sampling.SamplingStrategyFactory.available_strategies()
classmethod
Get list of available strategy names.
Returns:
| Type | Description |
|---|---|
list[str]
|
List of registered strategy names |
historymatching.sampling.SamplingStrategyFactory.create(strategy_name, **kwargs)
classmethod
Create a sampling strategy by name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
strategy_name
|
str
|
Name of the strategy to create |
required |
**kwargs
|
Strategy-specific parameters |
{}
|
Returns:
| Type | Description |
|---|---|
SamplingStrategy
|
Configured SamplingStrategy instance |
Raises:
| Type | Description |
|---|---|
ValueError
|
If strategy name is unknown |
historymatching.sampling.SamplingStrategyFactory.get_strategy_info(strategy_name)
classmethod
Get information about a strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
strategy_name
|
str
|
Name of the strategy |
required |
Returns:
| Type | Description |
|---|---|
Dict[str, str]
|
Dict with strategy information |
Raises:
| Type | Description |
|---|---|
ValueError
|
If strategy name is unknown |
historymatching.sampling.SamplingStrategyFactory.register_strategy(name, strategy_class)
classmethod
Register a custom sampling strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name to register the strategy under |
required |
strategy_class
|
Type[SamplingStrategy]
|
SamplingStrategy subclass to register |
required |
Raises:
| Type | Description |
|---|---|
TypeError
|
If strategy_class is not a SamplingStrategy subclass |
Feature selection
Bases: ABC
Abstract base class for feature selection strategies.
Feature selection strategies determine which model outputs (features) should be emulated in each history matching iteration.
historymatching.feature_selection.FeatureSelectionStrategy.get_strategy_name()
abstractmethod
Return human-readable name for this strategy.
historymatching.feature_selection.FeatureSelectionStrategy.select_features(simulation_results, observations, iteration=1)
abstractmethod
Select features to emulate for this iteration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
simulation_results
|
DataFrame
|
DataFrame with simulation outputs |
required |
observations
|
ObservationData
|
ObservationData containing target observations |
required |
iteration
|
int
|
Current iteration number (1-based) |
1
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List of feature names to emulate |
historymatching.feature_selection.FeatureSelectionStrategy.validate_features(features, simulation_results, observations)
Validate selected features exist in both simulation results and observations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
features
|
List[str]
|
List of feature names to validate |
required |
simulation_results
|
DataFrame
|
DataFrame with simulation outputs |
required |
observations
|
ObservationData
|
ObservationData containing target observations |
required |
Returns:
| Type | Description |
|---|---|
List[str]
|
List of valid feature names |
Raises:
| Type | Description |
|---|---|
ValueError
|
If no valid features remain |
Bases: FeatureSelectionStrategy
Automatic feature selection strategy.
Uses statistical metrics to automatically select the most informative features for emulation. Based on the existing feature_selection function.
Initialize automatic feature selection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
method
|
str
|
Statistical method for ranking features ('fano', 'var', 'mean', etc.) |
'mean_sq_z'
|
threshold
|
Optional[float]
|
Minimum threshold value for the metric (optional) |
None
|
cooldown_period
|
int
|
Number of recent selections to track for avoiding repetition |
1
|
correlation_threshold
|
float
|
Maximum correlation allowed with recent selections |
0.8
|
max_features
|
int
|
Maximum number of features to select per iteration |
1
|
historymatching.feature_selection.AutoFeatureSelection.reset_history()
Reset the selection history (useful for testing or restarting).
historymatching.feature_selection.AutoFeatureSelection.select_features(simulation_results, observations, iteration=1)
Select features automatically using statistical metrics.
Bases: FeatureSelectionStrategy
Manual feature selection strategy.
Returns a predefined list of features for each iteration. Useful when you know exactly which features to emulate.
Initialize manual feature selection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
selected_features
|
Union[str, List[str]]
|
Feature name or list of feature names to select |
required |
historymatching.feature_selection.ManualFeatureSelection.select_features(simulation_results, observations, iteration=1)
Select predefined features.
Emulator factory
Factory for creating and managing emulator instances.
Provides a registry-based approach for creating emulators by type, with support for custom emulator registration and default parameters.
Initialize emulator factory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
default_type
|
str
|
Default emulator type to use |
'gpr'
|
**default_kwargs
|
Default parameters to pass to emulators |
{}
|
historymatching.emulators.factory.EmulatorFactory.__repr__()
String representation of the factory.
historymatching.emulators.factory.EmulatorFactory.available_emulators()
classmethod
Get list of available emulator types.
Returns:
| Type | Description |
|---|---|
List[str]
|
List of registered emulator type names |
historymatching.emulators.factory.EmulatorFactory.create_and_train_emulator(X, y, emulator_type=None, **kwargs)
Create and immediately train an emulator.
Convenience method that combines create_emulator() and train().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
DataFrame
|
Input data (parameter samples) |
required |
y
|
DataFrame
|
Output data (single feature/column) |
required |
emulator_type
|
Optional[str]
|
Type of emulator to create |
None
|
**kwargs
|
Additional parameters for emulator constructor |
{}
|
Returns:
| Type | Description |
|---|---|
BaseEmulator
|
Trained emulator instance |
historymatching.emulators.factory.EmulatorFactory.create_emulator(X, y, emulator_type=None, **kwargs)
Create a single emulator instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
DataFrame
|
Input data (parameter samples) |
required |
y
|
DataFrame
|
Output data (single feature/column) |
required |
emulator_type
|
Optional[str]
|
Type of emulator to create (uses default if None) |
None
|
**kwargs
|
Additional parameters for emulator constructor |
{}
|
Returns:
| Type | Description |
|---|---|
BaseEmulator
|
Configured emulator instance (not yet trained) |
Raises:
| Type | Description |
|---|---|
ValueError
|
If emulator type is unknown |
TypeError
|
If y has multiple columns |
historymatching.emulators.factory.EmulatorFactory.create_emulators_for_features(samples, simulation_results, features, emulator_type=None, **kwargs)
Create emulators for multiple features.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
samples
|
DataFrame
|
Parameter samples DataFrame |
required |
simulation_results
|
DataFrame
|
Simulation results DataFrame |
required |
features
|
List[str]
|
List of feature names to create emulators for |
required |
emulator_type
|
Optional[str]
|
Type of emulator to create (uses default if None) |
None
|
**kwargs
|
Additional parameters for emulator constructors |
{}
|
Returns:
| Type | Description |
|---|---|
Dict[str, BaseEmulator]
|
Dict mapping feature names to trained emulator instances |
Raises:
| Type | Description |
|---|---|
ValueError
|
If any feature is not found in simulation_results |
historymatching.emulators.factory.EmulatorFactory.get_default_kwargs()
Get the default parameters.
historymatching.emulators.factory.EmulatorFactory.get_default_type()
Get the default emulator type.
historymatching.emulators.factory.EmulatorFactory.get_emulator_info(emulator_type)
classmethod
Get information about an emulator type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
emulator_type
|
str
|
Name of the emulator type |
required |
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
Dict with emulator information |
Raises:
| Type | Description |
|---|---|
ValueError
|
If emulator type is unknown |
historymatching.emulators.factory.EmulatorFactory.register_emulator(name, emulator_class)
classmethod
Register a custom emulator type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name to register the emulator under |
required |
emulator_class
|
Type[BaseEmulator]
|
BaseEmulator subclass to register |
required |
Raises:
| Type | Description |
|---|---|
TypeError
|
If emulator_class is not a BaseEmulator subclass |
historymatching.emulators.factory.EmulatorFactory.set_default_type(emulator_type)
Create new factory with different default emulator type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
emulator_type
|
str
|
New default emulator type |
required |
Returns:
| Type | Description |
|---|---|
EmulatorFactory
|
New EmulatorFactory instance with updated default type |
historymatching.emulators.factory.EmulatorFactory.with_defaults(**kwargs)
Create new factory with updated default parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs
|
Parameters to update in defaults |
{}
|
Returns:
| Type | Description |
|---|---|
EmulatorFactory
|
New EmulatorFactory instance with updated defaults |
historymatching.emulators.factory.EmulatorFactory.with_defaults_class(default_type='gpr', **default_kwargs)
classmethod
Class method to create factory with specific defaults.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
default_type
|
str
|
Default emulator type |
'gpr'
|
**default_kwargs
|
Default parameters |
{}
|
Returns:
| Type | Description |
|---|---|
EmulatorFactory
|
EmulatorFactory instance with specified defaults |
Emulators
Base class for emulators.
Initialize the emulator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Optional[DataFrame]
|
Input data. Pandas dataframe with columns representing parameter values. |
None
|
y
|
Optional[DataFrame]
|
Output data. Pandas dataframe with columns representing
observations and rows representing samples. Each row in this
dataframe must match the corresponding row in |
None
|
test_fraction
|
Fraction of |
0.25
|
Returns:
| Type | Description |
|---|---|
|
None |
historymatching.emulators.base.BaseEmulator.get_hyperparameters()
abstractmethod
Return emulator hyperparameters as a JSON-serializable dict.
Subclasses should include all fitted hyperparameters relevant to understanding the emulator (e.g. lengthscales, noise variance, regression coefficients). Parameter names in the input space should use the original column names where possible.
historymatching.emulators.base.BaseEmulator.get_implausibility(x, target, target_var, model_discrepancy=0)
Get implausibility for a given set of parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
DataFrame
|
Input data. Pandas dataframe with columns representing parameter values where the implausibility metric will be evaluated. |
required |
target
|
Scalar indicating the value to use as reference for the implausibility computation. This is typically extracted from observed data. |
required | |
target_var
|
Variance of the target point. |
required | |
model_discrepancy
|
Model discrepancy or variance. This parameter quantifies the discrepancy between the model output and real life data. |
0
|
Returns: Numpy array with implausibility values for each of the data points in x.
historymatching.emulators.base.BaseEmulator.info()
Prints report about the emulator and its performance.
historymatching.emulators.base.BaseEmulator.plot_diagnostics()
Diagnostics plots for the trained emulator.
historymatching.emulators.base.BaseEmulator.plot_implausibility(x=None, target=0, target_var=0, model_discrepancy=0, threshold=3)
Get implausibility for a given set of parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Input data. Pandas dataframe with columns representing parameter values where the implausibility metric will be evaluated. If not provided, both testing and training data will be used for generating the plots. |
None
|
|
target
|
Scalar indicating the value to use as reference for the implausiblity computation. This is typically extracted from observed data. |
0
|
|
target_var
|
Variance of the target point. |
0
|
|
model_discrepancy
|
Model discrepancy or variance. This parameter quantifies the discrepancy between the model output and real life data. |
0
|
|
threshold
|
Implausibility threshold. Sets of parameters within this threshold are deemed as non-implausible. |
3
|
historymatching.emulators.base.BaseEmulator.plot_predictions()
Plot the predicted and true testing values.
historymatching.emulators.base.BaseEmulator.plot_residuals()
Plot residuals of predicted vs. true testing values.
historymatching.emulators.base.BaseEmulator.plot_zscore(target=0, target_var=0, model_var=0, threshold=3)
Plot a Z-score diagnoses for testing and training data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
target
|
Scalar indicating the value to use as reference for the implausiblity computation. This is typically extracted from observed data. |
0
|
|
target_var
|
Variance of the target point. |
0
|
|
model_var
|
Model discrepancy or variance. This parameter quantifies the discrepancy between the model output and real life data. |
0
|
|
threshold
|
Implausibility threshold. Sets of parameters within this threshold are deemed as non-implausible. |
3
|
historymatching.emulators.base.BaseEmulator.predict(x)
abstractmethod
Predict an output using the trained emulator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
DataFrame
|
Input data. Pandas dataframe with columns representing parameter values. |
required |
Returns:
| Type | Description |
|---|---|
EmulationResults
|
EmulationResults with predicted values and uncertainty information. |
historymatching.emulators.base.BaseEmulator.print_emulator_description()
abstractmethod
Display detailed specifications (for example, emulator coefficients) for the trained emulator.
historymatching.emulators.base.BaseEmulator.test()
Tests and runs diagnostics on the trained emulator.
historymatching.emulators.base.BaseEmulator.train()
abstractmethod
Trains the emulator.
Bases: BaseEmulator
Emulator based on an ordinary least squares linear regression. The emulator fits a linear regression model to minimize the residual sum of squares between observed targets in the training data and the targets predicted by the linear approximation.
Initialize the emulator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Optional[DataFrame]
|
Input data. Pandas dataframe with columns representing parameter values. |
None
|
y
|
Optional[DataFrame]
|
Output data. Pandas dataframe with columns representing
observations and rows representing samples. Each row in this
dataframe must match the corresponding row in |
None
|
test_fraction
|
float
|
Fraction of |
0.25
|
Returns:
| Type | Description |
|---|---|
None
|
None |
historymatching.emulators.linear.LinearModel.get_hyperparameters()
Return linear model hyperparameters as a JSON-serializable dict.
historymatching.emulators.linear.LinearModel.predict(x)
Predict an output using the trained emulator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
DataFrame
|
Input data. Pandas dataframe with columns representing parameter values. |
required |
Returns:
| Type | Description |
|---|---|
EmulationResults
|
EmulationResults with predicted values and uncertainty intervals. |
historymatching.emulators.linear.LinearModel.print_emulator_description()
Display detailed specifications (for example, emulator coefficients) for the trained emulator.
historymatching.emulators.linear.LinearModel.train()
Fits a linear regression model to minimize the residual sum of squares between observed targets in the training data and the targets predicted by the linear approximation.
Bases: BaseEmulator
Generalized Linear Model (GLM) emulator.
Initialize the emulator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Optional[DataFrame]
|
Input data. Pandas dataframe with columns representing parameter values. |
None
|
y
|
Optional[DataFrame]
|
Output data. Pandas dataframe with columns representing
observations and rows representing samples. Each row in this
dataframe must match the corresponding row in |
None
|
test_fraction
|
float
|
Fraction of |
0.25
|
link
|
Link function for the GLM model. It can be either 'linear' or 'poisson'. |
'linear'
|
Returns:
| Type | Description |
|---|---|
None
|
None |
historymatching.emulators.glm.GLM.get_hyperparameters()
Return GLM hyperparameters as a JSON-serializable dict.
historymatching.emulators.glm.GLM.predict(x)
Predict an output using the trained emulator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
DataFrame
|
Input data. Pandas dataframe with columns representing parameter values. |
required |
Returns:
| Type | Description |
|---|---|
EmulationResults
|
EmulationResults with predicted values and uncertainty intervals. |
historymatching.emulators.glm.GLM.print_emulator_description()
Display detailed specifications (for example, emulator coefficients) for the trained emulator.
historymatching.emulators.glm.GLM.train()
Fits a Generalised Linear Model.
Bases: BaseEmulator
Gaussian Process Regression emulator implemented in GPFlow.
Initialise the Gaussian Process Regression (GPR) emulator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Optional[DataFrame]
|
Input data. Pandas dataframe with columns representing parameter values. |
None
|
y
|
Optional[DataFrame]
|
Output data. Pandas dataframe with columns representing
observations and rows representing samples. Each row in this
dataframe must match the corresponding row in |
None
|
test_fraction
|
Fraction of |
0.25
|
Returns:
| Type | Description |
|---|---|
|
None |
historymatching.emulators.gpr.GPR.get_hyperparameters()
Return GPR hyperparameters as a JSON-serializable dict.
historymatching.emulators.gpr.GPR.predict(x)
Predict an output using the trained emulator.
historymatching.emulators.gpr.GPR.print_emulator_description()
Display detailed specifications (for example, emulator coefficients) for the trained emulator.
historymatching.emulators.gpr.GPR.train()
Fits a GPR model.
Inputs are normalized to [0, 1] and outputs are standardized (zero mean, unit variance) before training. This ensures: - Kernel lengthscales are comparable across parameters with very different physical scales (e.g. 0.0005–0.006 vs 1.0–3.0). - The optimizer isn't confused by large output offsets (e.g. birth weight ~ 3000 g with a signal of a few grams).
Predictions are un-standardized automatically in predict().
Bases: BaseEmulator
Bayes Linear emulator with OLS trend and squared-exponential residual correlation.
historymatching.emulators.bayes_linear.BayesLinear.get_hyperparameters()
Return Bayes Linear hyperparameters as a JSON-serializable dict.
historymatching.emulators.bayes_linear.BayesLinear.predict(x)
Predict using the trained Bayes Linear emulator.
historymatching.emulators.bayes_linear.BayesLinear.print_emulator_description()
Display Bayes Linear emulator specifications.
historymatching.emulators.bayes_linear.BayesLinear.train()
Train the Bayes Linear emulator.
Steps: 1. Normalize inputs to [0, 1], standardize outputs. 2. OLS regression for the trend. 3. Optimize correlation lengths theta via concentrated log-likelihood. 4. Pre-compute Cholesky factor and weight vector for fast prediction.
Standardized container for emulator prediction results.
Provides clean access to mean and standard deviation predictions, with optional additional data for emulator-specific outputs.
Initialize emulation results.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mean
|
Union[ndarray, Series]
|
Predicted means (required) |
required |
std
|
Union[ndarray, Series]
|
Predicted standard deviations (required) |
required |
additional_data
|
Optional[DataFrame]
|
Optional DataFrame with other results (CI, etc.) |
None
|
historymatching.emulators.results.EmulationResults.__len__()
Number of predictions.
historymatching.emulators.results.EmulationResults.__repr__()
String representation.
historymatching.emulators.results.EmulationResults.get_additional_data()
Get additional emulator-specific data.
historymatching.emulators.results.EmulationResults.get_ci(confidence_level=0.95)
Get confidence intervals assuming normal distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
confidence_level
|
float
|
Confidence level (0.95 = 95%) |
0.95
|
Returns:
| Type | Description |
|---|---|
Tuple[Series, Series]
|
Tuple of (lower_bound, upper_bound) Series |
historymatching.emulators.results.EmulationResults.get_mean()
Get predicted means.
historymatching.emulators.results.EmulationResults.get_std()
Get predicted standard deviations.
historymatching.emulators.results.EmulationResults.get_variance()
Get predicted variances (computed from std).
NROY sampling
Generate NROY parameter samples filtered through all emulators.
Parameters
n_points : int
Target number of NROY samples to return.
parameter_space : ParameterSpace
Bounds for each parameter.
emulator_bank : EmulatorBank
All trained emulators (across waves) used for filtering.
observations : ObservationData
Target values and uncertainties.
threshold : float
Implausibility threshold (default 3.5).
sampling_strategy : SamplingStrategy, optional
Strategy for generating initial LHS candidates. Defaults to LHS maximin.
method : str
'auto' (default) runs LHS rejection and escalates to the
ray + importance-sampling pipeline only if LHS underfills;
'lhs' does pure rejection sampling; 'ray' seeds with a
small LHS draw then goes straight to ray + importance sampling.
seed : int, optional
Random seed for reproducibility.
Returns
pd.DataFrame Up to n_points rows of NROY parameter samples.
Plotting
The historymatching.plotting module provides composable plotting functions (each
returns Matplotlib axes and accepts an ax=/axes= argument). The most commonly
used are re-exported at the top level, e.g. historymatching.plot_pairplot.
Reusable plotting helpers for history matching.
This module collects the figures that recur in every history-matching analysis — NROY parameter clouds, convergence curves, posterior marginals, emulator-quality summaries, z-scores against targets, and ensemble fan plots — into a single set of composable functions.
Design conventions (so the figures behave predictably in notebooks, scripts, and the engine's on-disk output alike):
- Every function takes primitive data (DataFrames, dicts, arrays) rather than history-matching objects, so there are no import cycles and the functions can be reused anywhere.
- Every function accepts an
ax(oraxes) argument. When omitted a new figure is created; when supplied the function draws into the caller's axes so plots can be composed into larger grids. - Every function returns the Matplotlib
Axes(or array of axes) it drew into. Nothing callsplt.show()orsavefig— the caller decides whether to display, save, or further customise the result.
The thin plot_* methods on :class:~historymatching.HistoryMatchingEngine,
:class:~historymatching.IterationResult, and the domain objects all delegate
here.
historymatching.plotting.marginal_variance_reduction(samples, bounds)
Per-parameter marginal variance reduction vs a uniform prior.
Simpler than the PCA version: compares each parameter's marginal variance in the NROY cloud to the prior variance. Useful for ranking which parameters to show in a pairplot.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
samples
|
DataFrame
|
NROY parameter samples. |
required |
bounds
|
Dict[str, Tuple[float, float]]
|
|
required |
Returns:
| Type | Description |
|---|---|
Dict[str, float]
|
|
historymatching.plotting.plot_constrained_dims(samples, bounds, *, n_top=5, title='Constrained directions', axes=None)
Plot the directions in parameter space that history matching constrained.
The top panel shows the variance-reduction spectrum (most-constrained
principal components first). Each following panel shows the loadings of one
top component — which parameters combine to form that constrained direction
(bar height = |loading|, red = positive, blue = negative).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
samples
|
DataFrame
|
NROY parameter samples. |
required |
bounds
|
Dict[str, Tuple[float, float]]
|
|
required |
n_top
|
int
|
Number of most-constrained components to detail. |
5
|
title
|
str
|
Figure suptitle. |
'Constrained directions'
|
axes
|
Optional[ndarray]
|
Existing axes array (length |
None
|
Returns:
| Type | Description |
|---|---|
ndarray
|
The array of |
historymatching.plotting.plot_convergence(iterations, fractions, *, ax=None, log=True, title='NROY convergence')
Plot the non-implausible (NROY) fraction at each wave.
The NROY fraction is the share of fresh prior samples that survive all emulator constraints accumulated so far. It should fall monotonically as waves add constraints; a plateau signals convergence (or an over-constrained model if it collapses toward zero).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
iterations
|
Sequence[int]
|
Wave numbers (x-axis). |
required |
fractions
|
Sequence[float]
|
NROY fraction for each wave, in |
required |
ax
|
Optional[Axes]
|
Existing axes to draw into; a new figure is created if omitted. |
None
|
log
|
bool
|
Use a logarithmic y-axis (recommended — fractions span orders of magnitude as constraints tighten). |
True
|
title
|
str
|
Axes title. |
'NROY convergence'
|
Returns:
| Type | Description |
|---|---|
Axes
|
The Matplotlib |
historymatching.plotting.plot_emulator_quality(quality, *, ax=None)
Bar chart of per-feature emulator R² (a quick fit-quality overview).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
quality
|
Dict[str, Dict[str, float]]
|
|
required |
ax
|
Optional[Axes]
|
Existing axes to draw into. |
None
|
Returns:
| Type | Description |
|---|---|
Axes
|
The Matplotlib |
historymatching.plotting.plot_ensemble_fan(trajectories, *, observed=None, x=None, ax=None, ci=0.95, member_color='#888888', mean_color=NROY_COLOR, obs_color=TRUTH_COLOR, show_members=True, show_mean=True, show_band=True, xlabel='Index', ylabel='Value', title='Ensemble vs observed')
Fan / spaghetti plot of an ensemble of trajectories against observed data.
A posterior-predictive check: re-run the simulator at NROY parameter sets, pass the resulting trajectories here, and compare their spread to the observed series. Model-agnostic — works for any ensemble of equal-length vectors.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
trajectories
|
Union[ndarray, Sequence[Sequence[float]], DataFrame]
|
2-D array/DataFrame/list of shape |
required |
observed
|
Optional[Sequence[float]]
|
Optional observed series (length |
None
|
x
|
Optional[Sequence[float]]
|
Optional x-axis values (defaults to |
None
|
ax
|
Optional[Axes]
|
Existing axes to draw into. |
None
|
ci
|
float
|
Central probability mass for the shaded band (e.g. |
0.95
|
member_color
|
str
|
Colour of individual trajectory lines. |
'#888888'
|
mean_color
|
str
|
Colour of the ensemble mean and band. |
NROY_COLOR
|
obs_color
|
str
|
Colour of the observed series. |
TRUTH_COLOR
|
show_members
|
bool
|
Draw each member as a faint line. |
True
|
show_mean
|
bool
|
Draw the ensemble mean. |
True
|
show_band
|
bool
|
Shade the central |
True
|
xlabel
|
str
|
X-axis label. |
'Index'
|
ylabel
|
str
|
Y-axis label. |
'Value'
|
title
|
str
|
Axes title. |
'Ensemble vs observed'
|
Returns:
| Type | Description |
|---|---|
Axes
|
The Matplotlib |
historymatching.plotting.plot_marginals(samples, params=None, *, truth=None, bounds=None, prior=None, show_median=True, bins=25, ncols=None, axes=None, color=NROY_COLOR)
Plot a marginal histogram for each parameter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
samples
|
DataFrame
|
DataFrame of parameter samples (e.g. an NROY cloud). |
required |
params
|
Optional[Sequence[str]]
|
Which columns to plot; defaults to all numeric columns. |
None
|
truth
|
Optional[Dict[str, float]]
|
Optional |
None
|
bounds
|
Optional[Dict[str, Tuple[float, float]]]
|
Optional |
None
|
prior
|
Optional[DataFrame]
|
Optional second sample set drawn faintly behind (e.g. the prior or first wave) to show how the marginal tightened. |
None
|
show_median
|
bool
|
Draw a solid line at each parameter's sample median. |
True
|
bins
|
int
|
Histogram bin count. |
25
|
ncols
|
Optional[int]
|
Columns in the subplot grid; defaults to |
None
|
axes
|
Optional[ndarray]
|
Existing axes array to draw into; a new figure is created if omitted. |
None
|
color
|
str
|
Histogram colour. |
NROY_COLOR
|
Returns:
| Type | Description |
|---|---|
ndarray
|
A flat NumPy array of the |
historymatching.plotting.plot_pairplot(samples, params=None, *, truth=None, prior=None, bounds=None, max_params=8, bins=25, s=8, color=NROY_COLOR, axes=None, title=None)
Corner plot of a parameter cloud: marginals on the diagonal, pairwise scatter below it.
This is the canonical history-matching output — the shape of the
non-implausible (NROY) region. Pass truth to overlay known values and
prior to show the cloud you started from.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
samples
|
DataFrame
|
DataFrame of parameter samples (foreground cloud). |
required |
params
|
Optional[Sequence[str]]
|
Columns to show; defaults to all numeric columns, capped at
|
None
|
truth
|
Optional[Dict[str, float]]
|
|
None
|
prior
|
Optional[DataFrame]
|
Optional background cloud (e.g. prior or first-wave samples). |
None
|
bounds
|
Optional[Dict[str, Tuple[float, float]]]
|
|
None
|
max_params
|
int
|
Cap on the number of parameters shown (keeps the grid readable for high-dimensional problems). |
8
|
bins
|
int
|
Diagonal histogram bins. |
25
|
s
|
float
|
Scatter marker size. |
8
|
color
|
str
|
Foreground cloud colour. |
NROY_COLOR
|
axes
|
Optional[ndarray]
|
Existing |
None
|
title
|
Optional[str]
|
Optional figure suptitle. |
None
|
Returns:
| Type | Description |
|---|---|
ndarray
|
The 2-D NumPy array of |
historymatching.plotting.plot_parameter_bounds(bounds, *, reference=None, ax=None)
Plot parameter bounds as horizontal ranges, optionally vs a reference.
Bounds are normalised to each reference range so shrinkage is visible on
one axis. Without a reference, raw widths are shown (each on its own scale
label).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
bounds
|
Dict[str, Tuple[float, float]]
|
|
required |
reference
|
Optional[Dict[str, Tuple[float, float]]]
|
Optional |
None
|
ax
|
Optional[Axes]
|
Existing axes to draw into. |
None
|
Returns:
| Type | Description |
|---|---|
Axes
|
The Matplotlib |
historymatching.plotting.plot_predicted_vs_actual(y_true, y_pred, *, ax=None, r2=None, mse=None, n_train=None, title='Predicted vs actual')
Scatter of emulator predictions against true simulator outputs.
Points hugging the dashed 1:1 line indicate a good fit.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
y_true
|
Sequence[float]
|
True (held-out) simulator outputs. |
required |
y_pred
|
Sequence[float]
|
Emulator predictions for the same points. |
required |
ax
|
Optional[Axes]
|
Existing axes to draw into. |
None
|
r2
|
Optional[float]
|
Optional R² to annotate in the title. |
None
|
mse
|
Optional[float]
|
Optional MSE to annotate in the title. |
None
|
n_train
|
Optional[int]
|
Optional training-set size to annotate. |
None
|
title
|
str
|
Base title. |
'Predicted vs actual'
|
Returns:
| Type | Description |
|---|---|
Axes
|
The Matplotlib |
historymatching.plotting.plot_targets(targets, *, ax=None)
Plot observation targets as means with ±1σ error bars.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
targets
|
Dict[str, Tuple[float, float]]
|
|
required |
ax
|
Optional[Axes]
|
Existing axes to draw into. |
None
|
Returns:
| Type | Description |
|---|---|
Axes
|
The Matplotlib |
historymatching.plotting.plot_zscores_vs_targets(waves, targets, *, ax=None, threshold=3.5)
Plot standardised simulation outputs against every observation target.
For each target the band shows (simulated - target_mean) / target_std
across the wave's samples: a thick bar for the inter-quartile range, a thin
line for the 5th–95th percentile, and a dot at the median. Outputs inside
the green ±threshold band are consistent with the target; bands drifting
toward zero across waves show the calibration converging.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
waves
|
List[dict]
|
List of |
required |
targets
|
Dict[str, Tuple[float, float]]
|
|
required |
ax
|
Optional[Axes]
|
Existing axes to draw into. |
None
|
threshold
|
float
|
Half-width of the shaded acceptance band (in sigma). |
3.5
|
Returns:
| Type | Description |
|---|---|
Axes
|
The Matplotlib |
historymatching.plotting.variance_reduction(samples, bounds)
PCA-based variance reduction of an NROY cloud vs a uniform prior.
Samples are normalised to [0, 1]^d using the prior bounds and PCA is
fit. Per principal component, reduction = 1 - NROY_var / prior_var
where the prior (uniform) variance is 1/12: 0 means as wide as the
prior, 1 means fully collapsed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
samples
|
DataFrame
|
NROY parameter samples. |
required |
bounds
|
Dict[str, Tuple[float, float]]
|
|
required |
Returns:
| Type | Description |
|---|---|
Tuple[ndarray, ndarray, List[str]]
|
|