fpsim.experiment module¶
Define classes and functions for the Experiment class (running sims and comparing them to data)
- class fpsim.experiment.Experiment(pars=None, flags=None, label=None)¶
Bases:
sciris.sc_utils.prettyobj
Class for running calibration to data
- init_dhs_data()¶
Assign data points of interest in DHS dictionary for Senegal data. All data 2018 unless otherwise indicated Adjust data for a different year or country
- extract_dhs_data()¶
- pop_growth_rate(years, population)¶
- initialize()¶
- post_process_sim()¶
- run_model(pars=None, mother_ids=False)¶
Create the sim and run the model
- extract_model()¶
- model_pop_size()¶
- model_mcpr()¶
- model_mmr()¶
Calculate maternal mortality in model over most recent 3 years
- model_infant_mortality_rate()¶
- model_crude_death_rate()¶
- model_crude_birth_rate()¶
- model_data_tfr()¶
- extract_skyscrapers()¶
- extract_birth_spacing()¶
- extract_methods()¶
- extract_age_pregnancy()¶
- compute_fit(*args, **kwargs)¶
Compute how good the fit is
- post_process_results(keep_people=False, compute_fit=True, **kwargs)¶
Compare the model and the data
- run(pars=None, keep_people=False, compute_fit=True, **kwargs)¶
Run the model and post-process the results
- compare()¶
Create and print a comparison between model and data
- summarize(as_df=False)¶
Convert results to a one-number-per-key summary format. Returns summary, also saves to self.summary.
- Parameters
as_df (bool) – if True, return a dataframe instead of a dict.
- to_json(filename=None, tostring=False, indent=2, verbose=False, **kwargs)¶
Export results as JSON.
- Parameters
filename (str) – if None, return string; else, write to file
tostring (bool) – if not writing to file, whether to write to string (alternative is sanitized dictionary)
indent (int) – if writing to file, how many indents to use per nested level
verbose (bool) – detail to print
kwargs (dict) – passed to savejson()
- Returns
A unicode string containing a JSON representation of the results, or writes the JSON file to disk
Examples:
json = calib.to_json() calib.to_json('results.json')
- plot(axes_args=None, do_maximize=True, do_show=True)¶
Plot the model against the data
- class fpsim.experiment.Fit(data, sim, weights=None, keys=None, custom=None, compute=True, verbose=False, **kwargs)¶
Bases:
sciris.sc_utils.prettyobj
A class for calculating the fit between the model and the data. Note the following terminology is used here:
fit: nonspecific term for how well the model matches the data
difference: the absolute numerical differences between the model and the data (one time series per result)
goodness-of-fit: the result of passing the difference through a statistical function, such as mean squared error
loss: the goodness-of-fit for each result multiplied by user-specified weights (one time series per result)
mismatches: the sum of all the losses (a single scalar value per time series)
mismatch: the sum of the mismatches – this is the value to be minimized during calibration
- Parameters
sim (Sim) – the sim object
weights (dict) – the relative weight to place on each result (by default: 10 for deaths, 5 for diagnoses, 1 for everything else)
keys (list) – the keys to use in the calculation
custom (dict) – a custom dictionary of additional data to fit; format is e.g. {‘my_output’:{‘data’:[1,2,3], ‘sim’:[1,2,4], ‘weights’:2.0}}
compute (bool) – whether to compute the mismatch immediately
verbose (bool) – detail to print
kwargs (dict) – passed to cv.compute_gof() – see this function for more detail on goodness-of-fit calculation options
Example:
sim = cv.Sim() sim.run() fit = sim.compute_fit() fit.plot()
- compute()¶
Perform all required computations
- reconcile_inputs(verbose=False)¶
Find matching keys and indices between the model and the data
- compute_diffs(absolute=False)¶
Find the differences between the sim and the data
- compute_gofs(**kwargs)¶
Compute the goodness-of-fit
- compute_losses()¶
Compute the weighted goodness-of-fit
- compute_mismatch(use_median=False)¶
Compute the final mismatch
- plot(keys=None, width=0.8, font_size=18, fig_args=None, axis_args=None, plot_args=None, do_show=True)¶
Plot the fit of the model to the data. For each result, plot the data and the model; the difference; and the loss (weighted difference). Also plots the loss as a function of time.
- fpsim.experiment.compute_gof(actual, predicted, normalize=True, use_frac=False, use_squared=False, as_scalar='none', eps=1e-09, skestimator=None, **kwargs)¶
Calculate the goodness of fit. By default use normalized absolute error, but highly customizable. For example, mean squared error is equivalent to setting normalize=False, use_squared=True, as_scalar=’mean’.
- Parameters
actual (arr) – array of actual (data) points
predicted (arr) – corresponding array of predicted (model) points
normalize (bool) – whether to divide the values by the largest value in either series
use_frac (bool) – convert to fractional mismatches rather than absolute
use_squared (bool) – square the mismatches
as_scalar (str) – return as a scalar instead of a time series: choices are sum, mean, median
eps (float) – to avoid divide-by-zero
skestimator (str) – if provided, use this scikit-learn estimator instead
kwargs (dict) – passed to the scikit-learn estimator
- Returns
array of goodness-of-fit values, or a single value if as_scalar is True
- Return type
gofs (arr)
Examples:
x1 = np.cumsum(np.random.random(100)) x2 = np.cumsum(np.random.random(100)) e1 = compute_gof(x1, x2) # Default, normalized absolute error e2 = compute_gof(x1, x2, normalize=False, use_frac=False) # Fractional error e3 = compute_gof(x1, x2, normalize=False, use_squared=True, as_scalar='mean') # Mean squared error e4 = compute_gof(x1, x2, skestimator='mean_squared_error') # Scikit-learn's MSE method e5 = compute_gof(x1, x2, as_scalar='median') # Normalized median absolute error -- highly robust
- fpsim.experiment.datapath(path)¶
Return the path of the parent folder
- fpsim.experiment.diff_summaries(sim1, sim2, skip_key_diffs=False, output=False, die=False)¶
Compute the difference of the summaries of two FPsim calibration objects, and print any values which differ.
- Parameters
sim1 (sim/dict) – the calib.summary dictionary, representing a single sim
sim2 (sim/dict) – ditto
skip_key_diffs (bool) – whether to skip keys that don’t match between sims
output (bool) – whether to return the output as a string (otherwise print)
die (bool) – whether to raise an exception if the sims don’t match
require_run (bool) – require that the simulations have been run
Example:
c1 = fp.Calibration() c2 = fp.Calibration() c1.run() c2.run() fp.diff_summaries(c1.summarize(), c2.summarize())