fpsim.experiment module

Define classes and functions for the Experiment class (running sims and comparing them to data)

class fpsim.experiment.Experiment(pars=None, flags=None, label=None)

Bases: sciris.sc_utils.prettyobj

Class for running calibration to data

init_dhs_data()

Assign data points of interest in DHS dictionary for Senegal data. All data 2018 unless otherwise indicated Adjust data for a different year or country

extract_dhs_data()
pop_growth_rate(years, population)
initialize()
post_process_sim()
run_model(pars=None, mother_ids=False)

Create the sim and run the model

extract_model()
model_pop_size()
model_mcpr()
model_mmr()

Calculate maternal mortality in model over most recent 3 years

model_infant_mortality_rate()
model_crude_death_rate()
model_crude_birth_rate()
model_data_tfr()
extract_skyscrapers()
extract_birth_spacing()
extract_methods()
extract_age_pregnancy()
compute_fit(*args, **kwargs)

Compute how good the fit is

post_process_results(keep_people=False, compute_fit=True, **kwargs)

Compare the model and the data

run(pars=None, keep_people=False, compute_fit=True, **kwargs)

Run the model and post-process the results

compare()

Create and print a comparison between model and data

summarize(as_df=False)

Convert results to a one-number-per-key summary format. Returns summary, also saves to self.summary.

Parameters

as_df (bool) – if True, return a dataframe instead of a dict.

to_json(filename=None, tostring=False, indent=2, verbose=False, **kwargs)

Export results as JSON.

Parameters
  • filename (str) – if None, return string; else, write to file

  • tostring (bool) – if not writing to file, whether to write to string (alternative is sanitized dictionary)

  • indent (int) – if writing to file, how many indents to use per nested level

  • verbose (bool) – detail to print

  • kwargs (dict) – passed to savejson()

Returns

A unicode string containing a JSON representation of the results, or writes the JSON file to disk

Examples:

json = calib.to_json()
calib.to_json('results.json')
plot(axes_args=None, do_maximize=True, do_show=True)

Plot the model against the data

class fpsim.experiment.Fit(data, sim, weights=None, keys=None, custom=None, compute=True, verbose=False, **kwargs)

Bases: sciris.sc_utils.prettyobj

A class for calculating the fit between the model and the data. Note the following terminology is used here:

  • fit: nonspecific term for how well the model matches the data

  • difference: the absolute numerical differences between the model and the data (one time series per result)

  • goodness-of-fit: the result of passing the difference through a statistical function, such as mean squared error

  • loss: the goodness-of-fit for each result multiplied by user-specified weights (one time series per result)

  • mismatches: the sum of all the losses (a single scalar value per time series)

  • mismatch: the sum of the mismatches – this is the value to be minimized during calibration

Parameters
  • sim (Sim) – the sim object

  • weights (dict) – the relative weight to place on each result (by default: 10 for deaths, 5 for diagnoses, 1 for everything else)

  • keys (list) – the keys to use in the calculation

  • custom (dict) – a custom dictionary of additional data to fit; format is e.g. {‘my_output’:{‘data’:[1,2,3], ‘sim’:[1,2,4], ‘weights’:2.0}}

  • compute (bool) – whether to compute the mismatch immediately

  • verbose (bool) – detail to print

  • kwargs (dict) – passed to cv.compute_gof() – see this function for more detail on goodness-of-fit calculation options

Example:

sim = cv.Sim()
sim.run()
fit = sim.compute_fit()
fit.plot()
compute()

Perform all required computations

reconcile_inputs(verbose=False)

Find matching keys and indices between the model and the data

compute_diffs(absolute=False)

Find the differences between the sim and the data

compute_gofs(**kwargs)

Compute the goodness-of-fit

compute_losses()

Compute the weighted goodness-of-fit

compute_mismatch(use_median=False)

Compute the final mismatch

plot(keys=None, width=0.8, font_size=18, fig_args=None, axis_args=None, plot_args=None, do_show=True)

Plot the fit of the model to the data. For each result, plot the data and the model; the difference; and the loss (weighted difference). Also plots the loss as a function of time.

Parameters
  • keys (list) – which keys to plot (default, all)

  • width (float) – bar width

  • font_size (float) – size of font

  • fig_args (dict) – passed to pl.figure()

  • axis_args (dict) – passed to pl.subplots_adjust()

  • plot_args (dict) – passed to pl.plot()

  • do_show (bool) – whether to show the plot

fpsim.experiment.compute_gof(actual, predicted, normalize=True, use_frac=False, use_squared=False, as_scalar='none', eps=1e-09, skestimator=None, **kwargs)

Calculate the goodness of fit. By default use normalized absolute error, but highly customizable. For example, mean squared error is equivalent to setting normalize=False, use_squared=True, as_scalar=’mean’.

Parameters
  • actual (arr) – array of actual (data) points

  • predicted (arr) – corresponding array of predicted (model) points

  • normalize (bool) – whether to divide the values by the largest value in either series

  • use_frac (bool) – convert to fractional mismatches rather than absolute

  • use_squared (bool) – square the mismatches

  • as_scalar (str) – return as a scalar instead of a time series: choices are sum, mean, median

  • eps (float) – to avoid divide-by-zero

  • skestimator (str) – if provided, use this scikit-learn estimator instead

  • kwargs (dict) – passed to the scikit-learn estimator

Returns

array of goodness-of-fit values, or a single value if as_scalar is True

Return type

gofs (arr)

Examples:

x1 = np.cumsum(np.random.random(100))
x2 = np.cumsum(np.random.random(100))

e1 = compute_gof(x1, x2) # Default, normalized absolute error
e2 = compute_gof(x1, x2, normalize=False, use_frac=False) # Fractional error
e3 = compute_gof(x1, x2, normalize=False, use_squared=True, as_scalar='mean') # Mean squared error
e4 = compute_gof(x1, x2, skestimator='mean_squared_error') # Scikit-learn's MSE method
e5 = compute_gof(x1, x2, as_scalar='median') # Normalized median absolute error -- highly robust
fpsim.experiment.datapath(path)

Return the path of the parent folder

fpsim.experiment.diff_summaries(sim1, sim2, skip_key_diffs=False, output=False, die=False)

Compute the difference of the summaries of two FPsim calibration objects, and print any values which differ.

Parameters
  • sim1 (sim/dict) – the calib.summary dictionary, representing a single sim

  • sim2 (sim/dict) – ditto

  • skip_key_diffs (bool) – whether to skip keys that don’t match between sims

  • output (bool) – whether to return the output as a string (otherwise print)

  • die (bool) – whether to raise an exception if the sims don’t match

  • require_run (bool) – require that the simulations have been run

Example:

c1 = fp.Calibration()
c2 = fp.Calibration()
c1.run()
c2.run()
fp.diff_summaries(c1.summarize(), c2.summarize())