Developer tutorial: Analyzers#

Reporting results#

Each Starsim module can have its own results, which get added to the full list of results in the Sim object. For example, the ss.Pregnancy module adds results like sim.results.pregnancy.pregnant, and the ss.HIV module adds results like sim.results.hiv.new_infections. If you are writing your own module, you can add whatever custom results you want. However, another option is to create an Analyzer to store results that you might need for one particular analysis but won’t need all the time. An Analyzer is very similar to other Starsim modules in its structure, but the general idea of an analyzer is that it gets called at the end of a timestep, and reports of the state of things after everything else has been updated without changing any of the module states itself.

Simple usage#

For simple reporting, it’s possible to use a single function as an analyzer. In this case, the function receives a single argument, sim, which it has full access to. For example, if you wanted to know the number of connections in the network on each timestep, you could write a small analyzer as follows:

[1]:
import starsim as ss
import matplotlib.pyplot as plt

# Store the number of edges
n_edges = []

def count_edges(sim):
    """ Print out the number of edges in the network on each timestep """
    network = sim.networks[0] # Get the first network
    n = len(network)
    n_edges.append(n)
    print(f'Number of edges for network {network.name} on step {sim.ti}: {n}')
    return

# Create the sim
pars = dict(
    diseases='sis',
    networks = 'mf',
    analyzers = count_edges,
    demographics = True,
)

# Run the sim
sim = ss.Sim(pars).run()
sim.plot()

# Plot the number of edges
plt.figure()
plt.plot(sim.timevec, n_edges)
plt.title('Number of edges over time')
Starsim 2.0.0 (2024-10-01) — © 2023-2024 by IDM
Initializing sim with 10000 agents
  Running 2000.0 ( 0/51) (0.00 s)  ———————————————————— 2%
Number of edges for network mfnet on step 0: 3706
Number of edges for network mfnet on step 1: 3657
Number of edges for network mfnet on step 2: 3620
Number of edges for network mfnet on step 3: 3578
Number of edges for network mfnet on step 4: 3545
Number of edges for network mfnet on step 5: 3538
Number of edges for network mfnet on step 6: 3513
Number of edges for network mfnet on step 7: 3490
Number of edges for network mfnet on step 8: 3434
Number of edges for network mfnet on step 9: 3407
  Running 2010.0 (10/51) (0.98 s)  ••••———————————————— 22%
Number of edges for network mfnet on step 10: 3372
Number of edges for network mfnet on step 11: 3344
Number of edges for network mfnet on step 12: 3308
Number of edges for network mfnet on step 13: 3278
Number of edges for network mfnet on step 14: 3247
Number of edges for network mfnet on step 15: 3223
Number of edges for network mfnet on step 16: 3239
Number of edges for network mfnet on step 17: 3276
Number of edges for network mfnet on step 18: 3304
Number of edges for network mfnet on step 19: 3332
  Running 2020.0 (20/51) (1.07 s)  ••••••••———————————— 41%
Number of edges for network mfnet on step 20: 3348
Number of edges for network mfnet on step 21: 3360
Number of edges for network mfnet on step 22: 3386
Number of edges for network mfnet on step 23: 3450
Number of edges for network mfnet on step 24: 3498
Number of edges for network mfnet on step 25: 3539
Number of edges for network mfnet on step 26: 3571
Number of edges for network mfnet on step 27: 3610
Number of edges for network mfnet on step 28: 3654
Number of edges for network mfnet on step 29: 3691
  Running 2030.0 (30/51) (1.17 s)  ••••••••••••———————— 61%
Number of edges for network mfnet on step 30: 3728
Number of edges for network mfnet on step 31: 3758
Number of edges for network mfnet on step 32: 3812
Number of edges for network mfnet on step 33: 3859
Number of edges for network mfnet on step 34: 3890
Number of edges for network mfnet on step 35: 3920
Number of edges for network mfnet on step 36: 3965
Number of edges for network mfnet on step 37: 3994
Number of edges for network mfnet on step 38: 4042
Number of edges for network mfnet on step 39: 4072
  Running 2040.0 (40/51) (1.28 s)  ••••••••••••••••———— 80%
Number of edges for network mfnet on step 40: 4083
Number of edges for network mfnet on step 41: 4120
Number of edges for network mfnet on step 42: 4170
Number of edges for network mfnet on step 43: 4215
Number of edges for network mfnet on step 44: 4272
Number of edges for network mfnet on step 45: 4300
Number of edges for network mfnet on step 46: 4356
Number of edges for network mfnet on step 47: 4386
Number of edges for network mfnet on step 48: 4453
Number of edges for network mfnet on step 49: 4465
  Running 2050.0 (50/51) (1.40 s)  •••••••••••••••••••• 100%

Number of edges for network mfnet on step 50: 4498
[1]:
Text(0.5, 1.0, 'Number of edges over time')
../_images/tutorials_dev_tut_analyzers_3_2.png
../_images/tutorials_dev_tut_analyzers_3_3.png

Is that what you expected it to look like? The reason it looks like that is that initially, agents die (either from aging or from disease), reducing the number of edges. New agents are being born, but they don’t participate in male-female networks until the age of debut – which is 15 years old by default, which is why the trend reverses (and tracks population size) after 2015. This illustrates the importance of model burn-in!

Advanced usage#

Suppose we wanted to create an analyzer that would report on the number of new HIV infections in pregnant women:

[2]:
import starsim as ss
import pandas as pd

class HIV_in_pregnancy(ss.Analyzer):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.requires = [ss.HIV, ss.Pregnancy]
        self.name = 'hiv_in_pregnancy'
        return

    def init_results(self):
        super().init_results()
        self.define_results(
            ss.Result('new_infections_pregnancy'),
        )
        return

    def apply(self, sim):
        ti = sim.ti
        hiv = sim.diseases.hiv
        pregnant = sim.demographics.pregnancy.pregnant
        newly_infected = hiv.ti_infected == ti
        self.results['new_infections_pregnancy'][ti] = len((newly_infected & pregnant).uids)
        return

pregnancy = ss.Pregnancy(pars=dict(fertility_rate=pd.read_csv('test_data/nigeria_asfr.csv')))
hiv = ss.HIV(beta={'mfnet':[0.5,0.25]})
sim = ss.Sim(diseases=hiv, networks='mfnet', demographics=pregnancy, analyzers=HIV_in_pregnancy())
sim.run()
print(f'Total infections among pregnant women: {sim.results.hiv_in_pregnancy.new_infections_pregnancy.sum()}')

Initializing sim with 10000 agents
  Running 2000.0 ( 0/51) (0.00 s)  ———————————————————— 2%
  Running 2010.0 (10/51) (0.15 s)  ••••———————————————— 22%
  Running 2020.0 (20/51) (0.33 s)  ••••••••———————————— 41%
  Running 2030.0 (30/51) (0.55 s)  ••••••••••••———————— 61%
  Running 2040.0 (40/51) (0.79 s)  ••••••••••••••••———— 80%
  Running 2050.0 (50/51) (1.05 s)  •••••••••••••••••••• 100%

Total infections among pregnant women: 0.0

Analyzers are ideal for adding custom results, and because they get added to the sim in the same way as any other result, they also get automatically exported in the same format, e.g. using sim.export_df().

Here’s a plot of the results from our HIV in pregnancy analyzer:

[3]:
import matplotlib.pyplot as plt

res = sim.results

plt.figure()
plt.plot(res.timevec, res.hiv_in_pregnancy.new_infections_pregnancy)
plt.title('HIV infections acquired during pregnancy')
plt.show();
../_images/tutorials_dev_tut_analyzers_8_0.png