T6 - Using analyzers

Analyzers are objects that do not change the behavior of a simulation, but just report on its internal state, almost always something to do with sim.people. This tutorial takes you through some of the built-in analyzers and gives a brief example of how to build your own.

Click here to open an interactive version of this notebook.

Results by age

By far the most common reason to use an analyzer is to report results by age. The results in sim.results are aggregated over all ages, whereas data on cervical cancers are generally reported by age. Age-specific outputs can be customized using an analyzer to match the age bins of the data. The following example shows how to set this up:

[1]:
import numpy as np
import sciris as sc
import hpvsim as hpv

# Create some parameters, setting beta (per-contact transmission probability) higher
# to create more cancers for illutration
pars = dict(beta=0.5, n_agents=50e3, start=1970, n_years=50, dt=1., location='tanzania')

# Also set initial HPV prevalence to be high, again to generate more cancers
pars['init_hpv_prev'] = {
    'age_brackets'  : np.array([  12,   17,   24,   34,  44,   64,    80, 150]),
    'm'             : np.array([ 0.0, 0.75, 0.9, 0.45, 0.1, 0.05, 0.005, 0]),
    'f'             : np.array([ 0.0, 0.75, 0.9, 0.45, 0.1, 0.05, 0.005, 0]),
}

# Create the age analyzers.
az1 = hpv.age_results(
    result_keys=sc.objdict(
        hpv_prevalence=sc.objdict( # The keys of this dictionary are any results you want by age, and can be any key of sim.results
            timepoints=['2019'], # List the years that you want to generate results for
            edges=np.array([0., 15., 20., 25., 30., 40., 45., 50., 55., 65., 100.]),
        ),
        hpv_incidence=sc.objdict(
            timepoints=['2019'],
            edges=np.array([0., 15., 20., 25., 30., 40., 45., 50., 55., 65., 100.]),
        ),
        cancer_incidence=sc.objdict(
            timepoints=['2019'],
            edges=np.array([0.,20.,25.,30.,40.,45.,50.,55.,65.,100.]),
        ),
        cancer_mortality=sc.objdict(
            timepoints=['2019'],
            edges=np.array([0., 20., 25., 30., 40., 45., 50., 55., 65., 100.]),
        )
    )
)

sim = hpv.Sim(pars, genotypes=[16, 18], analyzers=[az1])
sim.run()
a = sim.get_analyzer()
a.plot();
HPVsim 1.1.1 (2023-03-01) — © 2023 by IDM
Loading location-specific demographic data for "nigeria"
Loading location-specific demographic data for "tanzania"
Initializing sim with 50000 agents
Loading location-specific data for "tanzania"
  Running 1970.0 ( 0/51) (1.09 s)  ———————————————————— 2%
  Running 1980.0 (10/51) (2.28 s)  ••••———————————————— 22%
  Running 1990.0 (20/51) (3.85 s)  ••••••••———————————— 41%
  Running 2000.0 (30/51) (6.02 s)  ••••••••••••———————— 61%
  Running 2010.0 (40/51) (8.90 s)  ••••••••••••••••———— 80%
  Running 2020.0 (50/51) (12.69 s)  •••••••••••••••••••• 100%

Simulation summary:
       10,174 infections
            0 dysplasias
            0 pre-cins
        1,071 cin1s
        2,436 cin2s
          803 cin3s
        7,015 cins
           80 cancers
            0 cancer detections
           80 cancer deaths
            0 detected cancer deaths
        8,032 reinfections
            0 reactivations
   121,515,664 number susceptible
       37,270 number infectious
          562 number with inactive infection
   60,773,276 number with no cellular changes
       37,270 number with episomal infection
          375 number with transformation
          562 number with cancer
       37,832 number infected
       37,832 number with abnormal cells
            0 number with latent infection
      132,588 number with precin
       51,782 number with cin1
       12,263 number with cin2
       18,073 number with cin3
       29,425 number with carcinoma in situ
      110,981 number with detectable dysplasia
            0 number with detected cancer
            0 number screened
            0 number treated for precancerous lesions
            0 number treated for cancer
            0 number vaccinated
            0 number given therapeutic vaccine
         0.00 hpv incidence (/100)
            0 cin1 incidence (/100,000)
            0 cin2 incidence (/100,000)
            0 cin3 incidence (/100,000)
            0 dysplasia incidence (/100,000)
            0 cancer incidence (/100,000)
    2,132,060 births
      302,312 other deaths
      -32,933 migration
            0 age-adjusted cervical cancer incidence (/100,000)
            0 age-adjusted cervical cancer mortality
            0 newly vaccinated
            0 cumulative number vaccinated
            0 new doses
            0 cumulative doses
            0 new therapeutic vaccine doses
            0 newly received therapeutic vaccine
            0 cumulative therapeutic vaccine doses
            0 total received therapeutic vaccine
            0 new screens
            0 newly screened
            0 new cin treatments
            0 newly treated for cins
            0 new cancer treatments
            0 newly treated for cancer
            0 cumulative screens
            0 cumulative number screened
            0 cumulative cin treatments
            0 cumulative number treated for cins
            0 cumulative cancer treatments
            0 cumulative number treated for cancer
            0 detected cancer incidence (/100,000)
            0 cancer mortality
   60,773,276 number alive
            0 crude death rate
            0 crude birth rate
         0.03 hpv prevalence (/100)
            0 pre-cin prevalence (/100,000)
            0 cin1 prevalence (/100,000)
            0 cin2 prevalence (/100,000)
            0 cin3 prevalence (/100,000)

../_images/tutorials_tut_analyzers_3_1.svg

It’s also possible to plot these results alongside data.

[2]:
az2 = hpv.age_results(
            result_keys=sc.objdict(
            cancers=sc.objdict(
            datafile='example_cancer_cases.csv',
        ),
    )
)
sim = hpv.Sim(pars, genotypes=[16, 18], analyzers=[az2])
sim.run()
a = sim.get_analyzer()
a.plot();
Loading location-specific demographic data for "nigeria"
Loading location-specific demographic data for "tanzania"
Initializing sim with 50000 agents
Loading location-specific data for "tanzania"
  Running 1970.0 ( 0/51) (0.10 s)  ———————————————————— 2%
  Running 1980.0 (10/51) (1.29 s)  ••••———————————————— 22%
  Running 1990.0 (20/51) (2.88 s)  ••••••••———————————— 41%
  Running 2000.0 (30/51) (5.07 s)  ••••••••••••———————— 61%
  Running 2010.0 (40/51) (7.93 s)  ••••••••••••••••———— 80%
  Running 2020.0 (50/51) (11.70 s)  •••••••••••••••••••• 100%

Simulation summary:
       10,174 infections
            0 dysplasias
            0 pre-cins
        1,071 cin1s
        2,436 cin2s
          803 cin3s
        7,015 cins
           80 cancers
            0 cancer detections
           80 cancer deaths
            0 detected cancer deaths
        8,032 reinfections
            0 reactivations
   121,515,664 number susceptible
       37,270 number infectious
          562 number with inactive infection
   60,773,276 number with no cellular changes
       37,270 number with episomal infection
          375 number with transformation
          562 number with cancer
       37,832 number infected
       37,832 number with abnormal cells
            0 number with latent infection
      132,588 number with precin
       51,782 number with cin1
       12,263 number with cin2
       18,073 number with cin3
       29,425 number with carcinoma in situ
      110,981 number with detectable dysplasia
            0 number with detected cancer
            0 number screened
            0 number treated for precancerous lesions
            0 number treated for cancer
            0 number vaccinated
            0 number given therapeutic vaccine
         0.00 hpv incidence (/100)
            0 cin1 incidence (/100,000)
            0 cin2 incidence (/100,000)
            0 cin3 incidence (/100,000)
            0 dysplasia incidence (/100,000)
            0 cancer incidence (/100,000)
    2,132,060 births
      302,312 other deaths
      -32,933 migration
            0 age-adjusted cervical cancer incidence (/100,000)
            0 age-adjusted cervical cancer mortality
            0 newly vaccinated
            0 cumulative number vaccinated
            0 new doses
            0 cumulative doses
            0 new therapeutic vaccine doses
            0 newly received therapeutic vaccine
            0 cumulative therapeutic vaccine doses
            0 total received therapeutic vaccine
            0 new screens
            0 newly screened
            0 new cin treatments
            0 newly treated for cins
            0 new cancer treatments
            0 newly treated for cancer
            0 cumulative screens
            0 cumulative number screened
            0 cumulative cin treatments
            0 cumulative number treated for cins
            0 cumulative cancer treatments
            0 cumulative number treated for cancer
            0 detected cancer incidence (/100,000)
            0 cancer mortality
   60,773,276 number alive
            0 crude death rate
            0 crude birth rate
         0.03 hpv prevalence (/100)
            0 pre-cin prevalence (/100,000)
            0 cin1 prevalence (/100,000)
            0 cin2 prevalence (/100,000)
            0 cin3 prevalence (/100,000)

../_images/tutorials_tut_analyzers_5_1.svg

These results are not particularly well matched to the data, but we will deal with this in the calibration tutorial later.

Snapshots

Snapshots both take “pictures” of the sim.people object at specified points in time. This is because while most of the information from sim.people is retrievable at the end of the sim from the stored events, it’s much easier to see what’s going on at the time. The following example leverages a snapshot in order to create a figure demonstrating age mixing patterns among sexual contacts:

[3]:
snap = hpv.snapshot(timepoints=['2020'])
sim = hpv.Sim(pars, analyzers=snap)
sim.run()

a = sim.get_analyzer()
people = a.snapshots[0]

# Plot age mixing
import pylab as pl
import matplotlib as mpl
fig, ax = pl.subplots(nrows=1, ncols=1, figsize=(5, 4))

fc = people.contacts['m']['age_f'] # Get the age of female contacts in marital partnership
mc = people.contacts['m']['age_m'] # Get the age of male contacts in marital partnership
h = ax.hist2d(fc, mc, bins=np.linspace(0, 75, 16), density=True, norm=mpl.colors.LogNorm())
ax.set_xlabel('Age of female partner')
ax.set_ylabel('Age of male partner')
fig.colorbar(h[3], ax=ax)
ax.set_title('Marital age mixing')
pl.show();
Loading location-specific demographic data for "nigeria"
Loading location-specific demographic data for "tanzania"
Initializing sim with 50000 agents
Loading location-specific data for "tanzania"
  Running 1970.0 ( 0/51) (0.10 s)  ———————————————————— 2%
  Running 1980.0 (10/51) (1.61 s)  ••••———————————————— 22%
  Running 1990.0 (20/51) (3.76 s)  ••••••••———————————— 41%
  Running 2000.0 (30/51) (6.98 s)  ••••••••••••———————— 61%
  Running 2010.0 (40/51) (11.14 s)  ••••••••••••••••———— 80%
  Running 2020.0 (50/51) (16.07 s)  •••••••••••••••••••• 100%

Simulation summary:
       14,726 infections
            0 dysplasias
            0 pre-cins
        1,339 cin1s
        2,410 cin2s
        1,339 cin3s
        9,907 cins
           80 cancers
            0 cancer detections
           54 cancer deaths
            0 detected cancer deaths
       12,316 reinfections
            0 reactivations
   182,280,368 number susceptible
       51,809 number infectious
          643 number with inactive infection
   60,772,376 number with no cellular changes
       51,809 number with episomal infection
          214 number with transformation
          643 number with cancer
       52,451 number infected
       52,451 number with abnormal cells
            0 number with latent infection
      169,563 number with precin
       69,453 number with cin1
       17,805 number with cin2
       19,385 number with cin3
       34,352 number with carcinoma in situ
      140,138 number with detectable dysplasia
            0 number with detected cancer
            0 number screened
            0 number treated for precancerous lesions
            0 number treated for cancer
            0 number vaccinated
            0 number given therapeutic vaccine
         0.00 hpv incidence (/100)
            0 cin1 incidence (/100,000)
            0 cin2 incidence (/100,000)
            0 cin3 incidence (/100,000)
            0 dysplasia incidence (/100,000)
            0 cancer incidence (/100,000)
    2,131,792 births
      312,566 other deaths
      -24,633 migration
            1 age-adjusted cervical cancer incidence (/100,000)
            0 age-adjusted cervical cancer mortality
            0 newly vaccinated
            0 cumulative number vaccinated
            0 new doses
            0 cumulative doses
            0 new therapeutic vaccine doses
            0 newly received therapeutic vaccine
            0 cumulative therapeutic vaccine doses
            0 total received therapeutic vaccine
            0 new screens
            0 newly screened
            0 new cin treatments
            0 newly treated for cins
            0 new cancer treatments
            0 newly treated for cancer
            0 cumulative screens
            0 cumulative number screened
            0 cumulative cin treatments
            0 cumulative number treated for cins
            0 cumulative cancer treatments
            0 cumulative number treated for cancer
            0 detected cancer incidence (/100,000)
            0 cancer mortality
   60,772,376 number alive
            0 crude death rate
            0 crude birth rate
         0.03 hpv prevalence (/100)
            0 pre-cin prevalence (/100,000)
            0 cin1 prevalence (/100,000)
            0 cin2 prevalence (/100,000)
            0 cin3 prevalence (/100,000)

../_images/tutorials_tut_analyzers_8_1.svg

Age pyramids

Age pyramids, like snapshots, take a picture of the people at a given point in time, and then bin them into age groups by sex. These can also be plotted alongside data:

[4]:
# Create some parameters
pars = dict(n_agents=50e3, start=2000, n_years=30, dt=0.5)

# Make the age pyramid analyzer
age_pyr = hpv.age_pyramid(
    timepoints=['2010', '2020'],
    datafile='south_africa_age_pyramid.csv',
    edges=np.linspace(0, 100, 21))

# Make the sim, run, get the analyzer, and plot
sim = hpv.Sim(pars, location='south africa', analyzers=age_pyr)
sim.run()
a = sim.get_analyzer()
fig = a.plot(percentages=True);
Loading location-specific demographic data for "nigeria"
Loading location-specific demographic data for "south africa"
Initializing sim with 50000 agents
Loading location-specific data for "south africa"
Dates provided in the age pyramid datafile ({'2020.0', '2000.0', '2010.0', '1990.0'}) are not the same as the age pyramid dates that were requested (['2010.0' '2020.0']).
Plots will only show requested dates, not all dates in the datafile.
  Running 2000.0 ( 0/62) (0.10 s)  ———————————————————— 2%
  Running 2005.0 (10/62) (1.08 s)  •••————————————————— 18%
  Running 2010.0 (20/62) (2.07 s)  ••••••—————————————— 34%
  Running 2015.0 (30/62) (3.13 s)  ••••••••••—————————— 50%
  Running 2020.0 (40/62) (4.30 s)  •••••••••••••——————— 66%
  Running 2025.0 (50/62) (5.47 s)  ••••••••••••••••———— 82%
  Running 2030.0 (60/62) (6.66 s)  •••••••••••••••••••— 98%
Simulation summary:
    4,413,014 infections
            0 dysplasias
            0 pre-cins
      201,248 cin1s
       53,971 cin2s
       39,616 cin3s
      564,222 cins
        1,491 cancers
            0 cancer detections
          839 cancer deaths
            0 detected cancer deaths
    3,561,693 reinfections
            0 reactivations
   184,985,312 number susceptible
    8,107,164 number infectious
       18,643 number with inactive infection
   64,366,264 number with no cellular changes
    8,107,164 number with episomal infection
       10,160 number with transformation
       18,643 number with cancer
    8,125,807 number infected
    8,125,807 number with abnormal cells
            0 number with latent infection
    3,989,916 number with precin
    1,630,027 number with cin1
      350,483 number with cin2
      347,687 number with cin3
      215,230 number with carcinoma in situ
    2,493,092 number with detectable dysplasia
            0 number with detected cancer
            0 number screened
            0 number treated for precancerous lesions
            0 number treated for cancer
            0 number vaccinated
            0 number given therapeutic vaccine
         0.80 hpv incidence (/100)
            0 cin1 incidence (/100,000)
            0 cin2 incidence (/100,000)
            0 cin3 incidence (/100,000)
            0 dysplasia incidence (/100,000)
            5 cancer incidence (/100,000)
    1,273,298 births
      552,384 other deaths
     -146,345 migration
            4 age-adjusted cervical cancer incidence (/100,000)
            0 age-adjusted cervical cancer mortality
            0 newly vaccinated
            0 cumulative number vaccinated
            0 new doses
            0 cumulative doses
            0 new therapeutic vaccine doses
            0 newly received therapeutic vaccine
            0 cumulative therapeutic vaccine doses
            0 total received therapeutic vaccine
            0 new screens
            0 newly screened
            0 new cin treatments
            0 newly treated for cins
            0 new cancer treatments
            0 newly treated for cancer
            0 cumulative screens
            0 cumulative number screened
            0 cumulative cin treatments
            0 cumulative number treated for cins
            0 cumulative cancer treatments
            0 cumulative number treated for cancer
            0 detected cancer incidence (/100,000)
            3 cancer mortality
   64,366,264 number alive
            0 crude death rate
            0 crude birth rate
         4.20 hpv prevalence (/100)
            0 pre-cin prevalence (/100,000)
            0 cin1 prevalence (/100,000)
            0 cin2 prevalence (/100,000)
            0 cin3 prevalence (/100,000)

../_images/tutorials_tut_analyzers_10_1.svg