T6 - Using analyzers¶
Analyzers are objects that do not change the behavior of a simulation, but just report on its internal state, almost always something to do with sim.people
. This tutorial takes you through some of the built-in analyzers and gives a brief example of how to build your own.
Click here to open an interactive version of this notebook.
Results by age¶
By far the most common reason to use an analyzer is to report results by age. The results in sim.results
already include results disaggregated by age, e.g. sim.results['cancers_by_age']
, but these results use standardized age bins which may not match the age bins for available data on cervical cancers. Age-specific outputs can be customized using an analyzer to match the age bins of the data. The following example shows how to set this up:
[1]:
import numpy as np
import sciris as sc
import hpvsim as hpv
# Create some parameters, setting beta (per-contact transmission probability) higher
# to create more cancers for illutration
pars = dict(beta=0.5, n_agents=50e3, start=1970, n_years=50, dt=1., location='tanzania')
# Also set initial HPV prevalence to be high, again to generate more cancers
pars['init_hpv_prev'] = {
'age_brackets' : np.array([ 12, 17, 24, 34, 44, 64, 80, 150]),
'm' : np.array([ 0.0, 0.75, 0.9, 0.45, 0.1, 0.05, 0.005, 0]),
'f' : np.array([ 0.0, 0.75, 0.9, 0.45, 0.1, 0.05, 0.005, 0]),
}
# Create the age analyzers.
az1 = hpv.age_results(
result_args=sc.objdict(
hpv_prevalence=sc.objdict( # The keys of this dictionary are any results you want by age, and can be any key of sim.results
years=2019, # List the years that you want to generate results for
edges=np.array([0., 15., 20., 25., 30., 40., 45., 50., 55., 65., 100.]),
),
hpv_incidence=sc.objdict(
years=2019,
edges=np.array([0., 15., 20., 25., 30., 40., 45., 50., 55., 65., 100.]),
),
cancer_incidence=sc.objdict(
years=2019,
edges=np.array([0.,20.,25.,30.,40.,45.,50.,55.,65.,100.]),
),
cancer_mortality=sc.objdict(
years=2019,
edges=np.array([0., 20., 25., 30., 40., 45., 50., 55., 65., 100.]),
)
)
)
sim = hpv.Sim(pars, genotypes=[16, 18], analyzers=[az1])
sim.run()
a = sim.get_analyzer()
a.plot();
HPVsim 1.2.4 (2023-09-19) — © 2023 by IDM
HPVsim data: at least one file missing: {'metadata': False, 'age_dist': False, 'birth': False, 'death': False, 'life_expectancy': False}
————————————————————————————————————
Downloading preprocessed HPVsim data
————————————————————————————————————
Note: this automatic download only happens once, when HPVsim is first run.
Downloading 1 URL(s)...
Downloading https://github.com/amath-idm/hpvsim_data/blob/main/hpvsim_data_v1.3.zip?raw=true...
Saving to /home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/checkouts/v1.2.4/docs/tutorials/files/tmp_hpvsim_data_v1.3.zip.zip...
Time to download https://github.com/amath-idm/hpvsim_data/blob/main/hpvsim_data_v1.3.zip?raw=true: 0.630 s
Time to download 1 URLs: 0.630 s
Removed "/home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/checkouts/v1.2.4/docs/tutorials/files/tmp_hpvsim_data_v1.3.zip.zip"
Data downloaded.
Loading location-specific demographic data for "tanzania"
Initializing sim with 50000 agents
Loading location-specific data for "tanzania"
Running 1970.0 ( 0/51) (1.03 s) ———————————————————— 2%
Running 1980.0 (10/51) (1.94 s) ••••———————————————— 22%
Running 1990.0 (20/51) (3.12 s) ••••••••———————————— 41%
Running 2000.0 (30/51) (4.66 s) ••••••••••••———————— 61%
Running 2010.0 (40/51) (6.59 s) ••••••••••••••••———— 80%
Running 2020.0 (50/51) (9.39 s) •••••••••••••••••••• 100%
Simulation summary:
1,071 infections
0 dysplasias
0 pre-cins
0 cin1s
2,356 cin2s
54 cin3s
3,133 cins
107 cancers
0 cancer detections
80 cancer deaths
0 detected cancer deaths
1,071 reinfections
0 reactivations
121,558,768 number susceptible
4,686 number infectious
669 number with inactive infection
60,774,536 number with no cellular changes
423,440 number with episomal infection
0 number with transformation
669 number with cancer
5,355 number infected
424,109 number with abnormal cells
0 number with latent infection
268 number with precin
22,143 number with cin1
10,040 number with cin2
19,251 number with cin3
51,407 number with detectable dysplasia
0 number with detected cancer
0 number screened
0 number treated for precancerous lesions
0 number treated for cancer
0 number vaccinated
0 number given therapeutic vaccine
0.00 hpv incidence (/100)
0 cin1 incidence (/100,000)
0 cin2 incidence (/100,000)
0 cin3 incidence (/100,000)
0 dysplasia incidence (/100,000)
0 cancer incidence (/100,000)
2,177,577 births
303,677 other deaths
-78,450 migration
1 age-adjusted cervical cancer incidence (/100,000)
0 age-adjusted cervical cancer mortality
0 newly vaccinated
0 cumulative number vaccinated
0 new doses
0 cumulative doses
0 new therapeutic vaccine doses
0 newly received therapeutic vaccine
0 cumulative therapeutic vaccine doses
0 total received therapeutic vaccine
0 new screens
0 newly screened
0 new cin treatments
0 newly treated for cins
0 new cancer treatments
0 newly treated for cancer
0 cumulative screens
0 cumulative number screened
0 cumulative cin treatments
0 cumulative number treated for cins
0 cumulative cancer treatments
0 cumulative number treated for cancer
0 detected cancer incidence (/100,000)
0 cancer mortality
60,774,536 number alive
0 crude death rate
0 crude birth rate
0.00 hpv prevalence (/100)
0 pre-cin prevalence (/100,000)
0 cin1 prevalence (/100,000)
0 cin2 prevalence (/100,000)
0 cin3 prevalence (/100,000)
It’s also possible to plot these results alongside data.
[2]:
az2 = hpv.age_results(
result_args=sc.objdict(
cancers=sc.objdict(
datafile='example_cancer_cases.csv',
),
)
)
sim = hpv.Sim(pars, genotypes=[16, 18], analyzers=[az2])
sim.run()
a = sim.get_analyzer()
a.plot();
Loading location-specific demographic data for "tanzania"
Initializing sim with 50000 agents
Loading location-specific data for "tanzania"
Running 1970.0 ( 0/51) (0.09 s) ———————————————————— 2%
Running 1980.0 (10/51) (0.99 s) ••••———————————————— 22%
Running 1990.0 (20/51) (2.15 s) ••••••••———————————— 41%
Running 2000.0 (30/51) (3.69 s) ••••••••••••———————— 61%
Running 2010.0 (40/51) (5.61 s) ••••••••••••••••———— 80%
Running 2020.0 (50/51) (8.35 s) •••••••••••••••••••• 100%
Simulation summary:
1,071 infections
0 dysplasias
0 pre-cins
0 cin1s
2,356 cin2s
54 cin3s
3,133 cins
107 cancers
0 cancer detections
80 cancer deaths
0 detected cancer deaths
1,071 reinfections
0 reactivations
121,558,768 number susceptible
4,686 number infectious
669 number with inactive infection
60,774,536 number with no cellular changes
423,440 number with episomal infection
0 number with transformation
669 number with cancer
5,355 number infected
424,109 number with abnormal cells
0 number with latent infection
268 number with precin
22,143 number with cin1
10,040 number with cin2
19,251 number with cin3
51,407 number with detectable dysplasia
0 number with detected cancer
0 number screened
0 number treated for precancerous lesions
0 number treated for cancer
0 number vaccinated
0 number given therapeutic vaccine
0.00 hpv incidence (/100)
0 cin1 incidence (/100,000)
0 cin2 incidence (/100,000)
0 cin3 incidence (/100,000)
0 dysplasia incidence (/100,000)
0 cancer incidence (/100,000)
2,177,577 births
303,677 other deaths
-78,450 migration
1 age-adjusted cervical cancer incidence (/100,000)
0 age-adjusted cervical cancer mortality
0 newly vaccinated
0 cumulative number vaccinated
0 new doses
0 cumulative doses
0 new therapeutic vaccine doses
0 newly received therapeutic vaccine
0 cumulative therapeutic vaccine doses
0 total received therapeutic vaccine
0 new screens
0 newly screened
0 new cin treatments
0 newly treated for cins
0 new cancer treatments
0 newly treated for cancer
0 cumulative screens
0 cumulative number screened
0 cumulative cin treatments
0 cumulative number treated for cins
0 cumulative cancer treatments
0 cumulative number treated for cancer
0 detected cancer incidence (/100,000)
0 cancer mortality
60,774,536 number alive
0 crude death rate
0 crude birth rate
0.00 hpv prevalence (/100)
0 pre-cin prevalence (/100,000)
0 cin1 prevalence (/100,000)
0 cin2 prevalence (/100,000)
0 cin3 prevalence (/100,000)
These results are not particularly well matched to the data, but we will deal with this in the calibration tutorial later.
Snapshots¶
Snapshots both take “pictures” of the sim.people
object at specified points in time. This is because while most of the information from sim.people
is retrievable at the end of the sim from the stored events, it’s much easier to see what’s going on at the time. The following example leverages a snapshot in order to create a figure demonstrating age mixing patterns among sexual contacts:
[3]:
snap = hpv.snapshot(timepoints=['2020'])
sim = hpv.Sim(pars, analyzers=snap)
sim.run()
a = sim.get_analyzer()
people = a.snapshots[0]
# Plot age mixing
import pylab as pl
import matplotlib as mpl
fig, ax = pl.subplots(nrows=1, ncols=1, figsize=(5, 4))
fc = people.contacts['m']['age_f'] # Get the age of female contacts in marital partnership
mc = people.contacts['m']['age_m'] # Get the age of male contacts in marital partnership
h = ax.hist2d(fc, mc, bins=np.linspace(0, 75, 16), density=True, norm=mpl.colors.LogNorm())
ax.set_xlabel('Age of female partner')
ax.set_ylabel('Age of male partner')
fig.colorbar(h[3], ax=ax)
ax.set_title('Marital age mixing')
pl.show();
Loading location-specific demographic data for "tanzania"
Initializing sim with 50000 agents
Loading location-specific data for "tanzania"
Running 1970.0 ( 0/51) (0.10 s) ———————————————————— 2%
Running 1980.0 (10/51) (1.20 s) ••••———————————————— 22%
Running 1990.0 (20/51) (2.71 s) ••••••••———————————— 41%
Running 2000.0 (30/51) (5.00 s) ••••••••••••———————— 61%
Running 2010.0 (40/51) (7.90 s) ••••••••••••••••———— 80%
Running 2020.0 (50/51) (11.43 s) •••••••••••••••••••• 100%
Simulation summary:
2,142 infections
0 dysplasias
0 pre-cins
0 cin1s
1,928 cin2s
1,125 cin3s
6,426 cins
80 cancers
0 cancer detections
187 cancer deaths
0 detected cancer deaths
1,339 reinfections
0 reactivations
182,343,568 number susceptible
7,256 number infectious
696 number with inactive infection
60,774,460 number with no cellular changes
448,501 number with episomal infection
0 number with transformation
696 number with cancer
7,952 number infected
449,197 number with abnormal cells
0 number with latent infection
535 number with precin
17,216 number with cin1
10,656 number with cin2
17,725 number with cin3
45,597 number with detectable dysplasia
0 number with detected cancer
0 number screened
0 number treated for precancerous lesions
0 number treated for cancer
0 number vaccinated
0 number given therapeutic vaccine
0.00 hpv incidence (/100)
0 cin1 incidence (/100,000)
0 cin2 incidence (/100,000)
0 cin3 incidence (/100,000)
0 dysplasia incidence (/100,000)
0 cancer incidence (/100,000)
2,177,309 births
312,218 other deaths
-69,078 migration
1 age-adjusted cervical cancer incidence (/100,000)
0 age-adjusted cervical cancer mortality
0 newly vaccinated
0 cumulative number vaccinated
0 new doses
0 cumulative doses
0 new therapeutic vaccine doses
0 newly received therapeutic vaccine
0 cumulative therapeutic vaccine doses
0 total received therapeutic vaccine
0 new screens
0 newly screened
0 new cin treatments
0 newly treated for cins
0 new cancer treatments
0 newly treated for cancer
0 cumulative screens
0 cumulative number screened
0 cumulative cin treatments
0 cumulative number treated for cins
0 cumulative cancer treatments
0 cumulative number treated for cancer
0 detected cancer incidence (/100,000)
1 cancer mortality
60,774,460 number alive
0 crude death rate
0 crude birth rate
0.00 hpv prevalence (/100)
0 pre-cin prevalence (/100,000)
0 cin1 prevalence (/100,000)
0 cin2 prevalence (/100,000)
0 cin3 prevalence (/100,000)
Age pyramids¶
Age pyramids, like snapshots, take a picture of the people at a given point in time, and then bin them into age groups by sex. These can also be plotted alongside data:
[4]:
# Create some parameters
pars = dict(n_agents=50e3, start=2000, n_years=30, dt=0.5)
# Make the age pyramid analyzer
age_pyr = hpv.age_pyramid(
timepoints=['2010', '2020'],
datafile='south_africa_age_pyramid.csv',
edges=np.linspace(0, 100, 21))
# Make the sim, run, get the analyzer, and plot
sim = hpv.Sim(pars, location='south africa', analyzers=age_pyr)
sim.run()
a = sim.get_analyzer()
fig = a.plot(percentages=True);
Loading location-specific demographic data for "south africa"
Initializing sim with 50000 agents
Loading location-specific data for "south africa"
Dates provided in the age pyramid datafile ({'1990.0', '2000.0', '2020.0', '2010.0'}) are not the same as the age pyramid dates that were requested (['2010.0' '2020.0']).
Plots will only show requested dates, not all dates in the datafile.
Running 2000.0 ( 0/62) (0.10 s) ———————————————————— 2%
Running 2005.0 (10/62) (0.89 s) •••————————————————— 18%
Running 2010.0 (20/62) (1.69 s) ••••••—————————————— 34%
Running 2015.0 (30/62) (2.53 s) ••••••••••—————————— 50%
Running 2020.0 (40/62) (3.42 s) •••••••••••••——————— 66%
Running 2025.0 (50/62) (4.31 s) ••••••••••••••••———— 82%
Running 2030.0 (60/62) (5.25 s) •••••••••••••••••••— 98%
Simulation summary:
250,745 infections
0 dysplasias
0 pre-cins
44,276 cin1s
6,525 cin2s
5,500 cin3s
105,891 cins
2,703 cancers
0 cancer detections
2,517 cancer deaths
0 detected cancer deaths
188,292 reinfections
0 reactivations
193,679,344 number susceptible
493,473 number infectious
21,812 number with inactive infection
64,437,476 number with no cellular changes
4,971,457 number with episomal infection
186 number with transformation
21,812 number with cancer
515,285 number infected
4,993,269 number with abnormal cells
0 number with latent infection
85,384 number with precin
258,948 number with cin1
91,349 number with cin2
145,413 number with cin3
493,753 number with detectable dysplasia
0 number with detected cancer
0 number screened
0 number treated for precancerous lesions
0 number treated for cancer
0 number vaccinated
0 number given therapeutic vaccine
0.04 hpv incidence (/100)
0 cin1 incidence (/100,000)
0 cin2 incidence (/100,000)
0 cin3 incidence (/100,000)
0 dysplasia incidence (/100,000)
8 cancer incidence (/100,000)
1,345,073 births
606,728 other deaths
-160,327 migration
8 age-adjusted cervical cancer incidence (/100,000)
0 age-adjusted cervical cancer mortality
0 newly vaccinated
0 cumulative number vaccinated
0 new doses
0 cumulative doses
0 new therapeutic vaccine doses
0 newly received therapeutic vaccine
0 cumulative therapeutic vaccine doses
0 total received therapeutic vaccine
0 new screens
0 newly screened
0 new cin treatments
0 newly treated for cins
0 new cancer treatments
0 newly treated for cancer
0 cumulative screens
0 cumulative number screened
0 cumulative cin treatments
0 cumulative number treated for cins
0 cumulative cancer treatments
0 cumulative number treated for cancer
0 detected cancer incidence (/100,000)
8 cancer mortality
64,437,476 number alive
0 crude death rate
0 crude birth rate
0.26 hpv prevalence (/100)
0 pre-cin prevalence (/100,000)
0 cin1 prevalence (/100,000)
0 cin2 prevalence (/100,000)
0 cin3 prevalence (/100,000)
/home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/envs/v1.2.4/lib/python3.9/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
/home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/envs/v1.2.4/lib/python3.9/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
/home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/envs/v1.2.4/lib/python3.9/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
/home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/envs/v1.2.4/lib/python3.9/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
/home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/envs/v1.2.4/lib/python3.9/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
/home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/envs/v1.2.4/lib/python3.9/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
/home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/envs/v1.2.4/lib/python3.9/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
/home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/envs/v1.2.4/lib/python3.9/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
/home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/envs/v1.2.4/lib/python3.9/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
/home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/envs/v1.2.4/lib/python3.9/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
/home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/envs/v1.2.4/lib/python3.9/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
/home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/envs/v1.2.4/lib/python3.9/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
/home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/envs/v1.2.4/lib/python3.9/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
/home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/envs/v1.2.4/lib/python3.9/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
/home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/envs/v1.2.4/lib/python3.9/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):
/home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/envs/v1.2.4/lib/python3.9/site-packages/seaborn/_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
if pd.api.types.is_categorical_dtype(vector):