T1 - Getting started¶
Installing and getting started with HPVsim is quite simple.
HPVsim is a Python package that can be pip-installed by typing pip install hpvsim
into a terminal. You can then check that the installation was successful by importing HPVsim with import hpvsim as hpv
.
The basic design philosophy of HPVsim is: common tasks should be simple. For example:
Defining parameters
Running a simulation
Plotting results
This tutorial walks you through how to do these things.
Click here to open an interactive version of this notebook.
Hello world¶
To create, run, and plot a sim with default options is just:
[1]:
import hpvsim as hpv
sim = hpv.Sim()
sim.run()
fig = sim.plot()
HPVsim 1.2.0 (2023-05-31) — © 2023 by IDM
Loading location-specific demographic data for "nigeria"
/home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/envs/v1.2.0/lib/python3.9/site-packages/sciris/sc_fileio.py:172: UserWarning: Fixing known unpickling deprecation "No module named 'pandas.core.indexes.numeric'"
obj = _unpickler(filestr, **kw, **kwargs) # Unpickle the data
Initializing sim with 20000 agents
Loading location-specific data for "nigeria"
/home/docs/checkouts/readthedocs.org/user_builds/institute-for-disease-modeling-hpvsim/envs/v1.2.0/lib/python3.9/site-packages/sciris/sc_fileio.py:172: UserWarning: Fixing known unpickling deprecation "No module named 'pandas.core.indexes.numeric'"
obj = _unpickler(filestr, **kw, **kwargs) # Unpickle the data
Running 1995.0 ( 0/144) (1.03 s) ———————————————————— 1%
Running 1997.5 (10/144) (1.40 s) •——————————————————— 8%
Running 2000.0 (20/144) (1.79 s) ••—————————————————— 15%
Running 2002.5 (30/144) (2.18 s) ••••———————————————— 22%
Running 2005.0 (40/144) (2.61 s) •••••——————————————— 28%
Running 2007.5 (50/144) (3.03 s) •••••••————————————— 35%
Running 2010.0 (60/144) (3.47 s) ••••••••———————————— 42%
Running 2012.5 (70/144) (3.94 s) •••••••••——————————— 49%
Running 2015.0 (80/144) (4.43 s) •••••••••••————————— 56%
Running 2017.5 (90/144) (4.94 s) ••••••••••••———————— 63%
Running 2020.0 (100/144) (5.47 s) ••••••••••••••—————— 70%
Running 2022.5 (110/144) (6.00 s) •••••••••••••••————— 77%
Running 2025.0 (120/144) (6.55 s) ••••••••••••••••———— 84%
Running 2027.5 (130/144) (7.13 s) ••••••••••••••••••—— 91%
Running 2030.0 (140/144) (7.76 s) •••••••••••••••••••— 98%
Simulation summary:
13,395,204 infections
0 dysplasias
0 pre-cins
3,230,231 cin1s
173,048 cin2s
81,717 cin3s
5,793,372 cins
20,296 cancers
0 cancer detections
17,625 cancer deaths
0 detected cancer deaths
9,683,216 reinfections
0 reactivations
774,427,776 number susceptible
16,136,736 number infectious
120,172 number with inactive infection
260,436,336 number with no cellular changes
39,733,240 number with episomal infection
2,670 number with transformation
120,172 number with cancer
16,256,907 number infected
39,853,408 number with abnormal cells
0 number with latent infection
3,388,858 number with precin
5,666,257 number with cin1
1,453,284 number with cin2
1,126,949 number with cin3
8,122,044 number with detectable dysplasia
0 number with detected cancer
0 number screened
0 number treated for precancerous lesions
0 number treated for cancer
0 number vaccinated
0 number given therapeutic vaccine
0.58 hpv incidence (/100)
0 cin1 incidence (/100,000)
0 cin2 incidence (/100,000)
0 cin3 incidence (/100,000)
0 dysplasia incidence (/100,000)
16 cancer incidence (/100,000)
9,416,166 births
2,604,267 other deaths
-1,089,562 migration
24 age-adjusted cervical cancer incidence (/100,000)
0 age-adjusted cervical cancer mortality
0 newly vaccinated
0 cumulative number vaccinated
0 new doses
0 cumulative doses
0 new therapeutic vaccine doses
0 newly received therapeutic vaccine
0 cumulative therapeutic vaccine doses
0 total received therapeutic vaccine
0 new screens
0 newly screened
0 new cin treatments
0 newly treated for cins
0 new cancer treatments
0 newly treated for cancer
0 cumulative screens
0 cumulative number screened
0 cumulative cin treatments
0 cumulative number treated for cins
0 cumulative cancer treatments
0 cumulative number treated for cancer
0 detected cancer incidence (/100,000)
14 cancer mortality
260,436,336 number alive
0 crude death rate
0 crude birth rate
2.07 hpv prevalence (/100)
0 pre-cin prevalence (/100,000)
0 cin1 prevalence (/100,000)
0 cin2 prevalence (/100,000)
0 cin3 prevalence (/100,000)
Defining parameters and genotypes, and running simulations¶
Parameters are defined as a dictionary. Some common parameters to modify are the number of agents in the simulation, the genotypes to simulate, and the start and end dates of the simulation. We can define those as:
[2]:
pars = dict(
n_agents = 10e3,
genotypes = [16, 18, 'hr'], # Simulate genotypes 16 and 18, plus all other high-risk HPV genotypes pooled together
start = 1980,
end = 2030,
)
Running a simulation is pretty easy. In fact, running a sim with the parameters we defined above is just:
[3]:
sim = hpv.Sim(pars)
sim.run()
Loading location-specific demographic data for "nigeria"
Initializing sim with 10000 agents
Loading location-specific data for "nigeria"
Running 1980.0 ( 0/204) (0.03 s) ———————————————————— 0%
Running 1982.5 (10/204) (0.31 s) •——————————————————— 5%
Running 1985.0 (20/204) (0.60 s) ••—————————————————— 10%
Running 1987.5 (30/204) (0.88 s) •••————————————————— 15%
Running 1990.0 (40/204) (1.17 s) ••••———————————————— 20%
Running 1992.5 (50/204) (1.46 s) •••••——————————————— 25%
Running 1995.0 (60/204) (1.76 s) •••••——————————————— 30%
Running 1997.5 (70/204) (2.06 s) ••••••—————————————— 35%
Running 2000.0 (80/204) (2.38 s) •••••••————————————— 40%
Running 2002.5 (90/204) (2.69 s) ••••••••———————————— 45%
Running 2005.0 (100/204) (3.04 s) •••••••••——————————— 50%
Running 2007.5 (110/204) (3.39 s) ••••••••••—————————— 54%
Running 2010.0 (120/204) (3.75 s) •••••••••••————————— 59%
Running 2012.5 (130/204) (4.14 s) ••••••••••••———————— 64%
Running 2015.0 (140/204) (4.52 s) •••••••••••••——————— 69%
Running 2017.5 (150/204) (4.92 s) ••••••••••••••—————— 74%
Running 2020.0 (160/204) (5.35 s) •••••••••••••••————— 79%
Running 2022.5 (170/204) (5.77 s) ••••••••••••••••———— 84%
Running 2025.0 (180/204) (6.23 s) •••••••••••••••••——— 89%
Running 2027.5 (190/204) (6.72 s) ••••••••••••••••••—— 94%
Running 2030.0 (200/204) (7.21 s) •••••••••••••••••••— 99%
Simulation summary:
10,199,556 infections
0 dysplasias
0 pre-cins
2,185,722 cin1s
147,965 cin2s
45,970 cin3s
5,104,806 cins
9,338 cancers
0 cancer detections
5,746 cancer deaths
0 detected cancer deaths
6,981,668 reinfections
0 reactivations
775,935,104 number susceptible
12,546,890 number infectious
80,447 number with inactive infection
260,318,592 number with no cellular changes
31,940,412 number with episomal infection
1,437 number with transformation
80,447 number with cancer
12,627,338 number infected
32,020,860 number with abnormal cells
0 number with latent infection
2,570,001 number with precin
4,253,646 number with cin1
1,213,891 number with cin2
820,274 number with cin3
6,234,658 number with detectable dysplasia
0 number with detected cancer
0 number screened
0 number treated for precancerous lesions
0 number treated for cancer
0 number vaccinated
0 number given therapeutic vaccine
0.44 hpv incidence (/100)
0 cin1 incidence (/100,000)
0 cin2 incidence (/100,000)
0 cin3 incidence (/100,000)
0 dysplasia incidence (/100,000)
7 cancer incidence (/100,000)
9,423,815 births
2,334,406 other deaths
-1,321,633 migration
10 age-adjusted cervical cancer incidence (/100,000)
0 age-adjusted cervical cancer mortality
0 newly vaccinated
0 cumulative number vaccinated
0 new doses
0 cumulative doses
0 new therapeutic vaccine doses
0 newly received therapeutic vaccine
0 cumulative therapeutic vaccine doses
0 total received therapeutic vaccine
0 new screens
0 newly screened
0 new cin treatments
0 newly treated for cins
0 new cancer treatments
0 newly treated for cancer
0 cumulative screens
0 cumulative number screened
0 cumulative cin treatments
0 cumulative number treated for cins
0 cumulative cancer treatments
0 cumulative number treated for cancer
0 detected cancer incidence (/100,000)
4 cancer mortality
260,318,592 number alive
0 crude death rate
0 crude birth rate
1.61 hpv prevalence (/100)
0 pre-cin prevalence (/100,000)
0 cin1 prevalence (/100,000)
0 cin2 prevalence (/100,000)
0 cin3 prevalence (/100,000)
[3]:
Sim(<no label>; 1980 to 2030; pop: 10000 default; epi: 3.72463e+08⚙, 334000♋︎)
This will generate a results dictionary sim.results
. Results by genotype are named things like sim.results['infections']
and stored as arrays where each row corresponds to a genotype, while totals across all genotypes have names like sim.results['infections']
or sim.results['cancers']
.
Rather than creating a parameter dictionary, any valid parameter can also be passed to the sim directly. For example, exactly equivalent to the above is:
[4]:
sim = hpv.Sim(n_agents=10e3, start=1980, end=2030)
sim.run()
Loading location-specific demographic data for "nigeria"
Initializing sim with 10000 agents
Loading location-specific data for "nigeria"
Running 1980.0 ( 0/204) (0.04 s) ———————————————————— 0%
Running 1982.5 (10/204) (0.31 s) •——————————————————— 5%
Running 1985.0 (20/204) (0.59 s) ••—————————————————— 10%
Running 1987.5 (30/204) (0.87 s) •••————————————————— 15%
Running 1990.0 (40/204) (1.17 s) ••••———————————————— 20%
Running 1992.5 (50/204) (1.47 s) •••••——————————————— 25%
Running 1995.0 (60/204) (1.79 s) •••••——————————————— 30%
Running 1997.5 (70/204) (2.10 s) ••••••—————————————— 35%
Running 2000.0 (80/204) (2.45 s) •••••••————————————— 40%
Running 2002.5 (90/204) (2.79 s) ••••••••———————————— 45%
Running 2005.0 (100/204) (3.15 s) •••••••••——————————— 50%
Running 2007.5 (110/204) (3.51 s) ••••••••••—————————— 54%
Running 2010.0 (120/204) (3.89 s) •••••••••••————————— 59%
Running 2012.5 (130/204) (4.30 s) ••••••••••••———————— 64%
Running 2015.0 (140/204) (4.71 s) •••••••••••••——————— 69%
Running 2017.5 (150/204) (5.13 s) ••••••••••••••—————— 74%
Running 2020.0 (160/204) (5.57 s) •••••••••••••••————— 79%
Running 2022.5 (170/204) (6.01 s) ••••••••••••••••———— 84%
Running 2025.0 (180/204) (6.48 s) •••••••••••••••••——— 89%
Running 2027.5 (190/204) (6.99 s) ••••••••••••••••••—— 94%
Running 2030.0 (200/204) (7.50 s) •••••••••••••••••••— 99%
Simulation summary:
19,149,308 infections
0 dysplasias
0 pre-cins
4,364,261 cin1s
280,129 cin2s
96,968 cin3s
8,138,096 cins
12,929 cancers
0 cancer detections
6,465 cancer deaths
0 detected cancer deaths
13,812,498 reinfections
0 reactivations
770,468,992 number susceptible
21,432,716 number infectious
100,559 number with inactive infection
260,562,080 number with no cellular changes
45,796,728 number with episomal infection
6,465 number with transformation
100,559 number with cancer
21,533,276 number infected
45,897,288 number with abnormal cells
0 number with latent infection
4,971,924 number with precin
7,750,226 number with cin1
1,947,253 number with cin2
1,291,465 number with cin3
10,820,148 number with detectable dysplasia
0 number with detected cancer
0 number screened
0 number treated for precancerous lesions
0 number treated for cancer
0 number vaccinated
0 number given therapeutic vaccine
0.83 hpv incidence (/100)
0 cin1 incidence (/100,000)
0 cin2 incidence (/100,000)
0 cin3 incidence (/100,000)
0 dysplasia incidence (/100,000)
10 cancer incidence (/100,000)
9,409,450 births
2,617,408 other deaths
-1,070,235 migration
15 age-adjusted cervical cancer incidence (/100,000)
0 age-adjusted cervical cancer mortality
0 newly vaccinated
0 cumulative number vaccinated
0 new doses
0 cumulative doses
0 new therapeutic vaccine doses
0 newly received therapeutic vaccine
0 cumulative therapeutic vaccine doses
0 total received therapeutic vaccine
0 new screens
0 newly screened
0 new cin treatments
0 newly treated for cins
0 new cancer treatments
0 newly treated for cancer
0 cumulative screens
0 cumulative number screened
0 cumulative cin treatments
0 cumulative number treated for cins
0 cumulative cancer treatments
0 cumulative number treated for cancer
0 detected cancer incidence (/100,000)
5 cancer mortality
260,562,080 number alive
0 crude death rate
0 crude birth rate
2.74 hpv prevalence (/100)
0 pre-cin prevalence (/100,000)
0 cin1 prevalence (/100,000)
0 cin2 prevalence (/100,000)
0 cin3 prevalence (/100,000)
[4]:
Sim(<no label>; 1980 to 2030; pop: 10000 default; epi: 6.09129e+08⚙, 410855♋︎)
You can mix and match too – pass in a parameter dictionary with default options, and then include other parameters as keywords (including overrides; keyword arguments take precedence). For example:
[5]:
sim = hpv.Sim(pars, end=2050) # Use parameters defined above, except set the end data to 2050
sim.run()
Loading location-specific demographic data for "nigeria"
Initializing sim with 10000 agents
Loading location-specific data for "nigeria"
Running 1980.0 ( 0/284) (0.04 s) ———————————————————— 0%
Running 1982.5 (10/284) (0.32 s) ———————————————————— 4%
Running 1985.0 (20/284) (0.60 s) •——————————————————— 7%
Running 1987.5 (30/284) (0.88 s) ••—————————————————— 11%
Running 1990.0 (40/284) (1.17 s) ••—————————————————— 14%
Running 1992.5 (50/284) (1.46 s) •••————————————————— 18%
Running 1995.0 (60/284) (1.76 s) ••••———————————————— 21%
Running 1997.5 (70/284) (2.06 s) •••••——————————————— 25%
Running 2000.0 (80/284) (2.39 s) •••••——————————————— 29%
Running 2002.5 (90/284) (2.70 s) ••••••—————————————— 32%
Running 2005.0 (100/284) (3.05 s) •••••••————————————— 36%
Running 2007.5 (110/284) (3.40 s) •••••••————————————— 39%
Running 2010.0 (120/284) (3.77 s) ••••••••———————————— 43%
Running 2012.5 (130/284) (4.16 s) •••••••••——————————— 46%
Running 2015.0 (140/284) (4.56 s) •••••••••——————————— 50%
Running 2017.5 (150/284) (4.96 s) ••••••••••—————————— 53%
Running 2020.0 (160/284) (5.39 s) •••••••••••————————— 57%
Running 2022.5 (170/284) (5.83 s) ••••••••••••———————— 60%
Running 2025.0 (180/284) (6.29 s) ••••••••••••———————— 64%
Running 2027.5 (190/284) (6.78 s) •••••••••••••——————— 67%
Running 2030.0 (200/284) (7.29 s) ••••••••••••••—————— 71%
Running 2032.5 (210/284) (7.79 s) ••••••••••••••—————— 74%
Running 2035.0 (220/284) (8.34 s) •••••••••••••••————— 78%
Running 2037.5 (230/284) (8.89 s) ••••••••••••••••———— 81%
Running 2040.0 (240/284) (9.48 s) ••••••••••••••••———— 85%
Running 2042.5 (250/284) (10.07 s) •••••••••••••••••——— 88%
Running 2045.0 (260/284) (10.75 s) ••••••••••••••••••—— 92%
Running 2047.5 (270/284) (11.40 s) •••••••••••••••••••— 95%
Running 2050.0 (280/284) (12.09 s) •••••••••••••••••••— 99%
Simulation summary:
13,187,595 infections
0 dysplasias
0 pre-cins
3,104,400 cin1s
247,806 cin2s
53,153 cin3s
7,186,377 cins
15,802 cancers
0 cancer detections
7,901 cancer deaths
0 detected cancer deaths
8,985,665 reinfections
0 reactivations
1,120,278,016 number susceptible
16,069,329 number infectious
128,572 number with inactive infection
375,473,728 number with no cellular changes
46,545,896 number with episomal infection
3,591 number with transformation
128,572 number with cancer
16,197,901 number infected
46,674,464 number with abnormal cells
0 number with latent infection
3,232,972 number with precin
5,606,165 number with cin1
1,619,718 number with cin2
1,329,534 number with cin3
8,518,066 number with detectable dysplasia
0 number with detected cancer
0 number screened
0 number treated for precancerous lesions
0 number treated for cancer
0 number vaccinated
0 number given therapeutic vaccine
0.39 hpv incidence (/100)
0 cin1 incidence (/100,000)
0 cin2 incidence (/100,000)
0 cin3 incidence (/100,000)
0 dysplasia incidence (/100,000)
8 cancer incidence (/100,000)
13,704,756 births
3,246,620 other deaths
-4,905,843 migration
10 age-adjusted cervical cancer incidence (/100,000)
0 age-adjusted cervical cancer mortality
0 newly vaccinated
0 cumulative number vaccinated
0 new doses
0 cumulative doses
0 new therapeutic vaccine doses
0 newly received therapeutic vaccine
0 cumulative therapeutic vaccine doses
0 total received therapeutic vaccine
0 new screens
0 newly screened
0 new cin treatments
0 newly treated for cins
0 new cancer treatments
0 newly treated for cancer
0 cumulative screens
0 cumulative number screened
0 cumulative cin treatments
0 cumulative number treated for cins
0 cumulative cancer treatments
0 cumulative number treated for cancer
0 detected cancer incidence (/100,000)
4 cancer mortality
375,473,728 number alive
0 crude death rate
0 crude birth rate
1.43 hpv prevalence (/100)
0 pre-cin prevalence (/100,000)
0 cin1 prevalence (/100,000)
0 cin2 prevalence (/100,000)
0 cin3 prevalence (/100,000)
[5]:
Sim(<no label>; 1980 to 2050; pop: 10000 default; epi: 6.02183e+08⚙, 617001♋︎)
Plotting results¶
As you saw above, plotting the results of a simulation is rather easy too:
[6]:
fig = sim.plot()
Full usage example¶
Many of the details of this example will be explained in later tutorials, but to give you a taste, here’s an example of how you would run two simulations to determine the impact of a custom intervention aimed at protecting the elderly.
[7]:
import hpvsim as hpv
# Custom vaccination intervention
def custom_vx(sim):
if sim.yearvec[sim.t] == 2000:
target_group = (sim.people.age>9) * (sim.people.age<14)
sim.people.peak_imm[0, target_group] = 1
pars = dict(
location = 'tanzania', # Use population characteristics for Japan
n_agents = 10e3, # Have 50,000 people total in the population
start = 1980, # Start the simulation in 1980
n_years = 50, # Run the simulation for 50 years
burnin = 10, # Discard the first 20 years as burnin period
verbose = 0, # Do not print any output
)
# Running with multisims -- see Tutorial 3
s1 = hpv.Sim(pars, label='Default')
s2 = hpv.Sim(pars, interventions=custom_vx, label='Custom vaccination')
msim = hpv.MultiSim([s1, s2])
msim.run()
fig = msim.plot(['cancers', 'dysplasias'])
Loading location-specific demographic data for "tanzania"
Loading location-specific demographic data for "tanzania"
<Figure size 640x480 with 0 Axes>
[ ]: