T1 - Getting started¶
Installing and getting started with HPVsim is quite simple.
HPVsim is a Python package that can be pip-installed by typing pip install hpvsim
into a terminal. You can then check that the installation was successful by importing HPVsim with import hpvsim as hpv
.
The basic design philosophy of HPVsim is: common tasks should be simple. For example:
Defining parameters
Running a simulation
Plotting results
This tutorial walks you through how to do these things.
Click here to open an interactive version of this notebook.
Hello world¶
To create, run, and plot a sim with default options is just:
[1]:
import hpvsim as hpv
sim = hpv.Sim()
sim.run()
fig = sim.plot()
HPVsim 1.2.4 (2023-09-19) — © 2023 by IDM
Loading location-specific demographic data for "nigeria"
Initializing sim with 20000 agents
Loading location-specific data for "nigeria"
Running 1995.0 ( 0/144) (0.96 s) ———————————————————— 1%
Running 1997.5 (10/144) (1.34 s) •——————————————————— 8%
Running 2000.0 (20/144) (1.71 s) ••—————————————————— 15%
Running 2002.5 (30/144) (2.10 s) ••••———————————————— 22%
Running 2005.0 (40/144) (2.51 s) •••••——————————————— 28%
Running 2007.5 (50/144) (2.92 s) •••••••————————————— 35%
Running 2010.0 (60/144) (3.35 s) ••••••••———————————— 42%
Running 2012.5 (70/144) (3.79 s) •••••••••——————————— 49%
Running 2015.0 (80/144) (4.26 s) •••••••••••————————— 56%
Running 2017.5 (90/144) (4.74 s) ••••••••••••———————— 63%
Running 2020.0 (100/144) (5.24 s) ••••••••••••••—————— 70%
Running 2022.5 (110/144) (5.76 s) •••••••••••••••————— 77%
Running 2025.0 (120/144) (6.30 s) ••••••••••••••••———— 84%
Running 2027.5 (130/144) (6.85 s) ••••••••••••••••••—— 91%
Running 2030.0 (140/144) (7.48 s) •••••••••••••••••••— 98%
Simulation summary:
12,396,439 infections
0 dysplasias
0 pre-cins
2,843,543 cin1s
251,561 cin2s
136,729 cin3s
5,780,554 cins
18,159 cancers
0 cancer detections
14,955 cancer deaths
0 detected cancer deaths
8,663,087 reinfections
0 reactivations
776,222,336 number susceptible
14,938,752 number infectious
137,263 number with inactive infection
260,489,216 number with no cellular changes
39,731,632 number with episomal infection
2,670 number with transformation
137,263 number with cancer
15,076,015 number infected
39,868,896 number with abnormal cells
0 number with latent infection
2,955,704 number with precin
5,675,871 number with cin1
1,481,591 number with cin2
1,207,598 number with cin3
8,268,922 number with detectable dysplasia
0 number with detected cancer
0 number screened
0 number treated for precancerous lesions
0 number treated for cancer
0 number vaccinated
0 number given therapeutic vaccine
0.53 hpv incidence (/100)
0 cin1 incidence (/100,000)
0 cin2 incidence (/100,000)
0 cin3 incidence (/100,000)
0 dysplasia incidence (/100,000)
14 cancer incidence (/100,000)
9,517,645 births
2,496,913 other deaths
-1,324,566 migration
21 age-adjusted cervical cancer incidence (/100,000)
0 age-adjusted cervical cancer mortality
0 newly vaccinated
0 cumulative number vaccinated
0 new doses
0 cumulative doses
0 new therapeutic vaccine doses
0 newly received therapeutic vaccine
0 cumulative therapeutic vaccine doses
0 total received therapeutic vaccine
0 new screens
0 newly screened
0 new cin treatments
0 newly treated for cins
0 new cancer treatments
0 newly treated for cancer
0 cumulative screens
0 cumulative number screened
0 cumulative cin treatments
0 cumulative number treated for cins
0 cumulative cancer treatments
0 cumulative number treated for cancer
0 detected cancer incidence (/100,000)
12 cancer mortality
260,489,216 number alive
0 crude death rate
0 crude birth rate
1.91 hpv prevalence (/100)
0 pre-cin prevalence (/100,000)
0 cin1 prevalence (/100,000)
0 cin2 prevalence (/100,000)
0 cin3 prevalence (/100,000)
Defining parameters and genotypes, and running simulations¶
Parameters are defined as a dictionary. Some common parameters to modify are the number of agents in the simulation, the genotypes to simulate, and the start and end dates of the simulation. We can define those as:
[2]:
pars = dict(
n_agents = 10e3,
genotypes = [16, 18, 'hr'], # Simulate genotypes 16 and 18, plus all other high-risk HPV genotypes pooled together
start = 1980,
end = 2030,
)
Running a simulation is pretty easy. In fact, running a sim with the parameters we defined above is just:
[3]:
sim = hpv.Sim(pars)
sim.run()
Loading location-specific demographic data for "nigeria"
Initializing sim with 10000 agents
Loading location-specific data for "nigeria"
Running 1980.0 ( 0/204) (0.03 s) ———————————————————— 0%
Running 1982.5 (10/204) (0.31 s) •——————————————————— 5%
Running 1985.0 (20/204) (0.58 s) ••—————————————————— 10%
Running 1987.5 (30/204) (0.86 s) •••————————————————— 15%
Running 1990.0 (40/204) (1.15 s) ••••———————————————— 20%
Running 1992.5 (50/204) (1.44 s) •••••——————————————— 25%
Running 1995.0 (60/204) (1.74 s) •••••——————————————— 30%
Running 1997.5 (70/204) (2.04 s) ••••••—————————————— 35%
Running 2000.0 (80/204) (2.36 s) •••••••————————————— 40%
Running 2002.5 (90/204) (2.67 s) ••••••••———————————— 45%
Running 2005.0 (100/204) (3.01 s) •••••••••——————————— 50%
Running 2007.5 (110/204) (3.34 s) ••••••••••—————————— 54%
Running 2010.0 (120/204) (3.68 s) •••••••••••————————— 59%
Running 2012.5 (130/204) (4.05 s) ••••••••••••———————— 64%
Running 2015.0 (140/204) (4.41 s) •••••••••••••——————— 69%
Running 2017.5 (150/204) (4.79 s) ••••••••••••••—————— 74%
Running 2020.0 (160/204) (5.19 s) •••••••••••••••————— 79%
Running 2022.5 (170/204) (5.60 s) ••••••••••••••••———— 84%
Running 2025.0 (180/204) (6.02 s) •••••••••••••••••——— 89%
Running 2027.5 (190/204) (6.48 s) ••••••••••••••••••—— 94%
Running 2030.0 (200/204) (6.94 s) •••••••••••••••••••— 99%
Simulation summary:
7,778,957 infections
0 dysplasias
0 pre-cins
1,723,150 cin1s
136,473 cin2s
15,802 cin3s
4,317,573 cins
5,746 cancers
0 cancer detections
7,901 cancer deaths
0 detected cancer deaths
5,408,638 reinfections
0 reactivations
777,292,608 number susceptible
9,445,364 number infectious
63,927 number with inactive infection
260,174,192 number with no cellular changes
25,786,204 number with episomal infection
2,155 number with transformation
63,927 number with cancer
9,509,292 number infected
25,850,132 number with abnormal cells
0 number with latent infection
1,995,378 number with precin
3,498,735 number with cin1
920,115 number with cin2
813,810 number with cin3
5,195,309 number with detectable dysplasia
0 number with detected cancer
0 number screened
0 number treated for precancerous lesions
0 number treated for cancer
0 number vaccinated
0 number given therapeutic vaccine
0.33 hpv incidence (/100)
0 cin1 incidence (/100,000)
0 cin2 incidence (/100,000)
0 cin3 incidence (/100,000)
0 dysplasia incidence (/100,000)
4 cancer incidence (/100,000)
9,517,192 births
2,585,803 other deaths
-1,206,708 migration
6 age-adjusted cervical cancer incidence (/100,000)
0 age-adjusted cervical cancer mortality
0 newly vaccinated
0 cumulative number vaccinated
0 new doses
0 cumulative doses
0 new therapeutic vaccine doses
0 newly received therapeutic vaccine
0 cumulative therapeutic vaccine doses
0 total received therapeutic vaccine
0 new screens
0 newly screened
0 new cin treatments
0 newly treated for cins
0 new cancer treatments
0 newly treated for cancer
0 cumulative screens
0 cumulative number screened
0 cumulative cin treatments
0 cumulative number treated for cins
0 cumulative cancer treatments
0 cumulative number treated for cancer
0 detected cancer incidence (/100,000)
6 cancer mortality
260,174,192 number alive
0 crude death rate
0 crude birth rate
1.21 hpv prevalence (/100)
0 pre-cin prevalence (/100,000)
0 cin1 prevalence (/100,000)
0 cin2 prevalence (/100,000)
0 cin3 prevalence (/100,000)
[3]:
Sim(<no label>; 1980 to 2030; pop: 10000 default; epi: 3.19541e+08⚙, 316043♋︎)
This will generate a results dictionary sim.results
. Results by genotype are named things like sim.results['infections']
and stored as arrays where each row corresponds to a genotype, while totals across all genotypes have names like sim.results['infections']
or sim.results['cancers']
.
Rather than creating a parameter dictionary, any valid parameter can also be passed to the sim directly. For example, exactly equivalent to the above is:
[4]:
sim = hpv.Sim(n_agents=10e3, start=1980, end=2030)
sim.run()
Loading location-specific demographic data for "nigeria"
Initializing sim with 10000 agents
Loading location-specific data for "nigeria"
Running 1980.0 ( 0/204) (0.04 s) ———————————————————— 0%
Running 1982.5 (10/204) (0.31 s) •——————————————————— 5%
Running 1985.0 (20/204) (0.59 s) ••—————————————————— 10%
Running 1987.5 (30/204) (0.87 s) •••————————————————— 15%
Running 1990.0 (40/204) (1.16 s) ••••———————————————— 20%
Running 1992.5 (50/204) (1.46 s) •••••——————————————— 25%
Running 1995.0 (60/204) (1.77 s) •••••——————————————— 30%
Running 1997.5 (70/204) (2.08 s) ••••••—————————————— 35%
Running 2000.0 (80/204) (2.41 s) •••••••————————————— 40%
Running 2002.5 (90/204) (2.75 s) ••••••••———————————— 45%
Running 2005.0 (100/204) (3.09 s) •••••••••——————————— 50%
Running 2007.5 (110/204) (3.44 s) ••••••••••—————————— 54%
Running 2010.0 (120/204) (3.80 s) •••••••••••————————— 59%
Running 2012.5 (130/204) (4.19 s) ••••••••••••———————— 64%
Running 2015.0 (140/204) (4.59 s) •••••••••••••——————— 69%
Running 2017.5 (150/204) (4.99 s) ••••••••••••••—————— 74%
Running 2020.0 (160/204) (5.41 s) •••••••••••••••————— 79%
Running 2022.5 (170/204) (5.84 s) ••••••••••••••••———— 84%
Running 2025.0 (180/204) (6.29 s) •••••••••••••••••——— 89%
Running 2027.5 (190/204) (6.77 s) ••••••••••••••••••—— 94%
Running 2030.0 (200/204) (7.26 s) •••••••••••••••••••— 99%
Simulation summary:
14,344,024 infections
0 dysplasias
0 pre-cins
3,273,196 cin1s
300,240 cin2s
116,361 cin3s
6,111,114 cins
22,267 cancers
0 cancer detections
8,619 cancer deaths
0 detected cancer deaths
10,307,299 reinfections
0 reactivations
773,658,112 number susceptible
17,499,422 number infectious
101,277 number with inactive infection
260,479,472 number with no cellular changes
41,073,328 number with episomal infection
2,873 number with transformation
101,277 number with cancer
17,600,700 number infected
41,174,608 number with abnormal cells
0 number with latent infection
3,758,034 number with precin
6,394,116 number with cin1
1,505,512 number with cin2
1,303,676 number with cin3
9,068,985 number with detectable dysplasia
0 number with detected cancer
0 number screened
0 number treated for precancerous lesions
0 number treated for cancer
0 number vaccinated
0 number given therapeutic vaccine
0.62 hpv incidence (/100)
0 cin1 incidence (/100,000)
0 cin2 incidence (/100,000)
0 cin3 incidence (/100,000)
0 dysplasia incidence (/100,000)
17 cancer incidence (/100,000)
9,517,192 births
2,705,756 other deaths
-1,120,515 migration
26 age-adjusted cervical cancer incidence (/100,000)
0 age-adjusted cervical cancer mortality
0 newly vaccinated
0 cumulative number vaccinated
0 new doses
0 cumulative doses
0 new therapeutic vaccine doses
0 newly received therapeutic vaccine
0 cumulative therapeutic vaccine doses
0 total received therapeutic vaccine
0 new screens
0 newly screened
0 new cin treatments
0 newly treated for cins
0 new cancer treatments
0 newly treated for cancer
0 cumulative screens
0 cumulative number screened
0 cumulative cin treatments
0 cumulative number treated for cins
0 cumulative cancer treatments
0 cumulative number treated for cancer
0 detected cancer incidence (/100,000)
7 cancer mortality
260,479,472 number alive
0 crude death rate
0 crude birth rate
2.24 hpv prevalence (/100)
0 pre-cin prevalence (/100,000)
0 cin1 prevalence (/100,000)
0 cin2 prevalence (/100,000)
0 cin3 prevalence (/100,000)
[4]:
Sim(<no label>; 1980 to 2030; pop: 10000 default; epi: 5.66686e+08⚙, 416602♋︎)
You can mix and match too – pass in a parameter dictionary with default options, and then include other parameters as keywords (including overrides; keyword arguments take precedence). For example:
[5]:
sim = hpv.Sim(pars, end=2050) # Use parameters defined above, except set the end data to 2050
sim.run()
Loading location-specific demographic data for "nigeria"
Initializing sim with 10000 agents
Loading location-specific data for "nigeria"
Running 1980.0 ( 0/284) (0.03 s) ———————————————————— 0%
Running 1982.5 (10/284) (0.30 s) ———————————————————— 4%
Running 1985.0 (20/284) (0.58 s) •——————————————————— 7%
Running 1987.5 (30/284) (0.86 s) ••—————————————————— 11%
Running 1990.0 (40/284) (1.15 s) ••—————————————————— 14%
Running 1992.5 (50/284) (1.43 s) •••————————————————— 18%
Running 1995.0 (60/284) (1.72 s) ••••———————————————— 21%
Running 1997.5 (70/284) (2.02 s) •••••——————————————— 25%
Running 2000.0 (80/284) (2.34 s) •••••——————————————— 29%
Running 2002.5 (90/284) (2.65 s) ••••••—————————————— 32%
Running 2005.0 (100/284) (2.99 s) •••••••————————————— 36%
Running 2007.5 (110/284) (3.32 s) •••••••————————————— 39%
Running 2010.0 (120/284) (3.66 s) ••••••••———————————— 43%
Running 2012.5 (130/284) (4.03 s) •••••••••——————————— 46%
Running 2015.0 (140/284) (4.39 s) •••••••••——————————— 50%
Running 2017.5 (150/284) (4.78 s) ••••••••••—————————— 53%
Running 2020.0 (160/284) (5.17 s) •••••••••••————————— 57%
Running 2022.5 (170/284) (5.57 s) ••••••••••••———————— 60%
Running 2025.0 (180/284) (6.00 s) ••••••••••••———————— 64%
Running 2027.5 (190/284) (6.45 s) •••••••••••••——————— 67%
Running 2030.0 (200/284) (6.91 s) ••••••••••••••—————— 71%
Running 2032.5 (210/284) (7.39 s) ••••••••••••••—————— 74%
Running 2035.0 (220/284) (7.91 s) •••••••••••••••————— 78%
Running 2037.5 (230/284) (8.42 s) ••••••••••••••••———— 81%
Running 2040.0 (240/284) (8.97 s) ••••••••••••••••———— 85%
Running 2042.5 (250/284) (9.52 s) •••••••••••••••••——— 88%
Running 2045.0 (260/284) (10.14 s) ••••••••••••••••••—— 92%
Running 2047.5 (270/284) (10.74 s) •••••••••••••••••••— 95%
Running 2050.0 (280/284) (11.40 s) •••••••••••••••••••— 99%
Simulation summary:
11,406,265 infections
0 dysplasias
0 pre-cins
2,368,165 cin1s
163,049 cin2s
55,307 cin3s
5,667,218 cins
7,901 cancers
0 cancer detections
11,492 cancer deaths
0 detected cancer deaths
8,023,172 reinfections
0 reactivations
1,120,881,408 number susceptible
13,954,716 number infectious
75,419 number with inactive infection
375,317,152 number with no cellular changes
38,483,932 number with episomal infection
4,310 number with transformation
75,419 number with cancer
14,030,136 number infected
38,559,352 number with abnormal cells
0 number with latent infection
2,748,852 number with precin
5,289,404 number with cin1
1,446,613 number with cin2
1,153,556 number with cin3
7,845,039 number with detectable dysplasia
0 number with detected cancer
0 number screened
0 number treated for precancerous lesions
0 number treated for cancer
0 number vaccinated
0 number given therapeutic vaccine
0.34 hpv incidence (/100)
0 cin1 incidence (/100,000)
0 cin2 incidence (/100,000)
0 cin3 incidence (/100,000)
0 dysplasia incidence (/100,000)
4 cancer incidence (/100,000)
13,848,412 births
3,176,947 other deaths
-5,171,606 migration
5 age-adjusted cervical cancer incidence (/100,000)
0 age-adjusted cervical cancer mortality
0 newly vaccinated
0 cumulative number vaccinated
0 new doses
0 cumulative doses
0 new therapeutic vaccine doses
0 newly received therapeutic vaccine
0 cumulative therapeutic vaccine doses
0 total received therapeutic vaccine
0 new screens
0 newly screened
0 new cin treatments
0 newly treated for cins
0 new cancer treatments
0 newly treated for cancer
0 cumulative screens
0 cumulative number screened
0 cumulative cin treatments
0 cumulative number treated for cins
0 cumulative cancer treatments
0 cumulative number treated for cancer
0 detected cancer incidence (/100,000)
6 cancer mortality
375,317,152 number alive
0 crude death rate
0 crude birth rate
1.24 hpv prevalence (/100)
0 pre-cin prevalence (/100,000)
0 cin1 prevalence (/100,000)
0 cin2 prevalence (/100,000)
0 cin3 prevalence (/100,000)
[5]:
Sim(<no label>; 1980 to 2050; pop: 10000 default; epi: 5.11407e+08⚙, 525780♋︎)
Plotting results¶
As you saw above, plotting the results of a simulation is rather easy too:
[6]:
fig = sim.plot()
Full usage example¶
Many of the details of this example will be explained in later tutorials, but to give you a taste, here’s an example of how you would run two simulations to determine the impact of a custom intervention aimed at protecting the elderly.
[7]:
import hpvsim as hpv
# Custom vaccination intervention
def custom_vx(sim):
if sim.yearvec[sim.t] == 2000:
target_group = (sim.people.age>9) * (sim.people.age<14)
sim.people.peak_imm[0, target_group] = 1
pars = dict(
location = 'tanzania', # Use population characteristics for Japan
n_agents = 10e3, # Have 50,000 people total in the population
start = 1980, # Start the simulation in 1980
n_years = 50, # Run the simulation for 50 years
burnin = 10, # Discard the first 20 years as burnin period
verbose = 0, # Do not print any output
)
# Running with multisims -- see Tutorial 3
s1 = hpv.Sim(pars, label='Default')
s2 = hpv.Sim(pars, interventions=custom_vx, label='Custom vaccination')
msim = hpv.MultiSim([s1, s2])
msim.run()
fig = msim.plot(['cancers', 'dysplasias'])
Loading location-specific demographic data for "tanzania"
Loading location-specific demographic data for "tanzania"
<Figure size 640x480 with 0 Axes>
[ ]: