T1 - Getting started

Installing and getting started with HPVsim is quite simple.

HPVsim is a Python package that can be pip-installed by typing pip install hpvsim into a terminal. You can then check that the installation was successful by importing HPVsim with import hpvsim as hpv.

The basic design philosophy of HPVsim is: common tasks should be simple. For example:

  • Defining parameters

  • Running a simulation

  • Plotting results

This tutorial walks you through how to do these things.

Click here to open an interactive version of this notebook.

Hello world

To create, run, and plot a sim with default options is just:

[1]:
import hpvsim as hpv

sim = hpv.Sim()
sim.run()
fig = sim.plot()
HPVsim 1.2.2 (2023-08-11) — © 2023 by IDM
Loading location-specific demographic data for "nigeria"
Initializing sim with 20000 agents
Loading location-specific data for "nigeria"
  Running 1995.0 ( 0/144) (0.96 s)  ———————————————————— 1%
  Running 1997.5 (10/144) (1.33 s)  •——————————————————— 8%
  Running 2000.0 (20/144) (1.71 s)  ••—————————————————— 15%
  Running 2002.5 (30/144) (2.10 s)  ••••———————————————— 22%
  Running 2005.0 (40/144) (2.52 s)  •••••——————————————— 28%
  Running 2007.5 (50/144) (2.92 s)  •••••••————————————— 35%
  Running 2010.0 (60/144) (3.36 s)  ••••••••———————————— 42%
  Running 2012.5 (70/144) (3.80 s)  •••••••••——————————— 49%
  Running 2015.0 (80/144) (4.27 s)  •••••••••••————————— 56%
  Running 2017.5 (90/144) (4.76 s)  ••••••••••••———————— 63%
  Running 2020.0 (100/144) (5.26 s)  ••••••••••••••—————— 70%
  Running 2022.5 (110/144) (5.77 s)  •••••••••••••••————— 77%
  Running 2025.0 (120/144) (6.32 s)  ••••••••••••••••———— 84%
  Running 2027.5 (130/144) (6.89 s)  ••••••••••••••••••—— 91%
  Running 2030.0 (140/144) (7.52 s)  •••••••••••••••••••— 98%
Simulation summary:
   12,396,439 infections
            0 dysplasias
            0 pre-cins
    2,843,543 cin1s
      251,561 cin2s
      136,729 cin3s
    5,780,554 cins
       18,159 cancers
            0 cancer detections
       14,955 cancer deaths
            0 detected cancer deaths
    8,663,087 reinfections
            0 reactivations
   776,222,336 number susceptible
   14,938,752 number infectious
      137,263 number with inactive infection
   260,489,216 number with no cellular changes
   39,731,632 number with episomal infection
        2,670 number with transformation
      137,263 number with cancer
   15,076,015 number infected
   39,868,896 number with abnormal cells
            0 number with latent infection
    2,955,704 number with precin
    5,675,871 number with cin1
    1,481,591 number with cin2
    1,207,598 number with cin3
    8,268,922 number with detectable dysplasia
            0 number with detected cancer
            0 number screened
            0 number treated for precancerous lesions
            0 number treated for cancer
            0 number vaccinated
            0 number given therapeutic vaccine
         0.53 hpv incidence (/100)
            0 cin1 incidence (/100,000)
            0 cin2 incidence (/100,000)
            0 cin3 incidence (/100,000)
            0 dysplasia incidence (/100,000)
           14 cancer incidence (/100,000)
    9,517,645 births
    2,496,913 other deaths
   -1,324,566 migration
           21 age-adjusted cervical cancer incidence (/100,000)
            0 age-adjusted cervical cancer mortality
            0 newly vaccinated
            0 cumulative number vaccinated
            0 new doses
            0 cumulative doses
            0 new therapeutic vaccine doses
            0 newly received therapeutic vaccine
            0 cumulative therapeutic vaccine doses
            0 total received therapeutic vaccine
            0 new screens
            0 newly screened
            0 new cin treatments
            0 newly treated for cins
            0 new cancer treatments
            0 newly treated for cancer
            0 cumulative screens
            0 cumulative number screened
            0 cumulative cin treatments
            0 cumulative number treated for cins
            0 cumulative cancer treatments
            0 cumulative number treated for cancer
            0 detected cancer incidence (/100,000)
           12 cancer mortality
   260,489,216 number alive
            0 crude death rate
            0 crude birth rate
         1.91 hpv prevalence (/100)
            0 pre-cin prevalence (/100,000)
            0 cin1 prevalence (/100,000)
            0 cin2 prevalence (/100,000)
            0 cin3 prevalence (/100,000)

../_images/tutorials_tut_intro_3_1.svg

Defining parameters and genotypes, and running simulations

Parameters are defined as a dictionary. Some common parameters to modify are the number of agents in the simulation, the genotypes to simulate, and the start and end dates of the simulation. We can define those as:

[2]:
pars = dict(
    n_agents = 10e3,
    genotypes = [16, 18, 'hr'], # Simulate genotypes 16 and 18, plus all other high-risk HPV genotypes pooled together
    start = 1980,
    end = 2030,
)

Running a simulation is pretty easy. In fact, running a sim with the parameters we defined above is just:

[3]:
sim = hpv.Sim(pars)
sim.run()
Loading location-specific demographic data for "nigeria"
Initializing sim with 10000 agents
Loading location-specific data for "nigeria"
  Running 1980.0 ( 0/204) (0.03 s)  ———————————————————— 0%
  Running 1982.5 (10/204) (0.31 s)  •——————————————————— 5%
  Running 1985.0 (20/204) (0.59 s)  ••—————————————————— 10%
  Running 1987.5 (30/204) (0.87 s)  •••————————————————— 15%
  Running 1990.0 (40/204) (1.16 s)  ••••———————————————— 20%
  Running 1992.5 (50/204) (1.45 s)  •••••——————————————— 25%
  Running 1995.0 (60/204) (1.75 s)  •••••——————————————— 30%
  Running 1997.5 (70/204) (2.05 s)  ••••••—————————————— 35%
  Running 2000.0 (80/204) (2.39 s)  •••••••————————————— 40%
  Running 2002.5 (90/204) (2.71 s)  ••••••••———————————— 45%
  Running 2005.0 (100/204) (3.05 s)  •••••••••——————————— 50%
  Running 2007.5 (110/204) (3.39 s)  ••••••••••—————————— 54%
  Running 2010.0 (120/204) (3.74 s)  •••••••••••————————— 59%
  Running 2012.5 (130/204) (4.11 s)  ••••••••••••———————— 64%
  Running 2015.0 (140/204) (4.48 s)  •••••••••••••——————— 69%
  Running 2017.5 (150/204) (4.87 s)  ••••••••••••••—————— 74%
  Running 2020.0 (160/204) (5.27 s)  •••••••••••••••————— 79%
  Running 2022.5 (170/204) (5.68 s)  ••••••••••••••••———— 84%
  Running 2025.0 (180/204) (6.12 s)  •••••••••••••••••——— 89%
  Running 2027.5 (190/204) (6.58 s)  ••••••••••••••••••—— 94%
  Running 2030.0 (200/204) (7.05 s)  •••••••••••••••••••— 99%
Simulation summary:
    7,778,957 infections
            0 dysplasias
            0 pre-cins
    1,723,150 cin1s
      136,473 cin2s
       15,802 cin3s
    4,317,573 cins
        5,746 cancers
            0 cancer detections
        7,901 cancer deaths
            0 detected cancer deaths
    5,408,638 reinfections
            0 reactivations
   777,292,608 number susceptible
    9,445,364 number infectious
       63,927 number with inactive infection
   260,174,192 number with no cellular changes
   25,786,204 number with episomal infection
        2,155 number with transformation
       63,927 number with cancer
    9,509,292 number infected
   25,850,132 number with abnormal cells
            0 number with latent infection
    1,995,378 number with precin
    3,498,735 number with cin1
      920,115 number with cin2
      813,810 number with cin3
    5,195,309 number with detectable dysplasia
            0 number with detected cancer
            0 number screened
            0 number treated for precancerous lesions
            0 number treated for cancer
            0 number vaccinated
            0 number given therapeutic vaccine
         0.33 hpv incidence (/100)
            0 cin1 incidence (/100,000)
            0 cin2 incidence (/100,000)
            0 cin3 incidence (/100,000)
            0 dysplasia incidence (/100,000)
            4 cancer incidence (/100,000)
    9,517,192 births
    2,585,803 other deaths
   -1,206,708 migration
            6 age-adjusted cervical cancer incidence (/100,000)
            0 age-adjusted cervical cancer mortality
            0 newly vaccinated
            0 cumulative number vaccinated
            0 new doses
            0 cumulative doses
            0 new therapeutic vaccine doses
            0 newly received therapeutic vaccine
            0 cumulative therapeutic vaccine doses
            0 total received therapeutic vaccine
            0 new screens
            0 newly screened
            0 new cin treatments
            0 newly treated for cins
            0 new cancer treatments
            0 newly treated for cancer
            0 cumulative screens
            0 cumulative number screened
            0 cumulative cin treatments
            0 cumulative number treated for cins
            0 cumulative cancer treatments
            0 cumulative number treated for cancer
            0 detected cancer incidence (/100,000)
            6 cancer mortality
   260,174,192 number alive
            0 crude death rate
            0 crude birth rate
         1.21 hpv prevalence (/100)
            0 pre-cin prevalence (/100,000)
            0 cin1 prevalence (/100,000)
            0 cin2 prevalence (/100,000)
            0 cin3 prevalence (/100,000)

[3]:
Sim(<no label>; 1980 to 2030; pop: 10000 default; epi: 3.19541e+08⚙, 316043♋︎)

This will generate a results dictionary sim.results. Results by genotype are named things like sim.results['infections'] and stored as arrays where each row corresponds to a genotype, while totals across all genotypes have names like sim.results['infections'] or sim.results['cancers'].

Rather than creating a parameter dictionary, any valid parameter can also be passed to the sim directly. For example, exactly equivalent to the above is:

[4]:
sim = hpv.Sim(n_agents=10e3, start=1980, end=2030)
sim.run()
Loading location-specific demographic data for "nigeria"
Initializing sim with 10000 agents
Loading location-specific data for "nigeria"
  Running 1980.0 ( 0/204) (0.03 s)  ———————————————————— 0%
  Running 1982.5 (10/204) (0.31 s)  •——————————————————— 5%
  Running 1985.0 (20/204) (0.59 s)  ••—————————————————— 10%
  Running 1987.5 (30/204) (0.88 s)  •••————————————————— 15%
  Running 1990.0 (40/204) (1.18 s)  ••••———————————————— 20%
  Running 1992.5 (50/204) (1.48 s)  •••••——————————————— 25%
  Running 1995.0 (60/204) (1.79 s)  •••••——————————————— 30%
  Running 1997.5 (70/204) (2.11 s)  ••••••—————————————— 35%
  Running 2000.0 (80/204) (2.45 s)  •••••••————————————— 40%
  Running 2002.5 (90/204) (2.79 s)  ••••••••———————————— 45%
  Running 2005.0 (100/204) (3.16 s)  •••••••••——————————— 50%
  Running 2007.5 (110/204) (3.52 s)  ••••••••••—————————— 54%
  Running 2010.0 (120/204) (3.91 s)  •••••••••••————————— 59%
  Running 2012.5 (130/204) (4.32 s)  ••••••••••••———————— 64%
  Running 2015.0 (140/204) (4.73 s)  •••••••••••••——————— 69%
  Running 2017.5 (150/204) (5.15 s)  ••••••••••••••—————— 74%
  Running 2020.0 (160/204) (5.59 s)  •••••••••••••••————— 79%
  Running 2022.5 (170/204) (6.04 s)  ••••••••••••••••———— 84%
  Running 2025.0 (180/204) (6.52 s)  •••••••••••••••••——— 89%
  Running 2027.5 (190/204) (7.02 s)  ••••••••••••••••••—— 94%
  Running 2030.0 (200/204) (7.54 s)  •••••••••••••••••••— 99%
Simulation summary:
   14,344,024 infections
            0 dysplasias
            0 pre-cins
    3,273,196 cin1s
      300,240 cin2s
      116,361 cin3s
    6,111,114 cins
       22,267 cancers
            0 cancer detections
        8,619 cancer deaths
            0 detected cancer deaths
   10,307,299 reinfections
            0 reactivations
   773,658,112 number susceptible
   17,499,422 number infectious
      101,277 number with inactive infection
   260,479,472 number with no cellular changes
   41,073,328 number with episomal infection
        2,873 number with transformation
      101,277 number with cancer
   17,600,700 number infected
   41,174,608 number with abnormal cells
            0 number with latent infection
    3,758,034 number with precin
    6,394,116 number with cin1
    1,505,512 number with cin2
    1,303,676 number with cin3
    9,068,985 number with detectable dysplasia
            0 number with detected cancer
            0 number screened
            0 number treated for precancerous lesions
            0 number treated for cancer
            0 number vaccinated
            0 number given therapeutic vaccine
         0.62 hpv incidence (/100)
            0 cin1 incidence (/100,000)
            0 cin2 incidence (/100,000)
            0 cin3 incidence (/100,000)
            0 dysplasia incidence (/100,000)
           17 cancer incidence (/100,000)
    9,517,192 births
    2,705,756 other deaths
   -1,120,515 migration
           26 age-adjusted cervical cancer incidence (/100,000)
            0 age-adjusted cervical cancer mortality
            0 newly vaccinated
            0 cumulative number vaccinated
            0 new doses
            0 cumulative doses
            0 new therapeutic vaccine doses
            0 newly received therapeutic vaccine
            0 cumulative therapeutic vaccine doses
            0 total received therapeutic vaccine
            0 new screens
            0 newly screened
            0 new cin treatments
            0 newly treated for cins
            0 new cancer treatments
            0 newly treated for cancer
            0 cumulative screens
            0 cumulative number screened
            0 cumulative cin treatments
            0 cumulative number treated for cins
            0 cumulative cancer treatments
            0 cumulative number treated for cancer
            0 detected cancer incidence (/100,000)
            7 cancer mortality
   260,479,472 number alive
            0 crude death rate
            0 crude birth rate
         2.24 hpv prevalence (/100)
            0 pre-cin prevalence (/100,000)
            0 cin1 prevalence (/100,000)
            0 cin2 prevalence (/100,000)
            0 cin3 prevalence (/100,000)

[4]:
Sim(<no label>; 1980 to 2030; pop: 10000 default; epi: 5.66686e+08⚙, 416602♋︎)

You can mix and match too – pass in a parameter dictionary with default options, and then include other parameters as keywords (including overrides; keyword arguments take precedence). For example:

[5]:
sim = hpv.Sim(pars, end=2050) # Use parameters defined above, except set the end data to 2050
sim.run()
Loading location-specific demographic data for "nigeria"
Initializing sim with 10000 agents
Loading location-specific data for "nigeria"
  Running 1980.0 ( 0/284) (0.03 s)  ———————————————————— 0%
  Running 1982.5 (10/284) (0.31 s)  ———————————————————— 4%
  Running 1985.0 (20/284) (0.60 s)  •——————————————————— 7%
  Running 1987.5 (30/284) (0.88 s)  ••—————————————————— 11%
  Running 1990.0 (40/284) (1.18 s)  ••—————————————————— 14%
  Running 1992.5 (50/284) (1.47 s)  •••————————————————— 18%
  Running 1995.0 (60/284) (1.78 s)  ••••———————————————— 21%
  Running 1997.5 (70/284) (2.09 s)  •••••——————————————— 25%
  Running 2000.0 (80/284) (2.42 s)  •••••——————————————— 29%
  Running 2002.5 (90/284) (2.75 s)  ••••••—————————————— 32%
  Running 2005.0 (100/284) (3.10 s)  •••••••————————————— 36%
  Running 2007.5 (110/284) (3.44 s)  •••••••————————————— 39%
  Running 2010.0 (120/284) (3.80 s)  ••••••••———————————— 43%
  Running 2012.5 (130/284) (4.18 s)  •••••••••——————————— 46%
  Running 2015.0 (140/284) (4.56 s)  •••••••••——————————— 50%
  Running 2017.5 (150/284) (4.94 s)  ••••••••••—————————— 53%
  Running 2020.0 (160/284) (5.36 s)  •••••••••••————————— 57%
  Running 2022.5 (170/284) (5.77 s)  ••••••••••••———————— 60%
  Running 2025.0 (180/284) (6.22 s)  ••••••••••••———————— 64%
  Running 2027.5 (190/284) (6.68 s)  •••••••••••••——————— 67%
  Running 2030.0 (200/284) (7.16 s)  ••••••••••••••—————— 71%
  Running 2032.5 (210/284) (7.65 s)  ••••••••••••••—————— 74%
  Running 2035.0 (220/284) (8.17 s)  •••••••••••••••————— 78%
  Running 2037.5 (230/284) (8.70 s)  ••••••••••••••••———— 81%
  Running 2040.0 (240/284) (9.27 s)  ••••••••••••••••———— 85%
  Running 2042.5 (250/284) (9.85 s)  •••••••••••••••••——— 88%
  Running 2045.0 (260/284) (10.49 s)  ••••••••••••••••••—— 92%
  Running 2047.5 (270/284) (11.11 s)  •••••••••••••••••••— 95%
  Running 2050.0 (280/284) (11.80 s)  •••••••••••••••••••— 99%
Simulation summary:
   11,406,265 infections
            0 dysplasias
            0 pre-cins
    2,368,165 cin1s
      163,049 cin2s
       55,307 cin3s
    5,667,218 cins
        7,901 cancers
            0 cancer detections
       11,492 cancer deaths
            0 detected cancer deaths
    8,023,172 reinfections
            0 reactivations
   1,120,881,408 number susceptible
   13,954,716 number infectious
       75,419 number with inactive infection
   375,317,152 number with no cellular changes
   38,483,932 number with episomal infection
        4,310 number with transformation
       75,419 number with cancer
   14,030,136 number infected
   38,559,352 number with abnormal cells
            0 number with latent infection
    2,748,852 number with precin
    5,289,404 number with cin1
    1,446,613 number with cin2
    1,153,556 number with cin3
    7,845,039 number with detectable dysplasia
            0 number with detected cancer
            0 number screened
            0 number treated for precancerous lesions
            0 number treated for cancer
            0 number vaccinated
            0 number given therapeutic vaccine
         0.34 hpv incidence (/100)
            0 cin1 incidence (/100,000)
            0 cin2 incidence (/100,000)
            0 cin3 incidence (/100,000)
            0 dysplasia incidence (/100,000)
            4 cancer incidence (/100,000)
   13,848,412 births
    3,176,947 other deaths
   -5,171,606 migration
            5 age-adjusted cervical cancer incidence (/100,000)
            0 age-adjusted cervical cancer mortality
            0 newly vaccinated
            0 cumulative number vaccinated
            0 new doses
            0 cumulative doses
            0 new therapeutic vaccine doses
            0 newly received therapeutic vaccine
            0 cumulative therapeutic vaccine doses
            0 total received therapeutic vaccine
            0 new screens
            0 newly screened
            0 new cin treatments
            0 newly treated for cins
            0 new cancer treatments
            0 newly treated for cancer
            0 cumulative screens
            0 cumulative number screened
            0 cumulative cin treatments
            0 cumulative number treated for cins
            0 cumulative cancer treatments
            0 cumulative number treated for cancer
            0 detected cancer incidence (/100,000)
            6 cancer mortality
   375,317,152 number alive
            0 crude death rate
            0 crude birth rate
         1.24 hpv prevalence (/100)
            0 pre-cin prevalence (/100,000)
            0 cin1 prevalence (/100,000)
            0 cin2 prevalence (/100,000)
            0 cin3 prevalence (/100,000)

[5]:
Sim(<no label>; 1980 to 2050; pop: 10000 default; epi: 5.11407e+08⚙, 525780♋︎)

Plotting results

As you saw above, plotting the results of a simulation is rather easy too:

[6]:
fig = sim.plot()
../_images/tutorials_tut_intro_13_0.svg

Full usage example

Many of the details of this example will be explained in later tutorials, but to give you a taste, here’s an example of how you would run two simulations to determine the impact of a custom intervention aimed at protecting the elderly.

[7]:
import hpvsim as hpv

# Custom vaccination intervention
def custom_vx(sim):
    if sim.yearvec[sim.t] == 2000:
        target_group = (sim.people.age>9) * (sim.people.age<14)
        sim.people.peak_imm[0, target_group] = 1

pars = dict(
    location = 'tanzania', # Use population characteristics for Japan
    n_agents = 10e3, # Have 50,000 people total in the population
    start = 1980, # Start the simulation in 1980
    n_years = 50, # Run the simulation for 50 years
    burnin = 10, # Discard the first 20 years as burnin period
    verbose = 0, # Do not print any output
)

# Running with multisims -- see Tutorial 3
s1 = hpv.Sim(pars, label='Default')
s2 = hpv.Sim(pars, interventions=custom_vx, label='Custom vaccination')
msim = hpv.MultiSim([s1, s2])
msim.run()
fig = msim.plot(['cancers', 'dysplasias'])
Loading location-specific demographic data for "tanzania"
Loading location-specific demographic data for "tanzania"
../_images/tutorials_tut_intro_15_1.svg
<Figure size 640x480 with 0 Axes>
[ ]: