Welcome to EMOD-Generic-Scripts¶
EMOD-Generic-Scripts is a collection of Python scripts and demonstration models for use with the generic simtype of EMOD (current with the Generic-Ongoing branch), emod-api, and idmtools. All of these models are for use on COmputational Modeling Platform Service (COMPS) at IDM but can be easily adapted for use in other Slurm systems with support for SIF containers.
Each model is independent and implemented using the generic simtype of EMOD. For each model, one or more simulation ensembles (called experiments on COMPS) are provided that highlight interesting features and/or phenomena. EMOD simulates stochastic processes; individual outcomes may not be representative, so an ensemble of simulations is the natural basis for consideration.
Input files for each simulation are constructed using emod-api as a Python pre-processing step during execution on COMPS. Values to be varied between simulations or between experiments are considered simulation parameters and are conceptually distinct from software parameters. Simulation parameters are specified as arguments to the Python scripts that construct input files; those scripts are responsible for setting all necessary software parameters. Simulation parameters are typically a (small) subset of the software parameters required and there may not be a one-to-one mapping between simulation parameters and software parameters (e.g., a single simulation parameter named R0 may set several software parameters in order to ensure the model produces the desired R0 value). Model construction is the process of determining which software parameters are static and which software parameters are to be varied between simulations. The software parameter “Run_Number” is used to set the random number seed for a simulation and should always be included as a simulation parameter.
Additional information about how to use idmtools can be found in the documentation. Additional information about software parameters for the generic simtype of EMOD can be found in the parameter overview.
Click here for a diagram showing how idmtools and each of the related packages are used in an end-to-end workflow using EMOD as the disease transmission model.
Contents¶
Get started¶
Follow the instructions below to
Set up a virtual python environment (e.g. conda).
Install requirements via pip using the IDM artifactory:
pip install -r requirements.txt --index-url=https://packages.idmod.org/api/pypi/pypi-production/simple
Run an experiment:
cd EMOD-Generic-Scripts/model_covariance01/experiment_covariance01 python make01_param_dict.py python make02_lauch_sims.py python make03_pool_brick.py
Generate figures:
cd EMOD-Generic-Scripts/model_covariance01/figure_attackfrac01 python make_fig_attackrate.py
Client- and server-side workflow¶
This workflow separates defining and uploading (client-side operations) from writing inputs and running sims (server-side).
Client-side¶
The client-side workflow uses idmtools and emodpy to communicate with COMPS.
Create a parameter dictionary that defines the experiment by specifying values for all simulation parameters:
OUTPUT = param_dict.json
Upload the parameter dictionary along with the Python files that construct the model inputs. Run the simulations:
OUTPUT = COMPS_ID.id
Collect results from the server:
OUTPUT = data_brick.json
Server-side¶
The server-side workflow uses emod-api for file creation.
Each EMOD simulation automatically runs three Python functions: dtk_pre_process
, dtk_in_process
, and dtk_post_process
. All file creation work is done within the dtk_pre_process
function. That application will find the ID assigned to the simulation, open the parameter dictionary and retrieve the correct values for the simulation, then use emod-api to create input files.
The easiest way to trace the workflow is to start in an experiment directory and examine the three Python scripts (corresponding to the three client-side steps above) that need to be run in order.
Template for workflow¶
You can easily start an EMOD-Generic-Scripts type workflow by utilizing cookiecutter-EMOD-Generic.
Follow the instructions below:
Install cookiecutter
pip install -U cookiecutter
Generate the project. You will be prompted for some choices and defaults.
cookiecutter git@github.com:InstituteforDiseaseModeling/cookiecutter-EMOD-Generic.git
Examples¶
EMOD-Generic-Scripts provides a set of example models. Please consider contributing your own analyses.
model_covariance01¶
This example demonstrates the impact of heterogeneity on epidemic attack rate. These simulations sweep over R0 with fixed levels for variance in an individual’s acquisition risk, transmission rate, and covariance between acquisition risk and transmission rate.
Each simulation uses a single node of fully susceptible individuals and an outbreak is initialized by a constant importation pressure of infected individuals. There is no age structure, vital dynamics, or waning.
model_covid01¶
Demonstration simulations for SARS-CoV-2 transmission. The model strcuture was used to examine the impact of Supplementary Immunization Activities (SIA)s for non Covid-19 diseases on SARS-CoV-2 tranmsssion.
Important features include:
Multi-node network used to represent an urban center with peri-urban and rural locations.
Population migration between the nodes.
Health care workers (HCW) and non-HWC moving between locations.
Personal protective equipment (PPE) for HWC.
Age-dependent susceptibility.
Self-isolation behavior.
Age-structured contact matricies.
Overdispersion of the infection rate.
This model does not include:
SARS-CoV-2 variants of concern.
Waning immunity, re-infection, or secondary vaccine failure.
Vital dynamics (births, deaths, or aging).
model_demographics01¶
This example demonstrates the implementation of vital dynamics (births, deaths, and aging) to reproduce the population pyramid of the United Kingdom over the period 1950 to 1980. Simulations include no infections or contagion, and have only one node.
Sweep outcomes are replicates of four model configurations: initialization of ages at steady-state with steady-state approximation for birth and mortality rates, initialization of ages at steady-state with time-varying inputs for birth rate, initialization of ages using historical data with time-varying inputs for birth rate and steady-state approximation for mortality, and initialization of ages using historical data with time-varying inputs for birth rate and mortality rates adjusted to best fit total population and age pyramid data.
Calibration outcomes demonstrate the adjustment of three age-structured mortality multipliers to best fit total population and age pyramid data.
model_demographics02¶
This example demonstrates the implementation of vital dynamics (births, deaths, and aging) to reproduce data from UN World Population Prospects between the years 1950 and 2090. Simulations include no infections or contagion, and have only one node.
Sweep outcomes are replicates of two time periods: estimates (1950 - 2020) and median projections (2020 - 2090). Simulations do not currently include any immigration, so it is not possible to match WPP data for countries where total population would be declining except for an exogeneous source of adult individuals.
model_measles_cod01¶
Documentation.
Important features include:
Multi-node network used to represent regions of the Democratic Republic of the Congo at the 10km scale.
Network infectivity contagion transfer between nodes.
Spatially and temporally varying rates of routine immunization.
Acquisition-transmission covariance.
Pre-specified calendar of supplemental immunization activities (SIAs).
Maternally derived immunity.
Overdispersion of the infection rate.
Maximum simulation duration based on elapsed time.
Weighted agents to represent multiple individuals per agent.
Age-based immunity initialization to approximate endemic transmission.
Infectivity reservoir to represent persistent exogeneous importation.
model_measles_gha01¶
Simulations examining RDT impact using Ghana as an example context. Work presented as part of Feb 2022 co-chair.
Parameters in the baseline model were adjusted to fit observed timeseries of measles incidence. Poisson-based likelihood function is maximized over one free parameter that scales total incidence and is interpreted as a reporting rate.
Important features include:
Multi-node network used to represent the country of Ghana at the 10km scale.
Network infectivity contagion transfer between nodes.
Spatially varying rates of routine immunization.
Acquisition-transmission covariance.
Pre-specified calendar of supplemental immunization activities (SIAs).
Maternally derived immunity.
Overdispersion of the infection rate.
Maximum simulation duration based on elapsed time.
Weighted agents to represent multiple individuals per agent.
Age-based immunity initialization to approximate endemic transmission.
Infectivity reservoir to represent persistent exogeneous importation.
Test scenario projects incidence forward and implements outbreak response SIAs based on observed incidence.
Important features include:
SQL-based event reporting of symptomatic incidence.
Spatial varying reporting rates sampled from a beta distribution.
Response interventions created dynamically using python in-process.
model_measles_nga01¶
Documentation.
Important features include:
Multi-node network used to represent regions of Nigeria at the 10km scale.
Network infectivity contagion transfer between nodes.
Spatially varying rates of routine immunization.
Acquisition-transmission covariance.
Pre-specified calendar of supplemental immunization activities (SIAs).
Maternally derived immunity.
Overdispersion of the infection rate.
Maximum simulation duration based on elapsed time.
Weighted agents to represent multiple individuals per agent.
Age-based immunity initialization to approximate endemic transmission.
Infectivity reservoir to represent persistent exogeneous importation.
model_measles_nga02¶
Simulations of measles transmission in Nigeria. The model is structured so that each state is implemented separately. Age-structured mortality from UN WPP reports for national estimates have been used for all states; crude birth rates are initialized using values from the 2013 Nigeria DHS and scaled in time using relative trends from the UN WPP.
Important features include:
Multi-node network used to represent regions of Nigeria at the adm02 scale.
Network infectivity contagion transfer between nodes.
Annual seasonality in measles infectivity.
Spatially varying rates of routine immunization.
Pre-specified calendar of supplemental immunization activities (SIAs).
Maternally derived immunity.
Maximum simulation duration based on elapsed time.
Weighted agents to represent multiple individuals per agent.
Age-based immunity initialization to approximate endemic transmission.
Infectivity reservoir to represent persistent exogeneous importation.
Event based notification of vaccine doses delivered by routine immunization.
Future SIA vaccines as non-contagious infections for labeling and tracking.
model_network01¶
This example demonstrates the implementation of the spread of infectivity on a network. Each simulation has a default network of 625 nodes (25-by-25 grid) that has 1 of 4 preset levels of network infectivity. Simulations are implemented as multi-core (four cores per simulation) to demonstrate logging and transmisssion between nodes hosted on different cores (note that cores may or may not be co-located on a common machine.)
An outbreak is initialized by a constant importation pressure of infected individuals. There is no age structure, vital dynamics, or waning. All simulations run for 1000 time steps of until total infectivity falls to zero.
Infected individuals are introduced in one node only (lower-left on the grid). Inter-node populations are not well mixed; the network infectivity feature implements a gravity-type expression for contagion transmission. All nodes have equal populations and the coefficient on the distance exponent in the gravity expression is fixed. Each simulation is assigned 1 of 4 coefficients in the gravity expression.
model_polio01¶
Demonstration simulations for poliovirus type-2 transmission in Nigeria. The model structure was used to examine the impact Sabin vaccine reversion dynamics in contrast to nOPV vaccines.
Important features include:
Multi-node network used to represent the country of Nigeria at either the admin-2 level or the 10km length scale.
Network infectivity contagion transfer between nodes.
Acquisition-transmission covariance.
HINT structure to create non-participatory populations.
Genome mutation dynamics and mutation labeling.
Genome based variable infectivity.
Clade labeling to create variable genome reversion trajectories.
Immunity initialization from external data (IDM’s Polio Immunity Mapper).
Pre-specificed calendar of supplemental immunization activities (SIAs).
Maternally derived immunity.
Node-level variance in R0 value.
Overdispersion of the infection rate.
Maximum simulation duration based on elapsed time.
Weighted agents to represent multiple individuals per agent.
Population serosurveys implemented using custom reporters.
model_rubella01¶
Simulations that produce Figure 5 in the manuscript “Examination of scenarios introducing rubella vaccine in the Democratic Republic of the Congo.” These simulations are used to demonstrate the potential for increased congenital rubella syndrome (CRS) after vaccine introduction.
See https://doi.org/10.1016/j.jvacx.2021.100127
Extensions the cited work to examine the impact of non-steady-state demographics; vital dynamics and fertility from WPP data. Applies agent weights during simulation so that outputs are corectly scaled to total population.
Important features include: - Maternally derived immunity. - Optional steady-state population pyramid. - Very low rates of exogeneous importation. - Posterior distribution of R0 values used as input.
model_transtree01¶
This example demonstrates the implementation of superspreading behavior and its visualization using transmission trees. These simulations use a constant R0 value of 1.5 and one of two variance levels for the secondary transmission rate: constant or exponential. A constant rate has all infected individuals tranmsit with a Poisson rate equal to the R0 value; an exponential rate has each infected individual transmit with a Poisson rate that is drawn from an exponential distribution with mean equal to the R0 value.
Each simulation has a default network of 16 nodes (4-by-4 grid) that are well mixed using the network infectivity feature. Simulations are implemented as multi-core (two cores per simulation) to demonstrate logging and transmisssion between nodes hosted on different cores (note that cores may or may not be co-located on a common machine.)
An outbreak is initialized by a constant importation pressure of infected individuals. There is no age structure, vital dynamics, or waning. All simulations run for 1000 time steps of until total infectivity falls to zero.
Data for the explicit transmission tree is generated by applying infection labels based on the unique ID of the infecting agent. Logging is provided by the event database (SQLite format) that records these infection labels. Each record provides a unique ID for the infected agent and the unique ID for the infecting agent. Infected individuals imported to initialize the outbreak have their infecting agent’s ID label set to zero.