hpvsim.base module¶
Base classes for HPVsim. These classes handle a lot of the boilerplate of the People and Sim classes (e.g. loading, saving, key lookups, etc.), so those classes can be focused on the disease-specific functionality.
- class ParsObj(pars)[source]¶
Bases:
FlexPretty
A class based around performing operations on a self.pars dict.
- class Result(name=None, npts=None, scale=True, color=None, n_rows=0, n_copies=0)[source]¶
Bases:
object
Stores a single result – by default, acts like an array.
- Parameters:
name (str) – name of this result, e.g. new_infections
npts (int) – if values is None, precreate it to be of this length
scale (bool) – whether or not the value scales by population scale factor
color (str/arr) – default color for plotting (hex or RGB notation)
Example:
import hpvsim as hpv r1 = hpv.Result(name='test1', npts=10) r1[:5] = 20 print(r1.values)
- property npts¶
- property shape¶
- class BaseSim(*args, **kwargs)[source]¶
Bases:
ParsObj
The BaseSim class stores various methods useful for the Sim that are not directly related to simulating the epidemic. It is not used outside of the Sim object, so the separation of methods into the BaseSim and Sim classes is purely to keep each one of manageable size.
- update_pars(pars=None, create=False, **kwargs)[source]¶
Ensure that metaparameters get used properly before being updated
- property n¶
Count the number of people – if it fails, assume none
- get_t(dates, exact_match=False, return_date_format=None)[source]¶
Convert a string, date/datetime object, or int to a timepoint (int).
- Parameters:
date (str, date, int, or list) – convert any of these objects to a timepoint relative to the simulation’s start day
exact_match (bool) – whether or not to demand an exact match to the requested date
return_date_format (None, str) – if None, do not return dates; otherwise return them as strings or floats as requested
- Returns:
the time point in the simulation cloesst to the requested date
- Return type:
t (int or str)
Examples:
sim.get_t('2015-03-01') # Get the closest timepoint to the specified date sim.get_t(3) # Will return 3 sim.get_t('2015') # Can use strings sim.get_t(['2015.5', '2016.5']) # List of strings, will match as close as possible sim.get_t(['2015.5', '2016.5'], exact_match=True) # Raises an error since these dates aren't directly simulated
- result_keys(which='all')[source]¶
Get the actual results objects, not other things stored in sim.results.
If which is ‘total’, return only the main results keys. If ‘genotype’, return only genotype keys. If ‘all’, return all keys.
- result_types(reskeys)[source]¶
Figure out what kind of result it is, which determines what plotting style to use
- export_results(for_json=True, filename=None, indent=2, *args, **kwargs)[source]¶
Convert results to dict – see also to_json().
The results written to Excel must have a regular table shape, whereas for the JSON output, arbitrary data shapes are supported.
- Parameters:
for_json (bool) – if False, only data associated with Result objects will be included in the converted output
filename (str) – filename to save to; if None, do not save
indent (int) – indent (int): if writing to file, how many indents to use per nested level
args (list) – passed to savejson()
kwargs (dict) – passed to savejson()
- Returns:
dictionary representation of the results
- Return type:
resdict (dict)
- export_pars(filename=None, indent=2, *args, **kwargs)[source]¶
Return parameters for JSON export – see also to_json().
This method is required so that interventions can specify their JSON-friendly representation.
- Parameters:
filename (str) – filename to save to; if None, do not save
indent (int) – indent (int): if writing to file, how many indents to use per nested level
args (list) – passed to savejson()
kwargs (dict) – passed to savejson()
- Returns:
a dictionary containing all the parameter values
- Return type:
pardict (dict)
- to_json(filename=None, keys=None, tostring=False, indent=2, verbose=False, *args, **kwargs)[source]¶
Export results and parameters as JSON.
- Parameters:
filename (str) – if None, return string; else, write to file
keys (str or list) – attributes to write to json (default: results, parameters, and summary)
tostring (bool) – if not writing to file, whether to write to string (alternative is sanitized dictionary)
indent (int) – if writing to file, how many indents to use per nested level
verbose (bool) – detail to print
args (list) – passed to savejson()
kwargs (dict) – passed to savejson()
- Returns:
A unicode string containing a JSON representation of the results, or writes the JSON file to disk
Examples:
json = sim.to_json() sim.to_json('results.json') sim.to_json('summary.json', keys='summary')
- to_df(date_index=False)[source]¶
Export results to a pandas dataframe
- Parameters:
date_index (bool) – if True, use the date as the index
- to_excel(filename=None, skip_pars=None)[source]¶
Export parameters and results as Excel format
- Parameters:
filename (str) – if None, return string; else, write to file
skip_pars (list) – if provided, a custom list parameters to exclude
- Returns:
An sc.Spreadsheet with an Excel file, or writes the file to disk
- shrink(skip_attrs=None, in_place=True)[source]¶
“Shrinks” the simulation by removing the people and other memory-intensive attributes (e.g., some interventions and analyzers), and returns a copy of the “shrunken” simulation. Used to reduce the memory required for RAM or for saved files.
- Parameters:
skip_attrs (list) – a list of attributes to skip (remove) in order to perform the shrinking; default “people”
in_palce (bool) – whether to perform the shrinking in place (default), or return a shrunken copy instead
- Returns:
a Sim object with the listed attributes removed
- Return type:
shrunken (Sim)
- save(filename=None, keep_people=None, skip_attrs=None, **kwargs)[source]¶
Save to disk as a gzipped pickle.
- Parameters:
filename (str or None) – the name or path of the file to save to; if None, uses stored
kwargs – passed to sc.makefilepath()
- Returns:
the validated absolute path to the saved file
- Return type:
filename (str)
Example:
sim.save() # Saves to a .sim file
- static load(filename, *args, **kwargs)[source]¶
Load from disk from a gzipped pickle.
- Parameters:
filename (str) – the name or path of the file to load from
kwargs – passed to hpv.load()
- Returns:
the loaded simulation object
- Return type:
sim (Sim)
Example:
sim = hpv.Sim.load('my-simulation.sim')
- get_interventions(label=None, partial=False, as_inds=False)[source]¶
Find the matching intervention(s) by label, index, or type. If None, return all interventions. If the label provided is “summary”, then print a summary of the interventions (index, label, type).
- Parameters:
label (str, int, Intervention, list) – the label, index, or type of intervention to get; if a list, iterate over one of those types
partial (bool) – if true, return partial matches (e.g. ‘beta’ will match all beta interventions)
as_inds (bool) – if true, return matching indices instead of the actual interventions
Examples:
tp = hpv.test_prob(symp_prob=0.1) cb1 = hpv.change_beta(days=5, changes=0.3, label='NPI') cb2 = hpv.change_beta(days=10, changes=0.3, label='Masks') sim = hpv.Sim(interventions=[tp, cb1, cb2]) cb1, cb2 = sim.get_interventions(hpv.change_beta) tp, cb2 = sim.get_interventions([0,2]) ind = sim.get_interventions(hpv.change_beta, as_inds=True) # Returns [1,2] sim.get_interventions('summary') # Prints a summary
- get_intervention(label=None, partial=False, first=False, die=True)[source]¶
Like get_interventions(), find the matching intervention(s) by label, index, or type. If more than one intervention matches, return the last by default. If no label is provided, return the last intervention in the list.
- Parameters:
label (str, int, Intervention, list) – the label, index, or type of intervention to get; if a list, iterate over one of those types
partial (bool) – if true, return partial matches (e.g. ‘beta’ will match all beta interventions)
first (bool) – if true, return first matching intervention (otherwise, return last)
die (bool) – whether to raise an exception if no intervention is found
Examples:
tp = hpv.test_prob(symp_prob=0.1) cb = hpv.change_beta(days=5, changes=0.3, label='NPI') sim = hpv.Sim(interventions=[tp, cb]) cb = sim.get_intervention('NPI') cb = sim.get_intervention('NP', partial=True) cb = sim.get_intervention(hpv.change_beta) cb = sim.get_intervention(1) cb = sim.get_intervention() tp = sim.get_intervention(first=True)
- class BasePeople(pars)[source]¶
Bases:
FlexPretty
A class to handle all the boilerplate for people – note that as with the BaseSim vs Sim classes, everything interesting happens in the People class, whereas this class exists to handle the less interesting implementation details.
Initialize essential attributes used for filtering
- set_pars(pars=None)[source]¶
Re-link the parameters stored in the people object to the sim containing it, and perform some basic validation.
- validate(sim_pars=None, verbose=False)[source]¶
Perform validation on the People object.
- Parameters:
sim_pars (dict) – dictionary of parameters from the sim to ensure they match the current People object
verbose (bool) – detail to print
- filter_inds(inds)[source]¶
Store indices to allow for easy filtering of the People object.
- Parameters:
inds (array) – filter by these indices
- Returns:
A filtered People object, which works just like a normal People object except only operates on a subset of indices.
- filter(criteria)[source]¶
Store indices to allow for easy filtering of the People object.
- Parameters:
criteria (array) – a boolean array for the filtering critria
- Returns:
A filtered People object, which works just like a normal People object except only operates on a subset of indices.
- unfilter()[source]¶
Set main simulation attributes to be views of the underlying data
This method should be called whenever the number of agents required changes (regardless of whether or not the underlying arrays have been resized)
- property is_female¶
Boolean array of everyone female
- property is_female_alive¶
Boolean array of everyone female and alive
- property is_male¶
Boolean array of everyone male
- property is_male_alive¶
Boolean array of everyone male and alive
- property f_inds¶
Indices of everyone female
- property m_inds¶
Indices of everyone male
- property int_age¶
Return ages as an integer
- property round_age¶
Rounds age up to the next highest integer
- property dt_age¶
Return ages rounded to the nearest whole timestep
- property is_active¶
Boolean array of everyone sexually active i.e. past debut
- property is_female_adult¶
Boolean array of everyone eligible for screening
- property is_virgin¶
Boolean array of everyone not yet sexually active i.e. pre debut
- property alive_inds¶
Indices of everyone alive
- property alive_level0¶
Indices of everyone alive who is a level 0 agent
- property alive_level0_inds¶
Indices of everyone alive who is a level 0 agent
- property n_alive¶
Number of people alive
- property n_alive_level0¶
Number of people alive
- property infected¶
Boolean array of everyone infected. Union of infectious and inactive. Includes people with cancer, people with latent infections, and people with active infections
- property abnormal¶
Boolean array of everyone with abnormal cells. Union of episomal, transformed, and cancerous
- property latent¶
Boolean array of everyone with latent infection. By definition, these people have inactive infection and no cancer.
- property precin¶
Boolean array of females with HPV whose disease severity level does not meet the threshold for detectable cell changes
- count_any(key, weighted=True)[source]¶
Count the number of people for a given key for a 2D array if any value matches
- date_keys()[source]¶
Returns keys for different event dates (e.g., date a person became symptomatic)
- dur_keys()[source]¶
Returns keys for different durations (e.g., the duration from exposed to infectious)
- to_graph()[source]¶
Convert all people to a networkx MultiDiGraph, including all properties of the people (nodes) and contacts (edges).
Example:
import hpvsim as hpv import networkx as nx sim = hpv.Sim(n_agents=50, pop_type='hybrid', contacts=dict(h=3, s=10, w=10, c=5)).run() G = sim.people.to_graph() nodes = G.nodes(data=True) edges = G.edges(keys=True) node_colors = [n['age'] for i,n in nodes] layer_map = dict(h='#37b', s='#e11', w='#4a4', c='#a49') edge_colors = [layer_map[G[i][j][k]['layer']] for i,j,k in edges] edge_weights = [G[i][j][k]['beta']*5 for i,j,k in edges] nx.draw(G, node_color=node_colors, edge_color=edge_colors, width=edge_weights, alpha=0.5)
- save(filename=None, force=False, **kwargs)[source]¶
Save to disk as a gzipped pickle.
Note: by default this function raises an exception if trying to save a run or partially run People object, since the changes that happen during a run are usually irreversible.
- Parameters:
filename (str or None) – the name or path of the file to save to; if None, uses stored
force (bool) – whether to allow saving even of a run or partially-run People object
kwargs – passed to
sc.makefilepath()
- Returns:
the validated absolute path to the saved file
- Return type:
filename (str)
Example:
sim = hpv.Sim() sim.initialize() sim.people.save() # Saves to a .ppl file
- static load(filename, *args, **kwargs)[source]¶
Load from disk from a gzipped pickle.
- Parameters:
filename (str) – the name or path of the file to load from
args (list) – passed to
hpv.load()
kwargs (dict) – passed to
hpv.load()
- Returns:
the loaded people object
- Return type:
people (People)
Example:
people = hpv.people.load('my-people.ppl')
- init_contacts(reset=False)[source]¶
Initialize the contacts dataframe with the correct columns and data types
- add_contacts(contacts, lkey=None, beta=None)[source]¶
Add new contacts to the array. See also contacts.add_layer().
- class Person(pars=None, uid=None, age=-1, sex=-1, debut=-1, rel_sev=-1, partners=None, current_partners=None, rship_start_dates=None, rship_end_dates=None, n_rships=None)[source]¶
Bases:
prettyobj
Class for a single person. Note: this is largely deprecated since sim.people is now based on arrays rather than being a list of people.
- class FlexDict[source]¶
Bases:
dict
A dict that allows more flexible element access: in addition to obj[‘a’], also allow obj[0]. Lightweight implementation of the Sciris odict class.
- class Contacts(data=None, layer_keys=None, **kwargs)[source]¶
Bases:
FlexDict
A simple (for now) class for storing different contact layers.
- Parameters:
data (dict) – a dictionary that looks like a Contacts object
layer_keys (list) – if provided, create an empty Contacts object with these layers
kwargs (dict) – additional layer(s), merged with data
- add_layer(**kwargs)[source]¶
Small method to add one or more layers to the contacts. Layers should be provided as keyword arguments.
Example:
hospitals_layer = hpv.Layer(label='hosp') sim.people.contacts.add_layer(hospitals=hospitals_layer)
- class Layer(*args, label=None, **kwargs)[source]¶
Bases:
FlexDict
A small class holding a single layer of contact edges (connections) between people.
The input is typically arrays including: person 1 of the connection, person 2 of the connection, the weight of the connection, the duration and start/end times of the connection. Connections are undirected; each person is both a source and sink.
This class is usually not invoked directly by the user, but instead is called as part of the population creation.
- Parameters:
f (array) – an array of N connections, representing people on one side of the connection
m (array) – an array of people on the other side of the connection
acts (array) – an array of number of acts per timestep for each connection
dur (array) – duration of the connection
start (array) – start time of the connection
end (array) – end time of the connection
label (str) – the name of the layer (optional)
kwargs (dict) – other keys copied directly into the layer
Note that all arguments (except for label) must be arrays of the same length, although not all have to be supplied at the time of creation (they must all be the same at the time of initialization, though, or else validation will fail).
Examples:
# Generate an average of 10 contacts for 1000 people n = 10_000 n_people = 1000 p1 = np.random.randint(n_people, size=n) p2 = np.random.randint(n_people, size=n) beta = np.ones(n) layer = hpv.Layer(p1=p1, p2=p2, beta=beta, label='rand') layer = hpv.Layer(dict(p1=p1, p2=p2, beta=beta), label='rand') # Alternate method # Convert one layer to another with extra columns index = np.arange(n) self_conn = p1 == p2 layer2 = hpv.Layer(**layer, index=index, self_conn=self_conn, label=layer.label)
- property members¶
Return sorted array of all members
- meta_keys()[source]¶
Return the keys for the layer’s meta information – i.e., f, m, beta, any others
- validate(force=True)[source]¶
Check the integrity of the layer: right types, right lengths.
If dtype is incorrect, try to convert automatically; if length is incorrect, do not.
- get_inds(inds, remove=False)[source]¶
Get the specified indices from the edgelist and return them as a dict.
- Parameters:
inds (int, array, slice) – the indices to be removed
- pop_inds(inds)[source]¶
“Pop” the specified indices from the edgelist and return them as a dict. Returns in the right format to be used with layer.append().
- Parameters:
inds (int, array, slice) – the indices to be removed
- append(contacts)[source]¶
Append contacts to the current layer.
- Parameters:
contacts (dict) – a dictionary of arrays with keys f,m,beta, as returned from layer.pop_inds()
- to_graph()[source]¶
Convert to a networkx DiGraph
Example:
import networkx as nx sim = hpv.Sim(n_agents=20, pop_type='hybrid').run() G = sim.people.contacts['h'].to_graph() nx.draw(G)
- find_contacts(inds, as_array=True)[source]¶
Find all contacts of the specified people
For some purposes (e.g. contact tracing) it’s necessary to find all of the contacts associated with a subset of the people in this layer. Since contacts are bidirectional it’s necessary to check both P1 and P2 for the target indices. The return type is a Set so that there is no duplication of indices (otherwise if the Layer has explicit symmetric interactions, they could appear multiple times). This is also for performance so that the calling code doesn’t need to perform its own unique() operation. Note that this cannot be used for cases where multiple connections count differently than a single infection, e.g. exposure risk.
- Parameters:
inds (array) – indices of people whose contacts to return
as_array (bool) – if true, return as sorted array (otherwise, return as unsorted set)
- Returns:
a set of indices for pairing partners
- Return type:
contact_inds (array)
Example: If there were a layer with - P1 = [1,2,3,4] - P2 = [2,3,1,4] Then find_contacts([1,3]) would return {1,2,3}
- update(people, frac=1.0)[source]¶
Regenerate contacts on each timestep.
This method gets called if the layer appears in
sim.pars['dynam_layer']
. The Layer implements the update procedure so that derived classes can customize the update e.g. implementing over-dispersion/other distributions, random clusters, etc.Typically, this method also takes in the
people
object so that the update can depend on person attributes that may change over time (e.g. changing contacts for people that are severe/critical).- Parameters:
people (People) – the HPVsim People object, which is usually used to make new contacts
frac (float) – the fraction of contacts to update on each timestep