synthpops.pop module

This module provides the main class for interacting with SynthPops, the Pop class.

class Pop(n=None, max_contacts=None, ltcf_pars=None, school_pars=None, with_industry_code=False, with_facilities=False, use_default=False, use_two_group_reduction=True, average_LTCF_degree=20, ltcf_staff_age_min=20, ltcf_staff_age_max=60, with_school_types=False, school_mixing_type='random', average_class_size=20, inter_grade_mixing=0.1, average_student_teacher_ratio=20, average_teacher_teacher_degree=3, teacher_age_min=25, teacher_age_max=75, with_non_teaching_staff=False, average_student_all_staff_ratio=15, average_additional_staff_degree=20, staff_age_min=20, staff_age_max=75, rand_seed=None, country_location=None, state_location=None, location=None, sheet_name=None, household_method='infer_ages', smooth_ages=False, window_length=7, do_make=True)[source]

Bases: sciris.sc_utils.prettyobj

Make a full population network including both people (ages, sexes) and contacts. By default uses Seattle, Washington data. Note about the household methods available: ‘infer_ages’ and ‘fixed_ages’.

If using ‘infer_ages’, then the ages of individuals in the population are generated by first placing individuals into households using the age of the head of households or reference individuals (always an adult), household age mixing patterns, household sizes, and the age distribution from data (census or other sources).

If using ‘fixed_ages’, then individuals are pre-assigned ages according to the age distribution and placed into households using the age of the head of households or reference individuals, household age mixing patterns, and household sizes.

Parameters:
  • n (int) – The number of people to create.
  • max_contacts (dict) – A dictionary for maximum number of contacts per layer: keys must be “W” (work).
  • ltcf_pars (dict) – If supplied, replace default LTCF parameters
  • school_pars (dict) – if supplied, replace default school parameters
  • with_industry_code (bool) – If True, assign industry codes for workplaces, currently only possible for cached files of populations in the US.
  • with_facilities (bool) – If True, create long term care facilities, currently only available for locations in the US.
  • use_default (bool) – If True, use default data from settings.location, settings.state, settings.country.
  • use_two_group_reduction (bool) – If True, create long term care facilities with reduced contacts across both groups.
  • average_LTCF_degree (float) – default average degree in long term care facilities.
  • ltcf_staff_age_min (int) – Long term care facility staff minimum age.
  • ltcf_staff_age_max (int) – Long term care facility staff maximum age.
  • with_school_types (bool) – If True, creates explicit school types.
  • school_mixing_type (str or dict) – The mixing type for schools, ‘random’, ‘age_clustered’, or ‘age_and_class_clustered’ if string, and a dictionary of these by school type otherwise.
  • average_class_size (float) – The average classroom size.
  • inter_grade_mixing (float) – The average fraction of edges rewired to create edges between grades in the same school when school_mixing_type is ‘age_clustered’
  • average_student_teacher_ratio (float) – The average number of students per teacher.
  • average_teacher_teacher_degree (float) – The average number of contacts per teacher with other teachers.
  • teacher_age_min (int) – The minimum age for teachers.
  • teacher_age_max (int) – The maximum age for teachers.
  • with_non_teaching_staff (bool) – If True, includes non teaching staff.
  • average_student_all_staff_ratio (float) – The average number of students per staff members at school (including both teachers and non teachers).
  • average_additional_staff_degree (float) – The average number of contacts per additional non teaching staff in schools.
  • staff_age_min (int) – The minimum age for non teaching staff.
  • staff_age_max (int) – The maximum age for non teaching staff.
  • rand_seed (int) – Start point random sequence is generated from.
  • country_location (string) – name of the country the location is in
  • state_location (string) – name of the state the location is in
  • location (string) – name of the location
  • sheet_name (string) – sheet name where data is located
  • household_method (string) – name of household generation method used; for details see above.
  • smooth_ages (bool) – If True, use smoothed out age distribution.
  • window_length (int) – length of window over which to average or smooth out age distribution
  • do_make (bool) – whether to make the population
Returns:

A dictionary of the full population with ages, connections, and other attributes.

Return type:

network (dict)

generate()[source]

Actually generate the network.

Returns:A dictionary of the full population with ages, connections, and other attributes.
Return type:network (dict)
set_layer_classes()[source]

Add layer classes.

clean_up_layer_info()[source]

Clean up temporary data from the pop object after storing them in specific layer classes.

pop_item(key)[source]

Pop key from self.

to_dict()[source]

Export to a dictionary – official way to get the popdict.

Example:

popdict = pop.to_dict()
to_json(filename, indent=2, **kwargs)[source]

Export to a JSON file.

Example:

pop.to_json('my-pop.json')
save(filename, **kwargs)[source]

Save population to an binary, gzipped object file.

Example:

pop.save('my-pop.pop')
static load(filename, *args, **kwargs)[source]

Load from disk from a gzipped pickle.

Parameters:
  • filename (str) – the name or path of the file to load from
  • kwargs – passed to sc.loadobj()

Example:

pop = sp.Pop.load('my-pop.pop')
initialize_households_list()[source]

Initialize a new households list.

initialize_empty_households(n_households=None)[source]

Create a list of empty households.

Parameters:n_households (int) – the number of households to initialize
populate_households(households, age_by_uid)[source]

Populate all of the households. Store each household at the index corresponding to it’s hhid.

Parameters:
  • households (list) – list of lists where each sublist represents a household and contains the ids of the household members
  • age_by_uid (dict) – dictionary mapping each person’s id to their age
get_household(hhid)[source]

Return household with id: hhid.

Parameters:hhid (int) – household id number
Returns:A populated household.
Return type:sp.Household
add_household(household)[source]

Add a household to the list of households.

Parameters:household (sp.Household) – household with at minimum the hhid, member_uids, member_ages, reference_uid, and reference_age.
initialize_workplaces_list()[source]

Initialize a new workplaces list.

initialize_empty_workplaces(n_workplaces=None)[source]

Create a list of empty workplaces.

Parameters:n_households (int) – the number of workplaces to initialize
populate_workplaces(workplaces)[source]

Populate all of the workplaces. Store each workplace at the index corresponding to it’s wpid.

Parameters:
  • workplaces (list) – list of lists where each sublist represents a workplace and contains the ids of the workplace members
  • age_by_uid (dict) – dictionary mapping each person’s id to their age
get_workplace(wpid)[source]

Return workplace with id: wpid.

Parameters:wpid (int) – workplace id number
Returns:A populated workplace.
Return type:sp.Workplace
add_workplace(workplace)[source]

Add a workplace to the list of workplaces.

Parameters:workplace (sp.Workplace) – workplace with at minimum the wpid, member_uids, member_ages, reference_uid, and reference_age.
initialize_ltcfs_list()[source]

Initialize a new ltcfs list.

initialize_empty_ltcfs(n_ltcfs=None)[source]

Create a list of empty ltcfs.

Parameters:n_ltcfs (int) – the number of ltcfs to initialize
populate_ltcfs(resident_lists, staff_lists)[source]

Populate all of the ltcfs. Store each ltcf at the index corresponding to it’s ltcfid.

Parameters:
  • residents_list (list) – list of lists where each sublist represents a ltcf and contains the ids of the residents
  • staff_lists (list) – list of lists where each sublist represents a ltcf and contains the ids of the staff
get_ltcf(ltcfid)[source]

Return ltcf with id: ltcfid.

Parameters:ltcfid (int) – ltcf id number
Returns:A populated ltcf.
Return type:sp.LongTermCareFacility
add_ltcf(ltcf)[source]

Add a ltcf to the list of ltcfs.

Parameters:ltcf (sp.LongTermCareFacility) – ltcf with at minimum the ltcfid, resident_uids, staff_uids, resident_ages, staff_ages, reference_uid, and reference_age.
initialize_schools_list()[source]

Initialize a new schools list.

initialize_empty_schools(n_schools=None)[source]

Create a list of empty schools.

Parameters:n_schools (int) – the number of schools to initialize
populate_schools(student_lists, teacher_lists, non_teaching_staff_lists, age_by_uid, school_types=None, school_mixing_types=None)[source]

Populate all of the schools. Store each school at the index corresponding to it’s scid.

Parameters:
  • student_lists (list) – list of lists where each sublist represents a school and contains the ids of the students
  • teacher_lists (list) – list of lists where each sublist represents a school and contains the ids of the teachers
  • non_teaching_staff_lists (list) – list of lists where each sublist represents a school and contains the ids of the non teaching staff
  • age_by_uid (dict) – dictionary mapping each person’s id to their age
  • school_types (list) – list of the school types
  • school_mixing_types (list) – list of the school mixing types
get_school(scid)[source]

Return school with id: scid.

Parameters:scid (int) – school id number
Returns:A populated school.
Return type:sp.School
add_school(school)[source]

Add a school to the list of schools.

Parameters:school (sp.School) – school
populate_all_classrooms(schools_in_groups)[source]

Populate all of the classrooms in schools for each school that has school_mixing_type equal to ‘age_and_class_clustered’. Each classroom will be indexed at id clid.

Parameters:schools_in_groups (dict) – a dictionary representing each school in terms of student_groups and teacher_groups corresponding to classrooms
get_classroom(scid, clid)[source]

Return classroom with id: clid from school with id: scid.

Parameters:
  • scid (int) – school id number
  • clid (int) – classroom id number
Returns:

A populated classroom.

Return type:

sp.Classroom

compute_information()[source]

Computing an advanced description of the population.

compute_summary()[source]

Compute summaries and add to pop post generation.

summarize(return_msg=False)[source]

Print and optionally return a brief summary string of the pop.

count_pop_ages()[source]

Create an age count of the generated population post generation.

Returns:Dictionary of the age count of the generated population.
Return type:dict
get_household_sizes()[source]

Create household sizes in the generated population post generation.

Returns:Dictionary of household size by household id (hhid).
Return type:dict
count_household_sizes()[source]

Count of household sizes in the generated population.

Returns:Dictionary of the count of household sizes.
Return type:dict
get_household_heads()[source]

Get the ids of the head of households in the generated population post generation.

get_household_head_ages()[source]

Get the age of the head of each household in the generated population post generation.

count_household_head_ages(bins=None)[source]

Count of household head ages in the generated population.

Parameters:bins (array) – If supplied, use this to create a binned count of the household head ages. Otherwise, count discrete household head ages.
Returns:Dictionary of the count of household head ages.
Return type:dict
get_household_head_ages_by_size()[source]

Get the count of households by size and the age of the head of the household, assuming the minimal household members id is the id of the head of the household.

Returns:An array with row as household size and columns as household head age brackets.
Return type:np.ndarray
get_ltcf_sizes(keys_to_exclude=[])[source]

Create long term care facility sizes in the generated population post generation.

Parameters:keys_to_exclude (list) – possible keys to exclude for roles in long term care facilities. See notes.
Returns:Dictionary of the size for each long term care facility generated.
Return type:dict

Notes

keys_to_exclude is an empty list by default, but can contain the different long term care facility roles: ‘snf_res’ for residents and ‘snf_staff’ for staff. If either role is included in the parameter keys_to_exclude, then individuals with that value equal to 1 will not be counted.

count_ltcf_sizes(keys_to_exclude=[])[source]

Count of long term care facility sizes in the generated population.

Parameters:keys_to_exclude (list) – possible keys to exclude for roles in long term care facilities. See notes.
Returns:Dictionary of the count of long term care facility sizes.
Return type:dict

Notes

keys_to_exclude is an empty list by default, but can contain the different long term care facility roles: ‘snf_res’ for residents and ‘snf_staff’ for staff. If either role is included in the parameter keys_to_exclude, then individuals with that value equal to 1 will not be counted.

count_enrollment_by_age()[source]

Create enrollment count by age for students in the generated population post generation.

Returns:Dictionary of the count of enrolled students by age in the generated population.
Return type:dict
enrollment_rates_by_age

Enrollment rates by age for students in the generated population.

Returns:Dictionary of the enrollment rates by age for students in the generated population.
Return type:dict
count_enrollment_by_school_type(*args, **kwargs)[source]

Create enrollment sizes by school types in the generated population post generation.

Returns:List of generated enrollment sizes by school type.
Return type:list
count_employment_by_age()[source]

Create employment count by age for workers in the generated population post generation.

Returns:Dictionary of the count of employed workers by age in the generated population.
Return type:dict
employment_rates_by_age

Employment rates by age for workers in the generated population.

Returns:Dictionary of the employment rates by age for workers in the generated population.
Return type:dict
get_workplace_sizes()[source]

Create workplace sizes in the generated population post generation.

Returns:Dictionary of workplace size by workplace id (wpid).
Return type:dict
count_workplace_sizes()[source]

Count of workplace sizes in the generated population.

Returns:Dictionary of the count of workplace sizes.
Return type:dict
get_contact_counts_by_layer(layer='S', **kwargs)[source]

Get the number of contacts by layer.

Returns:Dictionary of the count of contacts in the layer for the different people types in the layer. See sp.contact_networks.get_contact_counts_by_layer() for method details.
Return type:dict
to_people()[source]

Convert to the alternative People representation of a population

plot_people(*args, **kwargs)[source]

Placeholder example of plotting the people in a population.

plot_contacts(*args, **kwargs)[source]

Plot matrices of the contacts for a given layer or layers.

plot_contact_counts(contact_counter, **kwargs)[source]

Plot the number of contacts by contact types as a histogram.

Parameters:
  • contact_counter (dict) – A dictionary with people_types as keys and value as list of counts for each type of contacts
  • **title_prefix (str) – optional title prefix for the figure
  • **figname (str) – name to save figure to disk
  • **fontsize (float) – Matplotlib.figure.fontsize
Returns:

Matplotlib figure and axes of the histograms of contact distributions for the corresponding contact_counter.

Examples:

pars = {'n': 10e3, 'location': 'seattle_metro', 'state_location': 'Washington', 'country_location': 'usa'}
pop = sp.Pop(**pars)
layer = 'S'
contact_counter = pop.get_contact_counts_by_layer(layer=layer)
fig, ax = pop.plot_contact_counts(contact_counter)
plot_ages(**kwargs)[source]

Plot a comparison of the expected and generated age distribution.

Example:

pars = {'n': 10e3, 'location':'seattle_metro', 'state_location':'Washington', 'country_location':'usa'}
pop = sp.Pop(**pars)
fig, ax = pop.plot_ages()
plot_household_sizes(**kwargs)[source]

Plot a comparison of the expected and generated household size distribution.

Example:

pars = {'n': 10e3, 'location':'seattle_metro', 'state_location':'Washington', 'country_location':'usa'}
pop = sp.Pop(**pars)
fig, ax = pop.plot_household_sizes()
plot_household_head_ages_by_size(**kwargs)[source]

Plot a comparison of the expected and generated age distribution of the household heads by the household size.

Examples:

pars = {'n': 10e3, 'location':'seattle_metro', 'state_location':'Washington', 'country_location':'usa'}
pop = sp.Pop(**pars)
fig, ax = pop.plot_household_head_ages_by_size()

kwargs = pars.copy()
fig, ax = pop.plot_household_head_ages_by_size(**kwargs)
plot_ltcf_resident_sizes(**kwargs)[source]

Plot a comparison of the expected and generated ltcf resident sizes.

Examples:

pars = {'n': 10e3, 'location':'seattle_metro', 'state_location':'Washington', 'country_location':'usa'}
pop = sp.Pop(**pars)
fig, ax = pop.plot_ltcf_resident_sizes()
plot_enrollment_rates_by_age(**kwargs)[source]

Plot a comparison of the expected and generated enrollment rates by age.

Example:

pars = {'n': 10e3, 'location':'seattle_metro', 'state_location':'Washington', 'country_location':'usa'}
pop = sp.Pop(**pars)
fig, ax = pop.plot_enrollment_rates_by_age()
plot_employment_rates_by_age(**kwargs)[source]

Plot a comparison of the expected and generated employment rates by age.

Example:

pars = {'n': 10e3, 'location':'seattle_metro', 'state_location':'Washington', 'country_location':'usa'}
pop = sp.Pop(**pars)
fig, ax = pop.plot_employment_rates_by_age()
plot_school_sizes(*args, **kwargs)[source]

Plot a comparison of the expected and generated school size distributions by school type.

Example:

pars = {'n': 10e3, 'location':'seattle_metro', 'state_location':'Washington', 'country_location':'usa'}
pop = sp.Pop(**pars)
fig, ax = pop.plot_school_sizes()
plot_workplace_sizes(**kwargs)[source]

Plot a comparison of the expected and generated workplace sizes for workplaces that are not schools or long term care facilities.

Examples:

pars = {'n': 10e3, 'location':'seattle_metro', 'state_location':'Washington', 'country_location':'usa'}
pop = sp.Pop(**pars)
fig, ax = pop.plot_ltcf_resident_sizes()
make_population(*args, **kwargs)[source]

Interface to sp.Pop().to_dict(). Included for backwards compatibility.

generate_synthetic_population(*args, **kwargs)[source]

For backwards compatibility only.