synthpops.pop module¶
This module provides the main class for interacting with SynthPops, the Pop class.
-
class
Pop
(n=None, max_contacts=None, ltcf_pars=None, school_pars=None, with_industry_code=False, with_facilities=False, use_default=False, use_two_group_reduction=True, average_LTCF_degree=20, ltcf_staff_age_min=20, ltcf_staff_age_max=60, with_school_types=False, school_mixing_type='random', average_class_size=20, inter_grade_mixing=0.1, average_student_teacher_ratio=20, average_teacher_teacher_degree=3, teacher_age_min=25, teacher_age_max=75, with_non_teaching_staff=False, average_student_all_staff_ratio=15, average_additional_staff_degree=20, staff_age_min=20, staff_age_max=75, rand_seed=None, country_location=None, state_location=None, location=None, sheet_name=None, household_method='infer_ages', smooth_ages=False, window_length=7, do_make=True)[source]¶ Bases:
sciris.sc_utils.prettyobj
Make a full population network including both people (ages, sexes) and contacts. By default uses Seattle, Washington data. Note about the household methods available: ‘infer_ages’ and ‘fixed_ages’.
If using ‘infer_ages’, then the ages of individuals in the population are generated by first placing individuals into households using the age of the head of households or reference individuals (always an adult), household age mixing patterns, household sizes, and the age distribution from data (census or other sources).
If using ‘fixed_ages’, then individuals are pre-assigned ages according to the age distribution and placed into households using the age of the head of households or reference individuals, household age mixing patterns, and household sizes.
Parameters: - n (int) – The number of people to create.
- max_contacts (dict) – A dictionary for maximum number of contacts per layer: keys must be “W” (work).
- ltcf_pars (dict) – If supplied, replace default LTCF parameters
- school_pars (dict) – if supplied, replace default school parameters
- with_industry_code (bool) – If True, assign industry codes for workplaces, currently only possible for cached files of populations in the US.
- with_facilities (bool) – If True, create long term care facilities, currently only available for locations in the US.
- use_default (bool) – If True, use default data from settings.location, settings.state, settings.country.
- use_two_group_reduction (bool) – If True, create long term care facilities with reduced contacts across both groups.
- average_LTCF_degree (float) – default average degree in long term care facilities.
- ltcf_staff_age_min (int) – Long term care facility staff minimum age.
- ltcf_staff_age_max (int) – Long term care facility staff maximum age.
- with_school_types (bool) – If True, creates explicit school types.
- school_mixing_type (str or dict) – The mixing type for schools, ‘random’, ‘age_clustered’, or ‘age_and_class_clustered’ if string, and a dictionary of these by school type otherwise.
- average_class_size (float) – The average classroom size.
- inter_grade_mixing (float) – The average fraction of edges rewired to create edges between grades in the same school when school_mixing_type is ‘age_clustered’
- average_student_teacher_ratio (float) – The average number of students per teacher.
- average_teacher_teacher_degree (float) – The average number of contacts per teacher with other teachers.
- teacher_age_min (int) – The minimum age for teachers.
- teacher_age_max (int) – The maximum age for teachers.
- with_non_teaching_staff (bool) – If True, includes non teaching staff.
- average_student_all_staff_ratio (float) – The average number of students per staff members at school (including both teachers and non teachers).
- average_additional_staff_degree (float) – The average number of contacts per additional non teaching staff in schools.
- staff_age_min (int) – The minimum age for non teaching staff.
- staff_age_max (int) – The maximum age for non teaching staff.
- rand_seed (int) – Start point random sequence is generated from.
- country_location (string) – name of the country the location is in
- state_location (string) – name of the state the location is in
- location (string) – name of the location
- sheet_name (string) – sheet name where data is located
- household_method (string) – name of household generation method used; for details see above.
- smooth_ages (bool) – If True, use smoothed out age distribution.
- window_length (int) – length of window over which to average or smooth out age distribution
- do_make (bool) – whether to make the population
Returns: A dictionary of the full population with ages, connections, and other attributes.
Return type: network (dict)
-
generate
()[source]¶ Actually generate the network.
Returns: A dictionary of the full population with ages, connections, and other attributes. Return type: network (dict)
-
clean_up_layer_info
()[source]¶ Clean up temporary data from the pop object after storing them in specific layer classes.
-
to_dict
()[source]¶ Export to a dictionary – official way to get the popdict.
Example:
popdict = pop.to_dict()
-
to_json
(filename, indent=2, **kwargs)[source]¶ Export to a JSON file.
Example:
pop.to_json('my-pop.json')
-
save
(filename, **kwargs)[source]¶ Save population to an binary, gzipped object file.
Example:
pop.save('my-pop.pop')
-
static
load
(filename, *args, **kwargs)[source]¶ Load from disk from a gzipped pickle.
Parameters: - filename (str) – the name or path of the file to load from
- kwargs – passed to sc.loadobj()
Example:
pop = sp.Pop.load('my-pop.pop')
-
initialize_empty_households
(n_households=None)[source]¶ Create a list of empty households.
Parameters: n_households (int) – the number of households to initialize
-
populate_households
(households, age_by_uid)[source]¶ Populate all of the households. Store each household at the index corresponding to it’s hhid.
Parameters: - households (list) – list of lists where each sublist represents a household and contains the ids of the household members
- age_by_uid (dict) – dictionary mapping each person’s id to their age
-
get_household
(hhid)[source]¶ Return household with id: hhid.
Parameters: hhid (int) – household id number Returns: A populated household. Return type: sp.Household
-
add_household
(household)[source]¶ Add a household to the list of households.
Parameters: household (sp.Household) – household with at minimum the hhid, member_uids, member_ages, reference_uid, and reference_age.
-
initialize_empty_workplaces
(n_workplaces=None)[source]¶ Create a list of empty workplaces.
Parameters: n_households (int) – the number of workplaces to initialize
-
populate_workplaces
(workplaces)[source]¶ Populate all of the workplaces. Store each workplace at the index corresponding to it’s wpid.
Parameters: - workplaces (list) – list of lists where each sublist represents a workplace and contains the ids of the workplace members
- age_by_uid (dict) – dictionary mapping each person’s id to their age
-
get_workplace
(wpid)[source]¶ Return workplace with id: wpid.
Parameters: wpid (int) – workplace id number Returns: A populated workplace. Return type: sp.Workplace
-
add_workplace
(workplace)[source]¶ Add a workplace to the list of workplaces.
Parameters: workplace (sp.Workplace) – workplace with at minimum the wpid, member_uids, member_ages, reference_uid, and reference_age.
-
initialize_empty_ltcfs
(n_ltcfs=None)[source]¶ Create a list of empty ltcfs.
Parameters: n_ltcfs (int) – the number of ltcfs to initialize
-
populate_ltcfs
(resident_lists, staff_lists)[source]¶ Populate all of the ltcfs. Store each ltcf at the index corresponding to it’s ltcfid.
Parameters: - residents_list (list) – list of lists where each sublist represents a ltcf and contains the ids of the residents
- staff_lists (list) – list of lists where each sublist represents a ltcf and contains the ids of the staff
-
get_ltcf
(ltcfid)[source]¶ Return ltcf with id: ltcfid.
Parameters: ltcfid (int) – ltcf id number Returns: A populated ltcf. Return type: sp.LongTermCareFacility
-
add_ltcf
(ltcf)[source]¶ Add a ltcf to the list of ltcfs.
Parameters: ltcf (sp.LongTermCareFacility) – ltcf with at minimum the ltcfid, resident_uids, staff_uids, resident_ages, staff_ages, reference_uid, and reference_age.
-
initialize_empty_schools
(n_schools=None)[source]¶ Create a list of empty schools.
Parameters: n_schools (int) – the number of schools to initialize
-
populate_schools
(student_lists, teacher_lists, non_teaching_staff_lists, age_by_uid, school_types=None, school_mixing_types=None)[source]¶ Populate all of the schools. Store each school at the index corresponding to it’s scid.
Parameters: - student_lists (list) – list of lists where each sublist represents a school and contains the ids of the students
- teacher_lists (list) – list of lists where each sublist represents a school and contains the ids of the teachers
- non_teaching_staff_lists (list) – list of lists where each sublist represents a school and contains the ids of the non teaching staff
- age_by_uid (dict) – dictionary mapping each person’s id to their age
- school_types (list) – list of the school types
- school_mixing_types (list) – list of the school mixing types
-
get_school
(scid)[source]¶ Return school with id: scid.
Parameters: scid (int) – school id number Returns: A populated school. Return type: sp.School
-
add_school
(school)[source]¶ Add a school to the list of schools.
Parameters: school (sp.School) – school
-
populate_all_classrooms
(schools_in_groups)[source]¶ Populate all of the classrooms in schools for each school that has school_mixing_type equal to ‘age_and_class_clustered’. Each classroom will be indexed at id clid.
Parameters: schools_in_groups (dict) – a dictionary representing each school in terms of student_groups and teacher_groups corresponding to classrooms
-
get_classroom
(scid, clid)[source]¶ Return classroom with id: clid from school with id: scid.
Parameters: - scid (int) – school id number
- clid (int) – classroom id number
Returns: A populated classroom.
Return type: sp.Classroom
-
count_pop_ages
()[source]¶ Create an age count of the generated population post generation.
Returns: Dictionary of the age count of the generated population. Return type: dict
-
get_household_sizes
()[source]¶ Create household sizes in the generated population post generation.
Returns: Dictionary of household size by household id (hhid). Return type: dict
-
count_household_sizes
()[source]¶ Count of household sizes in the generated population.
Returns: Dictionary of the count of household sizes. Return type: dict
-
get_household_heads
()[source]¶ Get the ids of the head of households in the generated population post generation.
-
get_household_head_ages
()[source]¶ Get the age of the head of each household in the generated population post generation.
-
count_household_head_ages
(bins=None)[source]¶ Count of household head ages in the generated population.
Parameters: bins (array) – If supplied, use this to create a binned count of the household head ages. Otherwise, count discrete household head ages. Returns: Dictionary of the count of household head ages. Return type: dict
-
get_household_head_ages_by_size
()[source]¶ Get the count of households by size and the age of the head of the household, assuming the minimal household members id is the id of the head of the household.
Returns: An array with row as household size and columns as household head age brackets. Return type: np.ndarray
-
get_ltcf_sizes
(keys_to_exclude=[])[source]¶ Create long term care facility sizes in the generated population post generation.
Parameters: keys_to_exclude (list) – possible keys to exclude for roles in long term care facilities. See notes. Returns: Dictionary of the size for each long term care facility generated. Return type: dict Notes
keys_to_exclude is an empty list by default, but can contain the different long term care facility roles: ‘snf_res’ for residents and ‘snf_staff’ for staff. If either role is included in the parameter keys_to_exclude, then individuals with that value equal to 1 will not be counted.
-
count_ltcf_sizes
(keys_to_exclude=[])[source]¶ Count of long term care facility sizes in the generated population.
Parameters: keys_to_exclude (list) – possible keys to exclude for roles in long term care facilities. See notes. Returns: Dictionary of the count of long term care facility sizes. Return type: dict Notes
keys_to_exclude is an empty list by default, but can contain the different long term care facility roles: ‘snf_res’ for residents and ‘snf_staff’ for staff. If either role is included in the parameter keys_to_exclude, then individuals with that value equal to 1 will not be counted.
-
count_enrollment_by_age
()[source]¶ Create enrollment count by age for students in the generated population post generation.
Returns: Dictionary of the count of enrolled students by age in the generated population. Return type: dict
-
enrollment_rates_by_age
¶ Enrollment rates by age for students in the generated population.
Returns: Dictionary of the enrollment rates by age for students in the generated population. Return type: dict
-
count_enrollment_by_school_type
(*args, **kwargs)[source]¶ Create enrollment sizes by school types in the generated population post generation.
Returns: List of generated enrollment sizes by school type. Return type: list
-
count_employment_by_age
()[source]¶ Create employment count by age for workers in the generated population post generation.
Returns: Dictionary of the count of employed workers by age in the generated population. Return type: dict
-
employment_rates_by_age
¶ Employment rates by age for workers in the generated population.
Returns: Dictionary of the employment rates by age for workers in the generated population. Return type: dict
-
get_workplace_sizes
()[source]¶ Create workplace sizes in the generated population post generation.
Returns: Dictionary of workplace size by workplace id (wpid). Return type: dict
-
count_workplace_sizes
()[source]¶ Count of workplace sizes in the generated population.
Returns: Dictionary of the count of workplace sizes. Return type: dict
-
get_contact_counts_by_layer
(layer='S', **kwargs)[source]¶ Get the number of contacts by layer.
Returns: Dictionary of the count of contacts in the layer for the different people types in the layer. See sp.contact_networks.get_contact_counts_by_layer() for method details. Return type: dict
-
plot_contact_counts
(contact_counter, **kwargs)[source]¶ Plot the number of contacts by contact types as a histogram.
Parameters: - contact_counter (dict) – A dictionary with people_types as keys and value as list of counts for each type of contacts
- **title_prefix (str) – optional title prefix for the figure
- **figname (str) – name to save figure to disk
- **fontsize (float) – Matplotlib.figure.fontsize
Returns: Matplotlib figure and axes of the histograms of contact distributions for the corresponding contact_counter.
Examples:
pars = {'n': 10e3, 'location': 'seattle_metro', 'state_location': 'Washington', 'country_location': 'usa'} pop = sp.Pop(**pars) layer = 'S' contact_counter = pop.get_contact_counts_by_layer(layer=layer) fig, ax = pop.plot_contact_counts(contact_counter)
-
plot_ages
(**kwargs)[source]¶ Plot a comparison of the expected and generated age distribution.
Example:
pars = {'n': 10e3, 'location':'seattle_metro', 'state_location':'Washington', 'country_location':'usa'} pop = sp.Pop(**pars) fig, ax = pop.plot_ages()
-
plot_household_sizes
(**kwargs)[source]¶ Plot a comparison of the expected and generated household size distribution.
Example:
pars = {'n': 10e3, 'location':'seattle_metro', 'state_location':'Washington', 'country_location':'usa'} pop = sp.Pop(**pars) fig, ax = pop.plot_household_sizes()
-
plot_household_head_ages_by_size
(**kwargs)[source]¶ Plot a comparison of the expected and generated age distribution of the household heads by the household size.
Examples:
pars = {'n': 10e3, 'location':'seattle_metro', 'state_location':'Washington', 'country_location':'usa'} pop = sp.Pop(**pars) fig, ax = pop.plot_household_head_ages_by_size() kwargs = pars.copy() fig, ax = pop.plot_household_head_ages_by_size(**kwargs)
-
plot_ltcf_resident_sizes
(**kwargs)[source]¶ Plot a comparison of the expected and generated ltcf resident sizes.
Examples:
pars = {'n': 10e3, 'location':'seattle_metro', 'state_location':'Washington', 'country_location':'usa'} pop = sp.Pop(**pars) fig, ax = pop.plot_ltcf_resident_sizes()
-
plot_enrollment_rates_by_age
(**kwargs)[source]¶ Plot a comparison of the expected and generated enrollment rates by age.
Example:
pars = {'n': 10e3, 'location':'seattle_metro', 'state_location':'Washington', 'country_location':'usa'} pop = sp.Pop(**pars) fig, ax = pop.plot_enrollment_rates_by_age()
-
plot_employment_rates_by_age
(**kwargs)[source]¶ Plot a comparison of the expected and generated employment rates by age.
Example:
pars = {'n': 10e3, 'location':'seattle_metro', 'state_location':'Washington', 'country_location':'usa'} pop = sp.Pop(**pars) fig, ax = pop.plot_employment_rates_by_age()
-
plot_school_sizes
(*args, **kwargs)[source]¶ Plot a comparison of the expected and generated school size distributions by school type.
Example:
pars = {'n': 10e3, 'location':'seattle_metro', 'state_location':'Washington', 'country_location':'usa'} pop = sp.Pop(**pars) fig, ax = pop.plot_school_sizes()
-
plot_workplace_sizes
(**kwargs)[source]¶ Plot a comparison of the expected and generated workplace sizes for workplaces that are not schools or long term care facilities.
Examples:
pars = {'n': 10e3, 'location':'seattle_metro', 'state_location':'Washington', 'country_location':'usa'} pop = sp.Pop(**pars) fig, ax = pop.plot_ltcf_resident_sizes()