synthpops.contact_networks module

This module generates the household, school, and workplace contact networks.

make_contacts(pop, age_by_uid, homes_by_uids, students_by_uid_lists=None, teachers_by_uid_lists=None, non_teaching_staff_uid_lists=None, workplace_by_uid_lists=None, facilities_by_uid_lists=None, facilities_staff_uid_lists=None, use_two_group_reduction=False, average_LTCF_degree=20, with_school_types=False, school_mixing_type='random', average_class_size=20, inter_grade_mixing=0.1, average_student_teacher_ratio=20, average_teacher_teacher_degree=3, average_student_all_staff_ratio=15, average_additional_staff_degree=20, school_type_by_age=None, workplaces_by_industry_codes=None, max_contacts=None)[source]

From microstructure objects (dictionary mapping ID to age, lists of lists in different settings, etc.), create a dictionary of individuals. Each key is the ID of an individual which maps to a dictionary for that individual with attributes such as their age, household ID (hhid), school ID (scid), workplace ID (wpid), workplace industry code (wpindcode) if available, and contacts in different layers.

Parameters:
  • age_by_uid (dict) – dictionary mapping id to age for all individuals in the population
  • homes_by_uids (list) – A list of lists where each sublist is a household and the IDs of the household members.
  • schools_by_uids (list) – A list of lists, where each sublist represents a school and the ids of the students and teachers within it
  • teachers_by_uids (list) – A list of lists, where each sublist represents a school and the ids of the teachers within it
  • workplaces_by_uids (list) – A list of lists, where each sublist represents a workplace and the ids of the workers within it
  • facilities_by_uids (list) – A list of lists, where each sublist represents a skilled nursing or long term care facility and the ids of the residents living within it
  • facilities_staff_uids (list) – A list of lists, where each sublist represents a skilled nursing or long term care facility and the ids of the staff working within it
  • non_teaching_staff_uids (list) – None or a list of lists, where each sublist represents a school and the ids of the non teaching staff within it
  • use_two_group_reduction (bool) – If True, create long term care facilities with reduced contacts across both groups
  • average_LTCF_degree (int) – default average degree in long term care facilities
  • with_school_types (bool) – If True, creates explicit school types.
  • school_mixing_type (str or dict) – The mixing type for schools, ‘random’, ‘age_clustered’, or ‘age_and_class_clustered’ if string, and a dictionary of these by school type otherwise. ‘random’ means random graphs for each school, ‘age_clustered’ means random graphs but with students mostly mixing within the age/grade (inter_grade_mixing controls mixing between grades), ‘age_and_grade_clustered’ means students cohorted into classes with their own teachers.
  • average_class_size (float) – The average classroom size.
  • inter_grade_mixing (float) – The average fraction of mixing between grades in the same school for clustered school mixing types.
  • average_student_teacher_ratio (float) – The average number of students per teacher.
  • average_teacher_teacher_degree (float) – The average number of contacts per teacher with other teachers.
  • average_student_all_staff_ratio (float) – The average number of students per staff members at school (including both teachers and non teachers).
  • average_additional_staff_degree (float) – The average number of contacts per additional non teaching staff in schools.
  • school_type_by_age (dict) – A dictionary of probabilities for the school type likely for each age.
  • workplaces_by_industry_codes (np.ndarray or None) – array with workplace industry code for each workplace
  • trimmed_size_dic (dict) – If supplied, trim contacts on creation rather than post hoc.
Returns:

A popdict of people with attributes. Dictionary keys are the IDs of individuals in the population and the values are a dictionary for each individual with their attributes, such as age, household ID (hhid), school ID (scid), workplace ID (wpid), workplace industry code (wpindcode) if available, and the IDs of their contacts in different layers. Different layers available are households (‘H’), schools (‘S’), and workplaces (‘W’), and long term care facilities (‘LTCF’). Contacts in these layers are clustered and thus form a network composed of groups of people interacting with each other. For example, all household members are contacts of each other, and everyone in the same school is considered a contact of each other. If use_two_group_reduction is True, then contracts within ‘LTCF’ are reduced from fully connected.

Notes

Methods to trim large groups of contacts down to better approximate a sense of close contacts (such as classroom sizes or smaller work groups are available via sp.trim_contacts() or sp.create_reduced_contacts_with_group_types(): see these methods for more details).

If with_school_types==False, completely random schools will be generated with respect to the average_class_size, but other parameters such as average_additional_staff_degree will not be used.

create_reduced_contacts_with_group_types(popdict, group_1, group_2, setting, average_degree=20, p_matrix=None, force_cross_edges=True)[source]

Create contacts between members of group 1 and group 2, fixing the average degree, and the probability of an edge between any two groups controlled by p_matrix if provided. Forces inter group edge for each individual in group 1 with force_cross_groups equal to True. This means not everyone in group 2 will have a contact with group 1.

Parameters:
  • group_1 (list) – list of ids for group 1
  • group_2 (list) – list of ids for group 2
  • average_degree (int) – average degree across group 1 and 2
  • p_matrix (np.ndarray) – probability matrix for edges between any two groups
  • force_cross_groups (bool) – If True, force each individual to have at least one contact with a member from the other group
Returns:

Popdict with edges added for nodes in the two groups.

Notes

This method uses the Stochastic Block Model algorithm to generate contacts both between nodes in different groups

and for nodes within the same group. In the current version, fixing the average degree and p_matrix, the matrix of probabilities for edges between any two groups is not supported. Future versions may add support for this.

get_contact_counts_by_layer(popdict, layer='S', with_layer_ids=False)[source]

Method to count the number of contacts for individuals in the population based on their role in a layer and the role of their contacts. For example, in schools this method can distinguish the number of contacts between students, teachers, and non teaching staff in the population, as well as return the number of contacts between all individuals present in a school. In a population with a school layer and roles defined as students, teachers, and non teaching staff, this method will return the number of contacts or edges for sc_students, sc_teachers, and sc_staff to sc_student, sc_teacher, sc_staff, all_staff, all. all_staff is the combination of sc_teacher and sc_staff, and all is all kinds of people in schools.

Parameters:
  • popdict (dict) – popdict of a Pop object, Dictionary keys are the IDs of individuals in the population and the values are a dictionary
  • layer (str) – name of the physial contact layer: H for households, S for schools, W for workplaces, C for community, etc.
  • with_layer_ids (bool) – If True, return additional dictionary on contacts by layer group id
Returns:

A dictionary with keys = people_types (default to [‘sc_student’, ‘sc_teacher’, ‘sc_staff’]) and each value is a dictionary which stores the list of counts for each type of contact: default to [‘sc_student’, ‘sc_teacher’, ‘sc_staff’, ‘all_staff’, ‘all’] for example: contact_counter[‘sc_teacher’][‘sc_teacher’] store the counts of each teacher’s contacts or edges to other teachers. If with_layer_ids is True: additionally return a dictionary with keys = layer_id (for example: scid, wpid…), and value is list of contact contacts.

Return type:

If with_layer_ids is False

filter_people(pop, ages=None, uids=None)[source]

Helper function to filter people based on their uid and age.

Parameters:
  • pop (sp.Pop) – population
  • ages (list or array) – ages of people to include
  • uids (list or array) – ids of people to include
Returns:

An array of the ids of people to include for further analysis.

Return type:

array

count_layer_degree(pop, layer='H', ages=None, uids=None, uids_included=None)[source]

Create a dataframe from the population of people in the layer, including their uid, age, degree, and the ages of contacts in the layer.

Parameters:
  • pop (sp.Pop) – population
  • layer (str) – name of the physial contact layer: H for households, S for schools, W for workplaces, C for community or other
  • ages (list or array) – ages of people to include
  • uids (list or array) – ids of people to include
  • uids_included (list or None) – pre-calculated mask of people to include
Returns:

A pandas DataFrame of people in the layer including uid, age, degree, and the ages of contacts in the layer.

Return type:

pandas.DataFrame

compute_layer_degree_description(pop, layer='H', ages=None, uids=None, uids_included=None, degree_df=None, percentiles=None)[source]

Compute a description of the statistics for the degree distribution by age for a layer in the population contact network. See pandas.Dataframe.describe() for more details on all of the statistics included by default.

Parameters:
  • pop (sp.Pop) – population
  • layer (str) – name of the physial contact layer: H for households, S for schools, W for workplaces, C for community or other
  • ages (list or array) – ages of people to include
  • uids (list or array) – ids of people to include
  • uids_included (list or None) – pre-calculated mask of people to include
  • degree_df (dataframe) – pandas dataframe of people in the layer and their uid, age, degree, and ages of their contacts in the layer
  • percentiles (list) – list of the percentiles to include as statistics
Returns:

A pandas DataFrame of the statistics for the layer degree distribution by age.

Return type:

pandas.DataFrame

random_graph_model(uids, average_degree, seed=None)[source]

Generate edges for a group of individuals given their ids from an Erdos-Renyi random graph model given the expected average degree.

Parameters:
  • uids (list, np.ndarray) – a list or array of the ids of people in the graph
  • average_degree (float) – the average degree in the generated graph
Returns:

Fast implementation of the Erdos-Renyi random graph model.

Return type:

nx.Graph

get_expected_density(average_degree, n_nodes)[source]

Calculate the expected density of an undirected graph with no self-loops given graph properties. The expected density of an undirected graph with no self-loops is defined as the number of edges as a fraction of the number of maximal edges possible.

Reference: Newman, M. E. J. (2010). Networks: An Introduction (pp 134-135). Oxford University Press.

Parameters:
  • average_degree (float) – average expected degree
  • n_nodes (int) – number of nodes in the graph
Returns:

The expected graph density.

Return type:

float