synthpops.data module¶
-
class
PopulationAgeDistribution
[source]¶ Bases:
jsonobject.api.JsonObject
Class for population age distribution with a specified number of bins.
-
num_bins
¶
-
distribution
¶
-
-
class
SchoolSizeDistributionByType
[source]¶ Bases:
jsonobject.api.JsonObject
Class for the school size distribution by school type.
-
school_type
¶
-
size_distribution
¶
-
-
class
SchoolTypeByAge
[source]¶ Bases:
jsonobject.api.JsonObject
Class for the school type by age range.
-
school_type
¶
-
age_range
¶
-
-
class
Location
[source]¶ Bases:
jsonobject.api.JsonObject
Class for the json object for the location containing data about the population to generate representative contact networks.
The general use case of this is to use a filepath, and the parent data is parsed from the filepath. DefaultProperty type handles either a scalar or json object. We allow a json object mainly for testing of inheriting from a parent specified directly in the json.
Most users will want to populate this with a relative or absolute file path.
Note
The structures for the population age distribution will be updated to be more flexible to take in a parameter for the number of age brackets to generate the population age distribution structure.
-
location_name
¶
-
data_provenance_notices
¶
-
reference_links
¶
-
citations
¶
-
notes
¶
-
parent
¶
-
population_age_distributions
¶
-
employment_rates_by_age
¶
-
enrollment_rates_by_age
¶
-
household_head_age_brackets
¶
-
household_head_age_distribution_by_family_size
¶
-
household_size_distribution
¶
-
ltcf_resident_to_staff_ratio_distribution
¶
-
ltcf_num_residents_distribution
¶
-
ltcf_num_staff_distribution
¶
-
ltcf_use_rate_distribution
¶
-
school_size_brackets
¶
-
school_size_distribution
¶
-
school_size_distribution_by_type
¶
-
school_types_by_age
¶
-
workplace_size_counts_by_num_personnel
¶
-
get_list_properties
()[source]¶ Get the properties of the location data object as a list.
Returns: A list of the properties of the location json object with data about the location. Return type: list
-
get_population_age_distribution
(nbrackets)[source]¶ Get the age distribution of the population aggregated to nbrackets age brackets. If the data doesn’t contain a distribution with the requested number of brackets, an exception is raised.
Parameters: nbrackets (int) – the number of age brackets the age distribution is aggregated to Returns: A list of the probability age distribution values indexed by the bracket number. Return type: list
-
-
populate_parent_data_from_file_path
(location, parent_file_path)[source]¶ Loading a location json object with necessary data fields filled from the parent location using the parent location file path.
Parameters: - location (json) – json object for the location data
- parent_file_path (str) – file path to the parent location
Returns: The location json object with necessary data fields filled from the parent location.
Return type: json
-
populate_parent_data_from_json_obj
(location, parent)[source]¶ Loading a location json object with necessary data fields filled from the parent location json.
Parameters: - location (json) – json object for the location data
- parent (json) – json object for the parent location
Returns: The location json object with necessary data fields filled from the parent location.
Return type: json
-
populate_parent_data
(location)[source]¶ Populate location json object with fields from the parent location if available.
Parameters: location (json) – json data object for the location # parameter name change for more specificity Returns: The location json data object with data fields filled from the parent location. Return type: json
-
load_location_from_json
(json_obj, check_constraints=None)[source]¶ Load location data from json object with some checks made.
Parameters: json_obj (json) – json object containing location data Returns: The json object with location data. Return type: json
-
load_location_from_json_str
(json_str, check_constraints=None)[source]¶ Load location data from json str with some checks made.
Parameters: json_str (str) – string version of the json object Returns: The json object with location data. Return type: json
-
get_relative_path
(datadir)[source]¶ Get the relative path for the data folder.
Parameters: datadir (str) – data folder path Returns: Relative path for the data folder. Return type: str Notes
This method may not be necessary anymore…
-
get_location_attr
(location, property_name)[source]¶ Get the attribute from the json object containing location data given the associated property name.
Parameters: - location (json) – the json object with location data
- property_name (str) – the property name
Returns: If property_name exists in the location json object, return [True, attribute]. Else, return [False, None].
-
load_location_from_filepath
(rel_filepath, check_constraints=None)[source]¶ Loads location data object from provided relative filepath where the file path is relative to defaults.settings.datadir.
Parameters: rel_filepath (str) – relative file path for the location data Returns: The json object with location data. Return type: json
-
save_location_to_filepath
(location, abs_filepath)[source]¶ Saves json object with location data to provided absolute filepath.
Parameters: - location (json) – the json object with location data
- abs_filepath (str) – absolute file path to where the json is saved
Returns: None.
-
check_location_constraints_satisfied
(location)[source]¶ Checks a number of constraints that need to be satisfied for the schema.
Parameters: location (json) – the json object with location data
Returns: None.
Raises: - RuntimeError with a description if one of the constraints is not
satisfied.
-
are_location_constraints_satisfied
(location)[source]¶ Checks a number of constraints that need to be satisfied for the schema.
Parameters: location (json) – the json object with location data Returns: [True, None] if all constraints are satisfied. [False, str] if a constraint is violated. The returned str is one of the error messages.
-
check_array_of_arrays_entry_lens
(location, expected_len, property_name)[source]¶ Check that each array in an array of arrays has the expected length.
Parameters: - location (json) – the json object with location data
- expected_len (int) – the expected length of each sub array
- property_name (str) – the property name
Returns: [True, None] if sub array length checks pass. [False, str] if sub array length checks fail. The returned str is the error message.
-
check_valid_probability_distributions
(property_name, valid_properties=None)[source]¶ Check that the property_name is a valid probability distribution.
Parameters: - property_name (str) – the property name
- valid_properties (str or list) – a list of the valid probability distributions
Returns: None.
-
check_probability_distribution_sum_age_distributions
(location, arr, tolerance=0.01, **kwargs)[source]¶ Check that each population age distribution has a sum equal to 1 within some tolerance.
Parameters: - location (json) – the json object with location data
- arr (list) – the list of population age distributions
- tolerance (float) – difference from the sum of 1 tolerated
- kwargs (dict) – dictionary of values passed to np.isclose()
Returns: [True, None] if the sum of the probability distribution is equal to 1 within the tolerance level. [False, str] else. The returned str is the error message with some information about the check.
-
check_probability_distribution_nonnegative_age_distributions
(location, arr)[source]¶ Check that each population age distribution has all non negative values.
Parameters: - location (json) – the json object with location data
- arr (list) – the list of population age distributions
Returns: [True, None] if the sum of the probability distribution is equal to 1 within the tolerance level. [False, str] else. The returned str is the error message with some information about the check.
-
check_probability_distribution_sum
(location, property_name, tolerance=0.01, valid_properties=None, **kwargs)[source]¶ Check that fields representing probability distributions have sums equal to 1 within some tolerance.
Parameters: - location (json) – the json object with location data
- property_name (str) – the property name
- tolerance (float) – difference from the sum of 1 tolerated
- valid_properties (str or list) – a list of the valid probability distributions
- kwargs (dict) – dictionary of values passed to np.isclose()
Returns: [True, None] if the sum of the probability distribution is equal to 1 within the tolerance level. [False, str] else. The returned str is the error message with some information about the check.
-
check_probability_distribution_nonnegative
(location, property_name, valid_properties=None)[source]¶ Check that fields representing probability distributions have all non negative values.
Parameters: - location (json) – the json object with location data
- property_name (str) – the property name
- valid_properties (str or list) – a list of the valid probability distributions
Returns: [True, None] if the values of the probability distribution are all non negative. [False, str] else. The returned str is the error message with some information about the check.
-
check_all_probability_distribution_sums
(location, tolerance=0.01, die=False, verbose=False, **kwargs)[source]¶ Checks that each probability distribution available to a location has a sum close to 1.
Parameters: - location (json) – the json object with location data
- tolerance (float) – difference from the sum of 1 tolerated
- die (bool) – raise an exception if the check fails
- verbose (bool) – print a warning if the check fails
- kwargs (dict) – dictionary of values passed to np.isclose()
Returns: List of checks and a list of associated error messages.
Return type: list, list
-
check_all_probability_distribution_nonnegative
(location, die=False, verbose=True)[source]¶ Run checks that a field representing probabilty distributions has all non negative values.
Parameters: - location (json) – json object with the location data
- die (bool) – raise an exception if the check fails
- verbose (bool) – print a warning if the check fails
Returns: List of checks and a list of associated error messages.
Return type: list, list
-
check_location_name
(location)[source]¶ Check the location json data object has a string.
Parameters: location (json) – the json object with location data Returns: [True, str] if the location json has a str value in the location_name field. Returned str specifies the location_name. [False, str] if the location json does not have a str value in the location_name field.
-
check_population_age_distributions
(location)[source]¶ Check that the population age distributions are self-consistent in the number of brackets, and each sub array has length 3.
Parameters: location (json) – the json object with location data Returns: [True, None] if checks pass. [False, str] if checks fail.
-
check_employment_rates_by_age
(location)[source]¶ Check that the employment rates by age is an array of arrays, where each sub array has length 2.
Parameters: location (json) – the json object with location data Returns: [True, None] if checks pass. [False, str] if checks fail.
-
check_enrollment_rates_by_age
(location)[source]¶ Check that the enrollment rates by age is an array of arrays, where each sub array has length 2.
Parameters: location (json) – the json object with location data Returns: [True, None] if checks pass. [False, str] if checks fail.
-
check_household_head_age_brackets
(location)[source]¶ Check that the household head age brackets is an array of arrays, where each sub array has length 2.
Parameters: location (json) – the json object with location data Returns: [True, None] if checks pass. [False, str] if checks fail.
-
check_household_head_age_distributions_by_family_size
(location)[source]¶ Check that the conditional household head age distribution by household size is an array with length equal to the number of household head age brackets.
Parameters: location (json) – the json object with location data Returns: [True, None] if checks pass. [False, str] if checks fail.
-
check_household_size_distribution
(location)[source]¶ Check that the household size distribution is an array of arrays, where each sub array has length 2.
Parameters: location (json) – the json object location data Returns: [True, None] if checks pass. [False, str] if checks fail.
-
check_ltcf_resident_to_staff_ratio_distribution
(location)[source]¶ Check that the long term care facility resident to staff ratio distribution is an array of arrays, where each sub array has length 3.
Parameters: location (json) – the json object location data Returns: [True, None] if checks pass. [False, str] if checks fail.
-
check_ltcf_num_residents_distribution
(location)[source]¶ Check that the long term care facility resident size distribution is an array of arrays, where each sub array has length 3.
Parameters: location (json) – the json object location data Returns: [True, None] if checks pass. [False, str] if checks fail.
-
check_ltcf_num_staff_distribution
(location)[source]¶ Check that the long term care facility staff size distribution is an array of arrays, where each sub array has length 3.
Parameters: location (json) – the json object location data Returns: [True, None] if checks pass. [False, str] if checks fail.
-
check_school_size_brackets
(location)[source]¶ Check that the school size distribution brackets is an array of arrays, where each sub array has length 2.
Parameters: location (json) – the json object location data Returns: [True, None] if checks pass. [False, str] if checks fail.
-
check_school_size_distribution_by_type
(location)[source]¶ Check that the school size distribution by school type is an array of arrays, where each sub array has length 3.
Parameters: location (json) – the json object location data Returns: [True, None] if checks pass. [False, str] if checks fail.
-
check_school_types_by_age
(location)[source]¶ Check that the school types by age range is an array of arrays, where each sub array has length 2.
Parameters: location (json) – the json object location data Returns: [True, None] if checks pass. [False, str] if checks fail.
-
check_workplace_size_counts_by_num_personnel
(location)[source]¶ Check that the workplace size count is an array of arrays, where each sub array has length 3.
Parameters: location (json) – the json object location data Returns: [True, None] if checks pass. [False, str] if checks fail.
-
convert_df_to_json_array
(df, cols, int_cols=None)[source]¶ Convert desired data from a pandas dataframe into a json array.
Parameters: - df (pandas dataframe) – the dataframe with data
- cols (list) – list of the columns to convert to the json array format
- int_cols (str or list) – a str or list of columns to convert to integer values
Returns: An array version of the pandas dataframe to be added to synthpops json data objects.
Return type: array