synthpops.process_census module

This module provides functions that process data tables from the US Census Bureau into simple distribution tables that SynthPops functions can talk to.

Also includes functions to process data tables from the National survey on Long Term Care Providers in the US to convert those into rates by age for each US state using SynthPops functions.

synthpops.process_census.process_us_census_age_counts(datadir, location, state_location, country_location, year, acs_period)

Process American Community Survey data for a given year to get an age count for the location binned into 18 age brackets.

Parameters
  • datadir (str) – file path to the data directory

  • location (str) – name of the location

  • state_location (str) – name of the state the location is in

  • country_location (str) – name of the country the location is in

  • year (int) – the year for the American Community Survey

  • acs_period (int) – the number of years for the American Community Survey

Returns

A dictionary with the binned age count and a dictionary with the age bracket ranges.

synthpops.process_census.process_us_census_age_counts_by_gender(datadir, location, state_location, country_location, year, acs_period)

Process American Community Survey data for a given year to get an age count by gender for the location binned into 18 age brackets.

Parameters
  • datadir (str) – file path to the data directory

  • location (str) – name of the location

  • state_location (str) – name of the state the location is in

  • country_location (str) – name of the country the location is in

  • year (int) – the year for the American Community Survey

  • acs_period (int) – the number of years for the American Community Survey

Returns

A dictionary with the binned age count by gender and a dictionary with the age bracket ranges.

synthpops.process_census.process_us_census_population_size(datadir, location, state_location, country_location, year, acs_period)

Process American Community Survey data for a given year to get the population size for the location.

Parameters
  • datadir (str) – file path to the data directory

  • location (str) – name of the location

  • state_location (str) – name of the state the location is in

  • country_location (str) – name of the country the location is in

  • year (int) – the year for the American Community Survey

  • acs_period (int) – the number of years for the American Community Survey

Returns

The population size of the location for a given year estimated from the American Community Survey.

Return type

int

synthpops.process_census.process_us_census_household_size_count(datadir, location, state_location, country_location, year, acs_period)

Process American Community Survey data for a given year to get a household size count for the location. The last bin represents households of size 7 or higher.

Parameters
  • datadir (str) – file path to the data directory

  • location (str) – name of the location

  • state_location (str) – name of the state the location is in

  • country_location (str) – name of the country the location is in

  • year (int) – the year for the American Community Survey

  • acs_period (int) – the number of years for the American Community Survey

Returns

A dictionary with the household size count.

synthpops.process_census.process_us_census_employment_rates(datadir, location, state_location, country_location, year, acs_period)

Process American Community Survey data for a given year to get employment rates by age as a fraction.

Parameters
  • datadir (str) – file path to the data directory

  • location (str) – name of the location

  • state_location (str) – name of the state the location is in

  • country_location (str) – name of the country the location is in

  • year (int) – the year for the American Community Survey

  • acs_period (int) – the number of years for the American Community Survey

Returns

A dictionary with the employment rates by age as a fraction.

synthpops.process_census.process_us_census_enrollment_rates(datadir, location, state_location, country_location, year, acs_period)

Process American Community Survey data for a given year to get enrollment rates by age as a fraction.

Parameters
  • datadir (str) – file path to the data directory

  • location (str) – name of the location

  • state_location (str) – name of the state the location is in

  • country_location (str) – name of the country the location is in

  • year (int) – the year for the American Community Survey

  • acs_period (int) – the number of years for the American Community Survey

Returns

A dictionary with the enrollment rates by age as a fraction.

synthpops.process_census.process_us_census_workplace_sizes(datadir, location, state_location, country_location, year)

Process American Community Survey data for a given year to get a count of workplace sizes as the number of employees per establishment.

Parameters
  • datadir (str) – file path to the data directory

  • location (str) – name of the location

  • state_location (str) – name of the state the location is in

  • country_location (str) – name of the country the location is in

  • year (int) – the year for the American Community Survey

Returns

A dictionary with the workplace or establishment size distribution as a count.

synthpops.process_census.process_long_term_care_facility_rates_by_age(datadir, state_location, country_location)

Process the National Long Term Care Providers state data tables from 2016 to get the estimated user rates by age.

Parameters
  • datadir (string) – file path to the data directory

  • state_location (string) – name of the state

  • country_location (string) – name of the country the state is in

Returns

A dictionary with the estimated rates of Long Term Care Facility usage by age for the state in 2016.

Return type

dict

synthpops.process_census.process_usa_ltcf_resident_to_staff_ratios(datadir, country_location, state_location, location_alias, location_list=[''], save=False)

Process the Kaiser Health News (KHN) dashboard data on the ratios by facility to estimate the ratios for all facilities in the area. from 2016 to get the estimated user rates by age. Then write to file.

Parameters
  • datadir (string) – file path to the data directory

  • country_location (string) – name of the country

  • state_location (string) – name of the state

  • location_alias (str) – more commonly known name of the location

  • location_list (list) – list of locations to include

  • save (bool) – If True, save to file.

Returns

A dictionary with the probability of resident to staff ratios and the bins.

Return type

dict

synthpops.process_census.write_age_bracket_distr_18(datadir, location_alias, state_location, country_location, age_bracket_count, age_brackets)

Write age bracket distribution binned to 18 age brackets.

Parameters
  • datadir (str) – file path to the data directory

  • location_alias (str) – more commonly known name of the location

  • state_location (str) – name of the state the location is in

  • country_location (str) – name of the country the location is in

  • age_bracket_count (dict) – dictionary of the age count given by 18 brackets

  • age_brackets (dict) – dictionary of the age range for each bracket

Returns

None.

synthpops.process_census.write_age_bracket_distr_16(datadir, location_alias, state_location, country_location, age_bracket_count, age_brackets)

Write age bracket distribution binned to 16 age brackets.

Parameters
  • datadir (str) – file path to the data directory

  • location_alias (str) – more commonly known name of the location

  • state_location (str) – name of the state the location is in

  • country_location (str) – name of the country the location is in

  • age_bracket_count (dict) – dictionary of the age count given by 18 brackets

  • age_brackets (dict) – dictionary of the age range for each bracket

Returns

None.

synthpops.process_census.write_gender_age_bracket_distr_18(datadir, location_alias, state_location, country_location, age_bracket_count_by_gender, age_brackets)

Write age bracket by gender distribution data binned to 18 age brackets.

Parameters
  • datadir (str) – file path to the data directory

  • location_alias (str) – more commonly known name of the location

  • state_location (str) – name of the state the location is in

  • country_location (str) – name of the country the location is in

  • age_bracket_distr (dict) – dictionary of the age count by gender given by 18 brackets

  • age_brackets (dict) – dictionary of the age range for each bracket

Returns

None.

synthpops.process_census.write_gender_age_bracket_distr_16(datadir, location_alias, state_location, country_location, age_bracket_count_by_gender, age_brackets)

Write age bracket by gender distribution binned to 16 age brackets.

Parameters
  • datadir (str) – file path to the data directory

  • location_alias (str) – more commonly known name of the location

  • state_location (str) – name of the state the location is in

  • country_location (str) – name of the country the location is in

  • age_bracket_distr (dict) – dictionary of the age count by gender given by 18 brackets

  • age_brackets (dict) – dictionary of the age range for each bracket

Returns

None.

synthpops.process_census.read_household_size_count(datadir, location_alias, state_location, country_location)

Get household size count dictionary.

Parameters
  • datadir (str) – file path to the data directory

  • location_alias (str) – more commonly known name of the location

  • state_location (str) – name of the state the location is in

  • country_location (str) – name of the country the location is in

Returns

A dictionary of the household size count.

Return type

dict

synthpops.process_census.write_household_size_count(datadir, location_alias, state_location, country_location, household_size_count)

Write household size count.

Parameters
  • datadir (str) – file path to the data directory

  • location_alias (str) – more commonly known name of the location

  • state_location (str) – name of the state the location is in

  • country_location (str) – name of the country the location is in

  • household_size_count (dict) – dictionary of the household size count.

Returns

None.

synthpops.process_census.write_household_size_distr(datadir, location_alias, state_location, country_location, household_size_count)

Write household size distribution.

Parameters
  • datadir (str) – file path to the data directory

  • location_alias (str) – more commonly known name of the location

  • state_location (str) – name of the state the location is in

  • country_location (str) – name of the country the location is in

  • household_size_count (dict) – dictionary of the household size count.

Returns

None.

synthpops.process_census.write_employment_rates(datadir, location_alias, state_location, country_location, employment_rates)

Write employment rates by age as a fraction.

Parameters
  • datadir (str) – file path to the data directory

  • location_alias (str) – more commonly known name of the location

  • state_location (str) – name of the state the location is in

  • country_location (str) – name of the country the location is in

  • employment_rates (dict) – dictionary of the employment rates by age as a fraction.

Returns

None.

synthpops.process_census.write_enrollment_rates(datadir, location_alias, state_location, country_location, enrollment_rates)

Write employment rates by age as a fraction.

Parameters
  • datadir (str) – file path to the data directory

  • location_alias (str) – more commonly known name of the location

  • state_location (str) – name of the state the location is in

  • country_location (str) – name of the country the location is in

  • enrollment_rates (dict) – dictionary of the enrollment rates by age as a fraction.

Returns

None.

synthpops.process_census.write_long_term_care_facility_use_rates(datadir, state_location, country_location, ltcf_rates_by_age)

Write Long Term Care Facility usage rates by age as a fraction for a state in the United States.

Parameters
  • datadir (str) – file path to the data directory

  • location_alias (str) – more commonly known name of the location

  • state_location (str) – name of the state the location is in

  • country_location (str) – name of the country the location is in

  • ltcf_rates_by_age (dict) – dictionary of the long term care facility use rates by age as a fraction.

Returns

None.

synthpops.process_census.write_workplace_size_counts(datadir, location_alias, state_location, country_location, size_label_mappings, establishment_size_counts)

Write workplace or establishment size count distribution.

Parameters
  • datadir (str) – file path to the data directory

  • location_alias (str) – more commonly known name of the location

  • state_location (str) – name of the state the location is in

  • country_location (str) – name of the country the location is in

  • size_label_mappings (dict) – dictionary of the size labels mapping to the size bin

  • establishment_size_counts (dict) – dictionary of the count of workplaces by size label

Returns

None.