laser_core.demographics package¶
Submodules¶
laser_core.demographics.kmestimator module¶
This module provides the KaplanMeierEstimator class for predicting the year and age at death based on given ages and cumulative death data.
Classes:
KaplanMeierEstimator: A class to perform Kaplan-Meier estimation for predicting the year and age at death.
Functions:
_pyod(ages_years: np.ndarray, cumulative_deaths: np.ndarray, max_year: np.uint32 = 100): Calculate the predicted year of death based on the given ages in years.
_pdod(age_in_days: np.ndarray, year_of_death: np.ndarray, day_of_death: np.ndarray): Calculate the predicted day of death based on the given ages in days and predicted years of death.
Usage example:
estimator = KaplanMeierEstimator(cumulative_deaths=np.array([...]))
year_of_death = estimator.predict_year_of_death(np.array([40, 50, 60]), max_year=80)
age_at_death = estimator.predict_age_at_death(np.array([40*365, 50*365, 60*365]), max_year=80)
- class laser_core.demographics.kmestimator.KaplanMeierEstimator(source: ndarray | list | Path | str)[source]¶
Bases:
object
- property cumulative_deaths: ndarray¶
Returns the original source data.
- predict_age_at_death(ages_days: ndarray[Any, dtype[integer]], max_year: uint32 = 100) ndarray [source]¶
Calculate the predicted age at death (in days) based on the given ages in days.
- Parameters:
ages_days (np.ndarray) – The ages of the individuals in days.
max_year (int) – The maximum year to consider for calculating the predicted year of death. Default is 100.
- Returns:
age_at_death (np.ndarray) – The predicted days of death.
Example:
predict_age_at_death(np.array([40*365, 50*365, 60*365]), max_year=80) # returns something like array([22732, 26297, 29862])
- predict_year_of_death(ages_years: ndarray[Any, dtype[integer]], max_year: uint32 = 100) ndarray [source]¶
Calculate the predicted year of death based on the given ages in years.
- Parameters:
ages_years (np.ndarray) – The ages of the individuals in years.
max_year (int) – The maximum year to consider for calculating the predicted year of death. Default is 100.
- Returns:
year_of_death (np.ndarray) – The predicted years of death.
Example:
predict_year_of_death(np.array([40, 50, 60]), max_year=80) # returns something like array([62, 72, 82])
laser_core.demographics.pyramid module¶
A class for generating samples from a distribution using the Vose alias method.
- class laser_core.demographics.pyramid.AliasedDistribution(counts)[source]¶
Bases:
object
A class to generate samples from a distribution using the Vose alias method.
- property alias: ndarray¶
- property probs: ndarray¶
- sample(count=1, dtype=<class 'numpy.int32'>) int [source]¶
Generate samples from the distribution.
- Parameters:
count (int) – The number of samples to generate. Default is 1.
- Returns:
int or numpy.ndarray – A single integer if count is 1, otherwise an array of integers representing the generated samples.
- property total: int¶
- laser_core.demographics.pyramid.load_pyramid_csv(file: Path, verbose=False) ndarray [source]¶
Load a CSV file with population pyramid data and return it as a NumPy array.
The CSV file is expected to have the following schema:
The first line is a header: “Age,M,F”
Subsequent lines contain age ranges and population counts for males and females:
"low-high,#males,#females" ... "max+,#males,#females"
Where low, high, males, females, and max are integer values >= 0.
The function processes the CSV file to create a NumPy array with the following columns:
Start age of the range
End age of the range
Number of males
Number of females
- Parameters:
file (Path) – The path to the CSV file.
verbose (bool) – If True, prints the file reading status. Default is False.
- Returns:
np.ndarray – A NumPy array with the processed population pyramid data.
laser_core.demographics.spatialpops module¶
- laser_core.demographics.spatialpops.distribute_population_skewed(tot_pop, num_nodes, frac_rural=0.3)[source]¶
Calculate the population distribution across a number of nodes based on a total population, the number of nodes, and the fraction of the population assigned to rural nodes.
The function generates a list of node populations distributed according to a simple exponential random distribution, with adjustments to ensure the sum matches the total population and the specified fraction of rural population is respected.
- Parameters:
tot_pop (int) – The total population to be distributed across the nodes.
num_nodes (int) – The total number of nodes among which the population will be distributed.
frac_rural (float) – The fraction of the total population to be assigned to rural nodes (value between 0 and 1). Defaults to 0.3. The 0 node is the single urban node and has (1-frac_rural) of the population.
- Returns:
list of int – A list of integers representing the population at each node. The sum of the list equals tot_pop.
Notes
The population distribution is weighted using an exponential random distribution to create heterogeneity among node populations.
Adjustments are made to ensure the total fraction assigned to rural nodes adheres to frac_rural.
Examples
>>> from laser_core.demographics.spatialpops import distribute_population_skewed >>> np.random.seed(42) # For reproducibility >>> tot_pop = 1000 >>> num_nodes = 5 >>> frac_rural = 0.3 >>> distribute_population_skewed(tot_pop, num_nodes, frac_rural) [700, 154, 64, 54, 28]
>>> tot_pop = 500 >>> num_nodes = 3 >>> frac_rural = 0.4 >>> distribute_population_skewed(tot_pop, num_nodes, frac_rural) [300, 136, 64]
- laser_core.demographics.spatialpops.distribute_population_tapered(tot_pop, num_nodes)[source]¶
Distribute a total population heterogeneously across a given number of nodes.
The distribution follows a logarithmic-like decay pattern where the first node (Node 0) receives the largest share of the population, approximately half the total population. Subsequent nodes receive progressively smaller populations, ensuring that even the smallest node has a non-negligible share.
The function ensures the sum of the distributed populations matches the tot_pop exactly by adjusting the largest node if rounding introduces discrepancies.
- Parameters:
tot_pop (int) – The total population to distribute. Must be a positive integer.
num_nodes (int) – The number of nodes to distribute the population across. Must be a positive integer.
- Returns:
numpy.ndarray – A 1D array of integers where each element represents the population assigned to a specific node. The length of the array is equal to num_nodes.
- Raises:
ValueError – If tot_pop or num_nodes is not greater than 0.
Notes
The logarithmic-like distribution ensures that Node 0 has the highest population, and subsequent nodes receive progressively smaller proportions.
The function guarantees that the sum of the returned array equals tot_pop.
Examples
Distribute a total population of 1000 across 5 nodes:
>>> from laser_core.demographics.spatialpops import distribution_population_tapered >>> distribute_population_tapered(1000, 5) array([500, 250, 125, 75, 50])
Distribute a total population of 1200 across 3 nodes:
>>> distribute_population_tapered(1200, 3) array([600, 400, 200])
Handling a small total population with more nodes:
>>> distribute_population_tapered(10, 4) array([5, 3, 2, 0])
Ensuring the distribution adds up to the total population:
>>> pop = distribute_population_tapered(1000, 5) >>> pop.sum() 1000
Module contents¶
- class laser_core.demographics.AliasedDistribution(counts)[source]¶
Bases:
object
A class to generate samples from a distribution using the Vose alias method.
- property alias: ndarray¶
- property probs: ndarray¶
- sample(count=1, dtype=<class 'numpy.int32'>) int [source]¶
Generate samples from the distribution.
- Parameters:
count (int) – The number of samples to generate. Default is 1.
- Returns:
int or numpy.ndarray – A single integer if count is 1, otherwise an array of integers representing the generated samples.
- property total: int¶
- class laser_core.demographics.KaplanMeierEstimator(source: ndarray | list | Path | str)[source]¶
Bases:
object
- property cumulative_deaths: ndarray¶
Returns the original source data.
- predict_age_at_death(ages_days: ndarray[Any, dtype[integer]], max_year: uint32 = 100) ndarray [source]¶
Calculate the predicted age at death (in days) based on the given ages in days.
- Parameters:
ages_days (np.ndarray) – The ages of the individuals in days.
max_year (int) – The maximum year to consider for calculating the predicted year of death. Default is 100.
- Returns:
age_at_death (np.ndarray) – The predicted days of death.
Example:
predict_age_at_death(np.array([40*365, 50*365, 60*365]), max_year=80) # returns something like array([22732, 26297, 29862])
- predict_year_of_death(ages_years: ndarray[Any, dtype[integer]], max_year: uint32 = 100) ndarray [source]¶
Calculate the predicted year of death based on the given ages in years.
- Parameters:
ages_years (np.ndarray) – The ages of the individuals in years.
max_year (int) – The maximum year to consider for calculating the predicted year of death. Default is 100.
- Returns:
year_of_death (np.ndarray) – The predicted years of death.
Example:
predict_year_of_death(np.array([40, 50, 60]), max_year=80) # returns something like array([62, 72, 82])
- laser_core.demographics.load_pyramid_csv(file: Path, verbose=False) ndarray [source]¶
Load a CSV file with population pyramid data and return it as a NumPy array.
The CSV file is expected to have the following schema:
The first line is a header: “Age,M,F”
Subsequent lines contain age ranges and population counts for males and females:
"low-high,#males,#females" ... "max+,#males,#females"
Where low, high, males, females, and max are integer values >= 0.
The function processes the CSV file to create a NumPy array with the following columns:
Start age of the range
End age of the range
Number of males
Number of females
- Parameters:
file (Path) – The path to the CSV file.
verbose (bool) – If True, prints the file reading status. Default is False.
- Returns:
np.ndarray – A NumPy array with the processed population pyramid data.