emodpy_hiv.plotting.plot_hiv_by_age_and_gender module#

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.create_title(base_title: str = '', node_id: int | None = None, gender: str | None = None, show_avg_per_run: bool = False, show_fraction: bool = False, show_fraction_of: bool = False, fraction_of_str: str = '', hiv_negative: bool = False, has_age_bins: bool = False)[source]#

Use the input arguments to create a title for the plot (and filename).

Parameters:

base_title (str, optional) – This is the core string to place in the title. It describes the specific data being plotted.
node_id (int, optional) – The ID of the node for which the data is being filtered for.
gender (str, optional) – The string (Male or Female) for the gender that data is being filtered for.
show_avg_per_run (bool, optional) – True indicates that the data is an average over multiple runs.
show_fraction (bool, optional) – True indicates that the data is not true counts but a fraction (i.e. a count divided by another counter)
show_fraction_of (bool, optional) – When the denominator of the fraction can be different things, say population or infected, this allows you to specify the option that is not population.
fraction_of_str (str, optional) – If ‘show_fraction_of’ is true, then this argument an be used to include in the plot what the denominator is.
hiv_negative (bool, optional) – If True, then the title will indicate that the data is for people without HIV.
has_age_bins (bool, optional) – If True, then ‘by Age’ is added to the title.

Returns:

A string to be used a the top line title of the plot.

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.create_y_axis_name(base_title: str = '', node_id: int | None = None, gender: str | None = None, show_avg_per_run: bool = False, show_fraction: bool = False, show_fraction_of: bool = False, fraction_of_str: str = '', has_age_bins: bool = False)[source]#

Given the arguments, create a label for the y-axis that describes the data being plotted.

Parameters:

base_title (str, optional) – This is the core string to place in the y-label. It describes what is being plotted.
node_id (int, optional) – TBD
gender (str, optional) – The string (Male or Female) for the gender that data is being filtered for.
show_avg_per_run (bool, optional) – TBD
show_fraction (bool, optional) – True indicates that the data is not true counts but a fraction (i.e. a count divided by another counter)
show_fraction_of (bool, optional) – When the denominator of the fraction can be different things, say population or infected, this allows you to specify the option that is not population.
fraction_of_str (str, optional) – If ‘show_fraction_of’ is true, then this argument an be used to include in the plot what the denominator is.
has_age_bins (bool, optional) – TBD

Returns:

A string to be used a the top line title of the plot.

Extract population data for a specific node, gender and age.

It is assumed that the file has ages every 5 years from 0 to 100.

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.extract_population_data_multiple_ages(filename: str, node_id: int | None = None, gender: str | None = None, age_bin_list: list[float] | None = None, filter_by_hiv_negative: bool = False, other_strat_column_name: str | None = None, other_strat_value: int | float | str | None = None, other_data_column_names: list[str] | None = None)[source]#: Extract population data for multiple ages for a specific node and gender.

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.extract_population_data_by_stratification(filename: str, node_id: int | None = None, gender: str | None = None, age_bin_list: list[float] | None = None, start_column_name: str | None = None, strat_values: list[str] | None = None)[source]#: Extract population data such that you get the population for each stratification.

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.extract_population_data_by_stratification_for_dir(dir_or_filename: str, node_id: int | None = None, gender: str | None = None, age_bin_list: list[float] | None = None, start_column_name: str | None = None, strat_values: list[str] | None = None, show_avg_per_run: bool = False)[source]#

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.create_df_for_plot_by_stratification(combined_df: DataFrame, col_name_prefixs: list[str], age_bin_list: list[float] | None = None, show_fraction: bool = False)[source]#

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.plot_population_for_dir(dir_or_filename: str, unworld_pop_filename: str, country: str, version: str, x_base_population: float = 1.0, show_avg_per_run: bool = False, gender: str | None = None, age_bin: float | None = None, img_dir: str | None = None)[source]#

Plot the population for the given age bin against the data in the UN World Population spreadsheet.

Parameters:

dir_or_filename (str, required) – The directory or filename containing the ReportHIVByAgeAndGender.csv files.
unworld_pop_filename (str, required) – The name and path to a UN World Pop Excel spreadsheet where the ‘country’ parameter specifies a country name found in the spreadsheet and the ‘version’ specifies the year of the data. These values are needed to know how to read the data in the spreadsheet.
country (str, required) – The name of the country found in the spreadsheet to extract the data for.
version (str, required) – The year associated with when the UN World Pop file was created. PLEASE NOTE: This year is a string.
x_base_population (float, optional) – The ‘x_Base_Population’ value (found in the config) is used to divide by the population numbers in the CSV file so you get numbers that match the true population.
show_avg_per_run (bool, optional) – If ‘dir_or_filename’ is a directory, this will calculate the average number of people with the given risk type at a given time step between the files. Default is False.
gender (str, optional) – The string (Male or Female) for the gender that data is being filtered for.
age_bin (float, optional) – If provided, the data for this specific age stratification will be plotted. Both the data in the report file and the UN World Pop file must have this stratification. If you do not provide a value, then the population is not broken up by age (i.e. total population).
img_dir (str, optional) – Directory to save the images. If None, the images will not be saved and a window will be opened.

Returns:

None - but image will be saved or window opened.

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.plot_population_by_gender(filename: str, img_dir: str | None = None)[source]#

For the given file, plot the population for each gender over time.

Args

filename (str, required):: The name and path of the ReportHIVByAgeAndGender.csv file to extract the data from.
img_dir (str, optional):: Directory to save the images. If None, the images will not be saved and a window will be opened.

Returns:: None - but image will be saved or window opened.

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.plot_population_by_ip(dir_or_filename: str, exp_dir_or_filename: str | None = None, node_id: int | None = None, gender: str | None = None, age_bin_list=None, ip_key: str | None = None, ip_values: list[str] | None = None, show_avg_per_run: bool = False, show_fraction: bool = False, expected_values: dict | None = None, img_dir: str | None = None)[source]#

For the indicated files, create a plot showing who has what value of the give IP key over time.

Parameters:

dir_or_filename (str, required) – The directory or filename containing the ReportHIVByAgeAndGender.csv files.
exp_dir_or_filename (str, required) – The expected or alternate directory or filename containing the ReportHIVByAgeAndGender.csv files.
node_id (int, optional) – The ID of the node for which the data is being filtered for.
gender (str, optional) – The string (Male or Female) for the gender that data is being filtered for.
age_bin_list (list[float], optional) – A list of ages, in years, where the population with risk value will be counted for each bin. For example, if you enter [10, 25, 30, 55], there will be three age bins with the following ranges: [10-25), [25-30), [30-55)
ip_key (str, required) – Extract the data from the files based on this IP stratification column. If the files has not been stratified by this IP, you will need to re-run the simulations with stratification turned on.
ip_values (list[str], optional) – By default, this plotting tool uses all of the values for the IP, however, this allows you to only include the ones you are interested (i.e. a subset to the total possible values for the IP)
show_avg_per_run (bool, optional) – If ‘dir_or_filename’ is a directory, this will calculate the average number of people with the given risk type at a given time step between the files. Default is False.
show_fraction (bool, optional) – True indicates that the number of people the given IP value will be divided by the sum of all the people with all of the selected IP values. For example, if the IP key were Risk and you selected to only plot LOW and MEDIUM, then the total number of people with LOW will be divided by the total number of people with either LOW or MEDIUM.
expected_values (dict, optional) – If the user provides this dictionary, the constant expected value of each IP will be plotted. This should be a dictionary with an IP value as the key and a constant expected value. There should be an IP value for each IP value plotted.
img_dir (str, optional) – Directory to save the images. If None, the images will not be saved and a window will be opened.

Returns:

None - but image will be saved or window opened.

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.plot_columns(filename: str, title: str, y_axis_name: str, column_names: list[str], fraction_of_population: bool = False, img_dir: str | None = None)[source]#

For a given file, plot the indicated columns versus time (i.e. Year).

Parameters:

filename (str, required) – The name and path of the ReportHIVByAgeAndGender.csv file to extract the data from.
title (str, required) – The title to put at the top of the plot.
y_axis_name (str, required) – The name to label the y-axis on the plot.
column_names (list[str], required) – The list of column names to plot the data for. The report has a space before each column name. Please be sure to include it.
fraction_of_population (bool, optional) – If True, divide the count each column by the population for the same stratification.
img_dir (str, optional) – Directory to save the images. If None, the images will not be saved and a window will be opened.

Returns:

None - but image will be saved or window opened.

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.plot_circumcision_by_age(filename: str, age_bin_list: list[float], fraction_of_total: bool = False, img_dir: str | None = None)[source]#

For a single file, plot the number of men who are circumcised by age.

Parameters:

filename (str, required) – The name and path of the ReportHIVByAgeAndGender.csv to extract the data from.
age_bin_list (list[float], optional) – A list of ages in years where the population with risk value will be counted for each bin. For example, if you enter [10, 25, 30, 55], there will be three age bins with the following ranges: [10-25), [25-30), [30-55)
fraction_of_total (bool, optional) – If True, the number of men who are circumcised will be divided by the total number of men with that stratification.
img_dir (str, optional) – Directory to save the images. If None, the images will not be saved and a window will be opened.

Returns:

None - but image will be saved or window opened.

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.extract_population_data_multiple_ages_for_dir(dir_or_filename: str, node_id: int | None = None, gender: str | None = None, age_bin_list: list[float] | None = None, show_avg_per_run: bool = False, filter_by_hiv_negative: bool = False, other_strat_column_name: str | None = None, other_strat_value_a: int | float | str | None = None, other_strat_value_b: int | float | str | None = None, other_data_column_names: list[str] | None = None)[source]#

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.create_df_for_plot_by_age(combined_df: DataFrame, col_name_prefixs: list[str], main_column_name: str, gender: str, age_bins: list[float], show_fraction: bool, fraction_of: bool, fraction_of_column_name: str)[source]#

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.base_plot_by_age(base_title: str, main_column_name: str, dir_or_filename: str, exp_dir_or_filename: str | None = None, node_id: int | None = None, age_bins: list[float] | None = None, gender: str | None = None, filter_by_hiv_negative: bool = False, other_strat_column_name: str | None = None, other_strat_value_a: int | float | str | None = None, other_strat_value_b: int | float | str | None = None, show_avg_per_run: bool = False, show_fraction: bool = False, fraction_of: bool = False, fraction_of_column_name: str | None = None, fraction_of_str: str | None = None, img_dir: str | None = None)[source]#

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.plot_onART_by_age(dir_or_filename: str, exp_dir_or_filename: str | None = None, node_id: int | None = None, gender: str | None = None, age_bin_list: list[float] | None = None, show_avg_per_run: bool = False, show_fraction: bool = False, fraction_of_infected: bool = False, img_dir: str | None = None)[source]#

Create a plot showing information about the people on ART. You can show the fraction of the population or the fraction of the infected population.

Parameters:

dir_or_filename (str, required) – The directory or filename containing the ReportHIVByAgeAndGender.csv files.
exp_dir_or_filename (str, required) – The expected or alternate directory or filename containing the ReportHIVByAgeAndGender.csv files.
node_id (int, optional) – The ID of the node for which the data is being filtered for.
gender (str, optional) – The string (Male or Female) for the gender that data is being filtered for.
age_bin_list (list[float], optional) – A list of ages in years where the population with risk value will be counted for each bin. For example, if you enter [10, 25, 30, 55], there will be three age bins with the following ranges: [10-25), [25-30), [30-55)
show_avg_per_run (bool, optional) – If ‘dir_or_filename’ is a directory, this will calculate the average number of people with the given risk type at a given time step between the files. Default is False.
show_fraction (bool, optional) – True indicates that the number of people on ART will be divided by either the number of people in the population or the number of infected people. It depends on the ‘fraction_of_infected’ parameter.
fraction_of_infected (bool, optional) – If ‘show_fraction’ is True, then this parameter determines what the divisor is when creating the fraction. If it is True, it will divide the number of people on ART by the number of infected people. If it is False, the divisor will the total population with that stratification.
img_dir (str, optional) – Directory to save the images. If None, the images will not be saved and a window will be opened.

Returns:

None - but image will be saved or window opened.

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.plot_population_by_age(dir_or_filename: str, exp_dir_or_filename: str | None = None, node_id: int | None = None, gender: str | None = None, age_bin_list: list[float] | None = None, show_avg_per_run: bool = False, img_dir: str | None = None)[source]#

Create a plot showing the population over time.

Parameters:

dir_or_filename (str, required) – The directory or filename containing the ReportHIVByAgeAndGender.csv files.
exp_dir_or_filename (str, required) – The expected or alternate directory or filename containing the ReportHIVByAgeAndGender.csv files.
node_id (int, optional) – The ID of the node for which the data is being filtered for.
gender (str, optional) – The string (Male or Female) for the gender that data is being filtered for.
age_bin_list (list[float], optional) – A list of ages in years where the population with risk value will be counted for each bin. For example, if you enter [10, 25, 30, 55], there will be three age bins with the following ranges: [10-25), [25-30), [30-55)
show_avg_per_run (bool, optional) – If ‘dir_or_filename’ is a directory, this will calculate the average number of people with the given risk type at a given time step between the files. Default is False.
img_dir (str, optional) – Directory to save the images. If None, the images will not be saved and a window will be opened.

Returns:

None - but image will be saved or window opened.

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.plot_vmmc_by_age(dir_or_filename: str, exp_dir_or_filename: str | None = None, node_id: int | None = None, age_bin_list: list[float] | None = None, show_avg_per_run: bool = False, img_dir: str | None = None)[source]#

Create a plot showing information about the men who are circumcised.

Parameters:

dir_or_filename (str, required) – The directory or filename containing the ReportHIVByAgeAndGender.csv files.
exp_dir_or_filename (str, required) – The expected or alternate directory or filename containing the ReportHIVByAgeAndGender.csv files.
node_id (int, optional) – The ID of the node for which the data is being filtered for.
gender (str, optional) – The string (Male or Female) for the gender that data is being filtered for.
age_bin_list (list[float], optional) – A list of ages in years where the population with risk value will be counted for each bin. For example, if you enter [10, 25, 30, 55], there will be three age bins with the following ranges: [10-25), [25-30), [30-55)
show_avg_per_run (bool, optional) – If ‘dir_or_filename’ is a directory, this will calculate the average number of people with the given risk type at a given time step between the files. Default is False.
img_dir (str, optional) – Directory to save the images. If None, the images will not be saved and a window will be opened.

Returns:

None - but image will be saved or window opened.

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.plot_population_by_age_vs_unworld_pop(filename: str, unworld_pop_filename: str, age_bin_list: list[float], country: str, version: str, x_base_population: float = 1.0, img_dir=None)[source]#

For a single file, plot the actual population for a given age bin versus what was expected in the given UN World Population file. If you have define three age bins, there will be six curves - an actual and an expected for each bin.

Parameters:

filename (str, required) – The name and path of the ReportHIVByAgeAndGender.csv file to plot the data from.
unworld_pop_filename (str, required) – The name and path to a UN World Pop Excel spreadsheet where the ‘country’ parameter specifies a country name found in the spreadsheet and the ‘version’ specifies the year of the data. These values are needed to know how to read the data in the spreadsheet.
age_bin_list (list[float], required) – A list of ages in years where the population with risk value will be counted for each bin. For example, if you enter [10, 25, 30, 55], there will be three age bins with the following ranges: [10-25), [25-30), [30-55)
country (str, required) – The name of the country found in the spreadsheet to extract the data for.
version (str, required) – The year associated with when the UN World Pop file was created. PLEASE NOTE: This year is a string.
x_base_population (float, optional) – The ‘x_Base_Population’ value (found in the config) is used to divide by the population numbers in the CSV file so you get numbers that match the true population.
img_dir (str, optional) – Directory to save the images. If None, the images will not be saved and a window will be opened.

Returns:

None - but image will be saved or window opened.

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.plot_risk(dir_or_filename: str, starting_expected_values: dict | None = None, expected_value_for_high_per_node: list[float] | None = None, gender: str | None = None, age_bin_list: list[float] | None = None, show_avg_per_run: bool = False, show_fraction: bool = False, img_dir: str | None = None)[source]#

Create one plot for each node in ‘expected_value_for_high_per_node’ showing the risk values for the population versus what is expected.

Parameters:

dir_or_filename (str, required) – The directory or filename containing the ReportHIVByAgeAndGender.csv files.
starting_expected_values (dict, optional) – The starting three values of how risk distributed to the population. These should be the same values that you have in the demographics. The order is LOW, MEDIUM, and HIGH. Typically, one might have 0.85 for LOW, 0.15 for MEDIUM, and 0.0 for HIGH because people get set to HIGH in the campaign’s CSW logic.
expected_value_for_high_per_node (list[float, optional]) – A list of expected fraction of the population to have Risk = HIGH for a given node. The node ID of each value is expected to be the index of the position plus 1. The starting values for LOW and MEDIUM are adjusted for this HIGH value.
gender (str, optional) – The string (Male or Female) for the gender that data is being filtered for.
age_bin_list (list[float], optional) – A list of ages in years where the population with risk value will be counted for each bin. For example, if you enter [10, 25, 30, 55], there will be three age bins with the following ranges: [10-25), [25-30), [30-55)
show_avg_per_run (bool, optional) – If ‘dir_or_filename’ is a directory, this will calculate the average number of people with the given risk type at a given time step between the files. Default is False.
show_fraction (bool, optional) – True indicates that for each stratification the number of people with a given risk value is divided by the total number of people in that stratification.
img_dir (str, optional) – Directory to save the images. If None, the images will not be saved and a window will be opened.

Returns:

None - but image will be saved or window opened.

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.plot_vmmc_for_dir(dir_or_filename: str, node_id: int | None = None, age_bin_list: list[float] | None = None, show_expected: bool = False, show_avg_per_run: bool = False, img_dir: str | None = None)[source]#

Create a plot showing what fraction of men are circumcised over time versus the expected number of circumcised men.

Parameters:

dir_or_filename (str, required) – The directory or filename containing the ReportHIVByAgeAndGender.csv files.
node_id (int, optional) – The ID of the node for which the data is being filtered for.
age_bin_list (list[float], optional) – A list of ages in years where the population with risk value will be counted for each bin. For example, if you enter [10, 25, 30, 55], there will be three age bins with the following ranges: [10-25), [25-30), [30-55)
show_expected (bool, optional) – If true, plot the expected fraction of circumcisions.
show_avg_per_run (bool, optional) – If ‘dir_or_filename’ is a directory, this will calculate the average number of people with the given risk type at a given time step between the files. Default is False.
img_dir (str, optional) – Directory to save the images. If None, the images will not be saved and a window will be opened.

Returns:

None - but image will be saved or window opened.

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.plot_prevalence_for_dir(dir_or_filename: str, exp_dir_or_filename: str | None = None, node_id: int | None = None, gender: str | None = None, age_bin_list: list[float] | None = None, show_avg_per_run: bool = False, show_fraction: bool = False, img_dir: str | None = None)[source]#

Create a plot showing who is infected with HIV.

Parameters:

dir_or_filename (str, required) – The directory or filename containing the ReportHIVByAgeAndGender.csv files.
exp_dir_or_filename (str, required) – The expected or alternate directory or filename containing the ReportHIVByAgeAndGender.csv files.
node_id (int, optional) – The ID of the node for which the data is being filtered for.
gender (str, optional) – The string (Male or Female) for the gender that data is being filtered for.
age_bin_list (list[float], optional) – A list of ages in years where the population with risk value will be counted for each bin. For example, if you enter [10, 25, 30, 55], there will be three age bins with the following ranges: [10-25), [25-30), [30-55)
show_avg_per_run (bool, optional) – If ‘dir_or_filename’ is a directory, this will calculate the average number of people with the given risk type at a given time step between the files. Default is False.
show_fraction (bool, optional) – True indicates that the number of infected people should be divided by the total number of people in that stratification.
img_dir (str, optional) – Directory to save the images. If None, the images will not be saved and a window will be opened.

Returns:

None - but image will be saved or window opened.

emodpy_hiv.plotting.plot_hiv_by_age_and_gender.plot_risk_zambia(dir_or_filename: str, age_bin_list: list[float] | None = None, show_avg_per_run: bool = False, show_fraction: bool = False, show_expected: bool = False, img_dir: str | None = None)[source]#

Create multiple risk value plots where each plot is for a specific node and gender. The plot can show the count or fraction of the group that has one of the three risk values. The plot can also show the expected values for specific node and gender for Zambia.

Parameters:

dir_or_filename (str, required) – The directory or filename containing the ReportHIVByAgeAndGender.csv files.
age_bin_list (list[float], optional) – A list of ages in years where the population with risk value will be counted for each bin. For example, if you enter [10, 25, 30, 55], there will be three age bins with the following ranges: [10-25), [25-30), [30-55)
show_avg_per_run (bool, optional) – If ‘dir_or_filename’ is a directory, this will calculate the average number of people with the given risk type at a given time step between the files. Default is False.
show_fraction (bool, optional) – True indicates that the data is not true counts but a fraction (i.e. a count divided by another counter)
show_expected (bool, optional) – True indicates that you want to see the expected fractions of the population that have the different risk values PER NODE. False will be data for all nodes.
img_dir (str, optional) – Directory to save the images. If None, the images will not be saved and a window will be opened.

Returns:

None - but image will be saved or window opened.