hiv_workflow.lib.analysis.gaussian_distribution module

class hiv_workflow.lib.analysis.gaussian_distribution.GaussianDistribution

Bases: hiv_workflow.lib.analysis.base_distribution.BaseDistribution

exception InvalidUncertaintyException

Bases: Exception

UNCERTAINTY_CHANNEL = 'two_sigma'
prepare(dfw, channel, weight_channel=None, additional_keep=None)

Prepare a DataFrameWrapper and this distribution object for a compare() call together. This includes dataframe verification/data checking, adding additional distribution-specific channels/columns, and trimming the data columns to the minimum needed. Depending on the particular distribution type, additional attributes on self may be set to prepare it in addition to the dfw (e.g. setting self.alpha_channel and self.beta_channel, derived from arg channel on self for BetaDistribution). :param dfw: DataFrameWrapper containing data that will be used in a future compare() call :param channel: data channel/column in dfw that the future compare() call be regarding :param weight_channel: an analyzer weighting channel that must be kept, if specified :param additional_keep: additional columns in the DataFrameWrapper to preserve

Returns: a modified copy of the input DataFrameWrapper

static construct_gaussian_channel(channel, type)
add_percentile_values(dfw, channel, p)

Adds a new data channel to a DataFrameWrapper object that represents a requested probability threshold/value for a specified channel. Useful for creating uncertainty envelopes in plots. :param dfw: DataFrameWrapper with data to construct percentiles and to add percentiles to :param channel: the column in dfw that percentiles will be constructed from/for :param p: the 0-1 percentile level for the given channel to add

Returns: a list containing the new channel name in dfw

compare(df, reference_channel, data_channel)

Returns a score between -708.3964 and 100 (bad, good) for how well the dataframe (df) simulation data column (data_channel) matches the reference data column (reference_channel). :param df: pandas DataFrame with columns of data to compare :param reference_channel: reference data channel in dataframe :param data_channel: simulation data channel to compare to the reference data channel

Returns: a floating point score measuring the degree of data/reference fit, also known colloquially as ‘likelihood’