hiv_workflow.lib.analysis.data_frame_wrapper module¶
Maybe add xlsx reading, from a defined, similar format to csv
Currently, all files read via .from_directory() are merged into ONE dataframe.
-
class
hiv_workflow.lib.analysis.data_frame_wrapper.
DataFrameWrapper
(filename=None, dataframe=None, stratifiers=None)¶ Bases:
object
-
CSV
= 'csv'¶
-
property
channels
¶ Channels are non-stratifier columns :return:
-
filter
(conditions=None, keep_only=None)¶ Selects rows from the internal dataframe that satisfy all provided conditions. The stratifiers of the result will exclude current-object stratifiers that contain NaN in the resulting rows.
This method should very rarely if ever be called without a keep_only specified, unless conditions are specified.
Always results in the minimal row/column set satisfying the inputs with no remaining NaN values in the dataframe
- Parameters
conditions – an iterator (e.g. list) of tuples/triplets specifying (in order) stratifier, operator, value. e.g. [‘min_age’, operator.ge, 25] (to select rows where ‘min_age’ is >= 25)
keep_only – If not None, then is a list of data channels to keep (in addition to stratifiers) after filtering. Rows with any NaN values will be dropped after trimming to these channels.
- Returns
an object of the same type as the object this method is called on with only selected rows remaining.
-
merge
(other_dfw, index, keep_only=None)¶ Attempts to merge two DataFrameWrapper objects into one using the provided index list as a multi-index.
- Parameters
other_dfw – the DataFrameWrapper object to merge with.
index – a list of columns to merge on. All are required in both DataFrameWrapper objects.
keep_only – a list of columns. Triggers removal of result rows where NaN appears in any of these specified columns. Result will contain these columns AND those from provided index.
- Returns
A newly created, merged object of the exact type of self and the stratifiers equal to the provided index.
-
verify_required_items
(needed, available=None)¶ Standard method for checking if necessary items/channels are available and printing a meaningful error if not :param needed: channels to look for :param available: channel list to look in :return: Nothing
-
equals
(other_dfw)¶
-
classmethod
from_directory
(directory, file_type=None, stratifiers=None)¶
-