idmtools.analysis.csv_analyzer module¶
idmtools CSVAnalyzer.
Example of a csv analyzer to concatenate csv results into one csv from your experiment’s simulations.
Copyright 2021, Bill & Melinda Gates Foundation. All rights reserved.
- class idmtools.analysis.csv_analyzer.CSVAnalyzer(filenames, output_path='output_csv')¶
Bases:
idmtools.entities.ianalyzer.IAnalyzer
Provides an analyzer for CSV output.
Examples
- Simple Example
This example covers the basic usage of the CSVAnalyzer
# Example CSVAnalyzer for any experiment # In this example, we will demonstrate how to use a CSVAnalyzer to analyze csv files for experiments # First, import some necessary system and idmtools packages. from logging import getLogger from idmtools.analysis.analyze_manager import AnalyzeManager from idmtools.analysis.csv_analyzer import CSVAnalyzer from idmtools.core import ItemType from idmtools.core.platform_factory import Platform if __name__ == '__main__': # Set the platform where you want to run your analysis # In this case we are running in BELEGOST since the Work Item we are analyzing was run on COMPS logger = getLogger() with Platform('BELEGOST') as platform: # Arg option for analyzer init are uid, working_dir, data in the method map (aka select_simulation_data), # and filenames # In this case, we want to provide a filename to analyze filenames = ['output/c.csv'] # Initialize the analyser class with the path of the output csv file analyzers = [CSVAnalyzer(filenames=filenames, output_path="output_csv")] # Set the experiment id you want to analyze experiment_id = '1038ebdb-0904-eb11-a2c7-c4346bcb1553' # comps exp id simple sim and csv example # Specify the id Type, in this case an Experiment on COMPS manager = AnalyzeManager(partial_analyze_ok=True, ids=[(experiment_id, ItemType.EXPERIMENT)], analyzers=analyzers) manager.analyze()
- Multiple CSVs
This example covers analyzing multiple CSVs
# Example CSVAnalyzer for any experiment with multiple csv outputs # In this example, we will demonstrate how to use a CSVAnalyzer to analyze csv files for experiments # First, import some necessary system and idmtools packages. from idmtools.analysis.analyze_manager import AnalyzeManager from idmtools.analysis.csv_analyzer import CSVAnalyzer from idmtools.core import ItemType from idmtools.core.platform_factory import Platform if __name__ == '__main__': # Set the platform where you want to run your analysis # In this case we are running in BELEGOST since the Work Item we are analyzing was run on COMPS platform = Platform('BELEGOST') # Arg option for analyzer init are uid, working_dir, data in the method map (aka select_simulation_data), # and filenames # In this case, we have multiple csv files to analyze filenames = ['output/a.csv', 'output/b.csv'] # Initialize the analyser class with the path of the output csv file analyzers = [CSVAnalyzer(filenames=filenames, output_path="output_csv")] # Set the experiment id you want to analyze experiment_id = '1038ebdb-0904-eb11-a2c7-c4346bcb1553' # comps exp id # Specify the id Type, in this case an Experiment on COMPS manager = AnalyzeManager(partial_analyze_ok=True, ids=[(experiment_id, ItemType.EXPERIMENT)], analyzers=analyzers) manager.analyze()
- __init__(filenames, output_path='output_csv')¶
Initialize our analyzer.
- Parameters:
filenames – Filenames we want to pull
output_path – Output path to write the csv
- initialize()¶
Initialize on run. Create an output directory.
- Returns:
None
- map(data: Dict[str, Any], simulation: Union[IWorkflowItem, Simulation]) pandas.core.frame.DataFrame ¶
Map each simulation/workitem data here.
The data is a mapping of files -> content(in this case, dataframes since it is csvs parsed).
- Parameters:
data – Data mapping of files -> content
simulation – Simulation/Workitem we are mapping
- Returns:
Items joined together into a dataframe.
- reduce(all_data: Dict[Union[IWorkflowItem, Simulation], pandas.core.frame.DataFrame])¶
Reduce(combine) all the data from our mapping.
- Parameters:
all_data – Mapping of our data in form Item(Simulation/Workitem) -> Mapped dataframe
- Returns:
None