idmtools.analysis.csv_analyzer module

idmtools CSVAnalyzer.

Example of a csv analyzer to concatenate csv results into one csv from your experiment’s simulations.

Copyright 2021, Bill & Melinda Gates Foundation. All rights reserved.

class idmtools.analysis.csv_analyzer.CSVAnalyzer(filenames, output_path='output_csv')

Bases: idmtools.entities.ianalyzer.IAnalyzer

Provides an analyzer for CSV output.

Examples

Simple Example

This example covers the basic usage of the CSVAnalyzer

# Example CSVAnalyzer for any experiment
# In this example, we will demonstrate how to use a CSVAnalyzer to analyze csv files for experiments

# First, import some necessary system and idmtools packages.
from logging import getLogger

from idmtools.analysis.analyze_manager import AnalyzeManager
from idmtools.analysis.csv_analyzer import CSVAnalyzer
from idmtools.core import ItemType
from idmtools.core.platform_factory import Platform

if __name__ == '__main__':

    # Set the platform where you want to run your analysis
    # In this case we are running in BELEGOST since the Work Item we are analyzing was run on COMPS
    logger = getLogger()
    with Platform('BELEGOST') as platform:

        # Arg option for analyzer init are uid, working_dir, data in the method map (aka select_simulation_data),
        # and filenames
        # In this case, we want to provide a filename to analyze
        filenames = ['output/c.csv']
        # Initialize the analyser class with the path of the output csv file
        analyzers = [CSVAnalyzer(filenames=filenames, output_path="output_csv")]

        # Set the experiment id you want to analyze
        experiment_id = '1038ebdb-0904-eb11-a2c7-c4346bcb1553'  # comps exp id simple sim and csv example

        # Specify the id Type, in this case an Experiment on COMPS
        manager = AnalyzeManager(partial_analyze_ok=True, ids=[(experiment_id, ItemType.EXPERIMENT)],
                                 analyzers=analyzers)
        manager.analyze()
Multiple CSVs

This example covers analyzing multiple CSVs

# Example CSVAnalyzer for any experiment with multiple csv outputs
# In this example, we will demonstrate how to use a CSVAnalyzer to analyze csv files for experiments

# First, import some necessary system and idmtools packages.
from idmtools.analysis.analyze_manager import AnalyzeManager
from idmtools.analysis.csv_analyzer import CSVAnalyzer
from idmtools.core import ItemType
from idmtools.core.platform_factory import Platform


if __name__ == '__main__':

    # Set the platform where you want to run your analysis
    # In this case we are running in BELEGOST since the Work Item we are analyzing was run on COMPS
    platform = Platform('BELEGOST')

    # Arg option for analyzer init are uid, working_dir, data in the method map (aka select_simulation_data),
    # and filenames
    # In this case, we have multiple csv files to analyze
    filenames = ['output/a.csv', 'output/b.csv']
    # Initialize the analyser class with the path of the output csv file
    analyzers = [CSVAnalyzer(filenames=filenames, output_path="output_csv")]

    # Set the experiment id you want to analyze
    experiment_id = '1038ebdb-0904-eb11-a2c7-c4346bcb1553'  # comps exp id

    # Specify the id Type, in this case an Experiment on COMPS
    manager = AnalyzeManager(partial_analyze_ok=True, ids=[(experiment_id, ItemType.EXPERIMENT)],
                             analyzers=analyzers)
    manager.analyze()
__init__(filenames, output_path='output_csv')

Initialize our analyzer.

Parameters:
  • filenames – Filenames we want to pull

  • output_path – Output path to write the csv

initialize()

Initialize on run. Create an output directory.

Returns:

None

map(data: Dict[str, Any], simulation: Union[IWorkflowItem, Simulation]) pandas.core.frame.DataFrame

Map each simulation/workitem data here.

The data is a mapping of files -> content(in this case, dataframes since it is csvs parsed).

Parameters:
  • data – Data mapping of files -> content

  • simulation – Simulation/Workitem we are mapping

Returns:

Items joined together into a dataframe.

reduce(all_data: Dict[Union[IWorkflowItem, Simulation], pandas.core.frame.DataFrame])

Reduce(combine) all the data from our mapping.

Parameters:

all_data – Mapping of our data in form Item(Simulation/Workitem) -> Mapped dataframe

Returns:

None