Data Analysis¶
idmtools provides a map-reduce framework for analyzing simulation outputs after your experiment has finished running.
Overview¶
The analysis pipeline consists of three components:
| Component | Description |
|---|---|
| IAnalyzer | Base class you implement to define how to process and aggregate simulation outputs |
| AnalyzeManager | Runs analyzers locally against one or more experiments, suites, or simulations |
| PlatformAnalysis | Runs analyzers remotely as an SSMT work item on COMPS — keeps data on the cluster and avoids large transfers |
How It Works¶
Analysis follows a map-reduce pattern:
Text Only
Simulations ──► map() ──► per-simulation result
│
▼
reduce() ──► aggregate output (CSV, plots, etc.)
- map — called once per simulation; receives the simulation's output files and returns any Python object
- reduce — called once after all simulations are mapped; receives
{simulation: map_result}and produces the final output
Quick Example¶
Python
from idmtools.analysis.analyze_manager import AnalyzeManager
from idmtools.core import ItemType
from idmtools.core.platform_factory import Platform
from idmtools.entities import IAnalyzer
class MyAnalyzer(IAnalyzer):
def __init__(self):
super().__init__(filenames=["output/result.json"])
def map(self, data, simulation):
return data[self.filenames[0]]
def reduce(self, all_data):
for sim, result in all_data.items():
print(sim.id, result)
with Platform('CALCULON') as platform:
manager = AnalyzeManager(
ids=[('your-experiment-id', ItemType.EXPERIMENT)],
analyzers=[MyAnalyzer()]
)
manager.analyze()
Choosing Between AnalyzeManager and PlatformAnalysis¶
| AnalyzeManager | PlatformAnalysis | |
|---|---|---|
| Where it runs | Your local machine | Remote SSMT worker on COMPS |
| Data transfer | Downloads output files locally | Files stay on the cluster |
| Best for | Development, small datasets | Large datasets, production workflows |
| Platform required | Any idmtools platform | COMPS only |
In This Section¶
- Analyzers (IAnalyzer) — Define custom analysis logic and use built-in analyzers
- AnalyzeManager — Run analysis locally with full parameter reference
- PlatformAnalysis — Run analysis remotely on COMPS via SSMT