idmtools.analysis.analyze_manager module#
idmtools Analyzer manager.
AnalyzerManager is the “driver” of analysis. Analysis is mostly a map reduce operation.
Copyright 2021, Bill & Melinda Gates Foundation. All rights reserved.
- idmtools.analysis.analyze_manager.pool_worker_initializer(func, analyzers, platform: IPlatform) NoReturn [source]#
Initialize the pool worker, which allows the process pool to associate the analyzers, cache, and path mapping to the function executed to retrieve data.
Using an initializer improves performance.
- Parameters:
func – The function that the pool will call.
analyzers – The list of all analyzers to run.
platform – The platform to communicate with to retrieve files from.
- Returns:
None
- class idmtools.analysis.analyze_manager.AnalyzeManager(platform: IPlatform = None, configuration: dict = None, ids: List[Tuple[str, ItemType]] = None, analyzers: List[IAnalyzer] = None, working_dir: str = None, partial_analyze_ok: bool = False, max_items: int | None = None, verbose: bool = True, force_manager_working_directory: bool = False, exclude_ids: List[str] = None, analyze_failed_items: bool = False, max_workers: int | None = None, executor_type: str = 'process')[source]#
Bases:
object
Analyzer Manager Class. This is the main driver of analysis.
- ANALYZE_TIMEOUT = 28800#
- WAIT_TIME = 1.15#
- EXCEPTION_KEY = '__EXCEPTION__'#
- exception TimeOutException[source]#
Bases:
Exception
TimeOutException is raised when the analysis times out.
- exception ItemsNotReady[source]#
Bases:
Exception
ItemsNotReady is raised when items to be analyzed are still running.
Notes
TODO - Add doc_link
- __init__(platform: IPlatform = None, configuration: dict = None, ids: List[Tuple[str, ItemType]] = None, analyzers: List[IAnalyzer] = None, working_dir: str = None, partial_analyze_ok: bool = False, max_items: int | None = None, verbose: bool = True, force_manager_working_directory: bool = False, exclude_ids: List[str] = None, analyze_failed_items: bool = False, max_workers: int | None = None, executor_type: str = 'process')[source]#
Initialize the AnalyzeManager.
- Parameters:
platform (IPlatform) – Platform
configuration (dict, optional) – Initial Configuration. Defaults to None.
ids (Tuple[str, ItemType], optional) – List of ids as pair of Tuple and ItemType. Defaults to None.
analyzers (List[IAnalyzer], optional) – List of Analyzers. Defaults to None.
working_dir (str, optional) – The working directory. Defaults to os.getcwd().
partial_analyze_ok (bool, optional) – Whether partial analysis is ok. When this is True, Experiments in progress or Failed can be analyzed. Defaults to False.
max_items (int, optional) – Max Items to analyze. Useful when developing and testing an Analyzer. Defaults to None.
verbose (bool, optional) – Print extra information about analysis. Defaults to True.
force_manager_working_directory (bool, optional) – [description]. Defaults to False.
exclude_ids (List[str], optional) – [description]. Defaults to None.
analyze_failed_items (bool, optional) – Allows analyzing of failed items. Useful when you are trying to aggregate items that have failed. Defaults to False.
max_workers (int, optional) – Set the max workers. If not provided, falls back to the configuration item max_threads. If max_workers is not set in configuration, defaults to CPU count
executor_type – (str): Whether to use process or thread pooling. Process pooling is more efficient but threading might be required in some environments
- add_item(item: IEntity) NoReturn [source]#
Add an additional item for analysis.
- Parameters:
item – The new item to add for analysis.
- Returns:
None
- add_analyzer(analyzer: IAnalyzer) NoReturn [source]#
Add another analyzer to use on the items to be analyzed.
- Parameters:
analyzer – An analyzer object (
IAnalyzer
).- Returns:
None
- analyze() bool [source]#
Process the provided items with the provided analyzers. This is the main driver method of
AnalyzeManager
.- Parameters:
kwargs – extra parameters
- Returns:
True on success; False on failure/exception.