idmtools.analysis.analyze_manager module#

idmtools Analyzer manager.

AnalyzerManager is the “driver” of analysis. Analysis is mostly a map reduce operation.

idmtools.analysis.analyze_manager.pool_worker_initializer(func, analyzers, platform: IPlatform) → NoReturn[source]#

Initialize the pool worker, which allows the process pool to associate the analyzers, cache, and path mapping to the function executed to retrieve data.

Using an initializer improves performance.

Parameters:

func – The function that the pool will call.
analyzers – The list of all analyzers to run.
platform – The platform to communicate with to retrieve files from.

Returns:

None

class idmtools.analysis.analyze_manager.AnalyzeManager(platform: IPlatform = None, configuration: dict = None, ids: List[Tuple[str, ItemType]] = None, analyzers: List[IAnalyzer] = None, working_dir: str = None, partial_analyze_ok: bool = False, max_items: int | None = None, verbose: bool = True, force_manager_working_directory: bool = False, exclude_ids: List[str] = None, analyze_failed_items: bool = False, max_workers: int | None = None, executor_type: str = 'process')[source]#

Bases: object

Analyzer Manager Class. This is the main driver of analysis.

ANALYZE_TIMEOUT = 28800#

WAIT_TIME = 1.15#

EXCEPTION_KEY = '__EXCEPTION__'#

exception TimeOutException[source]#

Bases: Exception

TimeOutException is raised when the analysis times out.

exception ItemsNotReady[source]#

Bases: Exception

ItemsNotReady is raised when items to be analyzed are still running.

Notes

TODO - Add doc_link

__init__(platform: IPlatform = None, configuration: dict = None, ids: List[Tuple[str, ItemType]] = None, analyzers: List[IAnalyzer] = None, working_dir: str = None, partial_analyze_ok: bool = False, max_items: int | None = None, verbose: bool = True, force_manager_working_directory: bool = False, exclude_ids: List[str] = None, analyze_failed_items: bool = False, max_workers: int | None = None, executor_type: str = 'process')[source]#

Initialize the AnalyzeManager.

Parameters:

platform (IPlatform) – Platform
configuration (dict, optional) – Initial Configuration. Defaults to None.
ids (Tuple[str, ItemType], optional) – List of ids as pair of Tuple and ItemType. Defaults to None.
analyzers (List[IAnalyzer], optional) – List of Analyzers. Defaults to None.
working_dir (str, optional) – The working directory. Defaults to os.getcwd().
partial_analyze_ok (bool, optional) – Whether partial analysis is ok. When this is True, Experiments in progress or Failed can be analyzed. Defaults to False.
max_items (int, optional) – Max Items to analyze. Useful when developing and testing an Analyzer. Defaults to None.
verbose (bool, optional) – Print extra information about analysis. Defaults to True.
force_manager_working_directory (bool, optional) – [description]. Defaults to False.
exclude_ids (List[str], optional) – [description]. Defaults to None.
analyze_failed_items (bool, optional) – Allows analyzing of failed items. Useful when you are trying to aggregate items that have failed. Defaults to False.
max_workers (int, optional) – Set the max workers. If not provided, falls back to the configuration item max_threads. If max_workers is not set in configuration, defaults to CPU count
executor_type – (str): Whether to use process or thread pooling. Process pooling is more efficient but threading might be required in some environments

add_item(item: IEntity) → NoReturn[source]#

Add an additional item for analysis.

Parameters:: item – The new item to add for analysis.
Returns:: None

add_analyzer(analyzer: IAnalyzer) → NoReturn[source]#

Add another analyzer to use on the items to be analyzed.

Parameters:: analyzer – An analyzer object (IAnalyzer).
Returns:: None

analyze() → bool[source]#

Process the provided items with the provided analyzers. This is the main driver method of AnalyzeManager.

Parameters:: kwargs – extra parameters
Returns:: True on success; False on failure/exception.