rsvsim.misc module¶
Miscellaneous functions that do not belong anywhere else
-
rsvsim.misc.
date
(obj, *args, start_date=None, readformat=None, outformat=None, as_date=True, **kwargs)[source]¶ Convert any reasonable object – a string, integer, or datetime object, or list/array of any of those – to a date object. To convert an integer to a date, you must supply a start date.
Caution: while this function and readdate() are similar, and indeed this function calls readdate() if the input is a string, in this function an integer is treated as a number of days from start_date, while for readdate() it is treated as a timestamp in seconds. To change
- Parameters
obj (str, int, date, datetime, list, array) – the object to convert
args (str, int, date, datetime) – additional objects to convert
start_date (str, date, datetime) – the starting date, if an integer is supplied
readformat (str/list) – the format to read the date in; passed to sc.readdate()
outformat (str) – the format to output the date in, if returning a string
as_date (bool) – whether to return as a datetime date instead of a string
- Returns
either a single date object, or a list of them (matching input data type where possible)
- Return type
dates (date or list)
Examples:
sc.date('2020-04-05') # Returns datetime.date(2020, 4, 5) sc.date([35,36,37], start_date='2020-01-01', as_date=False) # Returns ['2020-02-05', '2020-02-06', '2020-02-07'] sc.date(1923288822, readformat='posix') # Interpret as a POSIX timestamp
New in version 1.0.0. New in version 1.2.2: “readformat” argument; renamed “dateformat” to “outformat”
-
rsvsim.misc.
day
(obj, *args, start_date=None, **kwargs)[source]¶ Convert a string, date/datetime object, or int to a day (int), the number of days since the start day. See also sc.date() and sc.daydiff(). If a start day is not supplied, it returns the number of days into the current year.
- Parameters
obj (str, date, int, list, array) – convert any of these objects to a day relative to the start day
args (list) – additional days
start_date (str or date) – the start day; if none is supplied, return days since (supplied year)-01-01.
- Returns
the day(s) in simulation time (matching input data type where possible)
- Return type
days (int or list)
Examples:
sc.day(sc.now()) # Returns how many days into the year we are sc.day(['2021-01-21', '2024-04-04'], start_date='2022-02-22') # Days can be positive or negative
New in version 1.0.0. New in version 1.2.2: renamed “start_day” to “start_date”
-
rsvsim.misc.
daydiff
(*args)[source]¶ Convenience function to find the difference between two or more days. With only one argument, calculate days since 2020-01-01.
Examples:
diff = sc.daydiff('2020-03-20', '2020-04-05') # Returns 16 diffs = sc.daydiff('2020-03-20', '2020-04-05', '2020-05-01') # Returns [16, 26]
New in version 1.0.0.
-
rsvsim.misc.
date_range
(start_date, end_date, inclusive=True, as_date=False, dateformat=None)¶ Return a list of dates from the start date to the end date. To convert a list of days (as integers) to dates, use sc.date() instead.
- Parameters
start_date (int/str/date) – the starting date, in any format
end_date (int/str/date) – the end date, in any format
inclusive (bool) – if True (default), return to end_date inclusive; otherwise, stop the day before
as_date (bool) – if True, return a list of datetime.date objects instead of strings
dateformat (str) – passed to date()
Example:
dates = sc.daterange('2020-03-01', '2020-04-04')
New in version 1.0.0.
-
rsvsim.misc.
load_data
(datafile, columns=None, calculate=True, check_date=True, verbose=True, start_day=None, **kwargs)[source]¶ Load data for comparing to the model output, either from file or from a dataframe.
- Parameters
datafile (str or df) – if a string, the name of the file to load (either Excel or CSV); if a dataframe, use directly
columns (list) – list of column names (otherwise, load all)
calculate (bool) – whether to calculate cumulative values from daily counts
check_date (bool) – whether to check that a ‘date’ column is present
start_day (date) – if the ‘date’ column is provided as integer number of days, consider them relative to this
kwargs (dict) – passed to pd.read_excel()
- Returns
pandas dataframe of the loaded data
- Return type
data (dataframe)
-
rsvsim.misc.
load
(*args, do_migrate=True, update=True, verbose=True, **kwargs)[source]¶ Convenience method for sc.loadobj() and equivalent to cv.Sim.load() or cv.Scenarios.load().
- Parameters
filename (str) – file to load
do_migrate (bool) – whether to migrate if loading an old object
update (bool) – whether to modify the object to reflect the new version
verbose (bool) – whether to print migration information
args (list) – passed to sc.loadobj()
kwargs (dict) – passed to sc.loadobj()
- Returns
Loaded object
Examples:
sim = cv.load('calib.sim') # Equivalent to cv.Sim.load('calib.sim') scens = cv.load(filename='school-closures.scens', folder='schools')
-
rsvsim.misc.
save
(*args, **kwargs)[source]¶ Convenience method for sc.saveobj() and equivalent to cv.Sim.save() or cv.Scenarios.save().
- Parameters
filename (str) – file to save to
obj (object) – object to save
args (list) – passed to sc.saveobj()
kwargs (dict) – passed to sc.saveobj()
- Returns
Filename the object is saved to
Examples:
cv.save('calib.sim', sim) # Equivalent to sim.save('calib.sim') cv.save(filename='school-closures.scens', folder='schools', obj=scens)
-
rsvsim.misc.
savefig
(filename=None, comments=None, **kwargs)[source]¶ Wrapper for Matplotlib’s savefig() function which automatically stores rsvsim metadata in the figure. By default, saves (git) information from both the rsvsim version and the calling function. Additional comments can be added to the saved file as well. These can be retrieved via cv.get_png_metadata(). Metadata can also be stored for SVG and PDF formats, but cannot be automatically retrieved.
- Parameters
filename (str) – name of the file to save to (default, timestamp)
comments (str) – additional metadata to save to the figure
kwargs (dict) – passed to savefig()
Example:
cv.Sim().run(do_plot=True) filename = cv.savefig()
-
rsvsim.misc.
migrate
(obj, update=True, verbose=True, die=False)[source]¶ Define migrations allowing compatibility between different versions of saved files. Usually invoked automatically upon load, but can be called directly by the user to load custom objects, e.g. lists of sims.
Currently supported objects are sims, multisims, scenarios, and people.
- Parameters
obj (any) – the object to migrate
update (bool) – whether to update version information to current version after successful migration
verbose (bool) – whether to print warnings if something goes wrong
die (bool) – whether to raise an exception if something goes wrong
- Returns
The migrated object
Example:
sims = cv.load('my-list-of-sims.obj') sims = [cv.migrate(sim) for sim in sims]
-
rsvsim.misc.
git_info
(filename=None, check=False, comments=None, old_info=None, die=False, indent=2, verbose=True, frame=2, **kwargs)[source]¶ Get current git information and optionally write it to disk. Simplest usage is cv.git_info(__file__)
- Parameters
filename (str) – name of the file to write to or read from
check (bool) – whether or not to compare two git versions
comments (dict) – additional comments to include in the file
old_info (dict) – dictionary of information to check against
die (bool) – whether or not to raise an exception if the check fails
indent (int) – how many indents to use when writing the file to disk
verbose (bool) – detail to print
frame (int) – how many frames back to look for caller info
kwargs (dict) – passed to sc.loadjson() (if check=True) or sc.savejson() (if check=False)
Examples:
cv.git_info() # Return information cv.git_info(__file__) # Writes to disk cv.git_info('rsvsim_version.gitinfo') # Writes to disk cv.git_info('rsvsim_version.gitinfo', check=True) # Checks that current version matches saved file
-
rsvsim.misc.
check_version
(expected, die=False, verbose=True)[source]¶ Get current git information and optionally write it to disk. The expected version string may optionally start with ‘>=’ or ‘<=’ (== is implied otherwise), but other operators (e.g. ~=) are not supported. Note that e.g. ‘>’ is interpreted to mean ‘>=’.
- Parameters
expected (str) – expected version information
die (bool) – whether or not to raise an exception if the check fails
Example:
cv.check_version('>=1.7.0', die=True) # Will raise an exception if an older version is used
-
rsvsim.misc.
check_save_version
(expected=None, filename=None, die=False, verbose=True, **kwargs)[source]¶ A convenience function that bundles check_version with git_info and saves automatically to disk from the calling file. The idea is to put this at the top of an analysis script, and commit the resulting file, to keep track of which version of rsvsim was used.
- Parameters
expected (str) – expected version information
filename (str) – file to save to; if None, guess based on current file name
kwargs (dict) – passed to git_info(), and thence to sc.savejson()
Examples:
cv.check_save_version() cv.check_save_version('1.3.2', filename='script.gitinfo', comments='This is the main analysis script') cv.check_save_version('1.7.2', folder='gitinfo', comments={'SynthPops':sc.gitinfo(sp.__file__)})
-
rsvsim.misc.
get_version_pars
(version, verbose=True)[source]¶ Function for loading parameters from the specified version.
Parameters will be loaded for rsvsim ‘as at’ the requested version i.e. the most recent set of parameters that is <= the requested version. Available parameter values are stored in the regression folder. If parameters are available for versions 1.3, and 1.4, then this function will return the following
If parameters for version ‘1.3’ are requested, parameters will be returned from ‘1.3’
If parameters for version ‘1.3.5’ are requested, parameters will be returned from ‘1.3’, since rsvsim at version 1.3.5 would have been using the parameters defined at version 1.3.
If parameters for version ‘1.4’ are requested, parameters will be returned from ‘1.4’
- Parameters
version (str) – the version to load parameters from
- Returns
Dictionary of parameters from that version
-
rsvsim.misc.
get_png_metadata
(filename, output=False)[source]¶ Read metadata from a PNG file. For use with images saved with cv.savefig(). Requires pillow, an optional dependency. Metadata retrieval for PDF and SVG is not currently supported.
- Parameters
filename (str) – the name of the file to load the data from
Example:
cv.Sim().run(do_plot=True) cv.savefig('rsvsim.png') cv.get_png_metadata('rsvsim.png')
-
rsvsim.misc.
get_doubling_time
(sim, series=None, interval=None, start_day=None, end_day=None, moving_window=None, exp_approx=False, max_doubling_time=100, eps=0.001, verbose=None)[source]¶ Alternate method to calculate doubling time (one is already implemented in the sim object).
Examples:
cv.get_doubling_time(sim, interval=[3,30]) # returns the doubling time over the given interval (single float) cv.get_doubling_time(sim, interval=[3,30], moving_window=3) # returns doubling times calculated over moving windows (array)
-
rsvsim.misc.
poisson_test
(count1, count2, exposure1=1, exposure2=1, ratio_null=1, method='score', alternative='two-sided')[source]¶ Test for ratio of two sample Poisson intensities
If the two Poisson rates are g1 and g2, then the Null hypothesis is
H0: g1 / g2 = ratio_null
against one of the following alternatives
H1_2-sided: g1 / g2 != ratio_null H1_larger: g1 / g2 > ratio_null H1_smaller: g1 / g2 < ratio_null
- Parameters
count1 – int Number of events in first sample
exposure1 – float Total exposure (time * subjects) in first sample
count2 – int Number of events in first sample
exposure2 – float Total exposure (time * subjects) in first sample
ratio – float ratio of the two Poisson rates under the Null hypothesis. Default is 1.
method – string Method for the test statistic and the p-value. Defaults to ‘score’. Current Methods are based on Gu et. al 2008 Implemented are ‘wald’, ‘score’ and ‘sqrt’ based asymptotic normal distribution, and the exact conditional test ‘exact-cond’, and its mid-point version ‘cond-midp’, see Notes
alternative –
string The alternative hypothesis, H1, has to be one of the following
’two-sided’: H1: ratio of rates is not equal to ratio_null (default) ‘larger’ : H1: ratio of rates is larger than ratio_null ‘smaller’ : H1: ratio of rates is smaller than ratio_null
- Returns
pvalue two-sided # stat
Notes
‘wald’: method W1A, wald test, variance based on separate estimates ‘score’: method W2A, score test, variance based on estimate under Null ‘wald-log’: W3A ‘score-log’ W4A ‘sqrt’: W5A, based on variance stabilizing square root transformation ‘exact-cond’: exact conditional test based on binomial distribution ‘cond-midp’: midpoint-pvalue of exact conditional test
The latter two are only verified for one-sided example.
References
Gu, Ng, Tang, Schucany 2008: Testing the Ratio of Two Poisson Rates, Biometrical Journal 50 (2008) 2, 2008
Author: Josef Perktold License: BSD-3
destination statsmodels
From: https://stackoverflow.com/questions/33944914/implementation-of-e-test-for-poisson-in-python
Date: 2020feb24
-
rsvsim.misc.
compute_gof
(actual, predicted, normalize=True, use_frac=False, use_squared=False, as_scalar='none', eps=1e-09, skestimator=None, estimator=None, **kwargs)[source]¶ Calculate the goodness of fit. By default use normalized absolute error, but highly customizable. For example, mean squared error is equivalent to setting normalize=False, use_squared=True, as_scalar=’mean’.
- Parameters
actual (arr) – array of actual (data) points
predicted (arr) – corresponding array of predicted (model) points
normalize (bool) – whether to divide the values by the largest value in either series
use_frac (bool) – convert to fractional mismatches rather than absolute
use_squared (bool) – square the mismatches
as_scalar (str) – return as a scalar instead of a time series: choices are sum, mean, median
eps (float) – to avoid divide-by-zero
skestimator (str) – if provided, use this scikit-learn estimator instead
estimator (func) – if provided, use this custom estimator instead
kwargs (dict) – passed to the scikit-learn or custom estimator
- Returns
array of goodness-of-fit values, or a single value if as_scalar is True
- Return type
gofs (arr)
Examples:
x1 = np.cumsum(np.random.random(100)) x2 = np.cumsum(np.random.random(100)) e1 = compute_gof(x1, x2) # Default, normalized absolute error e2 = compute_gof(x1, x2, normalize=False, use_frac=False) # Fractional error e3 = compute_gof(x1, x2, normalize=False, use_squared=True, as_scalar='mean') # Mean squared error e4 = compute_gof(x1, x2, skestimator='mean_squared_error') # Scikit-learn's MSE method e5 = compute_gof(x1, x2, as_scalar='median') # Normalized median absolute error -- highly robust