Class SimHandler reference

High-level API classes for working with epidemic simulation data

class epivislab.simhandler.SimHandler(simulation, state_coord, within_sim_coord, between_sim_coord, measured_coord, time_coord)

Organizes xarray simulation data coordinates and manages aggregation and summary statistic calculations.

Class instantiation automatically calls several data validation, cleaning, and organizing methods. See validate(), make_lists(), and chunk_sim() for more details on these methods.

simulation

simulation data

Type

xarray

state_coord

coodinate(s) for simulation state data (e.g. disease compartment)

Type

str, list

within_sim

coordinate(s) for within-simulation data

Type

str, list

between_sim

coordinate(s) for between-simulation data

Type

str, list

time_coord

coordinate for timestep data

Type

str

validate()

Check that all coords are identified as between, within, or measured

make_lists()

Convert state_coord, within_sim, between_sim, and measured coordinates provided as strings to single item lists.

make_chunks()

Convert the xarray simulation to a dask.DataFrame.

The resulting dask.DataFrame will have a chunk size equal to the length of values in each each simulation as inferred by the combined lengths of all simulation coordinates. The dimensions will be ordered as follows:

  • self.within_sim

  • self.state_coord

  • self.time_coord

  • self.between_sim

Per the Dask recommendation., the last dimension will be contiguous in the resulting dask.DataFrame. This ordering ensures that replicate simulation measures are organized next to each other, for faster slicing and computation of between-simulation statistics.

Returns

None; assigns chunked dask.DataFrame to chunk_sim

class epivislab.simhandler.EpiSummary(simulation, state_coord, within_sim_coord, between_sim_coord, measured_coord, time_coord)

Extends SimHandler for to implement aggregations.

sum_over_groups(groupers, aggcol)

Sum column aggcol within simulations, maintaining groups named in groupers

This method checks that groupers are not between simulation coordinates (which cannot be validly summed), and checks that the aggcol is a measured coordinate.

The time and state coordinates should be explicitly listed in groupers.

Parameters
  • groupers (list of str) – names of coordinates to maintain in aggregated data

  • aggcol (str) – name of measured coordinate

Returns

simulation data summed across within-simulation variables not included in groupers.

Return type

dask.DataFrame

quantile_between_sims(groupers, aggcol, quantile)

Calculate quantiles of column aggcol between simulations, maintaining groups named in groupers

This method checks that groupers are not between simulation coordinates (which cannot be validly summed), and checks that the aggcol is a measured coordinate.

The time and state coordinates should be explicitly listed in groupers. If any within-simulation coordinates are excluded from groupers, this method will first call sum_over_groups() to sum data by the desired grouping. After summation is complete, valide quantiles are calculated.

Parameters
  • groupers (list of str) – names of coordinates to maintain in aggregated data

  • aggcol (str) – name of measured coordinate

  • quantile (float) – quantile value in the (0, 1) interval; passed to Quantile for calculation.

Returns

simulation data quantiles by groupers calculated across all simulations.

Return type

dask.DataFrame

prediction_interval(groupers, aggcol, upper, lower)

Wrapper to quantile_between_sum() to calculate upper, lower, 50% quantiles.

Parameters
  • groupers (list of str) – names of coordinates to maintain in aggregated data

  • aggcol (str) – name of measured coordinate

  • upper (float) – quantile value in the (0, 1) interval; passed to quantile_between_sims()

  • lower (float) – quantile value in the (0, 1) interval; passed to quantile_between_sims()

Returns

quantile data for the grouped simulation

Return type

xarray

interval_plot(groupers, aggcol, upper, lower)

Wrapper to prediction_interval to calculate interval and generate plot :param groupers: names of coordinates to maintain in aggregated data :type groupers: list of str :param aggcol: name of measured coordinate :type aggcol: str :param upper: quantile value in the (0, 1) interval` :type upper: float :param lower: quantile value in the (0, 1) interval` :type lower: float

Returns

None; outputs plotly graph using plotly display method.

spaghetti_plot(**kwargs)

Generate spaghetti plots directly from simulation

Optionally, data can be grouped by passing groupers and aggcol arguments, which are passed on to timeseries.spaghetti_timeseries().

Parameters

**kwargs (optional) – optional keyword to epivislab.timeseries.spaghetti_timeseries()

Returns

None; outputs plotly graph using plotly display method.