dispel.stats.core module#

Core functions to calculate statistics.

dispel.stats.core.freq_nan(data)[source]#

Get the frequency of null values.

Parameters:

data (Series) –

Return type:

float

dispel.stats.core.iqr(data)[source]#

Compute the inter-quartile range.

Parameters:

data (Series) –

Return type:

float

dispel.stats.core.mad(data, axis=None)[source]#

Compute mean absolute deviation.

Parameters:
  • data (ndarray) – The data from which to calculate the mean absolute deviation.

  • axis – The axis along which to calculate the mean absolute deviation.

Returns:

The mean absolute deviation.

Return type:

numpy.ndarray

dispel.stats.core.npcv(data)[source]#

Compute the non-parametric coefficient of variation of a series.

Parameters:

data (Series) –

Return type:

float

dispel.stats.core.percentile_05(self, *, q=0.05, interpolation='linear')#

Percentile 0.5 aggregation

dispel.stats.core.percentile_95(self, *, q=0.95, interpolation='linear')#

Percentile 0.95 aggregation

dispel.stats.core.q1(self, *, q=0.25, interpolation='linear')#

First quartile (Q1) aggregation

dispel.stats.core.q3(self, *, q=0.75, interpolation='linear')#

Third quartile (Q3) aggregation

dispel.stats.core.q_factory(percentile, name)[source]#

Create percentile aggregation method.

Parameters:
Returns:

Returns a callable aggregation method that returns the percentile specified in percentile.

Return type:

Callable[[pandas.Series], float]

dispel.stats.core.variation(data, error='coerce')[source]#

Compute the coefficient of variation of a series.

Parameters:
  • data (Series) – A pandas series for which to compute the coefficient of variation.

  • error (Literal['raise', 'coerce', 'omit']) –

    Defines how to handle when the data mean is null is raised. The following options are available (default is coerce)

    • raise: ZeroDivisionError will be raised.

    • coerce: variation will be set as 0.

    • omit: variation will return nan.

Returns:

The coefficient of variation of the data using an unbiased standard deviation computation.

Return type:

float

Raises:
  • ZeroDivisionError – If the data mean is null and the argument error is set to raised.

  • ValueError – If the argument error is given an unsupported value.

Examples

Here are a few usage examples:

variation

>>> import pandas as pd
>>> from dispel.stats.core import variation
>>> x = pd.Series([3.2, 4.1, 0., 1., -6.])
>>> variation(x)
8.626902135395195

In case of ZeroDivisionError, one can use the error argument to control the output.

variation

>>> x = pd.Series([3., -4., 0., 2., -1.])
>>> variation(x)
0.0

>>> x = pd.Series([3., -4., 0., 2., -1.])
>>> variation(x, error='raise')
Traceback (most recent call last):
...
ZeroDivisionError: Cannot divide by null mean.

>>> x = pd.Series([3., -4., 0., 2., -1.])
>>> variation(x, error='omit')
nan
dispel.stats.core.variation_increase(data, error='coerce')[source]#

Compute the coefficient of variation increase for a series.

The coefficient of variation increase corresponds to the the CV of the second half of the data minus that of the first half.

Parameters:
  • data (Series) – A pandas series for which to compute the coefficient of variation increase.

  • error (Literal['raise', 'coerce', 'omit']) –

    Defines how to handle when the data mean is null is raised. The following options are available (default is coerce)

    • raise: ZeroDivisionError will be raised.

    • coerce: variation will be set as 0.

    • omit: variation will return nan.

Returns:

The coefficient of variation increase of the data using an unbiased standard deviation computation.

Return type:

float

Examples

Here are a few usage examples:

variation_increase

>>> import pandas as pd
>>> from dispel.stats.core import variation_increase
>>> x = pd.Series([3.2, 4.1, 0., 1., -6.])
>>> variation_increase(x)
-2.4459184350510386

In case of ZeroDivisionError, one can use the error argument to control the output.

variation_increase

>>> x = pd.Series([3., -4., 0., 1., -1.])
>>> variation_increase(x, error='raise')
Traceback (most recent call last):
...
ZeroDivisionError: Cannot divide by null mean.

>>> x = pd.Series([3., -3., 0., 2., -1.])
>>> variation_increase(x, error='omit')
nan