dispel.stats.learning module#

Inter-session Learning analysis.

A module where functions are provided to compute and extract learning parameters from measure collections containing processed measures. The module provides class and functions to compute and extract parameters from fitted model by curve fit and compute relevant learning related measures.

class dispel.stats.learning.DelayParameters[source]#

Bases: object

Class ensemble of delay parameters.

__init__(mean, median, max)#

Parameters:

mean (float | None) –
median (float | None) –
max (float | None) –

Return type:

None

classmethod empty()[source]#

Return empty delay parameters.

Return type:: DelayParameters

max: float | None#: The maximum delay between sessions.

mean: float | None#: The mean delay between sessions.

median: float | None#: The median delay between sessions.

to_dict()[source]#

Convert learning parameters to dictionary format.

Return type:: Dict[str, float | None]

class dispel.stats.learning.LearningCurve[source]#

Bases: object

Class ensemble of learning curve parameters.

__init__(asymptote, slope)#

Parameters:

asymptote (float) –
slope (float) –

Return type:

None

asymptote: float#: The asymptote of the fitted learning curve.

static compute_learning(x, a, b)[source]#

Compute learning curve function.

Parameters:

x (float | int | ndarray | Series) –
a (float | int | ndarray | Series) –
b (float | int | ndarray | Series) –

Return type:

float | int | ndarray | Series

classmethod empty()[source]#: Return empty learning curve.

classmethod fit(x, y)[source]#

Fit learning curve using scipy.optimize.curve_fit().

See dispel.stats.learning.LearningCurve.compute_learning().

Parameters:

x (ndarray) – The trial numbers associated with data points.
y (ndarray) – The measure data points.

Returns:

The fitted learning curve.

Return type:

LearningCurve

get_warm_up(data)[source]#

Compute the warm-up argmax for measure values and fitted parameters.

The warm_up here is actually the minimum number of trials the user has to perform in order to reach 90% of the optimal performance (asymptote value) given by the model.

Parameters:: data (Series | ndarray) – A numpy array series containing ordered measure values.
Returns:: The argmax of the first occurrence of the measure values that reaches 90% of the optimal performance given by the model.
Return type:: int
Raises:: TypeError – If the given data is not a pandas series nor a numpy array.

property learning_rate: float#: Get the learning rate related to curve.

slope: float#: The slope of the fitted learning curve.

to_dict()[source]#: Convert learning curve information to dictionary format.

class dispel.stats.learning.LearningModel[source]#

Bases: object

Class ensemble of learning model.

__init__(curve, new_data, r2_score, nb_outliers=0)#

Parameters:

curve (LearningCurve) –
new_data (Series) –
r2_score (float | None) –
nb_outliers (int | None) –

Return type:

None

curve: LearningCurve#: The fitted learning curve.

classmethod empty()[source]#

Return empty learning model.

Return type:: LearningModel

nb_outliers: int | None = 0#: The number of outliers rejected during the model fitting.

new_data: Series#: The data points without outliers

r2_score: float | None#

to_dict()[source]#: Convert learning model information to dictionary format.

class dispel.stats.learning.LearningParameters[source]#

Bases: object

Class ensemble of learning parameters.

__init__(subject_id, measure_id, model, delay_parameters)#

Parameters:

subject_id (str) –
measure_id (str) –
model (LearningModel) –
delay_parameters (DelayParameters) –

Return type:

None

delay_parameters: DelayParameters#: The delay parameters in days

measure_id: str#: The measure id

model: LearningModel#: The learning model

subject_id: str#: The subject’s id

to_dict()[source]#

Convert learning parameters to dictionary format.

Return type:: Dict[str, float | int | str | None]

class dispel.stats.learning.LearningResult[source]#

Bases: object

The learning results for one measure and one or multiple subjects.

__init__()[source]#

append(learning_parameters)[source]#

Append new learning results for one subject to learning results.

Parameters:: learning_parameters (LearningParameters) – The learning parameters for the measure and subject in question.
Raises:: ValueError – If the learning parameters are for a different measure than the one concerning the learning result.

classmethod from_parameters(learning_parameters)[source]#

Initialize learning result from parameters.

Parameters:: learning_parameters (LearningParameters) – The learning parameters for the measure and subject in question.
Returns:: The learning result regrouping the given information.
Return type:: LearningResult

get_new_data(subject_id)[source]#

Get the new data points without outliers.

Parameters:: subject_id (str) – The identifier of the subject for which the new data is to be retrieved.
Returns:: A pandas series containing the new data points for the measure in question (without outliers).
Return type:: pandas.Series
Raises:: ValueError – If the subject identifier is not found in the learning analysis results.

get_parameters(subject_id=None)[source]#

Get learning results for one or all subjects.

Parameters:: subject_id (str | None) – The subject identifier for which the learning is to be retrieved. If None is provided all learning results will be given.
Returns:: If a valid subject id is given, the output is a pandas series summarizing learning results. If None is given the output will be a pandas data frame summarizing all learning results.
Return type:: Union[pandas.Series, pandas.DataFrame]
Raises:: ValueError – If the subject identifier is not found in the learning analysis results.

dispel.stats.learning.compute_delay(data)[source]#

Extract mean, median and maximum delay between consecutive sessions.

Parameters:: data (Series) – A pandas series containing timestamps.
Returns:: A dispel.stats.learning.DelayParameters with the values of the mean, median and maximum delay between consecutive trials for a given measure and subject in days.
Return type:: DelayParameters

dispel.stats.learning.compute_learning_model(data, tolerance=0.99, reset_trials=True)[source]#

Compute the learning model.

Parameters:

data (Series) – A pandas series composed of measure values for only one measure and only one user and trials numbers as index.
tolerance (float) – The tolerance threshold above which the data points are to be considered outliers and therefore rejected. Should be between 0 and 1.
reset_trials (bool) – True if the trial numbers are to be reset for the new data (without outliers). False otherwise.

Returns:

The output contains the following information:

The fitted learning model.

The delay parameters.

Return type:

Tuple[LearningModel, DelayParameters]

Raises:

ValueError – If the threshold tolerance is outside the legal bounds i.e. [0, 1].

dispel.stats.learning.extract_learning_for_all_subjects(measure_collection, measure_id, tolerance=0.99, reset_trials=True)[source]#

Compute learning parameters for all subjects in a measure collection.

Parameters:

measure_collection (MeasureCollection) – A measure collection containing any measures and any subjects.
measure_id (str) – The measure id on which the learning parameters are to be computed.
tolerance (float) – The tolerance threshold above which the data points are to be considered outliers and therefore rejected. Should be between 0 and 1.
reset_trials (bool) – True if the trial numbers are to be reset for the new data (without outliers). False otherwise.

Returns:

The learning result for all subjects of the measure in question. See: dispel.stats.learning.LearningResult.

Return type:

LearningResult

dispel.stats.learning.extract_learning_for_one_subject(measure_collection, subject_id, measure_id, tolerance=0.99, reset_trials=True)[source]#

Compute learning for a unique subject and a unique measure.

Parameters:

measure_collection (MeasureCollection) – A measure collection containing any measures and any subjects.
subject_id (str) – The identifier of the subject for which the delay is to be computed.
measure_id (str) – The identifier of the measure for which the delay is to be computed.
tolerance (float) – The tolerance threshold above which the data points are to be considered outliers and therefore rejected. Should be between 0 and 1.
reset_trials (bool) – True if the trial numbers are to be reset for the new data (without outliers). False otherwise.

Returns:

The learning result for one subject of the measure in question. See: dispel.stats.learning.LearningResult.

Return type:

LearningResult

dispel.stats.learning.reject_outliers(data, sigma)[source]#

Reject outliers with Z-score outside the tolerated bounds.

Parameters:

data (Series) – A pandas series composed of measure values for only one measure and only one user and trials numbers as index.
sigma (float) – The standard deviation threshold above which the data points are to be considered as outliers and therefore rejected.

Returns:

The data without the detected outlier (if detected) with the same structure as the entry.

Return type:

pandas.Series