dispel.stats.reliability module#

Reliability analyses module.

Intraclass correlation coefficients#

This module contains functions to compute intraclass correlation coefficient (ICC) models[1][2].

Those functions allow to compute single score or average score ICCs as an index of inter-rater reliability of quantitative data. Additionally, F-test and confidence interval are computed.

Implemented ICCs#
ICC	`Model`	`Description`	`Unit`	Function
ICC(1, 1)	`ONE_WAY`	`AGREEMENT`	`SINGLE`	`icc_oneway_random_absolute_single()`
ICC(2, 1)	`TWO_WAY`	`AGREEMENT`	`SINGLE`	`icc_two_way_random_absolute_single()`
ICC(3,1)	`TWO_WAY`	`CONSISTENCY`	`SINGLE`	`icc_two_way_mixed_consistency_single()`
ICC(1,k)	`ONE_WAY`	`AGREEMENT`	`AVERAGE`	`icc_oneway_random_absolute_average()`
ICC(2,k)	`TWO_WAY`	`AGREEMENT`	`AVERAGE`	`icc_two_way_random_absolute_average()`
ICC(3,k)	`TWO_WAY`	`CONSISTENCY`	`AVERAGE`	`icc_two_way_mixed_consistency_average()`

When considering which form of ICC is appropriate for an actual set of data one has take several decisions (Shrout & Fleiss, 1979)[3]:

1. Should only the subjects be considered as random effects (ICCModel.ONE_WAY) or are subjects and raters randomly chosen from a bigger pool of persons (ICCModel.TWO_WAY).

2. If differences in judges’ mean ratings are of interest, inter-rater ICCDesc.AGREEMENT instead of ICCDesc.CONSISTENCY should be computed.

3. If the unit of analysis is a mean of several ratings, unit should be changed to ICCUnit.AVERAGE. In most cases, however, single values (ICCUnit.SINGLE) are regarded.

The implementations of the ICCs are based on:

References

class dispel.stats.reliability.ICCDesc[source]#

Bases: StringEnum

The ICC description.

AGREEMENT = 'agreement'#: Agreement between raters

CONSISTENCY = 'consistency'#: Consistency between raters

class dispel.stats.reliability.ICCKind[source]#

Bases: StringEnum

The kind of ICC analysis.

OTHER = 'other'#

PARALLEL_FORM = 'parallel form'#

TEST_RETEST = 'test retest'#

class dispel.stats.reliability.ICCModel[source]#

Bases: StringEnum

The model type used to perform the ICC analysis.

ONE_WAY = 'oneway'#: Random subject effects

TWO_WAY = 'two-way'#: Random subject and repetition effects

class dispel.stats.reliability.ICCResult[source]#

Bases: object

ICC reliability analysis results.

__init__(model, desc, unit, kind, value, l_bound, u_bound, p_value, sample_size, sessions, power=0.8)#

Parameters:

model (ICCModel) –
desc (ICCDesc) –
unit (ICCUnit) –
kind (ICCKind) –
value (float) –
l_bound (float) –
u_bound (float) –
p_value (float) –
sample_size (int) –
sessions (int) –
power (float) –

Return type:

None

desc: ICCDesc#: The description of ICC

kind: ICCKind#: The kind of the ICC

l_bound: float#: The lower bound of the 95% confidence interval

model: ICCModel#: The model of ICC

p_value: float#: The p-value of the test must be under 0.05

power: float = 0.8#: The power of the study regarding the sample size and ICC

sample_size: int#: The subject sample size associated to the current study and ICC

sessions: int#: The number of sessions considered in the ICC analysis

u_bound: float#: The upper bound of the 95% confidence interval

unit: ICCUnit#: The unit of ICC

value: float#: The ICC value

class dispel.stats.reliability.ICCResultSet[source]#

Bases: object

Class ensemble of ICC scores for multiple measures.

__init__(study, p0_icc, iccs=<factory>)#

Parameters:

study (ICCResultSetStudy) –
p0_icc (float) –
iccs (Dict[str, ICCResult]) –

Return type:

None

iccs: Dict[str, ICCResult]#: The ICC scores for each measure associated by their measure_id

p0_icc: float#: The null hypothesis reference for measure ICC scores

study: ICCResultSetStudy#: Sort of study concerned either control or patient

to_data_frame()[source]#

Export ICC result set to a pandas data frame format.

Return type:: DataFrame

class dispel.stats.reliability.ICCResultSetStudy[source]#

Bases: str, Enum

The type of the study.

STUDY_CLINICAL = 'clinical'#

STUDY_CONTROL = 'control'#

class dispel.stats.reliability.ICCUnit[source]#

Bases: StringEnum

The ICC unit.

AVERAGE = 'average'#: Values analyzed are aggregates of multiple values

SINGLE = 'single'#: Values analyzed are single values

class dispel.stats.reliability.StringEnum[source]#

Bases: Enum

String enumerator.

dispel.stats.reliability.ensure_session_standards(data, session_min=8, null_ratio=0.1)[source]#

Ensure sessions have sufficient support.

This transformation ensures that sessions have sufficient support, i.e. a subject is only considered if it has contributed more than session_min sessions. A session is dropped if it has a higher null ratio than null_ratio.

Parameters:

data (DataFrame) – A data frame with subjects as rows, sessions as columns, and cells containing the measure values.
session_min (int) – The minimum number of required sessions for each subject to be considered in the analysis.
null_ratio (float) – The ratio of null values across subjects for a particular session below which it is taken into account.

Returns:

The filtered data containing only subjects that have contributed at least session_min sessions, sessions that have a lower ratio of null values than null_ratio, and subjects that have no null value for the latter sessions.

Return type:

pandas.DataFrame

dispel.stats.reliability.icc_oneway_random_absolute_average(ratings, confidence_level=0.95)[source]#

Compute the ICC(1,k) score.

ICC(1,k): One-way random effects, absolute agreement, multiple raters/measurements.

Parameters:

ratings (DataFrame) – Matrix with n subjects m raters, i.e. array-like, shape (n_subjects, n_raters)
confidence_level (float) – Confidence level of the interval.

Returns:

ICC with its relative test information.

Return type:

ICCResult

dispel.stats.reliability.icc_oneway_random_absolute_single(ratings, confidence_level=0.95)[source]#

Compute the ICC(1,1) score.

ICC(1,1) : One-way random effects, absolute agreement, single rater/measurement.

Parameters:

ratings (DataFrame) – Matrix with n subjects m raters, i.e. array-like, shape (n_subjects, n_raters)
confidence_level (float) – Confidence level of the interval.

Returns:

ICC with its relative test information

Return type:

ICCResult

dispel.stats.reliability.icc_parallel_form(form1, form2)[source]#

Compute the icc score from parallel form measures.

This score allows the comparison of to measures of same nature and definition but obtained in different condition (example : CPS mean RT on predefinedKey1 compared to predefinedKey2)

Parameters:

form1 (pandas.DataFrame) – A data frame containing the M measure form 1 values for all users
form2 (pandas.DataFrame) – A data frame containing the M’ measure form 2 values for all users

Returns:

The parallel form ICC score containing the value with its definition and its 95% confidence interval

Return type:

ICCResult

dispel.stats.reliability.icc_power(icc, p0_icc, n_ratings, n_subjects, p_value=0.05, tails=2)[source]#

Compute the power of an ICC score obtained during a study.

Parameters:

icc (float) – The ICC obtained during the study
p0_icc (float) – The null hypothesis value of the obtained ICC
n_ratings (int) – The number of ratings for each subject
n_subjects (int) – Number of subjects during the study
p_value (float) – The p_value of the study. Should have been set at 0.05.
tails (Literal[1, 2]) – Unilateral (1) or Bilateral (2) test.

Returns:

The power associates with the ICC study

Return type:

float

dispel.stats.reliability.icc_sample_size(icc, p0_icc, n_ratings, p_value=0.05, tails=2, power=0.8)[source]#

Compute the sample size for an ICC reliability scoring. See [4].

Parameters:

icc (float) – The ICC score expected during the study
p0_icc (float) – The null hypothesis value of the expected ICC
n_ratings (int) – The number of ratings for each subject
p_value (float) – The desired p_value. Set at 0.05 for statistical tests
tails (Literal[1, 2]) – Unilateral (1) or Bilateral (2) test
power (float) – The statistical power of the test. Always sets at 0.8 for clinical study.

Returns:

The sample size of the study

Return type:

int

References

dispel.stats.reliability.icc_set_test_retest(measure_collection, study=ICCResultSetStudy.STUDY_CONTROL, session_min=8, null_ratio=0.1, errors='raise')[source]#

Compute the ICC test retest score for all measures.

It takes into consideration the study type, which could be either “control” or “clinical”. It also ensures sessions have sufficient support. See icc_test_retest_session_safe() for details.

Parameters:

measure_collection (MeasureCollection) – A collection of measures
study (ICCResultSetStudy) – Type of the study used to determine the \(p_0^{icc}\). 0 is used for control studies and 0.6 for clinical ones.
session_min (int) – See ensure_session_standards().
null_ratio (float) – See ensure_session_standards().
errors (Literal['raise', 'ignore']) – How to handle errors occurring during the computation of ICC scores. - If ‘raise’, then errors will be risen. - If ‘ignore’, then the measure will be skipped.

Returns:

A set of measure ICC

Return type:

ICCResultSet

Raises:

ValueError – If errors is set to ‘raise’, ValueError will be risen again if it occurred in icc_test_retest_session_safe().

dispel.stats.reliability.icc_test_retest(data, study=ICCResultSetStudy.STUDY_CONTROL)[source]#

Compute the test-retest ICC of a data frame.

Implemented by: Mind-the-Pineapple/ICC is licensed under the MIT License. See [5] and [6].

Parameters:

data (DataFrame) – A N*M pandas DataFrame containing N subjects and M ratings
study (ICCResultSetStudy) – Status of the study

Returns:

The test retest ICC score containing the value with its definition and its 95% confidence interval

Return type:

ICCResult

References

dispel.stats.reliability.icc_test_retest_session_safe(measure_collection, measure_id, study=ICCResultSetStudy.STUDY_CONTROL, session_min=8, null_ratio=0.1)[source]#

Compute the ICC test retest score for one measure.

Parameters:

measure_collection (MeasureCollection) – A collection of measures.
measure_id (str) – The measure id to be used for the computation of the ICC scores
study (ICCResultSetStudy) – Type of the study used to determine the \(p_0^{icc}\). 0 is used for control studies and 0.6 for clinical ones.
session_min (int) – See ensure_session_standards().
null_ratio (float) – See ensure_session_standards().

Returns:

The ICC score results for the provided measure.

Return type:

ICCResult

dispel.stats.reliability.icc_two_way_mixed_consistency_average(ratings, confidence_level=0.95)[source]#

Compute the ICC(3,k) score.

ICC(3,k): Two-way mixed effects, consistency, average raters/measurements.

Parameters:

ratings (DataFrame) – Matrix with n subjects m raters, i.e. array-like, shape (n_subjects, n_raters)
confidence_level (float) – Confidence level of the interval.

Returns:

ICC with its relative test information

Return type:

ICCResult

dispel.stats.reliability.icc_two_way_mixed_consistency_single(ratings, confidence_level=0.95)[source]#

Compute the ICC(3,1) score.

ICC(3,1): Two-way mixed effects, consistency, single rater/measurement.

Parameters:

ratings (DataFrame) – Matrix with n subjects m raters, i.e. array-like, shape (n_subjects, n_raters)
confidence_level (float) – Confidence level of the interval.

Returns:

ICC with its relative test information

Return type:

ICCResult

dispel.stats.reliability.icc_two_way_random_absolute_average(ratings, confidence_level=0.95)[source]#

Compute the ICC(2,k) score.

ICC(2,k): Two-way random effects, absolute agreement, average raters/measurements.

Parameters:

ratings (DataFrame) – Matrix with n subjects m raters, i.e. array-like, shape (n_subjects, n_raters)
confidence_level (float) – Confidence level of the interval.

Returns:

ICC with its relative test information.

Return type:

ICCResult

dispel.stats.reliability.icc_two_way_random_absolute_single(ratings, confidence_level=0.95)[source]#

Compute the ICC(2,1) score.

ICC(2,1): Two-way random effects, absolute agreement, single rater/measurement.

Parameters:

ratings (DataFrame) – Matrix with n subjects m raters, i.e. array-like, shape (n_subjects, n_raters)
confidence_level (float) – Confidence level of the interval.

Returns:

ICC with its relative test information

Return type:

ICCResult