Data model#
Reading#
The basic container for all data processed is a Reading
.
_providers extend the library with reading functionality for specific file formats.
The function read()
allows automatically parse registered formats into
the reading data model. The providers too come with functions that allow to read from
files directly, if the format is known.
As an example, we will read a file using the parsing for Ad Scientiam’s file format. First you have to import and read in given a path:
from dispel.providers.ads.io import read_ads
reading = read_ads('path/to/file.json')
Level#
Level definition#
Tasks used to assess the performance of a subject can be composed of multiple sub-tasks used to assess multiple aspects within a domain. To provide a logical structure for data capture, these sub-tasks are structured in so called levels. From a data analysis perspective a level is a logical unit of analysis. For example, the DRAW task asks the subject to draw four different shapes twice with both their left and right hand. Each attempt to draw a shape with one hand is considered a level as one would derive measures for analyses for each and everyone of them.
Identification of levels#
Levels are identified by their id that allows to uniquely select one scenario
out of the above mentioned modalities. The ids are derived from the lower case
name of the modality. Multiple modalities are separated with a dash
(-
).
CPS#
Symbol-to-digit/digit-to-digit
Symbol-to-digit (
symbol_to_digit
)Digit-to-digit (
digit_to_digit
)
PINCH / DRAW / GRIP#
Hand
Left (
left
)Right (
right
)
Shape
Square counter-clock (
square_counter_clock
)Square (
square
)Figure 8 (
figure_eight
)Spiral (
spiral
)
Attempt
1st attempt (
1
)2nd attempt (
2
)
An example for DRAW would be left_square_1
.
Relationships#
The following diagram illustrated the class relationships between Reading
, Level
, RawDataSet
, LevelEpoch
and MeasureSet
.
Extracting a Level#
To get access to one of the dispel.data.levels.Level
use the method
get_level()
:
level = reading.get_level('<level_id>')
where <level_id>
has to be replaced with the desired level_id
. e.g.
if you have a reading of the CPS test, you can use
level_id = 'digit_to_symbol'
.
One can also extract all levels from a Reading
using
levels
:
levels = reading.levels
LevelEpoch#
LevelEpoch
allow to describe specific time periods
in levels with measures. They can be used to both process and extract data
and measures for those specific epochs in time.
RawDataSet#
A RawDataSet
is a data structure with a pandas
data frame encapsulated with a RawDataSetDefinition
composed with information about the data source, and a short description of
the values in RawDataSet (see
RawDataValueDefinition
).
Extracting a RawDataSet#
To get access to one of the RawDataSet
s contained in
the Level, simply call
data = level.get_raw_data_set('<id>')
where <id>
has to be replaced with the data set ids, for the cps example
with level_id = ‘digit_to_symbol’ one may use id = 'userInput'
.
Minimum working example#
We illustrate how to get a pandas.DataFrame
with formatted data from a
json file with an example, reading data from a CPS experiment at the level
symbol-to-digit with key level_id = 'digit_to_symbol'
and
dispel.data.raw.RawDataSet
id set to id = 'userInput'
.
from dispel.io.ads import read_ads
# path to json
path = "./tests/io/_resources/ads/CPS/example.json"
# read ads
reading = read_ads(path)
# extract dataset
data_set = reading.get_level('digit_to_symbol').get_raw_data_set('userInput')
# get dataframe from data_set
df = data_set.data
Flag#
Flag
s provide a structured way to
mark entities as valid. Those flags can originate from both technical
issues (TECHNICAL
) with the
underlying data capture or behavioural aspects (
BEHAVIORAL
)
of subjects performing tests not according to their respective protocol.
Flags are supported for Reading
s,
Level
s, RawDataSet
s and
MeasureValue
s.
is_valid()
indicates if a
particular entity is valid and
get_flags()
allows to
retrieve all reasons why a particular entity was flagged.
Flags contain both an identifier and a reason. The id
contains
three pieces of information:
task_name
e.g. CPS etc.
flag_type
e.g. technical, behavioral etc.
flag_severity
e.g. deviation or invalidation
flag_name
e.g. tilt_angle, pressure_range etc.
And the reason
contains a more descriptive message of the data
flag.
Flags can be defined as follows:
flag
>>> from dispel.data.values import AbbreviatedValue as AV
>>> from dispel.data.flags import Flag, FlagId, FlagSeverity, FlagType
>>> flag_id = FlagId(
... task_name=AV('Pinch test', 'pinch'),
... flag_name=AV('Tilt angle', 'ta'),
... flag_type=FlagType.BEHAVIORAL,
... flag_severity=FlagSeverity.DEVIATION,
... )
>>> flag_id
pinch-behavioral-deviation-ta
>>> flag = Flag(
... id_=flag_id,
... reason='The tilt angle of the phone is too flat during the Pinch '
... 'test.'
... )
>>> flag
Flag(id=pinch-behavioral-deviation-ta, reason='The tilt angle of the phone is too flat during the ...