Aggregate emotion analysis


This (fairly technical) document contains technical information about the analysis process and its outputs, but does not describe psychological interpretation of the output.

You may want to start with a background on Stim Engine technology.


Emotions are recorded as a set of 6 “facial coding action units” (AU) [currently, sadness, fear, disgust, puzzlement, surprise, happiness] sampled approximately 16-20 times per second. For example, we might have a video of 30 seconds measured every 200 milliseconds (after resampling, see below). This means the raw data format consists of a matrix of size 6 by 30*(1000/200) = 6 x 150 values per person.

Post-Processing Pipeline

The pipeline is the sequence of processing steps needed to complete the analysis. It occurs on a segment of the participants selected according to user-defined criteria.

Individual Pipeline

Each individual timeseries is subject to the following processing steps:

  1. Windowing: We exclude data outside the bounds of stim presentation
  2. Fixed interval resampling: Since the camera provides measurements at a variable rate, we resample all results to a fixed rate to ensure all individuals contribute equally to the final analysis at all times. We average samples within each interval to produce the resampled values.
  3. Normalization, binarization & aggregation: We compute statistics about each timeseries. This allows us to compute a standard score and, by measuring distance from this, reduce each sample to a binary (yes/no) decision as to whether each AU is being expressed in each sample at any moment. Binarization is an important precursor for aggregation. In more recent data where we use the Convolutional-Neural-Network (CNN) algorithm, the output is already normalized with a known distribution on training data which means we can apply a fixed threshold to binarize the output at a particular sensitivity/specificity point. We can optionally use a specific time-window for the normalization, such as exposure to stimuli of known emotional content (e.g. a relaxing beach scene).
  4. Smoothing: We also smooth the standard scores; note that both original and smoothed data is available.

Aggregate Pipeline

The aggregate pipeline converts the binary classification of AU expression into a population frequency over time (i.e. what fraction of the population expressed a response at each moment). Importantly, we then look at changes in population frequency for each AU over time.

The following steps are performed:

  • Summation: We sum the binary AU classification for all participants at each time.
  • Frequency: We calculate the fraction of the population with nonzero AU expression for each AU at each time.
  • Smoothing: We smooth the frequency measurement
  • Relative: We measure the frequency of expression relative to a moving baseline. This highlights changes over time.


We summarize the entire result for a single stim as an “E-Score” (or Emotion-Score). The score has 3 components:

  • Eyes
  • Emotions
  • Effects

See the E-Score page for more information.

Interactive dataset results

All the outputs produced here, plus heatmaps, video and AOI statistics, will be used in our interactive results interface. Once you have created a Dataset, you can explore the results using the Analysis section of the Stim Admin interface.