Notice
HMM representation of scan paths reveals consistently more and less efficient visual strategies for task recognition
- document 1 document 2 document 3
- niveau 1 niveau 2 niveau 3
Descriptif
Understanding human visual behaviour during videos that display humans performing a task is at the crossroads of various fields such as psychology and neurology as well as imitation learning. Machine learning models tasked with predicting visual attention can be black boxes difficult to translate into an exploitable understanding of human behaviour. Hidden Markov models are a statistical but interpretable model, capable of capturing the tendencies of a big number of scan paths, clustering them for similar stimuli, co-clustering them across different stimuli for the same observers, and making use of the Viterbi algorithm to output most probable hidden state sequences. In this work, we adapted the approach for analysing eye movement using Hidden Markov Models (EMHMM) for static eye tracking of videos that show humans performing simple tasks on a conveyor belt. This adaptation sets the hidden states as regions of interest in the underlying scene, thus making the outputs of the Viterbi algorithm and the properties of the learned HMMs human interpretable. We uncover among observers efficient and inefficient visual strategies for task recognition. Groups with similar strategy proved to be consistently more efficient across different stimuli. We use a measure of entropy to classify strategies into efficient active and inefficient passive. By applying HMMs to eye-tracking data, we reveal distinct patterns of visual exploration that correlate with task efficiency. These results can help understand cognitive behaviour and apply eye-movements datasets to improve the performance and/or the speed of learning of machine learning models for the goal of task recognition. The generative property of HMMs allows us to augment the data in a particular way, for example by increasing the weight of efficient active scan paths in the hopes that it yields better results for subsequent applications.