Chrome Extension
WeChat Mini Program
Use on ChatGLM

A computer-vision based approach to co-register frames from egocentric video recordings

Journal of Vision(2022)

Cited 0|Views5
No score
Abstract
Wearable eye-trackers enable us to record eye movement dynamics from an egocentric viewpoint. Although the data collected from wearable eye-trackers can index our looking patterns in naturalistic settings, current data analyses focus on the correspondence between fixation locations and an individual frame from the egocentric video recording. This analysis approach separates the continuous eye movement dynamics into isolated frames, thereby hindering the study of the temporal dynamics of eye movement patterns, such as building computational models to predict eye movement during interpersonal interactions. The challenges are largely caused by the fact that the recorded eye movement data includes both eye movement and head/body motion, but there is no reliable method to isolate the head/body motion. Therefore, separating eye movement from and head/body motion may potentially facilitate computational models to focus on finding features relevant for predicting fixation dynamics. To this end, we adopt methods in computer vision to correct observers’ head/body motion by co-registering frames of videos taken by head-mounted cameras. The end results are a series of images as if taken from a static camera. Towards this goal, we first used semantic segmentation algorithms based on deep learning to identify stationary objects (e.g., a table & wall) in each frame. Next, we calculated the dense optic flows between every two consecutive frames based on the pixels automatically selected from the stationary objects. The global frame-by-frame movement is then estimated as a series of affine transformations and is used to warp and align consecutive frames. We tested our method on eye-tracking data and egocentric videos simultaneously recorded by head-mounted cameras from infants aged 9-18 months exploring a lab environment. Ongoing works are testing existing models of saliency prediction on the aligned videos.
More
Translated text
Key words
egocentric video recordings,computer-vision computer-vision,frames,co-register
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined