A computer-vision based approach to co-register frames from egocentric video recordings

Jianing Mu, Zixun Wei,Margaret Moulson,Gabriel (Naiqi) Xiao,Ming Bo Cai

Journal of Vision（2022）

Cited 0|Views5

No score

Abstract

Wearable eye-trackers enable us to record eye movement dynamics from an egocentric viewpoint. Although the data collected from wearable eye-trackers can index our looking patterns in naturalistic settings, current data analyses focus on the correspondence between fixation locations and an individual frame from the egocentric video recording. This analysis approach separates the continuous eye movement dynamics into isolated frames, thereby hindering the study of the temporal dynamics of eye movement patterns, such as building computational models to predict eye movement during interpersonal interactions. The challenges are largely caused by the fact that the recorded eye movement data includes both eye movement and head/body motion, but there is no reliable method to isolate the head/body motion. Therefore, separating eye movement from and head/body motion may potentially facilitate computational models to focus on finding features relevant for predicting fixation dynamics. To this end, we adopt methods in computer vision to correct observers’ head/body motion by co-registering frames of videos taken by head-mounted cameras. The end results are a series of images as if taken from a static camera. Towards this goal, we first used semantic segmentation algorithms based on deep learning to identify stationary objects (e.g., a table & wall) in each frame. Next, we calculated the dense optic flows between every two consecutive frames based on the pixels automatically selected from the stationary objects. The global frame-by-frame movement is then estimated as a series of affine transformations and is used to warp and align consecutive frames. We tested our method on eye-tracking data and egocentric videos simultaneously recorded by head-mounted cameras from infants aged 9-18 months exploring a lab environment. Ongoing works are testing existing models of saliency prediction on the aligned videos.

Translated text

Key words

egocentric video recordings,computer-vision computer-vision,frames,co-register

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined