Modeling visual search in a thousand scenes: The roles of saliency, target features, and scene context

Krista A. Ehinger,Barbara Hidalgo-Sotelo,Antonio Torralba,Aude Oliva

Journal of Vision（2010）

Cited 4|Views16

No score

Abstract

Three sources of guidance have been proposed to explain the deployment of attention during visual search tasks. (1) Saliency reflects the capture of attention by regions of an image that differ from their surroundings in low-level features (i.e., Itti & Koch, 2000). (2) Attention may also be guided towards image regions that look like the search target (Wolfe, 2007); for example, attention may be directed towards red objects when searching for a red-colored target. (3) The context of a scene is also likely to guide attention: in the real world, objects are constrained to appear in particular locations (for example, cars appear on streets), so attention may be guided to these locations during search (Torralba et al., 2007). We attempted to predict human search fixations using computational models of the three sources of guidance (saliency, target features, and scene context) in a large database of human fixation data (14 observers searching for pedestrians in 912 outdoor scenes). When tested individually, each model performed above chance but scene context provided the best prediction of human fixation locations. A combined model incorporating all three sources of guidance outperformed each of the single-source models, with performance driven predominantly by the context model. The combined model performed at 94% of the level of human agreement in the search task, as measured by the area under the ROC curve. We compared performance of the three-source model of search guidance to an empirically-derived model of scene context. For this comparison, a “context oracle” was created by asking human observers to specify the scene region where a target was most likely to appear. This context oracle predicted human fixations as well as the three-source computational model. We discuss the implication of these results for future models of visual search.

Translated text

Key words

visual search

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined