Chrome Extension
WeChat Mini Program
Use on ChatGLM

Are These the Same Apple? Comparing Images Based on Object Intrinsics

NeurIPS(2023)

Cited 0|Views9
No score
Abstract
The human visual system can effortlessly recognize an object under different\nextrinsic factors such as lighting, object poses, and background, yet current\ncomputer vision systems often struggle with these variations. An important step\nto understanding and improving artificial vision systems is to measure image\nsimilarity purely based on intrinsic object properties that define object\nidentity. This problem has been studied in the computer vision literature as\nre-identification, though mostly restricted to specific object categories such\nas people and cars. We propose to extend it to general object categories,\nexploring an image similarity metric based on object intrinsics. To benchmark\nsuch measurements, we collect the Common paired objects Under differenT\nExtrinsics (CUTE) dataset of $18,000$ images of $180$ objects under different\nextrinsic factors such as lighting, poses, and imaging conditions. While\nexisting methods such as LPIPS and CLIP scores do not measure object intrinsics\nwell, we find that combining deep features learned from contrastive\nself-supervised learning with foreground filtering is a simple yet effective\napproach to approximating the similarity. We conduct an extensive survey of\npre-trained features and foreground extraction methods to arrive at a strong\nbaseline that best measures intrinsic object-centric image similarity among\ncurrent methods. Finally, we demonstrate that our approach can aid in\ndownstream applications such as acting as an analog for human subjects and\nimproving generalizable re-identification. Please see our project website at\nhttps://s-tian.github.io/projects/cute/ for visualizations of the data and\ndemos of our metric.
More
Translated text
Key words
same apple,images
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined