Investigating CLIP Performance for Meta-data Generation in AD Datasets.

Sujan Sai Gannamaneni, Arwin Sadaghiani, Rohil Prakash Rao,Michael Mock,Maram Akila

CVPR Workshops(2023)

Cited 1|Views2
No score
Using Machine Learning (ML) models for safety-critical perception tasks in Autonomous Driving (AD) or other domains requires a thorough evaluation of the model performance and the data coverage w.r.t. the intended Operational Design Domain (ODD). However, obtaining the needed per-image semantic meta-data along the relevant dimensions of the ODD for real-world image datasets is non-trivial. Recent advances in self-supervised foundation models, specifically CLIP, suggest that such meta-data could be obtained for real-world images in an automated fashion using zero-shot classification. While CLIP was already reported to achieve promising performance on tasks such as the recognition of gender or age on facial images, we investigate to which extent less prominent and more fine-grained observables, e.g., presence of accessories such as spectacles or the shirt- or hair-color, can be determined. We provide an analysis of CLIP for generating fine-grained meta-data on three datasets from the AD domain, one of synthetic origin including ground truth, the others being Cityscapes and Railsem19. We also compare with a standard facial dataset where more elaborate attribute annotations are present. To improve the quality of generated meta-data, we additionally extend the ensemble approach of CLIP by a simple noise-suppressing technique.
Translated text
Key words
AD domain,automated fashion,data coverage w.r.t,facial images,fine-grained meta-data,generated meta-data,intended Operational Design Domain,investigating CLIP performance,Machine Learning models,meta-data generation,ODD,other domains,per-image semantic meta-data,real-world image datasets,real-world images,relevant dimensions,safety-critical perception tasks,self-supervised foundation models,standard facial dataset,synthetic origin including ground truth,zero-shot classification
AI Read Science
Must-Reading Tree
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined