An Investigation of Deep Visual Architectures Based on Preprocess Using the Retinal Transform

European Conference on Computer Vision(2020)

Cited 0|Views17
No score
Abstract
This work investigates the utility of a biologically motivated software retina model to pre-process and compress visual information prior to training and classification by means of a deep convolutional neural networks (CNNs) in the context of object recognition in robotics and egocentric perception. We captured a dataset of video clips in a standard office environment by means of a hand-held high-resolution digital camera using uncontrolled illumination. Individual video sequences for each of 20 objects were captured over the observable view hemisphere for each object and several sequences were captured per object to serve training and validation within an object recognition task. A key objective of this project is to investigate appropriate network architectures for processing retina transformed input images and in particular to determine the utility of spatio-temporal CNNs versus simple feed-forward CNNs. A number of different CNN architectures were devised and compared in their classification performance accordingly. The project demonstrated that the image classification task could be conducted with an accuracy exceeding 98% under varying lighting conditions when the object was viewed from distances similar to that when trained.
More
Translated text
Key words
Deep learning,Retina,Visual cortex,CNN,Retinal transform,Image classification
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined