Universal representations:The missing link between faces, text, planktons, and cat breeds
arXiv: Computer Vision and Pattern Recognition(2017)
摘要
With the advent of large labelled datasets and high-capacity models, the
performance of machine vision systems has been improving rapidly. However, the
technology has still major limitations, starting from the fact that different
vision problems are still solved by different models, trained from scratch or
fine-tuned on the target data. The human visual system, in stark contrast,
learns a universal representation for vision in the early life of an
individual. This representation works well for an enormous variety of vision
problems, with little or no change, with the major advantage of requiring
little training data to solve any of them. In this paper we investigate whether
neural networks may work as universal representations by studying their
capacity in relation to the âÂÂsizeâ of a large combination of vision problems.
We do so by showing that a single neural network can learn simultaneously
several very different visual domains (from sketches to planktons and MNIST
digits) as well as, or better than, a number of specialized networks. However,
we also show that this requires to carefully normalize the information in the
network, by using domain-specific scaling factors or, more generically, by
using an instance normalization layer.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要