Second Language Transfer Learning In Humans And Machines Using Image Supervision

Kiran Praveen,Anshul Gupta,Akshara Soman,Sriram Ganapathy

2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019)（2019）

引用 1|浏览1

暂无评分

摘要

In the task of language learning, humans exhibit remarkable ability to learn new words from a foreign language with very few instances of image supervision. The question therefore is whether such transfer learning efficiency can be simulated in machines. In this paper, we propose a deep semantic model for transfer learning words from a foreign language (Japanese) using image supervision. The proposed model is a deep audio-visual correspondence network that uses a proxy based triplet loss. The model is trained with large dataset of multi-modal speech/image input in the native language (English). Then, a subset of the model parameters of the audio network are transfer learned to the foreign language words using proxy vectors from the image modality. Using the proxy based learning approach, we show that the proposed machine model achieves transfer learning performance for an image retrieval task which is comparable to the human performance. We also present an analysis that contrasts the errors made by humans and machines in this task.

查看译文

关键词

Multimodal learning, transfer learning, document retrieval, human-machine comparison, distance metric learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要