Multistage Deep Transfer Learning for EmIoT-Enabled Human–Computer Interaction

IEEE Internet of Things Journal(2022)

引用 7|浏览7
暂无评分
摘要
Emotional Internet of Things (EmIoT), which provides Internet of Things (IoT) devices cognitive and socialization capabilities, has been regarded as a future direction to improve users’ experiences. With the development of intelligent techniques, the requirement of EmIoT is not only sensing the users’ emotional states but also providing emotional feedbacks. Human–computer interaction has been studied to achieve speech interaction with IoT devices. The recent advances in neural text-to-speech (TTS) have made “human parity” synthesized speech possible for IoT-enabled human–computer interaction. Furthermore, emotion control can be achieved by using the emotional codes in a unified model, referred to as emotional TTS (or ETTS for short). Such ETTS models have achieved promising emotional expressiveness using large-scale emotion-annotated English data set; however, they are not practical in IoT environments with other mainstream languages, especially for Chinese. In fact, the limited available large-scale emotion-annotated data set is challenging the development of Chinese ETTS. To address that we propose a multistage deep transfer learning scheme to design a high-quality Chinese ETTS system under a small-scale training corpus to achieve EmIoT in Mandarin environments. In this scheme, the pretrained knowledge from the former stages corresponding to a large-scale neutral English and a medium-scale emotional English corpora is transferred to a Mandarin ETTS model. Thereby, the trained model can achieve high-quality emotional speech with limited available emotional corpus, which is able to serve various EmIoT-oriented applications. The experiments have been conducted to demonstrate the effectiveness and superiority of the proposed model as compared to other counterparts in terms of naturalness and emotional expressiveness. We refer readers to visit our demo Webpage 1 enjoy the synthesized speech samples.
更多
查看译文
关键词
Emotional expressiveness,emotional Internet of Things (EmIoT),human–computer interaction (HCI),transfer learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要