Limitations of Audiovisual Speech on Robots for Second Language Pronunciation Learning

HRI '23: Proceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction(2023)

Cited 0|Views14
No score
Abstract
The perception of audiovisual speech plays an important role in infants' first language acquisition and continues to be important for language understanding beyond infancy. Beyond that, the perception of speech and congruent lip motion supports language understanding for adults, and it has been suggested that second language learning benefits from audiovisual speech, as it helps learners distinguish speech sounds in the target language. In this paper, we study whether congruent audiovisual speech on a robot facilitates the learning of Japanese pronunciation. 27 native-Dutch speaking participants were trained in Japanese pronunciation by a social robot. The robot demonstrated 30 Japanese words of varying complexity using either congruent audiovisual speech, incongruent visual speech, or computer-generated audiovisual speech. Participants were asked to imitate the robot's pronunciation, recordings of which were rated by native Japanese speakers. Against expectation, the results showed that congruent audiovisual speech resulted in lower pronunciation performance than low-fidelity or incongruent speech. We show that our learners, being native Dutch speakers, are only very weakly sensitive to audiovisual Japanese speech which possibly explains why learning performance does not seem to benefit from audiovisual speech.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined