Gesture Motion Graphs for Few-Shot Speech-Driven Gesture Reenactment

PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2023(2023)

引用 1|浏览6
暂无评分
摘要
This paper presents the CASIA-GO entry to the Generation and Evaluation of Non-verbal Behaviour for Embedded Agents (GENEA) Challenge 2023. The system is originally designed for few-shot scenarios such as generating gestures with the style of any in-the-wild target speaker from short speech samples. Given a group of reference speech data including gesture sequences, audio, and text, it frst constructs a gesture motion graph that describes the soft gesture units and interframe continuity inside the speech, which is ready to be used for new rhythmic and semantic gesture reenactment by pathfnding when test audio and text are provided. We randomly choose one clip from the training data for one test clip to simulate a few-shot scenario and provide compatible results for subjective evaluations. Despite the 0.25% average utilization of the whole training set for each clip in the test set and the 17.5% total utilization of the training set for the whole test set, the system succeeds in providing valid results and ranks in the top 1/3 in the appropriateness for agent speech evaluation.
更多
查看译文
关键词
speech-driven gesture generation,motion graph,few-shot
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要