M3TCM: Multi-modal Multi-task Context Model for Utterance Classification in Motivational Interviews
International Conference on Language Resources and Evaluation(2024)
Abstract
Accurate utterance classification in motivational interviews is crucial to
automatically understand the quality and dynamics of client-therapist
interaction, and it can serve as a key input for systems mediating such
interactions. Motivational interviews exhibit three important characteristics.
First, there are two distinct roles, namely client and therapist. Second, they
are often highly emotionally charged, which can be expressed both in text and
in prosody. Finally, context is of central importance to classify any given
utterance. Previous works did not adequately incorporate all of these
characteristics into utterance classification approaches for mental health
dialogues. In contrast, we present M3TCM, a Multi-modal, Multi-task Context
Model for utterance classification. Our approach for the first time employs
multi-task learning to effectively model both joint and individual components
of therapist and client behaviour. Furthermore, M3TCM integrates information
from the text and speech modality as well as the conversation context. With our
novel approach, we outperform the state of the art for utterance classification
on the recently introduced AnnoMI dataset with a relative improvement of 20
for the client- and by 15
ablation studies, we quantify the improvement resulting from each contribution.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined