Cultural Understanding Using In-context Learning and Masked Language Modeling

HCI INTERNATIONAL 2021 - LATE BREAKING PAPERS: MULTIMODALITY, EXTENDED REALITY, AND ARTIFICIAL INTELLIGENCE（2021）

引用 0|浏览4

暂无评分

摘要

With the rapid advancement of natural language processing (NLP) as a sub-field of artificial intelligence (AI), a number of unsupervised pre-trained language models trained on large corpus have become available (e.g. BERT and GPT-3). While these models have tremendous linguistic knowledge, a lot of other types of knowledge are embedded in them as well. We perform cross-culture analysis experiments using AI-based Masked Language Modeling (MLM) and GPT-based Generative Language Modeling (In-context learning modeling). The designed approach is to set up a cultural context in sentences with masked words (for MLM) or in a human-prompted text segment (for GPT-based NLG). Consequently, the predicted masked words or the machine generated stories will reflect measurable intercultural differences because language models are trained on different corpus in different languages, and on English corpus containing a significant amount of knowledge on foreign cultures. We show a variety of examples: geopolitical knowledge, holidays, gestures, customs, social norms, emotion schema, role schema, procedure schema, and emotion change detection based on a diplomatic speech. The deep learning neural network model encodes its knowledge in the weights of a neural network instead of as organized semantic concepts. The model can reflect biases brought in by the training data and can give us inaccurate or faulty answers. Overall, with the rapid advancement of language technology, pre-trained language models have grown more powerful, and have great potential to serve as a culturalization tool.

查看译文

关键词

Cultural understanding, Language modeling, Masked language modeling, Generative language modeling, Cross-lingual writing assistance, Culturalization, Localization, Cross-lingual, Inter-culture

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要