Many Heads but One Brain: Fusion Brain -- a Competition and a Single Multimodal Multitask Architecture

Daria Bakshandaeva,Denis Dimitrov,Alex Shonenkov,Mark Potanin, Vladimir Arkhipkin,Denis Karachev, Vera Davydova,Anton Voronov, Михаил Мартынов, Н. А. Семенова, Mikhail Stepnov,Elena Tutubalina,Andrey Chertok, Aleksandr Petiushko

arXiv (Cornell University)(2021)

引用 0|浏览0
暂无评分
摘要
Supporting the current trend in the AI community, we present the AI Journey 2021 Challenge called Fusion Brain, the first competition which is targeted to make the universal architecture which could process different modalities (in this case, images, texts, and code) and solve multiple tasks for vision and language. The Fusion Brain Challenge combines the following specific tasks: Code2code Translation, Handwritten Text recognition, Zero-shot Object Detection, and Visual Question Answering. We have created datasets for each task to test the participants' submissions on it. Moreover, we have collected and made publicly available a new handwritten dataset in both English and Russian, which consists of 94,128 pairs of images and texts. We also propose a multimodal and multitask architecture - a baseline solution, in the center of which is a frozen foundation model and which has been trained in Fusion mode along with Single-task mode. The proposed Fusion approach proves to be competitive and more energy-efficient compared to the task-specific one.
更多
查看译文
关键词
fusion brain,multimodal multitask architecture,many heads
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要