MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders
arxiv(2024)
摘要
Multi-task dense scene understanding, which learns a model for multiple dense
prediction tasks, has a wide range of application scenarios. Modeling
long-range dependency and enhancing cross-task interactions are crucial to
multi-task dense prediction. In this paper, we propose MTMamba, a novel
Mamba-based architecture for multi-task scene understanding. It contains two
types of core blocks: self-task Mamba (STM) block and cross-task Mamba (CTM)
block. STM handles long-range dependency by leveraging Mamba, while CTM
explicitly models task interactions to facilitate information exchange across
tasks. Experiments on NYUDv2 and PASCAL-Context datasets demonstrate the
superior performance of MTMamba over Transformer-based and CNN-based methods.
Notably, on the PASCAL-Context dataset, MTMamba achieves improvements of +2.08,
+5.01, and +4.90 over the previous best method in the tasks of semantic
segmentation, human parsing, and object boundary detection, respectively. The
code is available at .
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要