Feature boosting with efficient attention for scene parsing
CoRR(2024)
摘要
The complexity of scene parsing grows with the number of object and scene
classes, which is higher in unrestricted open scenes. The biggest challenge is
to model the spatial relation between scene elements while succeeding in
identifying objects at smaller scales. This paper presents a novel
feature-boosting network that gathers spatial context from multiple levels of
feature extraction and computes the attention weights for each level of
representation to generate the final class labels. A novel `channel attention
module' is designed to compute the attention weights, ensuring that features
from the relevant extraction stages are boosted while the others are
attenuated. The model also learns spatial context information at low resolution
to preserve the abstract spatial relationships among scene elements and reduce
computation cost. Spatial attention is subsequently concatenated into a final
feature set before applying feature boosting. Low-resolution spatial attention
features are trained using an auxiliary task that helps learning a coarse
global scene structure. The proposed model outperforms all state-of-the-art
models on both the ADE20K and the Cityscapes datasets.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要