谷歌浏览器插件
订阅小程序
在清言上使用

TransIFC: Invariant Cues-aware Feature Concentration Learning for Efficient Fine-grained Bird Image Classification

IEEE Transactions on Multimedia(2023)

引用 10|浏览3
暂无评分
摘要
Fine-grained bird image classification (FBIC) is not only meaningful for endangered bird observation and protection but also a prevalent task for image classification in multimedia processing and computer vision. However, FBIC suffers from several challenges, such as bird molting, complex background, and arbitrary bird posture. To effectively tackle these challenges, we present a novel invariant cues-aware feature concentration Transformer (TransIFC), which learns invariant and core information in bird images. To this end, two novel modules are proposed to leverage the characteristics of bird images, namely, the hierarchy stage feature aggregation (HSFA) module and the feature in feature abstraction (FFA) module. The HSFA module aggregates the multiscale information of bird images by concatenating multilayer features. The FFA module extracts the invariant cues of birds through feature selection based on discrimination scores. Transformer is employed as the backbone to reveal the long-dependent semantic relationships in bird images. Moreover, abundant visualizations are provided to prove the interpretability of the HSFA and FFA modules in TransIFC. Comprehensive experiments demonstrate that TransIFC can achieve state-of-the-art performance on the CUB-200-2011 dataset (91.0%) and the NABirds dataset (90.9%). Finally, extended experiments have been conducted on the Stanford Cars dataset to suggest the potential of generalizing our method on other fine-grained visual classification tasks.
更多
查看译文
关键词
Feature extraction,Image classification,Invariant cues,Transformer,Deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要