Prompt Learning for Oriented Power Transmission Tower Detection in High-Resolution SAR Images
arxiv(2024)
摘要
Detecting transmission towers from synthetic aperture radar (SAR) images
remains a challenging task due to the comparatively small size and side-looking
geometry, with background clutter interference frequently hindering tower
identification. A large number of interfering signals superimposes the return
signal from the tower. We found that localizing or prompting positions of power
transmission towers is beneficial to address this obstacle. Based on this
revelation, this paper introduces prompt learning into the oriented object
detector (P2Det) for multimodal information learning. P2Det contains the sparse
prompt coding and cross-attention between the multimodal data. Specifically,
the sparse prompt encoder (SPE) is proposed to represent point locations,
converting prompts into sparse embeddings. The image embeddings are generated
through the Transformer layers. Then a two-way fusion module (TWFM) is proposed
to calculate the cross-attention of the two different embeddings. The
interaction of image-level and prompt-level features is utilized to address the
clutter interference. A shape-adaptive refinement module (SARM) is proposed to
reduce the effect of aspect ratio. Extensive experiments demonstrated the
effectiveness of the proposed model on high-resolution SAR images. P2Det
provides a novel insight for multimodal object detection due to its competitive
performance.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要