DMAT: A Dynamic Mask-Aware Transformer for Human De-occlusion
CoRR(2024)
摘要
Human de-occlusion, which aims to infer the appearance of invisible human
parts from an occluded image, has great value in many human-related tasks, such
as person re-id, and intention inference. To address this task, this paper
proposes a dynamic mask-aware transformer (DMAT), which dynamically augments
information from human regions and weakens that from occlusion. First, to
enhance token representation, we design an expanded convolution head with
enlarged kernels, which captures more local valid context and mitigates the
influence of surrounding occlusion. To concentrate on the visible human parts,
we propose a novel dynamic multi-head human-mask guided attention mechanism
through integrating multiple masks, which can prevent the de-occluded regions
from assimilating to the background. Besides, a region upsampling strategy is
utilized to alleviate the impact of occlusion on interpolated images. During
model learning, an amodal loss is developed to further emphasize the recovery
effect of human regions, which also refines the model's convergence. Extensive
experiments on the AHP dataset demonstrate its superior performance compared to
recent state-of-the-art methods.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要