Video-Based Multi-Camera Vehicle Tracking via Appearance-Parsing Spatio-Temporal Trajectory Matching Network

IEEE Transactions on Circuits and Systems for Video Technology(2024)

引用 0|浏览9
Multi-camera vehicle tracking is a fundamental task for city traffic management to count traffic flow or monitor roads. This paper focuses on multi-camera tracking on the highway, which is more challenging compared with city streets in some problems such as fast-moving vehicles, tiny similar vehicles in appearance, longer tracking distance, and lighting intensity changes in the dark tunnels. In this paper, we propose a practical Appearance-Parsing Spatio-Temporal Trajectory Matching Network (ASTM-Net) based on the global appearance matching of local trajectory for addressing the cross-camera tracking tasks on the highway. Specifically, considering that the environmental disturbance and small vehicles have a similar appearance, we propose a multiple appearance-attribute parsing (MAP) module consisting of a Bi-propagation top-down (Bi-TD) block and appearance re-identification (ARe-ID) block to obtain salient global appearance-attribute features through given a video sequence. To address discrete tracking fragments caused by occlusion, we develop an appearance-joint-tracking (AJT) mechanism to merge the isolated tracklets with target interaction and occlusion handling. We then exploit an appearance-informed spatio-temporal matching (ASTM) module to achieve multi-camera tracklet-totarget assignment, which employs spatio-temporal consistency relation for intra-camera trajectory correction and coarse intercamera tracklet correlation and aggregate appearance matrix of local trajectories for assigning global trajectory ID. Finally, in order to evaluate our proposed ASTM-Net, a new dataset, named HST, collected on the highway is established.We verify the ASTM-Net on the HST and the other three public datasets, i.e ., CityFlow, UA-DETRAC, and Synthehicle, whose experimental results demonstrate the effectiveness and robustness of the proposed method.
Multi-camera tracking,appearance-attribute,spatio-temporal consistency,tracklet-to-target assignment
AI 理解论文
Chat Paper