鲁国智,彭冬亮,谷雨(杭州电子科技大学通信信息传输与融合技术国防重点学科实验室, 杭州 310018)
目的 为提高目标跟踪的鲁棒性，针对相关滤波跟踪中的多特征融合问题，提出了一种多特征分层融合的相关滤波鲁棒跟踪算法。方法 采用多通道相关滤波跟踪算法进行目标跟踪时，从目标和周围背景区域分别提取HOG（histogram of oriented gradient）、CN（color names）和颜色直方图3种特征。提出的分层融合算法首先采用自适应加权融合策略进行HOG和CN特征的特征响应图融合，通过计算特征响应图的平滑约束性和峰值旁瓣比两个指标得到融合权重。将该层融合结果与基于颜色直方图特征获得的特征响应图进行第2层融合时，采用固定系数融合策略进行特征响应图的融合。最后基于融合后的响应图估计目标的位置，并采用尺度估计算法估计得到目标更准确的包围盒。结果 采用OTB-2013（object tracking benchmark 2013）和VOT-2014（visual object tracking 2014）公开测试集验证所提跟踪算法的性能，在对多特征分层融合参数进行分析的基础上，与5种主流基于相关滤波的目标跟踪算法进行了对比分析。实验结果表明，本文算法的目标跟踪精度有所提高，其跟踪精度典型值比Staple算法提高了5.9%（0.840 vs 0.781），同时由于有效地融合了3种特征，在多种场景下目标跟踪的鲁棒性优于其他算法。结论 提出的多特征分层融合跟踪算法在保证跟踪准确率的前提下，跟踪鲁棒性优于其他算法。当相关滤波跟踪算法采用了多个不同类型特征时，本文提出的分层融合策略具有一定的借鉴性。
Robust correlation filtering-based tracking by multifeature hierarchical fusion
Lu Guozhi,Peng Dongliang,Gu Yu(Fundamental Science on Communication Information Transmission and Fusion Technology Laboratory, Hangzhou Dianzi University, Hangzhou 310018, China)
Objective A robust correlation filtering-based visual tracking algorithm based on multifeature hierarchical fusion is proposed to improve the robustness of target tracking after summarizing the main multifeature fusion strategies to solve the multifeature fusion problem in correlation filtering-based tracking. Method Three features, including histogram of oriented gradient (HOG), color name (CN), and color histogram, are extracted from the target area and its surroundings to depict the appearances of the target and background when the multichannel correlation filtering algorithm is used to track the target. Two fusion layers are used in the proposed hierarchical fusion scheme to combine the response maps of the three features. The HOG and CN features, which describe the gradient and color information of the target, respectively, have a strong discrimination capability and are a pair of complementary features. Given that the saliency of the HOG and CN features is different under different tracking scenarios, the adaptive weighted fusion strategy, which can adaptively adjust fusion weights according to scene change, can be used to combine the responses of the HOG and CN features. Therefore, the adaptive weighted fusion strategy is used to combine the response maps of the HOG and CN features at the first fusion layer, where fusion weights are computed by calculating the smooth constraint and peak-to-sidelobe ratio of the feature response maps. Color histogram is a global statistical feature, and it can handle the case of deformation because the position information is discarded during computation of the color histogram. However, the tracking algorithm has a low accuracy when using the color histogram only because it is susceptible to the interference of similar-colored backgrounds. Thus, the color histogram feature is used as an additional feature in the proposed algorithm. The fixed-coefficient fusion strategy is adopted to combine the feature response maps of the first fusion layer and the feature response maps of the second fusion layer based on the color histogram. Finally, the position of the target is estimated based on the final response map, and the maximum of the final response map corresponds to the target position. The scale estimation algorithm, which uses a 1D scale-dependent filter to estimate the target scale rapidly, is adopted to obtain an accurate bounding box of the target. The model update procedure using a fixed learning factor at each frame is performed to adapt to appearance changes. Result The performance of the proposed tracking algorithm is verified using two public datasets, i.e., OTB-2013 and VOT-2014, for the evaluation of the visual tracking algorithm. The OTB-2013 dataset contains 51 test sequences, of which 35 are color video sequences. The distance precision and success rate curves are selected as performance metrics for the OTB-2013 dataset, and the one-pass evaluation assessment method is used to compute these metrics. The VOT-2014 dataset contains 25 color test sequences, and the accuracy and robustness metrics are used to analyze the performance for the VOT-2014 dataset. The experiments are divided into two parts, i.e., performance analysis of different parameters on the proposed algorithm and comparison with five mainstream correlation-filtering-based tracking algorithms, to analyze the performance of the proposed algorithm fully. The parameters of the proposed multifeature hierarchical fusion scheme, including fusion methods, target features, and fusion parameters, are analyzed using 35 sequences of the OTB-2013 dataset. Experimental results indicate that the proposed adaptive weighted fusion strategy is better than multiplicative fusion strategy, and the HOG, CN, color histogram features can improve the performance of the tracking algorithm. Second, the performance of our algorithm and five mainstream tracking algorithms are compared and analyzed. The six tracking algorithms are initially tested on all sequences and subsequently tested on 10 different individual attribute sequences. Experimental results indicate that the tracking performance is improved, where the precision score of the proposed algorithm is higher than that of the Staple algorithm by 5.9% (0.840 vs 0.781). Meanwhile, the robustness of the proposed algorithm is superior to that of other algorithms in most scenarios because of the effective integration of the CN, HOG, and color histogram features, and the highest success rate is achieved on out-of-plane rotation, occlusion, and fast motion sequences. Conclusion The robustness of the proposed multifeature hierarchical fusion tracking algorithm is superior to that of other algorithms based on correlation filtering under the premise of ensuring the tracking accuracy. The proposed hierarchical fusion strategy can be used and expanded when different types of features are adopted in the correlation filtering-based tracking algorithm.