Current Issue Cover

张晶,赵旭(上海交通大学自动化系, 系统控制与信息处理教育部重点实验室, 上海 200240)

摘 要
目的 人体目标再识别的任务是匹配不同摄像机在不同时间、地点拍摄的人体目标。受光照条件、背景、遮挡、视角和姿态等因素影响,不同摄相机下的同一目标表观差异较大。目前研究主要集中在特征表示和度量学习两方面。很多度量学习方法在人体目标再识别问题上了取得了较好的效果,但对于多样化的数据集,单一的全局度量很难适应差异化的特征。对此,有研究者提出了局部度量学习,但这些方法通常需要求解复杂的凸优化问题,计算繁琐。方法 利用局部度量学习思想,结合近几年提出的XQDA(cross-view quadratic discriminant analysis)和MLAPG(metric learning by accelerated proximal gradient)等全局度量学习方法,提出了一种整合全局和局部度量学习框架。利用高斯混合模型对训练样本进行聚类,在每个聚类内分别进行局部度量学习;同时在全部训练样本集上进行全局度量学习。对于测试样本,根据样本在高斯混合模型各个成分下的后验概率将局部和全局度量矩阵加权结合,作为衡量相似性的依据。特别地,对于MLAPG算法,利用样本在各个高斯成分下的后验概率,改进目标损失函数中不同样本的损失权重,进一步提高该方法的性能。结果 在VIPeR、PRID 450S和QMUL GRID数据集上的实验结果验证了提出的整合全局—局部度量学习方法的有效性。相比于XQDA和MLAPG等全局方法,在VIPeR数据集上的匹配准确率提高2.0%左右,在其他数据集上的性能也有不同程度的提高。另外,利用不同的特征表示对提出的方法进行实验验证,相比于全局方法,匹配准确率提高1.3%~3.4%左右。结论 有效地整合了全局和局部度量学习方法,既能对多种全局度量学习算法的性能做出改进,又能避免局部度量学习算法复杂的计算过程。实验结果表明,对于使用不同的特征表示,提出的整合全局—局部度量学习框架均可对全局度量学习方法做出改进。
Global-local metric learning for person re-identification

Zhang Jing,Zhao Xu(Department of Automation, Shanghai Jiao Tong University, Key Laboratory of System Control and Information Processing, Ministry of Education, Shanghai 200240, China)

Objective The task in person re-identification is to match snapshots of people from non-overlapping camera views at different times and places. Intra-class images from different cameras show varying appearances due to variations in illumination, background, occlusion, viewpoint, and pose. Feature representation and metric learning are two major research directions in person re-identification. On the one hand, some studies focus on feature descriptors, which are discriminative for different classes and robust against intra-class variations. On the other hand, numerous metric learning algorithms have achieved good performance in person re-identification. The comparison of all the samples with a single global metric is inappropriate for handling heterogeneous data. Several researchers have proposed local metric learning. However, these methods generally require complicated computations to solve convex optimization problems. Method To improve the performance of metric learning algorithms and avoid complex computation, this study applies the concept of local metric learning and combines global metric learning algorithms, such as cross-view quadratic discriminant analysis (XQDA) and metric learning by accelerated proximal gradient (MLAPG). In the training stage, all the samples are softly partitioned into several clusters using the Gaussian mixture model (GMM). Local metrics are learned on each cluster using metric learning methods, such as XQDA and MLAPG. Meanwhile, a global metric is also learned for the entire training set. In the testing stage, the posterior probabilities of the testing samples that are aligned to each GMM component are computed. For each pair of samples, the local metrics weighted by their posterior probabilities of GMM components and the global metric weighted by a cross-validated parameter are integrated into the final metric for similarity evaluation. In this manner, we use different metrics to measure various pairs of samples, which is more suitable for heterogeneous data sets. In particular, we also propose an effective local metric learning strategy for MLAPG by modifying the weights of the loss values of the sample pairs in the loss function with the posterior probabilities of the samples aligned to each GMM component. Result We conduct experiments on three challenging data sets of person re-identification (i.e., VIPeR, PRID 450S, and QMUL GRID). Experimental results show that the proposed approach achieves better performance compared with traditional global metric learning methods. It performs significantly better on the VIPeR data set, providing more complex variations of backgrounds and clothes than on the other data sets, thereby improving matching accuracy by approximately 2.0%. In addition, we also conduct experiments on different types of feature representations for person re-identification to verify the generalized effectiveness of the proposed method. The matching accuracy is improved by approximately 1.3% to 3.4% with different feature descriptors. This result shows that the proposed approach can improve performance regardless of which feature descriptor is used. Conclusion We propose a novel framework for integrating global and local metric learning methods by taking advantages of both metric learning approaches. Numerous recent global metric learning approaches can be integrated into the proposed framework to obtain improved performance in the person re-identification problem. Compared with certain local metric learning approaches, the proposed framework integrates global metric learning methods flexibly and effectively. It doesn't require complicated computation unlike other local metric learning approaches. Moreover, the proposed metric learning framework can be applied to many feature representation approaches.