何霞,汤一平,陈朋,王丽冉,袁公萍(浙江工业大学信息工程学院, 杭州 310023;浙江工业大学信息工程学院, 杭州 310023;浙江银江研究院有限公司, 杭州 310000)
目的 随着公共安全领域中大规模图像监控及视频数据的增长以及智能交通的发展，车辆检索有着极其重要的应用价值。针对已有车辆检索中自动化和智能化水平低、难以获取精确的检索结果等问题，提出一种多任务分段紧凑特征的车辆检索方法，有效利用车辆基本信息的多样性和关联性实现实时检索。方法 首先，利用相关任务之间的联系提高检索精度和细化图像特征，因此构造了一种多任务深度卷积网络分段学习车辆不同属性的哈希码，将图像语义和图像表示相结合，并采用最小化图像编码使学习到的车辆的不同属性特征更具有鲁棒性；然后，选用特征金字塔网络提取车辆图像的实例特征并利用局部敏感哈希再排序方法对提取到的特征进行检索；最后，针对无法获取查询车辆目标图像的特殊情况，采用跨模态辅助检索方法进行检索。结果 提出的检索方法在3个公开数据集上均优于目前主流的检索方法，其中在CompCars数据集上检索精度达到0.966，在VehicleID数据集上检索精度提升至0.862。结论 本文提出的多任务分段紧凑特征的车辆检索方法既能得到最小化图像编码及图像实例特征，还可在无法获取目标检索图像信息时进行跨模态检索，通过实验对比验证了方法的有效性。
Fast hash vehicle retrieval method based on multitasking
He Xia,Tang Yiping,Chen Peng,Wang Liran,Yan Gongping(School of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China;School of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China;Zhejiang Enjoyor Research Institute Co., Ltd, Hangzhou 310000, China)
Objective Large-scale image monitoring and video data have continuously increased in the field of public safety. Intelligent transportation has constantly evolved. Vehicle retrieval has extremely important application value. Existing vehicle retrieval techniques have low automation and intelligence level. Accurate search results are difficult to obtain. These retrieval techniques consume a large amount of storage space. To solve these problems, this study proposes a multi-task segmented compact feature vehicle retrieval method. The method can effectively use the correlation between detection and identification tasks. To achieve real-time retrieval, the method completely utilizes the diversity of information of vehicle attributes. Vehicle retrieval technology based on appearance features can overcome the limitation of traditional license plate recognition methods. This technology has broad application prospects in illegal inspections and search and seize of suspected criminal vehicles. Method This study constructs a multi-tasking deep convolutional network to investigate the hash code. This learning technique combines the image semantics with image representation. The technique uses the connection between the related tasks to improve the retrieval accuracy and to refine the image features. The hash code learning method uses the minimum image coding to ensure robustness of the learned vehicle features. Then, we use a feature pyramid network to extract the instance characteristics of the vehicle image. In the retrieval process, the extracted features are sorted using a local sensitive hash reordering method. A vehicle image cannot be obtained for several vehicle searches. For example, the night vision of a camera is blurred. This study proposes that a cross-modal-assisted retrieval can meet the actual requirements of different environments. Result Two datasets are used to verify the recognition of multitasking networks. The two datasets contain large-scale images of different vehicles. The BIT-Vehicle database is a commonly used database for vehicle identification. This database contains pictures of 9 850 bayonet vehicles. The pictures of these vehicles are divided into 12 categories. The categories are mainly divided into two tasks, namely, color and model. To verify the accuracy of fine-grained vehicle classification and multi-tasking network identification, we use the CompCars dataset that is more subdivided than the BIT-Vehicle dataset. The CompCars dataset contains two parts, namely, a network collection image and a bayonet capture image. We select the bayonet image part of the dataset and organized it, including the 30 000 positive bayonet capture images. The pictures of these vehicles are divided into 11 body color labels, 69 vehicle brands, 281 vehicle models, and 3 vehicle models. Therefore, this dataset is suitable for the verification of multitask convolutional neural network recognition performance. In addition, the general adaptability of the proposed vehicle retrieval method is verified. Experimental vehicle retrieval experiments are conducted on the VehicleID dataset. The VehicleID dataset contains approximately 200 000 images of 26 000 vehicles captured from surveillance cameras in real-world scenarios in different environments. The VehicleID dataset contains 250 models and 7 colors. The proposed search method outperforms the current mainstream search methods on all three public datasets. Among the datasets, the search accuracy on the CompCars dataset reaches 0.966. The search precision of the VehicleID dataset increases to 0.862. Compared with the existing methods, the retrieval accuracy of the proposed method is remarkably improved. Conclusion This study focused on the reality of public safety scenarios and the improvement of retrieval accuracy of massive video data. We designed a multitask neural network learning method that is suitable for identification and retrieval. The method unifies multiple feature extraction in the same model and uses end-to-end training. The proposed multi-task segmented compact feature vehicle retrieval method can achieve the minimum image coding and image feature. The method can also perform cross-modal retrieval when the target retrieval image information cannot be obtained. The effectiveness of the method is verified based on the comparison of experiments.