摘 要：目的 针对手动设计的手指静脉质量特征计算过程复杂、鲁棒性差、表达效果不理想等问题，提出了基于级联优化CNN(卷积神经网络)进行多特征融合的手指静脉质量评估方法。方法 以半自动化方式对手指静脉公开数据库MMCBNU_6000进行质量标注并用R-SMOTE(Radom-Synthetic Minority Over-sampling Technique)算法平衡类别；将深度学习中的CNN结构应用到手指静脉质量评估并研究了不同的网络深度对表征手指静脉质量的影响；受到传统方法中将二值图像和灰度图像结合进行质量评估的启发，设计了两种融合了灰度图像和二值图像的质量特征的模型：多通道CNN（Multi-Column CNN, MC-CNN）和级联优化CNN（Cascaded fine-tuning CNN, CF-CNN），MC-CNN在训练和测试时均需要同时输入二值图像和灰度图像，CF-CNN在训练时分阶段输入二值图像和灰度图像，测试时只需输入灰度图像。 结果 本文设计的三种简单CNN结构（CNN-K,K=3,4,5）在MMCBNU_6000数据库上对测试集图像的分类正确率分别为93.31%、93.94%、85.63%，以灰度图像和二值图像分别作为CNN-4的输入在MMCBNU_6000数据库上对测试集图像的分类正确率对应为93.94%、91.92%，MC-CNN和CF-CNN在MMCBNU_6000数据库上对测试集图像的分类正确率分别为91.44%、94.62%，此外，与现有的其他算法相比，CF-CNN在MMCBNU_6000数据库上对高质量测试图像、低质量测试图像、整体测试集图像的分类正确率均最高。 结论 实验结果表明，基于CF-CNN学习到的融合质量特征比现有的手工特征和基于单一静脉形式学习到的特征表达效果更好，可以有效地对手指静脉图像进行高、低质量的区分。
Finger vein image quality assessment based oncascaded fine-tuning convolutional neural network
Junying Zeng,Yao Chen,Chuanbo Qin,Junying Gan,Yikui Zhai,Wulin Feng()
Abstract: Objective In recent years, finger vein recognition, as one of the emerging biometric identification technologies, has attracted the attention of more and more researchers. But the annoying thing is that the quality of some collected finger vein images is not ideal due to individual differences, changes in the collection environment, and differences in the performance of acquisition equipment. In a finger vein recognition system, low-quality images will seriously affect feature extraction and matching, resulting in poor identification performance of the system. In an application scene that needs to establish a standard template library of personal finger vein information in real life, registered low-quality images will seriously affect the use of the finger vein standard template library. Therefore, in order to filter low-quality images and select high-quality images to input finger vein recognition system or register a finger vein standard template library, it is necessary to realize quality assessment correctly after collecting the finger vein images. Suffering from the problems that the hand-crafted finger vein quality characteristic is serve sensitive to various factors of considerably computation complexity, weak robustness and unsatisfactory expression, a method of finger vein quality assessment is proposed to tackle all these variance via multi-feature fusion, which is primarily based on cascaded fine-tuning convolutional neural network (CNN). Method Finger vein image quality assessment methods based on deep learning require a large number of labeled finger vein images. However, the existing finger vein image public databases only provide finger vein images and do not mark them for quality. So the first step we have to do is to accomplish the labeling work. In this paper, firstly, the public finger vein database MMCBNU_6000 is labeled for quality representation in a semi-automated manner. This manner is based on the calculation of the number of veins in the finger vein image and then manual correction. Such an annotation method is more accurate, time-saving, and cost-effective than a pure manual annotation method. But the collected low-quality finger vein images are much less than high-quality finger vein images in the actual scene, hence R-SMOTE algorithm is employed to balance all categories. In recent years, the excellent capabilities of deep neural networks have been proven in the fields of image and speech. However, as for finger vein quality assessment, most of the existing methods are based on hand-crafted features, and there are few methods which gain quality features based on self-learning. In this paper, then, the CNN structure in deep learning is applied to finger vein quality assessment and the depth of CNN framework is investigated for the contribution to the quality representation. Deeper networks may not be good at representing the quality characteristics of finger vein images. The best network depth may be confirmed after experiment and it is used as the basis for the next research. Meanwhile, inspired by the combination of binary image and grayscale image in traditional quality evaluation, two models, Multi-Column CNN (MC-CNN) and Cascaded fine-tuning CNN (CF-CNN), are designed to merge the quality features of grayscale image and binary image. When MC-CNN is trained and tested, binary image and grayscale image have to be input together to the model. As for CF-CNN, binary image and grayscale image are input to the model in stages during training, and only grayscale image is input to it during testing. It is worth mentioning that we input the binary finger vein image to the network and have also verified that the quality characteristics of the binary finger vein do help to distinguish the high- quality and low-quality finger vein images. After verification, we have the basis to believe that the combination of binary image and grayscale image by CNN will produce amazing results. Result Furthermore, there are some experimental results for test set on the MMCBNU_6000 database: classification accuracy of the CNN-K(K=3, 4, 5) designed in this paper are respectively 93.31%, 93.94%, and 85.63%, classification accuracy of the CNN-4 with grayscale image and binary image as input are 93.94% and 91.92%, classification accuracy of the MC-CNN and CF-CNN are 91.44% and 94.62%. The experimental results with simple CNN structure show that CNN-3 has the highest classification accuracy rate for high quality images, CNN-5 has the highest classification accuracy rate for low quality images, and CNN-4 has the highest classification accuracy rate for the whole test set. The experimental results with CNN-4 show that grayscale vein form performs better than binary vein form. The experimental results with complex CNN structure show that CF-CNN performs better than MC-CNN. In addition, compared with other existing algorithms, CF-CNN has the highest classification accuracy rate for high-quality test images, low-quality test images and overall test images on the MMCBNU_6000 database. Conclusion In this paper, first, three simple CNN structures were designed and used for finger vein quality assessment, and the comprehensive performance of CNN-4 is better than CNN-3 and CNN-5, which means that the network is not as deep as possible and the structure of the network should be adapted to the research questions. Secondly, this paper compares the performance difference between the input of gray image and binary image for the same network, indicating that the gray image and the binary image characterize the vein quality to varying degrees. Finally, in order to fuse the quality features of grayscale images and binary images, two kinds of fusion model are proposed: MC-CNN and CF-CNN. CF-CNN is better than MC-CNN and has a simpler structure. It is an end-to-end quality evaluation model of finger veins. In summary, our method has realized a state of the art performance, which is validated to obtain better features than those from the existing manual one and single vein form.