Current Issue Cover
类别信息生成式对抗网络的单图超分辨重建

杨云,张海宇(陕西科技大学电气与信息工程学院)

摘 要
目的 基于生成式对抗网络的超分辨模型(SRGAN)以感知损失函数作为优化目标,有效解决了传统基于均方误差(MSE)的损失函数导致重建图像模糊的问题。但是SRGAN的感知损失函数中并未添加明确指示模型生成对应特征的标志性信息,使得其无法精准地将数据的具体维度与语义特征对应起来,受此局限性影响,模型对于生成图像的特征信息表示不足,导致重建结果特征不明显,给后续识别处理过程带来困难。针对上述问题,在SRGAN方法的基础上,提出一种类别信息生成式对抗网络的超分辨模型(Class-info SRGAN),通过附加的信息变量限制超分辨重建的解空间范围,协助模型更加准确地参照数据的语义特征完成重建任务。方法 对SRGAN模型增设类别分类器,并将类别损失项添加至生成网络损失中,再利用反向传播训练更新网络参数权重,以达到为模型提供特征类别信息的目的,最终生成具有可识别特征的重建图像。创新及优势在于将特征类别信息引入损失函数,改进了超分辨模型的优化目标,使得重建结果的特征表示更加突出。结果 经CelebA数据集测试表明:添加性别分类器的Class-info SRGAN的生成图像性别特征识别率整体偏高(范围在58%至97%之间);添加眼镜分类器的Class-info SRGAN的生成图像眼镜框架更加清晰。此外,模型在Fashion-mnist与Cifar-10数据集上的结果同样表明其相较于SRGAN的重建质量更佳。结论 实验结果验证了本方法在超分辨重建任务中的优势和有效性,同时结果显示:虽然Class-info SRGAN更适用于具有简单、具体属性特征的图像,但总体而言仍是一种效果显著的超分辨模型。
关键词
Class-info generative adversarial network for single image super-resolution

Yang yun,Zhang Haiyu(College of Electrical & Information Engineering, Shaanxi University of Science & Technology)

Abstract
Objective Image super-resolution reconstruction technology which means that by a set of low-quality and low-resolution images (or motion sequences) to produce a corresponding high-quality and high-resolution one. It has a wide range of applications in many fields, such as military, medical, public safety, and computer vision. In computer vision area, image super-resolution reconstruction may be able to make the image achieve a transformation from the detection level to recognition level, and even further to the identification level. In other words, image super-resolution reconstruction can enhance the image recognition capability and identification accuracy. In addition, it can also realize a dedicated analysis for the target, so that a higher spatial resolution image of the interesting region can be obtained instead of employing the configuration of a high spatial resolution image with a large amount of data and calculation directly. The conventional approaches of super-resolution reconstruction, in general, includes example-based model, bicubic interpolation model and sparse coding methods, etc. In recent years, with the advent of the time of artificial intelligence (AI), deep learning (DL) has established a close connection with many associative subjects, and there are a lot of research achievements acquired by it, so does it in super-resolution reconstruction field. Convolutional neural networks (CNNs) and generative adversarial networks (GANs) have made numerous breakthroughs and achievements of the domain of image super-resolution reconstruction, such as using a convolutional neural network (SRCNN), using a very deep convolutional networks (VDSR) and using a generative adversarial network (SRGAN). In particular, the present of SRGAN model indicates that the single image super-resolution (SISR) technology has made a remarkable progress. Especially it uses the perceptual loss function as the optimization goal instead of traditional loss function based on mean square error (MSE), which effectively solved the problem of the model using the original loss function in order to obtaining a higher peak signal-to-noise ratio (PSNR) that caused the reconstruction results become fuzzy. Albeit this method makes image super-resolution reconstruction quality ameliorate significantly, how to better highlight the feature representation of reconstructed images, thereby comprehensively improving the reconstruction quality of generated images are still difficult issue which need to be resolved. Essentially, because of super-resolution reconstruction itself belongs to an ill-posed problem: images lose a certain amount of info during the process of down-sampling, therefore the reconstruction of the corresponding high-resolution image by the low-resolution image that lost part of the characteristics will inevitably bring about a generative deviation. In addition, out of the fact that SRGAN didn’t add any auxiliary trademark info, which explicitly instruct the model to generate the corresponding features, into its loss function, made the model fail to accurately match specific dimensions and semantic features of data. The influence on controllability has confined its ability to sufficiently represent the features info of generated images, which caused a limitation to better improve the quality of reconstructed images and posed a difficulty to subsequent identification and process of the image. Aiming at above problems, based on SRGAN method, a super-resolution model based on class-info generative adversarial network (Class-info SRGAN) now is proposed, which be designed to utilize the additional info variables to restrict the solution space scope of super-resolution reconstruction and assist the model to fulfil the reconstruction task referring to the data semantic features in a more accurate way. Method For original SRGAN model, adding a class classifier and putting the class-loss item into the generative network loss, then making use of back-propagation during the training process to update parameters weights of the network to achieve the purpose of providing feature class-info for the model, and finally producing the reconstructed images which are possessed of the corresponding features. The innovation and advantage of proposed model lies in, to the original objective function, introducing feature class-info and improving the optimization objective of super-resolution model. Sequentially, it optimizes the network training process, meanwhile, makes the feature representation of reconstruction results become more prominent. Result The CelebA experiments indicated that the class-loss item helps the SRGAN model make minor changes towards better output. To be specific, compared with SRGAN model, respectively, with gender-class information, the differences were blurry, so it was harder to conclude that if the model does have a significant effect, although to some extent there were some slight improvements. The overall gender recognition rate of generated images by Class-info SRGAN model is, ranging from 58% to 97%, higher than the rate of SRGAN (from 8% to 98%). But with glasses-class information, the model more obviously learned how to form better shaped glasses. Besides, the results on the Fashion-mnist dataset and Cifar-10 dataset also indicated that the model is of significant effect, although the final results with Cifar-10 dataset were not very prominent as before experiments. In a word, the outcomes showed that the reconstruction quality of generated images from Class-info SRGAN model are better than original SRGAN. Conclusion Class-info does work in cases where there are clear-cut attributes that the model has learned as much as possible. The experimental results verify the superiority and effectiveness of the proposed model in the super-resolution reconstruction task. To sum up, based on some concrete and simple feature attributes, Class-info SRGAN seems to be a promising super-resolution model. But when it comes to the advancement, the answer must be definite. For example, how to develop a general Class-info SRGAN that can be used for a variety of super-resolution reconstruction tasks, how to successfully carry out a Class-info SRGAN with multiple attributes simultaneously, and how to insert the auxiliary class-info into Class-info SRGANs’ architectures more efficiently and conveniently, so on and so forth. These assumptions provide references and conditions for acquiring a better performance of super-resolution reconstruction in the future.
Keywords
QQ在线


订阅号|日报