Current Issue Cover
  • 发布时间:
  • 摘要点击次数:
  • 全文下载次数:
  • DOI:
  • 2017 | Volume  | Number 7

()

摘 要
目的:人类视觉系统性能远超当前机器视觉,模拟人类视觉机制改进当前算法是有效研究途径。本文提出一种视觉感知正反馈模型,通过循环迭代、重复叠加视觉刺激生成更符合人类感知的视觉显著性图。方法:首先用多种常规方法检测图像显著度,模拟人类视觉多通道特性,再组合这些显著图为综合显著图;利用显著度大的像素构建初始注视区。其次借助集成RVFL(随机向量功能网络)模拟人脑神经网络产生视觉刺激:对注视与非注视区内像素在线“随机采样-学习建模”,图像像素经模型分类获得新注视区。对新注视区与非注视区,可重复迭代进行“随机采样-学习建模-像素分类”;迭代中若注视区连续相同,则表明感知饱和,迭代终止。若将每次像素分类结果看作是一种视觉刺激,则多次视觉刺激输出叠加,可生成新的图像显著性图。最终的像素分类结果就是图像分割目标。结果:本文算法与现有方法在标准图像数据库上进行了对比评测,包括通过对六种算法在三个数据库上的P-R曲线,F-Measure值和MAE值上的定量分析,和对六种模型生成的显著性图作定性比较。数据表明,本文算法在SED2和MSRA10K中性能最好,在ECSSD中稍低于BL和RBD算法。观察表明,本文算法的显著图与人类视觉感知更接近。且算法的正反馈迭代过程一般可迅速饱和,并未显著增加算法负担。实验表明提出的方法可作为一种有效的后处理手段,显著提升常规显著性检测算法的性能。结论:提出了一种模拟人类视觉机制的数据驱动的显著性检测算法,无需图像先验知识和事先的标记样本。面对多目标,背景复杂等情况,本文方法具有相对好的鲁棒性和适用性,并且能够较好解决现实环境中图像处理算法的通用性、可靠性和准确性问题。
关键词

()

Abstract
Objective: The performance of current machine vision is far from that of human vision. Simulating human visual mechanism is an effective way to improve the existed algorithm. Human visual system can detect objects with high acuity and focus attention on region relevant to current visual task. These advantages all owe to visual attention mechanism. Human accept attention by making a serious of eye movements. There are two forms of eye movement: saccades and microsaccades. 1) In saccades stage, human eyes aim to find candidate object so it makes sharply shifts in the whole field of view. 2)While candidates are identified as target, eyes will make a series of dense tiny movements that is called microsaccades around the target for the purpose of intensify objects and inhibit noises. Continuous microsaccades will lead to visual fading and the eye movement will switch to the stage of saccades to find new objects. The integration of saccades and microsaccades contribute to the quickly and efficiently performance of human vision system. Motivate by above facts, this paper presents a novel saliency detection framework by simulating microsaccades and visual fading. A positive feedback loop is constructed that focus on fixation area and intensify objects to make saturation of visual perception that leads to visual fading. In which, multiple random sampling of the gaze area is used to simulate the behavior of microsaccades, and RVFL (Random Vector Functional Link Networks) is utilized to simulate the human neural system to produce binary visual stimulus. The proposed framework is data-driven totally, need not any prior knowledge and labeled samples. Method: Firstly, the conventional saliency detection methods could be used to produce a variety of saliency map. We group these saliency maps to an integrated saliency map to simulate multi-channel visual perception. The integrated saliency map can be thresholded further to form an initial fixation area. Followed multiple random sampling could be executed from the pixels in the fixation and non-fixation area. Then ensemble of RVFL (Random Vector Functional Link Networks) is trained on-line by those samples of pixel. And then the model of RVFL could be used to classify image pixels to obtain a new fixation area (binary area). For the new fixation area and non-fixation area, iterations of " sampling – learning (modeling) - pixel classification" could be performed on-line. If the fixation area were unchanged in the iteration, it indicates that the perception is saturated and the iteration should be terminated. If taking binary result of pixel classification as a kind of visual stimulation, the output of multiple visual stimuli could be accumulated to generate new image saliency map. And the last binary result of pixel classification in positive feedback loop could be regard as foreground of segmentation. Result: Three popular image databases SED2, MSRA10K and ECSSD were chosen to evaluate the performance of our algorithm. They total contain 11100 nature images with different salient objects and scenes. Every image in the dataset was finely labeled manually for the purpose of saliency detection and image segmentation. Five other models were compared, include the state-of-the-art or closely related to our approach: BL,RBD,SF,GS and MR. P-R curve, F-measure and MAE was used to illustrate the performance of the algorithm in six algorithms on three databases. Experimental results show that our method has the best performance in SED2 (two objects) and MSRA10K(single object). Our method is inferior to BL and very close to RBD in the ECSSD (complex scene and multi-object) database, while better than the rest compared algorithms. It is also shown that the performance of BL,RBD,SF,GS and MR. can be improved effectively by adding learning-based positive feedback in SED2 database. Experimental images illustrate that the new method is more consistent with the visual saliency map of human perception by positive feedback and accumulating visual stimulation. From the view of qualitative evaluation, it is clearly that the binary result detected by our method is closer to the Ground truth than others. The positive feedback iteration could be saturated quickly, and the running time of algorithm is not significantly increased. It can be treated as an effective post-processing modular, which could improve the performance of the conventional saliency detection algorithm. Conclusion: This paper proposes a novel saliency region detection method base on machine learning and positive feedback of perception. Motivated by human visual system, we construct a framework using RVFL to process visual information from coarse to fine, to form saliency map and extract salient objects. Our algorithm is data-driven totally and need not any prior knowledge compared with the existed algorithms. Experiments on several standard image databases show that our method not only improves the performance of the conventional saliency detection algorithms, but also segments object successfully in different scenes.
Keywords
QQ在线


关注微信