结果 对大量不同类型的视频进行稳像效果测试，并且与典型的特征点轨迹稳像算法以及商业软件进行稳像效果对比，其中包括基于轨迹增长的稳像算法、基于对极几何点转移的稳像算法以及商业软件Warp Stabilizer。实验结果表明，本文算法的轨迹长度要求低、轨迹利用率高以及鲁棒性好，对于92%剧烈抖动的视频，稳像效果优于基于轨迹增长的稳像算法；对于93%缺乏长轨迹的视频以及71.4%存在滚动快门失真的视频，稳像效果优于Warp Stabilizer；而与基于对极几何点转移的稳像算法相比，退化情况更少，可避免摄像机阶段性静止、摄像机纯旋转等情况带来的算法失效问题。
A video stabilization algorithm using trifocal tensor reprojection
wang jing dong,xuechongfei,wei xue ying,liu yun xiao()
Objective Video stabilization is one of the key research areas of computer vision. Currently, there are three major categories of video stabilization algorithms: 2D Global Motion Stabilization, 2D Local Motion Stabilization, and Feature Trajectory Stabilization. The 2D Global Motion Stabilization and 2D Local Motion Stabilization algorithms usually cannot achieve satisfying stabilization result in scenes with non-planar depth variations. In contrast, the Feature Trajectory Stabilization algorithm handles non-planar depth variations well in the aforementioned scenes and outperforms the rest two video stabilization algorithms. However, it normally suffers from stabilization output distortion and unstable local result due to its drawbacks in the trajectory length, robustness, and trajectory utilization rate. Aiming at this problem, a Feature Trajectory Stabilization Algorithm Using Trifocal Tensor Reprojection is proposed in this paper.
Method This algorithm extracts real feature point trajectory in the video’s scene with KLT algorithm and leverages the RANSAC algorithm to eliminate mismatches during tracking feature point. The algorithm then adaptively chooses a segment of the real trajectories to initialize the virtual trajectories based on the length of real trajectories. A long virtual trajectory is constructed by applying trifocal tensor transfer to extend the initial virtual trajectory. This virtual trajectory extending process stops when either the virtual trajectory exceeds half of the frame width or height, or the difference between the mean and median of transferred points is larger than 5 pixels. When the number of virtual trajectories through one frame is less than 300, new initial virtual trajectories are added using the real trajectories though the same frame. With the acquired long trajectories, the algorithm odd-extends the beginning of the virtual trajectories with regards to the first frame and odd-extends the ending of the virtual trajectories with regards to the last frame. The stabilized view is defined by the smoothed virtual trajectories from the output of the FIR filter. In order to smoothen the real trajectories, it re-projects real feature points to the stabilized views by the trifocal tensor transfer. Then it divides the original frames into 16*32 uniform-meshed grids. The final stabilized frames are rendered using Mesh Grid Warping conversion of the original frames, while the input to the Mesh Grid Warping are the smoothing vectors between the real feature points and smoothed real feature points. In the case of smoothing vectors with unneglectable error, the proposed algorithm deletes any to guarantee the output of Mesh Grid Warping, by combined usage of discarding the smoothed trajectories at most 5 frames long and the RNASAC algorithm based on the Affine Model. Notice the degraded precision of virtual trajectory construction and real feature point reprojection due to the degeneration of trifocal tensor transfer, this algorithm adaptively changes the size of the transfer window accordingly to the severity of degeneration. This process guarantees enough transferred points acquired so that furthermore guarantees the precision of virtual trajectory construction and real feature point reprojection. In the construction of virtual trajectories, whenever the number of virtual trajectories through one frame is detected 25% less than the previous frame, this algorithm marks the previous frame as a breakpoint to process the partitioned video respectively. Thus, the proposed algorithm achieves better stabilization result.
Result Experiment on a number videos of different types shows the proposed algorithm has advantages in the video stabilization result over the traditional Feature Trajectory Stabilization algorithms that base on Feature Trajectory augmentation or Epipolar Point Transfer and the commercial software Warp Stabilizer. When compared with the stabilization algorithm based on Feature Trajectory Augmentation, the testing videos are classified into categories including “simple”, “running”, “rolling shutter”, “depth”, and “driving”. The “simple” videos have relatively slow camera motions and smooth depth variations. The “running” videos are captured while users are running, so these videos are challenging due to excessively wobbling. The “rolling shutter” videos suffer from noticeable rolling shutter distortions. The “depth” videos have significant abrupt depth change. The “driving” videos are captured on moving vehicles. On the other hand, when compared with Warp Stabilizer, the classification is slightly changed to include “simple”, “lack of long trajectory”, “rolling shutter”, “depth”, “driving”. The “lack of long trajectory” videos lack long trajectories due to possible reasons such as camera panning, motion blurring or excessive jitters. To compare the stabilization results of the three algorithms, a scoring system is used to evaluate their stabilization outputs and then statistical analyzing the results of each category in order to demonstrate the performance of the proposed algorithm. The result shows the proposed algorithm requires less trajectory length while achieving higher trajectory utilization rate and good robustness. When compared with the algorithm base on Feature Trajectory augmentation, for 92% of the “running” videos, the stabilization results of have fewer distortions and better stability; for around 50% of the “rolling shutter” videos, both algorithms have similar stabilization result; and for 38% of the videos in this category, the proposed algorithm has fewer distortions. For 55% of the “simple” videos, both algorithms have similar stability and no distinct distortion; for the rest 45% of the videos in this category, the proposed algorithm can achieve better stability. For most of the “depth” and “driving” videos, both algorithms have similar stability and extent of distortion; for a fewer “depth” videos, the proposed algorithm has slightly better stability. When compared with Warp Stabilizer, for 93% of the “lack of long trajectory” videos and 71.4% of the “rolling shutter” videos, the proposed algorithm has fewer distortions and better overall effect. For the “simple” and “driving” videos, both algorithms achieve good stabilization result. For 75% of the “depth” videos, both algorithms achieve a similar result; for the rest 25% of the videos in this category, the proposed algorithm has fewer distortions. When compared with stabilization algorithm that based on Epipolar Point Transfer, the proposed algorithm has fewer degenerated situations, therefore, it can avoid distortion introduced by phased motionless camera or pure camera rotation.
Conclusion The proposed algorithm has less restriction on the camera motion pattern and scene depth. It is suitable for common video stabilization situa-tions including ones that lacks parallax, with non-planar structure, or with rolling shutter distortion. It can still achieve sat-isfying stabilization result in scenarios that lacks long trajectory due to camera panning, motion blurring or excessive jitters. The time complexity of this algorithm may need improvement as it requires about 3 to 5 seconds per frame on a machine with a 2.1GHz Intel Core i3 CPU and 3GB of memory. In the future, the parallel computing may be a potential solution to increase the speed.