Objective: Distributed video coding (DVC) has attracted a lot of attention of many relevant international standardization committees and experts ever since the emergence of distributed source coding (DSC), a new class of source coding approaches based on the Slepian-Wolf theorem and the Wyner-Ziv (WZ) theorem. Due to the characteristic of slight encoding and high error robustness, DVC is a good way to meet the demand of the new video business which requires low-power consumption and low complexity, such as video chat, unmanned aerial, wireless monitoring, etc. However, the bit error ratio of the wireless channel is higher than the wired channel due to the impact of the channel attenuation, multipath interference, frequency band mutual interference, etc. In the DVC system, video source is interleaved key frames and WZ frames, the side information which is regarded as the noise version of the current WZ frame is generated by the motion estimation and compensation algorithm of the adjacent key frames, so the key frames whether able to correctly decode and transmit would affect the compression efficiency and rate-distortion of the whole system. Nevertheless, the robustness of the key frames which use the traditional intra-frame coding is far lower than the WZ frames which are based on channel coding. For the robustness and transmission of key frames in the heterogeneous network, this paper presents a quality scalable protection solution for the key frames in wavelet domain DVC. Method: At the encoder side, the key frames is encoded by the traditional HEVC/H.265 (High Efficiency Video Coding) intra-frame coding and Wyner-Ziv coding based wavelet domain simultaneously. The HEVC bitstreams are transmitted to the wireless channel, for the WZ bitstreams, the information bits are directly discarded, and the generated parity bits are stored in buffer. In order to make the bit rate of the system to adapt to the different network conditions, different layers of low frequency and high frequency band of the wavelet decomposition image can be combined into different enhanced layers. Firstly, the decoder determines whether the HEVC bitstreams of the key frames lost or not. If there is no error, the HEVC bitstreams are decoded to reconstruct directly, and the WZ parity bits in buffer will be deleted. On the contrary, the error concealment technique would be used to reconstruct a video frame of the received HEVC bitsreams. Then the reconstructed frame is taken as the side information of the current key frame. At the same time, the decoder will request the WZ data of different enhancement layer according to the different channel environment. On the other hand, in the DVC system, the original frame and its corresponding side information roughly obeys the Laplace distribution. Due to the decoder can’t get the accurate original information, so the real practice is to use the forward reference frame and side information to obtain the virtual noise model of the current frame. But if the channel condition is limited and there are errors in the key frames simultaneously, it is impossible to send parity data of all enhancement layers. As a result, the quality of the reconstructed forward reference frame may be relatively poor and the estimation of the virtual noise model may have a large gap compared with the practical situation. So this paper improves the virtual noise model of the error key frames. Due to the similarity of the virtual noise model of the same layer in the wavelet decomposition image, with the decoded bands of the first enhancement layer and its corresponding side information, the more accord with actual virtual noise model of the second and the third enhancement layer could be obtained. Result: In order to validate the effectiveness of the proposed scheme, the luminance of three video sequences with different motion characteristics are simulated, which include the foreman, bus, and coastguard sequences. The rate-distortion performance over packet loss channels with different randomly packet loss ratio [i.e., PLR = (1%，5%，10%，20%)] is valuated. Experiments results show that compared with the traditional error concealment method, the proposed scheme can effectively improve the rate-distortion performance of the reconstructed video image under different channel condition. Specifically, when the loss rate of key frames is 5%, if only the parity data of the first enhancement layer are transmitted, the PSNR of the reconstructed video can be improved about 2~5dB, if the parity data of the second enhancement layer continue to be transmitted, the PSNR of the reconstructed video can also be increased by 0.5~1.6dB. If all parity data of the three enhancement layers are transmitted, the decoded video can basically achieve the same quality of the key frames that without error. When the data loss ratio is relatively high, such as 20%, the quality of the reconstructed video by typical error concealment method nearly cannot meet the basic requirements. But in the proposed scheme, with the parity data of the first enhancement layer transmitted, the PSNR could be improved about 4.5~8.3dB, if the parity data of the second enhancement layer continue to be transmitted, the PSNR could be also increased by 2.7~4.1dB, if all parity data of the three enhancement layers are transmitted, the PSNR could be also increased by 3.7~4.6dB. In general, with the transmission of the different enhancement layers, the different reconstructed video quality could be obtained. Conclusion: Experimental results have indicated that the proposed error protection scheme for key frames in wavelet domain DVC can improve the robustness of key frames. For the different channel environment and requirements, the proposed framework can also improve the rate-distortion performance. However, the proposed scheme is based on the feedback channel which would bring some delay during the decoding, so the rate estimation in the encoder side can be the next direction of research.