6.5  Conclusions and perspectives

This chapter has resulted in a new coding algorithm for depth images that explicitly exploits the smooth properties of depth signals. Regions are modeled by piecewise-linear functions and they are separated by straight lines along their boundaries. The algorithm employs a quadtree decomposition to enable the coding of small details as well as large regions with a single node. The performance of the full coding algorithm can be controlled by three different coding and modeling aspects: the level of the quadtree segmentation, the overall coefficient-quantization setting for the image and the choice of the modeling function. All three aspects are controlled by a Lagrangian cost function, in order to impose a global rate-distortion constraint for the coding of the complete image. For typical bit rates (i.e., between 0.01 bit/pixel and 0.25 bit/pixel), experiments have revealed that the coder outperforms a JPEG-2000 encoder by 0.6 to 3.0 dB. Finally, we have shown that the proposed depth coder yields a rendering quality improvement of up 0.4 dB, when compared to a JPEG-2000 encoder. It has been reported by Farin et al. [69] that the coding system discussed in this chapter can be further improved. This improved coding performance is obtained by exploiting the inter-block redundancy within the quadtree. By decorrelating the remaining dependencies between each block, extra improvements of 0.3 to 1.0 dB can be obtained. Also, Merkle et al. have compared the proposed depth coder with an H.264/MPEG-4 AVC encoder (intra-coding). Experimental results have shown that an H.264/MPEG-4 AVC based coder slightly outperforms the coding of the proposed coder in bit rate. However, the result of this evaluation shows that the proposed coding system, as it is specialized on the characteristics of depth images, outperforms H.264/MPEG-4 AVC (intra-coding) in rendering quality.

Whereas the presented coding scheme is still at its early stage of development, it has the potential of significantly improving not only the depth compression performance, but also the rendering quality of synthetic images in a multi-view system. Given the promising results, it is interesting to extend the encoder for exploiting the temporal and spatial inter-view redundancy.

Finally, we briefly comment on the complexity of the proposed coding framework. It should be noticed that the search for the best modeling function is computationally expensive, but only carried out at the encoder. The corresponding decoder is very simple and can be executed on a regular platform. The second important function is the rate-distortion optimization. Given the consistency of depth images over time, it would be easy to use R-D optimized settings of the previous frame and refine those settings for the current frame. This would reduce the amount of computation significantly without sacrificing much performance.