Chapter 5
H.264-Based Depth and Texture Multi-View Coding

“Prediction is very difficult, especially about the future.”


Niels Bohr, Danish physicist. This chapter concentrates on the compression of multi-view depth and multi-view texture video, based on predictive coding. To exploit the inter-view correlation, two view-prediction tools have been implemented and used in parallel: a block-based disparity-compensated prediction and a View Synthesis Prediction (VSP) scheme. Whereas VSP relies on an accurate depth image, the block-based disparity-compensated prediction scheme can be performed without any geometry information. Our encoder employs both strategies and adaptively selects the most appropriate prediction scheme, using a rate-distortion criterion for an optimal prediction-mode selection. The attractiveness of the encoding algorithm is that the compression is robust against inaccurately-estimated depth images and requires only two reference cameras for fast random-access to different views. We present experimental results for several texture and depth multi-view sequences, yielding a quality improvement of up to 0.6 dB for the texture and 3.2 dB for the depth, when compared to solely performing H.264/MPEG-4 AVC disparity-compensated prediction.

 5.1 Introduction
 5.2 Multi-view video coding tools
  5.2.1 Disparity-compensated prediction
  5.2.2 Prediction structures
  5.2.3 Performance bounds
  5.2.4 Coding efficiency versus random access and decoding complexity
 5.3 View Synthesis Prediction (VSP) for N-depth/N-texture coding
  5.3.1 Predictive coding of views
  5.3.2 Incorporating VSP into H.264/MPEG-4 AVC
  5.3.3 Multi-view depth coding aspects when using VSP
 5.4 Experimental results
  5.4.1 Conditions
  5.4.2 Experimental results for multi-view texture coding
  5.4.3 Experimental results for multi-view depth coding
 5.5 Conclusions