7.1  Introduction

In Chapter 4, we have seen that a possible 3D video representation format relies on a technique that associates one depth with one texture image. To enable the 3D-TV application, a simplified version of this representation format, i.e., the 1-depth/1-texture format, was recently adopted and standardized in Part 3 of the MPEG-C video specifications [2]. Following this initiative, a more advanced N-depth/N-texture format is currently investigated by the Ad Hoc Group on Free Viewpoint Television (FTV) [101]. The FTV framework is based on a representation that combines a reference texture image with the corresponding depth image that describes the depth of the visible surfaces in the scene. Using a depth-image-based representation, the 3D rendering of novel views can be subsequently performed, using image warping algorithms. Thus, employing a depth-image-based representation in transmission leads to the compression of multiple texture views and also their associated depth images.

Previous work on the compression of such a data set (texture and corresponding depth images), has addressed the problem of texture and depth compression by coding each of the signals individually. For example, several approaches have employed a modified H.264/MPEG-4 AVC encoder to compress either texture [51], or depth [6472] data. Such an independent coding yields high compression ratios for texture and depth data, individually. However, the influence of texture and depth compression on 3D rendering is not incorporated in these experiments, so that the rendering quality trade-off are not considered. Furthermore, recent literature [4354] confirms that the rendering quality trade-off is sometimes not well understood.

To illustrate the problem of joint compression of texture and depth, let us consider the following two cases. First, assume that the texture and depth images are compressed at very high and low quality, respectively. In this case, detailed texture is mapped onto a coarse approximation of object surfaces, which thus yields rendering artifacts. Alternatively, when texture and depth images are compressed at low and high quality, respectively, a high-quality depth image is employed to warp a coarsely quantized texture image, which also yields low-quality rendering. These two simple but extreme cases illustrate that a clear dependence exists between the texture- and depth-quality setting. It goes without saying that this dependency exists in the general case as well. Consequently, the quantization setting for both the depth and texture images should be carefully selected. For this reason, we address in this chapter, the following problem statement:
given a maximum bit-rate budget to represent the 3D scene, what is the optimal distribution of the bit rate over the texture and the depth image, such that the 3D rendering distortion is minimized?

To answer this question, we propose a new compression algorithm with a bit-rate control that unifies the texture and depth Rate-Distortion (R-D) functions. The attractiveness of the algorithm is that both depth and texture data are simultaneously combined into a joint R-D surface model that calculates the optimal bit allocation between texture and depth. We discuss the optimization of the performance of the joint coding algorithm using a slightly extended H.264/MPEG-4 AVC encoder, where the extension involves a joint bit-allocation algorithm. However, note that the proposed extension can be employed as an addition to any encoder, e.g., H.264/MPEG-4 AVC, JPEG-2000 or even a platelet-based encoder. Additionally, we have found that our joint model can be readily integrated as a practical sub-system, because it influences the setting of the compression system rather than the actual coding algorithm. As a bonus, this optimal setting is obtained with a limited computation effort.

The remainder of this chapter is structured as follows. Section 7.2 formulates the framework of the joint bit allocation of texture and depth. Section 7.3 describes a fast hierarchical optimization algorithm. Section 7.4 discusses the relationship between the depth and texture bit rate and Section 7.5 presents the possible applications of the proposed algorithm. Experimental results are provided in Section 7.6 and the chapter concludes with Section 7.7.