## 7.2 Joint depth/texture bit allocation

In this section, we first present a joint bit-allocation analysis of depth and texture and afterwards, we provide an experimental analysis of the two-dimensional R-D model to enable a fast search algorithm for estimating the optimal quantization parameters.

### 7.2.1 Formulation of the joint bit-allocation problem

Let us consider the problem of jointly coding a texture and depth image at a maximum rate R

_{max}with minimum rendering distortion D

_{render}. The rate R

_{max}and distortion D

_{render}functions can be defined as follows. First, the maximum rate value R

_{max}is decomposed into the sum of the individual rates required for texture and depth coding. Because the texture and depth images are coded with two different quantizer settings (denoted q

_{t}and q

_{d}, respectively), the texture and depth rate functions can be written as R

_{t}(q

_{t}) and R

_{d}(q

_{d}), respectively. The joint rate function can therefore be specified as

| (7.1) |

Second, the rendering distortion function D_{render} depends on the image rendering
algorithm. The rendering algorithm relies on the quality of the input texture and
depth images and therefore on the quantization parameters q_{t} and q_{d}.
Consequently, as there is one rendering quality, we define a joint rendering
distortion as D_{render}(q_{t},q_{d}).

The goal of the joint bit-allocation optimization is to determine the optimal
quantization parameters (q_{t}^{opt},q_{d}^{opt}) for coding the depth and texture images,
such that the rendering distortion is minimized. The optimization problem can
now be formulated as finding the minimum of the rendering distortion,
hence

| (7.2) |

under the constraint that the joint bit rate is bounded to R_{max}, so that

| (7.3) |

where Q denotes the set of all possible quantizer settings. Without prior assumption, the solution to Equation (7.2) involves an exhaustive search over Q, in order to find the quantization setting with minimum distortion. Fortunately, a more efficient search can be performed by exploiting special properties of the R-D function. For example, assuming a smooth monotonic R-D surface model, hierarchical optimization techniques can be employed to find the best setting. Therefore, prior to investigating fast search algorithms, we provide a performance-point analysis of the R-D function to validate the smoothness of the surface.

### 7.2.2 R-D surface analysis

To analyze the R-D function, we construct a surface using an input data set, composed of multi-view images and their corresponding depth images. The rendering algorithm is based on the relief texture mapping (see Chapter 4). We generate the R-D surface by measuring the rendering distortion for all quantizers (q

_{t},q

_{d}), defined within a search range of q

_{min}≤q

_{t},q

_{d}≤q

_{max}. In total, k = q

_{max}-q

_{min}+ 1 compression iterations of the depth and texture images are carried out, which yields k ×k R-D performance points. In our specific case, we employ an H.264/MPEG-4 AVC encoder to compress the reference texture and depth images. However, since the proposed joint bit-allocation method is generic, any depth and texture encoder can be employed, as long as they have a controllable quantizer.

To measure the rendering distortion, one solution is to warp a coded reference image using the corresponding depth image. The rendering distortion is evaluated by calculating the Mean Squared Error (MSE) between the rendered image and the corresponding image captured at the same location and orientation (see Figure 7.1).

Therefore, considering an N-view data set and a selected quantizer set
(q_{t},q_{d}), N -r distortion measures can be obtained (excluding the r reference
images). To obtain a single rendering distortion measurement, the N -r
measures are then averaged. The pseudo-code of the R-D surface construction
algorithm is summarized in Algorithm 5.

As a result, Figure 7.2 shows the R-D surfaces of the two multi-view sequences “Ballet” and “Breakdancers”. To generate the presented curves, the first depth and texture of the reference views are encoded with an H.264/MPEG-4 AVC encoder in intra-mode.

Considering Figure 7.2, it is readily observed that both R-D surfaces show smooth monotonic properties. Up till now, we have only established an empirical validation of monotonic nature of the R-D surface. Assuming that the rendering function is indeed monotonic, a fast quantizer-setting search algorithm can be employed.