To evaluate the performance of the search algorithm for the quantizer settings, experiments are carried out, using the “Ballet” and “Breakdancers” sequences. The impact of using a variable number of reference views is measured by using two different, so called, rendering structures. As portrayed by Figure 7.4 (a) and (b), the investigated rendering structures include one and two reference views, respectively.
To generate the R-D surfaces for both sequences, we set qtmin = qdmin = 27 and qtmax = qdmax = 51, so that 25 × 25 R-D points are obtained. For coding experiments, we have employed the open-source H.264/MPEG-4 AVC encoder x264  and we have encoded the first depth and texture frames of the reference view(s) in intra-mode. The presented experiments attempt to quantify the rendering quality obtained in three ways by using:
- a pre-defined depth bit rate (reserve 10% of the texture bit rate), or
- the quantizers qd,qt determined by performing a full search, or
- the quantizers qd,qt determined by employing a hierarchical search.
The pre-defined depth bit rate of 10% is based on experimental results published in the framework of the ATTEST project jointly carried out by Op De Beeck et al. and Smolic et al. [18, 93]. First, considering the experimental setup that employs one reference view for rendering, it can be observed in Figure 7.5 that the proposed joint bit-allocation framework consistently outperforms the pre-defined depth bit-rate coding scheme. For example, when inspecting Figure 7.5(a) and Figure 7.5(b), it can be seen that the joint bit-allocation framework yields a quality improvement of 0.8 dB and 1.0 dB at a bit rate 75 kbit per frame, respectively. Additionally, employing the sub-optimal search does not sacrifice the rendering performance compared to full search. Next, let us consider the second experimental setup that uses two reference views for rendering (see Figure 7.6). It can be first noted that the proposed bit-allocation framework also consistently outperforms the pre-defined depth bit-rate coding scheme. Specifically, a rendering-quality improvement of up to 0.9 dB and 0.4 dB is obtained for the sequences “Breakdancers” and “Ballet”, respectively. Note that the depth of the “Ballet” sequence is encoded at 20% of the texture bit rate because a ratio of 10% is not sufficient and even provides annoying results in terms of noise. Again, as previously observed, a fast hierarchical search can be employed without loss of rendering quality. Thus, the sub-optimal hierarchical search provides a fast and accurate estimation of the optimal R-D point of operation.
Additionally, as expected, the rendering quality does not only depend on the sequence characteristics, but also on the rendering structure. For example, a rendering quality of 31.2 dB and 32.5 dB can be obtained at a total bit rate of 150 kbit per frame for the depth and texture images when using one and two reference views, respectively. This result simply highlights that the rendering quality, and thus, the bit-rate distribution is not only influenced by the texture and depth characteristics of the sequence, but also by the number of reference views employed for rendering. Finally, it can be noted that the R-D curve, which is obtained by optimizing the quantization parameters, presents a smooth logarithmically increasing rendering quality. In contrast, not optimizing the quantization parameters yields an R-D curve with non-smooth and even erratic rendering quality.
Figure 7.5 (a) and (b) Obtained rendering quality for the sequences “Breakdancers” and “Ballet”, respectively, when performing joint bit allocation using one reference view.
Figure 7.6 (a) and (b) Obtained rendering quality for the sequences “Breakdancers” and “Ballet”, respectively, when performing joint bit allocation using two references views.
Finally, Table 7.1 summarizes the measured rendering distortions for the “Breakdancers” and “Ballet” multi-view sequences and the two rendering structures depicted in Figure 7.4. As expected, it can be noted that the ratio between the depth and texture bit rate depends not only on the multi-view sequence, but also on the number of reference views employed for rendering. Specifically, let us select four R-D points corresponding to the maximal rendering quality in the Tables denoted (a) (b) (c) and (d) (see Table 7.1). These four points (underlined in the table) are selected such that the sum of the texture and depth bit rates is less or equal to 100 kbit/frame, i.e., Rmax ≤ 100. For the “Breakdancers” sequence, it can be seen that the depth bit rate corresponds to 40% and 29% of the total bit rate when using one and two reference views, respectively. Next, for the “Ballet” sequence, a bit-rate ratio of 50% and 38% is measured, when using one and two reference views, respectively. Therefore, significant variations in bit-rate distribution are observed for the two multi-view sequences. The obtained bit-rate ratios can be explained by examining the properties of the multi-view sequences. Specifically, the “Ballet” sequence shows richer 3D content when compared to the “Breakdancers” sequence. As a result, rendering high-quality virtual images of the “Ballet” sequence requires a higher depth bit rate when compared to the “Ballet” sequence.