|
Scalable error-resilient coding of video signals Presenter Fabio Verdicchio - ETRO-VUB Abstract Modern multimedia applications require coding techniques that are capable of adapting the transmitted stream to the needs of a variety of end-users with different bandwidth provisions and playback capabilities (e.g. display resolution, refresh-rate and processing power) and at the same time provide a controllable degree of resiliency against erasures, as retransmission of lost data is often impractical. Assuming reliable transmission, scalable video coding (SVC) enables extracting from a unique layered stream several subsets, producing different visual qualities, frame-rates and resolutions. From a complementary perspective, multiple description coding (MDC) mitigates the effect of channel erasures by generating complementary representations of the input which are individually decodable and, upon reception of any additional description, mutually refine the accuracy of the reconstructed signal.
Driven by the above considerations, this thesis investigates a novel design for scalable erasure-resilient video coding aiming at combining (i) the efficiency of the open-loop SVC architecture based on spatiotemporal multi-resolution decomposition of the input with (ii) the error resilience and scalability features of embedded multiple description scalar quantization (EMDSQ), which extend the MDC paradigm to progressive transmission of each description. Our first contribution is a video coding architecture generating M fully scalable representation of the sequence. Targeting packet switched networks, we extend the concept of EMDSQ decoding by accounting for the fact that descriptions may not be received at the same quantization accuracy, which corresponds to the realistic case of descriptions, spanning more packets, which are incompletely received. A crucial consequence of such extension is the ability to control the redundancy within the MDC representation of the source after compression has been performed, by varying the number of transmitted descriptions. As a result, different representations of each spatiotemporal subband featuring different amounts of redundancy can be produced without additional coding stages. A second original contribution stems as we formulate the MDC rate-allocation problem in a novel framework, which allows tuning the redundancy across the spatiotemporal decomposition of the sequence. To this end, we first establish the link between the representation accuracy of each spatiotemporal subband/motion-field and the distortion in the reconstructed sequence. Then we derive the expected distortion at decoding site in our stochastic framework which accounts for source quantization and transmission errors. Given the available bit-rate and the packet-loss rate, our codec selects for each subband (i) the optimal subset of coded descriptions and (ii) the relative quantization accuracy, in order to minimize the expected distortion in the decoded video. Numerical results clearly demonstrate the advantages of the proposed approach over equivalent codec instantiations employing EMDSQ with non-scalable redundancy or other MDC principles such as data-partitioning.
|