multimediacoding(unimi)

MultiMedia Coding (UniMI)

Home

Research: Zoomable User Interfaces - Multiple Description Coding - Audio Adaptive Playout - Peer2Peer - LDPC/DF - BioElectronic

Teaching: MicroElectronics (UniBG) - MultiMedia Coding (UniMI) - Thesis projects

Miscellanea: About me - Remarks - Blog

MultiMedia Coding (UniMI)

The course started in 2007. Updates are very frequent while teaching (March-June).

The course is aimed to give basic notions of digital processing of multimedia signals (pictures, video and audio).

Lecture notes

    • 1 Introduction (pdf 1.6Mb)
    • 2 Sampling and Quantization (pdf 6.3Mb)
    • 3 Filters, Up/Down/Re-sampling (pdf 4.4Mb)
    • 4 Entropy coding (pdf 0.7Mb)
    • 5 Video coding: JPEG and MPEG-2 (pdf 2.6 Mb)
    • 5 Video coding: H.264 (pdf 1 Mb)
    • 5 Video coding: SVC (pdf 1.6 Mb)
    • 5 Video coding: MVC (pdf 5.3 Mb)
    • 6 Audio coding (pdf 2 Mb)

1 Introduction (pdf 1.6Mb)

Course outline. From analog to digital signal representation, compression issues, transmission issues.

Examples for transmission: Multiple Description, Adaptive Playout.

Examples for compression of video: JPEG, MPEG-2, H.264, SVC; and for audio: MP3, vocoder.

2 Sampling and Quantization (pdf 6.3Mb)

Sampling theorem, filters to reject interferents and avoid aliasing. Reconstruction by cardinal sinus, zero order hold, filters to reject armonics.

Quantizers with uniform step with/without dead-band, dithering, delta-sigma and noise-shaping. Vector quantization. Non uniform / non-linear quantization, gamma correction.

Spatial sampling of Red/Green/Blue, color spaces (YCbCr), under-sampling of luma/chroma (4:2:2 and 4:2:0 formats). Format ratio: 4:3, 16:9 or wide-screen, Temporal sampling to get smooth motion (24 Hz), to display (50-60 Hz), to avoid large-area flicker (100-120 Hz). Spatial/temporal sampling, interlaced video. Conversion progressive-interlaced, 50-100 Hz, 4:3-16:9.

Analog TV. Quadrature amplitude modulation of color, luma/chroma separation. Frequency modulation of audio; stereo - dual channel audio. Spectrogram.

Quality, objective assessment (PSNR, peak signal to noise ratio). Examples of noisy, blurred, blocketized images. Enhancement filters: lowpass filters, median filters.

3 Filters, Up/Down/Re-sampling (pdf 4.4Mb)

Frequency domain processing. Examples of 2D spectrum of images.

Filter. Frequency response, phase and group delay. Symmetric coefficients (linear phase) and symmetric frequency response (half-band). Quadrature mirror filters. Examples: moving averages and comb filters with zero / unit coefficients. Concatenated filters. Sharpening.

Filter implementation. FIR/IIR parts. I/II direct/Transposed form. Second order sections. Filter design: sampled cardinal sinus as lowpass prototype. Modulation of coefficients.

Interpolation by zero insertion and filtering. Filtering and decimation. Examples for 1:2, 1:3, 2:1 and 3:1 up/downsamplers. Sample rate converters (SRC): synchronous; asynchronous with stored coefficients (polyphase, two branch polyphase with linear refinement) and computed coefficients (Farrow, modified Farrow).

Examples for interpolation/decimation applied to images and video. Conversion among QCIF, CIF and 4CIF formats. Conversion among interlaced 4:2:0 and 4:2:2 (field dependent processing).

Filterbanks: 2D Discrete Cosine Transform. Applicatin to compression of images, DPCM compression (applied to DC coefficient of 2D DCT), quadtree compression. MP3 filter-bank. Application to music compression (masking effects).

Filters study using MATLAB. Length estimation, short versus long filters. Test images. Examples of filtered images.

4 Entropy coding (pdf 0.7Mb)

Entropy, entropy of DPCM, entropy of groups of symbols. Entropy of variable length codes (VLC), istantenous decodability.

Huffman codes, Huffman codes for groups of symbols, non-binary Huffman codes. Other VLC codes: unary codes, Golomb codes, Rice codes. Fixed length code for variable number of symbols: Tunstall codes. Arithmetic coding (with some implementation detail).

Dictionary techniques: Lempel-Ziv (LZ77, LZ78), Lempel-Ziv Welch (LZW). Other techniques: move-to-front; invertible block sorting: Burrows-Wheeler transform (BWT).

5 Video coding: JPEG and MPEG-2 (pdf 2.6 Mb)

JPEG image coding: block discrete cosine transform (DCT), quantization, DPCM, zig-zag scan, run-level coding, variable length coding.

MPEG-1 and MPEG-2 video coding: profiles and levels; hierarchy from groups of pictures (GOP), to slices, macroblocks and blocks of pixels; DCT transform and quantization; temporal prediction: motion estimation, motion compensation (ME/MC). Data partitioning, SNR scalability, spatial scalability.

5 Video coding: H.264 (pdf 1 Mb)

H.264 video coding: intra spatial prediction; inter temporal prediction with variable block size, multiple reference frames, generalized B pictures, reference B pictures, weighted prediction. Integer pseudo-DCT, hadamard transform. Non-linear extended-range quantization. Deblocking loop filter. Context adaptive variable length coding (CAVLC) and binary arithmetic coding (CABAC). Profiles.

5 Video coding: SVC (pdf 1.6 Mb)

Scalable Video Coding (SVC): temporal scalability by using hierarchy of temporal prediction, spatial scalability by down/up sampling, SNR scalability by re-quantization. Adaptive GOP structure. Extended spatial scalability. Fine grain SNR scalability. Motion compensated temporal filtering (MCTF), temporal/spatial wavelet transform. Access units, packetization and layer dependency.

5 Video coding: MVC (pdf 5.3 Mb)

Multiview Video Coding (MVC): 3DTV, free viewpoint television; multi-view/3D video capture, 3D video display; 3D picture/video format, depth map extraction, rendering and synthesis.

6 Audio coding (pdf 2 Mb)

Human auditory system (HAS), masking effects in time and frequency, de-masking. Filterbanks. MPEG-1 layer I, II and III (mp3), MPEG-2 advanced audio coding (AAC).

Other resources

UniMI Andrea Vitali

Program of MultiMedia Coding

Suggestions

Received so far:

    • page numbers
    • split into smaller units
    • summaries

Created: 2nd April 2007. Updated: 31st July 2008.