Tutorial 1 – Into digits

Digital technology is sweeping our industry and affects many parts of our lives. Yet we live in an analog world. Light and sound naturally exist in analog forms and our senses of sight and hearing are matched to that. The first machines to capture, record and manipulate pictures and sound were analog but today it is far easier to do the jobs in the digital domain. Not only does this allow the use of the highly advanced digital components available from the computer industry but it also leads to many new capabilities that were impractical or simply impossible with analog.

The techniques used to move between the analog and digital worlds of television pictures are outlined here. Some of the pitfalls are shown as well as describing why the digital coding standards for standard definition and high definition television (ITU-R BT.601 and ITU-R BT.709) are the way they are.

Why digits?
The digital machines used in television are generally highly complex and many represent the state-of-the-art of digital technology. The initial reason for the popularity of digital techniques was that the scale of the computer industry ensured that the necessary electronic components were both relatively easily available and continued to develop. But the preference for digits is also because of their fidelity and the power they give to handle and manipulate images. Rather than having to accurately handle every aspect of analog signals, all digital circuits have to do is differentiate between, or generate, two electrical states – on and off, high and low, 1 and 0. To read, or correctly interpret, this information accurately requires only recognizing a 1 or 0 state, rather than the value of continuously varying analog signals.This is relatively easy and so leads to superb fidelity in multi-generation recordings, no losses in passing the signal from place to place, plus the potential of processing to produce effects, large-scale storage and many other techniques far beyond those available in analog.

Forty-plus years ago, the technology simply did not exist to convert television pictures into digits. Even if it could have been done there were no systems able to process the resulting data stream at anything like realtime. Today digital machines have successfully reached every aspect of television production – from scene to screen. At the same time costs have tumbled so that today all new equipment, from broadcast professional to consumer level, is digital.

From analog to digital
Initially, digitization involved working with television’s composite signals (PAL and NTSC) but this is now rare. Today it is the component signals (meaning separate signals that together make-up the full colorsignal), not composite, which are digitized according to the ITU-R BT.601 and ITU-R BT.709 digital sampling specifications for SD and HD respectively (film applications uses different ranges of sampling to these TV and video requirements).

‘601’ describes sampling at standard definition and is widely used in TV operations. Sampling for high definition, according to ITU-R BT.709, broadly follows the same principles, but works faster. Both standards define systems for 8-bit and 10-bit sampling accuracy – providing 28 (= 256) and 210 (= 1024) discrete levels with which to describe the analog signals.

There are two types of component signals; the Red, Green and Blue (RGB) and Y, R-Y, B-Y but it is the latter which is by far the most widely used in digital television and is included in the ITU-R BT.601 and 709 specifications. The R-Y and B-Y, referred to as color difference signals, carry the color information while Y represents the luminance. Cameras, telecines, etc., generally produce RGB signals from their image sensors. These are easily converted to Y, R-Y, B-Y using a resistive matrix and filters. This is established analog technology used to prepare video for PAL or NTSC coding.

Analog to digital conversion occurs in three parts: signal preparation, sampling and digitization.

Signal preparation
The analog to digital converter (ADC) only operates correctly if the signals applied to it are correctly conditioned. There are two major elements to this. The first involves an amplifier to ensure the correct voltage and amplitude ranges for the signal are given to the ADC. For example, luminance amplitude between black and white must be set so that it does not exceed the range that the ADC will accept. The ADC has only a finite set of numbers (an 8-bit ADC can output 256 unique numbers – but no more, a 10-bit ADC has 1024 – but no more) with which to describe the signal. The importance of this is such that the ITU-R BT.601 and 709 standards specify this set-up quite precisely saying that, for 8-bit sampling, black should correspond to level 16 and white to level 235, and at 10-bit sampling 64 and 940 respectively. This leaves headroom for errors, noise and spikes to avoid overflow or underflow on the ADC. Similarly for the color difference signals, zero signal corresponds to level 128 (512 for 10-bit) and full amplitude covers only 225 (897) levels.

Tutorial 1 - Into digits

For the second major element the signals must be low-pass filtered to prevent the passage of information beyond the luminance band limit of 5.75 MHz and the color difference band limit of 2.75 MHz, from reaching their respective ADCs. If they did, aliasing artifacts would result and be visible in the picture (more later). For this reason low pass (anti-aliasing) filters sharply cut off any frequencies beyond the band limit. For HD, the principle remains the same but the frequencies are all 5.5 times higher, generally, depending on the HD standard being used.

Sampling and digitization
The low-pass filtered signals of the correct amplitudes are then passed to the ADCs where they are sampled and digitized. Normally two ADCs are used, one for the luminance Y, and the other for both color difference signals, R-Y and B-Y. Within the active picture the ADCs take a sample of the analog signals (to create pixels) each time they receive a clock pulse (generated from the sync signal). For Y the clock frequency in SD is 13.5 MHz and for each color difference channel half that – 6.75 MHz – making a total sampling rate of 27 MHz (74.25 MHz, 37.125 MHz and 148.5 MHz respectively for HD). It is vital that the pattern of sampling is rigidly adhered to, otherwise onward systems, and eventual conversion back to analog, will not know where each sample fits into the picture – hence the need for standards! Co-sited sampling is used, alternately making samples of Y, R-Y, and B-Y on one clock pulse and then on the next, Y only (i.e. there are half the color samples compared with the luminance). This sampling format used in 601 is generally referred to as 4:2:2 and is designed to minimize chrominance/luminance delay – any timing offset between the color and luminance information. Other sampling formats are used in other applications – for example 4:2:0 for MPEG-2 compression used for transmission.

The amplitude of each sample is held and precisely measured in the ADC. Its value is then expressed and output as a binary number and the analog to digital conversion is complete. Note that the digitized forms of R-Y and B-Y are referred as Cr and Cb.

Sampling (clock) frequency
The (clock) frequency at which the picture signal is sampled is crucial to the accuracy of analog to digital conversion. The object is to be able, at some later stage, to faithfully reconstruct the original analog signal from the digits. Clearly using too high a frequency is wasteful whereas too low a frequency will result in aliasing – so generating artifacts. Nyquist stated that for a conversion process to be able to re-create the original analog signal, the conversion (clock) frequency must be at least twice the highest input frequency being sampled (see diagram below) – in this case, for luminance, 2 x 5.5 MHz =11.0 MHz. 13.5 MHz is chosen for luminance to take account of both the filter characteristics and the differences between the 625/50 and 525/60 television standards. It is a multiple of both their line frequencies, 15,625 Hz and 15,734.265 Hz respectively, and therefore compatible with both (see 13.5 MHz). Since each of the color difference channels will contain less information than the Y channel (an effective economy since our eyes can resolve luminance better than chrominance) their sampling frequency is set at 6.75 MHz – half that of the Y channel.

Tutorial 2

From digital to analog
Today, it is increasingly common for the digital signal to be carried right through to the viewer,  so the signal would not require digital to analog conversion at all. Where D to A conversion is  required, the digital information is fed to three digital to analog converters (DACs), one each for Y, Cr and Cb (digitized R-Y and B-Y), which are clocked in the same way and with the same frequencies as was the case with the ADCs. The output is a stream of analog voltage samples creating a ‘staircase’ or ‘flat top’ representation similar to the original analog signal (see figure below). The use of a sampling system imposes some frequency-dependent loss of amplitude which follows a Sinx/x slope. This means that the output amplitude curves down to zero at half the frequency of the sampling frequency, known as the Nyquist frequency. For example sampling at 13.5 MHz could resolve frequencies up to 6.75 MHz. Although the ITU-R BT.601 set-up is way off that zero point, the curved response is still there. This curve is corrected in the Sinx/x low-pass filters which, by losing the unwanted high frequencies, smoothes the output signal so it now looks the same as the original Y, R-Y, B-Y analog inputs. For those needing RGB, this can be simply produced by a resistive matrix.

Turorial 3

Perfect results?
Today the whole analog to digital and digital to analog process is usually reliable and accurate. However there are inherent inaccuracies in the process. The accuracy of the clock timing is important and it should not vary in time (jitter). Also the accuracy of the ADCs in measuring the samples, though within the specification of the chip, may not be exact. This is a specialized task as each sample must be measured and output in just 74 nanoseconds, or 13.5 nanoseconds for HD. Equally the DACs may only be expected to be accurate to within their specification, and so they too will impose some degree of non-linearity into the signal. Even with perfect components and operation the process of sampling and reconstituting a signal is not absolutely accurate. The output is never precisely the same as the original signal. For this reason, plus cost considerations, system workflows are designed so that repeated digitization processes are, as far as possible, avoided. Today it is increasingly common for pictures to be digitized at, or soon after, the camera and not put back to analog, except for monitoring, until the station output, or, with DTV, until arriving at viewers’ TV sets or set-top boxes; indeed in many cases the signal now remains digital throughout the entire production, distribution and viewing chain.

See also: 13.5 MHz