Table 3 of the ATSC DTV Standard, Annex A, summarizes the picture formats originally allowable for DTV transmission in the USA. Any one of these may be compressed using MPEG-2 and transmitted. An ATSC receiver must be able to display pictures from any of these formats.
The range of formats in Table 3 caters for schemes familiar to both television and computers, and takes account of different resolutions, scanning techniques and refresh rates. For each frame size, frame rates are available to provide compatibility with existing transmission systems and receivers. 29.97 Hz is needed to keep step with NTSC simulcasts. Technically this is not required once NTSC is no longer transmitted! 30 Hz is easier to use, and does not involve considerations such as drop-frame timecode. 24 Hz progressive (24P) offers compatibility with film material. A choice of progressive or interlaced scanning is also available for most frame sizes (see Progressive and Interlaced).
Table 3 is concerned with video formats to be handled in the ATSC system rather than defining standards for video production. ATSC’s Table 1 of annex A refers to the standardized video production formats likely to be used as inputs to the compression table.
|Vertical size value||Horizontal size value||Aspect ratio information||Frame rate code||Progressive sequence|
Video standard Active lines Active samples/line
SMPTE 274M 1080 1920
SMPTE S17.392 720 1280
ITU-R BT.601-4 483 720
ATSC Annex A, Table 1: Standardized video input formats
|Video standard||Active lines||Active samples/line|
An image file format widely used in computers. It was developed by Truevision Inc. and there are many variations of the format.
Timebase Corrector. This was often included as a part of a VTR to correct the timing inaccuracies of the pictures read from tape. Early models were limited by their dependence on analog storage devices, such as ultrasonic glass delay lines. This meant that VTRs, such as the original quadruplex machines, had to be mechanically highly accurate and stable to keep the replayed signal within the correction range (window) of the TBC; just a few microseconds.
The introduction of digital processing techniques made much larger stores, with analogue inputs and outputs, economic, so widening the correction window and reducing the need for especially accurate, expensive mechanical engineering. Digital TBC has had a profound effect on VTR design – and price. Quantel’s first product was a digital TBC for use with IVC VTRs.
When VTRs went digital (DVTR) the TBC function became easier and was always a part of the DVTR. With the wide use of disk or solid-state video stores the TBC function has disappeared as data is accurately timed by buffers and clocks built into the equipment.
Transfer Control Protocol/Internet Protocol. A set of standards that enables the transfer of data between computers. Besides its application directly to the Internet it is also widely used throughout the computer industry. It was designed for transferring data files rather than large files of television or film pictures. Thus, although TCP/IP has the advantage of being widely compatible it is a relatively inefficient way of moving picture files.
Device for converting film images into video in realtime. The main operational activity here is color grading which is executed on a shot-by-shot basis and absorbs considerable telecine time. This includes the time needed for making grading decisions and involves significant handling of the film, spooling and cueing which risks film wear and damage, besides the actual transfer time. The output of a telecine is digital video (rather than data files).
Digital technology has moved the transfer process on. Now, adding a disk store or server can create a virtual telecine enabling the film-to-digital media transfer to run as one continuous operation. Whole film spools can be scanned in one pass, with useful footage selected by an EDL. In this case the telecine may be termed a Film Scanner – creating image files (rather than digital video) that contain sufficient latitude for downstream grading.
A technique that allows a (computing) process to be shared among several processors with the objective of completing the process more quickly than would be possible when using a single processor. To provide faster processing for multiple tasks, multiprocessing is used where the programs (threads) are divided to run simultaneously on different cores to provide faster processing for multiple tasks.
Tagged Image File Format. A bit-mapped file format for scanned images. Widely used in the computer industry. There are many variations of this format.
A graphical representation of events, such as those that occur in editing, compositing, grading or other processes, usually as a horizontal line. This works well with disk-based and solid-state based operations providing instant access to any part of the process and, hopefully, to all the footage, decisions, associated tools and their settings.
Three Letter Acronym. There is a huge collection of examples in these pages, from AAF to WAV. Somehow two letters seems abrupt and four or more becomes a bit clumsy. Three is perfect!
Following a defined point, or points, in the pictures of a clip. Initially this was performed by hand, using a DVE but was laborious, difficult and limited to only to pixel accuracy. Now image tracking is widely used, thanks to the availability of automatic point tracking operating to sub-pixel accuracy. The tracking data can be applied to control DVE picture moving for such applications as the removal of film weave, replacing 3D objects in moving video, wire removal, etc.
Advanced multiple point tracking is sometimes used to analyze images in 3D, so allowing a whole raft of computer-generated material to be move-matched for compositing into live scenes, blurring the boundaries of live and synthetic imagery.
A timing reference signal developed for HD. Typically this is originated by an sync-pulse generator and is distributed to most of the technical equipment, including cameras, mixer/switcher, and video processing equipment in a studio or truck so they all operate in sync.
With the opportunity to devise a new timing signal, the TLS signal combines a negative and positive pulse that is symmetrically above and below a baseline – nominally at zero volts. This means that accurate timing can be extracted from it even when the baseline voltage drifts.
In the early days of HD, the new equipment offered TLS and black and burst inputs for synchronization, but it was soon found that the old analog black and burst was preferred, already available and actually offered more accurate timing for 25p, 30p, 50i and 60i (but not suitable for other frame rates).
See: Digitizing Time
This has no strict technical meaning but is marketing hype. The ATSC says that all HD, 720P, 1080I and 1080P are all true HD, but the term has tended to be associated with 1080P often in advertising, but this is nothing official. Not to be confused with… TrueHD.
Quantel term for the ability to continuously read any frame, or sequence of frames, in any order at or above video (or realtime) rate, a part of FrameMagic. A true random access video store (usually comprising disks) can allow editing which offers rather more than just quick access to material. Cuts are simply instructions for the order of replay and involve no copying, so adjustments can be made as often and whenever required. This results in instant and very flexible operation. At the same time technical concerns associated with fragmentation do not arise as the store can operate to specification when fully fragmented, ensuring full operation at all times. This aspect is particularly important for server stores with many users.
Dolby’s lossless audio technology developed for high-definition disk-based media including Blu-ray Disc. It includes ‘bit-for-bit’ lossless coding up to 18 Mb/s and support for up to eight channels of 24-bit, 96 kHz audio. It is supported by HDMI.
The TrueType vector font format was originally developed by Apple Computer, Inc. The specification was later released to Microsoft. TrueType fonts are therefore supported on most operating systems. Most major type libraries are available in TrueType format. There are also many type design tools available to develop custom TrueType fonts. Quantel equipment supports the import of these and other commercially available fonts.
Removal of the least significant bits (LSBs) of a digital word – as could be necessary when connecting a 10-bit video source into 8-bit video equipment, or handling the 16-bit result of digital video mixing on an 8-bit system. Just dropping the lowest two bits is not the right answer. If not carefully handled truncation can lead to unpleasant artifacts on video signals – such as ‘contouring’. Quantel invented Dynamic Rounding as a way to handle the truncation of digital image data so that the values of the dropped lower bits are contained in the remaining bits.
See also: Dynamic Rounding
Digital technology is sweeping our industry and affects many parts of our lives. Yet we live in an analog world. Light and sound naturally exist in analog forms and our senses of sight and hearing are matched to that. The first machines to capture, record and manipulate pictures and sound were analog but today it is far easier to do the jobs in the digital domain. Not only does this allow the use of the highly advanced digital components available from the computer industry but it also leads to many new capabilities that were impractical or simply impossible with analog.
The techniques used to move between the analog and digital worlds of television pictures are outlined here. Some of the pitfalls are shown as well as describing why the digital coding standards for standard definition and high definition television (ITU-R BT.601 and ITU-R BT.709) are the way they are.
The digital machines used in television are generally highly complex and many represent the state-of-the-art of digital technology. The initial reason for the popularity of digital techniques was that the scale of the computer industry ensured that the necessary electronic components were both relatively easily available and continued to develop. But the preference for digits is also because of their fidelity and the power they give to handle and manipulate images. Rather than having to accurately handle every aspect of analog signals, all digital circuits have to do is differentiate between, or generate, two electrical states – on and off, high and low, 1 and 0. To read, or correctly interpret, this information accurately requires only recognizing a 1 or 0 state, rather than the value of continuously varying analog signals.This is relatively easy and so leads to superb fidelity in multi-generation recordings, no losses in passing the signal from place to place, plus the potential of processing to produce effects, large-scale storage and many other techniques far beyond those available in analog.
Forty-plus years ago, the technology simply did not exist to convert television pictures into digits. Even if it could have been done there were no systems able to process the resulting data stream at anything like realtime. Today digital machines have successfully reached every aspect of television production – from scene to screen. At the same time costs have tumbled so that today all new equipment, from broadcast professional to consumer level, is digital.
From analog to digital
Initially, digitization involved working with television’s composite signals (PAL and NTSC) but this is now rare. Today it is the component signals (meaning separate signals that together make-up the full colorsignal), not composite, which are digitized according to the ITU-R BT.601 and ITU-R BT.709 digital sampling specifications for SD and HD respectively (film applications uses different ranges of sampling to these TV and video requirements).
‘601’ describes sampling at standard definition and is widely used in TV operations. Sampling for high definition, according to ITU-R BT.709, broadly follows the same principles, but works faster. Both standards define systems for 8-bit and 10-bit sampling accuracy – providing 28 (= 256) and 210 (= 1024) discrete levels with which to describe the analog signals.
There are two types of component signals; the Red, Green and Blue (RGB) and Y, R-Y, B-Y but it is the latter which is by far the most widely used in digital television and is included in the ITU-R BT.601 and 709 specifications. The R-Y and B-Y, referred to as color difference signals, carry the color information while Y represents the luminance. Cameras, telecines, etc., generally produce RGB signals from their image sensors. These are easily converted to Y, R-Y, B-Y using a resistive matrix and filters. This is established analog technology used to prepare video for PAL or NTSC coding.
Analog to digital conversion occurs in three parts: signal preparation, sampling and digitization.
The analog to digital converter (ADC) only operates correctly if the signals applied to it are correctly conditioned. There are two major elements to this. The first involves an amplifier to ensure the correct voltage and amplitude ranges for the signal are given to the ADC. For example, luminance amplitude between black and white must be set so that it does not exceed the range that the ADC will accept. The ADC has only a finite set of numbers (an 8-bit ADC can output 256 unique numbers – but no more, a 10-bit ADC has 1024 – but no more) with which to describe the signal. The importance of this is such that the ITU-R BT.601 and 709 standards specify this set-up quite precisely saying that, for 8-bit sampling, black should correspond to level 16 and white to level 235, and at 10-bit sampling 64 and 940 respectively. This leaves headroom for errors, noise and spikes to avoid overflow or underflow on the ADC. Similarly for the color difference signals, zero signal corresponds to level 128 (512 for 10-bit) and full amplitude covers only 225 (897) levels.
For the second major element the signals must be low-pass filtered to prevent the passage of information beyond the luminance band limit of 5.75 MHz and the color difference band limit of 2.75 MHz, from reaching their respective ADCs. If they did, aliasing artifacts would result and be visible in the picture (more later). For this reason low pass (anti-aliasing) filters sharply cut off any frequencies beyond the band limit. For HD, the principle remains the same but the frequencies are all 5.5 times higher, generally, depending on the HD standard being used.
Sampling and digitization
The low-pass filtered signals of the correct amplitudes are then passed to the ADCs where they are sampled and digitized. Normally two ADCs are used, one for the luminance Y, and the other for both color difference signals, R-Y and B-Y. Within the active picture the ADCs take a sample of the analog signals (to create pixels) each time they receive a clock pulse (generated from the sync signal). For Y the clock frequency in SD is 13.5 MHz and for each color difference channel half that – 6.75 MHz – making a total sampling rate of 27 MHz (74.25 MHz, 37.125 MHz and 148.5 MHz respectively for HD). It is vital that the pattern of sampling is rigidly adhered to, otherwise onward systems, and eventual conversion back to analog, will not know where each sample fits into the picture – hence the need for standards! Co-sited sampling is used, alternately making samples of Y, R-Y, and B-Y on one clock pulse and then on the next, Y only (i.e. there are half the color samples compared with the luminance). This sampling format used in 601 is generally referred to as 4:2:2 and is designed to minimize chrominance/luminance delay – any timing offset between the color and luminance information. Other sampling formats are used in other applications – for example 4:2:0 for MPEG-2 compression used for transmission.
The amplitude of each sample is held and precisely measured in the ADC. Its value is then expressed and output as a binary number and the analog to digital conversion is complete. Note that the digitized forms of R-Y and B-Y are referred as Cr and Cb.
Sampling (clock) frequency
The (clock) frequency at which the picture signal is sampled is crucial to the accuracy of analog to digital conversion. The object is to be able, at some later stage, to faithfully reconstruct the original analog signal from the digits. Clearly using too high a frequency is wasteful whereas too low a frequency will result in aliasing – so generating artifacts. Nyquist stated that for a conversion process to be able to re-create the original analog signal, the conversion (clock) frequency must be at least twice the highest input frequency being sampled (see diagram below) – in this case, for luminance, 2 x 5.5 MHz =11.0 MHz. 13.5 MHz is chosen for luminance to take account of both the filter characteristics and the differences between the 625/50 and 525/60 television standards. It is a multiple of both their line frequencies, 15,625 Hz and 15,734.265 Hz respectively, and therefore compatible with both (see 13.5 MHz). Since each of the color difference channels will contain less information than the Y channel (an effective economy since our eyes can resolve luminance better than chrominance) their sampling frequency is set at 6.75 MHz – half that of the Y channel.
From digital to analog
Today, it is increasingly common for the digital signal to be carried right through to the viewer, so the signal would not require digital to analog conversion at all. Where D to A conversion is required, the digital information is fed to three digital to analog converters (DACs), one each for Y, Cr and Cb (digitized R-Y and B-Y), which are clocked in the same way and with the same frequencies as was the case with the ADCs. The output is a stream of analog voltage samples creating a ‘staircase’ or ‘flat top’ representation similar to the original analog signal (see figure below). The use of a sampling system imposes some frequency-dependent loss of amplitude which follows a Sinx/x slope. This means that the output amplitude curves down to zero at half the frequency of the sampling frequency, known as the Nyquist frequency. For example sampling at 13.5 MHz could resolve frequencies up to 6.75 MHz. Although the ITU-R BT.601 set-up is way off that zero point, the curved response is still there. This curve is corrected in the Sinx/x low-pass filters which, by losing the unwanted high frequencies, smoothes the output signal so it now looks the same as the original Y, R-Y, B-Y analog inputs. For those needing RGB, this can be simply produced by a resistive matrix.
Today the whole analog to digital and digital to analog process is usually reliable and accurate. However there are inherent inaccuracies in the process. The accuracy of the clock timing is important and it should not vary in time (jitter). Also the accuracy of the ADCs in measuring the samples, though within the specification of the chip, may not be exact. This is a specialized task as each sample must be measured and output in just 74 nanoseconds, or 13.5 nanoseconds for HD. Equally the DACs may only be expected to be accurate to within their specification, and so they too will impose some degree of non-linearity into the signal. Even with perfect components and operation the process of sampling and reconstituting a signal is not absolutely accurate. The output is never precisely the same as the original signal. For this reason, plus cost considerations, system workflows are designed so that repeated digitization processes are, as far as possible, avoided. Today it is increasingly common for pictures to be digitized at, or soon after, the camera and not put back to analog, except for monitoring, until the station output, or, with DTV, until arriving at viewers’ TV sets or set-top boxes; indeed in many cases the signal now remains digital throughout the entire production, distribution and viewing chain.
See also: 13.5 MHz
The film process is designed for capturing images from scenes that will be edited and copied for eventual projection onto hundreds or thousands of big cinema screens. This has been in operation for over 100 years and so has been developed and refined to a very high degree to precisely meet these objectives. The film stocks used in cameras have a characteristic that allows them to capture a very wide range of scene brightness with good color saturation to provide wide latitude for color correction after processing. Intermediate stocks used to copy the one original negative are designed to be as faithful to the original as possible. Print stocks provide the high contrast needed to produce a bright and good contrast image on the projector screen to overcome the background illumination in the theatre.
Television is different in many ways. For example, the results are always instantly viewable and are delivered, sometimes live, to millions of smaller screens. The sensors used in video cameras do not presently have the wide dynamic range of film and so shooting with them has to be more carefully controlled as the ability to correct exposure faults later is more restricted. Also the viewing conditions for video are different from cinema. Not many of us sit in a darkened room to watch television, so the images need to be brighter and more contrasted than for film.
The three different basic types of film stock used – camera negative, intermediate and print – each have very specific jobs. Camera negative records as much detail as possible from the original scene, both spatially and in range of light to make that original detail eventually available on a multitude of internegatives from which are produced thousands of release prints for projection.
The Film Lab
Between the camera negative and the print there are normally two intermediate stages: Interpositive and Internegative. At each point more copies are made so that there is a large number of internegatives from which to make a much larger number of release prints. The object of these intermediate stages is purely to increase the number of negatives to print. This is because the precious and unique camera negative would be effectively destroyed with so much handling. The intermediate materials, interpositive and internegative, are exactly the same and designed to make, as nearly as possible, exact copies for each stage (with each being the negative of the previous stage). For this requirement the material has a gamma of 1.
But the release print is not just a film representation of the shot scenes: editing, visual effects, and grading – not to mention audio work – must take place in between. This mainly works in parallel with the film processing path – partly to reduce handling the negative.
The camera negative is printed to make the rush prints which provide the first viewing of the shot material. Note that this will be at least several hours after the shoot so hopefully all the good takes came out well! The first edit decisions about what footage is actually required are made from the rush prints with the aid of offline editing.
The negative cutter has the responsibility for cutting the unique footage according to the scene list. Initial grading is applied as the cut negative is transferred to interpositive. Should there be any further need of grading, instructions for this are sent with the internegatives to the print production labs. Any need of dissolves rather than cuts, or more complex visual effects, will require work from the optical printer or, these days, a digital film effects workstation.
Grading or Timing
Grading is the process of applying a primary color correction to the film copying process. The original camera negative may contain lighting changes which will mean that scenes shot on different days or times during the day need to look the same but simply do not. By effectively controlling the color of the light used to copy the negative to one of the intermediate stages these errors can be much reduced to produce a scene-to-scene match. Grading is carried out on a special system equipped with a video monitor displaying the current frame from the negative loaded onto it. Three controls provide settings of the red, green and blue ‘printer’ light values that adjust the amount of each of the three lights used to image the frame. These adjustments allow the operator to balance the color and brightness of the scenes in the movie.
This results in a table of corrections linked to the edge code of the original negative. This table is used to control the optical printer making the copy. Most processing laboratories subscribe to a standard definition of the settings but this does not mean that settings defined at one processing lab can be used at another. The photochemical process is very complex and individual labs will vary, however they all aim toward a standard. The ‘neutral’ value for RGB printer lights is represented typically as between 25, 25, 25 and 27, 27, 27 – depending on which lab is used. To print an overexposed negative will require higher values, and an underexposed negative lower values. A change of one in the value represents one 12th of a stop adjustment in exposure. Differential adjustment of the values provides basic color correction.
This analog process is now often replaced by a digital process known as Digital Intermediate (DI).
See also: Digital Intermediate