This is the act of finding film frame boundaries. For “perfect” pSF or 2:3 sequences, this will produce a regular pattern of frames; for “non-perfect” sequences the pattern will not be regular and might have discontinuities at edit points for example.
Temporal conversion without the use of interpolation. If input and output field or frame rates are not identical then field or frame drops or repeats must occur.
A Progressive Segmented Frame (pSF) format splits a progressive image into two sequential fields. It is identical to 2:2 in terms of motion profile.
A film frame being transported as 2:2 (psf) is placed into two consecutive video fields. F2/F1 denotes that the film frame is carried in a field two and the following field one. This is commonly referred to “reverse dominance” or “reverse cadence”.
See also pSF
A film frame being transported as 2:2 (pSF) is placed into two consecutive video fields. F1/F2 denotes that the film frame is carried in field one and the following field two. This is commonly referred to “normal dominance” or “perfect cadence”.
See also pSF
It is common for the electronic editing process to be performed post telecine. When editing is performed to 2:3 there is a potential for disruptions in the 2:3 sequence. These can be 3 field sequences adjacent to other 3 field sequences, and 2 field sequences adjacent to other 2 field sequences. Also there are cases where we have single fields present that are not part of any sequence (Orphan fields). These disruptions caused by editing create a “broken 2:3 sequence”.
Motion compensated processing is responsive to the output of a motion estimator. A motion estimator usually works on a local basis in the picture and measures not only the existence of motion but also its speed and direction. Motion compensated processing typically controls spatiotemporal filters that track the motion of each part of the picture.
Motion adaptive processing is responsive to the output of a motion detector. The motion detector may work on a global or a local basis in the picture, and may be binary or may measure on a continuous scale the amount of motion, or the confidence that an object or region is moving. Motion adaptive processing controls a mix between processing that is optimized for static detail and processing that is optimized for moving areas.
A video compression system developed by Sony and released in 2012. Designed for professional use in acquisition and post production, XAVC uses the highest level (5.2) of H.264/MPEG-4 AVC and offers some things that plain vanilla H.264 XAVC does not, such as encoding 1080/50P and 1080/60P. At the high performance end it supports 4K at both 4096 x 2160 (cinema) and 3840 x 2160 (video) resolutions at up to 60 f/s. With available bit depths of 8, 10 and 12, 4:2:0, 4:2:2 and 4:4:4 sampling and bit rates ranging from 15 to 960 Mb/s, XAVC can be used for a wide range of applications.
As H.264 is an advanced compression system, XAVC can offer high quality performance at relatively low bit rates (and smaller / longer storage) with a choice of either intra-frame or long-GOP codecs. The compressed video can be wrapped in an MXF OP1a container which is widely used in broadcast.
XAVC-S is the 8-bit consumer version of XAVC. It offers lower bits rates than XAVC and is wrapped in an MPEG-4 container. It is designed for the shorter, less complex workflows typical of consumer production.
Most Significant Bit. In a binary number, this is the first digit – 0 or 1 – and is the most significant because changing it has the largest effect on the whole number represented compared with changing any of the other digits in the binary number.
Least Significant Bit. In a binary number, this is the final digit – 0 or 1 – and is the least significant because changing it has the smallest effect on the whole number represented compared with changing any of the other digits in the binary number. However, even with their relative insignificance, LSBs can have a significant impact if not cared for properly, particularly when multiplying two binary numbers together – which is why Quantel invented Dynamic Rounding.
This is a common frequency that can be derived from 625/50 PAL (and 525/60 NTSC) that runs through SD, HD and UHD digital television standards using 4:4:4 or 4:2:2 sampling, including 1080-line HD and UHD at 25, 30, 60 Hz frame rates. It can be created from the old analog (PAL or NTSC) black-and-burst signal. Because 1.125 MHz is a common frequency, black and burst was widely used as a timing reference signal in the early days of HD.
Apt name for a world leader in broadcast technology with more than 40 years’ experience in providing digital technology to the media and broadcast industry. Today the Snell product range encompasses a comprehensive range of software and hardware solutions for the creation, management and distribution of content to today’s multi-screen world of tablets, televisions, mobiles and PCs. Snell is a Quantel company.
A television set that includes an internet connection, some computing power and a UI to access the ‘Smart’ functions – including internet browsing. Popular uses include accessing ‘catch-up TV’ services such as Comcast’s Xfinity and the BBC’s iPlayer. Smart TVs now make up nearly 50 percent of the set population in the USA. However, the use of the internet beyond the catchup services is reported to be in decline, with the UI cited as a major cause.
Using a tablet or smart phone to access more background information, or to add comments, about what you are watching on TV is said to be using a ‘second screen’. Some productions offer an app and access to more relevant material to add to the experience – beyond just the passive watching of TV.
Quantum dot is a relatively new display technology for television screens. QD are microscopic particles that emit light when electricity is applied, or when light is shone on them. The light’s color is very stable and relates to the QD’s size. Of late, they have been applied to screen technology in two ways.
One is to use them as a backlight for LCD screens. Currently LED screens use LEDs as the backlight, and an LCD filter to control the brightness of each screen ‘dot’ (not quantum!). QD backlight uses just a blue LED light which is converted by QDs to reasonably pure red and green – very close to the required red, blue and green – so far less light is absorbed in the RGB filters behind the LCD screen, so providing more useful light (brighter pictures) and more accurate colors, and it does not drift.
The other type of QD screen uses QD-based LEDs to directly emit the screen light – as OLEDs screens do now. However OLED screens remain at high prices and there is some fear of quality degradation over time. QD-LED color can be accurately tuned in manufacture and is has a narrow spectrum. It does not drift and is not known to degrade. Amazon’s Kindle Fire HDX tablet uses QD technology, as did Sony in 2013 with its Triluminous screens.
1) Intellectual Property – this can be very valuable and there are regular court cases where owners of this type of IP are trying to sue other people who they think have stolen their IP.
2) Internet Protocol – is the de facto standard for networking and is the most widely used of the network protocols that carry data and lie on top of physical networks and connections. Besides its Internet use it is also the main open network protocol that is supported by all major computer operating systems. IP, or specifically IPv4, describes the packet format for sending data using a 32-bit address to identify each of nearly 4.3 billion devices on a network with four eight-bit numbers separated by dots e.g. 188.8.131.52. Each IP data packet contains a source and destination address as well as a payload of data. There is now IPv6 which brings, among many other enhancements, 128-bit addressing – allowing 2128 addresses, plenty for all the connected devices on planet Earth, and thus relieving IPv4’s address shortage.
Above IP are two transport layers. TCP (Transmission Control Protocol) provides reliable data delivery, efficient flow control, full duplex operation and multiplexing – simultaneous operation with many sources and destinations. It establishes a connection and detects corrupt or lost packets at the receiver and re-sends them. Thus TCP/IP, the most common form of IP, is used for general data transport but is relatively slow and not ideal for video.
The other transport layer is UDP (User Datagram Protocol) which uses a series of ‘ports’ to connect data to an application. Unlike the TCP, it adds no reliability, flow-control or error-recovery functions but it can detect and discard corrupt packets by using checksums. This simplicity means its headers contain fewer bytes and consume less network overhead than TCP, making it useful for streaming video and audio where continuous flow is more important than replacing corrupt packets.
There are other IP applications that live above these protocols such as File Transfer Protocol (FTP), Telnet for terminal sessions, Network File System (NFS), Simple Mail Transfer Protocol (SMTP) and many more.
Video over IP – Watching video over the internet is commonplace. It represents a very large, and growing, part of internet traffic, and fits well with the rising population of Smart TVs. There are several suitable streaming protocols in use, including those offering variable bit rates such as HTTP Live Streaming from Apple, and HTTP Smooth Streaming from Microsoft. These offer a good chance of providing uninterrupted viewing, even when the internet connection gets a bit slow.
This is an HTTP-based video streaming system from Microsoft and runs on Windows platforms. It uses H.264 video and AAC audio coding and fragmented MPEG-4 files for transport that can easily accommodate many kinds of data. Its video chunks can be quite large, about 10 seconds.
This HTTP-based media streaming protocol enables live streaming of audio or video over the internet for appropriate Apple products. It is a part of iOS, OS X, QuickTime and Safari and works by dividing the required source media into small chunks of around two seconds, then offering media files in several levels of H.264 video and MP3 or HE-AAC audio compression, providing from low to high bit-rate (and quality) delivered in an MPEG-2 Transport Stream. The data delivery system is adaptive to allow for variations of available data speeds, with the receiving end able to choose the highest bit-rate files it can receive fast enough to maintain live operation.
High Frame Rate – a frame rate higher than normal. For instance, movies (films) are normally shot at 24 f/s but some have been shot at 48 f/s – HFR. Some audiences say they do not like it as it’s too real and does not look like film.
It has been observed that when viewing UHD, motion judder is often very apparent and so a higher frame rate (say 48 f/s) is recommended by some. When shooting fast-action sports, such as football, then the UHD result would look better using, say, 50 or 60 f/s. In fact the UHD standard Rec 2020 includes frame rates up to 120 f/s.
A whole television picture. A frame has shape, its aspect ratio. Today all new TV frames have a 16:9 aspect ratio. Some motion pictures are presented on TV with a wider aspect ratio, typically with a black border above and below. A frame has a specific resolution and is either using interlaced (I) or progressive (P) scans. Most productions now originate in an HD format of either 1280 x 720P or 1920 x 1080(I or P) pixels. Some still use SD with 701 x 576I or 701 x 480I frames. These two SD standards do not have square pixels, all other DTV frames do. In UHD a frame could have 3840 x 2160 (4K) or 7680 X 4320 (8K) pixels. UHD only uses progressive scans. Interlace makes a relatively low frame rate of 25 or 30 f/s (shown as 50 or 60 fields/s) suitable for portraying motion quite well but, without further processing, stop motion freezes can look poor.
Another property of a frame is its color gamut, as defined in its standard. As TV video standards have progressed, so the associated color gamut has expanded. Some say this is the most striking change from HD to UHD. UHD frames may also have a higher dynamic range (HDR) – again enhancing the look of the pictures. A frame has a specific time. Usually 1/25 or 1/30 of a second. Larger frame formats, especially 4K and 8K, require faster frame rates to reasonably portray smooth movement on a big screen. See ‘Frame rate below’.
See also: Interlace
Digital Leader and the Digital PROjection VErifier (DPROVE) are two products that are based on SMPTE RP 428-6-2009. The Digital Leader is aimed at digital movie post production and cinemas. In post it can be added as a leader and/or footer (end) of Digital Cinema Distribution Master (DCDM) ‘reels’ so allowing a quick quality check.
DPROVE is a set of Digital Cinema Packages (DCPs) that help checking projector performance and aligment as well as the sound’s synchronization with the pictures.
See also: DCI
Broadcast eXchange Format standardizes interfaces among systems that deal with content metadata, content movement, schedules and as-run information. This is standardized in SMPTE RP 2021 and simplifies interoperability between applications. BXF provides a standard where there were a large number of diverse ‘pet’ or old file and data formats used in areas including schedules, playlists, record lists, as-run lists, content metadata, content movement instructions.
BXF makes it possible for all its users to have a common standard for the efficient exchange of data between all components of broadcast automation and business systems.
Informal word used to describe when streaming media suddenly ‘hits the buffers’ – stops. This is usually due to a lack of bandwidth when viewing video over the internet. In recent years the implementation of adaptive streaming schemes, as well as faster internet, have together greatly reduced the occurrence of buffering, making the viewing of video delivered via the internet a non-buffered experience.
The streaming of media often occupies a large part of a network’s available capacity, especially if the media is video and the network is the Internet – where video streaming is growing fast. The available data speed for users on the Internet varies all the time but video and audio are constant flows, with the video requiring substantial bandwidth to deliver good pictures. Relying on a constant bit rate may well run out of bandwidth, then the video freezes while waiting for the next frames – a process known as buffering.
A way around this is to vary the bit rate according to the available capacity of the network connection. This is adaptive bit-rate streaming. There are several versions in use. Generally these involve the sender system detecting the receiver’s available bit rate and CPU power, and then adjusting the sending bit rate accordingly by varying the amount of compression applied to the media. In practice this requires the sender’s coder simultaneously creating a set of streams, typically three, each with a different bit rate. These are made available as a series of files containing short sections of video, typically between 2 and 10 seconds. This way those with a fast connection see good quality video, and those with a slow one hopefully should still see satisfactory results. In practice, streaming starts with sending a manifest of the files, and then low bit-rate video files. Then, if the receiver sees there is room for a better quality level, it will ask for it. If bandwidth is getting too tight, it will switch down.
Quantel has developed this concept further in its QTube technology using virtualization to create variable bit rate data on the fly to suit the speed of the connection as it changes over time.
A non-technical term commonly used to describe the quality of video after it has been through a process or processes: often compression / decompression. Strictly speaking, it can only mean it looks the same, but that does not mean it is the same. So, if the input quality was good, then the output should look the same and so fine for an audience looking at a similar screen. However, if the material is required for a video process, such as pulling a key, or color correction, visually lossless may not be good enough, as the process can be more demanding tasks, such as creating a key single.
A timing reference signal developed for HD. Typically this is originated by an sync-pulse generator and is distributed to most of the technical equipment, including cameras, mixer/switcher, and video processing equipment in a studio or truck so they all operate in sync.
With the opportunity to devise a new timing signal, the TLS signal combines a negative and positive pulse that is symmetrically above and below a baseline – nominally at zero volts. This means that accurate timing can be extracted from it even when the baseline voltage drifts.
In the early days of HD, the new equipment offered TLS and black and burst inputs for synchronization, but it was soon found that the old analog black and burst was preferred, already available and actually offered more accurate timing for 25p, 30p, 50i and 60i (but not suitable for other frame rates).
This describes a Precision Time Protocol (PTP) that enables synchronizing distributed clocks to within 1 microsecond via Ethernet networks with relatively low demands on local clocks, the network and computing capacity. There are many applications for example in automation to synchronize elements of a production line (without timing belts).
PTP runs on IP networks, transferring precision time to slave devices via a 1 GHz virtual clock (timebase). Independent masters can be locked to one master clock, creating wide, or even global locking. SMPTE has been assessing the possibilities of using PTP as a synchronizing source for television applications.
The Digital Production Partnership was formed by leading UK public service broadcasters to help the television industry maximize the opportunities, and be aware of the challenges, of digital television production. It works in two areas: technology, and shared thinking, information and best practice.
In 2011 it created common technical standards for the delivery of SD and HD video to the major broadcasters. In 2012 a common format, structure and wrapper for the delivery of programs by digital file, including metadata, was agreed. The DPP file-based delivery standard became the UK’s standard in late 2014. For this the DPP referred to AMWA to create a subset of its AS-11 file system that can be edited for breaks, re-timed, and may have additional language tracks, captions and subtitles wrapped into the file container.
A PAL or NTSC video signal with no picture (black). The signal comprises line and field sync pulses as well as the color ‘burst’ before the start of each active TV line. It was widely used as an accurate timing reference for analog 626 and 525-line color equipment. You might have thought it has passed into history, but you would be wrong. Black and burst is still widely used as a sync reference timing signal for digital formats including SD, HD and UHD.
Founded in 2005, three years after the introduction of the Blu-ray Disc system, the BDA is a voluntary membership group for those interested in creating, manufacturing, or promoting the BD formats and products, as well as those seeking more information about the format as it evolves.
The BDA aims to develop BD specifications, ensure products are correctly implemented, promote wide adoption of the formats and provide useful information to those interested in supporting those formats.
See also: Blu-ray Disc
This is a common frequency that can be derived from 625/50 PAL (and 525/60 NTSC) that runs through SD, HD and UHD digital television standards. It can be created from the old analog black and burst signal. 2.25 MHz, or multiples thereof, runs through all the major digital television standards. These can be locked to black and burst. They include 1080-line HD at 25, 30 fps.
A high-speed digital subscriber line (DSL) standard for short local loops (the connection between the customer’s premises and the telecom’s network) that is expected to deliver from 150 Mb/s up to 1 Gb/s data rates over copper, and is slated as matching fiber at distances up to 400 metres. This means that many consumers can receive fast internet without running fiber to the house. The protocol is defined in Recommendation ITU-T G.9701. Such data rates are ample to support live 4K and even 8K UHD streaming to viewers’ homes. The first consumer installations of this technology are expected in late 2015.
The film process is designed for capturing images from scenes that will be edited and copied for eventual projection onto hundreds or thousands of big cinema screens. This has been in operation for over 100 years and so has been developed and refined to a very high degree to precisely meet these objectives. The film stocks used in cameras have a characteristic that allows them to capture a very wide range of scene brightness with good color saturation to provide wide latitude for color correction after processing. Intermediate stocks used to copy the one original negative are designed to be as faithful to the original as possible. Print stocks provide the high contrast needed to produce a bright and good contrast image on the projector screen to overcome the background illumination in the theatre.
Television is different in many ways. For example, the results are always instantly viewable and are delivered, sometimes live, to millions of smaller screens. The sensors used in video cameras do not presently have the wide dynamic range of film and so shooting with them has to be more carefully controlled as the ability to correct exposure faults later is more restricted. Also the viewing conditions for video are different from cinema. Not many of us sit in a darkened room to watch television, so the images need to be brighter and more contrasted than for film.
The three different basic types of film stock used – camera negative, intermediate and print – each have very specific jobs. Camera negative records as much detail as possible from the original scene, both spatially and in range of light to make that original detail eventually available on a multitude of internegatives from which are produced thousands of release prints for projection.
The Film Lab
Between the camera negative and the print there are normally two intermediate stages: Interpositive and Internegative. At each point more copies are made so that there is a large number of internegatives from which to make a much larger number of release prints. The object of these intermediate stages is purely to increase the number of negatives to print. This is because the precious and unique camera negative would be effectively destroyed with so much handling. The intermediate materials, interpositive and internegative, are exactly the same and designed to make, as nearly as possible, exact copies for each stage (with each being the negative of the previous stage). For this requirement the material has a gamma of 1.
But the release print is not just a film representation of the shot scenes: editing, visual effects, and grading – not to mention audio work – must take place in between. This mainly works in parallel with the film processing path – partly to reduce handling the negative.
The camera negative is printed to make the rush prints which provide the first viewing of the shot material. Note that this will be at least several hours after the shoot so hopefully all the good takes came out well! The first edit decisions about what footage is actually required are made from the rush prints with the aid of offline editing.
The negative cutter has the responsibility for cutting the unique footage according to the scene list. Initial grading is applied as the cut negative is transferred to interpositive. Should there be any further need of grading, instructions for this are sent with the internegatives to the print production labs. Any need of dissolves rather than cuts, or more complex visual effects, will require work from the optical printer or, these days, a digital film effects workstation.
Grading or Timing
Grading is the process of applying a primary color correction to the film copying process. The original camera negative may contain lighting changes which will mean that scenes shot on different days or times during the day need to look the same but simply do not. By effectively controlling the color of the light used to copy the negative to one of the intermediate stages these errors can be much reduced to produce a scene-to-scene match. Grading is carried out on a special system equipped with a video monitor displaying the current frame from the negative loaded onto it. Three controls provide settings of the red, green and blue ‘printer’ light values that adjust the amount of each of the three lights used to image the frame. These adjustments allow the operator to balance the color and brightness of the scenes in the movie.
This results in a table of corrections linked to the edge code of the original negative. This table is used to control the optical printer making the copy. Most processing laboratories subscribe to a standard definition of the settings but this does not mean that settings defined at one processing lab can be used at another. The photochemical process is very complex and individual labs will vary, however they all aim toward a standard. The ‘neutral’ value for RGB printer lights is represented typically as between 25, 25, 25 and 27, 27, 27 – depending on which lab is used. To print an overexposed negative will require higher values, and an underexposed negative lower values. A change of one in the value represents one 12th of a stop adjustment in exposure. Differential adjustment of the values provides basic color correction.
This analog process is now often replaced by a digital process known as Digital Intermediate (DI).
See also: Digital Intermediate
Digital technology is sweeping our industry and affects many parts of our lives. Yet we live in an analog world. Light and sound naturally exist in analog forms and our senses of sight and hearing are matched to that. The first machines to capture, record and manipulate pictures and sound were analog but today it is far easier to do the jobs in the digital domain. Not only does this allow the use of the highly advanced digital components available from the computer industry but it also leads to many new capabilities that were impractical or simply impossible with analog.
The techniques used to move between the analog and digital worlds of television pictures are outlined here. Some of the pitfalls are shown as well as describing why the digital coding standards for standard definition and high definition television (ITU-R BT.601 and ITU-R BT.709) are the way they are.
The digital machines used in television are generally highly complex and many represent the state-of-the-art of digital technology. The initial reason for the popularity of digital techniques was that the scale of the computer industry ensured that the necessary electronic components were both relatively easily available and continued to develop. But the preference for digits is also because of their fidelity and the power they give to handle and manipulate images. Rather than having to accurately handle every aspect of analog signals, all digital circuits have to do is differentiate between, or generate, two electrical states – on and off, high and low, 1 and 0. To read, or correctly interpret, this information accurately requires only recognizing a 1 or 0 state, rather than the value of continuously varying analog signals.This is relatively easy and so leads to superb fidelity in multi-generation recordings, no losses in passing the signal from place to place, plus the potential of processing to produce effects, large-scale storage and many other techniques far beyond those available in analog.
Forty-plus years ago, the technology simply did not exist to convert television pictures into digits. Even if it could have been done there were no systems able to process the resulting data stream at anything like realtime. Today digital machines have successfully reached every aspect of television production – from scene to screen. At the same time costs have tumbled so that today all new equipment, from broadcast professional to consumer level, is digital.
From analog to digital
Initially, digitization involved working with television’s composite signals (PAL and NTSC) but this is now rare. Today it is the component signals (meaning separate signals that together make-up the full colorsignal), not composite, which are digitized according to the ITU-R BT.601 and ITU-R BT.709 digital sampling specifications for SD and HD respectively (film applications uses different ranges of sampling to these TV and video requirements).
‘601’ describes sampling at standard definition and is widely used in TV operations. Sampling for high definition, according to ITU-R BT.709, broadly follows the same principles, but works faster. Both standards define systems for 8-bit and 10-bit sampling accuracy – providing 28 (= 256) and 210 (= 1024) discrete levels with which to describe the analog signals.
There are two types of component signals; the Red, Green and Blue (RGB) and Y, R-Y, B-Y but it is the latter which is by far the most widely used in digital television and is included in the ITU-R BT.601 and 709 specifications. The R-Y and B-Y, referred to as color difference signals, carry the color information while Y represents the luminance. Cameras, telecines, etc., generally produce RGB signals from their image sensors. These are easily converted to Y, R-Y, B-Y using a resistive matrix and filters. This is established analog technology used to prepare video for PAL or NTSC coding.
Analog to digital conversion occurs in three parts: signal preparation, sampling and digitization.
The analog to digital converter (ADC) only operates correctly if the signals applied to it are correctly conditioned. There are two major elements to this. The first involves an amplifier to ensure the correct voltage and amplitude ranges for the signal are given to the ADC. For example, luminance amplitude between black and white must be set so that it does not exceed the range that the ADC will accept. The ADC has only a finite set of numbers (an 8-bit ADC can output 256 unique numbers – but no more, a 10-bit ADC has 1024 – but no more) with which to describe the signal. The importance of this is such that the ITU-R BT.601 and 709 standards specify this set-up quite precisely saying that, for 8-bit sampling, black should correspond to level 16 and white to level 235, and at 10-bit sampling 64 and 940 respectively. This leaves headroom for errors, noise and spikes to avoid overflow or underflow on the ADC. Similarly for the color difference signals, zero signal corresponds to level 128 (512 for 10-bit) and full amplitude covers only 225 (897) levels.
For the second major element the signals must be low-pass filtered to prevent the passage of information beyond the luminance band limit of 5.75 MHz and the color difference band limit of 2.75 MHz, from reaching their respective ADCs. If they did, aliasing artifacts would result and be visible in the picture (more later). For this reason low pass (anti-aliasing) filters sharply cut off any frequencies beyond the band limit. For HD, the principle remains the same but the frequencies are all 5.5 times higher, generally, depending on the HD standard being used.
Sampling and digitization
The low-pass filtered signals of the correct amplitudes are then passed to the ADCs where they are sampled and digitized. Normally two ADCs are used, one for the luminance Y, and the other for both color difference signals, R-Y and B-Y. Within the active picture the ADCs take a sample of the analog signals (to create pixels) each time they receive a clock pulse (generated from the sync signal). For Y the clock frequency in SD is 13.5 MHz and for each color difference channel half that – 6.75 MHz – making a total sampling rate of 27 MHz (74.25 MHz, 37.125 MHz and 148.5 MHz respectively for HD). It is vital that the pattern of sampling is rigidly adhered to, otherwise onward systems, and eventual conversion back to analog, will not know where each sample fits into the picture – hence the need for standards! Co-sited sampling is used, alternately making samples of Y, R-Y, and B-Y on one clock pulse and then on the next, Y only (i.e. there are half the color samples compared with the luminance). This sampling format used in 601 is generally referred to as 4:2:2 and is designed to minimize chrominance/luminance delay – any timing offset between the color and luminance information. Other sampling formats are used in other applications – for example 4:2:0 for MPEG-2 compression used for transmission.
The amplitude of each sample is held and precisely measured in the ADC. Its value is then expressed and output as a binary number and the analog to digital conversion is complete. Note that the digitized forms of R-Y and B-Y are referred as Cr and Cb.
Sampling (clock) frequency
The (clock) frequency at which the picture signal is sampled is crucial to the accuracy of analog to digital conversion. The object is to be able, at some later stage, to faithfully reconstruct the original analog signal from the digits. Clearly using too high a frequency is wasteful whereas too low a frequency will result in aliasing – so generating artifacts. Nyquist stated that for a conversion process to be able to re-create the original analog signal, the conversion (clock) frequency must be at least twice the highest input frequency being sampled (see diagram below) – in this case, for luminance, 2 x 5.5 MHz =11.0 MHz. 13.5 MHz is chosen for luminance to take account of both the filter characteristics and the differences between the 625/50 and 525/60 television standards. It is a multiple of both their line frequencies, 15,625 Hz and 15,734.265 Hz respectively, and therefore compatible with both (see 13.5 MHz). Since each of the color difference channels will contain less information than the Y channel (an effective economy since our eyes can resolve luminance better than chrominance) their sampling frequency is set at 6.75 MHz – half that of the Y channel.
From digital to analog
Today, it is increasingly common for the digital signal to be carried right through to the viewer, so the signal would not require digital to analog conversion at all. Where D to A conversion is required, the digital information is fed to three digital to analog converters (DACs), one each for Y, Cr and Cb (digitized R-Y and B-Y), which are clocked in the same way and with the same frequencies as was the case with the ADCs. The output is a stream of analog voltage samples creating a ‘staircase’ or ‘flat top’ representation similar to the original analog signal (see figure below). The use of a sampling system imposes some frequency-dependent loss of amplitude which follows a Sinx/x slope. This means that the output amplitude curves down to zero at half the frequency of the sampling frequency, known as the Nyquist frequency. For example sampling at 13.5 MHz could resolve frequencies up to 6.75 MHz. Although the ITU-R BT.601 set-up is way off that zero point, the curved response is still there. This curve is corrected in the Sinx/x low-pass filters which, by losing the unwanted high frequencies, smoothes the output signal so it now looks the same as the original Y, R-Y, B-Y analog inputs. For those needing RGB, this can be simply produced by a resistive matrix.
Today the whole analog to digital and digital to analog process is usually reliable and accurate. However there are inherent inaccuracies in the process. The accuracy of the clock timing is important and it should not vary in time (jitter). Also the accuracy of the ADCs in measuring the samples, though within the specification of the chip, may not be exact. This is a specialized task as each sample must be measured and output in just 74 nanoseconds, or 13.5 nanoseconds for HD. Equally the DACs may only be expected to be accurate to within their specification, and so they too will impose some degree of non-linearity into the signal. Even with perfect components and operation the process of sampling and reconstituting a signal is not absolutely accurate. The output is never precisely the same as the original signal. For this reason, plus cost considerations, system workflows are designed so that repeated digitization processes are, as far as possible, avoided. Today it is increasingly common for pictures to be digitized at, or soon after, the camera and not put back to analog, except for monitoring, until the station output, or, with DTV, until arriving at viewers’ TV sets or set-top boxes; indeed in many cases the signal now remains digital throughout the entire production, distribution and viewing chain.
See also: 13.5 MHz
Film stock designed specifically for distribution and exhibition at cinemas. Unlike negative film, it is high contrast and low on latitude. This is designed to give the best performance when viewed at cinemas. Obviously a release print has to be clear of the orange base so this is bleached out during processing.
See also: Film basics (Tutorial 2)
Spots that occasionally appeared on the faces of…TV screens showing digital pictures when the technology was in its youth. These were caused by technical limitations but now that designs have matured, zits only appear during fault conditions.
Convenient shorthand commonly, but incorrectly, used to describe the analog luminance and color difference signals in component video systems. Y is correct for luminance but U and V are, in fact, the two subcarrier modulation axes used in the PAL color coding system. Scaled and filtered versions of the B-Y and R-Y color difference signals are used to modulate the PAL subcarrier in the U and V axes respectively. The confusion arises because U and V are associated with the color difference signals but clearly they are not themselves color difference signals. Or could it just be because YUV trips off the tongue much more easily than Y, R-Y, B-Y?
See also: PAL
These are the analog luminance, Y, and color difference signals (R-Y) and (B-Y) of component video. Y is pure luminance information whilst the two color difference signals together provide the color information. The latter are the difference between a color and luminance:
red minus luminance and blue minus luminance. The signals are derived from the original RGB source, usually a camera.
The Y, (R-Y), (B-Y) signals are fundamental to much of television. For example in ITU-R BT.601 it is these signals that are digitized to make 4:2:2 component digital video, in the PAL and NTSC TV systems they are used to generate the final composite coded signal, and in DTV they are sampled to create the MPEG-2 video bitstream.
The digital luminance and color difference signals in component video including ITU-R BT.601 (SD), ITU-R BT.709 (HD) ITU-R BT.2020 (UHD) coding. For SD with 4:2:2 sampling, the Y luminance signal is sampled at 13.5 MHz and the two color difference signals are sampled at 6.75 MHz co-sited with one of the luminance samples. Cr is the digitized version of the analog component (R-Y), likewise Cb is the digitized version of (B-Y). For BT.709 HD , sampling rates are 5.5 times greater – 74.25 MHz for Y and 37.125 MHz for Cr and Cb. Finally for BT.2020 4K UHD the frequencies are four times greater again at 297 and 128.5 MHz respectively for the luminance and chrominance.
A mathematically defined absolute color space, CIE X´Y´Z´, also known as CIE 1931 color space, was created by the International Commission on Illumination (CIE) in 1931. It was not heard much of in the digital media industry until X´Y´Z´ was selected by DCI as the color space for digital cinema.
Launched in 2003, Sony’s XDCAM professional camcorder products have evolved with technology. The first model was for SD television and made use of its Professional Disc (PD), an application of Blu-ray Disc, as the on-board recording medium. The product range included camcorders, mobile and studio decks which are designed to take advantage of the size, weight, data speed and re-record features of the PD technology. It used the DVCAM codec and record SD 4:1:1 (480-line) and 4:2:0 (576-line) video at 25 Mb/s onto the PD.
XDCAM HD camcorder images were native 1440 x 1080 and recorded as HDV: 1080/59.94I, 50I, 29.97P, 25P, and native 23.98P video using MPEG-2 MP@HL with compression and 4:2:0 sampling. Users could select 35 (HQ), 25 (SP), or 18 (LP) Mb/s bit rates according to picture quality and recording length requirements, ranging from 60 to 120 minutes. There were four channels of 16-bit, 48 kHz audio.
XDCAM EX takes the same ideas but records to solid-state storage in place of Blu-ray disc.
XDCAM HD422 is a family that includes a selection of cameras, recorders again including solid-state, and accessories.
See also: Professional Disc
What You See Is What You Get. Usually, but not always, referring to the accuracy of a screen display in showing how the final result will look. For example, a word processor screen showing the final layout and typeface that will appear from the printer. Or in an edit suite; does the monitor show exactly what will be placed on the master recording? This subject requires more attention as edited masters are now commonly output to a wide variety of ‘deliverables’ such as SD, HD, and UHD video, DVD, cinemas, the internet and social media… Issues such as color, gamma and display aspect ratio may need consideration.
See also: Color Management
Write Once/Read Many – describes storage devices on which data, once written, cannot be erased or re-written. This applies to some optical disks that are removable, making them useful for archiving. CD-R and DVD-R are examples.
See also: Optical disks
Clock information associated with AES/EBU digital audio channels. Synchronous audio sampled at 48 kHz is most commonly used in TV. The clock is needed to synchronize the audio data so it can be read.
See also: AES/EBU audio
A formal special interest group comprising consumer electronics and other technology companies (LG Electronics, Matsushita (Panasonic), NEC, Samsung, SiBEAM, Sony and Toshiba), formed to promote and enable the rapid adoption, standardization and multi-vendor interoperability of WirelessHD technology worldwide. Products provide a wireless digital connection of up to 10 Gb/s at ten meters, combining uncompressed high-definition video, multi-channel audio, intelligent format and control data, and Hollywood-approved content protection. It means the elimination of audio and video cables and short distance limitations. First-generation implementation high-speed rates range from 2-5 Gb/s for CE, PC, and portable devices, with up to 20 Gb/s eventually possible.
The latest Windows Media Player 12 (WM12) has been supplied with Windows 7 and 8 and includes a range of video and audio codecs including Microsoft’s own designs. File types handling include a wide range of MPEG4, and WMV video decoders to work with most popular formats.
Worldwide Interoperability for Microwave Access (IEEE 802-16 and ETSI HiperMAN) uses OFDMA modulation over its radio links. The WiMAX Forum describes it as “a standards-based technology enabling the delivery of last-mile wireless broadband access as an alternative to cable and DSL”. Unlike Wi-Fi, this offers symmetrical bandwidth (equal upload and download speeds) over longer ranges of some kilometers with strong encryption (3DES or AES). It connects between network endpoints without line-of-sight of the base station for fixed, portable and mobile wireless broadband. A typical cell radius is 3-10 km, offering up to 40 Mb/s per channel for fixed and portable access, and up to 1 Gb/s for fixed only.
Mobile networks are expected to provide up to 15 Mb/s capacity within a 3 km cell radius. WiMAX technology is incorporated in some mobile devices, allowing urban areas and cities to become MetroZones for portable outdoor broadband wireless access.
See also: OFDMA
A TV picture that has an aspect ratio wider than 4:3 – usually 16:9 – while still using the normal 525/60 or 625/50 or SD video. 16: 9 is also the aspect ratio used for HDTV. There is an intermediate scheme using 14:9 which is found to be more acceptable for those still using 4:3 displays. Widescreen is used on some analog transmissions as well as many digital transmissions. The mixture of 4:3 and 16:9 programming and screens has greatly complicated the issue of safe areas.
A compression technique in which the image signal is broken down into a series of frequency bands. This is a very efficient but the processing is more complex than for DCT-based compression that uses Fourier transforms. Although some wavelet-based compression was used by some manufacturers now all wavelet compression used in the media industry is JPEG 2000. It is prevalent in DCI digital cinema, is used in some new camcorders and is increasingly used in contribution and distribution circuits.
See also: JPEG 2000
An audio file format developed by Microsoft that carries audio that can be coded in many different formats. Metadata in WAV files describes the coding used. To play a WAV file requires the appropriate decoder to be supported by the playing device.
Vestigial Sideband modulation – an established modulation technique which is used in the RF (radio frequency) transmission subsystem of the ATSC(1) Digital Television Standard. The 8-VSB system has eight discrete amplitude levels supporting a payload data rate of 19.28 Mb/s in a 6 MHz TV channel. There is also a high data rate mode – 16 VSB – designed for CATV (cable television) and supporting a payload of 38.57 Mb/s.
Things move on; E-VSB, Enhanced-VSB, was approved by ATSC in 2004 as an amendment to the A/53C DTV Standard as an optional transmission mode with additional forward error correction coding layers to help reception under weaker signal conditions. This was responding to the wishes broadcasters for more flexibility in DTV. E-VSB allows broadcasters to trade-off data rate for a lower carrier-to-noise threshold for some services, e.g. “fall back” audio, and targeted at receivers with indoor antennas, non-realtime transmissions of file-based information, and more.
Vertical Interval Timecode (pronounced ‘vitsy’). Timecode information in digital form, added into the vertical blanking of a TV signal. This can be read by the video heads from videotape at any time pictures are displayed, even during jogging and freeze but not during spooling. This effectively complements LTC ensuring timecode can be read at any time.
See also: LTC
Distance of a viewer from a screen or picture. If viewers are too far away they will not be able to see the full detail of the images; if too near they will see the limitations of the detail in the images and possibly not see the picture’s edges. The ideal viewing distance, where you clearly see all the available detail, depends on the size of the video image, SD, HD, etc, and the size of the screen.
There is a rule-of-thumb to work out the ideal viewing distance, where you can see all the detail – provided you have a matching HD or 4K (or maybe 8K) capable screen. This is calculated by taking into account the acuity of the human eye, which is about one second of arc (1/60 of a degree). This means to see all the detail of SD television pictures on a 16 x 9 screen, we need to be no more than 6 screen heights (H) away. 1080-line HD slightly more than doubles the horizontal and vertical detail so we need to half the distance to 3H to see all that HD is offering. With 4K (twice the HD vertical and horizontal lines and pixels) we need to halve the distance again – to 1.5H. And if you are looking at 8K, it’s just 0.75H. If you don’t want to get so close to the screen the other way around is to get much a bigger one that is able to display all the pixels of your biggest chosen format.
Fortunately TV screen manufacturers have risen to the challenge… or have they? If viewing 1920 x 1080 HD on a 40-inch screen a viewer would have to sit no more than 60 inches (1.5m) away to see all the detail. For 4K UHD that should be reduced to 30 inches (0.72m). Then for 8K the distance would be just 15 inches (38cm). Perhaps that feels too close, so the alternative is to have a bigger screen. Assuming viewing at 1.5m is comfortable, then 4K viewing would require an 80-inch screen and for 8K, the rarely observed 160-inch screen!
The same principals apply in other digital viewing environments, such as cinema. Typically to appreciate all the detail of the DCI 4K (4096 x 2160) pictures, you would probably have to be in the front third of the cinema. For 8K (not a DCI standard), keep in the front few rows.
With this in mind, the design and shooting of scenes should allow the audience to see a great deal of detail, and have most, if not all, of their vision filled with the screen’s images. This has implications for scenery, make-up and wardrobe as well as lighting.
Variable Frame Rate shooting used to be only possible with film cameras; all electronic cameras worked at fixed frame rates. Panasonic’s HD Varicam was the first to offer variable speeds, originally with frame rates from 4 to 60 f/s in one-frame increments. Sony’s XDCAM HD offered the same range. There are also specialized digital cameras and solid-state recorders able to capture video at frame rates up to 1000 f/s, or more. Instant replay shows an otherwise unseen world of extreme slow motion.
In today’s multimedia world there is much demand for many version of a finished production. This business has ballooned. Historically versioning involved making copies from the edited and graded master to various videotape formats and, via a standards converter, to other video standards (e.g. NTSC to PAL). Now technical variations involve many more formats being supplied, including Web, mobile, HD and SD TV, DVD and cinema, as well as a variety of display systems including LED, LCD, Plasma and digital cinema. Aside from the technical needs, other requirements such as commercial, language and religious influences are among the many factors that can be causes for more yet versions.
Versioning is big business, as the number of versions can run to many tens and involve much more than simply making copies of the master. For example, work may involve re-grading to suit different viewing conditions, re-insertion of text or images to suit different regions or countries, pricing (for commercials) adding or removing shots or scenes for censoring, etc. Generally, for this to be done efficiently and effectively requires nonlinear editing in an uncommitted environment; where original footage and all the post processes that produced the master are available for recall and allow direct access to further adjustments, to re-make the result in a short time.
Fonts that are stored as vector information – sets of lengths and angles to describe each character. This offers the benefits of using relatively little storage and the type can be cleanly displayed at virtually any size. However it does require that the type is RIPped before it can be used – requiring processing for interactive use when sizing and composing type into a graphic. Quantel’s range of graphics and editing equipment uses vector fonts.
See also: TrueType
Avid’s DNxHD codec has been approved as compliant with the SMPTE VC-3 standard.
See also: DNxHD
Standardized by SMPTE, VC-2 (also known as Dirac Pro) is a video codec technology developed by the BBC. VC-2 is open source and royalty-free for all to use. It is an intra-frame compression scheme aimed at professional production and post production. Compression ratios are in the range 2:1 to 16:1, and typical VC-2 applications are seen to include desktop production over IP networks, reducing disk storage bandwidth in D-Cinema production and moving HD video over legacy infra-structure. A current application provides near lossless compression to enable the use of HD-SDI to carry 1080/50P and 1080/60P, which would otherwise require new 3G SDI infrastructure.
VC-1 is a video codec specification (SMPTE 421M-2006) implemented by Microsoft as Windows Media Video (WMV) 9, and specified in Blu-ray Disc, and many others. It is designed to achieve state-of-the-art compressed video quality at bit rates ranging from very low to very high with low computational complexity for it to run well on PC platforms. The codec can handle 1920 x 1080 at 6 to 30 Mb/s for high-definition video and is capable of higher resolutions such as 2K for digital cinema, and of a maximum bit rate of 135 Mb/s. An example of very low bit rate video would be 160 x 120 pixel at 10 kb/s.
VC-1 uses some similar transforms to H.261 (1990, the first practical digital coding standard) but much more like H.264/AVC. It includes some distinctive innovations and optimizations. These include 16-bit transforms to help to minimize decoder complexity and interlace coding using data from both fields to predict motion compensation. Also fading compensation improves compression efficiency for fades to/from black and a modified de-blocking filter helps handling areas of high detail.
Individual opinions differ but broadly speaking VC-1 offers at least similar performance and efficiency to H.264/AVC; some say it looks better. VC-1 offers a number of profiles for coding features, and levels of quality combinations defining maximum bit rates. These have a wide range from 176 x 144/15P which may be used for mobile phones, to 2K (2048 x 1536/24P) for movie production.
|Profile||Level||Max Bit Rate||Resolutions and Frame Rate|
|Simple||Low||96Kb/s||176 x 144 @ 15 Hz (QCIF)|
|Medium||384 Kb/s||240 x 176 @ 30 Hz|
|352 x 288 @ 15 Hz (CIF)|
|Main||Low||2 Mb/s||320 x 240 @24 Hz (QVGA)|
|Medium||10 Mb/s||720 x 480 @ 30 Hz (480p)|
|720 x 576 @ 25 Hz (576p)|
|High||20 Mb/s||1920 x 1080 @30 Hz (1080p)|
|Advanced||L0||2 Mb/s||352 x 288 @ 30 Hz (CIF)|
|L1||10 Mb/s||720 x 480 @ 30 Hz (NTSC-SD)|
|720 x 576 @ 25 Hz (PAL-SD)|
|L2||20 Mb/s||720 x 480 @60 Hz (480p)|
|1280 x 720 @ 30 Hz (720p)|
|L3||45 Mb/s||1920 x 1080 @24 Hz (1080p)|
|1920 x 1080 @30 Hz (1080i)|
|1280 x 720 @60 Hz (720p)|
|L4||135 Mbps||1920 x 1080 @60 Hz (1080p)|
|2048 x 1536 @ 24 Hz|
See also: MPEG-4
Panasonic range of cameras that offer variable frame rates, typically from 1-60 f/s for video and cinematographic projects. So, if working at a nominal 24 f/s, the system offers x 6 speed up (undercranking) to x 2.5 slow down (overcranking). The system works by continuously recording 60 f/s to tape while the images are captured at the appropriate rate. Then the relevant useful frames are flagged. Editing equipment with a VariCam interface can use the flag to record the right frames and so replay them at the right speed (e.g. with Panasonic and Quantel editing systems).
The range covers SD, HD and 4K UHD models. There is also P2, a solid-state camera recording system that offers long-form recording and high-speed transfers to editing equipment. P2 cards offer up to 64 GB storage and can operate in an editing environment as well as on the camera.
While many video compression schemes are ‘constant bit rate’ – designed to produce fixed data rates irrespective of the complexity of the video, VBR offers the possibility of fixing a constant picture quality by varying the bit-rate of, typically, MPEG-2 or MPEG-4 compressed video according to the needs of the pictures. This allows the images that require little data, like still or slow moving video sequences, to use less data and to use more for those that need it where there is more detail or/and movement; so maintaining a constant quality. This reduces the need for storage on DVDs, while delivering better overall quality, or more efficient allocation of total available bit-rate in a multi-channel broadcast multiplex.
Software or hardware that is promised or talked about but is not yet completed – and may never be released.
See also: RSN
Universal Serial Bus – has been evolving. It is common to have four or six USB connectors on a PC or laptop computer. These are usually USB 2.0 (introduced in 2000), identifiable by the ports which are generally black. The maximum transfer rate is 480 Mb/s which offers potentially useful connectivity for media applications on PCs and Macs. It is very cheap and widely used for connecting PC peripherals. It is a PAN, and so the service provided to any one device depends on their specification and what other connected devices are doing. Actual speeds achieved for bulk data transfers are about 300 Mb/s – but this is likely to rise.
The newer (2008) USB 3 ports are generally blue. They add a new transfer mode called SuperSpeed working at up to 5 Gb/s, more than ten times faster than USB 2. It is also full duplex (USB 2 is half duplex) meaning it can both simultaneously transmit and receive at full speed – making a total of 10 Gb/s I/O. USB 3.1 was released in 2013 which doubled the top speed to 10 Gb/s, with full duplex.
See also: IEEE 1394
The process which increases the size, or number of pixels used to represent an image by interpolating between existing pixels to create the same image on a larger format. There is no implied change of vertical scan rate. Despite its name, the process does not increase the resolution of the image; it just spreads the same over a larger canvas. The quality of the result depends on that of the original image as well as the accuracy of the interpolation process that creates the new larger image. Speed is an issue for realtime work, as good quality requires a large amount of processing, which also increases with the picture area.
Unicode allows computers to consistently represent and manipulate text in most of the world’s writing systems – 30 are currently implemented – describing about 100,000 characters. Before Unicode, there were hundreds of different encoding systems to assign numbers to characters, and no single one could contain enough characters to cover all languages – in Europe alone. Unicode provides a unique number for every character, regardless of platform, program or language. The Unicode Standard is widely adopted and supported by the computer industry.
Editing where the decisions are made and the edits completed but any can still easily be changed. This is possible in an edit suite with FrameMagic that includes true random access editing – where the edits need only comprise the original footage and the edit instructions. Nothing is re-recorded so nothing is committed. This way, decisions about any aspect of the edit can be changed at any point during the session, regardless of where the changes are required. Where new frames are generated, such as in mixes, dissolves and compositing, all the tools and their settings are available – preferably on the edit timeline.
See also: True random access
Ultra High Definition Television has two picture sizes, correctly referred to as UHD-1 (4K) and UHD-2 (8K), though more commonly known as 4K UHD and 8K UHD respectively. Both are standardized in ITU-R.BT. 2020.
The 4K UHD image is four times the area of 1920 HD at 3840 x 2160. There is now a choice of equipment for 4K UHD production and a number of programs have been produced in the format. 4K UHD TV screens are widely available; transmission trials have proved the delivery system using HEVC video compression via DVB.
What is now known as 8K UHD started life as Super-Hi vision in the laboratory at NHK in about 2001. The picture size is 7680 x 4320 – 16 times the area of 1920 HD. There are plans for transmission in Japan.
ITU-R BT.2020 describes framerates from 23.98 to 120 Hz with only progressive scans and a wider color gamut which can reproduce richer colors than HD. The 4K format has already been used for a number of broadcasts. For sport the 50 or 60 Hz framerate is popular. Even at lower rates the format produces very large quantities of data that require faster video links and more storage than HD. This requirement is four times larger with 8K.
Removal of the least significant bits (LSBs) of a digital word – as could be necessary when connecting a 10-bit video source into 8-bit video equipment, or handling the 16-bit result of digital video mixing on an 8-bit system. Just dropping the lowest two bits is not the right answer. If not carefully handled truncation can lead to unpleasant artifacts on video signals – such as ‘contouring’. Quantel invented Dynamic Rounding as a way to handle the truncation of digital image data so that the values of the dropped lower bits are contained in the remaining bits.
See also: Dynamic Rounding
The TrueType vector font format was originally developed by Apple Computer, Inc. The specification was later released to Microsoft. TrueType fonts are therefore supported on most operating systems. Most major type libraries are available in TrueType format. There are also many type design tools available to develop custom TrueType fonts. Quantel equipment supports the import of these and other commercially available fonts.
Quantel term for the ability to continuously read any frame, or sequence of frames, in any order at or above video (or realtime) rate, a part of FrameMagic. A true random access video store (usually comprising disks) can allow editing which offers rather more than just quick access to material. Cuts are simply instructions for the order of replay and involve no copying, so adjustments can be made as often and whenever required. This results in instant and very flexible operation. At the same time technical concerns associated with fragmentation do not arise as the store can operate to specification when fully fragmented, ensuring full operation at all times. This aspect is particularly important for server stores with many users.
Dolby’s lossless audio technology developed for high-definition disk-based media including Blu-ray Disc. It includes ‘bit-for-bit’ lossless coding up to 18 Mb/s and support for up to eight channels of 24-bit, 96 kHz audio. It is supported by HDMI.
This has no strict technical meaning but is marketing hype. The ATSC says that all HD, 720P, 1080I and 1080P are all true HD, but the term has tended to be associated with 1080P often in advertising, but this is nothing official. Not to be confused with… TrueHD.
See: Digitizing Time
Following a defined point, or points, in the pictures of a clip. Initially this was performed by hand, using a DVE but was laborious, difficult and limited to only to pixel accuracy. Now image tracking is widely used, thanks to the availability of automatic point tracking operating to sub-pixel accuracy. The tracking data can be applied to control DVE picture moving for such applications as the removal of film weave, replacing 3D objects in moving video, wire removal, etc.
Advanced multiple point tracking is sometimes used to analyze images in 3D, so allowing a whole raft of computer-generated material to be move-matched for compositing into live scenes, blurring the boundaries of live and synthetic imagery.
Three Letter Acronym. There is a huge collection of examples in these pages, from AAF to WAV. Somehow two letters seems abrupt and four or more becomes a bit clumsy. Three is perfect!
A technique that allows a (computing) process to be shared among several processors with the objective of completing the process more quickly than would be possible when using a single processor. To provide faster processing for multiple tasks, multiprocessing is used where the programs (threads) are divided to run simultaneously on different cores to provide faster processing for multiple tasks.
Device for converting film images into video in realtime. The main operational activity here is color grading which is executed on a shot-by-shot basis and absorbs considerable telecine time. This includes the time needed for making grading decisions and involves significant handling of the film, spooling and cueing which risks film wear and damage, besides the actual transfer time. The output of a telecine is digital video (rather than data files).
Digital technology has moved the transfer process on. Now, adding a disk store or server can create a virtual telecine enabling the film-to-digital media transfer to run as one continuous operation. Whole film spools can be scanned in one pass, with useful footage selected by an EDL. In this case the telecine may be termed a Film Scanner – creating image files (rather than digital video) that contain sufficient latitude for downstream grading.
Tagged Image File Format. A bit-mapped file format for scanned images. Widely used in the computer industry. There are many variations of this format.
A graphical representation of events, such as those that occur in editing, compositing, grading or other processes, usually as a horizontal line. This works well with disk-based and solid-state based operations providing instant access to any part of the process and, hopefully, to all the footage, decisions, associated tools and their settings.
Transfer Control Protocol/Internet Protocol. A set of standards that enables the transfer of data between computers. Besides its application directly to the Internet it is also widely used throughout the computer industry. It was designed for transferring data files rather than large files of television or film pictures. Thus, although TCP/IP has the advantage of being widely compatible it is a relatively inefficient way of moving picture files.
Timebase Corrector. This was often included as a part of a VTR to correct the timing inaccuracies of the pictures read from tape. Early models were limited by their dependence on analog storage devices, such as ultrasonic glass delay lines. This meant that VTRs, such as the original quadruplex machines, had to be mechanically highly accurate and stable to keep the replayed signal within the correction range (window) of the TBC; just a few microseconds.
The introduction of digital processing techniques made much larger stores, with analogue inputs and outputs, economic, so widening the correction window and reducing the need for especially accurate, expensive mechanical engineering. Digital TBC has had a profound effect on VTR design – and price. Quantel’s first product was a digital TBC for use with IVC VTRs.
When VTRs went digital (DVTR) the TBC function became easier and was always a part of the DVTR. With the wide use of disk or solid-state video stores the TBC function has disappeared as data is accurately timed by buffers and clocks built into the equipment.
An image file format widely used in computers. It was developed by Truevision Inc. and there are many variations of the format.
Table 3 of the ATSC DTV Standard, Annex A, summarizes the picture formats originally allowable for DTV transmission in the USA. Any one of these may be compressed using MPEG-2 and transmitted. An ATSC receiver must be able to display pictures from any of these formats.
The range of formats in Table 3 caters for schemes familiar to both television and computers, and takes account of different resolutions, scanning techniques and refresh rates. For each frame size, frame rates are available to provide compatibility with existing transmission systems and receivers. 29.97 Hz is needed to keep step with NTSC simulcasts. Technically this is not required once NTSC is no longer transmitted! 30 Hz is easier to use, and does not involve considerations such as drop-frame timecode. 24 Hz progressive (24P) offers compatibility with film material. A choice of progressive or interlaced scanning is also available for most frame sizes (see Progressive and Interlaced).
Table 3 is concerned with video formats to be handled in the ATSC system rather than defining standards for video production. ATSC’s Table 1 of annex A refers to the standardized video production formats likely to be used as inputs to the compression table.
|Vertical size value||Horizontal size value||Aspect ratio information||Frame rate code||Progressive sequence|
Video standard Active lines Active samples/line
SMPTE 274M 1080 1920
SMPTE S17.392 720 1280
ITU-R BT.601-4 483 720
ATSC Annex A, Table 1: Standardized video input formats
|Video standard||Active lines||Active samples/line|
This carries separate timing information (clock data) for keeping send and receive operations in step. The data bits are sent at a fixed rate so transfer times are guaranteed but transfers use more resources (than asynchronous) as they cannot be shared. Applications include native television connections, live video streaming and SDI. Operation depends on initial negotiation at send and receive ends but transfer is relatively fast.
Silicon X-tal (crystal) Reflective Display, a reflective liquid crystal micro-display from Sony used in the first commercially available 4K-sized projectors. The display chip has 4096 x 2160 pixels on one-and-a-half inches (diagonal) of a silicon chip. The design maintains a uniform, ultra-thin liquid crystal cell gap without any spacers in the image area, contributing to contrast performance, claimed as 4000:1. Its Vertically Aligned Nematic (VAN) liquid crystal changes state fast enabling speeds up to 200 f/s while minimizing image smear. HDTV-sized SXRD chips have been used in Sony consumer products, including a front projector and rear projection televisions up to 70 inches.
See also: Projectors (digital)
Connecting network users via a switch means that each can be sending or receiving data at the same time with the full wire-speed of the network available. This is made possible by the aggregate capacity of the switch. So, for example, an eight-port Gigabit Ethernet switch will have an aggregate capacity of 8 Gb/s. This means many simultaneous high-speed transactions taking place without interruption from other users. The Internet is connected by thousands of very high speed network switches.
Also known as 8K UHD, pioneered by the Japanese broadcaster NHK this is a very large format television system with a pictures size of 7680 x 4320 pixels. It is proposed to run at frame rates from 23.98 to 120 Hz and start broadcasting by 2020. SHV can also support a 22.2 sound system with 22 speakers and two woofers.
A spatial resolution smaller than that described by one pixel. Although digital images are composed of a matrix of pixels it can be very useful to resolve image detail to smaller than pixel size or position, i.e. sub-pixel. For example, the data for generating a smooth curve on the screen needs to be created to a finer accuracy than the pixel grid itself, otherwise the curve will look jagged. Again, when tracking an object in a scene, executing a DVE move, or calculating how a macroblock in MPEG-4 AVC coding moves from one picture to another, the size and position of the manipulated picture or element must be calculated, and the object resolved, to a far finer accuracy than the that of whole pixels, otherwise the move will appear jerky or wrong.
Moving an image with sub-pixel accuracy requires picture interpolation as its detail, that was originally placed on lines and pixels, now has to appear to be where none may have existed, e.g. between lines. The original picture has to be effectively rendered onto an intermediate pixel/line position. The example of moving a picture down a whole line is achieved relatively easily by re-addressing the lines of the output. But to move it by half a line requires both an address change and interpolation of the picture to take information from the adjacent lines and calculate new pixel values. Good DVEs and standards converters work to a grid many times finer than the line/pixel structure.
A popular language for computer database management. It is very widely used in client/server networks for PCs to access central databases and can be used with a variety of database management packages. It is data-independent and device-independent so users are not concerned with how the data is accessed. As increasing volumes of stored media content are accessible over networks, SQL is able to play a vital role in finding any required items.
Refers to supplying a constant realtime media service. Although broadcast TV has done this from the beginning, and SDI streams data, the term is more usually associated with delivery by networks, usually the Internet where it accounts for a large majority of the traffic. The transmission comprises a stream of data packets which can be viewed/heard as they arrive though are often buffered, stored slightly in advance of viewing/hearing, to compensate for any short interruptions of delivery. For the Internet, media is compressed and generally offers acceptable results for audio and video. There are three predominant video streaming solutions: RealNetworks with RealVideo, RealAudio and RealPlayer, Microsoft Windows Media and Apple QuickTime; each with their particular advantages. As Internet transfers are not deterministic, pictures and sound may not always be continuously delivered.
Many popular sites, such as YouTube and the BBC iPlayer, offer both SD and HD services. A few are working with 4K UHD. Most TV and radio stations offer live streaming services.
This is just arithmetic. You can work all these figures out yourself but it’s really useful having some of the key numbers already to hand. Using the ITU-R BT.601 4:2:2 digital coding standard for SD, each picture occupies a large amount of storage space – especially when related to computer storage devices such as DRAM and disks. So much so that the numbers can become confusing unless a few benchmark statistics are remembered. Fortunately the units of mega, giga and tera make it easy to express the vast numbers involved; ‘one gig’ trips off the tongue far more easily than ‘one thousand million’ and sounds much less intimidating.
Storage capacities for SD video can all be worked out directly from the 601 standard. Bearing in mind that sync words and blanking can be re-generated and added at the output, only the active picture area need be stored on disks. In line with the modern trend of many disk drive manufacturers, kilobyte, megabyte and gigabyte are taken here to represent 103, 106 and 109 respectively.
Every line of a 625/50 or 525/60 TV picture has 720 luminance (Y) samples and 360 each of two chrominance samples (Cr and Cb), making a total of 1,440 samples per line.
There are 576 active lines per picture creating 1440 x 576 = 829,440 pixels per picture.
Sampled at 8 bits per pixel (10 bits can also be used) a picture is made up of 6,635,520 bits or 829,440 8-bit bytes – generally written as 830 kB.
With 25 pictures a second there are 830 x 25 = 20,750 kbytes or 21 Mbytes per second.
There are 480 active lines and so 1,440 x 480 = 691,200 pixels per picture.
With each pixel sampled at 8-bit resolution this format creates 5,529,600 bits, or 691.2 kbytes per frame. At 30 frames per second this creates a total of 21,039 kbytes, or 20.7 Mbytes per second.
Note that both 625 and 525 line systems require approximately the same amount of storage for a given time – 21 Mbytes for every second. To store one hour takes 76 Gbytes. Looked at another way each gigabyte (GB) of storage will hold 47 seconds of non-compressed video. 10-bit sampling uses 25% more storage.
If compression is used, and assuming the sampling structure remains the same, simply divide the numbers by the compression ratio. For example, with 5:1 compression 1 GB will hold 47 x 5 = 235 seconds, and 1 hour takes 76/5 = 18 GB (approx). The storage requirement for VBR compression cannot be precisely calculated but there is usually some target average compression ratio or data rate figure quoted.
All media are limited by the bandwidth available in the transmission/delivery channel. There is a wide choice of services and screens. In the most restricted cases some wireless and mobile applications are supported with a variety of small screens, shapes and resolutions ranging from VGA (480×640) and some 3 or 4G phones with up to 320×240, or 176×144 pixels and frame rates down to 15Hz. Many modern smart phones boast 1920 x 1080 HD screens.
There are many video formats for HD but the 1920 x 1080 format is popular. Using 4:2:2 sampling, each line has 1920 Y samples and 960 each of Cr and Cb = 3840 samples per line. So each picture has 3840 x 1080 = 4.147 M samples. For 10-bit sampling each picture has the equivalent data of 5.18 M (8-bit) bytes. Assuming 30 pictures (60 fields) per second these produce 155 M bytes/s – 7.4 times that of SD. An hour of storage now needs to accommodate 560 GB.
Ultra High Definition has two sizes of picture – 4K and 8K. 4K is 2160 x 3840 twice the length and breadth of 1080 HD. If using 4:2:2 10-bit sampling then each picture is 16.588 M samples, equivalent data of 20.735 MB. At 30 f/s that amounts to 622.05 MB/s, .2.24TB/h.
8K at 4320 x 7680 is twice the size, and four times the area of 4K. One frame is 66.355 Msamples, or 82.94 MB. At 30 f/s this produces 2.488 GB/s, making an hour nearly 9 TB of data.
2K and 4K DCI
2K is a format used in digital movie production that uses 4:4:4 10-bit sampling and RGB colorspace with an image size of 2048 x 1536, and has 24 frames per second. This makes one frame 11.80 MB, and an hour of storage 1.04TB. Note that, applied to digital cinema exhibition, the 2K pixel size is 2048 x 1080, and the color space is X´Y´Z´ and uses 12-bit 4:4:4 sampling, as defined by the DCI. The 4K image size is increasingly being used for. It is a 2×2 version of 2K, making x4 the number of pixels.
Here are some popular TV and digital film formats showing the volume of their uncompressed data. Compression of up to 100:1 is applied to MPEG-2 TV transmissions – over 100:1 may be used with more advanced codecs such as MPEG-4 and VC-1. DCI have given a maximum data rate for replay in digital cinemas is 250 Mb/s. Here JPEG 2000 compression is used and there is no inter-frame compression; this works out at a compression of about 6.4:1 for 2K and 25.5:1 for 4K.
|Format (H x V)||Sampling (MB)||Image size Mb/s||One Frame (GB)||Data rate||One Hour|
|320/15P||4:1:1 8-bit||320 x 240||0.12||14.4||6.5 (3G phone)|
|525/60I||4:2:2 8-bit||720 x 480||0.69||166||76|
|625/50I||4:2:2 8-bit||720 x 576||0.83||166||76|
|720/60P||4:2:2 10-bit||1280 x 720||2.3||1104||500|
|1080/60I||4:2:2 10-bit||1920 x 1080||5.2||1248||560|
|1080/25P||4:4:4 10-bit||1920 x 1080||7.8||1560||700 (RGB)|
|1080/60P||4:4:4 10-bit||1920 x 1080||7.8||3744||1680 (RGB)|
|2K/24P||4:4:4 12-bit||2048 x 1080||10||1913||860 (DCI cinema)|
|2K/24P||4:4:4 10-bit||2048 x 1536||12||2304||1036 (cine production)|
|4K/60P||4:2:2 10-bit||3840 x 2160||20.7||4976||2240 (4K UHD)|
|4K/24P||4:4:4 12-bit||4096 x 2160||39.8||7650||3442 (DCI cinema)|
|4K/24P||4:4:4 10-bit||4096 x 3072||48||9216||4144 (cine production)|
|8K/60P||4:2:2 10-bit||7680 x 4320||82.9||19904||8960 (8K UHD)|