|
|
Nicam Stereo- Overview
Modern sound systems are based on the use of digital technology. The first type of commercial digital sound medium was the compact disc (CD) which was jointly launched by Philips and Sony in 1982. Pulse-code modulated sound was first introduced in the video field with the video-8 format. More recently, digital audio tape (DAT) and the soon to be released digital compact cassette are frequently referred to. Audio signals must start and end in analogue form, ie., what goes into the microphone and comes out of the loudspeaker is a sound pressure wave. Conversion to digital form, by quantisation (sampling the audio signal at regular intervals and assigning to each sample a digital value), is done to overcome deficiencies in the intervening systems and equipment (digital signals are less affected by noise, distortion etc). When reproduced, the quality of digital sound, and particularly its signal to noise ratio, largely depends on the number of quantisation levels used. The ultimate, as used in the CD and R-DAT (R standing for rotary head), is 16 bits, giving 65,536 values each representing a signal amplitude level. A 16 bit signal would require a wideband channel for transmission however. As there is not room to accommodate such a signal in the broadcast TV bands the NICAM system uses 10-bit words. Ordinarily, these would provide a signal to noise ratio of just 60dB. An improvement is achieved by complex digital processing to increase the signal to noise ratio to 84dB as provided by a 14-bit system.
The 32KHz sampling rate used by the Nicam system provides a frequency response of 0-15KHz in two channels for stereo. Separation between the two channels is total, so that dual-language sound tracks can be broadcast without crosstalk. The Nicam system can carry in one or both channels and is completely separate from and independent of the existing FM sound channel at 6MHz. Before it was adopted the Nicam system had to be tested under many different sets of conditions. It proved to be extremely rugged, providing high quality stereo sound even when the signal is so weak that the picture can hardly be locked.
Technical Overview Many of the processes and techniques that the Nicam system uses are common to those employed by the compact disc system. A block diagram of the processes is shown below-
The studio sound feeds for the left and right channels are first pre-emphasised and passed to an analogue to digital converter which produces 14-bit audio samples (words) at a sampling rate of 32KHz. These samples are fed to a compressor which reduces the bit rate to 10 per word, effectively extracting the ten most useful bits and discarding the rest. At this stage an error bit is added for error checking and correction at the receiver. In the following interleaver stage the bits are shuffled like a pack of cards. This is done to distribute the effects of propagation errors or interference when the original order is restored at the receiving end. Scrambling is then done by adding to the interleaved data an effectively random data stream, so that the sidebands of the transmitted Nicam signal have no strong, fixed frequency components that could cause interference.
Modulation comes next. For the UK system, the Nicam carrier is at 6.552MHz. It is phase modulated by the data pulses, using a DQPSK system. With digital signals phase modulation is economical of bandwidth, a factor that is important when trying to squeeze another signal into an already tightly packed 8MHz slot in the TV band. The Nicam carrier is added to the 6MHz FM sound carrier fed to the transmitter, where it joins the vision carrier (at a level -20dB relative to the peak vision carrier level) on its way to the transmitting aerial. At the receiver the tuner converts the Nicam carrier to an IF of 32.95MHz. This is demodulated and selected by a filter for application to the first section of the Nicam decoder, The DQPSK demodulator. The nicam signal emerges from this as a data stream. It then undergoes a series of processes that reverse those carried out at the broadcasting end. First is a descrambler which subtracts from the data stream the random data added at the transmitter to smooth out the sidebands. De-interleaving, in accordance with a simple codebook held in memory, then takes place. As a result the order of the original 10 + 1 bit words is restored. The words are next expanded to 14-bit form in accordance with a scale factor transmitted with the data. Error correction on the basis of the parity bits is then undertaken. At the end of this decoding process we have a real-time data stream of 14 bits, containing information for both the left and right sound channels. It then remains to convert the data back to analogue form, using a DA converter, then to filter the result to obtain the original sound signals. User controls are incorporated into the audio amplifiers, whose design is critical if the low distortion and high signal to noise are to be maintained. The performance, rating and positioning of the loudspeakers are also crucial to the realisation of the full potential of NICAM sound.
System Detail Pre-emphasis First, pre-emphasis. No sound or vision transmission or recording system is complete without a pre-emphasis system, which gives a boost to the high-frequency components of the signal. During reproduction the balance is restored by reducing the amplitude of the higher frequencies: noise is reduced in proportion. The pre-emphasis/de-emphasis characteristics follow the CCITT J17 recommendations, with a boost of 6.5dB at 800Hz. The curves are shown below-
Sampling and Quantisation The sampling rate for DA conversion is 32KHz, which means that digitised samples of the original analogue sound waveform occur at 31.25uS intervals. To prevent aliasing (this occurs when lower sidebands of the sampling frequency overlap with the upper end of the audio frequency spectrum), and the distortion to which this gives rise, the upper frequency limit of the audio signal must be held at or below half the sampling rate, ie. at or below 16KHz. For this purpose a filter with a very sharp cut-off at 15KHz is inserted in the L and R signal sampling is carried out simultaneously.
Digital Compression The 14 bit rate is reduces to 10 bits to conserve bandwidth. The significance of each bit in a word depends very much on the sound level represented by that word. The least significant bit in a word can influence the final signal by only one part in around 16,000. While this may be important in a delicate, quiet passage, it matters not one jot when the sample represents a very load sound or a rapid transition between two widely-different levels. Refer to the figure below, where the possible combinations that could go to make up a 14 bit word are laid out-
The shaded areas show the coding scheme. The most significant bit (MSB), the 14th (on the left), passes through the compression system unchanged. The 13th bit is discarded if it is the same as the 14th; the 12th bit is likewise discarded if it is the same as the 13th and 14th; and so on with the 12th, 11th and 10th bits. When this has been done we are left with words that have between 10 and 14 bits. If a particular word has more than 10 bits a sufficient number of bits is removed at the least significant bit (LSB) end to reduce the word to 10 bits. The digital signal has thus been compressed, and as long as the decoder knows what the compressor is up to it can expand the signal back to 14-bit length at the receiver.
Protection and Scale Factor It is necessary to add to each digital word some protection against the possibility of transmission/reception errors. For this purpose a parity bit is added to the end of each word. It is used in the decoder as a truth check on the word's six most significant bits. The system used is referred to as even parity. We now have a protected 11-bit word. We must next send the decoder a scale factor to indicate which bits have been deleted during the compression process. If the missing bits are on the left-side in the above diagram, the scale factor will enable them to be re-created: if they are on the right side they are lost forever, but this is of no importance since the words they are part of are not crucial ones from the noise point of view.
Another look at the above diagram shows that 10-bit resolution is given to large signals (coding range 1), 11 bit resolution applies in coding range 2, 12 bit in range 3 and so on through to the 14 bit range 5 which represents the quietest and most vulnerable sounds. This 'progressive coding' system is very effective, giving bandwidth economy while losing little in comparison to a linear coding system. For scale factor signalling purposes the 11 bit (10+1 parity bit) words are grouped together in blocks of 32, each block lasting for 1msec. With each block a 3-bit code word is sent to indicate the scale factor. There is of course a degree of inaccuracy in this. Since only one scale-factor message is sent for 32 consecutive words, some words will not receive optimum expansion. In fact what the scale factor tells the decoder is the magnitude of the largest ample in each block. This approximation increases the quantisation noise in the final audio signal, but this noise always occurs where the busyness of the signal disguises or masks it. The scale factor data occurs sufficiently frequently for the fastest perceptible loudness transitions to be tracked- this is the key to subjective noise reduction. It is this 'running approximation' characteristic of the scale factor signalling that gives the system the first two words of its name, near instantaneous. The average error is in practice very small and is imperceptible to the listener.
So far we have discussed two of the 'fiddle-factors' involved in Nicam coding, bit reduction for compression and the 1-in-32 scale factor signalling. Now we come to the biggest fiddle-factor, the way in which the scale-factor is sent. In effect it is included for nothing, using the parity bits in the data words. The scaling factor is sent by modifying the polarity bits in accordance with a complex table that is held in memory in the transmitter and receiver. The receiver's decoder extracts the 3-bit scale factor words by using what's called majority-decision logic- this process also restores the original parity pattern. The scale factor code words are shown to the right in the above diagram. There are five different compression patterns, each being assigned a range figure from 1 to 5. The table below shows how the three-bit code words convey scale factor and protection data.
Scale Factor Signalling Coding range Protection range Scale factor code 1 1 111 2 2 110 3 3 101 4 4 011 5 5 100 5 6 010 5 7 001 5 7 000
The three scale factor bits can give eight combinations. As shown, some of the 'spares' are used to signal to the decoder the amount of error-protection required by each sample. This can be used in the receiver to give extra protection to the most significant bits of the samples.
Interleaving The parity system gives sufficient protection to the data words. But impulsive interference and similar corruptive influences could blow large holes in the data stream. So further protective measures are required. The degree of protection called for depends on the likelihood of and probable pattern of data corruption. Nicam data, coming off air, does not require the massive error protection used with the CD, V8-PCM and DAT systems. So far from the intricacies if CIRC, the interleaving system used with the Nicam system is simple. It is achieved by writing the data into memory and then reading it out non-sequentially, as shown in the diagram below-
The readout order, which is held in a ROM address-sequencer, ensures that bits which were originally adjacent are separated by at least fifteen other bits. An error burst in the transmission path can corrupt several consecutive bits in the broadcast stream. But as the error is distributed amongst several words the damage to each is minor. It can be repaired by parity correction and/or error concealment as necessary, techniques that are universally used with PCM sound systems.
Housekeeping The digital signal now has enough protection to enable it to withstand the rigours of transmission. Housekeeping data has to be added next so that the sections of data can be identified and synchronised and the decoding process can be controlled. The diagram below shows the composition of a broadcast data frame.
The transmitted data is arranged in 728-bit frames. The housekeeping data comes first, rather like the field sync pulse in the video waveform. As the duration of a frame is 1msec, the data rate is 728Kbits/sec. There are no gaps between frames. The frame alignment word (FAW) synchronises and sets up the decoding process in the receiver. It consists of eight bits and is always 01001110. Next come thew five bits C0 to C4, which are used for decoder control and switching. Their functions are as follows. The frame flag bit C0 is set at 1 for eight successive frames and to 0 for the next eight frames. This unchanging pattern defines a 16-frame sequence and is used to synchronise changes in the typeof information being carried. Bits C1, C2 and C3 provide application control information according to the following table- Control Bits Contents of 704-bit sound/data block C1 C2 C3 0 0 0 Stereo signal with alternate L and R samples 0 1 0 Two independent mono signals in alternate frames,eg. dual language sound 1 0 0 One mono sound signal and one data channel sent in alternate frames 1 1 0 One data channel Note that C3 remains unchanged throughout these options. It's set to 1 when the Nicam channel carries the same sound as the main FM sound carrier and to 0 otherwise. This permits fall-back switching should the Nicam carrier fail or shutdown when a data transmission is in use. The next data block, AD0 to AD10, is at present unused- it is reserved for 'additional data' and it will remain to be seen what form this takes.
The Sound/Data block The initial section of the frame accounts for 24 bits. The remaining 704 convey stereo/dual-channel sound data. There are sixty four 11-bit words, the A channel (stereo left) and B channel (stereo right) samples being sent alternately throughout the period, 32 of each. Since two sound channels, plus the initial data, are being transmitted as a single serial data stream the bit rate of each channel is approximately doubled. This form of time-compression is a common technique known as time division multiplex (TDM). It is easy to put into effect by writing the data into a RAM in real time and reading it out at suitable intervals. TDM is used in many data storage and transmission systems to match signal density to channel bandwidth. The order of A and B samples shown in the above diagram represents the sequence before bit interleaving. Only the 704 bits of sound data are interleaved. With monaural sound the data block is arranged slightly differently according to the diagram below-
In the case of monaural transmission, two 32 word blocks (n and n+1) are placed end-to-end in a single frame, carrying the complete mono signal. Since the data rate is unchanged, the next frame is available for other forms of data if required. Odd numbered frames carry the mono sound data and even numbered ones the 'information' data. To indicate this condition control bits 1-3 change to 100, switching the decoder to mono mode. If two independent mono signals M1 nd M2 (eg, dual language sound) are being transmitted, M1 is carried in the odd-numbered frames as just described while the M2 signal is carried in even numbered frames. The control bits c1-3 change to 010 to switch the decoder sound routing. Data information can take many forms. Computer programs or software are perhaps the most likely. If conditional or restricted access is required by the broadcaster or his agent, perhaps the spare bits C3 and AD0-10 will be used to provide this feature. No data information format for this purpose has yet been published.
Scrambling Now that we have considered the composition of the Nicam data stream in detail it should be obvious that there are certain fixed patterns. Since the data stream modulates a carrier whose sideband structure depends upon the characteristics of the modulation, the energy in the sidebands can be evened out by scrambling the data at random. This is a form of energy dispersal. Total random of the bits is not possible because there would be no way of descrambling the bits at the receiver. A simple circuit that generates binary digits in what appears to be a random way but whose output is predictable and repeatable is used, called a pseudo-random sequence generator (PRSG) and consists of a clocked shift-register with one or more taps that feed back to its input.The figure below shows the one used for the Nicam system. It consists of nine stages, with a tap between the fifth and sixth feeding one input of an exclusive-or gate whose other input comes from the final stage. If you work it out, the sequence starts 0000 0111 1011 etc. Once it gets going the output sequence lasts for 511 (29 -1) bits and thus runs through almost one and a half times during the frame period. One of these PRSGs is used at each end of the system, ie. at the transmitter and the receiver. Each is reset to 111111111 at the end of the last bit of the frame alignment word, which is not to be scrambled, so that the PRSGs run in synchronism throughout the rest of the frame. The PRSG sequence is added 'modulo-two' to the data bit stream on its way to the modulator- modulo two addition is the action of an exclusive-or gate.
Modulation and Transmission
There are various ways of modulating a carrier wave with information. The best known are amplitude modulation, as used for terrestrial vision broadcasting, radio etc., and frequency modulation as used for satellite vision, TV sound and FM radio. A digital signal has only two states, 1 and 0. In terms of bandwidth the most economical method of modulation for use with such a signal is phase modulation: the carrier frequency remains constant but its phase is altered by the modulating data. The simplest phase modulation system is a two-phase one, with, for example, 1's setting the phase to 0 degrees and 0's setting the phase to 180 degrees. This is easy to detect at the receiver. Bandwidth saving is possible by using four-phase modulation, in which the carrier has four possible states- 0, 90, 180 and 270 degrees. Even more complex phase-shift keying systems have been devised, with eight phases, but the smaller the phase shift the more difficult reliable detection becomes. The quadrature phase-shift keying system used for Nicam has four carrier phase states. At the transmitter a serial to two-bit parallel converter changes the serial data into a series of two-bit pairs. These pairs can take one of four forms- 00, 01, 10 or 11. Each of these moves the phase by a different amount, as shown below- Input Data Phase of Nicam Carrier 00 0 01 -90 10 -270 11 -180 Thus the carrier has four possible states and is moved from one to another depending upon the composition of each two-bit pair. Note that with a 00 bit pair the carrier remains at its 0 degree shift, ie. its rest.
Since each phase change occurs at two-bit intervals instead of once per bit and the maximum shift is half a cycle of carrier (either way, in effect), the use of DQPSK greatly reduces the sideband spread. The sudden phase jumps envisaged when changing states can be smoothed out to reduce the sidebands further. With a 2-bit data rate of 346Kbits/sec and a carrier frequency of 6.552MHz, there's a minimum of eighteen carrier cycles between each phase change. By feeding the data to the DQPSK modulator via a low-pass filter we can change it from a square wave to a triangular wave. This has the effect of 'smearing-out' the carrier phase changes in time so that they are more evenly distributed along the eighteen cycle periods available. With the broadcast system 1 as used in the UK the normal FM sound carrier is at 6MHz above the vision carrier at a relative level of -7dB. For use with nicam transmissions it is reduced to -10dB. The Nicam carrier is at 6.552MHz (exactly nine times the bit rate) and is transmitted at a level of -20dB with respect to the vision carrier. It is important to appreciate that these levels are quoted as voltages, so that the power ratio of the peak vision carrier to the Nicam carrier is 100:1. The diagram below shows the spectrum of a system 1 TV channel with Nicam sound. At a main transmitter the Nicam carrier is fed to the klystron power amplifier used for the FM sound signal. For optimum efficiency this normally operates in a very non-linear mode and is very sharply tuned to the relevant channel frequency. For correct operation with Nicam signals its conditions have to be re-adjusted for more linear operation and for greater bandwidth to allow for the extra phase-modulated carrier. A great deal of trouble is taken to keep the Nicam carriers' sidebands low and quiet. There is no interference to the other carriers in the channel or to the adjacent sound and vision channels- or with co-channel reception conditions, to French system L transmissions which have an amplitude modulated sound carrier at 6.5MHz. The Nicam system has been adopted by some European countries and many others have expressed interest. For the Continental system B/G, with the FM sound signal at 5.5MHz, the Nicam carrier signal is set at 5.85MHz. This is rather close to the FM sound carrier, but can be accommodated with critical modification of the data shaping sideband filter. The Nicam signal bandwidth is reduced from 700 to 500KHz with system B/G. The Nicam format is compatible with the sound and data transmission arrangements used with the MAC/packet system that has been adopted for BSB satellite TV transmissions if it ever gets off the ground again. Thus the chip sets developed for one can be used for the other, which is useful for both set designers and chip manufacturers. These chip sets have to sort out the rather complex Nicam transmissions at minimal cost, which is no mean task, and hence the chip sets are still relatively expensive, at around £30 to £40 per set.
|
Send mail to
webmaster with questions or
comments about this web site.
|