Past columns have touched on some examples of bad engineering that make modern recordings less than they could be. Some readers have requested a more in-depth look at some of the issues I presented in those columns. The most obvious starting point for this elucidation would be the use of compression in the recording process. Before we can fully understand what compression actually does, we need to take a look at the variables involved in recorded sound that are affected by compression.
Noise floor. The noise floor of a playback or recording medium is the level at which noise begins to intrude on the signal. For example, cassette tape has a relatively high noise floor. Recorded sound below a certain level will get lost in the noise of the medium. In order to avoid this, compression of the signal may be used to keep the lowest sounds from getting lost in the noise. Digital audio has, for all intents and purposes, no noise floor. The quietest sounds that can be recorded on a standard CD (16 bit, 44.1 medium) will be limited by the actual noise of the electronics used to make the recording. The medium itself does not impart any noise floor on the signal.
Dynamic range This is defined by how great the difference is between the loudest portion that can or has been recorded and the quietest portion of that same recording, before the signal is lower than the noise. The cassette tape (with its much higher noise floor) will not be able to record as great a dynamic range as a CD. In fact, the dynamic range of the CD is about twice that of a cassette tape that was made without some form of noise reduction.
Peak level is simply the loudest portion that can be recorded on a medium before distortion takes over. How loud something sounds is generally not dependent on this peak as a compressed signal at a lower level can have a greater perceived volume level.
Resolution is how fine the data points are on the recorded medium and is greatly affected by the noise floor of the medium. A standard CD stores 44,100 samples of sound for every second of material. A standard analog tape or LP has, comparatively, an infinite number of samples for every second. This is a distinguishing factor between an analog signal and a digital signal.
To help demonstrate how all these points come together to change the way a recording sounds, it is useful to have a visual analogy:
Think of a square divided in half. One side is white, the other is black. The white portion is silence and the black portion is the loudest signal that can be recorded. As drawn, this box represents perfect (for the sake of this discussion) dynamic range. In other words, the white is pure white and the black is pure black. No white is lost in any 'noise' (noise floor) (other shadings, paper bleed, etc.) and the black, without any distortion, is visually perfect (peak level). However, this box, while representing great dynamic range, also represents very poor resolution of the grayscale that occurs between white and black. Now divide the box into 16 vertical pieces with white on the leftmost section and black on the rightmost. The sections in the middle will be filled with increasingly darker shades of gray as you move from white to black. This is a grayscale. If you try to go from white to black in only 16 discrete steps you will find that there are noticeable shifts in the darkness of the gray as you move from left to right. These discrete points represent better resolution than the abrupt shift of white to black but are still representative of a low resolution picture. The highest resolution using discrete steps would occur when (without the use of magnification) your eye could not really see the 'steps' between each shade of gray as you move from left to right. Perhaps 1000 discrete steps would make this happen. By enlarging (making louder) this grayscale you may be able to once again see the discrete steps involved in the illusion of a perfect scale. This is what a digital audio signal is like.
If you were to take a good felt tip marker and, starting with almost no pressure at all and ending up with maximum pressure, draw a three foot line from left to right, you would have a good example of an analog signal representing grayscale. With an analog it will not be possible to ever see a 'step' or 'shift' from one shade to another, no matter how enlarged, since there are no discrete steps. However, upon great enlargement the analog signal will begin to get lost in the noise of the medium on which it has been recorded. In this case the paper itself may start to have an impact on the scale.
If, in these examples, we were to start off with something grayer than white and finish with something slightly lighter than complete black, we would have a good example of compressed dynamic range. In other words, the grayscale now covers less range. For those of you into black and white movies, this is the reason why celluloid is preferred over digital video. There is greater variation of grayscale, more subtle shifts from white to black, and no banding effects (visible steps between shades) as the digital signal is compressed to fit more information in less space.
Dynamics in recorded sound are often broken down into macro and micro dynamics. Macro dynamics are the great shifts in volume level between loud and soft. Going back to our first example, the box divided into 2 black & white vertical sections has great macro dynamic variation, but really lousy micro dynamic variation. In fact, there is none at all. When people discuss micro dynamics they are really discussing resolution. Micro dynamics are important because a drum beat is neither completely silent (pardon the philosophical conundrum about having a 'beat' which is silent) nor is it always at its fullest volume. There are micro-dynamic cues during the actual hitting of the skin that give the sound character. Think of an electric keyboard that does not have any variation in note volume as you press the key. Middle C is either on or off, with no volume level in between. A piece played this way would have a great dynamic range but no resolution of micro dynamics and much of the musical expressiveness of the piece is completely lost. An analog piano has a wide variety of volume for any given note depending on how hard you push the key. A good digital electric keyboard might allow for some sort of 'touch' playing by having volume samples that relate to the force with which you press the key. However, the resolution of those changes is still dictated by how many discrete steps were used to get from silence to full signal. Better keyboards would have finer steps.
So how does this all relate to how music is recorded and why is it important and, more importantly, why is compression used? Compression of the audio signal is the result of doing two things that are inextricably linked. A signal compressor takes the audio signal and, at the macro level, 'squeezes' the dynamic range of this signal so that the difference between loud and soft is not so great, reducing resolution at the micro level. In our visual representation of 16 steps from white to black, a compressor would change the scale so that it did not start with white and might not end up at full black. (In practice, engineers will usually end up with full black because that is louder. They simply start further away from white, or, after the two extremes are brought closer together, they raise the entire new signal level to the loudest that they can fit on the medium.) In audio this means that the quietest signals that are recorded have their levels raised while the loudest signal levels have their levels reduced a bit. Basically it 'scrunches' the signal into a smaller space.
Compression can serve several functions. For instance, compression is used in radio broadcasting to raise the level of the music above the noise floor of the broadcast medium as well as make radio listenable in places that have high noise floors, such as cars. By compressing the overall signal and than raising the new, dynamically limited signal to its maximum level, an engineer can get the quiet parts out of the noise floor of the recorded medium.
An extreme example of compression is called peak limiting. Let's say that you are recording a person reading at the level of a normal speaking voice with little variation of volume (limited dynamic range), but three-quarters of the way through the piece a rim shot occurs as a point of emphasis (extreme dynamic change relative to the voice). This rim shot is very loud and, in order to contain its maximum uncompressed volume, it is necessary to record the speaking part at a lower level than you might have if there was no rim shot. By peak limiting (compressing) the rim shot so that it is not so loud at its peak, you can raise the overall level of the recording without distorting the peaks. This has the added benefit of getting the quietest parts of the recording out of the noise floor of the recording medium or the playback environment. If poorly done, this new compressed signal will no longer represent the intent of the performance as the rim shot will not have the desired effect due to the decreased difference (dynamic range) between its volume and that of the speaker. Compression has the added benefit of making the overall sound of the recording appear to be louder because there is little variation in the dynamic range, and – as we all know – louder sells.
In real numbers, a CD has about 120 decibels (dB) of dynamic range and the best LPs had about 60 to 70 dB. Yet, due to the overuse of compression, the average CD made today has less dynamic range than the average LP of 30 or 40 years ago. I have heard CDs that have had less than 5 dB of dynamic range! That means that the quietest portion (a very loose definition of the term quiet) is only 5 decibels quieter than the loudest portion. In our grayscale example, that would mean that the loudest signal was full black and the quietest signal would have been as much as 90% gray. So much for subtlety!
Compression can have the greatest overall impact on music reproduction, changing or creating the effect that a musical group desires. Compressors may also be the single greatest reason that recorded music does not sound live.
© Cadence Magazine 2001