What is Audio Compression?
Audio compression is typically known as dynamic range or dynamic audio compression. It’s a type of amplifier used in all professional recordings today and most of those from the last 40 years. Compression evens out the dynamic range or the span between the loudest and softest parts of a recording. Compression is used to smooth out a vocal track that pitches from very loud to incomprehensibly soft. Think of it as heightening the soft signals and reducing the loud signals to average out the overall volume.
Compression also keeps an instrument range within the range of your recording equipment, enabling you to record a more clear, clean sound. Technically speaking, if instruments become too loud or too soft during a recording, the sound levels can pitch too high or too low within the range your equipment can capture. In recordings without compression, the resulting high or low becomes muddied or distorted because the level is either too strong or too weak and doesn’t fall within the equipment’s range. It’s similar to how human ears have a smaller range of high and low sounds than many animals.
Compressors and limiters are specialized amplifiers used to reduce dynamic range — the span between the softest and loudest sounds. The use of compressors can make recordings and live mixes sound more polished by controlling maximum levels and maintaining higher average loudness. Additionally, many compressors — both hardware and software — will have a signature sound that can be used to inject wonderful coloration and tone into otherwise lifeless tracks. Compression can also be used to subtly massage a track to make it more natural sounding and intelligible without adding distortion, resulting in a song that’s more “comfortable” to listen to. Alternately, over-compressing your music can really squeeze the life out of it. For those who are unfamiliar with compressors, having a good grasp of the basics will go a long way toward understanding how compression works, and confidently using it to your advantage.
Common Compressor Controls & Parameters
Depending which compressor you’re using and whether it’s a hardware unit or a plug-in, there are some common parameters and controls that you will be using to dictate the behavior of the compression effect. Below are some of the basic elements of compression. Your compressor may or may not include all of them, but understanding what each one does will allow you to work comfortably with a wide range of compressors.
The threshold control sets the level at which the compression effect is engaged. Only when a level passes above the threshold will it be compressed. If the threshold level is set at say -10 dB, only signal peaks that extend above that level will be compressed. The rest of the time, no compression will be taking place.
The “knee” refers to how the compressor transitions between the non-compressed and compressed states of an audio signal running through it. Typically, compressors will offer one, or in some instances a switchable choice between both, a “soft knee” and a “hard knee” setting. Some compressors will even allow you to control the selection of any position between the two types of knees. As you can see in the diagram, a “soft knee” allows for a smoother and more gradual compression than a “hard knee.”
This refers to the time it takes for the signal to become fully compressed after exceeding the threshold level. Faster attack times are usually between 20 and 800 us (microseconds) depending on the type and brand of unit, while slower times generally range from 10 to 100 ms (milliseconds). Some compressors express this as slopes in dB per second rather than in time. Fast attack times may create distortion by modifying inherently slow-moving low-frequency waveforms (Ex. If a cycle at 100 Hz lasts 10 ms, then a 1 ms attack time will have time to alter the waveform, which will generate distortion.)
This is literally the opposite of attack time. More specifically, it is the time it takes for the signal to go from the compressed—or attenuated—state back to the original non-compressed signal. Release times will be considerably longer than attack times, generally ranging anywhere from 40-60 ms to 2-5 seconds, depending which unit you’re working with. These can also sometimes be referenced as slopes in dB per second instead of times. Normal compressor operation will be to set the release time to be as short as possible without producing a “pumping” effect, which is caused by cyclic activation and deactivation of compression. For example, if the release time is set too short and the compressor is cycling between active and non-active, your dominant signal — usually the bass guitar and bass drum — will also modulate your noise floor, resulting in a distinct “breathing” effect.
This parameter is often misunderstood, but it simply specifies the amount of attenuation to be applied to the signal. You will find a wide range of ratios available depending on the type and manufacturer of the compressor you are using. A ratio of 1:1 (one to one) is the lowest and it represents “unity gain”, or in other words, no attenuation. These compression ratios are expressed in decibels so that a ratio of 2:1 indicates that a signal exceeding the threshold by 2 dB will be attenuated down to 1 dB above the threshold, or a signal exceeding the threshold by 8 dB will be attenuated down to 4 dB above it, etc. A ratio of around 3:1 can be considered moderate compression, 5:1 would be medium compression, 8:1 starts getting into strong compression and 20:1 (twenty to one) thru ∞:1 (infinity to one) would be considered “limiting” by most and can be used to ensure that a signal does not exceed the amplitude of the threshold. The diagram below shows compression ratios as they relate to the input and output signals and illustrate how setting your compression ratio will affect the overall signal.
Although compression is generally perceived to make a signal louder, in all actuality the compression-induced attenuation is lowering the output. This is where “output gain”, or “make-up gain”, comes into play. You can use the output gain to “make-up” for the attenuation done by the compressor. Some compressors will provide meters that can be put into “GR” or “gain reduction” mode to visually indicate the total attenuation in dB, allowing you to accurately apply the correct amount of output gain.
Common Compression Types
The type of compressor you choose will also play a large role in the overall sound of the effect. Some compressor types will have faster “attack” and “release” times than others, and some will have more “coloration” or “vintage” vibe based on the internal components. This is a list of the four most famous compression types and a brief description of how they differ.
Probably the oldest type of compression is tube compression. Tube compressors tend to have a slower response — slower attack and release — than other forms of compression. Because of this, tube compressors exhibit a distinct coloration or “vintage” sound that is nearly impossible to achieve with other compressor types.
2. Optical Compression
Optical compressors affect the dynamics of an audio signal via a light element and an optical cell. As the amplitude of an audio signal increases, the light element emits more light, which causes the optical cell to attenuate the amplitude of the output signal. (Ex. LA-2A Classic Leveling Amplifier, which also uses tubes for its make-up gain)
3. FET Compression
FET or “Field Effect Transistor” compressors emulate the tube sound with transistor circuits. They are fast, clean, and reliable. (Ex. 1176LN Classic Limiting Amplifier)
4. VCA Compression
VCA or “Voltage Controlled Amplifier” compressors use solid state or integrated circuits. They are usually cheaper than a tube or optical compressors. VCA’s also tend to have less “coloration” compared to optical or tube compressors—somewhat similar to the digital vs. analog tape comparison in a recording. (Ex. dbx 160 Compressor/Limiter)
Simple Audio Compression Methods
Traditional lossless compression methods (Huffman, LZW, etc.) usually don’t work well on audio compression (the same reason as in image compression).
The following are some of the Lossy methods applied to audio compression:
- Silence Compression – detect the “silence”, similar to run-length coding
- Adaptive Differential Pulse Code Modulation (ADPCM)
- Linear Predictive Coding (LPC) fits signal to speech model and then transmits parameters of a model. Sounds like a computer talking, 2.4 kbits/sec.
- Code Excited Linear Predictor (CELP) does LPC but also transmits error term – audio conferencing quality at 4.8 kbits/sec.
It is important to distinguish between the audio coding format, the container containing the raw audio data, and an audio codec. A codec performs the encoding and decoding of the raw audio data while this encoded data is (usually) stored in a container file. Although most audio file formats support only one type of audio coding data (created with an audio coder), a multimedia container format (as Matroska or AVI) may support multiple types of audio and video data.
There are three major groups of audio file formats:
- Uncompressed audio formats, such as WAV, AIFF, AU or raw header-less PCM;
- Formats with lossless compression, such as FLAC, Monkey’s Audio (filename extension .ape), WavPack (filename extension .wv), TTA, ATRAC Advanced Lossless, ALAC (filename extension .m4a), MPEG-4 SLS, MPEG-4 ALS, MPEG-4 DST, Windows Media Audio Lossless (WMA Lossless), and Shorten (SHN).
- Formats with lossy compressions, such as Opus, MP3, Vorbis, Musepack, AAC, ATRAC and Windows Media Audio Lossy (WMA lossy).