This article is now maintained as a Wiki page at http://wiki.multimedia.cx/index.php?title=Quantization.
I have never quite understood what is so hard about quantization. Maybe I am missing something but it seems to be primarily a matter of division (for quantization) and multiplication (for dequantization, a.k.a. requantization).
This type of quantization is also referred to as scalar quantization, as opposed to vector quantization which is a topic for another article. Most references I have ever read describe quantization in an impossibly abstract manner. Implementation-wise, quantizing a number usually boils down to this:
original_number / quantization_factor = quantized_number
Dequantization entails the following operation:
quantized_number * quantization_factor = original_number (or, hopefully, some close approximation of original_number)
Basic division and multiplication. In doing so, the numerical precision is effectively reduced. If a particular number has a range of 0..499, there are 500 possible numbers. But if the number is quantized (divided by) 5, the quantized numbers will only range from 0..99 for a total of 100 possible numbers. Dequantizing (multiplying) will also mean that there are still only 100 possible numbers:
0 5 10 15 20 25 30 … 485 490 495
Let’s look a string of numbers that might be quantized in practice:
83 13 21 5 4 2 1 5 6 2 1 2 2 1 3 1
Let’s apply a quantization factor of 5. Thus, we divide each number by 5 and throw away the remainder:
16 2 4 1 0 0 0 1 1 0 0 0 0 0 0 0
Let’s examine what just happened. First, the numbers all got a lot smaller. Remember from the differential coding article that smaller numbers require less information to code. Not only that, a bunch of the numbers turned into zeros. Recall the RLE article’s mention of run-level coding. This is precisely where it comes into play.
During dequantization, the quantization factor is multiplied by each number and we get:
80 10 20 5 0 0 0 5 5 0 0 0 0 0 0 0
Hmm… the original string was this:
83 13 21 5 4 2 1 5 6 2 1 2 2 1 3 1
Something happened. Namely, we lost information during the coding process. Multimedia compression algorithms are often lossy in order to achieve greater compression than they could with a lossless algorithm. Guess what? Quantization is where a lot of that fabled information loss occurs. In fact, have you ever seen multimedia compressors which allow you to configure “how much compression” you want? This is where that configurability usually matters. By making the quantization factor (i.e., divisor) larger, the quantized numbers get smaller, more numbers turn into zeros, and more overall information is lost (and more compression is achieved).
While it is possible to quantize a string of numbers using just one quantization factor– and certain codecs do this– it is much more common for a codec to use a string of quantization factors, one for each of the numbers in the string to be quantized or dequantized. These strings will typically be referred to as either vectors or matrices. Using an entire quantizer vector or matrix, it is possible to throw away more information for the numbers that do not have as much effect. In our example string (let’s call it a vector now):
83 13 21 5 4 2 1 5 6 2 1 2 2 1 3 1
Note that the beginning numbers are much larger than the numbers at the end. This is often the case in situations where quantization is to be applied. The beginning numbers are more important and the numbers decrease in significance through the vector. Thus, smaller quantizers can be applied at the start of the vector, and larger quantizers can be applied at the end. For example:
2 2 2 2 3 3 3 4 4 4 5 6 7 8 9 10 11 12
In this example, the first 3 numbers will be quantized to:
41 6 10
and dequantized to:
82 12 20
which is a little more accurate than:
80 10 20
which we got when we used the uniform quantizer of 5.
Much multimedia compression research is focused on quantization patterns and how to apply them in such a way as to throw away just the right amount of information but not too much. As mentioned before, some codecs apply a uniform quantizer everywhere and some apply quantization vectors or matrices. Many codecs define different quantization matrices depending on which video plane is being operated upon. Some codecs encode information in a compressed video frame’s header to adjust quantizer matrices on each frame. Some codecs go even further to be able to adjust the quantization scheme on particular areas of the video frame.
Further Reading:
I love this explanation. It is very easy to understand and all of a sudden quantization factors and matrices make totally sense :-) This simple explanation helped me to understand how JPEG compression works. Together with the knowledge how a Discrete Cosine Transform works and how to interpret the transformed data, it also makes sense how JPEG can actually easily remove some image details , without making that loss too obvious for our eyes. Thanks a lot for that!
Good explanation. Helped me understand WeBP codec by Google.