Você está na página 1de 3

James Yu

Homework #2



We need to measure the X, Y, Z of a white surface in the same viewing condition. This is
because our eyes adapt to the surrounding ambient light, and will see a white surface in
that environment as white. Basically, this is similar to color balancing.


CIELAB was designed using

• large 2 cpd targets
• static display
• simple (uniform) backgrounds

This fails to take into account various elements like non static (moving) images, targets
with higher than 2 cpd components, and complex backgrounds. For general images, we
definitely cannot guarantee the viewing conditions under which CIELAB was designed
for, and thus makes it difficult to apply to general cases.

Human Pattern Vision


We know from the contrast-sensitivity function that we are not sensitive to high and very
low frequency low contrast images. However, when we use the mean squared error as a
metric, we are taking differences in these low sensitivity areas into account. For
example, an area A of an image that is gray versus an area of an image that has low
contrast high frequency information but is at the same mean level as A will look about the
same to a viewer.

Thus, one way to build a more accurate similarity algorithm is to put the image through a
filter that would get rid of the high frequency and low frequency low contrast areas. This
will remove the frequencies that the eye doesn’t perceive well. Then, we take the MSE,
and we should have a reasonable similarity measure.

An experiment to do is to compare the MSE with the revised MSE measure, and see
which comes closer to the subjective difference perceived by the viewer.

The blue receptors have low spatial resolution, but the yellow “receptors” will have a
higher spatial resolution. It then follows that since we only have a few blue receptors for
every yellow, then we do not need to have a high resolution component pathway.

General Review of JPEG-DCT


Pixels that are closer together spatially have higher correlation. Thus, even though the
dynamic range of the whole image can be big, the actual differences between adjacent
pixels are small. When we apply lossless compression, we are basically decorrelating the
data so that we may utilize memory more efficiently.


JPEG is a block based transform. This means that the image is divided into blocks of 8x8
(usually) and a DCT transform is applied to each block separately. This can be thought of
as a sampling problem since basically, each corresponding pixel in each block taken
together make up a downsampled version of the original image. This is then transformed
into a DCT coefficient, which can also be thought of as a convolution with a special kind
of kernel. We may obtain a matrix representation of this operation by taking the outer
product of the basis functions, which in this case would be cosines.


Information is lost during the quantization phase. After we obtain the DCT coefficients,
we specify how many levels of quantization are going to be used for each coefficient.
Some will have finer quantizations, while others will have a very coarse quantization. In
any case, quantization is a non-linear non-invertible process, and information is lost.

Also, JPEG downsamples the color planes Cr and Cb. Information is also lost there.


In the pixel domain, all pixels are equally important. It makes no sense to give more
weight to certain pixels and less to others: we perceive each pixel equally. However, in
the DCT domain not all coefficients are equally important. The human visual system has
a strong frequency dependent sensitivity function. We are not so sensitive to higher
frequency content, and thus, we can quantize those more heavily.


B will always be the same or worse looking than A. You can always lose information in
lossy compression, but never able to gain any. Thus, depending on the conditions of
compression, B may actually look worse than A. The best we can hope for is that no
more harm is done in compressing to B, which is not always the case.


The main visual artifact is the blocking artifact. This is where we see edge artifacts that
are due to the block based algorithm used for JPEG. Basically, discontinuities on the
block edges are seen at low bitrates. Also, JPEG tends to smooth out a lot of details at
low bitrates, since it is heavily quantizing the high frequency information.