Escolar Documentos
Profissional Documentos
Cultura Documentos
6, JUNE 2016
Abstract— Recently, image pairs, such as noisy and blurred amount of exposure; in other words, by using relatively
images or infrared and noisy images, have been considered as a short and long shutter speeds during shooting. In this way,
solution to provide high-quality photographs under low lighting an image captured with a short exposure will contain some
conditions. In this paper, a new method for decomposing the
image pairs into two layers, i.e., the base layer and the detail noise but can avoid blurring artifacts, while an image captured
layer, is proposed for image pair fusion. In the case of infrared with a long exposure will have blurring artifacts due to
and noisy images, simple naive fusion leads to unsatisfactory camera shake but will have clean color tones. Even though
results due to the discrepancies in brightness and image these two images are both degraded, the use of the image
structures between the image pair. To address this problem, pair can improve the restored image quality via accurate
a local contrast-preserving conversion method is first proposed
to create a new base layer of the infrared image, which can kernel estimation (i.e., camera motion) and removal of ringing
have visual appearance similar to another base layer, such as artifacts compared to the single-image-based deblurring and
the denoised noisy image. Then, a new way of designing three denoising methods [1], [2].
types of detail layers from the given noisy and infrared images The second image pair [4] consists of noisy and flash-
is presented. To estimate the noise-free and unknown detail produced images. A flash is often utilized in low lighting
layer from the three designed detail layers, the optimization
framework is modeled with residual-based sparsity and patch conditions to add artificial light to the scene, thereby producing
redundancy priors. To better suppress the noise, an iterative a sharp and noise-free image. However, the color tones of such
approach that updates the detail layer of the noisy image is image are different from those of the no-flash image captured
adopted via a feedback loop. This proposed layer-based method under ambient light due to the color temperature difference
can also be applied to fuse another noisy and blurred image between the ambient light and the flash. Thus, the color tones
pair. The experimental results show that the proposed method
is effective for solving the image pair fusion problem. in a flash image need to be adjusted to be similar to those
Index Terms— Near-infrared imaging, multi-exposure imaging,
of the no-flash image via the color transferring method [6] or
image fusion, layer decomposition, sparse regularization. illumination removal technique [7]. The simple naive fusion
method can also provide good results in such circumstances.
I. I NTRODUCTION The use of a flash has disadvantages in that its use is
TABLE I
I MAGE PAIR F USION M ODELS
A. Fusion Model for Noisy and Blurred Image Pairs fusion [18], infrared coloring [8], or dehazing [19], where the
For noisy and blurred image pairs, models based on input images contain little noise.
residual deconvolution (RD) [3], multi-resolution (MR) [15],
gradient different constraint (GDC) [16], and artificial blurred B. Fusion Model for Noisy and Infrared Image Pairs
image generation (ABIG) [17], [11] have been used. The For noisy and infrared images, the weighted least
RD model [3] identifies the unknown original image (x) square (WLS) [5] and GDC [20] models have been used.
by adding the lost detailed layer (x) to the base layer, The WLS model utilizes the infrared image to determine the
i.e., the denoised noisy image (xn,d ). This model is weights (λ) of the regularization term; in other words,
effective in restoring the fine details; however, it requires the infrared image is regarded as a guidance image used in
single-image-based deconvolution to identify the detail layer, the adaptive filtering [21]. In Table I, the GDC model needs
and thus it is susceptible to ringing artifacts [11]. In contrast, to be modified for noisy and infrared image fusion. In other
the GDC [16] and ABIG [17] models utilize the noisy image words, k should be the delta kernel, and xb and xn should
as the regularization and data fidelity terms, respectively, and be replaced by xn and xir , respectively. This illustrates that
then combine them in the transform or spatial domain. Thus, the GDC model produces an unknown image x that is similar
the ringing artifacts that result from the deconvolution with the to the noisy image xn ; however, it has a constraint that the
captured blurred image (xb ) and the inaccurate kernel (k) can gradients of x are similar to those of the clean infrared image,
be mitigated by fusing the sharp noisy image. The difference xir in order to enable restoration of the details. Recently,
between GDC and ABIG is that the captured noisy image is a more advanced scale map approach [9] has been introduced.
used as the prior in the GDC model, whereas the captured In the last row of Table I, the scale map (s) is additionally used
image is used as the data fidelity term in the ABIG model. to predict the gradients of the unknown original image (x) from
In other words, the GDC model makes the gradients of the the given guidance image (y). In other words, different from
unknown original image similar to those of the noisy image. the GDC model, the SM model can handle gradient direction
However, the ABIG model produces two types of blurred and magnitude discrepancy by using the scale map, although
images: one is the captured blurred image, and the other it needs to estimate the unknown scale map and the original
is the artificially created blurred image, which is obtained image at the same time. The ABIG model is not appropriate for
by denoising the noisy image (xn,d ). The artificially created noisy and infrared image pair fusion due to the discrepancies
blurred image is updated with the corresponding estimated in the brightness and gradients between the noisy and infrared
kernel (kg ) via a feedback loop. This is based on a related images. For this reason, the GDC model is preferred over the
work [14], which showed that two blurred images yield a ABIG model. If there are slight discrepancies between the
better restored image compared to the use of only one blurred two images, however, the naive fusion method [10], which
image. In the GDC and ABIG models, the alpha value (α) just combines the infrared gray image with the denoised
indicating the prior can be fixed at 0.5 or 1.0, or it can chrominance components of the noisy image, can be a good
be spatially varied [11]. The MR method [15] combines the choice.
wavelet coefficients of the noisy image, (xn ), with those of
the blurred image, (xb ) based on the assumption that the C. Fusion Model for Noisy and Flash Image Pairs
absolute differences between two images are typically larger Unlike the noisy and infrared image pair, there are no
and closer to the edges of the images. This indicates that discrepancies in the gradations or image structures between
the value of λ is inversely proportional to the absolute value noisy and flash images. However, the color appearance differs
of (d). In Table I, λ can be a scalar value or a function between these images [4] due to the color temperature
depending on the model, and the value of λ can be constant or difference between the ambient light and the flash. Therefore,
spatially varied. However, this assumption fails in the presence the color appearance of the flash image needs to be adjusted
of severe noise in the captured noisy image. Thus, this to be close to that of the noisy image. Color transfer [6] or
MR model is more appropriate for high dynamic range (HDR) illumination removal [7] can be used to adjust the color tones
SON AND ZHANG: LAYER-BASED APPROACH FOR IMAGE PAIR FUSION 2869
xbase = wxbase
n
+ (1 − w)xbase
ir
(9) min Ri x − Dα i 22 subjec to α i 0 ≤ T H (11)
αi
λ1 2
where w indicates the weighting value for the base layer
min x − xn 2 + λ2 x − (γ 1 x ir
+ γ 2 x ir,s
)
base
fusion. In this paper, the weighting value, w, is determined x 2 2 2 2
as the edge strength ratio between the two base layers at each λ3
N
pixel position. As mentioned in subsection III.A.1, the use of + Ri x − Dα i 22 (12)
2
simple fusion of the two base layers in (9) is achieved through i=1
the proposed image conversion method. Equation (11) is an 0 -norm problem with respect to the
Second, the unknown detail layer is estimated by coefficient vector, and thus one of the matching pursuit
minimizing the following cost function: algorithms [1], [12], [29] can be used. In (11), T H is a scalar
λ1 2
min x − xn 2 + λ2 x − (γ 1 x ir
+ γ 2 x ir,s
)
base
controlling the sparsity. In this paper, T H is set to 3. More
x,αi 2 2 2 2 details about parameter setting can be found in [1] and [12].
N
N On the other hand, (12) is an 2 -norm problem, so it can have a
λ3
+ Ri x − Dα i 22 + α i 0 (10) closed-form solution by differentiating (12) with respect to x
2 as follows:
i=1 i=1
where λ determines the weight ratio between the data fidelity
N
and regularization terms, and γ determines the fusing ratio λ1 I + λ2 I + λ3 RiT Ri x
ir,s
for the two detail layers, xir and xbase . In (10), is i=1
used to indicate detail layers. The first two terms indicate the
N
n ir ir,s
data fidelity terms. In other words, the original detail layer to = λ1 x + λ2 (r1 x + r2 xbase ) + λ3 RiT Dα i (13)
be estimated here should not deviate from the designed three i=1
2872 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 6, JUNE 2016
x =
⎛ ⎞
n )+λ (γ xir +γ xir,s )+λ (
N
⎜λ1 (x 2 1 2 base 3 RiT Dα i )⎟
−1⎜ i=1 ⎟
⎜ ⎟
⎝ λ1 +λ2 +λ3 β ⎠
(14)
where and −1 indicate the forward and backward fast
Fourier transforms, respectively.
N In (13), I is the identity
matrix, and the term T
i=1 Ri Ri is a diagonal matrix,
Fig. 5. Comparison of the learned dictionaries from the natural images and
the detail layers (left to right).
the diagonal entries of which are equal to
the number of
N
overlapping patches [1], [28]. Thus, the term i=1 RiT Ri can
be replaced with βI, where β is the number of overlapping conventional dictionary is more appropriate for the sparse
N representation of natural images [1], [27]. Conversely, the
patches. By inserting βI into the position of T
i=1 Ri Ri residual dictionary provides the best sparse representation for
and then applying the fast Fourier transform, the closed-form
the detail layers.
solution can be obtained, as shown in (14). The parameters
are set to λ1 = 1, λ2 = 10, λ3 = β −1 , γ1 = 1.3 and B. Noisy and Blurred Image Pair Fusion
γ2 = 0.7. In (14), γ controls which detail layers will be more
emphasized, resulting in sharpening. The detail layer of the The proposed layer-based approach described in Section
noisy image is iteratively replaced by the estimated detail layer of III.A can be applied to a second noisy and blurred image
through a feedback loop, as shown in Fig. 2(a), thus producing pair. Unlike the noisy and infrared image pair, the noisy and
high-quality restored images. blurred image pair has one base layer and two detail layers,
4) Residual Dictionary Generation: To learn the residual which are defined as follows.
dictionary in (10), a training image set including 200 natural n
xbase = xn,d (16)
images is first collected from Internet websites, and then the
xn = xn − xbase
n
(17)
nonlocal means image denoising method [24] is applied to
n,b
the training image set to provide the corresponding base layer xb = xb − (xbase = xbase
n
⊗ k) (18)
set. Next, the detail layer set is generated by subtracting the n , xn , and xb indicate the base layer and
where xbase
base layer set from the paired training image set. It can be n,b
assumed that the detail layers of the infrared gray images can two types of detail layers, respectively. xbase is defined
be sparsely represented using the residual dictionary learned as xbase ⊗ k, where k indicates the kernel corresponding to the
n
from the natural gray images. Given the detail layer set, the blurred image xb , and the kernel can be accurately estimated
residual dictionary can be obtained by solving the following based on the patch pair [17]. Equation (16) indicates that
cost function. the base layer is the same as the denoised noisy image with
the nonlocal means denoising method. The detail layer of the
min X − DA22 subject to ∀ j, A(:, j )0 ≤ T H (15) noisy image is the difference between the noisy image and the
D,A
base layer. On the other hand, the detail layer of the blurred
where the matrix X contains the patches extracted from image is the difference between the blurred image and the
the detail layers for the columns. D and A indicate the convolved base layer. Given the detail layers, the unknown
residual dictionary and the representation coefficient matrix, and noise-free detail layer can be found by minimizing the
respectively. The constraint term, A(:, j )0 ≤ T H , indicates following cost function:
that each column vector A(:, j )0 of the matrix A should 2
λ1
x − xn 2 + λ2 b
be sparse. The minimization of two unknowns D and A min 2 x ⊗ k − x
x,αi 2 2 2
in (15) can be alternatively solved with the K-SVD learning
algorithm [12]. λ3
N N
Algorithm 1 Image Fusion for Noisy and Infrared Image Pairs Algorithm 2 Image Fusion for Noisy and Blurred Image Pairs
Fig. 6. Experimental results: (a) visible color image, (b) created noisy image, (c) gray infrared image, (d) restored image with the naive fusion, (e) restored
image with the nonlocal means denoising, (f) restored image with the BM3D method, (g) restored image with the WLS model, (h) restored image with the
GDC model, (i) restored image with the GIF, (j) restored image with the proposed method, (k) denoised infrared image, (l) new base layer of the infrared
image, (m) base layer of the noisy image, (n) detail layer of the noisy image, (o) detail layer of the infrared image, (p) detail layer of the newly created base
layer, (q) fused base layer, (r) restored detail layer, and (s) close-up comparison (nonlocal means denoising, BM3D, WLS, GDC, GIF, and proposed methods:
left to right).
the grayscale images. To address this problem, the noisy color a quantitative evaluation is possible. For the noisy visible color
image is first separated into Y, Cb, and Cr images [26], and and infrared gray images without discrepancy, simple naive
then only the Y image is fused with the gray infrared image fusion can be a good choice because the luminance channel of
via the proposed method. The remaining Cb and Cr images the noisy image is directly replaced by the noise-free infrared
are denoised with the nonlocal means denoising method. image. This is demonstrated in Fig. 6(d), where there is little
Moreover, the real captured images [5], [9] are downloaded noise and superior sharpness compared to those of the other
and then tested. To analyze the performance of the proposed restored images. However, the overall brightness is decreased,
method, the conventional methods [5], [10], [20], [21], [24] which indicates that the naive fusion method cannot be
are compared. The testing environment is a Windows system applied to image pairs suffering from the discrepancy problem.
with Intel Core i7-4702, 2.2 GHz CPU, and the testing image Fig. 6(e) shows the denoised image with the nonlocal means
size is 512 × 512. denoising method [24]. As shown in Fig. 6(e), undesirable
Figure 6 shows the visible color image, the created noisy noise still remains on the flat regions and some designs on
image, the gray infrared image, the restored images, the base the wall are seriously blurred. However, the BM3D [32], one
layers, and detail layers. In the visible color image of Fig. 6(a), of the state-of-the-art denoising methods, can improve the
pixel saturation occurred around the glass window due to visual quality significantly, as shown in Fig. 6(f). However,
the limited camera dynamic range. See the red box. Thus, the lost visual information in the saturation region cannot
some information has been lost compared to that in the gray be restored. The WLS fusion model [5] uses the infrared
infrared image of Fig. 6(c). However, the gray infrared image image as the guidance image to control the weights of the
has been darkened in contrast to the visible color image. The regularization term, as shown in Table I. As a result, the
captured visible color image and the infrared gray image have designs on the wall are more sharpened and the noise in
different edges and brightness. This reveals that the original the flat regions is highly suppressed, which can be checked
visible color image without the pixel saturation is unknown, by comparing Figs. 6(e) and 6(g). Note that the nonlocal
and thus full-reference image quality evaluation cannot be means denoising method is used to provide xn and xb in
conducted [31]. However, for noisy and blurred image pairs, the Table I. This means that the WLS, GDC, and the proposed
SON AND ZHANG: LAYER-BASED APPROACH FOR IMAGE PAIR FUSION 2875
methods are influenced by the restoration quality of xn and xb . Fig. 7 shows additional experimental results for the created
In general, it is known that the performance of the BM3D is noisy and infrared images. Slight blurring occurred in the
better than that of the nonlocal means denoising method, and leftmost side of the visible color image, as shown in the
thus the visual quality of the BM3D can be better than that red box of Fig. 7(a). Naturally, the created noisy image
of the WLS method. See the Figs. 6(f) and 6(g). However, accompanies the blurring. Moreover, the fine details on the
the WLS method can be better than the BM3D for real-life wall have been removed due to noise, as shown in Fig. 7(b).
noisy images because the BM3D is weak to non-Gaussain However, the infrared image has no blurring in that region
noise, which will be checked later. In addition, if the BM3D and captures the fine details on the wall. The naive fusion
is used to provide xn and xb in Table I, the performance method provides the cleanest image, as shown in Fig. 7(d);
of the WLS can be better than that of the BM3D. Meanwhile, however, the restored colors are unnatural due to the brightness
the GDC model [4], [16], [20] includes the infrared image discrepancy between the noisy and infrared images. The
in the regularization term. In other words, it produces gradients restoration quality of the denoised noisy image in Fig. 7(e)
of the restored image that are close to those of the infrared is poor due to the severe noise. The BM3D can improve
image, and thus the lost information in the saturation region the restoration quality significantly, as shown in Fig. 7(f).
can be recovered, as shown in Fig. 6(h). Fig. 6(i) shows However, the representation for the designs of the wood
the image restored with the GIF [21]. The visual quality furniture is still poor. The use of the infrared image for
is comparable to that of the proposed method, as shown guidance with the WLS model leads to improvements in
in Fig. 6(j). However, the information in the saturation region the noise removal and edge representation, especially for the
is not restored. In addition, the sharpness is not better than chairs and the wall, which can be checked by comparing
the proposed method. Furthermore, GIF is a good solution the Figs. 7(e) and (g). However, since the gradients of the
only if the discrepancy between input image pairs is small, infrared image are not combined with those of the noisy image
which can be checked for the real-life noisy images. For detail in the WLS model, the blurred region remains. In contrast
comparison, close-up images are provided in Fig. 6(s). to the WLS model, the GDC model enables the fusion of
The image restored with the proposed method in Fig. 6(j) the infrared image with the noisy image according to the
is a combination of the fused based layer and the estimated gradient difference constraint, so that clear lines are recovered
detail layer, as shown in Figs. 6(q) and (r), respectively. on the designs of the wood furniture and chairs. In addition,
Figures 6(l) and (m) show the created base layers of the the blurred region is recovered, as shown in the leftmost
infrared and noisy images, respectively. The denoised infrared side of Fig. 7(h). Similar restoration quality can be obtained
image in Fig. 6(k) is changed to the new base layer in Fig. 6(l). with GIF, as shown in Fig. 7(i). In contrast, more fine
As shown in Fig. 6(l), its appearance becomes closer to that details on the wall are restored with the proposed method,
of the base layer in Fig. 6(m). Figures 6(n)-(p) show the as shown in Fig. 7(j), due to the layer-based approach.
three types of detail layers. The detail layer of the noisy As shown in Figs. 7(k)-(m), the denoised infrared image can
image includes both noise and image structures, as shown be combined with the denoised noisy image via the proposed
in Fig. 6(n). On the other hand, the detail layer of the infrared image conversion method, thereby avoiding the brightness
image contains fine details, as shown in Fig. 6(o); however, discrepancy problem and producing the fused base layer, as
some information, such as the bars on the window, has been shown in Fig. 7(q). Also, the clean detail layer of Fig. 7(r)
lost. The detail layer of the newly created base layer, shown is obtained by applying residual sparse regularization to the
in Fig. 6(p), has strong edges including highlights. Also, the designed detail layers, as shown in Figs. 7(n)-(p).
bars on the window that was removed in Fig. 6(o) can be Figures 8 and 9 show the experimental results for the
detected. As mentioned in subsection III.A.2, the detail layers captured noisy and infrared images, respectively. By comp-
are separately enhanced with contrast stretching [26], and thus aring the noisy images to the infrared images, respectively, as
it is meaningless to compare the absolute pixel values. Even shown in Figs. 8(a) and (b) and Figs. 9(a) and (b), structure
though there is brightness discrepancy between the denoised discrepancies are noted, especially for the brim of bowl and
noisy and infrared images, as shown in Figs. 6(k) and (m), the the “Lipt” text. Therefore, the naive fusion method naturally
proposed image conversion method overcomes this problem, produces unrealistic images, as shown in Figs. 8(c) and 9(c).
thereby producing the fused base layer, as shown in Fig. 6(q). The BM3D generates visual artifacts, as shown in the green
In addition, the clean detail layer, shown in Fig. 6(r), can box of Fig. 8(e). Also, the visual quality is poor. Different
be estimated via the proposed residual dictionary and sparse from the synthetic noisy images of Figs. 6 and 7, the visual
regularization. The updating of the detail layer of the noisy quality of the WLS method is better than that of the BM3D,
image can improve the restoration quality; however, the as shown in Figs. 8(f) and 9(f). The GIF cannot overcome
improvement is small for the noisy and infrared image pair. the discrepancy problem, as shown in Figs. 8(h) and 9(h),
Thus, the updated detail layers are not provided in this paper. especially for the brim of the bowl and “Lipt” text regions.
However, the effectiveness of the iteration will be assessed The restoration qualities of the nonlocal means denoising,
for noisy and blurred image pairs. As shown in Fig. 6(j), the BM3D, WLS, and GIF methods are not better than those
noise can be effectively removed, and the fine details can be of the GDC and proposed methods. In particular, the details
preserved with the proposed layer-based approach. Also, the in the background of Figs. 8(d), (e), (f) and (h) and the
saturated regions in the red box of the Fig. 6(a) can be nicely text on the paper box of Figs. 9(d), (e), (f) and (h) are lost.
recovered. In contrast, the GDC and proposed methods can recover
2876 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 6, JUNE 2016
Fig. 7. Experimental results: (a) visible color image, (b) created noisy image, (c) gray infrared image, (d) restored image with naive fusion, (e) restored
image with nonlocal means denoising, (f) restored image with the BM3D method, (g) restored image with the WLS model, (h) restored image with the GDC
model, (i) restored image with the GIF, (j) restored image with the proposed method, (k) denoised infrared image, (l) new base layer of the infrared image,
(m) base layer of the noisy image, (n) detail layer of the noisy image, (o) detail layer of the infrared image, (p) detail layer of the newly created base layer,
(q) fused base layer, and (r) restored detail layer.
Fig. 8. Experimental results: (a) captured noisy image, (b) captured infrared image, (c) restored image with naive fusion, (d) restored image with nonlocal
means denoising, (e) restored image with the BM3D method, (f) restored image with the WLS model, (g) restored image with the GDC model, (h) restored
image with the GIF, and (i) restored image with the proposed method.
the edges. As shown in Fig. 8(g), the GDC model restores the GDC model can produce gradients of the restored image that
edges of the flowers and branches on the bowl; however, the are similar to those of the infrared image, the fine details in
fine details on the background are lost. On the other hand, the background of the Fig. 8(g) can be recovered by assigning
the edges of the flowers and branches that are restored a larger value to the regularization parameter, λ, as shown
with the proposed method are not better than those obtained in Table I. However, the edge strength of the brim of the bowl
with the GDC method; however, the fine details in the will be lost due to the edge gradient discrepancies between
background are restored, as shown in Fig. 8(i). Also, the text the noisy and infrared images. This problem can be solved
visibility with the proposed method is better than that with by the proposed layer-based approach. In other words, the
the GDC method, as shown in Figs. 9(g) and (i). Since the detail layer of the infrared image can preserve the textures of
SON AND ZHANG: LAYER-BASED APPROACH FOR IMAGE PAIR FUSION 2877
Fig. 9. Experimental results: (a) captured noisy image, (b) captured infrared image, (c) restored image with naive fusion, (d) restored image with nonlocal
means denoising, (e) restored image with the BM3D method, (f) restored image with the WLS model, (g) restored image with the GDC model, (h) restored
image with the GIF, and (i) restored image with the proposed method.
TABLE II BIQE method that does not require any human subjective
BIQE C OMPARISON scores for training—i.e., opinion-unaware method—is
used [33]. This method models the natural statistics of the
local structures, contrast, multiscale decomposition, and
colors, and then it measures the deviation of the distorted
images from the reference statistics. It can be noticed that
the naïve fusion method has good BIQE scores. This is
because the infrared images with little noise and blurring
are directly used for the luminance channels of the fused
images. In other words, the resulting images with the naïve
fusion contain little noise and blurring. For this reason,
the background, whereas the detail layer of the newly created the naïve fusion method can obtain lower BIQE scores.
base layer can preserve thick edges. Moreover, the proposed However, the BIQE method cannot find the discrepancies
method controls the edge strength of the detail layers by in the brightness and image structures between the infrared
adjusting the gain parameters (γ1 and γ2 ) in (10). For the real and visible color images. It can reflect only the statistics of
captured images, the SM method in Table I was compared natural colors. Until now, no BIQE methods have focused
to the proposed method. We also observe that the visual on measuring these kinds of discrepancies. Table II shows
qualities of the proposed method are comparable to those of that the proposed method obtains the smallest average BIQE
the SM method (see the supplementary material at the score, which indicates the best visual quality among those
website: https://sites.google.com/site/changhwan76son/). methods. For Fig. 6, the GIF obtains the best BIQE score
The performance of the proposed method depends on the except for the naïve fusion method. However, the BIQE
restoration quality of the base and detail layers. The restoration score of the GIF is almost the same as that of the proposed
quality of the base layer depends on the used denoising method. Actually, the detail representation of the proposed
method. Since the BM3D is a state-of-the-art denoising method is better than that of the GIF. Moreover, some
method, it is expected that the restoration quality would be information around the saturation regions is lost with the GIF,
improved if the nonlocal means denoising method is replaced as shown in Fig. 6(s). Due to some noise in the flat region,
by the BM3D for the base layer generation. the BIQE score of the proposed method becomes slightly
There may be a question whether the image-pair-based higher than that of the GIF. However, from this result, it can
denoising methods including WLS, GDC and the proposed be concluded that the GIF can be a good solution only if the
methods are truly better than single-image-based denoising discrepancy between input image pairs is small. For other
methods, especially BM3D. However, since the image-pair- real test images with significant discrepancies, the BIQE
based denoising methods use additional information scores of the GIF are relatively high, indicating that the GIF
(e.g., edge and texture) contained in the captured infrared performance is inferior to the performances of the GDC and
images, restoration quality can be better than that of the the proposed methods. This shows that the GIF is closer to
single-image- based denoising methods [1], [24], [32]. edge-preserving smoothing filters and it is not designed to
In Table II, the scores of the blind image quality overcome the discrepancy problem. Additional computation
evaluation (BIQE) for different methods are provided for of the BIQE scores of the SM method [9] indicates that the
comparison. Original versions of the captured infrared and BIQE performance of the SM method is not better than those
noisy visible color images cannot be defined, and thus of the GDC and the proposed methods (see supplementary
BIQE was conducted. The recently introduced state-of-the-art material). However, the visual quality of the SM method is
2878 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 6, JUNE 2016
Fig. 10. Experimental results for the ’Text’ image: (a) original image, (b) created noisy image, (c) created blurred image, (d) denoised image [24], (e) restored
image with the ABIG model, (f) restored image with the RD model, and (g) restored image with the proposed method.
Fig. 11. Experimental results for the real captured image: (a) captured noisy image, (b) captured blurred image, (c) denoised image [24], (d) restored image
with the ABIG model, (e) restored image with the RD model, and (f) restored image with the proposed method.
comparable to those of the GDC and the proposed methods. can effectively remove the noise in the detail layer of the
The use of the scale map can solve the discrepancy problem, noisy image. Similar results are shown in Figs. 11(e) and (f).
which leads to nicely recovered image structures. For further comparison, the restored images obtained using
the RD-based and proposed methods are provided in Fig. 12.
B. Noisy and Blurred Image Pairs In the top row, ringing artifacts are observed, especially on
In this subsection, the viability of applying the proposed the left boundary of the butterfly and the side of the car.
layer-based approach to other noisy and blurred image pairs However, the ringing artifacts are removed, as shown in the
is investigated. Especially, the proposed method is compared bottom row. Moreover, the texts in the ‘Garage’ image are
to the conventional RD method [3] in terms of ringing clearly recovered with the proposed method. For reference,
artifacts. Test images from [11] are used for visual comparison. the images restored with the nonlocal means denoising
Figures 10 and 11 show the experimental results for the method and the ABIG-based method [11], [17] are shown
produced and real captured image pairs. The right side in Figs. 10 and 11, respectively. However, the single image
of Fig. 10(f) shows the ringing artifacts generated around deblurring method [34] and the MR-based method [15] are
the text. This is due to the single image deconvolution, excluded in this paper. As discussed in [11], the MR-based
which is required to identify the unknown detail layer, as method is more appropriate for HDR fusion, and single
shown in Table I. For the deconvolution, the estimated kernel image deblurring is much more prone to ringing artifacts.
using a patch pair is used [17]. On the bottom-left of The ABIG model uses the denoised noisy image in the data
the blurred image in Fig. 10(c), the original and estimated fidelity term, as shown in Table I, and thus ringing artifacts
kernels are arranged from left to right. One way of reducing can be reduced. However, since the denoised noisy image is
the ringing artifacts with the RD model is to assign a mixed with the deconvolved version of the blurred image,
larger value to the regularization parameter. However, this it is not easy to recover the fine textures and details that
will reduce the overall sharpness. In contrast, the ringing have already been removed with denoising. For example, the
artifacts can be removed with the proposed method, as shown grid texture patterns are degraded, as shown in the left side
in Fig. 10(g), through the use of the detail layer of the of Fig. 10(e). Also, the text sharpness is decreased, as shown
noisy image, xn , as shown in the first and last terms in Fig. 11(d). The nonlocal means denoising method uses only
of (19). In other words, the noisy image without ringing the noisy image, so the color tones of the denoised image
artifacts can be combined with the deconvolved version of the can be different from those of the blurred image, as shown
blurred image, thus reducing the ringing artifacts. Also, the in Figs. 11(b) and (c). In Fig. 12, the images obtained using
residual-based sparse and nonlocal redundancy regularizations the ABIG and the nonlocal means denoising methods are
SON AND ZHANG: LAYER-BASED APPROACH FOR IMAGE PAIR FUSION 2879
TABLE III
PSNR C OMPARISON
VI. C ONCLUSION
GDC models used the nonlocal means denoising method to A new layer-based image pair fusion method was proposed.
provide the xn and xb in Table I. Therefore, in Table IV, To solve the discrepancy problem between noisy and infrared
the computation time of the WLS and GDC methods include images, a local contrast-preserving conversion method was
that of the nonlocal means denoising. Also, the computation developed, thereby enabling the appearance of the base layer
time of the nonlocal means denoising and WLS can vary of the infrared image to be close to that of the noisy image.
depending on the size of the search window (e.g., 17 × 17) In addition, a new way of designing three types of detail layers
and optimization techniques (e.g., steepest descent method), that contain different information, e.g., highlights, edges, or
respectively. As shown in Table IV, the computational speed textures, was introduced. This detail layer design leads to
of the proposed method is relatively slow. This is mainly due improvement in the edge and texture representations. Also, the
to the residual sparse coding and nonlocal means denoising, proposed residual-based sparse coding effectively suppresses
which take about 99% of the total computation time. In this the noise in the detail layer of the noisy image through a
paper, the sparse coding (i.e., matching pursuit) and nonlocal feedback loop. Furthermore, the experimental results showed
means denoising algorithms are required to solve (11) and that the proposed layer-based method can also be applied to
generate base layers. The complexity for the sparse coding and noisy and blurred image pairs.
nonlocal means denoising algorithms was already discussed
in [24] and [35]. Other processes for solving (3), (8), and (14) R EFERENCES
take minor portion (about 1%) of the total computation time. [1] M. Elad and M. Aharon, “Image denoising via sparse and redundant
It is known that the complexity for the fast Fourier transform representations over learned dictionaries,” IEEE Trans. Image Process.,
to solve (14) and (8) is O(N log N) and the complexity for vol. 15, no. 12, pp. 3736–3745, Dec. 2006.
[2] S. Cho and S. Lee, “Fast motion deblurring,” ACM Trans. Graph.,
the matrix inversion in (3) depends on the algorithms used, vol. 28, no. 5, Dec. 2009, Art. no. 145.
e.g., O(n 3 ) for Gauss–Jordan elimination. Here, N is the total [3] L. Yuan, J. Sun, L. Quan, and H.-Y. Shum, “Image deblurring with
number of pixels of the input image and n is the number of blurred/noisy image pairs,” ACM Trans. Graph., vol. 26, no. 3, Jul. 2007,
Art. no. 1.
elements of the input matrix. In Table IV, the computation [4] S. Zhuo, D. Guo, and T. Sim, “Robust flash deblurring,” in Proc. IEEE
time for noisy and blurred image pairs is not provided. In the Comput. Vis. Pattern Recognit., Jun. 2010, pp. 2440–2447.
case of RD model in Table I, the base layer is given via the [5] S. Zhuo, X. Zhang, X. Miao, and T. Sim, “Enhancing low light images
using near infrared flash images,” in Proc. Int. Conf. Image Process.,
nonlocal means denoising and the detail layer is estimated by Sep. 2010, pp. 2537–2540.
using the GDC model. Therefore, the computation time of the [6] Y.-W. Tai, J. Jia, and C.-K. Tang, “Local color transfer via probabilistic
RD model is almost same as the GDC method in Table IV. segmentation by expectation-maximization,” in Proc. IEEE Comput. Vis.
Pattern Recognit., Jun. 2005, pp. 747–754.
For the proposed method, the computation time is increased [7] J. van de Weijer, T. Gevers, and A. Gijsenij, “Edge-based color
because more iterations are required to estimate the original constancy,” IEEE Trans. Image Process., vol. 16, no. 9, pp. 2207–2214,
detail layer. The computation time of the ABIG model in Sep. 2007.
[8] C. Fredembach and S. Süsstrunk, “Colouring the near infrared,” in Proc.
Table I was provided in [17]. 16th Color Imag. Conf., 2008, pp. 176–182.
As mentioned in the Introduction section, the contribution [9] Q. Yan et al., “Cross-field joint image restoration via scale map,” in
of this paper lies in the creation of the base and detail Proc. IEEE Int. Conf. Comput. Vis., Oct. 2013, pp. 1537–1544.
[10] R. Kakarala and R. Hebbalaguppe, “A method for fusing a pair of
layers for image pair fusion. The main focus is not to images in the JPEG domain,” J. Real-Time Image Process., vol. 9, no. 2,
improve the computational speed of the sparse coding and pp. 347–357, Jun. 2014.
nonlocal means denoising. Nevertheless, we note that there [11] C.-H. Son, H. Choo, and H.-M. Park, “Image-pair-based deblurring with
spatially varying norms and noisy image updating,” J. Vis. Commun.
are a number of ways to reduce the computation time of Image Represent., vol. 24, no. 8, pp. 1303–1315, Nov. 2013.
the proposed layer-based method. For examples, the first is [12] M. Aharon, M. Elad, and A. Bruckstein, “K -SVD: An algorithm for
to replace the residual sparse regularization in (10) with designing overcomplete dictionaries for sparse representation,” IEEE
Trans. Signal Process., vol. 54, no. 11, pp. 4311–4322, Nov. 2006.
other regularizations. In [11], it is shown that using the [13] M. Protter, M. Elad, H. Takeda, and P. Milanfar, “Generalizing the
spatially varying norm is effective to remove noise and nonlocal-means to super-resolution reconstruction,” IEEE Trans. Image
preserve textures. This type of regularization can be used Process., vol. 18, no. 1, pp. 36–51, Jan. 2009.
[14] A. Rav-Acha and S. Peleg, “Two motion-blurred images are better than
for the estimation of the unknown detail layer. Moreover, it one,” Pattern Recogn. Lett., vol. 26, no. 3, pp. 311–317, Feb. 2005.
can be implemented via fast Fourier transform. Therefore, [15] M. Tico and K. Pulli, “Image enhancement method via blur and noisy
the computation time can be reduced for estimation of the image fusion,” in Proc. Int. Conf. Image Process., 2009, pp. 1521–1524.
[16] S.-H. Lee, H.-M. Park, and S.-Y. Hwang, “Motion deblurring using edge
unknown detail layer. Note that in this paper, sparse coding map with blurred/noisy image pairs,” Opt. Commun., vol. 285, no. 7,
and learned dictionary are used to estimate the unknown detail pp. 1777–1786, Apr. 2012.
SON AND ZHANG: LAYER-BASED APPROACH FOR IMAGE PAIR FUSION 2881
[17] C.-H. Son and H.-M. Park, “A pair of noisy/blurry patches-based PSF Chang-Hwan Son received the B.S., M.S., and
estimation and channel-dependent deblurring,” IEEE Trans. Consum. Ph.D. degrees from Kyungpook National University,
Electron., vol. 57, no. 4, pp. 1791–1799, Nov. 2011. Daegu, South Korea, in 2002, 2004, and 2008,
[18] M. Tico, N. Gelfand, and K. Pulli, “Motion-blur-free exposure fusion,” respectively, all in electronics engineering. He
in Proc. Int. Conf. Image Process., Sep. 2010, pp. 3321–3324. was a National Scholarship Student in 2006.
[19] L. Schaul, C. Fredembach, and S. Süsstrunk, “Color image dehazing From 2008 to 2009, he was with the Advanced
using the near-infrared,” in Proc. Int. Conf. Image Process., 2009, Research and Development Group 2, Digital
pp. 1629–1632. Printing Division, Samsung Electronics Company,
[20] W. Li, J. Zhang, and Q.-H. Dai, “Robust blind motion deblurring using Ltd., Suwon, South Korea. As a Senior Research
near-infrared flash image,” J. Vis. Commun. Image Represent., vol. 24, Engineer, he was involved in the development of
no. 8, pp. 1394–1413, Nov. 2013. color reproduction and image processing algorithms
[21] K. He, J. Sun, and X. Tang, “Guided image filtering,” IEEE in digital printers and copiers. From 2010 to 2012, he was with the
Trans. Pattern Anal. Mach. Intell., vol. 35, no. 6, pp. 1397–1409, Department of Electronic Engineering, Sogang University, Seoul, South
Jun. 2013. Korea. He is currently a Research Professor with the College of Information
[22] G. Petschnigg, R. Szeliski, M. Agrawala, M. Cohen, H. Hoppe, and and Communication Engineering, Sungkyunkwan University, Suwon. His
K. Toyama, “Digital photography with flash and no-flash image pairs,” main research interests include color imaging, image restoration, sparse
ACM Trans. Graph., vol. 23, no. 3, pp. 664–672, Aug. 2004. representation, printing, hardcopy watermarking, and learning representation.
[23] S. Li, X. Kang, and J. Hu, “Image fusion with guided filtering,” He was a Reviewer of the IEEE T RANSACTIONS ON I MAGE P ROCESSING,
IEEE Trans. Image Process., vol. 22, no. 7, pp. 2864–2875, the IEEE T RANSACTIONS ON M ULTIMEDIA, the IEEE S IGNAL P ROCESSING
Jul. 2013. L ETTERS , and the Journal of Imaging Science and Technology, and served
[24] A. Buades, B. Coll, and J.-M. Morel, “A non-local algorithm for image as an Associate Editor of the ACM International Conference on Ubiquitous
denoising,” in Proc. IEEE Comput. Vis. Pattern Recognit., Jun. 2005, Information Management and Communication.
pp. 60–65.
[25] I.-S. Jang, K.-H. Park, and Y.-H. Ha, “Color correction by estimation of
dominant chromaticity in multi-scaled retinex,” J. Imag. Sci. Technol., Xiao-Ping Zhang (M’97–SM’02) received the
vol. 53, no. 5, pp. 050502-1–050502-11, Sep./Oct. 2009. B.S. and Ph.D. degrees in electronic engineering
[26] R. Gonzalez and R. E. Woods, Digital Image Processing. from Tsinghua University, in 1992 and 1996,
Upper Saddle River, NJ, USA: Prentice-Hall, 2002. respectively, and the M.B.A. (Hons.) degree in
[27] J. Yang, J. Wright, T. S. Huang, and Y. Ma, “Image super-resolution finance, economics, and entrepreneurship from the
via sparse representation,” IEEE Trans. Image Process., vol. 19, no. 11, University of Chicago Booth School of Business,
pp. 2861–2873, Nov. 2010. Chicago, IL.
[28] C.-H. Son and H. Choo, “Local learned dictionaries optimized to edge Since Fall 2000, he has been with the Department
orientation for inverse halftoning,” IEEE Trans. Image Process., vol. 23, of Electrical and Computer Engineering, Ryerson
no. 6, pp. 2542–2556, Jun. 2014. University, where he is currently a Professor and
[29] T. Blumensath and M. E. Davies, “Gradient pursuits,” IEEE Trans. the Director of the Communication and Signal
Signal Process., vol. 56, no. 6, pp. 2370–2382, Jun. 2008. Processing Applications Laboratory. He has served as Program Director of
[30] A. De Stefano, P. R. White, and W. B. Collis, “Training methods for Graduate Studies. He is cross appointed to the Finance Department at the Ted
image noise level estimation on wavelet components,” EURASIP J. Appl. Rogers School of Management, Ryerson University. His research interests
Signal Process., vol. 2004, no. 16, pp. 2400–2407, 2004. include image and multimedia content analysis, statistical signal processing,
[31] A. K. Moorthy and A. C. Bovik, “Blind image quality assessment: sensor networks and electronic systems, computational intelligence, and
From natural scene statistics to perceptual quality,” IEEE Trans. Image applications in big data, finance, and marketing. He is a Frequent Consultant
Process., vol. 20, no. 12, pp. 3350–3364, Dec. 2011. for biotech companies and investment firms. He is the Co-Founder and CEO
[32] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising for EidoSearch, an Ontario-based company offering a content-based search
by sparse 3-D transform-domain collaborative filtering,” IEEE Trans. and analysis engine for financial big data.
Image Process., vol. 16, no. 8, pp. 2080–2095, Aug. 2007. Dr. Zhang is a Registered Professional Engineer in Ontario, Canada, and
[33] L. Zhang, L. Zhang, and A. C. Bovik, “A feature-enriched completely a member of the Beta Gamma Sigma Honor Society. He is the General
blind image quality evaluator,” IEEE Trans. Image Process., vol. 24, Co-Chair for ICASSP2021 and the General Chair for MMSP2015. He is the
no. 8, pp. 2579–2591, Aug. 2015. Publicity Chair for ICME’06 and the Program Chair for ICIC’05 and ICIC’10.
[34] L. Xu and J. Jia, “Two-phase kernel estimation for robust motion He served as a Guest Editor for Multimedia Tools and Applications, and the
deblurring,” in Proc. 11th Eur. Conf. Comput. Vis., 2010, pp. 157–170. International Journal of Semantic Computing. He is a Tutorial Speaker in
[35] T. Peel, V. Emiya, L. Ralaivola, and S. Anthoine, “Matching pursuit with ACMMM2011, ISCAS2013, ICIP2013, and ICASSP2014. He is an Associate
stochastic selection,” in Proc. 20th Eur. Signal Process. Conf., 2012, Editor of the IEEE T RANSACTIONS ON S IGNAL P ROCESSING, the IEEE
pp. 879–883. T RANSACTIONS ON I MAGE P ROCESSING, the IEEE T RANSACTIONS ON
[36] J. Wang, Y. Guo, Y. Ying, Y. Liu, and Q. Peng, “Fast non-local M ULTIMEDIA, the IEEE T RANSACTIONS ON C IRCUITS AND S YSTEMS FOR
algorithm for image denoising,” in Proc. Int. Conf. Image Process., 2006, V IDEO T ECHNOLOGY, the IEEE S IGNAL P ROCESSING L ETTERS , and the
pp. 1429–1432. Journal of Multimedia.