Você está na página 1de 8

Comparative Image Fusion Analysais

Firooz Sadjadi Lockheed Martin Corporation firooz.sadjadi@ieee.org Abstract 2. Image Fusion Algorithms
Image fusion is and will be an integral part of many existing and future surveillance systems. However, little or no systematic attempt has been made up to now on studying the relative merits of various fusion techniques and their effectiveness on real multi-sensor imagery. In this paper we provide a method for evaluating the performance of image fusion algorithms. We define a set of measures of effectiveness for comparative performance analysis and then use them on the output of a number of fusion algorithms that have been applied to a set of real passive infrared (IR) and visible band imagery. Image Pyramid Approaches- An image pyramid consists of a set of lowpass or bandpass copies of an image, each copy representing pattern information of a different scale [ -6]. Typically, in an image pyramid 4 every level is a factor two smaller as its predecessor, and the higher levels will concentrate on the lower spatial frequencies. An image pyramid does contain all the information needed to reconstruct the original image. The Gaussian pyramid is a sequence of images in which each member of the sequence is a low pass filtered version of its predecessor [9]. Laplacian pyramid of an image is a set of bandpass images, in which each is a bandpass filtered copy of its predecessor. Bandpass copies can be obtained by calculating the difference between lowpass images at successive levels of a Gaussian pyramid [5]. Ratio of Low Pass Pyramid is another pyramid in which at every level the image is the ratio of two successive levels of the Gaussian pyramid [7]. Contrast Pyramid is similar to the ratio of Low Pass Pyramid approach. Contrast itself is defined as the ratio of the difference between luminance at a certain location in the image plane and local background luminance to the local background luminance. Luminance is defined as the quantitative measure of brightness and is the amo unt of visible light energy leaving a point on a surface in a given direction. Filter-Subtract-decimate (FSD) Pyramid technique is a more computationally efficient variation of the Gaussian Pyramid [27]. Morphological Pyramid-The multi-resolution techniques introduced by Burt and Adelson etc. typically use low or bandpass filters as part of the process. These filtering operations usually alter the details of shape and the exact location of the object sin the image. This problem has been addressed by using morphological filters to remove the image details without adverse effects [24]. Morphological filters, introduced by Serra are composed of a number of

1. Introduction
Image fusion is a process of combining images, obtained by sensors of different wavelengths simultaneously viewing of the same scene, to form a composite image. The composite image is formed to improve image content and to make it easier for the user to detect, recognize, and identify targets and increase his situational awareness. The research activities are mainly in the area of developing fusion algorithms that improves the information content of the composite imagery, and for making the system robust to the variations in the scene, such as dust or smoke, and environmental conditions, i.e. day or and night [1-31].This paper is structured in the following way: Section 2 provides details on several fusion algorithms. This consists of pyramid based algorithms that form a large number of image fusion techniques, biologically inspired fusion approaches, and total probability of error technique Section 3 defines a set of image fusion measures of effectiveness. Section 4 provides a comparative performance evaluation of the fusion techniques and the experimental fusion results, using real passive and active infrared and visible band imagery, for selected approaches. Finally, Section 5 provides a summary of the paper and its main conclusions.

elementary transformations: closing and opening transformations. The opening operator can be expressed as a composition of two other operators: erosion followed by dilation, both by the same input structural element. The main mechanism under the erosion operator is the local comparison of a shape, called structural element, Structuring element is a matrix used to define a neighborhood shape and size for morphological operations, including dilation and erosion. It consists of only 0's and 1's and can have an arbitrary shape and size. Gradient Pyramid of an image is obtained by applying gradient operators to every level of its Gaussian pyramid G. The gradient operators are used in the horizontal, vertical, and 2 diagonal directions. Laplacian Pyramids Fusion- In this approach the Laplacian pyramids for each image component (IR and Visible) are used. A strength measure is used to decide from which source what pixels contribute at each specific sample location. For example, one can use the local area sum as a measure of strength. Figs. 1-2 show visible, visible and IR images of a scene containing a truck and a helicopter hidden in smoke and dust. Fig. 3 shows the result of the Laplacian Pyramid Fusion of these visible and IR images. As can be seen for both truck and the helicopter, they only can be observed in the IR and the fused image. Ratio of Low Pass Fusion-In this case the fusion rule is to select at each pixel location i,j at the pyramid level L the pixel value from the largest deviation from unity from image source A or B. Fig.4 shows the result of the Contrast Pyramid Fusion of images shown in Figs. 1 and 2. FSD Fusion-This is similar to Laplacian fusion, the difference being in using FSD pyramid instead of Laplacian Pyramids. Fig.6 shows the result of the FSD Pyramid Fusion of images shown in Figs. 1 and 2. Gradient pyramid Fusion-This method uses Gradient pyramids instead of Laplacian with Gradient pyramids being used instead of the Laplacian. Figs.7 and 19 shows the result of the Gradient Pyramid Fusion of images shown in Figs. 1 and 2 and Figs 16 and 17 respectively. Principle Component Fusion-This is a statistical method for transforming a multivariate data set with correlated variables into a data set with new uncorrelated variables. For this purpose search is made of an orthogonal linear transformation of the original N-dimensional variables such that in the new coordinate system the new variables be of a much smaller dimensionality M and be uncorrelated. In the Principal Component Analysis (PCA) the sought after transformation parameters are obtained by minimizing

the covariance of error introduced by neglecting N-M of the transformed components. Fig 8 shows the result of the PCA Fusion of images shown in Figs. 1 and 2. Morphological Pyramid Fusion-This method uses morphological pyramids instead of Laplacian or contrast pyramids. Fig.9 shows the result of the Morphological Pyramid Fusion of images shown in Figs. 1 and 2. Wavelet based Methods- Wavelet methods are also a way to decompose image into localized scale specific signals [8, 9, 13, 24]. Wavelet transforms are linear and square integrable transforms whose basis functions are called wavelets . Discrete Wavelet Transform-In the traditional wavelet based fusion once the imagery is decomposed via wavelet transform a composite multi-scale representation is built by a selection of the salient wavelet coefficients. The selection can be based on choosing the maximum of the absolute values or an area based maximum energy. The final stage is an inverse discrete wavelet transform on the composite wavelet representation. Fig.10 shows the result of the DWT Fusion of images shown in Figs. 1 and 2. Shift Invariant Discrete Wave Transform FusionThe traditional discrete wavelet transforms (DWT) fusion encounter a number of shortcomings when fusions of sequences of imagery are being considered: They are not shift invariants and consequently the fusion methods using DWT lead to unstable and flickering results. For the case of image sequences the fusion process should not be dependent on the location of an object in the image and fusion output should be stable and consistent with the original input sequence. To make the DWT shift invariant a number of approaches have been suggested [21]. In [10] an efficient method for computing wavelet transforms of all necessary circular shifts of input images is derived. In [10] a concept of discrete wavelet frame (DWF) is introduced. Each stage of DWF the input sequence is divided into two parts: the wavelet frame sequence and the scale frame sequence. Figs.11 and 18 show the result of the SIDWT Fusion of images shown in Figs. 1-2, and Figs16-17 respectively. Total Probability Density Fusion-When the output of the sensors are non-correlated each image can be represented by a conditional density function p(f ( x, y)| ) , where represents a particular imaging sensor output. Then the effect of using all of the sensory outputs is equivalent to the use of the total probability function. The total probability density function can be obtained by:

p(f ( x, y)) = p(f ( x, y )| ) p( ) .

(1)

The fused invariant expressions then can be extracted from this total probability function. Fig 20 shows the result of applying this technique on the images shown in Figs 17 and 18. Biologically-Inspired Fusion -The bases of these approaches [2, 18, 19, 20, 22, 23] are biological models of color and vis ible/infrared vision. The fusion and some higher-level process for the human and many other higher organisms take place in the retina, the visual cortex, and the brain. These approaches are based on two biological models. The first is based on color vision in primates and monkeys and the second is based on the fusion of thermal infrared and visible imagery as is observed in a number of neuron classes in the optic tectum of two groups of snake, namely rattlesnake and pythons. In the case of color vision, the model mimics both the structure and function of layers of the retina, from the rod and cone photodetectors through the single opponent color ganglion cells. The images are contrast enhanced by spatial opponent processing by means of cone-horizontal-bipolar cell interactions. These interactions create both ON and OFF center-surround response channels. The signals are then enhanced within retina. More processing is performed within the visual cortex of primates and retina of some fish in the form of double-opponent color cells. A technique developed based on this model is used for the fusion of infrared and visible imagery (Fig 15). Fig 12 shows the result of the biologically-inspired fusion approach shown in Fig 14 used on images shown in Figs. 1 and 2.

the sum of absolute difference per line divided by the sum of intensities. Vision retention is defined by the number of faint lights missed by the fusion algorithm. Thermal retention is related to the difference between the fusion result and the IR image. This fusion system is specially adapted to the combination of thermal IR and visual imagery with most information the IR image. Xydeas and Petrovic [31] describe a method, which is based on the preservation of the perceptual edge information. Edge information before and after image fusion is described with a Sobel edge operator. A sigmoid is used to model the relative loss of perceptual information. The results are compared to subjective test procedures, and relative performances of different image fusion show a similar trend for the objective and the subjective test procedures. Measures of Effectiveness-In our study the measures of effectiveness (MOE) set were chosen to have the following properties: 1) correlate well with the contrast in the image both locally and globally, 2) convey information content in the image, and 3) measure separation between a target region and its immediate surrounding background clutter. The following MOEs were used in our study: Fechner-Weber contrast measure [87][88][89]:
mT F W = log m m B T

(2)

Target-background interference ration (TBIR): that is computed by a double-gated window over the entire image | mT mB | TBIR = (3) B Target interference ratio (TIR):
TIR

3. Image Fusion Evaluation


Fisher distance: The evaluation of the effectiveness of image fusion is not a trivial task. Visual search tasks have been used to evaluate the gain compared to single sensor usage. Toet et. al. [30] performed the search task experiments on close range early morning scenarios, and found the best results for the biologically-inspired color algorithm [2].The study by Krebs et. al. [11] found for airborne scenarios less favorable results for the fusion algorithms considered, including the biologicallyinspired color algorithm. Recently efforts have been published to generate objective image fusion metrics. Ulug and McCullough [14] construct an overall fusion rating based information content, vision retention and thermal retention. Their information content rating is based on
Fisher

( mT m B ) 2 T B

(4)

In the above metrics,mT , mB , T and B refer to the mean value of target, mean value of background, standard deviation of target and standard deviation of background respectively. Image Entropy: This metric measures the image complexity, w here entropy is defined as: Entropy = p log( p ) (6) p is the estimated probability density function (normalized pixel intensity histogram) of the selected image region.

mT 2 mB 2 T 2 + B2

(5)

For any MOE that requires region of interest and its immediate background a double gated box was used. The inside box was a minimum bounding rectangle enclosing the object of interest. The outside box was twice as large as the inside box. Image Fusion Evaluation Tool Box- to facilitate this analysis we have developed a Fusion Analysis Tool Box. Figure 15 shows a screen dump of this toolbox. By using this toolbox we can display multiple input images, select from a set of menus various fusion algorithms, for each selected algorithm select from a set of menus a set of algorithm parameters, perform the fusion and display the resulting outputs in both image form and a sets of quantitative MOE value plots. The toolbox provides the user to select and display a region of interest in the image and for this region to display the MOE results. This last capability will provide the user with the means for assessing the performance of the fusion algorithms both globally and locally. As can be seen from the Fig 15, a region of interest (in this case shown a battle tank hauler) can be selected on the imagery and its fusion results will be shown in the lower right section of the from. Next to this image is the display for the chosen MOE values for this region of interest for all of the selected fusion algorithms.

leads to performance improvements over individual input imagery. In the case of TBIR, the methods used actually show a decrease in performance values. The Fechner-Weber MOE shows that 9 fusion methods (out of the 11 methods used) produce improved performance values. This number for the case of Entropy is 5 and for the case of Fisher is only 1. Fig 13 shows a composite display of all the metrics for the first set of data. With the exception of entropy that indicates a decrease in value in going from Visible to IR band, all other metrics show increases in their values. TIR, TBIR, and Fisher distances show similarities in their trends for different fusion methods with a few exceptions for the cases of ratio pyramid and contrast pyramid approaches. For the second data set, whose composite metrics are shown in Fig 21, the biologically based fusion method did well compare with other fusion techniques only when the Entropy was used as a MOE. For this data set TIR, TBIR and Fisher MOEs all decrease in going from visible to IR which is somewhat surprising. The cause can be due to the very low variance in the background regions. This figure also shows that the best results were obtained for the DWT fusion method for both entropy and FW metrics. PCA shows to have the best Fisher metrics. However this seems to be due mainly to the small variance of the target regions.
Table 1 Fusion Methods Total Probability Density Function (TPE) Principal Component Analysis (PCA) Laplacian Pyramid Filter-Subtract-Decimate Hierarchical Pyramid (FSD) Ratio Pyramid Gradient Pyramid Discrete Wavelet Transform (DWT) Shift Invariant Discrete Wavelet Transform (SIDWT) Contrast Pyramid Morphological Pyramid Bio-inspired

4. Experimental Results
Two sets of registered visible and infrared images of a scene containing several targets of interest, (Figs 1 and 2 and 1 and 1 were used in the evaluation 6 7) experiment. Their MOEs computed for one of the vehicle, located in the middle of the scene, were shown in this paper. The algorithms used in evaluation are shown in the Table 1. For the first set Figure 13 shows a composite graph showing the variations of all the MOEs for various fusion methods. The results indicate that for TIR MOE, the contrast Pyramid leads to the best performance. However, by using TBIR, the best performance is achieved by the application of Ratio Pyramid. In the case of the Fechner-Weber MOE, SIDWT produces the best result. Entropy results, however indicate a number of similarly good performance values, the highest being the results of the application of Morphological and Laplacian Pyramid approaches. The results also show that not all of the fusion methods lead to improvements of performance. The number of fusion methods leading to any improvements at all varies according to the MOE used in the study. For the case of TIR only one method

5. Summary
In this paper we have presented the results of a study to provide a quantitative comparative analysis of a typical set of image fusion algorithms. The results were based on the application of these algorithms on two sets of collocated visible (electro-optic) and infrared (IR) imagery. The quantitative comparative analysis of their performances was based on using 5 different measures of effectiveness. These metrics were based on measuring information content and/or

measures of contrast. The results of this study indicate that the comparative merit of each fusion method is very much dependent on the measures of effectiveness being used. However, many of the fusion methods produced results that had lower measures of effectiveness than their input imagery. The highest relative MOE values were associated with the FechnerWeber and Entropy measures in both sets. Fisher metrics showed large values mainly due to low pixel variances in the target background areas. Acknowledgements Author gratefully acknowledges Defense Research Establishment Valcartier for supplying the Ms01 sequence used in this study.

[10] Svigny, L., Multisensor Image Fusion for Detection of Targets in the Battlefield of the Future, NATO AC/243, Panel 3, RSG9 38th Meeting Progress Report - Canada, Defense Research Establishment Valcartier, 1996. [11] W. K. Krebs, W. K., Scribner, D. A, Miller, G. M., Ogawa, J. S., Schuler, J.,Beyond Third Generation: A sensor fusion targeting FLIR pod for the F/A-18, Sensor Fusion: Architectures, Algorithms, and Applications II, SPIE Vol. 3376, pp.129-140, 1998. [12] Toet, A., Hierarchical Image Fusion, Machine Vision and Applications, 3:1-11, 1990. [13] Lejeune, C., Wavelet transforms for infrared applications, Infrared Technology XXI, SPIE Vol. 2552, pp. 313-324, 1995. [14] M . E. Ulug, M. E., and McCullough, C. L., A quantitative metric for comparison of night vision fusion algorithms, Sensor Fusion: Architectures, Algorithms, and Applications IV, SPIE Vol. 4051, pp. 80-88, 2000. [15] Yocky, D. A., Image merging and data fusion by means of the two dimensional wavelet transform, Journal of the Optical Society of America, Part A., Vol. 12, No., 9, 1995. [16] Nunez, J., Multiresolution-based image fusion with additive wavelet decomposition, IEEE Transactions on Geoscience and Remote Sensing, Vol., 37, No.3, 1999. [17] Li, H., Manjunath, B. S., and Mitra, S.K., Multisensor image fusion using the wavelet transform, Graphical Models and Image Processing, Vol. 57, No. 3, 1995. [18] Schiller, P. A., The ON and OFF channels of the visual system, Trends in Neuroscience, Vol. 15, No. 3, 1992, PP 86-92. [19] Schiller, P. H., Logothetis, N. K., The color-opponent and broadband channels of the primate visual system, Trends in Neuroscience, Vol. 15, No. 3, 1992, PP 86-92. [20] Waxman, A. M , et al., Neural processing of targets in Visible, Multispectral, IR and SAR Imagery, Neural Networks, Vol. 8, No. 7/8, 1995, pp 1029-1051. [21] Rockinger, O., Image Sequence Fusion Using a Shift Invariant Wavelet Transform, Proceedings of the International Conference on Image Processing, 1997.

6. References
[1] D. Ryan and R. Tinkler, Night Pilotage Assessment with Applications to Image Fusion, SPIE Vol. 2465, 1995. [2] Waxman, et al, Solid-State Color Night Vision: Fusion of Low-Light Visible and Thermal Infrared Imagery, MIT Lincoln Laboratory Journal, Vol. 11, No. 1, 1999. [3] Toet, J. Van Ruvan, and J. Valeton, Merging thermal and visual images by a contrast pyramid, Optical Engineering, Vol. 28, 1989. [4] P. Burt, "A gradient pyramid basis for pattern selective image fusion." the Society for Information Displays (SID) International Symposium Digest of Technical Papers, 23, pp. 467-470 (1992). [5] P. Burt, E. Adelson, Laplacian pyramid as a compact image code, IEEE Transactions on Communications, Vol. 31, No. 4, 1983. [6] P. Burt, The pyramid as a structure for efficient computation, in Multi-resolution Image Processing and Analysis, Editor: A. Rosenfeld, Springer-Verlag, New York, 1983. [7] Toet, A., Image fusion by a ratio of low-pass pyramid, Pattern Recognition Letters 9, pp. 245-253, 1996. [8] Mallat, S., Wavelets for a Vision, Proceedings of the IEEE, Vol. 84, pp. 604-614, 1996. [9] Olkkonen, H., and Pesola, P., Gaussian Pyramid Wavelet Transform for Multiresolution Analysis of Images, Graphical Models and Image Processing, vol. 58, pp. 394398, 1996.

[22] Newman, E. A., Hartline, P. H., Integration of visual and infrared information in bimodal neurons of rattlesnake optic tectum, Science, 1981, vol. 213, pp 789-791. [23] Newman, E. A., Hartline, P. H., The infrared visionof snakes, Scientific American, Vol. 246, March 1982, pp 116-127. [24] Ramac, L. C., Uner, M. K., Varshney, P. K., Morphological filters and wavelet based image fusion for concealed weapon detection, Proceedings of SPIE, Vol. 3376, 1998. [25] Li, S. T., Wang, Y. N., Multisensor image fusion using discrete multiwavelet transform, Proceedings of the 3rd International Conference on Visual Computing, Mexico City, Mexico, 2000. [26] Clark, G. A., Detection of buried objects by fusing dual-band infrared images, Proceedings of the Asilomar Conference on Signals, Systems, and Computing, Pacific Grove, California, 1993. [27] H. Anderson, A filter-subtract-decimate hierarchical pyramid signal analyzing and synthesizing technique, U.S. Patent 718 104, 1987. [28] Q. Wang, Y. Shen, Y. Zhang, J. Q. Zhang, " A Quantitative Method for Evaluating the Performance of Hyperspectral Image Fusion," IEEE Transactions on Instrumentation and Measurement, Vo. 52, No. 4, pp. 10411047, August 2003. [29] Ramesh, T. Ranjith," Fusion Performance Measures and a Lifting wavelet Transform based algorithm for image Fusion," ," Proceedings of International Conference on Information Fusion, pp. 317-320, July 2002. [30] Toet, et al, Fusion of visible and thermal imagery improves situational awareness, SPIE 3088, 1997. Fig 4. Fusion using Ratio Pyramid [31] Xydeas, C., and Petrovic, V., Objective Pixel-level Image Fusion Performance Measure, Sensor Fusion: Architectures, Algorithms, and Applications IV, SPIE Vol. 4051, pp. 89-98, 2000.

Fig 1. A visible band image of a scene

Fig 2. An infrared band image the scene shown in Fig 1

Fig 3. Fusion using Laplacian Pyramid

Fig 5. Fusion using Contrast Pyramid

Fig 6. Fusion using Filter-Shift Decimate technique

Fig 10. Fusion using Discrete Wavelet Transform

Fig 7. Fusion using Gradient Pyramid

Fig 11. Fusion using Shift Invariant Wavelet Transform

Fig 8. Fusion using Principle Component Analysis

Fig 12. Fusion using a Biologically-inspired approach

Fig 9. Fusion using Morphological Pyramids Fig 13. Variation of MOEs with Different Fusion Algorithms for the Truck Scene

noise cleaning

fused image

+
-

RGB/HSV

low-light visible imagery

+
-

color remap table HSV/RGB

distortion correction

+
+

+
-

remapped color

thermal infradred imagery

color display

visible and infrared imagery registered noise cleaned

contrast enhancement adaptive normalization ON and OFF infrared channels

single-opponent color contrast warm red cool blue

hue remap desaturate image select

Fig 14. A Schematic of a Model for Biologically-inspired Image Fusion

Fig 18. Fusion of Fig 17 and 18 using Discrete Wavelet Transform

Fig 15. Image Fusion Evaluation Tool Box

Fig 19. Fusion of Figs 17 and 18 using Laplacian Pyramid

Fig 16. Infrared Image of a Scene Showing a Man

Fig 20. Fusion of Figs 17 and 18 using Total Probability Density Technique

Fig 17.Visible-band Image of a Scene Shown in Fig 16 Fig 21 Variations of MOEs with Different Fusion Algorithms for the Man Scene

Você também pode gostar