Cloud Detection in Landsat Imagery of Ice Sheets Using Shadow Matching Technique and Automatic Normalized Difference Snow Index Threshold Value Decision
This work presents a new algorithm designed to detect clouds in satellite imagery of ice sheets. The approach identifies possible cloud pixels through the use of the normalized difference snow index (NDSI) possible cloud pixels are grown into regions and edges are determined. Possible cloud edges are then matched with possible cloud shadow regions.
This work presents a new algorithm designed to detect clouds in satellite imagery of ice sheets. The approach identifies possible cloud pixels through the use of the normalized difference snow index (NDSI) possible cloud pixels are grown into regions and edges are determined. Possible cloud edges are then matched with possible cloud shadow regions.
Cloud Detection in Landsat Imagery of Ice Sheets Using Shadow Matching Technique and Automatic Normalized Difference Snow Index Threshold Value Decision
This work presents a new algorithm designed to detect clouds in satellite imagery of ice sheets. The approach identifies possible cloud pixels through the use of the normalized difference snow index (NDSI) possible cloud pixels are grown into regions and edges are determined. Possible cloud edges are then matched with possible cloud shadow regions.
Cloud detection in Landsat imagery of ice sheets using shadow
matching technique and automatic normalized difference snow
index threshold value decision Hyeungu Choi a,b, * , Robert Bindschadler b a Science Applications International Corporation (SAIC), San Diego, CA, USA b Oceans and Ice Branch (Code 971), NASA Goddard Space Flight Center, Greenbelt, MD 20771, USA Received in revised form 24 March 2004; accepted 30 March 2004 Abstract This work presents a new algorithm designed to detect clouds in satellite visible and infrared (IR) imagery of ice sheets. The approach identifies possible cloud pixels through the use of the normalized difference snow index (NDSI). Possible cloud pixels are grown into regions and edges are determined. Possible cloud edges are then matched with possible cloud shadow regions using knowledge of the solar illumination azimuth. A scoring index quantifies the quality of each match resulting in a classified image. The best value of the NDSI threshold is shown to vary significantly, forcing the algorithm to be iterated through many threshold values. Computational efficiency is achieved by using sub-sampled images with only minor degradation in cloud-detection performance. The algorithm detects all clouds in each of eight test Landsat-7 images and makes no incorrect cloud classifications. D 2004 Elsevier Inc. All rights reserved. Keywords: Landsat; ETM+; Clouds; Shadow; Classification; Ice sheet; NDSI; Automatic cloud cover assessment (ACCA) 1. Introduction Automated procedures for detecting cloud have multi- ple uses. A major application is to assist in searches of optical imagery archives. Cloudier images can usually be ignored in lieu of less cloudy images, unless the target is small or if date is an essential search parameter. Accurate cloud assessment also serves a critical role in the sched- uling of high-resolution imagers such as the Enhanced Thematic Mapper Plus (ETM+) on Landsat-7 (Arvidson et al., 2001). Cloud cover of ETM+ images is used to determine if a desired image collection was successful and, if not, the image request is returned to the imaging queue for reacquisition. An incorrect cloud assessment can lead to poor utilization of imaging resources and effort. Over most of the earths surface, clouds can be detected by their high albedo in the visible spectrum and by their cold temperatures. However, either approach has difficulty in discriminating between clouds and ice sheets because both targets are bright and temperature inversions in the atmosphere above ice sheets are com- mon, leaving the surface colder than the clouds. Cloud formations are usually distinct and mappable in ice sheet imagery, but their automatic classification as cloud rather than as a formation of the ice sheet is the crux of the difficulty. The approach examined here utilizes the characteristic that clouds thick enough to mask the surface also cast shadows on the surface. Shadows are much darker than either the ice sheet surface or the clouds, and are easily identified. However, ice sheets do contain limited areas of mountains and bare rock that are also dark. Knowledge of the sun azimuth allows potential cloud features to be matched with potential cloud shadow features to better determine what features are actually clouds. A quantitative index of matching is used to optimize the algorithm, and multiple iterations are necessary to search the image for the 0034-4257/$ - see front matter D 2004 Elsevier Inc. All rights reserved. doi:10.1016/j.rse.2004.03.007 * Corresponding author. Oceans and Ice Branch, Science Applications International Corporation, Code 971, NASA Goddard Space Flight Center, Greenbelt, MD 20771, USA. Fax: +1-301-614-5644. E-mail addresses: choi@ice.gsfc.nasa.gov (H. Choi), bob@igloo.gsfc.nasa.gov (R. Bindschadler). www.elsevier.com/locate/rse Remote Sensing of Environment 91 (2004) 237242 set of cloud features that are best matched by the set of potential cloud shadows. 2. Data Eight Level-1G Landsat-7 ETM+ images of the Antarctic or Greenland ice sheet were used for this research. The selection was made to provide a variety of cloud types and coverage amounts. The test images were converted into calibrated reflectance images using ENVI (nRSINC) soft- ware. No atmospheric correction was applied. Sun elevation angle and azimuth angle were read from the metadata file. 3. The ACCA algorithm The ACCA (automatic cloud cover assessment (ACCA) algorithm (Irish, 2000) was developed for the Landsat processing system (LPS) and is the starting point for the approach discussed here. The LPS retrieves and processes the raw image data and generates Level-0R data with an associated cloud assessment. The ACCA algorithm embed- ded in LPS generates a cloud cover score for each quarter of each scene. The ACCA algorithm is a two-pass processing scheme. Pass one applies eight separate filters, while pass two involves thermal channel analysis. The ACCA gives good results over most of the planet with the exception of ice sheets because ACCA operates on the premise that clouds are colder than the land surface they cover. Only one of the eight filters for pass one processing is effective for ice sheet images: the normalized difference snow index (NDSI) (Hall et al., 1995). The NDSI was designed to distinguish snow from most other features. Other filters are designed for classifying highly reflective vegetation, rock, and sand. The NDSI filter is expressed as: NDSI band 2 band 5=band 2 band 5 1 and produces an image of NDSI values. A threshold applied to the NDSI image is used to separate cloud pixels from non-cloud pixels. The principle behind the NDSI filter is that while snow and cloud are both highly reflective in band 2 (0.520.6 Am), the reflectance of clouds in the near- infrared band 5 (1.551.75 Am) decreases less than the snow reflectance. Ice sheets are primarily covered in snow, so a snow versus cloud discriminator is expected to be effective. ACCA takes 0.7 for its NDSI threshold value. For different threshold values, the pixels identified as cloud varied. The examination of the NDSI filter indicated that the best NDSI threshold value was variable from one image to another, depending on factors such as sun elevation angle, atmo- spheric condition, and season. Another factor reducing the effectiveness of the NDSI as an ice sheet versus cloud discriminator is that snows near- infrared reflectance increases with snow grain size (Dozier, 1989). Ice sheets are generally covered by older, larger grained snow, a result of infrequent snowfall, wind that transports snowbreaking off a grains delicate dendritic arms, and large temperature gradients that enhance snow metamor- phosing into rounder shapes. The next two sections will show how the optimal threshold NDSI for each image is decided. 4. Cloud detection using shadow matching (CDSM) algorithm The basic concept of our cloud detection using shadow matching (CDSM) algorithm is to detect clouds by matching them with their corresponding shadows. Dark features are easily identified and shadows comprise a subset of all dark features. Potential clouds are identified through the appli- cation of the NDSI although the set of cloud candidates varies with the threshold used with the NDSI. Knowing the sun azimuth limits the searching necessary to match possi- ble clouds with possible shadows. The CDSM algorithm uses bands 2 through 5 of the calibrated reflectance images. Fig. 1 shows the flow chart of the CDSM algorithm. Preliminary steps, which are not shown here, are the removal of non-image edge pixels around the perimeter of the image and the detection of water pixels. Fig. 1. Brief flow chart of CDSM algorithm. H. Choi, R. Bindschadler / Remote Sensing of Environment 91 (2004) 237242 238 Water pixels are much darker than cloud shadow pixels in the visible spectrum. Each pixel in bands 3 and 4 is compared to a water threshold set at 0.07. Pixels with values below this threshold are classified as water and are not considered further in the cloud detection scheme. A NDSI threshold is set and, by application of the NDSI formula (Eq. (1)), potential cloud pixels are identified. The default threshold value is 0.7, but, as described below, this value is later varied to optimize the amount of cloud and shadow matching possible for any image. The result is a binary image with each pixel labeled either possible cloud or not-cloud. A morphological closing operator (Castleman, 1996) that removes small holes and narrow gaps is then applied to the binary map. This operation simplifies the shapes of possible clouds and reduces their number. This dramatically reduces the processing times of the remaining steps of the algorithm. Pixels identified as possible cloud are isolated into regions with a region-labeling algorithm. A region is a set of possible cloud pixels within a neighborhood around the pixel under examination. This labeling operation (Pavlidis, 1982) also tags each potential cloud region with a unique identifier. The CDSM algorithm then tests whether each possible cloud region has a matching shadow. Next, an edge detection procedure extracts the edges of the possible cloud regions. After a thinning operation, all edges are a single pixel wide. Bands 3 and 4 are also used for shadow detection. Cloud shadow is brighter than water and darker than both cloud and snow. The brightness of cloud shadow varies depending on sun elevation angle and cloud thickness. When sun elevation angle is lower than 15 degrees, a maximum Fig. 2. Landsat-7 ETM+ images (color composite image from bands 3, 4, and 5) and corresponding cloud mask results. The numbers on each image are Path/ Row. Identified clouds (light gray), detected shadows (dark gray), detected water pixels (grid) and the rejected non-cloud pixels (black). H. Choi, R. Bindschadler / Remote Sensing of Environment 91 (2004) 237242 239 reflectance threshold of 0.6 is used. The maximum threshold is increased to 0.7 when the sun elevation angle exceeds 15 degrees. Minimum thresholds are 0.15 and 0.1 for bands 3 and band 4, respectively. Pixels in the range between the minimum and maximum thresholds are classified as pos- sible cloud shadow. It is recognized that there may be other classes within this range of brightness, such as bare rock or snow shadowed by steep mountains. The matching of possible clouds with possible cloud shadows is how the actual cloud shadow pixels are separated from the other classes of intermediate brightness. The matching procedure works with the sets of edges of possible clouds, the possible cloud shadows, and the water regions. Starting at any cloud edge, this edge is translated along the image in the direction of solar illumination, searching for cloud shadow. When the edge pixels of a cloud cluster meet shadow, or water, or image edge pixels, the shadow, water, and image edge ratios (the number of cloud edge pixels meeting shadow, water, and image edge divided by the total edge pixel number of the cloud cluster) are recorded. For the case of the shadow ratio, the extreme situation is when a small cloud and its complete cloud shadow are identified. In this case, every edge pixel matches a cloud shadow pixel and the shadow ratio is 1. For larger clouds or low clouds with distinct shadows, the cloud could obscure a portion of the shadow and the shadow ratio would decrease toward 0.5. If the water ratio is greater than >0.25 or if the image edge ratio is greater than >0.2, then the possible cloud cluster is classified as cloud without testing shadow match- ing. If the shadow ratio is greater than >0.2, the cluster is classified as cloud. These thresholds were determined em- pirically based on the test images available. The output of the cloud detection algorithm is an image map classified into cloud, water, shadow, and the remaining possible cloud clusters that failed to be classified as cloud. Fig. 2 shows our test Landsat-7 images and the corresponding classified images resulting from the CDSM algorithm show- ing identified clouds (light gray), detected shadows (dark gray), detected water pixels (grid), rejected non-cloud pixels (black), and snow-covered ice sheet (white). 5. Automatic NDSI threshold decision (ANTD) algorithm As discussed earlier, the NDSI threshold value used in the CDSM algorithm cannot be fixed due to the variability of image conditions: specifically sun elevation angles, atmospheric conditions, and seasonal conditions. Significant errors occurred for any constant value of the NDSI thresh- old. From visual inspection of clouds in our eight test images and the performance with various values of the NDSI threshold, the proper NDSI values ranged from 0.56 to 0.79 (average = 0.675, standard deviation = 0.069). We introduced an automatic NDSI threshold decision (ANTD) method to deal with this condition (Fig. 3). The ANTD method requires that the full CDSM algorithm be applied for a series of NDSI threshold values. For each iteration in the series, a single value of the NDSI threshold is used and the results of the CDSM algorithm are used to derive a cloud score, defined as: Cloud score S 1 S Ratio 0:5 R 2 where S 1 = the number of cloud edge pixels matching cloud shadow; S Ratio = S 1 /(total number of cloud edge pixels); R = the number of cloud edge pixels not matched by cloud shadow. The preferred value of the NDSI threshold occurs when the cloud score is a maximum. For the iterations of the ANTD, the NDSI threshold is set to an initial value of 0.6 (0.56 for the image with the lowest sun-elevation angle of 8j) and increased by 0.01 for each iteration. Fig. 4 shows that there is always a maximum cloud score for each image, but that the corresponding preferred value of the NDSI threshold varies from image to image. Fig. 4 also shows that the cloud scores decrease sharply for NDSI thresholds above the preferred value. Higher NDSI thresholds add incorrect pixels to the possible cloud regions. As a result, fewer possible cloud regions match with possible cloud shadow, lowering the value of the first term in Eq. (2). In addition, the increased numbers of cloud edge pixels that fail increase the value of the second term in Eq. (2), which also serves to lower the cloud score. Eq. (2) is weighted to give preference to the larger cloud regions. Our bias was to ensure that the largest clouds had the greatest certainty of detection because smaller clouds have a lesser impact on the utility of an image. Our experience with this weighting and the coefficient value of 0.5 for the Fig. 3. Flow chart of ANTD algorithm. H. Choi, R. Bindschadler / Remote Sensing of Environment 91 (2004) 237242 240 second term was the result of extensive evaluations even though the number of images examined was limited to eight. To truncate the iteration process and save processor time, a test is included based on a shadowcloud ratio (total number of shadow pixels/total number of cloud pixels). We found that if this ratio is less than 0.15 at the end of an iteration, subsequent iterations for other values of the NDSI threshold need not be completed. A small shadowcloud ratio means that the cloud clusters have grown too much as a result of the NDSI threshold value being too high. 6. Performance The performances of the CDSM and ANTD algorithms were evaluated by comparing their results to an independent classification of each image based on visual inspection. Even though a cloud may appear very similar to snow in the visible and near-infrared parts of the spectrum, a person can often use cloud shape and shadow to unambiguously distinguish the cloud from snow-covered ice. In all eight cases, the combination of the CDSM and ANTD algorithms found all clouds and made no incorrect cloud classifications. The detected cloud percentage and the optimal NDSI threshold values returned by the ANTD algorithm are shown in Table 1. We feel the excellent results represent a great improvement over ACCA, which uses NDSI with a fixed threshold. Although our test data set was chosen randomly, from images already on hand for other studies, in half of the images the near-infrared reflectance of the snow-covered regions is so close to that of clouds, the NDSI often identified those regions as cloud. However, the CDSM algorithm correctly reclassified these regions because no matching shadows could be found. The CDSM and ANTD algorithms attempt to automate some of the procedures a human employs in cloud identi- fication, but the automated procedures necessarily involve many calculations. Each cloud cluster must be tested for shadow matching and the ANTD algorithm involves an iteration scheme of over 20 CDSM processes. Nearest- neighbor, sub-sampled images were created to examine the effect on reducing the CDSM/ANTD processing times and their effect of accurate cloud detection. Results for identical CDSM/ANTD processing of the 2 2 sub-sam- pled and 4 4 sub-sampled images are given in Table 1 and illustrated in Fig. 5. The calculated cloud-detection results suffer only minor degradation, while the processing time decreases exponentially. The results also show that images with more cloud clusters take more time for the CDSM/ ANTD procedure. Situations are known to occur, such as ground fog, where clouds are at such low elevations that shadows are displaced too short a distance to be resolved in an image. Our approach will fail in such situations. Requiring a shadow to be a minimum of 2 pixels wide (60 meters for ETM+) and Fig. 4. Normalized cloud score for each ANTD iteration step. Key gives path and row of each Landsat-7 ETM+ image. Table 1 Cloud percentage detected by the CDSM algorithms with full, quarter, and 1/16 size images and optimal NDSI threshold values reported by the ANTD algorithm Path, row Month/ day/year Cloud% (full size) Cloud% (1/4 size) Cloud% (1/16 size) NDSI threshold 227, 117 01/17/00 15.68 15.60 15.21 0.7 34, 119 02/26/00 1.05 0.94 0.92 0.56 12, 115 01/15/00 14.91 14.96 14.60 0.67 7, 121 12/27/99 14.47 14.52 14.24 0.79 229, 118 12/14/99 11.84 11.94 11.74 0.63 229, 119 12/14/99 5.92 5.96 5.82 0.71 53, 115 01/16/01 16.90 16.71 16.30 0.63 29, 117 12/21/99 7.47 7.37 7.13 0.71 H. Choi, R. Bindschadler / Remote Sensing of Environment 91 (2004) 237242 241 sun elevations to be a minimum of 10 degrees, this implies that only clouds with upper surfaces lower than 10 meters will be missed. We do not deem this restriction to severely limit the application of our approach. 7. Conclusion Automated cloud detection in space-borne visible, near- infrared, and short-wave infrared imagery of ice sheets has proven to be a challenging problem for many years. We believe that the shadow detection and matching approach of our CDSM algorithm is a novel means that uses more information within the image (i.e., darker regions as possible cloud shadows) and about the image (i.e., metadata of solar azimuth) to provide an improved solution to this problem. Operational adoption of the CDSM/ANTD approach is more likely given the much-reduced running times on sub- sampled images with little impact on cloud-detection per- formance. There appears to be further computational sav- ings possible with greater sub-sampling of the raw imagery, however, this aspect has not been fully explored in this paper. Further reductions of computational requirements could be achieved if some other means is found to determine the appropriate NDSI threshold. Our data set was not robust enough to indicate the specific conditions that might inde- pendently determine this threshold, however, it is possible that the environmental history of a site is so important as to make independent methods untrustworthy. Finally, an additional advantage of shadow matching is that it is easy to calculate the elevation of each cloud top. We have not included these calculations in our results, but they might prove useful for some scientific studies. References Arvidson, T., Gasch, J., & Goward, S. N. (2001). Landsat-7s long-term acquisition planan innovative approach to building a global imagery archive. Remote Sensing of Environment, 78 (12), 1326. Castleman, K. R. (1996). Digital image processing. Prentice Hall, NJ. Dozier, J. (1989). Remote Ssensing of snow in visible and near-infrared wavelengths. In G. Asrar (Ed.), Theory and applications of optical remote sensing ( pp. 527547). New York: Wiley. Hall, D. K., Riggs, G. A., & Salomonson, V. V. (1995). Development of methods for mapping global snow cover using moderate resolution imaging spectroradiometer data. Remote Sensing of Environment, 54, 127140. Irish, R. (2000). Landsat-7 automatic cloud cover assessment algorithms for multispectral, hyperspectral, and ultraspectral imagery. SPIE, 4049, 348355. Pavlidis, T. (1982). Algorithms for graphics and image processing. Com- puter Science Press, MD. RSINC (Research Systems), www.rsinc.com. Fig. 5. ANTD algorithm processing times depending on sampled image size. Key gives path and row of each Landsat-7 ETM+ image. H. Choi, R. Bindschadler / Remote Sensing of Environment 91 (2004) 237242 242