Você está na página 1de 4

A HYBRID APPROACH FOR AUTOMATED DETECTION OF LUNG NODULES IN CT IMAGES

J. DEHMESHKI, X. YE, M. V. CASIQUE, XY. LIN Medicsight PLC, 46 Berkeley square, London, WIJ 5AT, UK

Abstract
This paper presents a novel shape based Genetic Algorithm Template Matching (GATM) method for the automated detection of lung nodules. The GA process is employed as an optimisation method to effectively search for the location of nodule candidates within the lung area. To define the fitness function for GATM, 3D geometric shape feature is calculated at each voxel and then combined into global nodule intensity distribution. Lung nodule phantom images are used as reference images for template matching. The proposed method has been validated on 70 clinical thoracic CT scans that contain 178 nodules as a gold standard. 151 nodules were detected by the proposed method, a detection rate of 85%, with the number of False Positives (FP) at approximately 14.0/scan. This high detection performance provides a good basis for a Computer-Aided Detection (CAD) system for lung nodules.

2. METHOD
As it is well known, a typical CAD system for lung nodule detection consists of three major phases. The first phase deals with detection of all potential nodules (objects). Then important features of each object will be extracted in second phase. The extracted features, in third stage, are incorporated into a classifier to reduce FP objects (normal tissues). The overall performance of a CAD system depends on performances of each individual phase. Typically, the most challenging aspect of a CAD system is the first phase, object detection. In this paper, we focus on the first phase aiming at developing a method to detect most of the nodules candidates while introducing only a few FP objects into next phases. Figure 1 provides a flow diagram outlining the key steps in the proposed approach. The lung area is firstly extracted by using an adaptive thresholding method followed with a rolling ball algorithm [8]. Further processing of nodule candidates detection is carried out in the segmented lung region. Rules based filtering is then used to remove easily dismissible FP such as joint of vessels. The main focus of this study, shape based GATM, is shown in boldface in the figure. CT Lung Image Lung Extraction Shape-based GATM Process Rules-based Filtering for FP Reduction Nodule candidates Figure 1 Flow diagram of proposed nodule detection system 1.1. Shape based GA template matching The GATM process is used as an optimisation method to determine the target position of the nodule candidates within the lung area. Compared to linear searching, the advantage of using GA searching method is its stochastic optimisation characteristics, which simulates the evolution processes such as natural selection and the genetic modifications. Three key issues in proposed shape based GATM are: (a) how to define fitness function considering the shape information and nodule intensity distribution; (b) how to design the chromosome; and (c) how to create template images. In the following sections,

1. INTRODUCTION
Lung cancer is the most common cause of cancer death [1]. Early detection and treatment of lung cancer can significantly improve the long term health of those inflicted with it. Nodules can be missed due to low relative contrast, small size, or location of the nodule within an area of complicated anatomy. Recently, researchers have developed a number of computer-aided lung nodule detection methods to aid radiologists in identifying nodule candidates from CT images. The approaches can be divided into two groups: intensity based [2, 3] and model based detection methods [4, 5, 6]. Although much of the effort was devoted to the ComputerAided Detection (CAD) of lung nodules, lung CAD system still remains an ongoing research task and should be improved further [7]. One of the major difficulties that should be tackled is to detect nodules which are adjacent to anatomical structures such as blood vessels or the chest wall when they have very similar X-ray attenuation and appearance in individual cross-sectional CT images or to detect nodules which are in non-spherical shapes. To tackle this problem, a new hybrid approach has been developed which is based on the shape-based Genetic Algorithm Template Matching (GATM). 3D local shape information is combined into global nodule intensity distribution for fitness calculation of GA process. Furthermore, new definition for chromosome is proposed which includes directional information. Lung nodule phantom images are used as references for template matching instead of synthetic Gaussian template suggested in [4]. From the experimental results shown in section 3, the proposed method is robust to the templates, and also is able to detect nonspherical nodules with local spherical elements. Details of proposed method are described in following sections.

0-7803-9577-8/06/$20.00 2006 IEEE

506

ISBI 2006

new definitions are proposed to obtain high performance of lung nodule detection. 1.1.1. Lung phantom images as references for template matching In [4], Gaussian templates are used as reference images for the template matching. This is based on one assumption that CT value distribution of the nodules can be approximated using the Gaussian image model. But it might not be true in some cases due to partial volume effect in CT imaging. Also, based on our experience, this method is very sensitive to the parameters used for the models. In our work, QRM lung nodule phantom images are used as reference images. QRM Lung Phantom (QRM, Moehrendorf, Germany) [9] is a standard synthetic device created to mimic human lung for CT scanning. It includes spherical objects similar to lung nodules with known dimensions in various positions. The plastics used in this semi-anthropomorphic phantom mimic the tissues in the lung with respect to density and attenuation characteristic. The QRM lung phantom was scanned at 0 degree angle using a 16-slice MDCT (GE LightSpeed) scanner with slice thickness at 1.25mm, reconstruction pitch at 0.562. Eight nodule models with sizes ranging from 3mm to 20mm were created based on the CT lung phantom image. Figure 2 shows largest cross-section for each model. The advantage of using lung phantom nodule images as references is because of their better simulation to the real nodule compared to the synthetic Gaussian models suggested in [4].

The incorporating of direction information into the chromosome design improves of intensification property of proposed GA method. In contrast to the chromosomes having genes representing static locations, the design of chromosome with the embedded directional feature enables the convergence to the optimal solution more efficiently.
0 0 1 Y 2 3 4 1 X 2 3 4

Template T1: 000 T5: 100 T2: 001 T6: 101 T3: 010 T7: 110 T4: 011 T8: 111
(b) North 000 West 011 Front (Previous slice) 101 (d) South 010 Back (Next slice) 100 East 001

(a) Distance 0 pixel: 000 2 pixel: 001 8 pixel: 010 12 pixel: 011 20 pixel: 100 26 pixel: 101 32 pixel: 110 40 pixel: 111 (c)
Gene 1
x coordinate 0 1 1

Gene 2
y coordinate 0 1 1

Gene 3
z coordinate 0 0 0 0

Gene 4
Direction 0 0

Gene 5
Distance 0 0 1 0

Gene6
Template 1 0

(e)

Figure 2 Eight lung nodule phantom images (ranging from 3mm to 20mm) as reference images (Largest cross-section) 1.1.2. Definition of chromosome Each chromosome consists of 6 genes that are in the form of bitstrings. The first three genes in the chromosome represent the geometric location (x, y, z) of one nodule candidate. The fourth gene indicates the moving direction of the nodule candidate in the search space. The fifth gene represents the moving distance along the direction specified by the fourth gene. The new nodule candidate location obtained from the movement specified by relevant genes is used for fitness evaluation. The last gene chooses one of the template images as reference for template matching processing. Figure 3 illustrates an example of chromosome. A synthetic 2D image is given in Figure 3 (a), which has an object consisting of 8 pixels shown in black. The chromosome shown in Figure 3 (e) represents the pixel (3,3) highlighted by a circle in Figure 3 (a), in which the first two genes 011 and 011 represents the x and y location (In this example, Z coordinates is always 000 since the synthetic image is only 2D). Assuming there are 8 templates, 8 different moving magnitudes and 6 possible searching directions depicted in Figure 3 (b), (c) and (d), respectively, the fourth gene is encoded as 000 which means the North direction is chosen; while the moving distance is 2 pixels as the gene 5 is encoded as 001. Consequently, the pixel highlighted with a triangle is selected as a candidate for fitness evaluation. In this example, the third template is selected as the code for gene 6 is 010.

Figure 3 (a) Synthetic image; (b) coding of gene for direction; (c) coding of gene for moving distance; (d) coding of template; (e) an examples of chromosome. 1.1.3. Definition of fitness function Due to the fact that an isolated nodule or a nodule attached to a blood vessel is either depicted as a sphere or has some spherical elements, while a blood vessel is seen to be oblong, a 3D geometry feature can be used to distinguish nodules from adjoining blood vessels. The volumetric shape index is a measure of local shape, which is based on two principal curvatures defined in Equation 1.
k1 p H p  H 2 p  K p , k2 p H p  H 2 p  K p

[1]

where K p and H p are the Gaussian and mean curvatures [10]. Based on these principal curvatures, the volumetric shape index SI(p) for each voxel (p) is defined as [11]:

SI p

k p  k2 p 1 1  arctan 1 k1 p  k 2 p 2 S

[2]

Every distinct shape, except for the plane, corresponds to a unique value of SI, for example, SI is 1 for the sphere-like shape, and 0.75 for the cylinder-like shape. Based on the definition, the shape index directly characterizes the topological shape of an isosurface in the vicinity of each voxel without explicitly calculating the iso-surface. The shape index encodes 3D local shape information at each voxel, which is a very attractive feature for fitness calculation of GA process to separate nodules from blood vessels. The fitness of the chromosome is then defined as the similarity measurement

507

between the selected reference image and the extracted sub-image whose centre is determined by the chromosome and whose size is the same as that of the selected reference image:
f a ,b

i 0 S i u (ai  ma ) u (bi  mb ) n 1 n 1 i 0 (ai  ma ) 2 i 0 (bi  mb ) 2


ma 1 n

n 1

[3]

a
i 0

n1

mb

1 n

b
i 0

n 1

where n is the total number of pixels in the image; ai and bi are the intensity values of ith pixel in the sub-image and the selected reference image, respectively; Si is the shape index value of ith pixel of sub-image, which is obtained from Equation 2. It can be seen that without the shape index Si , Equation 3 becomes normal cross-correlation similarity measurement. As shape index measures 3D local feature, different shape index values are given for the sphere-like nodule and cylinder-like blood vessel. By combining this feature into the similarity calculation, the fitness function not only depends on global CT intensity distribution but also 3D local geometry feature. The high value of shape index (such as Si = 1 for the sphere-like shape) gives higher weighting to the similarity measurement; while the elongation shape (such as Si = 0.75 for the cylinder-like shape) has lower weighting factor. Figure 4 (a) shows one sub-image of a nodule attached to a blood vessel. Both of the nodule and the blood vessel have very similar image intensity. Figure 4 (c) and (d) are the shape index map and its highlighted shape index values, it can be seen that shape index values in the nodule region are much higher than that of the blood vessel area. By assigning these shape index values to fitness function, the nodule can have higher weighting than that of the blood vessel. Figure 4 (b) shows different fitness values calculated based on the cross-correlation and shape index weighted cross-correlation on 5 sampled points (A-E) indicated in (a). It is noted that by using shape index as weighting, the difference of the fitness values between the blood vessel and the nodule is higher than that of using cross-correlation. As a result, nodule can be easily distinguished from the adjoining blood vessel by using shape index weighted fitness function.
1.1.4. GA processing on CT lung nodule detection During the initialisation step, the genes related to nodule candidate location (x, y, z) and template images are randomly generated for each chromosome within initial population. The genes responsible for directional information are preset to zero, which means no movement required. The first evaluation of chromosome fitness is then based on initial location randomly generated. In the proposed GA process, we use 6 possible searching directions and 8 different moving lengths as shown in Figure 3 (c) and (d). Eight lung nodule phantom images are used as reference images shown in Figure 2, resulting in 3 bits length for the sixth gene. The number of population is set to be 1% of total voxel number within lung area. The maximum generation is set to be 200. After initialisation, the GA evolving process begins through consecutive generations. For each generation, the fitness of whole population is evaluated using shape based modified crosscorrelation defined in Equation 3. GA operations are then applied, including selection, crossover and mutation. Roulette wheel selection is used to select the parents for crossover operation. 70% of the population with lower fitness are replaced by new individuals that are produced by crossover and mutation operation.

One-point crossover is employed in the proposed system, while the mutation rate is 5%. Finally, after the process reaches the maximum generation, the chromosomes whose fitness values are greater than a pre-defined value are considered as nodule candidates. The pre-defined value is decided experimentally which provides the best overall performance. The input images are partitioned within minimum rectangular encompassing the lung area and the GATM process introduced above is performed on each partition individually. The final results are the union of nodule candidates from each partition. From our experiments, by partitioning the lung area and applying GA individually, both sensitivity and specificity performance of detection system are improved compared with applying GA on whole lung area as searching space.
SI Effect
0 .7 0 .6 0 .5 0 .4

A B C D E

Fitness

0 .3 0 .2 0 .1 0 A B C D E

Without SI Weighting With SI Weighting

Point label

(a)
. 7 4 0.75 0.76 0.76 0.75 0.73 0.69 0.63 0.34 0.36 0.77 0.77 0.76 0.72 0.69 069 0.68 0.50

(b)
0.85 0.77 0.74 0.60 0.67 0.72 0.75 0.81 0.66 0.76 0.70 0.68 0.88 0.83 0.82 0.85 0.55 0.67 0.68 0.75 0.87 0.95 0.88 0.89 0.51 0.59 0.67 0.76 0.82 0.90 0.93 0.92 0.52 0.60 0.67 0.80 0.85 0.88 0.92 0.96 0.54 0.64 0.67 0.83 0.85 0.91 0.95 0.96 0.74 0.73 0.69 0.36 0.18 0.22 0.36 0.76 0.74 0.72 0.64 0.25 0.23 0.32

Vessel

Nodule
(c)

(d)

Figure 4 (a) Sub-image of nodule adjoining to blood vessel with similar intensity; (b) Fitness curves by using the crosscorrelation and shape index (SI) weighted cross-correlation on 5 sampled points; (c) Shape index map; (d) Shape index values for nodule and blood vessel 1.2. Rules based filtering to eliminate FP After shape-based GA template matching process, the chromosomes with fitness values higher than a pre-defined threshold are kept as potential nodule candidates. Simple rules based filtering is used to remove easily dismissible FP. For each 3D nodule candidate point, a spherical mask is constructed based on the template image selected by the corresponding chromosome (gene 6). The average Hounsfield Units (HU) value is calculated as threshold for segmentation within the spherical mask. Six shape features of segmented objects are then calculated, namely, effective diameter, elongation and sphericity, compactness, maximum HU, minimum HU. A rulesbased filtering is applied based on these calculated features. The parameters are determined experimentally. As this study focuses on detecting most of the nodules while introducing only a few FP objects into next phases for feature calculation and classification, more advanced filtering methods are being investigated.

3. EXPERIMENTAL RESULTS
The proposed shape based GATM approach was applied to a database of 70 thoracic CT scans from 3 different hospitals. Each scan was read by three thoracic radiologists to produce a gold standard of 178 nodules. Slice thickness varied from 0.5mm to

508

1.25mm and the total slice number for each scan varied from 79 to 433 with an average of 240 per-scan. Dose is ranged from 60mA to 325mA. Table 1 shows the experiment results based on different fitness functions for GATM. By using the proposed method, 151 of the 178 (85%) nodules were detected. The average of FP is 14.0/scan (0.06/slice). It can be seen that the performance of nodule detection can be significantly improved by combining local shape feature calculation into global cross-correlation framework for GATM. Compared with results by only using cross-correlation coefficient as similarity measurement, the sensitivity is improved by 13.3% (from 75% to 85%), while FP can be reduced by 51%. Figure 5 shows examples of the detected nodules. Figure 6 shows examples of nodules detected by using shape index as a weighting factor for the fitness function. However, these nodules with non-spherical shapes or attached to vessels with similar intensity were missed if the fitness function is defined by cross-correlation coefficient only. Figure 7 shows normal vessels eliminated from nodule candidates by using proposed shape based GATM, which were wrongly identified as nodules by crosscorrelation GATM method. As mentioned before, shape index characterizes the local geometric feature which favors regions with high spherical elements, higher fitness value is obtained when the local spherical elements matches to one of the templates. This is the main reason that the proposed algorithm is able to detect non-spherical nodules but with high spherical local elements. But nodules can still be missed if there are no spherical local elements or the size of the elements not matching to any of the templates. Examples of missed nodules are shown in Figure 8. Most of these missed nodules are either in irregular shapes close to chest wall or Ground Grass Opacity (GGO) nodules with very low contrast.

template-matching technique, IEEE Transactions on Medical Imaging, vol.20, pp.595-604, 2001. [5] Z.Y. Ge, B. Sahiner, H. P. Chan, et al. Computer-aided detection of lung nodules: False positive reduction using a 3D gradient field method and 3D ellipsoid fitting, Medical Physics, vol.32, pp.2443, 2005. [6] M. S. Brown, M. F. McNitt-Cray, J. G. Golldin, et al. PatientSpecific Models for Lung Nodule Detection and Surveillance in CT Images, IEEE Trans. Medical Imaging, vol.20, no.12, pp.1242-1250, 2001. [7] J. M. Goo, Computer-Aided Detection of Lung Nodule on Chest CT: Issues to be Solved before Clinical Use, Journal of Radiology, vol.6, pp. 62-63, 2005. [8] R Gonzalez, Digital Image Processing, Prentice Hall, 2003 [9] http://www.qrm.de [10] O. Faugeras, ThreeDimensional Computer Vision: A geometric view-point, Cambridge, MA: MIT press, 1993. [11] J. Dehmeshki, X. Ye, J. Costello, Shape based region growing using derivatives of 3D medical images: application to semi-automated detection of nodules, ICIP, pp.1085-1088, 2003.
Nodules missed GATM based on fitness function with crosscorrelation only Proposed Shape-based GATM 44 27 Detection rate 75% 85% FP per scan 29.0 14.0

Table 1 Experiment results based on different fitness functions on a database of 70 CT scans with 178 nodules and an average slice number 240 per-scan

4. CONCLUSIONS
By combining shape index as a weighting factor with nodule intensity distribution in fitness function for GATM, the proposed algorithm significantly improved the detection sensitivity and the FP reduction performance, compared to the GATM using crosscorrelation as similarity measurement. The experimental results indicate the nodule detection rate of 85%, with FP 14.0/scan approximately. The new definition of GA chromosome and employment of lung nodule phantom template make GA process converge to the optimal solutions more efficiently. Some challenging nodules such as non-spherical nodules or nodules attached to pulmonary vessels with similar intensity can be identified, with a lower rate of FP. Most of the normal tissues such as blood vessels, sternum type, apical scarring, etc can be eliminated from nodule candidates.

Figure 5 Examples of nodules detected by the proposed method

Figure 6 Non-spherical nodules with spherical elements detected by using shape index in fitness function for GATM

5. REFERENCES
[1] R. Greenlee, T. Nurray, S. Bolden and P. Wingo, Cancer statistics 2000, CA: Cancer Journal Clinicians, vol.50, pp.7-33, 2000. [2] S. G. Armato, M. L. Giger and H. MacMahon, Automated detection of lung nodules in CT scans: Preliminary results, Medical Physics, vol.28, pp.1552-1561, 2001. [3] B. Zhao, G. Gamsu, M. S Ginsberg, L. Jiang, et al. Automatic detection of small lung nodules on CT utilizing a local density maximum algorithm, Journal of Applied Clinical Medical Physics, vol.4, no.3, 2003. [4] Y. Lee, T. Hara, H. Fujita, et al. Automated detection of pulmonary nodules in helical CT images based on an improved

Figure 7 Normal tissues (FP) eliminated from nodule candidates by using shape index in fitness function for GATM

Figure 8 Nodules still missed by the proposed method

509

Você também pode gostar