Você está na página 1de 22

Diabetic Retinopathy Detection and Severity

Classification

A Report Submitted
in Partial Fulfillment of the Requirements
for the Degree of
Bachelor of Technology
in
Information Technology

by
Vaibhav Gupta(20158051)
Sahil Kumar(20158041)
Monika Singh(20158061)
Pasumarthi Eswar Sai(20158091)
Mayank Kumar Meena(20158081)

to the
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
MOTILAL NEHRU NATIONAL INSTITUTE OF TECHNOLOGY
ALLAHABAD
April, 2018
UNDERTAKING

I declare that the work presented in this report titled “Dia-


betic Retinopathy Detection and Severity Classification ”, sub-
mitted to the Computer Science and Engineering Department,
Motilal Nehru National Institute of Technology, Allahabad,
for the award of the Bachelor of Technology degree in
I nformation Technology, is my original work. I have not
plagiarized or submitted the same work for the award of any
other degree. In case this undertaking is found incorrect, I ac-
cept that my degree may be unconditionally withdrawn.

April, 2018
Allahabad
Vaibhav Gupta(20158051)
Sahil Kumar(20158041)
Monika Singh(20158061)
Pasumarthi Eswar
Sai(20158091)
Mayank Kumar Meena
(20158081)

ii
CERTIFICATE

Certified that the work contained in the report titled “Dia-


betic Retinopathy Detection and Severity Classification ”, by
Vaibhav Gupta (20158051), Sahil Kumar(20158041), Monika
Singh(20158061),Pasumarthi Eswar Sai(20158091), Mayank Ku-
mar Meena(20158081), has been carried out under my supervi-
sion and that this work has not been submitted elsewhere for a
degree.

(Er. Manoj Wariya)


Computer Science and Engineering Dept.
M.N.N.I.T, Allahabad

April, 2018

iii
Preface

Diabetes occurs when the pancreas fails to secrete enough insulin, slowly affect-
ing the retina of the human eye. As it progresses, the vision of a patient starts
deteriorating, leading to diabetic retinopathy.
Diabetic Retinopathy is one of the leading causes of blindness and eye disease
in working age population of developed world. This project is an attempt towards
finding an automated and efficient solution that could detect the symptoms of DR
from a retinal image within seconds and simplify the process of reviewing and ex-
amination of images.
In our approach, we trained a deep Convolutional Neural Network model on a
small dataset consisting around 500 images(actual train dataset 35000) and used
various pre-processing and augmentation techniques to achieve higher accuracy.

iv
Acknowledgements

We must mention several individuals that were of enormous help in the development
of this work. Er. Manoj Wariya our mentor, encouraged us to carry out this work.
His continuous and invaluable guidance throughout the course of the study helped
us to complete the work to this stage and hope will continue in further research.
We wish to express our sincere gratitude to Prof. Rajeev Tripathi, Director,
MNNIT Allahabad, Allahabad and Prof. Neeraj Tyagi, Head, Computer Science
and Engineering Department, for providing us all the facilities required for the com-
pletion of this thesis work. We would also like to thank all our friends for their
constant motivation, advice and support.
In addition, very energetic and competitive atmosphere of the Computer Science
and Engineering Department has much to do with this work. We acknowledge with
thanks to faculty, teaching and non-teaching staff of the department, central library
and colleagues.

v
Contents

Preface iv

Acknowledgements v

1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Related Work 4

3 Proposed Work 6
3.1 Dataset and Preprosessing . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Reducing over-fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2.1 Data Augmentation . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3 Overall architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

4 Experimental Setup and Results Analysis 11


4.1 Hardware/Software Requirement . . . . . . . . . . . . . . . . . . . . 11
4.2 Result Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

5 Conclusion and Future Work 14

References 15

vi
Chapter 1

Introduction

Diabetic retinopathy is an eye disease caused by diabetes that can lead to loss of
vision or even complete blindness. Diabetic retinopathy accounts for 12% of all
new cases of blindness in the United States, and is the leading cause of blindness for
people aged 20 to 64 years. If caught early enough, progression to vision impairment
can be slowed if not altogether stopped, however, this is often difficult because
symptoms may appear too late to provide effective treatment. Diabetic retinopathy
(DR) has been estimated to affect about 93 million people globally, though only half
are aware of it. Diabetic retinopathy may progress through four stages:

1. Mild nonproliferative retinopathy - Small areas of balloon-like swelling in the


retinas tiny blood vessels, called microaneurysms, occur at this earliest stage
of the disease. These microaneurysms may leak fluid into the retina.

2. Moderate non-proliferative retinopathy - As the disease progresses, blood ves-


sels that nourish the retina may swell and distort. They may also lose their
ability to transport blood. Both conditions cause characteristic changes to the
appearance of the retina and may contribute to DME.

3. Severe non-proliferative retinopathy - Many more blood vessels are blocked,


depriving blood supply to areas of the retina. These areas secrete growth
factors that signal the retina to grow new blood vessels.

4. Proliferative diabetic retinopathy (PDR) - At this advanced stage, growth

1
factors secreted by the retina trigger the proliferation of new blood vessels,
which grow along the inside surface of the retina and into the vitreous gel, the
fluid that fills the eye. The new blood vessels are fragile, which makes them
more likely to leak and bleed.Retinal detachment can lead to permanent vision
loss.

Currently, detecting DR is a time-consuming and manual process that requires


a trained clinician to examine and evaluate digital color fundus photographs of the
retina. By the time human readers submit their reviews, often a day or two later, the
delayed results lead to lost follow up, miscommunication, and delayed treatment.
Unfortunately, there is no effective known cure for diabetic retinopathy and the
present treatments available are just management strategies at best. So its very
important to detect the disease in its early stages.

Figure 1: Various DR Stages

2
1.1 Motivation
There are various factors affecting the diabetes like its age, poor control, preg-
nancy but researches shows that progression to vision impairment can be slowed or
averted if DR is detected in early stage of the disease. One can see large number of
population suffering from the disease but still testing is done manually by trained
professionals in real life which is quite time taking and lengthy process and usually
due to miscommunication and delayed results eventually leads to delayed treatment
and ignorance. So aim of the project is to provide an automated, suitable and so-
phisticated approach using Deep learning so that DR can be detected at early levels
easily and damage to retina can be minimized.

3
Chapter 2

Related Work

Historically, image analysis and classification has mostly focused on low level image
analysis tasks like feature extraction and basic color normalization coupled with
classical machine learning classification models like regression, SVMs and random
forests. Progress was made by the introduction of automated extraction of high
dimensional sets of image features (on order of thousands). Dimensionality reduction
techniques (e.g sparse regression) were then used to supply the construction of simple
linear classifiers for the data.[1]
Some early approaches used various image processing techniques, for example,
Pinz et al, (Pinz et al. 1998)[2] used the gradient based method and hough transform
to map and localize blood vessels, the optic disc, and the fovea. Chaudhuri et al
(Chaudhuri et al. 1989)[3] used two dimensional matched filters to map the network
of blood vessels in the retina, a technique adopted by many later works.
Sinthanayothin et al (Sinthanayothin et al. 1999)[4] find blood vessels by per-
forming PCA on image gradients and inputting the results to a neural network.
Additionally, they localize the optic disc through simple intensity variations in im-
age patches and find the fovea through matched filters. Filtering (Saiprasad Ravis-
hankar 2009)[5] and Segmentation (Thomas Walter and Erginay 2002) [6] are also
morphological methods.
Machine learning methods - On the other hand, a number of attempts have been
made to use machine learning to automatically locate manifestations of retinopathy.

4
Examples include M. Melinscak et al[7], an automatic segmentation of blood vessels
in fundus images. It contains deep max-pooling convolutional neural networks to
segment blood vessels. It is deployed 10-layer architecture for achieving a maxi-
mum accuracy but worked with small image patches. It contains a preprocessing
for resizing and reshaping the fundus images. It carried around 4-convolutional and
4-max pooling layer with 2 additional fully connected layers for vessel segmenta-
tion. Srivastava [8], a key idea of randomly drop units along with their connections
during the training. His work significantly reduces the over fitting and gives im-
provements over other regularization techniques. Also, improves the performance of
neural networks in vision, document classification, speech recognition etc.
Mrinal Haloi [9], a new deep learning based computer-aided system for microa-
neurysm detection. Comparing another deep neural network, it required less prepro-
cessing, vessel extraction and more deep layers for training and testing the fundus
image dataset. It consists of five layers which include convolutional, max pooling
and Softmax layer with additional dropout training for improving an accuracy. It
achieved low false positive rate.
A joint study [1] by Harvard Medical School and MIT worked on furthering
techniques in histopatho- logical image analysis for metastic breast cancer. Their
approach obtained a near human-level classification performance using a 27-layer
neural network architecture. Similarly, researchers from Cambridge University, Im-
perial College London, and others, [7] intruduced an 11-layer 3D convolutional neural
network to offer a more computationally efficient approach to brain lesion segmen-
tation. One of the challenges they managed to overcome is the vanishing gradient
problem, where the signal of whether the prediction was correct or not becomes
greatly attenuated as it propagates through the layers of the neural network. Batch
Normalization, coupled with other heuristics, was leveraged to preserve this signal
and offered a significant performance improvement.
These are still early days for neural network applications, and methods are still
being developed in areas as diverse as systems architecture and data augmentation
to improve these models predictive power.

5
Chapter 3

Proposed Work

Our detection approach involves training a Convolutional Neural Network (CNN)


to classify the level of DR in images. So, for training data, we were provided with
approximately 500 (actual train dataset of 35000) labelled high-resolution images
taken under a variety of imaging conditions. A clinician has rated the presence of
diabetic retinopathy in each image on a scale of 0 to 4, according to the following
scale: 0 - No DR, 1 - Mild, 2 - Moderate, 3 - Severe and 4 - Proliferative DR. Images
were labelled with a subject id as well as either left or right.

3.1 Dataset and Preprosessing


We use a dataset of retina images from Kaggle.com. These are a set of high res-
olution retina images taken in a variety of conditions, including different cameras,
colors, lighting and orientations. For each person we have an image of their left and
right eye, along with a DR classification diagnosed by a clinician. There is consid-
erable noise and variation in the data set due to these differing conditions. A key
part of setting up our pipeline was pre-processing our data, the color retina images.
These images required a hefty amount of pre-processing before we could use them
in our neural network. The provided retina images were of different dimensions
and resolutions, were taken by different cameras, were in different orientations, and
were sometimes not even aligned or cropped similarly. The size of the dataset was

6
intractable to handle with our computational resources. To start, we had to trans-
form the images in such a way that it would be feasible for a neural network or any
learning algorithm to converge in a reasonable time. This consisted of resizing each
image 256px by 265px. While this helped in making the necessary computation less
intensive, it did not help with the fact that the lighting, orientation, and alignment
were not similar across images. Each image was rescaled to have the same radius
(the eyeball) and each pixel had its color subtracted by the local average. The edges
of the images were also clipped since there is a great variation on the boundaries or
edges of the images.
Removal of the boundary effect of Gaussian blur was followed by place the pro-
cessed image in the centre of (512px, 512px) image. We then trained the Convolu-
tional neural network(CNN) with these pre-processed images.

Figure 2: Class 0 retinal image

Figure 3: Class 0 pre-processed retinal image

7
3.2 Reducing over-fitting

3.2.1 Data Augmentation


The easiest and most common method to reduce over-fitting on image data is to ar-
tificially enlarge the dataset using label- preserving transformations. We employed
transformed images to be produced from the original images with very little compu-
tation. Since deep learning models are very flexible, we need large datasets to avoid
over fitting. Here, we randomly rotate, flip and scale the images, and create extra
images virtually. This data augmentation technique allows the model to learn in-
variance to these random transformations. In our implementation, the transformed
images are generated in Python code on the CPU.

Figure 4: Augmented Retinal images

3.3 Overall architecture


In Image recognition, a Convolutional Neural Network (CNN) is a type of feed- for-
ward artificial neural network in which the connectivity pattern between its neurons
is inspired by the organization of animal visual cortex, whose individual neurons are
arranged in such a way that responds to overlapping regions tiling the visual field.
In deep learning, the Convolutional Neural Network uses a complex architecture
composed of stacked layers in which it is particularly well-adapted to classify the
images. For multi-class classification, this architecture is robust and sensitive to
each feature present in the images. The net contains fifteen layers with weights; out
of which thirteen are convolutional and the remaining two are fully connected. The

8
output of the last fully-connected layer is fed to a 5-way softmax which produces a
distribution over the 5 class labels. We also used two consecutive convolution layers
followed by a maxpool layer to make our model deeper and increase the accuracy.
Every convolutional and fully connected (dense) layer is followed by a ReLU layer
to make the training faster.
The input image was 510X510 and was reduced to 2x2 through our convolutional
layers, and then fed into a fully connected layer with neurons to learn from. Then
these neurons make predictions using softmax.

9
Figure 5: CNN layer Architecture
10
Chapter 4

Experimental Setup and Results


Analysis

4.1 Hardware/Software Requirement


Programming language python is used for implementation and Anaconda as open
source distribution of python was used to simplify package management and de-
ployment. For pre-processing and augmentation, we used OpenCV for contrast
adjustment, crop, Gaussian blur, edges/boundary removal, color balance adjust-
ment, rotate, flip and scale. At pre-processing stage, black border removal and
resizing are done with the use of numpy package. Convolutional Neural Network
(CNN), multi-layer deep architecture are implemented using Keras with Tensor-
flow in its backend. For handling our large dataset, Graphics Processing Unit was
needed.(Atleast Nvidia GeForce GTX550 card (1Gb memory) or GTX 980( 4Gb
memory) ) along with Atleast 8-16 GB RAM and minimum of 160 GB of harddisk
space (for converted images, network parameters and extracted features)

4.2 Result Analysis


The performance evaluation of any neural network is done on the basis of some
specific parameters which decides whether the current model is justified for the

11
dataset or not. Since our model is very large (having thirteen convolutional layers
and two dense layers), it is not easy to train such model on our personal laptops.
Therefore, we tried two to simplify our model (using few layers only) and train it on
a rather very small dataset(500) with even less dimension of each image. On this
simplified model and dataset, we got the following results

Figure 6: Accuracy Result

Figure 7: Performance Evalution Report

12
Figure 8: Confusion Matrix

These are only the initial phase results which were just an experiment. With
proper deployment of resources such as high computing system with GPU we are
positive to test our complete data to improve both performance and accuracy.

13
Chapter 5

Conclusion and Future Work

Till now, we have not tested the complete test data, therefore, our first target would
be to achieve the same. However, such a large dataset can only be tested with the
availibilty of fast computing GPU’s and other advanced resources.In our proposed
solution, Deep Convolutional Neural Network is a wholesome approach to all level
of diabetic retinopathy stages. No manual feature extraction stages are needed. Our
network architecture yields significant classification accuracy.

Our main focus would be to design such a detection system which is highly accu-
rate and precise. Our network architecture is complex and computation- intensive
requiring high-level graphics processing unit to process the high- resolution images
when the level of layers stacked more. Also, we can use a pretrained model such as
VGG16 or inception V3 model for better results. We can also implement our whole
model as an application on mobile phones or as a Web-App, so as to make diabetic
retinopathy detection easier and time- saving for clinicians.

14
References

[1] Dayong Wang Aditya Khosla Rishab Gargeya Humayun Irshad Andrew H Beck
Beth Israel Deaconess Medical Center, Harvard Medical School CSAIL, Mas-
sachusetts Institute of Technology, Deep Learning for Identifying Metastatic
Breast Cancer.

[2] Pinz, A.; Berngger, S.; Datlinger, P.; and Kruger, A. 1998. Mapping the human
retina. IEEE Transactions on Medical Imaging 17:606619

[3] Chaudhuri, S.; Chatterjee, S.; Katz, N.; Nelson, M.; and Goldbaum, M. 1989.
Detection of blood vessels in retinal images using two-dimensional matched fil-
ters. In IEEE Transactions on Medical Imaging, volume 8, 263269

[4] Sinthanayothin, C.; Boyce, J. F.; Cook, H.; and Williamson, T. H.1999. Auto-
mated localization of the optic dic, fovea, and retinal blood vessels from digital
color fundus images. In Br J. Opthalmol., volume 83, 902910

[5] SaiprasadRavishankar, Arpit Jain, A. M. 2009. Automatic feature extraction for


early detection of diabetic retinopathy in fundus images.CVPR

[6] Thomas Walter, Jean-Claude Klein, P. M., and Erginay, A. 2002.A contribution
of image processing to the diagnosis of diabetic retinopathydetection of exudates
in color fundus images of the human retina. IEEE Transactions on Medical Imag-
ing

[7] Konstantinos Kamnitsasa , Christian Lediga , Virginia F.J. Newcombeb,c ,


Joanna P. Simpsonb , Andrew D. Kaneb , David K. Menonb,c, Daniel Rueckerta

15
, Ben Glockera, Efficient Multi-Scale 3D CNN with fully connected CRF for
Accurate Brain Lesion Segmentation.

16

Você também pode gostar