Escolar Documentos
Profissional Documentos
Cultura Documentos
Classification
A Report Submitted
in Partial Fulfillment of the Requirements
for the Degree of
Bachelor of Technology
in
Information Technology
by
Vaibhav Gupta(20158051)
Sahil Kumar(20158041)
Monika Singh(20158061)
Pasumarthi Eswar Sai(20158091)
Mayank Kumar Meena(20158081)
to the
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
MOTILAL NEHRU NATIONAL INSTITUTE OF TECHNOLOGY
ALLAHABAD
April, 2018
UNDERTAKING
April, 2018
Allahabad
Vaibhav Gupta(20158051)
Sahil Kumar(20158041)
Monika Singh(20158061)
Pasumarthi Eswar
Sai(20158091)
Mayank Kumar Meena
(20158081)
ii
CERTIFICATE
April, 2018
iii
Preface
Diabetes occurs when the pancreas fails to secrete enough insulin, slowly affect-
ing the retina of the human eye. As it progresses, the vision of a patient starts
deteriorating, leading to diabetic retinopathy.
Diabetic Retinopathy is one of the leading causes of blindness and eye disease
in working age population of developed world. This project is an attempt towards
finding an automated and efficient solution that could detect the symptoms of DR
from a retinal image within seconds and simplify the process of reviewing and ex-
amination of images.
In our approach, we trained a deep Convolutional Neural Network model on a
small dataset consisting around 500 images(actual train dataset 35000) and used
various pre-processing and augmentation techniques to achieve higher accuracy.
iv
Acknowledgements
We must mention several individuals that were of enormous help in the development
of this work. Er. Manoj Wariya our mentor, encouraged us to carry out this work.
His continuous and invaluable guidance throughout the course of the study helped
us to complete the work to this stage and hope will continue in further research.
We wish to express our sincere gratitude to Prof. Rajeev Tripathi, Director,
MNNIT Allahabad, Allahabad and Prof. Neeraj Tyagi, Head, Computer Science
and Engineering Department, for providing us all the facilities required for the com-
pletion of this thesis work. We would also like to thank all our friends for their
constant motivation, advice and support.
In addition, very energetic and competitive atmosphere of the Computer Science
and Engineering Department has much to do with this work. We acknowledge with
thanks to faculty, teaching and non-teaching staff of the department, central library
and colleagues.
v
Contents
Preface iv
Acknowledgements v
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Related Work 4
3 Proposed Work 6
3.1 Dataset and Preprosessing . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Reducing over-fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2.1 Data Augmentation . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3 Overall architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
References 15
vi
Chapter 1
Introduction
Diabetic retinopathy is an eye disease caused by diabetes that can lead to loss of
vision or even complete blindness. Diabetic retinopathy accounts for 12% of all
new cases of blindness in the United States, and is the leading cause of blindness for
people aged 20 to 64 years. If caught early enough, progression to vision impairment
can be slowed if not altogether stopped, however, this is often difficult because
symptoms may appear too late to provide effective treatment. Diabetic retinopathy
(DR) has been estimated to affect about 93 million people globally, though only half
are aware of it. Diabetic retinopathy may progress through four stages:
1
factors secreted by the retina trigger the proliferation of new blood vessels,
which grow along the inside surface of the retina and into the vitreous gel, the
fluid that fills the eye. The new blood vessels are fragile, which makes them
more likely to leak and bleed.Retinal detachment can lead to permanent vision
loss.
2
1.1 Motivation
There are various factors affecting the diabetes like its age, poor control, preg-
nancy but researches shows that progression to vision impairment can be slowed or
averted if DR is detected in early stage of the disease. One can see large number of
population suffering from the disease but still testing is done manually by trained
professionals in real life which is quite time taking and lengthy process and usually
due to miscommunication and delayed results eventually leads to delayed treatment
and ignorance. So aim of the project is to provide an automated, suitable and so-
phisticated approach using Deep learning so that DR can be detected at early levels
easily and damage to retina can be minimized.
3
Chapter 2
Related Work
Historically, image analysis and classification has mostly focused on low level image
analysis tasks like feature extraction and basic color normalization coupled with
classical machine learning classification models like regression, SVMs and random
forests. Progress was made by the introduction of automated extraction of high
dimensional sets of image features (on order of thousands). Dimensionality reduction
techniques (e.g sparse regression) were then used to supply the construction of simple
linear classifiers for the data.[1]
Some early approaches used various image processing techniques, for example,
Pinz et al, (Pinz et al. 1998)[2] used the gradient based method and hough transform
to map and localize blood vessels, the optic disc, and the fovea. Chaudhuri et al
(Chaudhuri et al. 1989)[3] used two dimensional matched filters to map the network
of blood vessels in the retina, a technique adopted by many later works.
Sinthanayothin et al (Sinthanayothin et al. 1999)[4] find blood vessels by per-
forming PCA on image gradients and inputting the results to a neural network.
Additionally, they localize the optic disc through simple intensity variations in im-
age patches and find the fovea through matched filters. Filtering (Saiprasad Ravis-
hankar 2009)[5] and Segmentation (Thomas Walter and Erginay 2002) [6] are also
morphological methods.
Machine learning methods - On the other hand, a number of attempts have been
made to use machine learning to automatically locate manifestations of retinopathy.
4
Examples include M. Melinscak et al[7], an automatic segmentation of blood vessels
in fundus images. It contains deep max-pooling convolutional neural networks to
segment blood vessels. It is deployed 10-layer architecture for achieving a maxi-
mum accuracy but worked with small image patches. It contains a preprocessing
for resizing and reshaping the fundus images. It carried around 4-convolutional and
4-max pooling layer with 2 additional fully connected layers for vessel segmenta-
tion. Srivastava [8], a key idea of randomly drop units along with their connections
during the training. His work significantly reduces the over fitting and gives im-
provements over other regularization techniques. Also, improves the performance of
neural networks in vision, document classification, speech recognition etc.
Mrinal Haloi [9], a new deep learning based computer-aided system for microa-
neurysm detection. Comparing another deep neural network, it required less prepro-
cessing, vessel extraction and more deep layers for training and testing the fundus
image dataset. It consists of five layers which include convolutional, max pooling
and Softmax layer with additional dropout training for improving an accuracy. It
achieved low false positive rate.
A joint study [1] by Harvard Medical School and MIT worked on furthering
techniques in histopatho- logical image analysis for metastic breast cancer. Their
approach obtained a near human-level classification performance using a 27-layer
neural network architecture. Similarly, researchers from Cambridge University, Im-
perial College London, and others, [7] intruduced an 11-layer 3D convolutional neural
network to offer a more computationally efficient approach to brain lesion segmen-
tation. One of the challenges they managed to overcome is the vanishing gradient
problem, where the signal of whether the prediction was correct or not becomes
greatly attenuated as it propagates through the layers of the neural network. Batch
Normalization, coupled with other heuristics, was leveraged to preserve this signal
and offered a significant performance improvement.
These are still early days for neural network applications, and methods are still
being developed in areas as diverse as systems architecture and data augmentation
to improve these models predictive power.
5
Chapter 3
Proposed Work
6
intractable to handle with our computational resources. To start, we had to trans-
form the images in such a way that it would be feasible for a neural network or any
learning algorithm to converge in a reasonable time. This consisted of resizing each
image 256px by 265px. While this helped in making the necessary computation less
intensive, it did not help with the fact that the lighting, orientation, and alignment
were not similar across images. Each image was rescaled to have the same radius
(the eyeball) and each pixel had its color subtracted by the local average. The edges
of the images were also clipped since there is a great variation on the boundaries or
edges of the images.
Removal of the boundary effect of Gaussian blur was followed by place the pro-
cessed image in the centre of (512px, 512px) image. We then trained the Convolu-
tional neural network(CNN) with these pre-processed images.
7
3.2 Reducing over-fitting
8
output of the last fully-connected layer is fed to a 5-way softmax which produces a
distribution over the 5 class labels. We also used two consecutive convolution layers
followed by a maxpool layer to make our model deeper and increase the accuracy.
Every convolutional and fully connected (dense) layer is followed by a ReLU layer
to make the training faster.
The input image was 510X510 and was reduced to 2x2 through our convolutional
layers, and then fed into a fully connected layer with neurons to learn from. Then
these neurons make predictions using softmax.
9
Figure 5: CNN layer Architecture
10
Chapter 4
11
dataset or not. Since our model is very large (having thirteen convolutional layers
and two dense layers), it is not easy to train such model on our personal laptops.
Therefore, we tried two to simplify our model (using few layers only) and train it on
a rather very small dataset(500) with even less dimension of each image. On this
simplified model and dataset, we got the following results
12
Figure 8: Confusion Matrix
These are only the initial phase results which were just an experiment. With
proper deployment of resources such as high computing system with GPU we are
positive to test our complete data to improve both performance and accuracy.
13
Chapter 5
Till now, we have not tested the complete test data, therefore, our first target would
be to achieve the same. However, such a large dataset can only be tested with the
availibilty of fast computing GPU’s and other advanced resources.In our proposed
solution, Deep Convolutional Neural Network is a wholesome approach to all level
of diabetic retinopathy stages. No manual feature extraction stages are needed. Our
network architecture yields significant classification accuracy.
Our main focus would be to design such a detection system which is highly accu-
rate and precise. Our network architecture is complex and computation- intensive
requiring high-level graphics processing unit to process the high- resolution images
when the level of layers stacked more. Also, we can use a pretrained model such as
VGG16 or inception V3 model for better results. We can also implement our whole
model as an application on mobile phones or as a Web-App, so as to make diabetic
retinopathy detection easier and time- saving for clinicians.
14
References
[1] Dayong Wang Aditya Khosla Rishab Gargeya Humayun Irshad Andrew H Beck
Beth Israel Deaconess Medical Center, Harvard Medical School CSAIL, Mas-
sachusetts Institute of Technology, Deep Learning for Identifying Metastatic
Breast Cancer.
[2] Pinz, A.; Berngger, S.; Datlinger, P.; and Kruger, A. 1998. Mapping the human
retina. IEEE Transactions on Medical Imaging 17:606619
[3] Chaudhuri, S.; Chatterjee, S.; Katz, N.; Nelson, M.; and Goldbaum, M. 1989.
Detection of blood vessels in retinal images using two-dimensional matched fil-
ters. In IEEE Transactions on Medical Imaging, volume 8, 263269
[4] Sinthanayothin, C.; Boyce, J. F.; Cook, H.; and Williamson, T. H.1999. Auto-
mated localization of the optic dic, fovea, and retinal blood vessels from digital
color fundus images. In Br J. Opthalmol., volume 83, 902910
[6] Thomas Walter, Jean-Claude Klein, P. M., and Erginay, A. 2002.A contribution
of image processing to the diagnosis of diabetic retinopathydetection of exudates
in color fundus images of the human retina. IEEE Transactions on Medical Imag-
ing
15
, Ben Glockera, Efficient Multi-Scale 3D CNN with fully connected CRF for
Accurate Brain Lesion Segmentation.
16