Escolar Documentos
Profissional Documentos
Cultura Documentos
Due to eye diseases, age related causes, uncontrolled diabetes, accidents and other
reasons, the number of visually impaired persons are increasing every year. One of the most
significant difficulties for a visually impaired person is to read. Speech and text is the main
medium for human communication. A person needs vision to access the information in a text.
However, those who have poor vision can gather information from voice. Recent
developments in mobile phones, computers, and availability of digital cameras make it
feasible to assist the blind person by developing camera based applications that combine
computer vision tools with other existing beneficial products such as Optical Character
Recognition (OCR) system.
The proposed system is a camera based assistive text reading aid which helps visually
impaired person in reading the text present on the captured image. The faces can also be
detected when a person enter into the frame by the mode control. The proposed idea involves
text extraction from scanned image using Tesseract Optical Character Recognition (OCR)
and converting the text to speech by e-Speak tool, a process which makes visually impaired
persons to read the text. This is a prototype for blind people to recognize the products in real
world by extracting the text on image and converting it into speech. Proposed method is
carried out by using Raspberry pi and portability is achieved by using a battery backup. This
technology helps millions of people in the world who experience a significant loss of vision.
This project is economical, portable and implemented with open source hardware and
software to assist the visually impaired person.
i
ACKNOWLEDGEMENT
The project of any research work depends so much on: the quality of education
received the quality of teachers, research resources and enabling and encouraging
environment. Studying in Alva’s Institute of Engineering and Technology, Mijar provides
all these above mentioned facilities which have made possible the successful outcome of this
research work.
Firstly, our gratitude goes to our guide, Mr. Sahana K Adyanthaya., Assistant
Professor, Department of Electronics and Communication, AIET, who is our source of
encouragement and motivation throughout this project. Without his valuable guidance, this
work would never have been a successful one.
We would like to express our sincere gratitude to our Head of the Department of
Electronics & Communication Engineering, Dr. D V Manjunatha for his guidance and
inspiration.
We would like to thank our Principal Dr. Peter Fernandes for providing all the
facilities and a proper environment to work in the college campus.
We are thankful to all the teaching and non-teaching staff members of Department of
Electronics & Communication Engineering for their help and needed support rendered
throughout the project.
ii
TABLE OF CONTENTS
TITLE Page No.
ABSTRACT i
ACKNOWLEDGEMENT ii
CHAPTER 1: INTRODUCTION 1
1.1 Prelude 1
1.2 Aim of the project 1
1.3 Existing system 1
1.4 Proposed system 2
1.5 Objective of the proposed system 2
1.6 Motivation 2
1.7 Organization of the report 3
CHAPTER 2: LITERATURE SURVEY 4
2.1 Introduction 4
2.2 Literature review 4
CHAPTER 3: FUNDAMENTALS OF THE PROJECT 16
3.1 Hardware Components 16
3.2 Software Tools 16
3.3 Block Diagram of the Proposed System 17
3.4 Raspberry pi 17
3.4.1 Hardware 19
3.4.2 Processor 19
3.4.3 Performance 21
3.4.4 Overclocking 21
3.4.5 RAM 23
3.4.6 Software Operating Systems 23
3.4.7 Python Installation on Window 27
iii
3.4.8 Installation of PuTTY software on Windows 35
3.4.9 Installation of VNC Server in Windows 37
3.4.10 Other Operating Systems (not Unix/Linux-based) 42
3.4.11 Other Operating Systems (Unix/Linux-based) 42
3.5 Pin diagram of Raspberry Pi 45
3.5.1 GPIO Numbering 46
3.5.2 Physical Numbering 46
3.6 Features of Raspberry Pi 46
3.6.1 Advantages of Raspberry Pi 47
3.6.2 Disadvantage of Raspberry Pi 47
3.7 Tesseract OCR 48
3.7.1 Features 48
3.8 E-speak tool 49
3.8.1 Features 49
3.9 Digital Image Processing 49
CHAPTER 4: METHODOLOGY 51
4.1 Architecture of the Proposed System 51
4.2 Flow Chart of the Proposed System 52
CHAPTER 5: IMPLEMENTATION OF THE SYSTEM
5.1 Introduction 54
5.2 Working of Proposed System 54
5.2.1 Camera
54
5.2.2 Mode Selection
5.2.3 Face Detection 55
5.2.4 Text Detection
55
5.2.5 Noise Correction and Sound Indication
5.2.6 Thresholding 55
5.2.7 Tesseract OCR
56
5.2.8 E-speak Tool
5.2.9 Audio Output 56
5.2.10 Conversion of Text to Voice using E-speak Tool
57
5.2.11 Software Implementation
57
58
58
59
iv
CHAPTER 6: RESULTS AND DISCUSSIONS 75
APPENDIX
LIST OF FIGURES
v
Fig. No. DESCRIPTION OF THE FIGURES Page No.
vi
Figure 5.2 Typing the IP address 59
Figure 5.3 Putty login ID 60
Figure 5.4 Password window 60
Figure 5.5 Ls command window 61
Figure 5.6 VNC server path 61
Figure 5.7 Selecting VNC software 62
Figure 5.8 VNC viewer window 62
Figure 5.9 Encryption window 63
Figure 5.10 Authentication window 63
Figure 5.11 Raspberry Pi desktop 64
Figure 5.12 Raspberry Pi project window 65
Figure 5.13 Text recognition python code 66
Figure 5.14 Executing text recognition code 67
Figure 5.15 Face detection code 68
Figure 5.16 Face detection code 69
Figure 5.17 Executing face detection code 70
Figure 5.18 Executing face detection code 71
Figure 5.19 Detected face set1 72
Figure 5.20 Detected face set2 73
Figure 5.21 Shutdown command 73
Figure 6.1 System design 75
vii
LIST OF TABLES
viii
LIST OF ABBREVATIONS
BCM Broadcom
CV Computer Vision
IP Internet Protocol
OTG On The Go
OS Operating System
SD Secured Digital
SL Spatial Language
x
SVM Support Vector Machine
xi