Draft - Computer Vision

Computer Vision
Through Experiment using Python and OpenCV
e-Yantra
Copyright 2015 e-Yantra

S OMETHING SOMETHING
e-yantra.org
Example licence statement - Licensed under the Creative Commons Attribution-NonCommercial 3.0
Unported License (the License). You may not use this file except in compliance with the License.
You may obtain a copy of the License at http://creativecommons.org/licenses/by-nc/3.0.
Unless required by applicable law or agreed to in writing, software distributed under the License is
distributed on an AS IS BASIS , WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
express or implied. See the License for the specific language governing permissions and limitations
under the License.
First edition - 2015
Contents
Part One : Basics
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1
Sense - Analyze - Control
2.2
Eyes vs Webcam
10
2.3
Models of Vision
10
2.4
Gestalt Principles
11
2.4.1
2.4.2
2.4.3
2.4.4
2.4.5
2.4.6
Proximity . . . . . . .
Similarity . . . . . . .
Closure . . . . . . . .
Common Motion
Symmetry . . . . . .
Continuity . . . . .
Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1
Introduction
15
3.2
Installation
15
3.3
Python
16
3.4
Numpy
16
3.5
OpenCV
16
3.6
Debugging
17
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
11
11
11
11
12
12
Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.1
Representation
19
4.2
Properties
19
II
Part Two : Hands On
Fundamental Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.1
Structure of a Program
23
5.2
Image from file
23
5.3
Image from Camera
25
5.4
Video from Camera
26
5.5
Experiments
26
5.6
Debugging
27
Basic Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.1
Colourspaces
29
6.2
Thresholding
30
6.3
Zooming, Rotating and Panning
31
6.4
Experiments
31
6.5
Debugging
31
User Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
7.1
Shapes
33
7.2
Buttons
34
7.3
Experiments
34
7.4
Debugging
34
Drawing on air . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
8.1
System
35
8.2
Light Pen
35
8.3
Steps
36
8.3.1
8.3.2
8.3.3
8.3.4
8.3.5
8.3.6
8.3.7
8.3.8
Step 1
Step 2
Step 3
Step 4
Step 5
Step 6
Step 7
Step 8
8.4
Experiments
46
8.5
Debugging
46
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
36
37
38
39
40
41
43
45
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Books
47
Articles
47
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Part One : Basics
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1
2.2
2.3
2.4

Eyes vs Webcam
Models of Vision
Gestalt Principles
Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1
3.2
3.3
3.4
3.5
3.6
Introduction
Installation
Python
Numpy
OpenCV
Debugging
Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.1
4.2
Representation
Properties
1. Introduction
Computer Vision is simply the pursuit of teaching a machine to see as humans do. While the task
seems to be trivial at the outset, given that we have made so many advances in computing. However,
the problem lies in the fact that while we humans certainly use our vision to understand the world
around us, the exact way in which we make inferences from the images we see is still not very clear.
For example, let me ask you to take the trouble to pick up a pencil and draw a cube.
Figure 1.1: A Symbolic Cube

If the above figure is what you drew, then I shall ask the following questions:
Are the opposite edges parallel?
Are the edges all of the same length?
From which angle will you have to see the cube to get exactly what you have drawn?
You may be surpised to learn that when we try to remember what something looks like, we
usually associate it with a symbol, which has certain properties that relate to the object, rather than
remembering an image of the object itself. Here, we can see a glimpse of the problem I spoke about
Chapter 1. Introduction
10
earlier. Since we do not completely understand how we ourselves see, we can.....
Figure 1.2: True Cube

Towards this end, we will try to define a certain task to be completed, or an inference to be
made, and accordingly process the information that we have, for example, an image.
2. Vision
2.1

A system is a model of almost anything we can imagine. A system may be a device, a robot, a
person, a unicorn, a university or even the universe. Once we have understood what is the system
we are describing, we can break it down to three main parts:
Sense
Analyze
Control
Figure 2.1: Sense-Analyze-Control

Let us take the example of a person as the system. Then, the Sense part corresponds to our five
senses. The Analyze part corresponds to how our brain interpreting the information obtained by
our senses. The Control part is our muscles, which helps the system respond to the stimulus that
we have sensed in the first place. Suppose we take the example of a program as a system, then we
can define the Sense part as any input that is provided, for example, an image from a camera. The
Analysis part is the algorithm that we run on the input that is provided, and finally, the Control part
is the output that we give to the user.
12
2.2
Chapter 2. Vision
Eyes vs Webcam
Since our goal is to emulate human vision using machines, we must understand the differences
between a machine and a human. Therefore, let us take our systems to be a human and a machine.
We must compare the three parts of the two systems.
From the previous description, it is clear that the sensor in the case of a human is the human
eye, and in the case of a machine, it is the camera or as is colloquially called, the webcam.
resolution
receptors
focus
binocular/stereo
The difference in algorithm cannot be stated, as we do not know completely how humans see,
but computer vision is progressing along many paths, a few broad fields of which are listed
here. The classification is not mutually exclusive, and many of these fields share overlaps in
the paths mentioned.
Mapping, Localization, Depth maps
Object detection
Object identification
Image/scene retrieval
Augmented Reality
Image segmentation
Scene Understanding
Action segmentation and recognition
Feature detectors
Since we can define the output loosely to be the kinds of inferences that can be drawn from
an image/video. For example, given a class, to count the number of students attending, or
to construct a 3D map of the seen world, or identifying an object in the scene. In such a
comparizon, if you will excuse blatant arrogance, we may say that the outputs of machine
vision are often good for specific applications, but no general purpose system exists that
makes all the above inferences in a manner comparable to human vision. However, it is not
only a guess, but also a hope that the future might tip the scale.
2.3
Models of Vision
In order to understand how we have arrived at present day algorithms for computer vision, we
must first know how human vision works. The model that we have today is vastly different from
those we constructed earlier, and a reading of these will help us understand both the strengths and
weaknesses that these models offer.
Emission Theory - Eye emits stuff that interacts with the outer world, perhaps modeled on
the sense of touch.
Intro-mission - Stuff representative of objects enters the eye, perhaps modeled on the sense
of smell.
Unconscious inference (Helmholtz) - Vision is learned from past experience
Gestalt theory - Visual system automatically groups elements into patterns:
Proximity
Similarity
Closure
Symmetry
Common Motion
Continuity
2.4 Gestalt Principles
13
Computational Models - Based upon study of brain functions, uses methods such as machine
learning, neural networks, etc.
2.4
2.4.1
Gestalt Principles
Proximity
This image indicates the Law of Proximity. The birds flock closely together, causing viewers to
perceive them as a group.
Figure 2.2: Proximity
2.4.2
Similarity
This image indicates the Law of Similarity. Even though the blue shapes are arranged uniformly,
the triangle made up of blue circles stands out from the rest of the figure.
Figure 2.3: Similarity
2.4.3
Closure
This image indicates the Law of Closure. In this very famous logo of WWF, we can see the panda
clearly, even though there are no outlines to specify the head or the body.
2.4.4
Common Motion
This image indicates the Law of Common Motion.
Chapter 2. Vision
14
Figure 2.4: Closure
Figure 2.5: Common Motion

2.4.5
Symmetry
This image indicates the Law of Symmetry. Earlier, we discussed that objects that are close by
are grouped together, but here, we see three sets of paranthesis, even though the different types of
brackets are closer than the matching sets.
Figure 2.6: Symmetry
2.4.6
Continuity
This image indicates the Law of Continuity. Even though, by the law of similarity, we should see
two bent curves touching, we instead see two smooth curves intersecting.
Therefore, we can see that these Gestalt Laws can be used as a foundation upon which we can
build ways to see and understand objects in an image. However, note that they do not point to the
same inference, but instead these laws sometimes compete to form different interpretations of the
same image.
2.4 Gestalt Principles
15
Figure 2.7: Continuity
3. Software
3.1
Introduction
In the previous chapter, we saw how the webcam is the Sense part of the system, and it retrieves an
image for us to analyze. This does not define what exactly an image is. Therefore, in order to define
an image, we must first learn how the image is represented. There are many ways to represent an
image. For example, the representation of an image taken by an MRI may be different from the
one taken by the webcam. Since we will be using the webcam, we will use a representation that is
intuitive, widely-used and pliable : a matrix of numbers. In order to elaborate on this representation,
let us first inspect how the webcam senses/captures the image. A webcam has an array of cells
which are arranged in sets of 3 ( refer to fig. xx )1 . This choice of three is modeled on the human
eye, which can loosely be said to be sensitive to 3 colours, red, green and blue. Therefore, each
cell on the array of the webcam has a receptor for blue light, green light and red light. These are
then given a value from 0 to 255 based on the amount of that particular light falling on the cell2 .
Therefore, the matrix of numbers that we spoke about is an n*m*3 dimensioned array, with n being
the height, m being the width, and 3 for an 8 bit value corresponding to blue, red and green ( BGR
). Please refer to fig. xx in order to get a better picture. We will be using the library called OpenCV
and the language Python in order to obtain and process the above matrix in the rest of the tutorial.
3.2
Installation
In this part, we will install the following three softwares:
Python
Numpy
OpenCV
1 This is a simplification.
In commercial CCDs, each group consists of four pixels, one red, one blue and two green(the
human eye is more sensitive to green than either red or blue).
2 Assuming the colour is represented in 8 bits
Chapter 3. Software
18
3.3
Python
Please follow the steps given below: Download Python from the following link: https://www.
python.org/ftp/python/2.7.6/python-2.7.6.msi Double-click the downloaded file to commence installation In order to configure the environment variables,
Right-click on My Computer
Click on Properties
Click on Advanced System Settings to open the System Properties dialog box
Under System Properties, select the Advanced tab
Click on Environment Variables
Under System Variables, search for the variable Path
Add C:/Python27;C:/Python27/Scripts; at the start of the textbox
Click on OK
In order to verify your installation,
Open Command Prompt and type python and press enter
You should see the following prompt (fig 3.1):
Figure 3.1: Python shell

Python as a scripting language... print assignments if then else for next modules - wikipedia
lists
Python vs C++
3.4
Numpy
Download Numpy from the following link: http://sourceforge.net/projects/numpy/
files/NumPy/1.7.1/numpy-1.7.1-win32-superpack-python2.7.exe/download
Double-click the downloaded file to commence installation
Open Command Prompt and type python and press Enter
At the python prompt, type import numpy and press Enter
3.5
OpenCV
Download OpenCV from the following link: http://sourceforge.net/projects/opencvlibrary/
files/opencv-win/2.4.9/opencv-2.4.9.exe/download
Double-click the downloaded file to commence installation
Navigate to the folder opencv/build/python/2.7
Copy the file cv2.pyd to C:/Python27/lib/site-packages
3.6 Debugging
19
Figure 3.2: Check Numpy

Open Command Prompt and type python and press Enter
At the python prompt, type import cv2 and press Enter

Suppose that we want to build a complex application using Computer Vision. This would imply
that we may use some algorithms that are commonly used, to operate upon images, points, etc.
However, if each person who was to write a program was to start from the basics, then we would not
make much progress. Therefore, we turn to libraries. Libraries are a collection of algorithms that
can be shared, reused and improved. They make developing an application very simple. OpenCV
is a library that implements many common algorithms used in Computer Vision, eg. zooming,
shifting, thresholding, etc.
3.6
Debugging
4. Images
4.1
Representation
From the earlier section, we learned that one way to represent images is using an n*m*3 dimensioned matrix to represent the colours blue, breen and red. However, there are other representations
as well, which are defined in OpenCV. A partial list is as follows and will be elaborated upon in the
chapter xx
BGR
RGB
Grayscale
Binary
HSV
YUV
4.2
Properties
The properties of such representations of images are as follows
Width - This is the number of rows in the image matrix
Height - This is the number of columns in the image matrix
Channels - This determines the kind of information stored in each pixel of the image matrix.
For example, a BGR image has 3 channels, blue, green and red, whereas a grayscale image
has only one channel, the grayness value.
Depth - This is the type of information of each pixel of the image matrix. For example, it
could be 8 bits, therefore have a value from 0 to 255, or it could be 16 bits, which would
- 1)
cover the numbers from 0 to (216
II
Part Two : Hands On
Fundamental Programs . . . . . . . . . . . . . . 23
5.1
5.2
5.3
5.4
5.5
5.6
Image from file
Image from Camera
Video from Camera
Experiments
Debugging
Basic Image Processing . . . . . . . . . . . . . . 29
6.1
6.2
6.3
6.4
6.5
Colourspaces
Thresholding
Experiments
Debugging
User Interfaces . . . . . . . . . . . . . . . . . . . . . . 33
7.1
7.2
7.3
7.4
Shapes
Buttons
Experiments
Debugging
Drawing on air . . . . . . . . . . . . . . . . . . . . . . 35
8.1
8.2
8.3
8.4
8.5
System
Light Pen
Steps
Experiments
Debugging
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . 47
Books
Articles
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5. Fundamental Programs
5.1
The structure of a program can be derived from the Sense-Analyze-Control model of the program.
Every program that we henceforth write will largely have the same structure, as represented in fig.
xx below
Figure 5.1: Structure of a Program

Therefore, we will now write 3 types of programs, as detailed below, that can be modified later
to suit any purpose.
5.2
Image from file

The first type of program is one that does not use a camera, but instead takes the input from a file
stored on the computer.
Chapter 5. Fundamental Programs
26

# ###########################################
# # I m p o r t OpenCV
i m p o r t numpy
i m p o r t cv2
# ###########################################
# ###########################################
# # Read t h e image
img = cv2 . imread ( ' g a n d a l f . j p e g ' )
# ###########################################
# ###########################################
# # Do t h e p r o c e s s i n g
# Nothing
# ###########################################
# ###########################################
# # Show t h e image
cv2 . imshow ( ' image ' , img )
# ###########################################
# ###########################################
# # C l o s e and e x i t
cv2 . waitKey ( 0 )
cv2 . destroyAllWindows ( )
# ###########################################
- - - - - - - - - - - - - - - explanation of statements - - - - - - - - - - - - - - - - - - - - F
import numpy - This command asks Python to use the Numpy library to manipulate matrices. Since the images we use are stored as matrices, we can then use numpy functions as
numpy.<function>. Alternately, we can also use the command import numpy as np to shorten
the name of the module so that we can call the numpy functions as np.<function>
import cv2 - This command asks Python to use the OpenCV library(also called module).
Once we have imported cv2, Python understands that the functions we use are defined in the
OpenCV library. Once we have imported this library, we can proceed to use functions from
this library as cv2.<function>
cv2.imread(<filename>, [flag]) - This command loads an image as a matrix. If we do not use

any flags, then the image is read and returned as an HxWxC matrix, where H - Height, W
- Width, C - Channels. For example, if our image is a 640x480 image(as is the case when
we read from the camera), then the dimensions of the matrix are 480x640x3, where 480 is
the height, 640 is the width, and 3 refers to the number of channels, i.e. Blue, Green, Red.
If we specify 0 as our flag, then the image is read as a grayscale image, whose matrix has
dimensions HxW, as there is only one channel now.
cv2.imshow(<window name>,<matrix>) - This command creates a window with the name

<window name>, and displays the image represented by <matrix>. Note that the function will always interpret m*n*3 images as if they are BGR images, irrespective of what
representation we are storing in the <matrix>.
5.3 Image from Camera

F
cv2.waitKey(t) - This command waits for a time period of time milliseconds for the user to
press a key, and if the user does press a key, then it returns the ASCII code of the key pressed.
In case we pass 0 as our parameter for time, then it waits indefinitely till the user presses a
key.
cv2.destroyAllWindows() - This command asks python to close all the open windows. This is
how we exit the program.
27
You can find the source code of this program at pathtofileImage.py
5.3
Image from Camera

The second type of program is one that processes a single image taken by a webcam connected to
the computer.

# ###########################################
i m p o r t numpy
i m p o r t cv2
# I n i t i a l i z e camera
cap = cv2 . VideoCapture ( 0 )
# ###########################################
# ###########################################
ret , frame = cap . read ( )
# ###########################################
# ###########################################
# Nothing
# ###########################################
# ###########################################
cv2 . imshow ( ' image ' , frame )
cv2 . waitKey ( 0 )
# ###########################################
# ###########################################
# c l o s e camera
cap . release ( )
# ###########################################
cv2.VideoCapture(i) -> cap - This command tells python to set up an instance of a VideoCapture class and assigns it to the variable cap. In other words, we need to tell python where
we are getting our images from, in this case the number assigned to the camera we need
to use. For example, cap = cv2.VideoCapture(0) will make python use the first camera it
finds(usually the laptop camera) whenever we need to read images using cap.imread().
cap.read() -> ret, frame - This is the command we use to retrieve images from the source we
have named as capture. For example, if we have used the command cap = cv2.VideoCapture(0),
Chapter 5. Fundamental Programs
28
then our capture is named cap and cap.read() will return frame, a matrix that stores an image
taken from the camera, and ret, which if tells us if the capture was successful
cap.release() - This command releases the camera that we had initialized when we used the
cv2.VideoCapture command. In the absence of this command, we will get errors when we
run the program again and try to initialize the same camera without first releasing it.
You can find the source code of this program at pathtocamImage.py
5.4
Video from Camera

The second type of program is one that processes a video stream from a webcam connected to the
computer.

# ###########################################
i m p o r t numpy
i m p o r t cv2
# ###########################################
# ###########################################
# # Video Loop
while (1) :
# Nothing
# # End t h e v i d e o l o o p
i f cv2 . waitKey ( 1 ) == 2 7 : # # 27 ASCII f o r e s c a p e key
break
# ###########################################
# ###########################################
# c l o s e camera
cap . release ( )
# ###########################################
while(1) - This is an infinite loop, equivalent to saying while(True).
break - This statement breaks out of the loop in which it occurs.
You can find the source code of this program at pathtocamVideo.py
5.5
Experiments
show multiple files, types of files, windows waitkey times waitkey catch key index different cameras
5.6 Debugging
5.6
Debugging
29
6. Basic Image Processing
6.1
Colourspaces

# ###########################################
i m p o r t numpy
i m p o r t cv2
# ###########################################
# ###########################################
# ###########################################
# ###########################################
p r i n t img . shape
gray = cv2 . cvtColor ( img , cv2 . COLOR_BGR2GRAY )
p r i n t gray . shape
# ###########################################
# ###########################################
cv2 . imshow ( ' image ' , gray )
# ###########################################
# ###########################################
cv2 . waitKey ( 0 )
# ###########################################
cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
Chapter 6. Basic Image Processing
32
6.2
Thresholding
Thresholding is a technique used to create a binary image from an existing image by assigning
either of two values based on the comparison of the intended threshold value and the pixel value.
For example, supposing the image we are Thresholding is a grayscale image. Then each of the
pixels has a value ranging from 0 to 255. Let us assume that the intended threshold value is 150.
Then those pixels in the image that have a value equal to or below 150 are set to 0 and those above
are set to 255. Thus the resulting image has pixels that are either black(0) or white(255).

# ###########################################
i m p o r t numpy
i m p o r t cv2
# ###########################################
# ###########################################
img = cv2 . imread ( ' l i o n . j p g ' )
# ###########################################
# ###########################################
gray = cv2 . cvtColor ( img , cv2 . COLOR_BGR2GRAY ) # We n e e d a g r a y s c a l e image t o do t h e thresholding
ret , thresh1 = cv2 . threshold ( gray , 1 5 0 , 2 5 5 , cv2 . THRESH_BINARY )
ret , thresh2 = cv2 . threshold ( gray , 1 2 7 , 2 5 5 , cv2 . THRESH_BINARY_INV )
ret , thresh3 = cv2 . threshold ( gray , 1 2 7 , 2 5 5 , cv2 . THRESH_TRUNC )
ret , thresh4 = cv2 . threshold ( gray , 1 2 7 , 2 5 5 , cv2 . THRESH_TOZERO )
ret , thresh5 = cv2 . threshold ( gray , 1 2 7 , 2 5 5 , cv2 . THRESH_TOZERO_INV )
# ###########################################
# ###########################################
cv2 . imshow ( ' image t h r e s h 1 ' , thresh1 )
cv2 . imshow ( ' o r i g i n a l ' , img )
# ###########################################
# ###########################################
cv2 . waitKey ( 0 )
# ###########################################
cv2.threshold(src, thresh, maxval, type) - Takes src as input and returns the threshed image. The types of thresholding include THRESH_BINARY, THRESH_BINARY_INV,
THRESH_TRUNC, THRESH_TOZERO, THRESH_TOZERO_INV

# inRange program
cv2.inRange - This command is used to apply thresholding when there is more than one
channel, eg. HSV, BGR images, etc.
6.3 Zooming, Rotating and Panning
6.3
33

# ###########################################
i m p o r t numpy
i m p o r t cv2
# ###########################################
# ###########################################
# ###########################################
# ###########################################
p r i n t img . shape
h , w , c = img . shape
res = cv2 . resize ( img , ( w * 2 , h * 2 ) , interpolation = cv2 . INTER_CUBIC ) # Note t h e o r d e r o f h e i g h t and w i d t h ,
# i t i s d i f f e r e n t from t h e one we u s e d f o r img . s h a p e
# cv2 . INTER_AREA
> s h r i n k i n g
# cv2 . INTER_CUBIC , cv2 . INTER_LINEAR > zooming
p r i n t res . shape
# ###########################################
# ###########################################
cv2 . imshow ( ' image ' , res )
# ###########################################
# ###########################################
cv2 . waitKey ( 0 )
# ###########################################
cv2.resize - This is a command used to change the size of an image.
6.4
Experiments
6.5
Debugging
34
Chapter 6. Basic Image Processing
7. User Interfaces
7.1
Shapes

# ###########################################
i m p o r t numpy as np
i m p o r t cv2
# ###########################################
# ###########################################
# # C r e a t e t h e image
img = np . zeros ( ( 5 0 0 , 5 0 0 , 3 ) , np . uint8 )
# ###########################################
# ###########################################
# Draw a l i n e
cv2 . line ( img , ( 1 0 , 1 0 ) , ( 4 9 0 , 1 0 ) , ( 2 5 5 , 0 , 0 ) , 5 )
# Draw a r e c t a n g l e
cv2 . rectangle ( img , ( 2 0 , 2 0 ) , ( 4 8 0 , 8 0 ) , ( 0 , 2 5 5 , 0 ) , 3 )
# Draw a c i r c l e
cv2 . circle ( img , ( 1 5 0 , 1 5 0 ) , 5 0 , ( 0 , 0 , 2 5 5 ) , 1)
cv2 . circle ( img , ( 3 5 0 , 1 5 0 ) , 5 0 , ( 0 , 0 , 2 5 5 ) , 3 )
# Filled
# Outline
# Draw an e l l i s p e
cv2 . ellipse ( img , ( 2 5 0 , 2 0 0 ) , ( 2 0 0 , 1 0 0 ) , 0 , 0 , 1 8 0 , ( 1 0 0 , 1 0 0 , 0 ) , 5 )
# Draw a p o l y g o n
pts = np . array ( [ [ 2 0 0 , 4 0 0 ] , [ 3 0 0 , 4 0 0 ] , [ 2 5 0 , 4 5 0 ] ] , np . int32 )
p r i n t pts
pts = pts . reshape ( ( 1 , 1 , 2 ) )
p r i n t pts
cv2 . polylines ( img , [ pts ] , True , ( 0 , 2 5 5 , 2 5 5 ) , 2 )
# ###########################################
# ###########################################
Chapter 7. User Interfaces
36
# ###########################################
# ###########################################
cv2 . waitKey ( 0 )
# ###########################################
7.2
cv2.lin
cv2.circle
cv2.rectangle
cv2.ellipse
Buttons

F
cv2.setMouseCallback
7.3
Experiments
7.4
Debugging
8. Drawing on air
8.1
System
8.2
Light Pen
Figure 8.1: The Pen
Chapter 8. Drawing on air
38
(a) On
(b) Off
Figure 8.2: Working of the pen
8.3
Steps
8.3.1
Step 1
As a first step, we need to get the video feed from the camera. Therefore, we will use our
camVideo.py as the template. As usual, please choose the correct video channel when using the
VideoCapture function.

# ###########################################
i m p o r t numpy
i m p o r t cv2
# ###########################################
# ###########################################
# # Video Loop
while (1) :
break
# ###########################################
# ###########################################
# c l o s e camera
cap . release ( )
# ###########################################
8.3 Steps
8.3.2
39
Step 2
Next, we need to choose the colourspace that is appropriate for our application. We can expect
changes in lighting conditions, and we are tracking a certain colour, therefore, we choose the HSV
colourspace.

# ###########################################
i m p o r t numpy
i m p o r t cv2
# ###########################################
# ###########################################
# # Video Loop
while (1) :
img = cv2 . cvtColor ( frame , cv2 . COLOR_BGR2HSV )
break
# ###########################################
# ###########################################
# c l o s e camera
cap . release ( )
# ###########################################
40
8.3.3
Step 3
In this step, we introduce the concept of functions in Python. We simply call the function findPoint
to convert the image from BGR to HSV and return the converted image. Therefore, the output we
shall see will be the same as Step 2.

# ###########################################
i m p o r t numpy
i m p o r t cv2
# ###########################################
# ###########################################
# # F i n d i n g t h e p o i n t (LED)
d e f findPoint ( img ) :
# # C o n v e r t t o HSV
img = cv2 . cvtColor ( img , cv2 . COLOR_BGR2HSV )
r e t u r n img
# ###########################################
# # Video Loop
while (1) :
output = findPoint ( frame )
cv2 . imshow ( ' image ' , output )
break
# ###########################################
# ###########################################
# c l o s e camera
cap . release ( )
# ###########################################
8.3 Steps
8.3.4
41
Step 4
Now that we have a suitable image to work with, we need to find the position of the LED. In order
to do that, we try to first reduce the image to show only the LED by thresholding. Please use
filterFind.py to find the appropriate threshold for the colour of LED that you are using. The values
used here are meant for a red LED. Note that we are now returning the mask as the output of the
function findPoint.

# ###########################################
i m p o r t numpy
i m p o r t cv2
# ###########################################
# ###########################################
hsv = cv2 . cvtColor ( img , cv2 . COLOR_BGR2HSV )
## Define t h r e s h o l d s
lower = numpy . array ( [ 0 , 0 , 2 3 0 ] )
upper = numpy . array ( [ 2 0 , 1 0 , 2 5 5 ] )
# # T h r e s h o l d t h e image
mask = cv2 . inRange ( hsv , lower , upper )
r e t u r n mask
# ###########################################
# # Video Loop
while (1) :
break
# ###########################################
# ###########################################
# c l o s e camera
cap . release ( )
# ###########################################
42
8.3.5
Step 5
Now we have a binary image where the LED portion of the image appears as a blob. We are looking
for a position (x,y) to tell us which is the point on the screen that the LED indicates. Therefore,
we can use contours to find the central point of the blob. In this step, let us display all the possible
contours so that we can later eliminate those contours that result from noise.

# ###########################################
i m p o r t numpy
i m p o r t cv2
# ###########################################
# ###########################################
## Find t h e blob of r e d
contours , hierarchy = cv2 . findContours ( mask , cv2 . RETR_TREE , cv2 . CHAIN_APPROX_SIMPLE )
cv2 . drawContours ( img , contours , 1, ( 0 , 0 , 2 5 5 ) , 3 )
r e t u r n img
# ###########################################
# # Video Loop
while (1) :
break
# ###########################################
# ###########################################
# c l o s e camera
cap . release ( )
# ###########################################
8.3 Steps
8.3.6
43
Step 6
We see that in the earlier step, there are many contours that are returned. We must now choose
the contour that indicates the LED. In order to do so, we can use contour properties of area and
moments. Firstly, we can dismiss those contours that are very small, or have a low area. Then
we can pick the biggest of those contours, using the assumption that it will be bigger than those
contours that result from noise.

# ###########################################
i m p o r t numpy
i m p o r t cv2
# ###########################################
# ###########################################
## Find blob with b i g g e s t a r e a
i f l e n ( contours ) >0:
maxA = 0
maxC = [ ]
f o r cnt i n contours :
area = cv2 . contourArea ( cnt )
i f area >maxA :
maxC = cnt
maxA = area
cv2 . drawContours ( img , [ cnt ] , 1, ( 0 , 0 , 2 5 5 ) , 3 )
r e t u r n img
# ###########################################
# # Video Loop
while (1) :
break
# ###########################################
# ###########################################
44
# c l o s e camera
cap . release ( )
# ###########################################
8.3 Steps
8.3.7
45
Step 7
Now that we have found out the blob that we need, we need to find its centre. This is done in
OpenCV by the use of contour moments. Once we find the centre, we return it using the same
function, findPoint. We then draw a circle around it to show us where this point is.

# ###########################################
i m p o r t numpy
i m p o r t cv2
canvas = numpy . zeros ( ( 4 8 0 , 6 4 0 , 3 ) , numpy . uint8 )
# ###########################################
# ###########################################
# contours = []
## D e f a u l t r e t u r n v a l u e
( cx , cy ) = ( 0 , 0 )
maxA = 0
maxC = [ ]
i f area >maxA :
maxC = cnt
maxA = area
## Find t h e c e n t e r of t h a t blob
M = cv2 . moments ( maxC )
i f M [ ' m00 ' ] ! = 0 :
cx = i n t ( M [ ' m10 ' ] / M [ ' m00 ' ] )
cy = i n t ( M [ ' m01 ' ] / M [ ' m00 ' ] )
r e t u r n mask , ( cx , cy )
# ###########################################
# # Video Loop
while (1) :
mask , outxy = findPoint ( frame )
# # Draw t h e p o i n t r e t u r n e d
cv2 . circle ( canvas , outxy , 5 , ( 0 , 0 , 2 5 5 ) , 2 )
46
cv2 . circle ( canvas , outxy , 2 0 , ( 2 5 5 , 0 , 2 5 5 ) , 2 )
cv2 . circle ( canvas , outxy , 4 0 , ( 0 , 2 5 5 , 2 5 5 ) , 2 )
# cv2 . l i n e ( c a n v a s , o l d P o s , pos , c o l o u r , t )
cv2 . imshow ( ' o r i g ' , frame )
cv2 . imshow ( ' image ' , canvas )
break
# ###########################################
# ###########################################
# c l o s e camera
cap . release ( )
# ###########################################
8.3 Steps
8.3.8
47
Step 8
Finally, we need to draw the points that we found earlier. This completes our sample application of
using Computer Vision to draw on air.

# ###########################################
i m p o r t numpy
i m p o r t cv2
# c a n v a s = numpy . z e r o s ( ( 4 8 0 , 6 4 0 , 3 ) , numpy . u i n t 8 )
# ###########################################
# ###########################################
g l o b a l oldPos
# contours = []
## D e f a u l t r e t u r n v a l u e
( cx , cy ) = ( 0 , 0 )
maxA = 400
maxC = [ ]
i f area >maxA :
maxC = cnt
maxA = area
i f maxC ! = [ ] :
## Find t h e c e n t e r of t h a t blob
M = cv2 . moments ( maxC )
i f M [ ' m00 ' ] ! = 0 :
cx = i n t ( M [ ' m10 ' ] / M [ ' m00 ' ] )
cy = i n t ( M [ ' m01 ' ] / M [ ' m00 ' ] )
r e t u r n mask , ( cx , cy )
# ###########################################
# # Video Loop
while (1) :
mask , outxy = findPoint ( frame )
# # Draw t h e p o i n t
cv2 . circle ( frame ,
returned
outxy , 5 , ( 0 , 0 , 2 5 5 ) , 3 )
outxy , 2 0 , ( 2 5 5 , 0 , 2 5 5 ) , 3 )
outxy , 4 0 , ( 0 , 2 5 5 , 2 5 5 ) , 3 )
48
break
# ###########################################
# ###########################################
# c l o s e camera
cap . release ( )
# ###########################################
8.4
Experiments
8.5
Debugging
Bibliography
Books
Articles
Index
Buttons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Gestalt Principles . . . . . . . . . . . . . . . . . . . . . . . 11
C
H
Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Colourspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Common Motion . . . . . . . . . . . . . . . . . . . . . . . . 11
Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Contour Properties . . . . . . . . . . . . . . . . . . . . . . 41
Contours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
D
Debugging . . . . . . . . . . . . . . . . 17, 27, 31, 34, 46
Drawing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
E
Experiments . . . . . . . . . . . . . . . . . . 26, 31, 34, 46
HSV colourspace . . . . . . . . . . . . . . . . . . . . . . . 37
I
Image from Camera . . . . . . . . . . . . . . . . . . . . . 25
Image from file . . . . . . . . . . . . . . . . . . . . . . . . . 23
Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
L
Light Pen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Functions in Python . . . . . . . . . . . . . . . . . . . . . 38
Models of Vision . . . . . . . . . . . . . . . . . . . . . . . . 10
52
INDEX
N
Numpy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16
O
OpenCV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
P
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Proximity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
R
Representation . . . . . . . . . . . . . . . . . . . . . . . . . . 19
S
Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Shapes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
T
Thresholding . . . . . . . . . . . . . . . . . . . . . . . . 30, 39
Tying up loose ends . . . . . . . . . . . . . . . . . . . . . 45
V
Video feed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Video from Camera . . . . . . . . . . . . . . . . . . . . . 26
Z
Zooming, Rotating and Panning . . . . . . . . . . 31

Draft - Computer Vision

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Draft - Computer Vision

Enviado por

Direitos autorais:

Formatos disponíveis

Computer Vision

Through Experiment using Python and OpenCV

Copyright 2015 e-Yantra

Part One : Basics

Sense - Analyze - Control

Part Two : Hands On

Image from file

Image from Camera

Video from Camera

Basic Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Zooming, Rotating and Panning

Part One : Basics

Sense - Analyze - Control

Figure 1.1: A Symbolic Cube

earlier. Since we do not completely understand how we ourselves see, we can.....

Figure 1.2: True Cube

Sense - Analyze - Control

Figure 2.1: Sense-Analyze-Control

2.4 Gestalt Principles

Figure 2.2: Proximity

Figure 2.3: Similarity

Figure 2.4: Closure

Figure 2.5: Common Motion

Figure 2.6: Symmetry

2.4 Gestalt Principles

Figure 2.7: Continuity

Figure 3.1: Python shell

Figure 3.2: Check Numpy

Figure 3.3: Check Numpy

Part Two : Hands On

Basic Image Processing . . . . . . . . . . . . . . 29

Figure 5.1: Structure of a Program

Image from file

Chapter 5. Fundamental Programs

cv2.imread(<filename>, [flag]) - This command loads an image as a matrix. If we do not use

cv2.imshow(<window name>,<matrix>) - This command creates a window with the name

5.3 Image from Camera

You can find the source code of this program at pathtofileImage.py

Image from Camera

Chapter 5. Fundamental Programs

You can find the source code of this program at pathtocamImage.py

Video from Camera

while(1) - This is an infinite loop, equivalent to saying while(True).

break - This statement breaks out of the loop in which it occurs.

You can find the source code of this program at pathtocamVideo.py

6. Basic Image Processing

Chapter 6. Basic Image Processing

6.3 Zooming, Rotating and Panning

Zooming, Rotating and Panning

cv2.resize - This is a command used to change the size of an image.

Chapter 6. Basic Image Processing

Figure 6.1: Check Numpy

Chapter 7. User Interfaces

Figure 8.1: The Pen

Chapter 8. Drawing on air

Figure 8.2: Working of the pen

Chapter 8. Drawing on air

Chapter 8. Drawing on air

Chapter 8. Drawing on air

Chapter 8. Drawing on air

Chapter 8. Drawing on air

Você também pode gostar