Raspberry Drone - Unmanned Aerial Vehicle (UAV)

Raspberry Drone: Unmanned Aerial Vehicle (UAV)
Guilherme Lereno Santos Vale
Thesis to obtain the Master of Science Degree in
Telecommunications and Information Engineering
Supervisors: Prof. Nuno Filipe Valentim Roma and Prof. Ricardo Jorge Fernandes
Chaves
Examination Committee
Chairperson: Prof. Paulo Jorge Pires Ferreira
Supervisor: Prof. Nuno Filipe Valentim Roma
Member of the Committee: Prof. Alberto Manuel Ramos da Cunha
May 2015
ii
Acknowledgments
I would like to thank Professor Nuno Roma and Professor Ricardo Chaves for giving me the opportunity
to work under their supervision and they were always available to help me throughout the entire project.
iii
iv
Resumo
Esta tese foca-se no desenvolvimento e implementação de um módulo computacional num unmanned

aerial vehicle (UAV) baseado num mini computador de baixo custo. Este módulo interage com a estação
terrestre, de maneira a transmitir a informação obtida de um conjunto de sensores (GPS, imagem e
outros sensores do UAV). Na estação terrestre, as coordenadas de GPS são subsequentemente usadas
para traçar a trajetória num mapa baseado em Google Maps, embebendo a informação dos sensores
no mapa. A estação móvel foi desenvolvida em linguagem de programação C no mini computador
Raspberry Pi com uma distribuição de Linux. A estação móvel captura de imagens com uma webcam
pronta a usar e usa um GPS convencional para adquirir a latitude/longitude. A informação adquirida
é então subsequentemente enviada para a estação terrestre através de um link de comunicação WiFi
sem fios. A estação terrestre é composta por uma aplicação web, desenvolvida em Node.Js, com
uma interface gráfica que mostra (em tempo real) a trajetória do UAV e insere a informação dos dados
dos sensores e imagens devidamente georreferenciados com marcadores no mapa. De acordo com a
avaliação de desempenho realizada, a implementação do sistema preenche os requisitos, sendo capaz
de oferecer uma taxa de transferência de 4 imagens por segundo.
Palavras-chave:Unmanned Aerial Vehicle, Sistema Embebido, Mini Computador, Aquisição

e Codificação de Imagem, Aquisição de Sinais de Georreferenciação, Interface Gráfica.
v
vi
Abstract
This thesis focuses on the development and implementation of an on-board computing module in an
unmanned aerial vehicle (UAV) based on a low-cost single-board computer. This module interacts with
a base station, in order to transmit the data gathered in a set of sensors (GPS, image and other UAV
sensors). At the base station, the GPS coordinates are subsequently used to track the trajectory in an
user interface based on Google Maps, embedding the sensors data on the map. The mobile station
was developed using C programming language on a Raspberry Pi single board computer with Linux
distribution. It captures images with an off-the-shelf low-cost webcam and uses a conventional GPS
module to acquire the latitude/longitude. The acquired data is then subsequently sent to the base station
throw a wireless WiFi communication link. The base station is composed of a web application, developed
in Node.Js, with a graphical interface to show (in real time) the trajectory of the UAV, by embedding
the georeferenced sensors data and images with convenient markers on the map. According to the
conducted performance evaluation, the implemented system is able to fulfil the aimed requirements,
being able to offer an image acquisition throughput of 4 fps.
Keywords: Unmanned Aerial Vehicle, Embedded System, Single-Board computer, Image ac-
quisition and encoding, Georeference signal acquisition, Graphical Interface.
vii
viii
Contents
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Resumo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
1 Introduction 1
1.1 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Related Work and Technologies 4

2.1 Existing Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 FPV Video Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.2 Video/Telemetry Transmission Systems . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.3 Existing Solutions Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Relevant Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 Processing Boards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.2 Video Acquisition Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.3 GPS Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.4 Data Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.5 Data Visualisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3 System Architecture 20
3.1 Overall System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Mobile Station . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.1 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2.2 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 Base Station . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
ix
4 System Integration and Implementation 36
4.1 Mobile Station . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.1.1 Operating System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.1.2 Image Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.1.3 Image Encode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.1.4 Implementation of the Image Acquisition and Encode . . . . . . . . . . . . . . . . 46
4.1.5 GPS Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.1.6 Image Geotagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.1.7 Threads Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2 Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3 Base Station . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3.1 Message Formatting and Protocol Implementation . . . . . . . . . . . . . . . . . . 55
4.3.2 System Control and Data Multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.3.3 Map Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3.4 Charts Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.3.5 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5 Experimental Results 63
5.1 Mobile Station . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.1.1 Image Acquisition and Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.1.2 GPS Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.1.3 EXIF: Image Geotag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.1.4 Sensors Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.2 Communication Link . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.3 Base Station . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.4 Final Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6 Conclusions 73
6.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Bibliography 77
x
List of Tables
2.1 Considered boards specifications: Raspberry Pi [1], BeagleBone Black [2], Gooseberry . 10
2.2 Comparison of 802.11 standards [3] [4] [5] . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.1 V4L2 Device types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4.2 Average time of encoding a hundred of PPM to JPEG images in a computer Lenovo T430 44
4.3 Average time of acquire and encode hundred images until JPEG . . . . . . . . . . . . . . 44
4.4 The packet structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.5 Configuration packet structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.6 Image message to the system control and data multiplexing module . . . . . . . . . . . . 56
4.7 Sensors message to the system control and data multiplexing module . . . . . . . . . . . 56
5.1 Influence of sensors sample rate in the performance of the image module . . . . . . . . . 64
5.2 Sensor acquisition and transmission time in Raspberry Pi version 1 . . . . . . . . . . . . . 66
5.3 Sensor acquisition and transmission time in the Raspberry Pi version 2 . . . . . . . . . . 66
xi
xii
List of Figures
1.1 Simplified view of the project architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1 First Person View architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Fat Shark PredatorV2 FPV set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Teradek Bond transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 System modules: mobile and base stations . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.5 Raspberry Pi single board computer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.6 BeagleBone Black single board computer . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.7 Gooseberry single board computer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.8 Gray levels image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.9 Industrial camera iXU 150 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.10 GoPro Hero 3 camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.11 GPS triangulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.12 Throughput comparison of WiLD MAC and stock 802.11 MAC [6] . . . . . . . . . . . . . . 17
2.13 Google Earth globe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1 Block diagram of the proposed architecture: mobile and base stations . . . . . . . . . . . 21
3.2 Mobile station block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3 Logitech C270 HD webcam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.4 ND-100S GPS USB Dongle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.5 Raspberry Pi single board computer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.6 Image acquisition modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.7 Chrominance subsampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.8 Main steps of JPEG algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.9 DCT coefficients in a zig-zag pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.10 GPS modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.11 NMEA message format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.12 EXIF marker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.13 Reliability over UDP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.14 Base station block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.15 Example of a JSON message . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
xiii
3.16 Application modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1 Mobile station threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.2 Main steps of programming a V4L2 device . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.3 v4l2 buffer state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.4 Image buffers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.5 EXIF attribute information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.6 Threads execution pattern without scheduling adoption . . . . . . . . . . . . . . . . . . . . 51
4.7 Synchronization of transmission thread . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.8 Thread execution pattern with scheduling adoption . . . . . . . . . . . . . . . . . . . . . . 52
4.9 System integration with Enet protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.10 Base station implementation overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.11 Base station implementation overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.12 Map example with marker and InfoWindow . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.13 Example of a polyline in a map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.14 Base station user interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.15 Pop-up menu to choose the considered sensor . . . . . . . . . . . . . . . . . . . . . . . . 62
4.16 Pop-up menu to choose the image resolution . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.1 System time of mobile station threads (the GPS thread is not represented because the
system time is to small) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.2 Image acquisition and encoding frame-rate in the Raspberry Pi version 1 . . . . . . . . . 64
5.3 Image acquisition and encoding frame-rate in the Raspberry Pi version 2 . . . . . . . . . 65
5.4 Comparative between Raspberry Pi version 1 and 2 . . . . . . . . . . . . . . . . . . . . . 65
5.5 EXIF geotag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.6 Bandwidth usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.7 Trajectory of the UAV and image/sensors markers . . . . . . . . . . . . . . . . . . . . . . 67
5.8 InfoWindow with image and sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.9 Plot with sensors data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.10 Base station interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.11 Drone with the mobile station . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.12 Communication bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.13 Communication packet loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.14 Jitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.15 Final test performed on the drone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.16 Final interface of the drone test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
xiv
xv
xvi
Chapter 1
Introduction
Unmanned Aerial Vehicles (UAVs), commonly referred as drones, are remotely piloted aircraft systems.
They range from simple hand-operated short-range systems, to long endurance and high altitude sys-
tems. They are also referred to as Unmanned Aerial Systems (UAS) and Remotely Piloted Aircraft
(RPA). UAVs have civil, commercial and military uses.
In this project, it will be designed and implemented a distributed data acquisition, processing and
visualization system, composed by a mobile station and a base station, to be integrated in an UAV. The
main goal of this system is to acquire data from navigation sensors and images from a camera installed
in the UAV and send them to the base station. The mobile station also has an GPS system, which will
be used to georeference all the data gathered by the sensors and track the trajectory of the UAV. The
base station consists on a computer (laptop) on the ground and is used to receive the gathered signals
transmitted by the mobile station. After receiving the gathered signals, these must be processed and
displayed using a map to georeference the data.
1.1 Context
”The genesis of unmanned flight systems began in 1916 when Lawrence and Elmer Sperry combined
the stabilizing gyro and a steering gyro to make an automatic pilot system designated by Aerial Torpedo”
[7]. Military planners have anticipated the great value of an uninhabited air vehicle, although the U.S.
did not begin experimenting it seriously with unmanned vehicles until the late 1950s. The modern UAV
appeared in the early 1970s, when designers in the U.S. and Israel started experimenting with small,
slow and cheap UAVs. Their most important feature was the use of video cameras that could send
pictures to the operator in real time [7].
Nowadays, UAVs industry is growing fast. The most significant catalyst to UAV growth has been
the U.S. army. The U.S. General Accounting Office estimates the number of countries with UAVs has
increased from approximately 41, in 2004, to at least 76 countries in 2012 [8]. However, the growth in
the use of armed UAVs, particularly by the U.S., raises a significant number of moral, ethical and legal
issues.
1
UAVs technology has matured to an extent that it became a key asset in military organizations. On
the other hand, the civil and commercial market for UAVs had an unrealised potential in a wide number
of applications, where the available technologies offer the opportunity to replace older solutions and
potentially be applied in new areas, where there was not any viable solution in the past [9].
To date, Portugal has little experience in the use of UAVs for military activities, but it is expected to
make significant moves into this area. Furthermore, there are also several civil initiatives in Portugal
focused on the usage of UAVs, to assist in the detection of forest fires [9].
1.2 Motivation
UAVs were initially developed for military use, but are increasingly being deployed in civilian applications,
including mapping, monitoring and managing habitats and natural resources.
The high flexibility in the use of UAV systems permits immediate intervention and interactive mea-
surements according to customers specific needs [10]. In fact, UAVs:
1. Allow mapping and monitoring small areas at extremely fine scales (up to 1 cm).
2. Can reach the desired point of observation in just few minutes.
3. Allow multi-temporal acquisitions over the same area at predefined and desired.
As an example, UAVs are having a fundamental role to find plagues on forests and agriculture, since
the costs and availability of high resolution satellite imagery often limits their applications in agriculture.
Consequently, UAVs are an inexpensive and more practical substitute for satellite and general aviation
aircraft for high resolution remotely sensed data. Moreover, UAVs are immediately accessible as a tool
for remote sensing by scientists and farmers. UAV imagery can link the gap between remotely sensed
imagery from aerial and satellite platforms and detailed ground-based observations [11]. UAVs have also
several advantages over piloted aircraft. An unmanned system can be deployed quickly and repeatedly
for assessment of effectiveness and also poses less risk because there is no pilot and it is less costly
than piloted aircraft.
As a consequence, it is expected that the flexibility in acquisition and the much lower costs for image
extraction will possibly result in the UAVs industry increase the demand over the traditional manned air-
craft. Images captured using UAVs generally have a spatial resolution of centimeters and the acquisition
of the images is manageable and not as influenced by cloud cover [11].
1.3 Requirements
As refereed before, UAV is an aircraft system that is able to autonomously fly during all phases of
flight and it is monitored from an operator station. Therefore, any new UAV system should fulfil certain
basic requirements and a comprehensive set of system specifications should be considered. Since this
2
system is a prototype for an air vehicle, it must be lightweight, with small dimensions and low power
consumption. The low price is also a strict requirement.
Nevertheless, an on-board system with enough computational power must be considered in this
project, allowing for image and georeference data acquisition. It must also be able to transmit the
gathered sensor signals to a base station. The base station should receive the data acquired by the
UAV and display it.
1.4 Objectives
This project is a prototype and is part of an UAV currently under development by the Portuguese Air
Force. The main goal of this project is to develop a set of software components to handle the interaction
between the on-board computing module and the base station.
In the on-board computing module, the software must handle the interface with the hardware that
deals with the image and GPS acquisition, as well as with the data transmission system installed in
the UAV, in order to acquire image and GPS coordinates and send them to the base station. Other
information from sensor signals can also be sent over the data link. All the acquired data should be
properly encoded and compressed before transmission.
In the base station, the software must receive the information from the UAV and display the data.
The GPS coordinates will be used to track the trajectory on a map, while displaying the acquired image.
Figure 1.1 shows the simplified view of the project. In particular the main objectives of this project are:
1. Implementation of the interface software, between the on-board computer, the sensors, the GPS
and the video camera.
2. Implementation of the communication between the on-board computer and the base station.
3. Implementation of the user interface, at the base station.
Figure 1.1: Simplified view of the project architecture
3
Chapter 2
Related Work and Technologies
This section presents the existing technologies related to the addressed problem. In section 2.1, is
presented existing solutions for video systems (section 2.1.1) and video transmission systems (section
2.1.2). In section 2.2, is presented related hardware platforms: processing boards (section 2.2.1), cam-
eras to acquire image (section 2.2.2), GPS log systems (section 2.2.3) and data communication (section
2.2.4). In section 2.2.5, is presented some available solutions for data visualization.
2.1 Existing Solutions
s There are not many existing solutions or commercial products available that can be installed in an UAV
system and that integrate all the requirements set for the presented system, namely: video, GPS and
communication systems. Some of the existing commercial products are presented in this section and
are somehow related with the developed work.
2.1.1 FPV Video Systems
On-board video acquisition systems are increasingly used in UAVs. The main objectives of these sys-
tems are FPV (First Person View) live image, aerial photography and videography.
The FPV system is a method used to control an UAV from the driver or pilot view point. FPV flight
involves mounting a small video camera and analog television transmitter on an aircraft and flying it by
means of a live video down-link, commonly displayed on video goggles or on LCD screen. In an FPV
system, the displayed image is the aircraft perspective. An FPV aircraft can be flown beyond visual
range, by using the remote control and the video transmitter. FPV became increasingly more common
and it is one of the growing activities involving radio control (RC) aircraft.
There are three major components in FPV systems. The on-board video acquisition system, the
telemetry system and a display (ground station). A basic FPV system consists of a camera and video
transmitter on the aircraft and a video receiver and a display on the ground. More advanced setups
commonly add specialized hardware and software, including on-screen displays with GPS navigation
4
and flight data, stabilization systems and autopilot devices with ”return to home” capability, allowing the
aircraft to fly back to its starting point on its own in the event of a signal loss.
A basic FPV setup (see Figure 2.1) needs at least four items:
• A camera: For video acquisition.
• A video transmitter: To transmit the picture to the ground.
• A video receiver: To receive the picture on the ground.
• A video display: To watch the video fed from the aircraft.
Figure 2.1: First Person View architecture
The following paragraphs describe two commercial Solutions currently available in the market:
Fat Shark FPV system

Fat Shark is a FPV manufacturer, offering complete solutions for FPV systems from cameras to displays.
In particular, the PredatorV2 is a Fat Shark solution for FPV plug and play system [12]. This set includes
a wireless 250 mW 5.8 GHz A/V video transmitter and a camera with 2.8 mm lens for wide angle 100
degree, ideal for fixed camera video piloting. The user can chose between two analogue video standards
NTSC (National Television Systems Committee) or PAL (Phase Alternating Line). NTSC video has lower
resolution than PAL but appears smoother due to the slightly higher frame rate (29.97 fps vs 25 fps).
To watch the video fed from the aircraft, Fat Shark Predator is equipped with a VGA 640 x 480 FPV
headset. Fat Shark claims that the offered system was an operation range of 1 km. This full set has a
price of 299 EUR [13].
Sky Drone FPV system

The Sky Drone solution offers a low-latency, digital full-HD FPV system that utilizes cellular networks 1 .
This FPV system uses a high resolution camera (1080p) with a wide dynamic range.
1 http://www.skydrone.aero/fpv
5
Figure 2.2: Fat Shark PredatorV2 FPV set
The main technological gaps between the Fat Shark system and this one is the communication link
and the video format. While the Fat Shark system works with analog video, the Sky Drone works with
digital video and send it through a 3G/4G link rather than a direct link used by the Fat Shark.
By using the existing 3G/4G networks, the Sky Drone system provides a considerable long range,
being the only requirement the availability of cell tower coverage. In order to achieve a low latency, Sky
Drone uses a custom video codec, streamed via UDP.
This system does not include the ground station display to watch the live video, but comes with a
software application that runs on Android and Windows. The Sky Drone FPV software automatically
adjusts to the network’s capabilities and always tries to ensure a smooth video stream. This system has
a price of 499 dollars.
2.1.2 Video/Telemetry Transmission Systems
In this section, a subset of complete solutions available on the market, for telemetry systems are pre-
sented.
Teradek Clip
The Teradek Clip is a aerial video (H.264) transmitter, developed specifically for UAVs [14]. Clip transmits
directly to a single iOS device at less than 4 frames of latency, giving a live video of the UAV flight. This
transmitter can be used for point to point broadcast with decoders (as Teradek encoders), over 2.4/5
GHz a/b/g/n MIMO WiFi. Clip has a micro HDMI input and dual RP-SMA jacks for high gain antennas,
and can transmit up to 92 m with a video bit rate of 150 Kbps to 5 Mbps. The range can be extended by
using directional antennas. Clip also allows to transmit telemetry data to a ground station in real time.
6
Teradek Bond
The Teradek Bond allows the broadcast 1080p HD video over aggregated bandwidth from five network
interfaces [15]. Bond can stream from nearly anywhere over five 3G/4G modems. The Teradek bond
connects to the encoder via USB. Bond devices require a Sputnik server [15], which converts each
bonded feed into a standard video format that can be sent to any streaming platform on the Web or to
several H.264 decoders [15] (figure 2.3).
Figure 2.3: Teradek Bond transmission
2.1.3 Existing Solutions Discussion
Nowadays, there are several companies in different areas which have commercial products, some of
them presented in this section, that somehow are related with the developed work of this project.
The FPV system can be used to acquire, transmit and display image but the majority of this systems
work with analog video and does not fulfil the requirements of this project. On other hand, the Teradek
clip and Teradek Bond are very good solutions to transmit video and other data. These systems could
be used in the project to transmit the gathered sensors from the mobile station to the base station, but
for a project which is a proof of concept, these two systems are too expensive: Teradek clip costs 599
dollars and Teradek Bond costs 3,990 dollars. These systems also do not respect the requirements of
this project for low energy consumption and small dimensions.
The existing off-the-shelf, solutions, are thus not herein considered since they are not able to fulfil
the main requirements defined.
2.2 Relevant Technology
The objective of this project is to implement a system composed of two units: mobile station and base
station. In order to fulfil the requirements of this project the UAV must be able to acquire sensor signals
(Image, georeference data and other sensors data) and transmit the gathered data for the base station,
which will then show it to the user. To develop this system several Hardware/software modules are
needed, defined in figure 2.4, which will be integrated in the mobile and base stations.
For this project (and proof of concept), a simple configuration interface is targeted, where the mod-
ules are easy to connect and integrate and with a strict low cost requirement. Thus, it will be used a
7
Figure 2.4: System modules: mobile and base stations
lightweight single-board computer, connected to a low cost GPS module (to acquire the coordinates)
and an off the shelf camera to acquire the images. The mobile station will interact with the base station
in order to transmit the gathered sensor signals. Therefore, the following presents the relevant imple-
mentation technologies and platforms for this project: single-board computers, cameras, GPS systems
and communication systems.
2.2.1 Processing Boards
A single board computer is a small sized complete computer, built on a single circuit board that plugs
into a monitor and a keyboard. It is a very compact computation system with microprocessor, memory,
input/output (I/O) and it can be used for many of the tasks performed by a desktop PC.
The main requirements taken into account, to chose the single board for this project are: low cost;
adequate computational power to acquire data from the sensors (GPS, image and other UAV sensors)
and transmit them to the base station; Interfaces to connect a video camera, GPS module and wireless
transmitter; the single board computer must support a robust open source OS, with device drivers for
acquiring image and GPS signals. Therefore, it was given preference for boards with Linux compatibility.
In the following paragraphs a brief review of a subset of currently available single board computers
is presented.
Raspberry Pi
Raspberry Pi (see Figure 2.5) is one of the most used mini computer of the market [1]. It is available in
two models (A and B) at a very low price and both models have the same processor (ARM1176JZF-S
700 MHz). One of the strengths of Raspberry Pi are the connectivity options, allowing HDMI out, RCA
out, audio out, the connection of two USB devices, Ethernet port, GPIO pins, and an SD card. The GPIO
pins and the two USB ports of model B are an important feature for this project, since they can be used
8
to connect the sensors modules.
Raspberry Pi model A has 256 MB RAM, one USB port and no Ethernet (network connection). Model
B has 512 MB RAM, two USB ports and an Ethernet port 2 . Table 2.1 shows the main specifications of
Raspberry Pi, among the other considered computing systems.
Figure 2.5: Raspberry Pi single board computer
In the final stage of this work was release a new version of Raspberry Pi (February 2015), the
Raspberry Pi 2 Model B. The new Raspberry Pi comes with a upgraded processor, Broadcom BCM2836
ARMv7 quad core processor powered single board computer running at 900 MHz and the amount of
RAM has also been upgraded to 1 GB. The rest of the hardware, however, matches that of the Raspberry
Pi B: a video core GPU, a 40-pin GPIO, four USB ports and a 10/100 Ethernet. Physically the Raspberry
Pi 2 also has the same dimensions as the previous version.
BeagleBoards
BeagleBoards has four models available. Among the offered solutions, the cheapest model is the Beal-
geBone Black (see Figure 2.6), which has a better processor than Raspberry Pi (1 GHz ARM Cortex-A8
core CPU) and a 2 GB 8-bit eMMC on-board flash storage. One disadvantage of this board is the single
USB port. This board can boot in less than 10 seconds[2]. Table 2.1 presents the specifications of the
BeagleBone Black.
Figure 2.6: BeagleBone Black single board computer
2 http://www.raspberrypi.org
9
Gooseberry
Gooseberry Board (see Figure 2.7) is a single board mini computer slightly bigger than the Raspberry Pi
board. Like BeagleBoard Black, the Gooseberry Board has a better processor than Raspberry Pi, ARM
A100 1 GHz (overclokable to 1.5 GHz) and an internal 4 GB flash storage. One of the strengths of this
board is the integrated WiFi 802.11 b/g/n.
The main disadvantages of this board are: only one mini USB port (need to use a USB hub to
connect multiple peripherals); no lan port; at present, Android is the only compatible OS for this board
(Ubuntu is also a possibility but requires some customization and additional knowledge).Table 2.1 shows
the specifications of Gooseberry.
Figure 2.7: Gooseberry single board computer
Raspberry Pi (model B) BeagleBone Black Gooseberry

CPU ARM1176JZF-S 700 MHz AM335x 1 GHz ARM Cortex-A8 A100 1 Ghz ARM
GPU Broadcom VideoCore IV PowerVR SGX530 Mali 400 MHz
Memory 512 MB 512 MB DDR3 512 MB
2 GB 8-bit eMMC on-board, 4 GB on-board,
Storage SD card slot
micro SD card slot micro SD card slot
Linux (Rasbian, Pidora, Arch),
Operating System OpenELEC, RaspBMC, Linux and Android Android 4.0 ICS
RISC OS
Dimensions 85.6 0 mm x 56 mm 86.40 mm x 53.3 mm 91.3 mm x 72.1 mm
Idle/max power consumption 1 W / 2.4 W 1.05 W/ 2.3 W 2.3 W / 4 W
Price 39 EUR3 55.35 EUR4 51 EUR
Table 2.1: Considered boards specifications: Raspberry Pi [1], BeagleBone Black [2], Gooseberry
Discussion
Despite the better performance of other single board computers presented in this section, the module
that best fits in this project is the Raspberry PI model B. Its processor (ARM1176JZF-S 700 MHz) and
memory (512 MB) are enough for the requirements of this project. Two considerable advantages of the
Raspberry PI are the two USB interfaces to connect other modules and the lower cost. This board also
effortlessly supports a Linux distribution.
10
2.2.2 Video Acquisition Systems
The two major types of image sensors are CCD (Charge Coupled Device) and CMOS (Complementary
Metal Oxide Semiconductor). Both devices essentially do the same, i.e., convert light intensity into a
digital signal. However, the different manufacturing processes used to make these devices give each
type of sensor advantages and drawbacks, depending on the application. CCD sensors have a higher
manufacturing cost than CMOS and consequently the cameras with CCD sensors are more expensive.
The main requirements taken into account to choose the camera for this project were: low cost; easy
to integrate with the other modules; the ease to change to another camera system; compatibility with the
Linux device driver in order to interact with the OS. The following paragraphs, the set of possible video
acquisition systems herein considered, will be discussed.
Industrial Cameras
Industrial cameras have been designed to have high standards with repeatable performance and robust
to withstand the demands of harsh environments. The term ”industrial cameras” is a little misleading,
since these cameras are not only used for machine vision but also for medical, scientific, and security
purposes.
Industrial cameras consist of two basic building blocks: the image sensor and the digital interface.
They use one of two major types of image sensors: CCD or CMOS, to convert light into electrons
(digital signal). The digital interface forms a gray levels image (figure 2.8) and transfers this image to a
computer. This transfer is usually based on USB 2.0, USB 3.0, FireWire or GigE interfaces [16]. Some
of this cameras have a Bayer filter in order to acquire chromatic images (color images).
Industrial cameras provide a range of resolutions from 640 x 480 to 6576 x 4384 pixel and a frame
rate from 2 fps to 300 fps. Most of this cameras provide images free of any pre-processing (RAW format).
Figure 2.8: Gray levels image
Despite the few industrial cameras special developed for aerial UAVs systems, it is a growing market.
One of this cameras is the Phase One iXU camera which claims to be the world smallest and lightest
industrial camera for aerial systems, weighing 750 grams. This camera have three versions with a CCD
sensor which acquire achromatic images (figure 2.8) in a high resolution of 80 MP (Megapixel) and one
version (iXU 150 figure 2.9) with a CMOS-based sensor which acquires chromatic images at 50 MP or
11
video 1080p at 30 fps. That sensors have 68% more capture area than any DSLR. Of this cameras the
one that fits best in this project is the last one because is a chromatic camera.
The Phase One iXU 150 camera offers three different output formats: Raw format or the encoded
formats TIF and JPEG, which will send to the host though a USB 3.0 interface. Another feature of this
camera is the possibility of incorporate a GPS witch allows image geotagging.
The possibility to process images for a format more suitable to transmit, like JPEG and the geotag
system, are two features very relevant for this work. The downside of this camera is the high price
around 60,000 dollars.
Figure 2.9: Industrial camera iXU 150
Action Cameras
Also called point-of-view (POV) cameras, these are small, lightweight, mountable video cameras de-
signed for hands-free recording. The video quality obtained with this cameras is very high and offer
features like built-in WiFi and slow-motion video capture.
Among the available brands of this type of devices is the GoPro [17]. GoPro cameras (see Figure
2.10) have three different models (the Black, Silver and White editions) with HD image quality and
integrated WiFi. The manufacturer also offers an App that allows to see a streaming preview of what
the camera is recording, in a mobile phone or tablet, but with a short delay of about 1-2 seconds. WiFi
communication at these cameras works in the 2.4 GHz b/g frequency and the video stream has poor
quality because of the low bandwidth. To circumvent this limitation, these cameras allow to connect an
audio/video cable in the micro HDMI port, which can be connected to a telemetry system like Fat Shark
telemetry module. With this setup, it is possible to do live stream a 720p video at 60 fps with little delay.
The GoPro offers a huge range of resolutions from WVGA up to 4K. The top model (Black Edition)
can record 1080p video at 60 fps, and 720p video at 100 fps and also supports 4K Cinema video at 12
fps. The Black edition can shoot burst photos at up to 30 fps, while the Silver and the White shoot at 3
fps and 10 fps, respectively.
The price of the Black edition is 449 EUR, while the Silver is 349 EUR and the White 249 EUR.
Webcam
A webcam is a small camera that fits in the palm of the hand and feeds its image in real time to a device.
This type of cameras can capture video with HD resolution as 1280 x 720 with 30 fps and generally
uses USB to connect a device. The most popular use of this cameras is videoconference. The great
12
Figure 2.10: GoPro Hero 3 camera
advantage of webcams is the low price.
Discussion
The best type of camera to install on the UAV is the Industrial Camera, but some of them needs a framer
grabber to handle the image acquisition and have a high price. Given the budget limitation and since this
an evaluation prototype, a webcam is herein used. Nevertheless, any camera with USB can be used in
the developed system.
2.2.3 GPS Systems
The Global Positioning System (GPS) is a satellite navigation system that provides positioning, naviga-
tion and time information (PNT) anywhere on Earth, provided when there is an unobstructed line of sight
to four or more GPS satellites [18].
The timing service is implemented by incorporating in each GPS satellite a high accuracy atomic
clock. The satellites permanently broadcast their own time to the receiver, so they can synchronize
themselves. Besides the information about the time of each satellite, the satellites also broadcast their
current position.
With the information about the time the message was sent and the speed (speed of light), it is
possible for the GPS module to calculate the distance between him and the satellites. By knowing the
position of the satellites, which is sent in the message and by calculating the distance between the GPS
module and the satellite, it is possible for the GPS module to calculate his own position (see Figure 2.11)
[19]. The protocol that is used by the majority of the GPS modules to communicate with other devices
is the NMEA 0183, created by the National Marine Electronics Association [20].
The advent of GPS has allowed the development of hundreds of applications, affecting many aspects
of modern life. GPS technology is now in almost every electronic device such as cell phones, watches,
cars, shipping containers and ATM machines [19].
In order to improve the accuracy of the GPS system a few variations can be considered. The DGPS
and RTK-GPS are two methods to improve the GPS system, as described in the following paragraphs.
13
Figure 2.11: GPS triangulation
DGPS
Differential GPS (DGPS) is a method to improve the positioning and timing performance of GPS using
one or more reference stations on Earth at known fixed locations, each equipped with at least one GPS
receiver. Accuracy of modern DGPS receivers is within 1 meter [21].
If the station and the terminal are fairly close (within a few hundred kilometers) the signals that reach
both of them will have travelled through virtually the same slice of atmosphere, and so will have virtually
the same errors. The idea behind differential GPS is the existence of one station that measures the
timing errors and then provides correction information to the receivers.
This reference station receives the same GPS signals as the user receiver, but instead of using
timing signals to calculate the position, it uses its known position to calculate timing. It figures out what
the travel time of the GPS signals should be, and compares it with what they actually are. Since the
station has no way of knowing which of the many available satellites a user receiver might be using
to calculate its position, the station computes the errors of each visible satellite and then transmits the
correction to the user receiver [22][19].
RTK-GPS
Real Time Kinematic GPS can provide centimeter accurate measurements in real time [19]. The user
antenna needs to be within a distance of 10 km to the base station, in order to receive real time radio
links that transmit information concerning the position correction.
RTK needs an initialization time of about 1 minute in order to give the maximum precision. RTK offers
two types of solutions: float and fix. RTK float needs at least 4 common satellites and offers an accuracy
between 20 cm and 1 m. RTK fixed needs at least 5 common satellites and offers accuracy within 2 cm
[23].
14
Discussion
Despite the great accuracy offered by the DGPS and RTK GPS technologies, they also have a high cost
associated with the price of the equipment and with their implementation. To fulfill the strict requirement
for the lowest price as possible, the proposed solution will use a low-cost GPS.
2.2.4 Data Communication
An UAV is able to autonomously fly during all phases of the flight, but has to be monitored from a base
station. The communication system should be able to collect the data and transmit it to the base station.
The success of UAVs missions is extremely dependent on the availability, high performance and security
of the communication channel. As such, a communication channel is essential in an UAV system.
The main requirements that should be taken into account, in order to choose the communication
channel for this project are: low cost and enough bandwidth to transmit the gathered sensors. To
estimate the necessary bandwidth for the data link, a JPEG image with 106 kB (1280 x 720 resolution),
was repeatedly set at a rate of 10 fps and the result was a bandwidth of 8480 kbps. The other sensors
require a small bandwidth, less than 100 kbps. Thus, the communication channel require a bandwidth,
at least, of 8580 kbps. In this section, wireless communication technologies are presented.
Dedicated Channels With a dedicated channel there is no sharing of the available bandwidth, since
the communication line is used for one single purpose. A dedicated channel avoids interferences and
allows data transfer with higher reception quality. As a consequence, they are normally used when the
availability of highly reliable dedicated communication channel is required with minimal signal delay. The
major disadvantage of these channels is the high price.
The UAVs military industry commonly used frequencies over UHF/VHF, S and C bands, but these
data links do not meet the demand of the desired data link rate [24]. Therefore, more efficient frequency
bands of L, X, and Ku are used in UAV systems today. Some links allows point-to-point data transmis-
sions of up to 274 Mbps, which are enough to handle full-motion video in UAVs with more than 200 km
range [25].
ZigBee
ZigBee wireless mesh technology has been developed to address sensor and control applications with
the promise of robust reliable, self-configuring and self-healing networks that provide a simple, cost-
effective and battery-efficient approach [26]. ZigBee is a specification for a suite of high level commu-
nication protocols using small low-power digital radios based on the IEEE 802.15.4-2003 standard for
wireless personal area networks (WPANs)[27]. The technology defined by the ZigBee specification is
intended to be simple and low-cost.
ZigBee operates in the Industrial, Scientific and Medical (ISM) radio bands, (868 MHz) in Europe.
The data transmission rate of ZigBee is from 20 kbps to 250 kbps with a range of 10 to 20 meters.
15
WiFi
802.11 is a set of wireless local access network (WLAN) standards developed by the Institute of Electrical
and Electronics Engineers (IEEE) that are mainly used for local wireless communications in the 2.4 GHz
and 5 GHz Industrial, Science and Medical (ISM) frequency bands.
That 802.11 standards consist of a physical layer and media access control (MAC) protocols [28].
The most popular protocols are those defined by the 802.11b, 802.11g, 802.11a and 802.11n (see table
2.2).
Frequency Bandwidth Max Data Rate Approximate

Standard
(GHz) (MHz) (Mbps) Outdoor Range (m)
802.11b 2.4 20 11 140
802.11a 5 20 54 120
802.11g 2.4 20 54 140
802.11n 2.4, 5 20, 40 +100 250
Table 2.2: Comparison of 802.11 standards [3] [4] [5]
The several 802.11 wireless standards can differ in terms of speed, transmission ranges, and used
frequencies, but they are similar in terms of actual implementation. All standards can use either an
infrastructure or an ad-hoc network design, and use the same security protocols. Wireless radios are
half-duplex and cannot listen while transmitting, so a CSMA/CA (Carrier-Sense Multiple-Access/Collision
Avoidance) mechanism is used to reduce collisions.
The IEEE 802.11 standard was designed for wireless broadcast environments with many hosts in
close vicinity competing for channel access. The last significant release of 802.11 was IEEE 802.11-
2009, which introduced 802.11n [29]. The goal of 802.11n is to increase significantly the data through-
put rate. While there are a number of technical changes, the addition of Multiple-Input Multiple-Output
(MIMO) and spatial multiplexing is an important one [5].
A MIMO (Multiple-Input Multiple-Output) solution can send a single stream, by using spatial diversity,
or multiple simultaneous streams using spatial multiplexing to increase the transmission rate [30]. The
multiple antennas pairs can provide independent spatial paths between transmitter and receiver. Multiple
antennas used in MIMO use multiple radio frequencies and thus more electrical power.
Today, most high-rate wireless systems use MIMO technologies, including 802.11n, LTE and WiMAX
[31] [32].
WiFi-based Long Distance (WiLD) networks are emerging as a low-cost connectivity solution for
long distances. Unlike common wireless networks, which use omnidirectional antennas to cater to short
ranges, WiLD networks are comprised of point-to-point wireless links, where each link can be as long
as 100 km. To achieve long distances in single point-to-point links, nodes use high-gain(e.g. 30 dBi)
directional antennas [33].
WiLDNet makes several essential changes to the 802.11 MAC protocol, but continues to exploit
standard (low-cost) WiFi network cards. To better handle losses and improve link utilization, WiLDNet
uses an adaptive loss-recovery mechanism using FEC and bulk acknowledgments [34]. Figure 2.12
16
shows the throughput/distance comparison of WILD networks and a stock 802.11.
Figure 2.12: Throughput comparison of WiLD MAC and stock 802.11 MAC [6]
WiMAX
An alternative for point-to-point links would be WiMAX technology (IEEE 802.16) [35]. WiMAX links have
a few advantages over WiFi: configurable channel spectrum width (and consequently data rate); better
modulation (especially for non line of sight scenarios); operation in licensed spectrum would permit
higher transmitter power, and thus longer distances and better signal strengths. On the other hand,
WiMAX products are more expensive than WiFi.
LTE
LTE (Long Term Evolution) or the E-UTRAN (Evolved Universal Terrestrial Access Network) is the latest
standard in the mobile network technology, and an evolution of GSM/UMTS standards. LTE uses a IP
network and is based on standards developed by the 3rd Generation Partnership Project (3GPP) [36].
The main objectives of LTE networks are: high peak data rates, short round trip time, as well as flexibility
in frequency and bandwidth. The LTE networks provides downlink peak rates of 100 Mbps, uplink peak
rates of 50 Mbps with 20 MHz. This network provides a long range but it needs cell tower coverage. On
the other hand, the usage of LTE network like data link can be expensive because the whole data traffic
has to be paid to a telecom operator [37].
Discussion
To transfer the information acquired by the sensors, several technologies were considered including,
dedicated channels, LTE, ZigBee, WiFi, Wimax. As mention before the main requirements for the com-
munication link are: enough bandwidth to transmit the gathered sensors data and the maximum range
as possible with low price.
Cell phones (LTE) were eliminated, due to the involved cost and to concerns related to coverage.
ZigBee was eliminated due the low bandwidth and low range. On the other hand a dedicated channel
would be the best option, but due the involved cost, it was discarded as well. The technology which best
fits in this project for the communication link was the common WiFi.
17
2.2.5 Data Visualisation
After the gathered data (at the UAV) has been received by the base station, it is necessary to convert
the data to a user friendly format and display it to the user.
The main requirements of data visualization for this project are: low cost; the ability to trace the
UAV and display the data from the gathered sensors. In this section, a set of relevant systems and
technologies to display the gathered data is presented.
Dedicated Maps
To develop a dedicated application, a data base is necessary with the maps and servers to run the
application. The main advantage of such a system is the ability to be fully personalized, fast and se-
cured, since it is developed with one exclusive purpose. On the other hand, this type of system is more
expensive, particularly in terms of development and maintenance of the system.
Google Earth
Google Earth5 is a stand-alone program developed by Google, which allows to display the world map
through a virtual globe (see figure 2.13) from space and view maps, terrain and 3D buildings. It is
possible to zoom in and out for close-up views. In some areas, the close-ups are detailed enough to
make out cars and even people. Google Earth has several features such as, find driving directions and
measure the distance between two locations.
The Google Earth Plug-in (its JavaScript API), provides the ability to embed a true 3D digital globe
into web pages. By using the API, one can draw markers and lines, drape images over the terrain, add
3D models, or load KML files, allowing to build sophisticated 3D map applications. The Google Earth
API also allows to load KML (Keyhole Markup Language) file which is a XML notation for expressing
geographic data, developed for use with Google Earth. The KML files are very useful to load coordinates
as a track in Google Earth.
Google Maps
Google Maps6 is a Web-based service, suported by Google, which provides detailed information about
geographical regions around the world. Google offers advanced features that powers map-based ser-
vices, including Google Maps Website, Google Ride Finder and Google Transit. The Google Earth and
Google Maps functionalities are similar, Google Earth displays satellite images of varying resolution of
the Earth’s surface on a virtual globe, whereas Google Maps is a website to access and hover online
maps.
The Google Maps API is one of the most popular JavaScript libraries on the web and allows to embed
Google Maps into a proprietary web, android or IOS application and use the map to search and explore
the world. Google Maps comes with three types of maps: street, satellite, and terrain. All Maps API
applications load the Maps API by using an API key. Google uses this API key to monitor the application
5 https://developers.google.com/earth/
6 https://developers.google.com/maps/
18
Figure 2.13: Google Earth globe
Maps API usage, and if the usage exceeds 25,000 map views a day, Google will contact and present
charges. Google Maps API for Business7 is a paid version of Google Maps API and uses the same code
base as the standard Google Maps API, but provides the following additional features and benefits:
• Greater capacity for service requests such as geocoding.
• Business-friendly terms and conditions.
• Support and service options, with a robust Service Level Agreement (SLA).
• Intranet application support within the enterprise.
• Control over advertisements within the maps.
Discussion
The best system to monitor and map the UAV trajectory would be a dedicated software, since it can
better target the specific usage and the dedicated data base and servers. However, this type of system
is more expensive. Therefore, it is not an option, because this type of dedicated software would take lots
of time to be developed.
Google Earth plugin and Google Maps API are similar. Herein the Google Maps API (Free version)
is selected, since this API is one of the most popular ones and supports and a JavaScript Libraries very
well developed API and with a lot of documentation. For proof of concept, the free version which has the
limitation of 25,000 map views a day is not a problem.
7 http://www.google.com/enterprise/mapsearth/products/mapsapi.html
19
Chapter 3
System Architecture
This chapter discusses the main elements of the overall system and of their sub systems, composed
by the two main units: the mobile station installed in the UAV, and the base station (see figure 3.1). In
this chapter it is also presented the hardware and software architecture for each station and the protocol
used in the signals acquisition and transmission.
3.1 Overall System
As previously mentioned, the main goal of this project is to develop a set of software modules to different
stations with these functionalities: GPS acquisition, image acquisition, other signals acquisition, geotag-
ging, communication between the two stations and display all the gathered data at the base station. In
order to accomplish these functionalities a set of hardware devices will be needed (see figure 3.1).
The system architecture was implemented using a direct host-to-host communication network. In the
next sections is presented the chosen hardware and the software architecture in each station separately.
20
Figure 3.1: Block diagram of the proposed architecture: mobile and base stations
3.2 Mobile Station
The mobile system will be integrated with the UAV and is based on an on-board computing module.
This module is connected to a GPS module receiver, whose characteristics will be described in section
3.2.1.2 and to a camera for image acquisition, whose characteristics will be described in section 3.2.1.1.
The data obtained by the GPS module and the video camera, as well as other data from UAV sensors
will be sent to the base station using a data transmitter. For such purpose, the acquired data will be
properly encoded and compressed before transmission via WiFi.
Figure 3.2 shows a block diagram that illustrates the several modules that constitute the mobile
station and therefore, must be implemented.
Figure 3.2: Mobile station block diagram
21
The system is composed by the following elements:
• Webcam module: hardware which assures the image capture.
• Image graber: software device driver that makes the interface for analog video capture.
• Image encoding: module that encodes the image into a format suitable for transmission.
• GPS: hardware which assures the communication with the GPS satellites.
• GPS daemon: daemon responsible for the interface between the application and the GPS hard-
ware.
• GPS acquisition: module that communicates with the GPS daemon and formats the GPS data.
• Geotagging: module that tags the gathered sensors (image and other sensors) with the georefer-
ence data, when the sensors are acquired.
• Other sensors acquisition: software layer that receives the other sensors data from the UAV.
• System control and data multiplexing: modulo that controls all the data and prioritize it to be
sent.
• Message formatting and protocol implementation: module that sends data messages to the
base station.
3.2.1 Hardware
This section discusses the hardware modules used in this project: camera, GPS and single-board com-
puter.
3.2.1.1 Image Acquisition
The first stage of any vision system is the image acquisition stage. Before any video or image processing
can start an image must be captured by a camera and converted into a manageable entity. Most analog
cameras capture light onto photographic films, while digital cameras capture images using electronic
image sensors to convert light into electrons. To transform the information from the camera sensor into
an image, the voltage of each cell captured by the camera sensor is converted into a pixel value in the
range: [0, 255]. Such a value is interpreted as the amount of light hitting a cell during the exposure time.
This is denoted the intensity of a pixel. It is visualized as a shade of gray denoted a grayscale value
or gray-level value ranging from black (0) to white (255). In order to obtain color images sensor uses a
Bayer filter with the pattern 50% green, 25% red and 25% blue.
Video can be defined as an ordered sequence of images varying at a certain rate. The frame rate,
which denotes the number of still pictures per unit of time, ranges from 6 fps or 8 fps for old mechanical
cameras to 120 or more fps for new high performance professional cameras.
22
Given the requirements of this project, the selected camera is a Logitech HD webcam C270 (see
Figure 3.3). This camera is an USB webcam and will interface with the processing board by using the
the Video4Linux device driver, provided by the Linux distribution. One important feature of this camera
is its compatibility with the UVC driver1 . For the desired proof of concept prototype, this camera fits the
requirements: low price, lightweight, easy to integrate and small dimensions.
Figure 3.3: Logitech C270 HD webcam
This camera features are:
• Video capture: Up to 1280 x 720 pixels.
• Up to 30 frames per second.
• Photos: Up to 3.0 megapixels.
• Hi-Speed USB 2.0 certified.
• 220 grams weight.
3.2.1.2 GPS Module
The tracking feature corresponding to this work will be supported by a GPS receiver. The GPS module
will be responsible for providing location and time information in all weather conditions, anywhere on
Earth, provided when there is an unobstructed line of sight to four or more GPS satellites . This device
will be installed in the UAV and connected to the single board computer using USB.
Several GPS hardware devices can be considered to provide the georeference coordinates on Earth.
The requisites considered for this device were:
• Hardware ready to connect to a USB port.
• Compatible with GPSd Linux daemon.
• Compatible with NMEA 0183 protocol.
• Low price.
1 http://www.ideasonboard.org/uvc
23
Thus, the chosen module was the ND-100S GPS USB Dongle (see Figure 3.4). This module is easy
to install, only needs a USB connection and it does not require any additional power source. With this
type of GPS, it can be achieved horizontal accuracy of about 3 meters (or better) and vertical accuracy
of 5 meters .
Figure 3.4: ND-100S GPS USB Dongle
This GPS features are:
• GPS Chipset SiRF Star III.
• Supported protocols NMEA 0183v3 GGA(1sec), GSA(1sec), GSV(5sec),RMC(1sec), GLL,VTG is

optional.
• Supported operating systems: MAC OS, Linux, Windows.
• USB 2.0.
• 20 grams weight.
3.2.1.3 Single Board Computer
The single board computer is the main component of the mobile station that will implement many of the
software modules of this project. This single board computer interfaces with the video camera, the GPS
receiver and the remaining UAV sensors, and combines all of the information into a protocol suitable for
streaming via the communication system.
In order to meet the computational restrictions of this prof of concept prototype, the selected single
board computer was the Raspberry Pi (see figure 3.5). It is an extremely lightweight (45 grams) ARM
board, with a 700 MHz processor and 512 MB RAM. Despite the better performance of other single
board computers specified in section 2.2.1, the selection took into account the availability of two USB
ports, which will be used to connect to the camera and to the GPS module. The SoC (system on a
Chip) contains an ARM1176JZFS processor which is a native x86 platform, fully compatible with Linux
operating system, that will be further discussed. The compatibility with a Linux distribution is a relevant
feature because it provides the possibility of use the Video4Linux and the GPSd device drivers. The
24
Raspberry Pi also fulfils the low price requirement and has a comprehensive set of documentation. This
single board also meets the constraints of space, since it only measures 8.5 cm x 5.6 cm.
Figure 3.5: Raspberry Pi single board computer
The other main key features are the following:
• GPU: Broadcom VideoCore IV @ 250 MHz.
• Storage: SD card.
• OS: Linux (Rasbian, Pidora, Arch), OpenELEC, RaspBMC, RISC OS.
• Two USB 2.0 ports.
• 5 V DC-input.
• Incorporates one Gigabit Ethernet adapter.
• 26 GPIO pins.
3.2.1.4 Communication Link
The chosen technology for the communication link was the IEEE 802.11. The usage of directional
antennas or WiLD WiFi, in order to increase the range, was discarded because it would be difficult to
keep a perfect and permanent alignment of the antennas. Furthermore, this project is a proof of concept
with strict low cost requirements.
3.2.2 Software
This section discusses the architecture of the several software modules. These modules will be imple-
mented in C language because it produces efficient programs, can handle low-level activities and it can
be compiled on a variety of computer platforms.
25
3.2.2.1 Image Module
The image module is composed by three sub modules (see figure 3.6). The first module webcam is the
hardware that converts light into electrons, already presented in section 3.2.1.1.
Figure 3.6: Image acquisition modules
The second module, image grabber, is the responsible for the interface with camera. This module
captures the video signal and store the digitalized image in memory. With this interface we can control
the capture process and move images from the camera into user space. The image is captured in YUV
4:2:2 format, that contains minimally processed data from the camera.
YUV Format
YUV is the native format of TV broadcast and composite video signals. It separates the brightness
information, luminance (Y) from the color information (Cb and Cr). The color information consists of blue
and red color difference signals. The green component is reconstructed by subtracting Cb and Cr from
the brightness component.
The adoption of the YUV format comes from the fact that early television only transmitted brightness
information. Hence, to add color in a way compatible with existing receivers, a new signal carrier was
added to transmit the color difference signals. One important property of this format is the fact that
the YUV format the U and V components usually have lower resolution than the Y component, taking
advantage of a property of the human visual system, which is more sensitive to brightness information
than to color. This technique, known as chroma subsampling, has been widely used to reduce the data
rate of video signals.
There are several subsampling schemes for the YUV format but only three are considered relevant
for this project: 4:4:4, 4:2:2 and 4:2:0 (see Figure 3.7). In the following, a brief overview about each of
the schemes, will be given. YUV is commonly denoted YCbCr in digital domain.
YCbCr 4:4:4 Each of the three YCbCr components have the same sample rate. This scheme is
sometimes used in high-end film scanners and cinematic post-production.
YCbCr 4:2:2 The two chroma components are sampled at half the rate of luminance in the horizontal
direction. This reduces the bandwidth of the video signal by one-third with little to no visual difference.
Many high-end digital video formats and interfaces make use of this scheme e.g.: AVC-Intra 100, Digital
Betacam, DVCPRO50, DVCPRO HD, Digital-S, CCIR 601 and also can be used in to encode JPEG
format[38].
26
Figure 3.7: Chrominance subsampling
YCbCr 4:2:0 This scheme is used in JPEG/JPEG File Interchange Format (JFIF) encodes of still
frames and in most video compression standards, such as H.26x/MPEG-x, in H.261. Cb and Cr compo-
nents are each subsampled by a factor of 2 horizontally and a factor of 2 vertically.
There are several variants of 4:2:0 subsampling. In JPEG/JFIF, H.261, and MPEG-1, Cb and Cr are
sited interstitially, horizontally halfway between alternate luminance samples. In MPEG-2, Cb and Cr are
cosited horizontally.
Despite the wide set of digital image formats available, the webcam device that was considered for
this project only supports the YCbCr 4:2:2 format. Note that the specification for the kind of JPEG image
file used today by most digital still cameras, only two of these patterns are allowed: 4:2:2 and 4:2:0.
Image Encoding
After acquiring the image data in YUV format, there is the need to compress it, mainly because it
is required to reduce the used memory space and the transmission bandwidth needed to stream it.
However, when transmitting compressed data, must be ensured that both the sender and the receiver
understand the encoding scheme.
There are several different techniques in which image data can be compressed. These techniques
may be lossless or lossy. Lossless compression means that the decompressed image is exactly the
27
same as the original image, without data loss. In lossy compression, the decompressed image is as
close to the original image as possible, but not exactly the same.
For Internet use, as well as for other applications with limited channel capacity, the two most common
compressed image formats are the Joint Photographic Experts Group (JPEG) format (usually lossy) and
the Graphic Interchange Format (GIF) format (lossless). The lossy JPEG encoding will be used in this
project, despite the lost of data, the JPEG encode take advantage of the range of frequencies that a
human eye can not detect errors, as such, that allows to reduce substantial the image size (lost of data),
without see visible by the human eye. This method will be further described [39].
JPEG Encoding The JPEG standard (ISO International Standard 10918) describes techniques for
coding and compressing still images or individual frames of video. A certain application can choose four
coding modes depending on its requirements and the color components are processed separately [39].
The JPEG algorithm is composed by several different steps (see figure 3.8). Only the steps for the
compression will be described, since the de-compression works in the opposite order [40].
Figure 3.8: Main steps of JPEG algorithm
Description of the JPEG algorithm represented in figure 3.8:
• The image data in YUV format is broken into 8 x 8 blocks of pixels.
• The Discrete Cosine Transform (DCT) applies a transformation from spatial domain to frequency
domain. Therefore, each 8 x 8 block is coded separately, by transforming into an 8 x 8 frequency
space. As such, more bits are assigned to some frequencies than others. This may be advan-
tageous, because the eye can not easily see errors in the higher frequencies range. As conse-
quence, the higher frequencies can be simply omitted.
• Quantization reduces the amount of information of the higher frequency components, by dividing
each component in the frequency domain by a constant value defined for that component. De-
pending on the size of the quantization steps, more or less information is lost in this step. The user
can define the strength of the JPEG compression. The quantization is the step where this user
information has influence on the result.
• The final encoder processing step is the entropy coding. This step achieves additional compres-
sion, by encoding the quantized DCT coefficients more compactly based on their statistical char-
acteristics. Many of the higher frequency components are rounded to zero, and many of the rest
28
become small positive or negative numbers, which takes many fewer bits to store. In particular,
JPEG takes advantage of the fact that most components are rounded to zero, by encoding the
components in a zig-zag pattern (see figure 3.9) and by employing a run-length encoding (RLE)
algorithm that groups similar frequencies together. Finally, it uses Huffman tables to encode the
result of the RLE.
Figure 3.9: DCT coefficients in a zig-zag pattern
3.2.2.2 GPS Modules
This subsection presents the GPS module, that is composed by four sub modules. The figure 3.10
shows the GPS modules and the format of the georeference data exchanged between them. These
modules and the data formats will be presented in the next paragraphs.
Figure 3.10: GPS modules
The first module is the GPS device (Hardware) that receive the data from the GPS satellites, pre-
sented in section 3.2.1.2. The second module, GPS daemon, is the daemon that do the interface with
de GPS hardware. The GPS daemon communicates the GPS device through NMEA 0183 protocol,
in order to grab the GPS information. After this, the daemon converts the data for the geographic co-
ordinate format, degrees minutes seconds (DMS) and send this information for the next module, GPS
acquisition. The GPS acquisition module is a developed software that communicates with the GPS dae-
mon receiving messages in format DMS and process this message into a decimal degrees format (DD)
that will be used to geotag. Finally the geotag module will tag the gathered sensors (image and other
sensors) with GPS data at the moment that they are acquired. This module is also responsible for create
29
an Exchangeable Image File format (EXIF), that will geotag the encoded image.
Example of the two different formats of coordinates that will be used:
• DMS N 38o 50’ 30.649” 9o 18’ 46.702” W.
• DD Latitude: 38.841846903808985; Longitude: -9.312973022460938.
NMEA 0183 Protocol

The NMEA 0183 protocol was created by the National Marine Electronics Association and it is a com-
bined electrical and data specification for communication between marine electronics GPS receivers
and many other types of instruments. The NMEA 0183 standard uses a simple ASCII with a bit-rate of
4800 bps and a serial communications protocol that defines how data are transmitted from one host to
multiple hosts2 .
An example of an NMEA 0183 message is the following:
GPGGA,123519,4807.038,N,01131.000,E,1,08,0.9,545.4,M,46.9,M,,*47
This message has the following format ( see figure 3.11).
Figure 3.11: NMEA message format
EXIF
Most digital cameras produce EXIF files, which are JPEG files with extra tags that contain information
about the image.
EXIF is a standard format for storing interchange information in digital photography image files using
JPEG compression. The EXIF tag structure is borrowed from TIFF files which is a computer file format
for storing raster graphics images. When EXIF is employed for JPEG files, the EXIF data are stored in
2 http://www.gpsinformation.org/dale/nmea.htm
30
one of JPEG defined utility Application Segments, the APP1 (segment marker 0xFFE1) see figure 3.12,
which in effect holds an entire TIFF file within. Almost all new digital cameras use the EXIF annotation,
storing information on the image such as, date and time the image was taken, white-balance, resolution
and GPS information. For this project it will be only stored the GPS information into the EXIF, in order to
easily see where the images were taken. EXIF can store lots of GPS attribute information (see figure 4.5)
but for the purpose of this project it will be only inserted the georeference data (latitude/longitude/altitude)
[41].
Figure 3.12: EXIF marker
3.2.2.3 System Control and Data Multiplexing
The system control and data multiplexing module is responsible for control and change of parameters
of the other system modules. When the mobile station receives a new parameter from the base station,
this is the module that changes it. This module is responsible for prioritize the data that will be sent to
the base station. As requirement of this project other sensors data have always priority over image data.
3.2.2.4 Message Formatting and Protocol Implementation
In order to choose the protocol used in this project, was analysed the TCP and UDP protocol. The fast
delivery of messages and a reliable communication is the main requirements for this section.
The TCP (transmission control protocol) and UDP (user datagram protocol) are transport protocols
that run over IP links, and they define two different ways to send data from one point to another over an
IP network path.
TCP has defined in a way that ensures that each packet of data gets to its recipient. If a network link
using TCP/IP notices that a packet has arrived out of sequence, then the protocol stops the transmission,
discards anything from the out-of-sequence packet forward and starts the transmission again. TCP is
a reliable transport protocol, but has costs, which causes a long transmission delay and it is difficult to
meet a soft-real-time demand. On the other hand, some advantages of the UDP are: small code and
memory size, small overhead, easy to implement and fast processing of the packet without requirement
for acknowledgement. UDP protocol has a high transmission rate, since there is no need to maintain the
connection state table (like the TCP protocol) and it has a smaller overhead. However, the UDP protocol
does not guarantee the delivery of all packets. Thus, it is faster than TCP protocol [42].
In this project, will be used a Reliable version of UDP protocol (RUDP) to transmit the data from
the mobile station to the base station. RUDP is based on RFCs 1151 and 908 - Reliable Data Protocol.
RUDP is a simple packet based transport protocol, layered on the UDP/IP protocols and provides reliable
connection. Reliable UDP transports can offer the benefit of high-capacity throughput, and minimal
31
overhead, when compared to the TCP. The following features are supported by RUDP on the system.
Transmitter and receiver refer to either clients or servers that are sending or receiving a data segment
respectively on a connection. The Client refers to the mobile station that initiates the connection and
Server refers to the base station that listened for a connection.
To achieve reliability with the UDP protocol is used acknowledges to know if message has arrived to
its destination. The sender is storing the message in its memory as long as it is waiting for acknowledge.
Once it gets acknowledged, the message can be discarded. In case the message is not acknowledged
during some period of time, it must be retransmitted (see figure 3.13).
Figure 3.13: Reliability over UDP
When the sending operation is performed, the protocol engine does the following. First it assigns
an ID to the message and passes it to the sending module which transmits the message to its destina-
tion. Then it sets the retransmission timeout to the message and pushes it into queue of acknowledge
pending messages. Whenever retransmission timeout of the message expires, the protocol engine re-
transmits the message (hands it over to the sender module) and sets a new timeout for the message. If
acknowledge is received, the protocol is notified about it by receiver module. It examines the acknowl-
edge pending messages and if acknowledge ID matches one of them, it removes the message from the
queue and notifies application about successful delivery of the message.
The protocol receiver may get messages out of order because of retransmissions. So the receiver
put the messages into heap, ordered by messages IDs. The engine knows what message is about
to come next, so it should not pass out-of-order messages to application, but rather wait for the next
in-order message.
32
3.3 Base Station
The base station consists of a PC (laptop), with an integrated WiFi interface (receiver). The base station
is responsible for receiving, processing and representing the data collected at the UAV (mobile station).
The base station will receive the sensors data, GPS coordinates, image and other sensors data, and
present it to the user. The base station will display a map widget, where the location of the UAV is
represented at the draw of the current track and the images are represented by markers at the location
of the acquisition. It is also displayed the images acquired by the mobile station in a stream of JPEG
images (similar to MJPEG) and the data from sensors in charts, all in soft real time. This architecture
is composed by a group of modules responsible for post-processing and presenting the data from the
sensors (see figure 3.14).
Figure 3.14: Base station block diagram
The system is composed by the following elements:
• Message formatting and protocol implementation: this module receives data messages from
the mobile station.
• System control and data multiplexing module that provides the data management support, by
receiving the data from the communication platform and dispatching it to the other modules respon-
sible for the coordinate, image, and sensors processing. It is also this module that is responsible
for the managing of the window events and the user input.
• Location track module that converts the coordinates into a track.
• Process image: this module formats the received image to be displayed at the screen and the
map.
• Map: this module is responsible for loading the map widget where the location of the UAV will be
represented and tracked and captured images inserted. The communication between the modules
and the map widget are made through JavaScript Object Notation messages.
33
• Other sensors data: this module formats the sensors data to be displayed in charts.
• Screen: this is the interface module between the application and the user, where the data will be
displayed after being processed. It is the module responsible for showing the user the map with
the UAV track and the images at the location of the acquisition. It is also responsible for display
the images in a stream of JPEG images and the sensors in a chart format.
• User input module that receives the user input and send it to the system control module.
JSON (JavaScript Object Notation), originally derived from the JavaScript scripting language, is an
open standard, lightweight data-interchange format. The code for parsing and generating JSON data
is readily available in many programming languages. It is easy for machines to parse and generate
and easy for humans to read and write. It is used primarily to transmit data between a server and web
application, as an alternative to XML.
Example of a JSON message see 3.15
Figure 3.15: Example of a JSON message
3.3.0.5 Base Station Application
To develop the base station application, two options were considered: web application and stand-alone
application.
In a Web application, the user interface is rendered on a client machine, by using a web browser and
the user interface capabilities on the client machine are limited to what the web browser supports. The
business logic and the data storage are not on the client machine and the communication between the
client and the server occurs using HTTP or HTTPS.
In the stand alone application, there are a vast number of different architectures and some of them
can be quite similar to a web hosted application. The business logic layer and data layer may reside on
the same machine or on a remote server.
For this project, it was decided to opt for a Web Application, since it is more versatile. A Web
Application is compatible with all operating systems, and it only requires a web browser to execute.
In figure 3.16, it is represented the interaction between the application modules.
34
Figure 3.16: Application modules
• Client: the browser where the information is displayed.
• Presentation Layer: the web server that will deliver the web content, serving as the data translator
for the network (Apache is an example of a web server).
• Business Logic layer: where the data is received and processed.
35
Chapter 4
System Integration and

Implementation
In this chapter, it is presented with more technical detail the used technologies and the implementation
of the architecture, described in the previous chapter. The developed system is divided in two sub-
systems: the mobile station, located in the UAV, and the base station on the ground. The developed
system in the mobile station simultaneously acquires all the signals, encodes and compresses the data
and sends them to the other station through a communication link. On the other hand, it is the base
station that receives, processes and displays all the data and also receives and sends parameters from
the user.
The developed modules in each station are:
• Mobile station: image acquisition and encoding module, GPS acquisition and geotag module,
other sensors acquisition module, controller and transmission module.
• Base station : receiver module, control module, image processing module, tracking module, map
module.
4.1 Mobile Station
The mobile station runs on the Raspberry Pi single board computer. The developed system is an multi-
thread program in written C, which allows the acquisition of the different sensors in parallel. Figure
4.1 shows a simple overview of the conceived system. The detailed implementation of this system is
presented in the following sections.
• Image acquisition- Thread that grabs the image from the camera to a buffer in YCbCr format.
• Image encoding - Thread that encodes the acquired image using the JPEG standard, and outputs
it into another buffer.
36
Figure 4.1: Mobile station threads
• GPS - Thread that acquires the location coordinates from the GPS device and process to a DD
format to georeference the gathered sensors.
• Sensors acquisition - Thread that grabs sensors data and transfers them to a buffer.
• Transmission thread - Thread with higher priority, that controls the priority of each packet (Sen-
sors have priority over image) and send it to the base station. It is also responsible for changing
the parameters received from the base station.
4.1.1 Operating System
The mobile station system will run a operating system which will manage the interaction with the single
board computer hardware. One of the advantages of using an operating system is that it greatly facili-
tates the access to the peripherals, along with other procedures that control the resources of the board.
The adopted operating system must have available the drivers for video and GPS acquisition and to
interface with the webcam.
The Raspberry Pi Board supports a Linux operating system and since the availability of drivers for
Video and GPS acquisition is a concern of this project, this OS will be the selected one.
There are many Linux distributions supported by the Raspberry Pi foundation. The two best operating
systems for this project are the Arch Linux ARM and the Raspbian. These operating systems are
supported by the Raspberry Pi community and are the two with best performance.
In particular, we have chosen Raspbian Wheezy1 , which was completed in June, 2012. However,
1 http://www.raspbian.org/
37
Raspbian is still under active development and can be downloaded from the Raspberry Pi website.
Raspbian is a freely available version of Debian Linux, which has been customized to run on the Pi. It
has been designed to make use of the Raspberry Pi floating point hardware architecture, thus enhancing
performance. It comes with over 35,000 pre-compiled software packages, bundled in a nice format for
easy installation and optimized for best performance on the Raspberry Pi. It is a fast easy-to-use and
light Linux distribution, that can be set up relatively quickly. Thanks to the support of the Raspberry
Pi Foundation, it gets a lot more development and attention from the community than any other Linux
distribution for the Raspberry Pi.
4.1.2 Image Acquisition
The developed system makes use of an internal kernel API designed for the image acquisition: the
Video4Linux2 API. The aim is to implement an application that programs the capture interface in order
to acquire images from the webcam.
As it was referred before, the digital camera converts the signal directly to a digital output, without the
need of an external A/D converter. These devices acquire the image data in a Raw pixel format, which
is obtained almost directly from the image sensor of the camera. As a consequence, image data must
be encoded in order to be more suitable for transmission.
Video For Linux 2 (V4L2) is the second version of the Video For Linux API, a Linux kernel interface for
analog radio and video capture and output drivers. V4L is intended to provide a common programming
interface for the many computer TV and capture cards on the market, as well as parallel port and USB
video cameras. Radio, teletext decoders and vertical blanking data interfaces are also provided.
V4L2 was designed to support a wide variety of devices, although only some of which are truly
”video” in nature 2 :
• Video capture interface - grabs video data from a tuner or camera device. This interface will be
further emphasized.
• Video output interface - allows applications to drive peripherals which can provide video images
- perhaps in the form of a television signal - outside of the computer.
• Video overlay interface - is a variant of the capture interface, whose job is to facilitate the direct
display of video data from a capture device. Video data is transferred directly from the capture
device to the display.
• Vertical Blanking Interval (VBI) interfaces - provide access to data transmitted during the video
blanking interval.
• Radio interface - provides access to audio streams from Amplitude modulation (AM) and Fre-
quency modulation (FM) tuner devices.
2 https://lwn.net/Articles/203924/
38
Video devices differ from many others in the vast number of ways in which they can be configured. As
a result, much of a V4L2 driver implements code which enables applications to discover a given device’s
capabilities and to configure that device to operate in the desired manner. In accordance, programming
a V4L2 device consists in the steps of figure 4.2. Some these steps of depend on the V4L2 device
type and most of them can be executed out of order. In this section, will be discussed some important
concepts and parameters to use the V4L2 device driver and the programming of these steps will be
further presented in section 4.1.4.
Figure 4.2: Main steps of programming a V4L2 device
4.1.2.1 V4L2: Opening and Closing Devices
To open and close V4L2 generic devices, the user application uses the open() and the close() functions.
After the open function is called, the driver can access the device and retrieve its private data, which
contains information specific to the that device. The V4L2 devices can be opened more than once
simultaneously, either by the same application, or by different applications.
Device Naming
V4L2 drivers are implemented as kernel modules, loaded manually by the system administrator or auto-
matically when a device is first discovered. The driver modules plug into the ”videodev” kernel module.
39
Conventionally, V4L2 video capture devices are accessed through character device named /dev/video
and /dev/video0 to /dev/video63 (see table 4.1).
Device Name Minor Range Function

/dev/video 0-63 Video capture interface
/dev/radio 64-127 AM/FM Radio Devices
/dev/vtx 192-233 Teletext interface chips
/dev/dbi 224-239 Raw VBI data
Table 4.1: V4L2 Device types
After the open, all the other system calls access the private data of the device via the struct *file
pointer. The return value is returned to the application.When this is supported by the driver, users can
for example start a ”panel” application to change controls like brightness or audio volume, while another
application captures video and audio.
4.1.2.2 V4L2: Querying Capabilities
Because V4L2 covers a wide variety of devices, not all aspects of the API are equally applicable to
all types of devices. Furthermore, devices of the same type have different capabilities. The VID-
IOC QUERYCAP ioctl is available to check if the kernel device is compatible with this specification,
and to query the functions and I/O methods supported by the device. Other features can be queried
by calling the respective ioctl. Hence, all V4L2 drivers must support VIDIOC QUERYCAP. Applications
should always call this ioctl after opening the device.
4.1.2.3 V4L2: Data Format Negotiation
Different I/O devices exchange different kinds of data with applications (for example, video images, raw
or sliced VBI data, RDS datagrams). The data format negotiation allows the application to ask for a
particular data format, upon which the driver selects and reports the best the hardware can do to satisfy
such request. A single mechanism exists to negotiate all data formats, by using the aggregate struct
v4l2 format and the get (VIDIOC G FMT ioctl) and the set (VIDIOC S FMT ioctl), which must be sup-
ported by all drivers. When the application calls the VIDIOC S FMT ioctl with a pointer to a v4l2 format
structure the driver checks and adjusts the parameters against hardware abilities. If the applications
omit the VIDIOC S FMT ioctl, its locking side effect are implied in the following steps, corresponding to
the selection of an I/O method with the VIDIOC REQBUFS ioctl or implicit with the first read() system
call.
To query the current image format definition, the application set the type field of a struct v4l2 format
to V4L2 BUF TYPE VIDEO CAPTURE and call the VIDIOC G FMT ioctl with a pointer to this structure.
The driver will then fill pix member of the fmt union enclosed within struct v4l2 format. In order to
request different parameters, the application set the type field of struct v4l2 format and initialize all fields
of the vbi member of fmt union, enclosed within struct v4l2 pix format. An alternative and more expedite
solution consists in just modifying the results of VIDIOC G FMT, and call the VIDIOC S FMT ioctl with
40
a pointer to this structure. The driver will then adjust the specified parameters and returns then just as
VIDIOC G FMT does.When the application calls the VIDIOC S FMT ioctl with a pointer to a v4l2 format
structure the driver checks and adjusts the parameters against hardware abilities.
In order to exchange images between drivers and applications, it is necessary to have standard
image data formats which both sides will interpret the same way. V4L2 includes several formats, al-
though drivers must provide a default and the selection persists across closing and reopening a device.
Nevertheless, applications should always negotiate a data format before engaging in data exchange.
The image format negotiation allows the application to ask for a particular data format, upon which the
driver selects and reports the best the hardware can do to satisfy such request. In this particular project
the used format supported by the camera is the YCbCr 4:2:2 (V4L2 PIX FMT YUYV) explained in the
previous chapter section 3.2.2.1.
The V4L2 standard formats are mainly uncompressed formats. The pixels are always arranged in
memory from left to right, and from top to bottom. The first byte of data in the image buffer is always
for the leftmost pixel of the topmost row. Following that is the pixel immediately to its right, and so on
until the end of the top row of pixels. Following the rightmost pixel of the row there may be zero or more
bytes of padding to guarantee that each row of pixel data has a certain alignment. Following the padding
bytes, (if any) is the data for the leftmost pixel of the second row from the top, and so on. The last row
has just as many pad bytes after it as the other rows.
4.1.2.4 V4L2: Inputs/Outputs
The V4L2 API defines several different methods to read() from or write() to a device. All drivers ex-
changing data with the applications must support at least one of them. The classic I/O method based on
the read() and the write() system calls is automatically selected after opening a V4L2 device. When the
driver does not support this method, attempts to read() or write() will fail. As such, other methods must
be negotiated in order to select the streaming I/O method, either with memory mapped or user buffers.
To do this negotiation, application should call the VIDIOC REQBUFS ioctl. The method supported by the
developed system is the streaming I/O memory mapping, in the following paragraphs will be described
the various I/O methods in more detail.
Read/Write
Input and output devices support the read() and write() function, respectively, when the V4L2 CAP
READWRITE flag in the capabilities field of struct v4l2 capability is returned by the VIDIOC QUERYCAP
ioctl.
In this mode, the drivers may need the CPU to copy the data, but they may also support direct
memory access (DMA) to or from user memory, so this I/O method is not necessarily less efficient than
other methods merely exchanging buffer pointers. However it is considered inferior, because no meta-
information like frame counters or timestamps are passed. This information is necessary to recognize
frame dropping and to synchronize with other data streams. Even so, this is also the simplest I/O method,
requiring little or no setup to exchange data.
41
Streaming I/O (Memory Mapping)
The input and output devices support this I/O method when the V4L2 CAP STREAMING flag in the ca-
pabilities field of struct v4l2 capability is returned by the VIDIOC QUERYCAP ioctl. In order to determine
if the memory mapping is supported, the application must call the VIDIOC REQBUFS ioctl.
This is a streaming I/O method where only pointers to buffers are exchanged between application
and driver: the data itself is not copied. Memory mapping is primarily intended to map buffers in device
memory into the application’s address space. Device memory can be, for example, the video memory
on a graphics card within a video capture device.
To allocate device buffers, applications should call the VIDIOC REQBUFS ioctl with the desired num-
ber of buffers and the corresponding buffer type (for example V4L2 BUF TYPE VIDEO CAPTURE). This
ioctl can also be used to change the number of buffers or to free the allocated memory, provided that
none of the buffers is still mapped.
Before applications can access the buffers they must map them into their address space with the
mmap() function. The location of the buffers in device memory can be determined with the VID-
IOC QUERYBUF ioctl. These buffers are allocated in physical memory, as opposed to virtual memory
which can be swapped out to disk.
Streaming I/0 (User Pointers)

Input and output devices support this I/O method when the V4L2 CAP STREAMING flag in the capa-
bilities field of struct v4l2 capability returned by the VIDIOC QUERYCAP ioctl is set. In particular, to
determine if the user pointer method is supported, applications should call the VIDIOC REQBUFS ioctl.
This I/O method combines advantages of both the read/write and the memory mapping methods.
Buffers are allocated by the application itself, and can reside in virtual or shared memory. Only pointers
to data are exchanged. These pointers and meta-information are passed in struct v4l2 buffer. The driver
must be switched into user pointer I/O mode by calling the VIDIOC REQBUFS with the desired buffer
type. No buffers are allocated beforehand, consequently, they are not indexed and cannot be queried
like mapped buffers with the VIDIOC QUERYBUF ioctl.
Buffer addresses and sizes are passed on the fly with the VIDIOC QBUF ioctl. Since buffers are
commonly cycled, applications can pass different addresses and sizes at each VIDIOC QBUF call. If
required by the hardware, the driver swaps memory pages within physical memory to create a continuous
area of memory. This happens transparently to the application in the virtual memory subsystem of the
kernel. When buffer pages have been swapped out to disk they are brought back and finally locked
in physical memory for DMA. Filled or displayed buffers are dequeued with the VIDIOC DQBUF ioctl.
The driver can unlock the memory pages at any time between the completion of the DMA and this ioctl.
The memory is also unlocked when VIDIOC STREAMOFF is called, VIDIOC REQBUFS, or when the
device is closed. When the VIDIOC STREAMOFF is called, it removes all buffers from both queues and
unlocks all buffers as a side effect.
Drivers implementing user pointer I/O must support the VIDIOC REQBUFS, VIDIOC QBUF, VID-
IOC DQBUF, VIDIOC STREAMON and VIDIOC STREAMOFF ioctl, the select() and poll() function.
42
4.1.2.5 V4L2: Buffers
When streaming I/O is active, frames are passed between the application and the driver in the struct
v4l2 buffer. Only pointers to buffers are exchanged: the data itself is not copied. These pointers, together
with meta-information (like timestamps or field parity), are stored in a struct v4l2 buffer, argument to the
VIDIOC QUERYBUF, VIDIOC QBUF and VIDIOC DQBUF ioctl. There are three fundamental states
that a buffer can be in: incoming queue, outgoing queue, user space (see figure 4.3).The streaming
drivers maintain two buffer queues, an incoming and an outgoing queue. The queues are organized as
FIFOs, buffers will be output in the order enqueued in the incoming FIFO, and were captured in the order
dequeued from the outgoing FIFO.
Figure 4.3: v4l2 buffer state
• Incoming queue For a video capture device, buffers in the incoming queue will be empty, waiting
for the driver to fill them with video data. For an output device, these buffers will have frame data
to be sent to the device.
• Outgoing queue Buffers that have been processed by the driver and are waiting for the application
to claim them. For capture devices, outgoing buffers will have new frame data and for output
devices, these buffers are empty.
• Neither queue: user space In this state, the buffer is owned by user space and will not normally
be touched by the driver. This is the only time that the application should do anything with the
buffer. This state will be called ”user space”.
To enqueue and dequeue a buffer, applications use the VIDIOC QBUF and VIDIOC DQBUF ioctl.
The status of a buffer being mapped (enqueued, full or empty) can be determined at any time using the
VIDIOC QUERYBUF ioctl. By default VIDIOC DQBUF blocks when no buffer is in the outgoing queue.
To start and stop capturing or output, the applications should call the VIDIOC STREAMON and
VIDIOC STREAMOFF ioctl. The VIDIOC STREAMOFF removes all buffers from both queues as a side
effect. As such, if an application needs to synchronize with another event, it should examine the struct
v4l2 buffer timestamp of the captured buffers before enqueuing buffers then output.
Webcam capabilities
The Logitech webcam used in the project only supports the YCbCr 4:2:2 format, as such the image is
43
acquired in that format in a resolution of 640 x 480 pixels. From the set of resolutions which this camera
allows was chosen this resolution because it is the one that has best compromise between quality,
processing power(encoding) and transmission bandwidth.
4.1.3 Image Encode
After acquiring the image data in YCbCr 4:2:2 format(640 x 480 resolution), there is the need to com-
press it, mainly because it is required to reduce the memory space and the transmission bandwidth
needed to stream it. For such purpose, it was decided to use the JPEG standard, mostly due to the wide
availability of visualization mods, for the receiver. Furthermore the JPEG format also allows to geotag
the image by using an EXIF tag.
In one first approach of this project, the image was not encoded directly from YCbCr 4:2:2 to JPEG.
First, it was encoded into several intermediate formats: to YCbCr 4:4:4, RGB, PPM and finally to JPEG.
For the final encoding three different encoders were considered: Cjpeg , ppm2jpeg and Image Magick 3 .
This test was performed in a personal computer with the main objective to know which of this encoders
is faster. As such, it was done a test in shell script, which encoded a hundred of images from ppm to
jpeg (see table 4.2). The one that showed better performance was the cjpeg encoder.
Lenovo T430
Cjpeg 31 ms
ppm2jpeg 33 ms
Image Magick 64 ms
Table 4.2: Average time of encoding a hundred of PPM to JPEG images in a computer Lenovo T430
However, when the first tests were performed in the Raspberry Pi, the cjpeg did not showed enough
performance for the desired system. There was the need to improve it and the solution was to use
Libjpeg4 , which does the encoding directly from YCbCr to JPEG. Some tests were performed in order
to compare the performance improvement (see table 4.3). In this test, it is important to refer: while the
cjpeg test needs to acquire an image in YCbCr 4:2:2, encode to 4:4:4, encode to RGB, encode to PPM
and finally encode to JPEG, the libjpeg test only needs to acquire the image in YCbCr and encode to
JPEG.
Raspberry Pi Lenovo T430

cjpeg 4182 ms 129 ms
libjpeg 302 ms 41 ms
Table 4.3: Average time of acquire and encode hundred images until JPEG
By taking these measures into account, it was decided to use libjpeg to implement the JPEG encod-
ing.
Libjpeg is a widely used free library that provides C code to read and write JPEG compressed image
files. The surrounding application program receives or supplies image data a scanline at a time, using
3 http://www.imagemagick.org/
4 http://www.ijg.org/
44
a straightforward uncompressed image format. All details of color conversion and other preprocess-
ing/postprocessing can be handled by the library.
The library includes a substantial amount of code that is not covered by the JPEG standard but
is necessary for typical applications of JPEG. These functions preprocess the image before JPEG
compression or post process it after decompression. They include colorspace conversion, downsam-
pling/upsampling, and color quantization. The application indirectly selects this code by specifying the
format in which it wishes to supply or receive image data.
In this library, it is possible to adjust the quality of the JPEG encoding but it does not allow lossless
JPEG.
Before diving into procedural details, it is helpful to understand the image data format that the JPEG
library expects or returns. The standard input image format is a rectangular array of pixels and must
be specified how many components there are and the colorspace interpretation of the components.
Most applications will use RGB data (three components per pixel) or grayscale data (one component
per pixel)or YcbCr. Pixels are stored by scanlines, with each scanline running from left to right. The
component values for each pixel are adjacent in the row. The array of pixels is formed by making a list
of pointers to the starts of scanlines, so the scanlines need not be physically adjacent in memory. The
library accepts or supplies one or more complete scanlines per call. It is not possible to process part of
a row at a time and the scanlines are always processed top-to-bottom.
A JPEG compression object will hold parameters and working state for the JPEG library that makes
the creation/destruction of the object separate from starting or finishing compression of an image. The
same object can be re-used for a series of image compression operations. This makes it easy to re-use
the same parameter settings for a sequence of images.
The outline of a JPEG compression operation is:
1. Allocate and initialize a JPEG compression object: a JPEG compression object is parametrized
by a ”struct jpeg compress struct”, together a bunch of subsidiary structures which are allocated
via malloc().
2. Specify the destination for the compressed data: the JPEG library delivers compressed data
to a ”data destination” module (eg. a file).
3. Set parameters for compression, including image size and colorspace: must be supplied
information about the source image by setting the following fields in the JPEG object (cinfo struc-
ture):
• image width: width of image, in pixels.
• image height: height of image, in pixels.
• input components: number of color channels (samples per pixel).
• in color space: colour space of source image (e.g RGB, grayscale and YCbCr).
4. JPEG start compress: after established the data destination and set all the necessary source
image info and other parameters, call jpeg start compress() to begin a compression cycle. This
45
will initialize internal state, allocate working storage, and emit the first few bytes of the JPEG data
stream header.
5. Loop writing JPEG scanlines: process that writes all the required image data by calling the
function jpeg write scanlines() one or more times. It is possible to pass one or more scanlines in
each call, up to the total image height. In most applications it is convenient to pass just one or a
few scanlines at a time.
6. JPEG finish compress: after the image data is written, call jpeg finish compress() to complete
the compression cycle. This step is essential to ensure that the last buffer load of data is written
to the data destination. jpeg finish compress() also releases working memory associated with the
JPEG object.
7. Release the JPEG compression object: when the JPEG compression object is finished, it can
destroyed by calling jpeg destroy compress(). This will free all subsidiary memory (regardless of
the previous state of the object).
Colour Space
The JPEG standard itself is ”color blind” and doesn’t specify any particular color space. It is cus-
tomary to convert color data to a luminance/chrominance color space before compressing, since this
permits greater compression. The existing JPEG file interchange format standards specify YCbCr or
GRAYSCALE data (JFIF version 1), GRAYSCALE, RGB, YCbCr, CMYK, or YCCK (Adobe), or BG RGB
or BG YCC (big gamut color spaces, JFIF version 2).
4.1.4 Implementation of the Image Acquisition and Encode
The implementation of the image acquisition and encoding involves several tasks that were descried in
the previous sections. The acquisition process and the encoding process shall run each in a different
thread. Therefore, both processes will be running in parallel and sharing the image buffer. As such,
the acquisition module captures the image to a buffer, that will be used by the encoder module. Subse-
quently, the encoder module compresses the YCbCr image data from the acquisition buffer and writes it
to another output buffer.
In order to acquire and encode/compress the images, the following tasks must be accomplished (see
buffering figure 4.4):
1. Open device /dev/video0 with the open() system call
2. Initialize the image capture device
(a) Query the device capabilities with the VIDIOC QUERYCAP ioctl
(b) Negotiate the image format with the VIDIOC S FMT ioctl (Width, height, pixel format, field)
(c) Request the driver to accommodate the required frames in memory mapping mode, with the
VIDIOC REQBUFS ioctl
46
(d) Setup the buffers characteristics, with the VIDIOC QUERYBUF ioctl
3. Initialize a JPEG compression object and the encoding buffers.
4. Runs the acquisition and encoding loops in parallel
(a) Image acquisition
i. Requests the driver to enqueue with the VIDIOC QBUF ioctl
ii. Start the acquisition VIDIOC STREAMON ioctl
iii. Dequeue the acquired frame VIDIOC DQBUF ioctl and unlock the encoder semaphore
iv. After the image is processed call the VIDIOC QBUF ioctl to re-enqueue the buffer.
(b) Image Encode
i. Specifies the destination buffer and the image parameters (width, height, number of com-
ponents, color space, jpeg quality)
ii. Start JPEG encode with the jpeg start compress()
iii. Loop for writing JPEG scanlines
iv. Write the JPEG scanlines with the jpeg write scanlines()
v. Finish the compression with the jpeg finish compression()
vi. Change buffer and lock the encoder semaphore
5. Transmission thread send the encoded/compressed image to the base station
In order to take advantage of all the resources of the system it was developed one producer-
consumer architecture with three buffers. With this architecture it is possible to simultaneously acquire
one image and encode another image and the consumer does not need to wait until the consumer finish.
If the producer is much faster than the consumer, some frames can be discarded. Using figure 4.4 as
an example: the acquisition thread may be acquiring one image to the buffer A1 and when image is
acquired the thread starts acquiring to buffer A2. Unless the encoding thread releases the A3 buffer,
the acquisition thread keeps changing between A1 and A2. When the encoder finishes the processing
of buffer A3, it starts the processing/encoding of the last acquired image. When the buffers change
pointers, they are controlled with locks to avoid synchronization problems.
Figure 4.4: Image buffers
47
4.1.5 GPS Acquisition
The GPSd is a Linux daemon that communicates with the attached GPS device through NMEA mes-
sages, over RS232, USB, Bluetooth, TCP/IP or UDP links. Reports are normally shipped to TCP/IP port
2947, but they can also go out via a shared-memory or D-BUS interface. The GPSd responds to queries
with a format that is substantially easier to parse than the NMEA 0183, emitted by most GPS receivers.
The GPSd distribution includes a linkable C service library, a C++ wrapper class, and a Python module
that developers of GPSd-aware applications can use to encapsulate all communication with GPSd.
In accordance, it was chosen this device driver because it transforms the NMEA0183 interface into
a C data structure, accessible by a socket. In fact, throws to the client libraries that are provided by the
GPSD the client applications need not even know about the protocol format. Instead, getting a sensor
information becomes a simple function call. Since the mobile station is implemented in C code running
in a Linux system, the use of this daemon makes the integration of the GPS module with the system
easier and more reliable.
In this project, in order to acquire the GPS information, the libgps library is used, which is the interface
that goes through the GPSD to communicate with the GPS device. This interface return the GPS data
in a struct gps data t with the gathered coordinates, time, speed and other system parameters.
The main functions of this library are:
• gpsd open(): initializes a GPS data structure to hold the data collected by the GPS, and returns
a socket attached to gpsd().
• gps stream(): make the register on the GPSd, to stream the reports and made it available when
poll for updates from the device.
• gps waiting(): checks if exist new data from the daemon.
• gps read(): accepts a response, or sequence of responses, from the daemon and interprets.
• gps close (): ends the session.
4.1.6 Image Geotagging
In order to geotag the encoded JPEG images, it is created an EXIF tag with the coordinates of the
acquisition moment For such propose it is used the libexif library. This library was written in C for
parsing, editing, and saving EXIF data. The EXIF meta data can store lots of information (e.g data, white
balance, resolution etc), however in this particular project we are interested in writing EXIF information
to georeference, which uses a specific sub-Image File Directory (IFD). The libexif use EXIF standard
tags to insert the data in the correct position.
EXIF can store a lot of GPS information but the implemented solution only regards the georeference
data. The attribute information (field names and codes) recorded in the GPS Info IFD is presented in
figure 4.5, followed by an explanation of the contents of the GPS attributes used in this project (lati-
tude/longitude).
48
Figure 4.5: EXIF attribute information
• TAG ID 1 : GPSLatitudeRef, indicates whether the latitude is north or south latitude. The ASCII
value ’N’ indicates north latitude, and ’S’ is south latitude.
• TAG ID 2 : GPSLatitude, indicates the latitude. The latitude is expressed as three RATIONAL
values giving the degrees, minutes, and seconds, respectively.
• TAG ID 3 : GPSLongitudeRef, indicates whether the longitude is east or west longitude. ASCII ’E’
indicates east longitude, and ’W’ is west longitude.
• TAG ID 4 : GPSLongitude, indicates the longitude. The longitude is expressed as three RATIONAL
values giving the degrees, minutes, and seconds, respectively.
In order to use the libexif to insert the georeference information the following functions invocation
must be ensured:
• exif data fix(): create the mandatory EXIF fields with default data.
• init tag (): get an existing tag or create one if it does not exist.
• create tag() : create a brand-new tag with a data field of the given length, in the given IFD. This is
needed when init tag() isn’t able to create this type of tag itself, or the default data length it creates
isn’t the correct length.
• exif set rational(): function that inserts a rational in the tag field.
49
4.1.7 Threads Management
In this section, it will be discussed the scheduling policies of the Raspbian Linux distribution with rele-
vance for this project. The process scheduler is the component of the kernel that selects which process
to run next. The Linux scheduler offers three different scheduling policies: one for normal processes and
two for real-time applications. A static priority value (sched priority) is assigned to each process and this
value can be changed only via system calls. Conceptually, the scheduler maintains a list of runnable
processes for each possible sched priority value, and sched priority can have a value in the range 0
to 99. In order to determine the process that runs next, the Linux scheduler looks for the non-empty
list with the highest static priority and takes the process at the head of this list. The scheduling policy
determines, for each process, where it will be inserted into the list of processes with equal static priority
and how it will move inside this list.
The Linux schedules policies are:
• SCHED OTHER: is the standard Linux time sharing scheduler that is intended for all processes
that do not require special static priority real-time mechanisms. The process to run is chosen from
the static priority 0 list, based on a dynamic priority that is determined only inside this list.
• SCHED FIFO: can only be used with static priorities higher than 0, which means that when a
SCHED FIFO processes becomes runnable, it will always preempt immediately any currently run-
ning normal SCHED OTHER process. SCHED FIFO is a simple scheduling algorithm without time
slicing. For processes scheduled under the SCHED FIFO policy, the following rules are applied:
a SCHED FIFO process that has been preempted by another process of higher priority will stay
at the head of the list for its priority and will resume execution as soon as all processes of higher
priority are blocked again. When a SCHED FIFO process becomes runnable, it will be inserted at
the end of the list for its priority.
• SCHED RR: Everything described above for SCHED FIFO also applies to SCHED RR, except
that each process is only allowed to run for a maximum time quantum. If a SCHED RR process
has been running for a time period equal to or longer than the time quantum, it will be put at the
end of the list for its priority. A SCHED RR process that has been preempted by a higher priority
process and subsequently resumes execution as a running process will complete the unexpired
portion of its round robin time quantum.
Only processes with superuser privileges can get a static priority higher than 0 and can therefore be
scheduled under SCHED FIFO or SCHED RR.
In a first approach, the strict scheduling of the threads was not the main priority of this project and
consequently when the system was executed in the Raspberry Pi, the performance was not very good,
with only 2 fps (max) (see figure 4.6) and a considerable latency of about 2 seconds in the moment of the
visualization. As a consequence of such low performance, threads scheduling became one of the main
concerns in the development of this project. The modules that required more attention in that matter
were the image acquisition and the encoding module, because of the greatest need for of computational
50
power, and the transmission module, which is the one responsible for delivering the acquired data as
soon as possible to the user.
Figure 4.6: Threads execution pattern without scheduling adoption
In order to improve the performance of the project, the standard default schedule (SCHED OTHER)
of the transmission threads was changed to the SCHED RR schedule, in order to transmit the acquired
data as soon as possible to the base station. This necessity was concluded after analysing the graph of
figure 4.6 and visualized when an image finishes the encoding it was not immediately sent.
Figure 4.7: Synchronization of transmission thread
To synchronize this high priority thread with the others, one semaphore is used, which put in a wait
the thread until any gather sensor is ready to be sent (see figure 4.7). Accordingly, this thread is always in
wait until one image or sensor data is ready to be sent. In that moment, the thread enter the semaphore
and use the processor until the transmission finish. The encoding thread also has a semaphore to
put on wait her activity until there exists a new image to be encoded. However, since the acquisition
of an image is faster than the encoding, normally this semaphore is always unlocked. But should the
system be mounted in a different hardware, this measure can be relevant (for example: a hardware with
a dedicated module for encoding). After implement the synchronization and the scheduling schemes,
the final system improved the performance considerably, allowing up to 4 fps (see figure 4.8) and around
0.7 s of latency.
51
Figure 4.8: Thread execution pattern with scheduling adoption
4.2 Communication
To develop the communication protocol between the mobile and base stations, the ENet library was
used. ENet is a C library with the purpose of providing a simple and robust network communication
layer on top of UDP (User Datagram Protocol). The primary feature it provides is a reliable and an
in-order to delivery all the packets. In section 3.2.2.4 it was presented the operation of the Reliable UDP.
ENet provides sequencing for all packets, by assigning to each sent packet a sequence number
that is incremented as the packets are sent. ENet guarantees that no packet with a higher sequence
number will be delivered before a packet with a lower sequence number, thus ensuring that packets
are delivered exactly in the order they are sent. For unreliable packets, ENet will simply discard the
lower sequence number packet if a packet with a higher sequence number has already been delivered.
This allows the packets to be dispatched immediately as they arrive, and reduce latency of unreliable
packets to an absolute minimum. For reliable packets, if a higher sequence number packet arrives,
but the preceding packets in the sequence have not yet arrived, ENet will stall delivery of the higher
sequence number packets until its predecessors have arrived. Enet uses a checksum in order to ensure
the packet do not have errors. ENet provides optional reliability of packet delivery by ensuring the
foreign host acknowledges receipt of all reliable packets. ENet will attempt to resend the packet up to a
reasonable amount of times, if no acknowledgement of the packet’s receipt happens within a specified
timeout. Retry timeouts are progressive and become more lenient with every failed attempt to allow for
temporary turbulence in network conditions.
The liveness of the connection is actively monitored by pinging the foreign host at frequent intervals.
It also monitors the network conditions from the local host to the foreign host, such as the mean round
trip time and packet loss. With this information, Enet can change the packet size in order obtain the
maximum throughput of the connection.
Some important specifications and functions of ENet:
• enet initialize(): before using ENet the program must call this function to initialize the library.
Upon program exit, it should be called enet deinitialize(), so that the library may clean up any used
resources.
• enet host create(): this function is used to create the server and the client. In the provided
52
parameters, it must be introduced an address on which to receive data and new connections,
as well as the maximum allowable numbers of connected peers. Optionally, it can specify the
incoming and outgoing bandwidth of the server (in bytes per second) so that ENet may try to
statically manage bandwidth resources among connected peers in addition to its dynamic throttling
algorithm; specifying 0 for these two options will cause ENet to rely entirely upon its dynamic
throttling algorithm to manage bandwidth. When done with a host, the host may be destroyed with
enet host destroy(). All connected peers to the host will be reset, and the resources used by the
host will be freed.
• enet host connect(): function that connects to a foreign host, receives the address of the host.
• enet host service(): ENet uses a polled event model to notify the programmer of significant
events. ENet hosts are polled for events with enet host service(), where an optional timeout value
in milliseconds can be specified to control how long ENet will poll; if a timeout of 0 is specified,
enet host service() will return immediately if there are no events to dispatch. enet host service()
will return 1 if an event was dispatched within the specified timeout. There are four types of signif-
icant events in ENet:
– ENET EVENT TYPE NONE : this event is returned if no event occurred within the specified
time limit. enet host service() will return 0 with this event.
– ENET EVENT TYPE CONNECT : this event is returned when either a new client host has
connected to the server host or when an attempt to establish a connection with a foreign host
has succeeded.
– ENET EVENT TYPE RECEIVE : this event is returned when a packet is received from a
connected peer. The ”peer” field contains the peer the packet was received from, ”channelID”
is the channel on which the packet was sent, and ”packet” is the packet that was sent. The
packet contained in the ”packet” field must be destroyed with enet packet destroy() when you
are done inspecting its contents.
– ENET EVENT TYPE DISCONNECT : this event is returned when a connected peer has ei-
ther explicitly disconnected or timed out. Only the ”peer” field of the event structure is valid
for this event and contains the peer that disconnected. Only the ”data” field of the peer is still
valid on a disconnect event and must be explicitly reset.
• enet packet create() : function used to create the packet. The size of the packet must be speci-
fied. To specify that the packet must use reliable delivery, the flag ENET PACKET FLAG RELIABLE
must be supplied. If this flag is not specified, the packet is assumed an unreliable packet, and no
retry attempts will be made or acknowledgements generated.
• enet peer send(): function used to send a packet to a foreign host. Once the packet is sent, ENet
will handle its deallocation of the packet.
53
• enet peer disconnect(): a disconnect request will be sent to the foreign host, and ENet will wait
for an acknowledgement from the foreign host before finally disconnecting.
In accordance, Enet guarantees the data being transmitted from the UAV to the base station and also
ensures the required the fragmentation of the packet by taking into account the observed packet loss
ratio of the link. Furthermore, the adoption of the Enet protocol, alleviates the need of encapsulating the
packets with complex control information (see figure 4.9).
Figure 4.9: System integration with Enet protocol
In accordance, it is not need to send extra fields in the packet (like sequence number and checksum)
since Enet already does that. Table 4.4 shows the packet structure that is sent over the Enet protocol,
followed by the explanation of the packet structure.
16 bits 64 bits Data size

ID GPS coordinates lat/long data
Table 4.4: The packet structure
The packet structure:
• Packet ID: there are 3 types of packet: image packet, sensors packet (sensor packet have a extra
field with the size of each sensor), configuration packet. The ack packets and retransmission
packets are provided by the Enet protocol.
• GPS coordinates: packet geottag, with the coordinates of the acquired data.
• Data: data to be sent.
The configuration packet does not have the GPS coordinates. It only has an ID and the configuration
command and their size can vary with the size of the command(see table 4.5)
16 bits Configuration command size

ID command
Table 4.5: Configuration packet structure
54
4.3 Base Station
The architecture of the base station consists on a web application system responsible for receiving the
acquired data from the mobile station. The GPS coordinates are used to track the trajectory on a map
while displaying the acquired image and the other sensors in charts. The map should also have markers
with the sensors and image location. The user interface should also allow the user to change some
parameters in the UAV (see figure 4.10).
Like any web application, this system has two modules: the client side, which runs on the browser and
the server side which runs in the server. The system modules were developed in C (receiver module),
HTML (interface), JavaScript (client side support modules) and Node.Js (control and data multiplexing).
Figure 4.10: Base station implementation overview
4.3.1 Message Formatting and Protocol Implementation
This module represents is the server side of the communication module, between the mobile station and
the base station, which implements the RUDP protocol. This module receives the acquired sensors data
from the mobile station and extracts the information.
To ensure the communication between this module and the system control module, a continuous
connection throw a TCP socket is created, to allow both sides to know the data format. One type of
messages can be received from the system control module and two types of message can be sent to
the system control module:
55
1. Image message This message is received from the mobile station. The image and the GPS
coordinates are extracted from the packet. After the extraction, the image receives an ID and is
written in RAM disk and one message is sent to the system control module with the image ID and
the GPS coordinates (see table 4.6).
2. Sensors message When this message is received the sensors and the corresponding coordinates
are sent to the system control module (see table 4.7) and written in a file as backup.
3. Configuration message These messages are sent from the base station (system control module)
to the mobile station, in order to change one parameter in the mobile station.
16 bits 64 bits 32 bits

ID GPS coordinates lat/long Image ID
Table 4.6: Image message to the system control and data multiplexing module
16 bits 64 bits Sensors data

ID GPS coordinates lat/long Data
Table 4.7: Sensors message to the system control and data multiplexing module
This module also receive from the system control and data multiplexing module a configuration mes-
sage already with the format to be sent to the mobile station (presented before see table 4.5)
4.3.2 System Control and Data Multiplexing
This is the main module of the base station. It was developed in Node.Js5 , by using the web application
framework Express, which helps to develop a REST (Representational State Transfer) application. This
is the module that creates the HTTP server.
In order to create the HTTP server, an anonymous function is provided as an argument to the function
createServer, acting as a callback that defines how each HTTP request should be handled. The callback
function accepts two arguments: request and response. When the callback executes, the HTTP server
will populate these arguments with objects that, respectively, allow the program to work out the details
of the request and send back a response.
This module is also responsible for processing the information that is received from the Message
formatting and protocol implementation module. The information is received as presented before (see
section 4.3.1). When the information is received, this module extract the information, analyses it and
sends it to a module in the client side (in a JSON object). This module can send five types of messages
and receives a configuration message with the configuration command. Figure 4.11 shows the type of
messages, together with the used sockets.
1. Image message to the user interface Sends the last acquired image directly to the user interface,
providing a sequence of images similar to MJPEG.
5 https://nodejs.org/
56
2. Image message to map Sends the image (with their coordinates) to the map module and all the
sensors acquired at the same coordinates. This message is sent periodically in intervals of 30
seconds.
3. Coordinates message to the marker Send the coordinates to the marker module, to create the
marker. This message is sent in the same channel of the ”image message to map”.
4. Coordinates message to polyline Send the last coordinates to the polyline module, in order to
create the UAV track. This message is only sent if the current coordinates are different from the
ones received in the previous message of the mobile station.
5. sensors data Sends the sensors data to the charts module in order to be displayed.
Figure 4.11: Base station implementation overview
In order to ensure the communication between the several modules from different sides (client and
server) Sockets.IO were used. However, since the HTTP is a stateless protocol, meaning that the client
is only able to make single, short-lived requests to the server, and the server has no real notion of
connected or disconnected users, it was necessary the adoption of of the WebSocket protocol [43],
which specifies a way for browsers to maintain a full-duplex connection to the server, allowing both
ends to send and receive data simultaneously. The problem with the WebSocket protocol is that it is
not yet finalized, and although some browsers have begun shipping with WebSocket, there are still a
lot of older versions out there, especially of Internet Explorer. Socket.IO solves this problem by utilizing
WebSocket when it is available in the browser, and falling back to other browser-specific tricks to simulate
the behaviour that WebSocket provides, even in older browsers [43].
57
Node.Js
As refereed before the actual implementation of this module was carried out by using Node.JS, which
is a JavaScript runtime that uses the V8 engine developed by Google for use in Chrome. However, it is
important to recall that other web development technologies were also considered, such as: PHP and
Python. However, Node.Js has a better performance than the traditional PHP in high concurrency envi-
ronments. PHP handles small requests well, but struggles with large requests. Furthermore, Node.Js
is more suited to IO-intensive applications situation, instead of compute-intensive sites. Python-Web is
also not suitable for a compute-intensive website [44].
In addition to fast JavaScript execution, node.Js uses event loop. The event loop is a single thread
that performs all I/O operations asynchronously. Traditionally, I/O operations either run synchronously
(blocking) or asynchronously by spawning off parallel threads to perform the work. When a Node.Js
application needs to perform an I/O operation, it sends an asynchronous task to the event loop, along
with a callback function, and then continues to execute the rest of its program. When the async operation
completes, the event loop returns to the task to execute its callback. With this event loop system, reading
and writing to network connections, reading/writing to the file system, and reading/writing to the database
are very fast operations. As a result, Node.Js allows to build fast, scalable network applications capable
of handling a huge number of simultaneous connections with high throughput. As the base station
application maintain a persistent connection from the browser back to the server to ensure a permanent
connection, Node.Js is the better choice.
4.3.3 Map Module
Google Maps is a set of technologies (HTML, CSS, and JavaScript working) together. The map is
represented by images that are loaded in the background with Ajax calls and then inserted into a in
the HTML page. When the user navigate throw the map, the API sends information about the new
coordinates and zoom levels of the map in Ajax calls that return new images. The API, itself, basically
consists of JavaScript files that contain classes with methods and properties that can be used to tell the
map how to behave.
Some advantages of using Google Maps in this project are:
• Google Maps is currently provided by the API for free.
• Google Maps provides a map service with vector maps and high-resolution satellite images and
from time to time they are updated by Google. Users can always enjoy the latest map information
services.
• Map is not dynamically generated based on users request, but rather pre-processed into image
pyramids, stored on the server side.
• Each map is simple to operate, including moving (mouse drag) and free scaling.
In this particular project, the map is used to display the current position and trajectory of the UAV.
This module is also responsible for displaying the image and sensors data georeferenced by their GPS
58
location.
To accomplish these tasks, the Google Maps API includes some modules which interact with the
map. There are three main modules with relevance for this project:
• Marker that allows insert markers in the map. This module is used in the project to put a marker
where the images and sensors were acquired (see figure 4.12).
• Infowindow the info window allows to show the sensors and image data on the map (see figure
4.12). The info window is used together with the marker. When the user clicks on the marker, the
Infowindow appears with the sensors and image information.
• Polyline that allows to drawing in the map. This module is used in the project in order to track the
UAV route in real time.
Figure 4.12: Map example with marker and InfoWindow
In order to load the map module, some basic steps have to be ensured. The first step is the inclusion
of the Maps API JavaScript using a script and the creation of a JavaScript object literal, to hold a number
of map properties. Secondly, a div element named “map-canvas” is created to hold the Map and get the
reference of this element in the browser document object model (DOM). Thirdly, a JavaScript function
is written to create a map object (google.maps.Map) and it takes two arguments: a reference to the
HTML element where the map will reside, and an object (MapOptions) that contains the initial settings
for the map, such as the starting zoom level, where the center of the map should be, and what kind of
map should be displayed. Finally, the map object is initialized from the body tag onload event. In the
HTML page, the document object model (DOM) is built out, and any external images and scripts are
received and incorporated into the document object. The maps modules (i.e. polylines, markers and
infowindows) have their own scripts, where they create the data arrays and send them to the map object
previous created.
59
4.3.3.1 Map Marker
To create a marker in the map (see figure 4.12), it was used the marker object (google.maps.Marker). It
takes only one parameter, which is an object MarkerOptions (google.maps.MarkerOptions). This object
has several properties that can be used to make the marker look and behave in different ways but the
two main properties are: position and map.
• position: this property defines the coordinates where the marker will be placed. It takes an argu-
ment in the form of a LatLng object (google.maps.LatLng), representing the latitude and longitude
coordinates.
• map: the map property is a reference to the map which will be add the marker.
4.3.3.2 Map InfoWindow
In order to show the image and sensors information related to a specific location, Google Maps API
offers the InfoWindow (see figure 4.12). This information appears over a marker when such marker is
clicked.
Much like the Marker object, the InfoWindow object resides in the google.maps namespace and takes
only one argument, called InfoWindowOptions. Like the MarkerOptions object, the InfoWindowOptions
object has several properties, but the most important one is the content. This property controls what will
show inside the InfoWindow. In this system it was used plain text to present the sensors values and an
HTML element to display the image.
In order to open the InfoWindow when the marker is clicked it was necessary create an event listener
which receives the object it is attached (marker), the event it should listen for (click) and the event handler
which is the function called (InfoWindow open()). The InfoWindow object has a method called open()
that will open the InfoWindow and make it visible on the map. The open() method takes two arguments.
The first argument is a reference to the map object that it will be added to. The second argument is the
object that the InfoWindow will attach itself to. In our system, it was attached to the marker being clicked.
4.3.3.3 Map Polylines
As it was referred before the polylines are used to give the current position and track the UAV in real
time. Polylines are made up of several connected lines. A line consists of two points: a starting point
and an end point. These points are made up of coordinates.
The polyline object takes one argument, of type PolylineOptions. The PolylineOptions type has
several properties(path, color, opacity, etc), but only one is required corresponding to the property ”path”,
which takes an array of google.maps.LatLng objects. When this module receives the coordinates from
the control module, these coordinates are converted to objects of the type google.maps.LatLng and add
to an array, which is the passed in the path property. After creating the polyline object, it is necessary to
add to the map with a method called setMap(). This method takes the Map object as argument.
60
Figure 4.13: Example of a polyline in a map
4.3.4 Charts Module
When the sensors data arrives to the base station (Message format and protocol implementation mod-
ule), it is sent to the system control module and multiplexed, in order to create a JSON object with the
sensors data. Then, it is sent to the charts module, that will show the data in a chart format.
6
Google charts API was used to display the sensors data in a chart format, which enables update
of the chart in real time. This API was chosen because it allows creating charts from data sources and
embed them in web pages. Furthermore, it is free and has a JavaScript interface similar, to the API used
in this project. Moreover, since Google charts are based on pure HTML5 and scalable-vector-graphics
technology, the charts can be displayed on various browsers and platforms, with no plug-ins required.
Further more, it is simple to actualise the chart in real time: it is only needed to add a new point and
actualize the chart object.
4.3.5 User Interface
The interface runs on the browser of the base station personal computer and enables the user to see the
signals that were acquired by the mobile station in a responsive interface (see figure 4.14). Furthermore,
it allows interacting with the application and setup some parameters. This interface allows the user to
see the UAV trajectory, the sequence of images acquired (similar to MJPEG video) and the sensors in
charts, all in soft real time.
For the frontend development, it was used Bootstrap 7 , which is a CSS framework that brings many
advantages such as: a responsive 12-column grids, layouts and components; many different elements
such as headings, lists, tables, buttons, forms, etc; interactive elements with JavaScript and good docu-
mentation.
Figure 4.14 illustrates the conceived user interface. In the following, it will be presented a brief
description of the several menus, buttons and the main display.
1. Button that selects the layout embedding the map, image data and sensors data.
2. Button that selects the layout with the map in full screen.
6 https://developers.google.com/chart/
7 http://getbootstrap.com/
61
Figure 4.14: Base station user interface
3. Button that selects the layout with only the image data stream.
4. Button that selects the layout with all the sensors data.
5. Pop-up menu to choose the desired image resolution.
6. Map where the UAV is tracked and where the images and sensors are georeferenced.
7. Stream of image data.
8. Plots of the sensors data.
9. Pop-up which selects the observed sensor (see figure 4.15). More sensors could be add.
Figure 4.15: Pop-up menu to choose the considered sensor
Upon startup, all the options are set to their default values. The user may just run the application with
the default values or, if desire change the parameters. In this prototype, the user can change the image
resolution see figure 4.16.
Figure 4.16: Pop-up menu to choose the image resolution
The frame rate of the image acquisition is always the maximum that the system supports for that
resolution.
62
Chapter 5
Experimental Results
In this chapter, an analysis and evaluation of the prototyped system is presented. In the following
paragraphs, it will be given a summary of the obtained results.
To perform this considered tests a Raspberry PI board was used to tun the mobile station code,
connected to the base station with a WiFi wireless link. The base station code was run in a personal
computer (Lenovo T430). This computer was also connected to the internet, in order to use the Google
Maps service. Several tests were conducted to test the obtained performance of the system, as well
the overall functionality of the system. In order to evaluate the system performance in a hardware with
a better performance than the 700 MHz CPU of the Raspberry Pi, the version two of the Raspberry Pi
board with a 900 Mhz quad-core CPU as also used, allowing to evaluate the performance in a better
hardware.
5.1 Mobile Station
To test the performance of the mobile station, a set of system timers were used to measure how much
time each module spends to do one step of the process. When the time is acquired, it is calculated
the image acquisition time, encoding time, GPS acquisition time, sensors acquisition time and finally
the transmission time (see figure 5.1). It is important to refer that for each recorded step, the obtained
measure represents the time used by that thread in system until the cycle finishes. Hence, if the thread
is waiting to enter in a lock, the timer continues running. Figure 5.1 shows the impact each thread of
the system. The thread which takes more time is the encoding thread followed by the acquisition thread,
despite transmission thread show a quite time on the graph most of that time is spent waiting for the
semaphore to open.
5.1.1 Image Acquisition and Encoding
The majority of the tests that were performed in the mobile station focused on performance. Since
the module that needs more computational power is the image module (acquisition and encoding),
several tests were made to evaluate the influence of the acquisition rate of the other external sensors
63
Figure 5.1: System time of mobile station threads (the GPS thread is not represented because the
system time is to small)
on the resulting image encoding rate. Table 5.1 presents the obtained results. As it can be observed 1
Msamples/s there is significant impact at the image module.
Sample Rate Average FPS

1 ksamples/s 4.07
10 ksamples/s 3.9
100 ksamples/s 3.88
1 Msamples/s 3
Table 5.1: Influence of sensors sample rate in the performance of the image module
On the other hand, the maximum frame rate that allows the application to run without losing frames is
4 fps (max) in the Raspberry Pi version 1 (the second version allows 9 fps (max)). Figure 5.2 represents
the obtained average frame rate in the Raspberry Pi first version. Figure 5.3 shows the same test, when
performed in the Raspberry Pi version 2.
To evaluate this performance metric, each image was tagged with an unique identifier, in order to
identify it at the passage of each buffer and how much time taken in each stage. This image tag was
also used to know if an image was lost in a buffer or if was discarded in the encoder buffer or transmission
buffer. Based on the tests performed when an image is acquired, is not discarded when waiting for the
encoding or transmission process.
Figure 5.2: Image acquisition and encoding frame-rate in the Raspberry Pi version 1
64
Figure 5.3: Image acquisition and encoding frame-rate in the Raspberry Pi version 2
Figure 5.4 compares the obtained frame-rate in the Raspberry 1 and 2. It is possible to see that both
systems only stabilizes and so get full performance after 50 seconds.
Figure 5.4: Comparative between Raspberry Pi version 1 and 2
5.1.2 GPS Acquisition
To simulate the GPS behaviour, GPSfake tool was used in this performance test. This tool takes as
input an NMEA file with a predefined list of coordinates to simulate the presence of a GPS device,
conveniently connected with the GPSd, just as a real GPS device. This allows to test the solution with a
previous log of GPS data, instead of using a real GPS device, which needs a perfect and clear vision to
the sky in order to receive data from the satellites. This tool was very useful to test the GPS acquisition
without the need of walking around with the GPS device in order to acquire GPS coordinates, most of
the tests were made with this tool.
5.1.3 EXIF: Image Geotag
In order to test the EXIF geotag, when image arrive to the base station, it was been open the with the
Image Magick Tool1 with the command: identify -format %[exif:*] file.jpg .
1 http://www.imagemagick.org/
65
Figure 5.5 shows the EXIF geotag.
Figure 5.5: EXIF geotag
5.1.4 Sensors Data
This project, was assumed to be completely agnostic in what concerns the format and the type of
sensors data that the mobile station will receive from the UAV and send to the base station. The only
predefined requisite was the existence of enough bandwidth to support 10 kbps. Other requirement was
the guarantee of reliability of the packets sent from the mobile to the base station.
Several tests were performed in both Raspberry Pi (version 1 and 2) to evaluate the impact of different
sensors sampling rate. Due to the lower processing capacity of the Raspberry Pi version 1, when the
sample rate values are changed the time to acquire and to transmit also suffer some changes see table
5.2. This is not easily perceived in Raspberry Pi version 2: when the sample rate change the time to
acquire and transmit suffer small changes, due to its superior processing capacity see table 5.3.
1 ksamples/s 10 ksamples/s 100 ksamples/s 1 Msamples/s

Sensors 0.1243652 s 0.125421 s 0.127044 s 0.2451976 s
Table 5.2: Sensor acquisition and transmission time in Raspberry Pi version 1
1 ksamples/s 10 ksamples/s 100 ksamples/s 1 Msamples/s

Sensors 0.14175 s 0.143375 s 0.145088 0.146317 s
Table 5.3: Sensor acquisition and transmission time in the Raspberry Pi version 2
5.2 Communication Link
The communication platform is the part of the system that allows the mobile station to communicate with
the base station. The first test with this was done by trying to establish a communication link between the
mobile and base application in the same computer, by using the localhost address. Then, it was used
the wireless link and the mobile station hardware, in order to send acquired data to the base station
through WiFi. To test the reliability of the protocol, it was sent a data file from the mobile station to the
base station throw a wireless link. Then in order to test the reliability the two files were compared, by
using the Unix command cmp.
In order to monitor the communication bandwidth usage, the nload tool was used. The nload tool was
installed in the base station (laptop) to monitor the incoming traffic from the mobile station at the interface
wireless lan (wlan0). This test was made with different image acquisition frame rates at a resolution of
66
640 x 480. The result, as expected was a linear line (see figure 5.6) and the maximum bandwidth usage
was around 2.2 Mbps.
Figure 5.6: Bandwidth usage
5.3 Base Station
The base station is the component of the system where the data is stored and processed, by creating
a georeference system that relates the data from the UAV sensors with the place where the data was
collected. To evaluate the base station, a test was performed in order check if the main goals were
accomplish. In order to test the base station with full features it was necessary use the mobile station
implemented on the Raspberry Pi, connected to the computer with a wireless link. The GPS coordinates
was acquired from the GPSFake which it was using a NMEA Log of a flight. To simulate the sensors,
it was acquired the data from the temperature and CPU usage from the Raspberry Pi and sent to the
base station in order to, visualize it at the user interface in real-time.
The features tested were:
• Trajectory and position of the UAV on the map (see figure 5.7)
Figure 5.7: Trajectory of the UAV and image/sensors markers
67
• Georeference images and sensors on the map (see figure 5.8).
Figure 5.8: InfoWindow with image and sensors
• Show the sensors acquired data. The figure 5.9 shows an example chart of the temperature sensor
(Raspberry Pi temperature). The chart is updated in real time when a sensor packet arrives at the
base station.
Figure 5.9: Plot with sensors data
• Show the a acquired images in sequence similar to MJPEG.
The figure 5.10 shows the full interface with all the features presented before.
68
Figure 5.10: Base station interface
5.4 Final Test
In order to understand the limitations of the developed prototype it was performed a test using a drone
(tricopter) (see figure 5.11).
In this test was used the Raspberry Pi 2 because has four USB connections which allows to connect
GPS dongle, webcam and WiFi dongle in simultaneous without using a USB hub which would be more
weight to be carried by the drone. To power the Raspberry Pi was used a power bank battery connected
to the Raspberry Pi using a micro USB cable. The base station was running in a laptop (Lenovo T430)
running Ubuntu and configured to work as an Access Point (AP).
Figure 5.11: Drone with the mobile station
69
Devices installed on the drone (see figure 5.11):
1. Raspberry Pi 2.
2. Logitech HD C270 webcam.
3. GPS Dongle ND-100S.
4. USB WiFi Dynamode.
5. Battery (power bank).
In order to understand the range that the drone would be able to fly without lose the communication
connectivity between the Raspberry Pi and the laptop, it was performed a test just using the Raspberry
Pi with a WiFi dongle. To performed this test the Raspberry Pi was connected to the laptop via WiFi and
the tool Iperf was used to measure the bandwidth (see figure 5.12), packet loss(see figure 5.13) and
jitter (see figure 5.14) of the connection link.
Figure 5.12: Communication bandwidth
Figure 5.13: Communication packet loss
With the results of this test was expected that the drone could fly to a distance of 110 m without
losing connection and a distance around 60 m keeping bandwidth superior to 2.2 Mbps (see figure 5.12)
which was the maximum bandwidth usage of the system. However the greater the distance is between
70
Figure 5.14: Jitter
the mobile and base station was expected higher latency on the data visualisation at the base station
because the packet loss and jitter which increases with the distance (see figures 5.13 and 5.14).
When the drone was flying (see figure 5.15) the delay of the data visualization was greater than the
previous tests performed because the drone was in movement. The packet loss also increased with the
movement of the drone which increased the latency of the data visualization.
Figure 5.15: Final test performed on the drone
71
However during the test was always possible to visualize the trajectory of the drone and the images
acquired at the base station. The figure 5.16 shows base station interface at the ending of the test. This
figure shows the drone trajectory on the map, the temperature sensor chart (Raspberry Pi temperature)
and the last image acquired.
Figure 5.16: Final interface of the drone test
72
Chapter 6
Conclusions
The aim of this work was the development of a prototype of a UAV sub-system, able to detect his current
position and also relate its position with the sensors acquired (image and other sensors). The UAV
module communicates with the base station module, in order to transmit the gathered sensors to be
displayed. According to the introduction of this document, the objectives defined for this work were
divided in three main modules:
• Implementation of the interface software, between the on-board computer, the sensors, the GPS
and the video. camera.
• Implementation of the communication software, between the on-board computer and the base
station.
• Implementation of the user interface, at the base station, with georeferenced system.
The implemented system, on the Raspberry Pi single board computer, is able to acquire and encode
image with a frame rate of 4 fps, acquire the sensors data, acquire the position of the mobile station and
send the gathered information to the base station. The system also allows to georeference the acquired
images by creating an EXIF tag.
The project also comprehended the implementation of the communication software, between the
on-board computer and the base station. The implemented system has a wireless WiFi communication
between the mobile and base station, implemented using the RUDP protocol in order to fulfil the required
of data reliability. The image data, which is the one which needs more bandwidth, is compressed to a
more suitable format (JPEG) before send to the base station.
The user interface was developed in Node.Js and is able to receive the data from the mobile station,
process it and display it on a map with the UAV trajectory, together with the images and sensors em-
bedded on the map, all in real time. The interface displays the acquired image in a sequence similar
to MJPEG and the sensors in charts. The base station is also able to send parameters from the user
interface to mobile station (e.g. image resolution).
To respect the requirements in terms of space, low price and weight, it was designed an architecture
based on a modular system, with the aim of providing a functional integration in another system and an
73
easy maintenance.
6.1 Future Work
The work described in this thesis concerned the development of a prof of concept prototype. As such,
some aspects can be improved in both levels: hardware and software. An important aspect is concerned
with the possibility to use another type of camera with better quality. Another interesting aspect would
be a dedicated hardware to encode image which encodes both formats, still and video images. In what
Concern the communication link, it could be used MIMO WiFi in order to have more bandwidth and so
transmit images and video with better quality. Cipher the information to transmit, would be a good privacy
policy tool. Another possible improvement to this project is to add more options to the user interface.
The user might be able to configure other parameters that are only available to the programmer, such
as the quality of the JPEG encoding.
74
Bibliography
[1] R S Components Vsn. Raspberry Pi Getting Started Guide. 2012.
[2] Beagle boards. BeagleBone Black System Reference Manual. 2013.
[3] Aeroflex. IEEE 802.11ac: Technical Overview. 2009.
[4] Jangeun Jun, Pushkin Peddabachagari, and Mihail Sichitiu. Theoretical maximum throughput of
IEEE 802.11 and its applications. 2003.
[5] Wi-fi Alliance. Multimedia-Grade Wi-Fi R Networks. pages 1–18, 2007.
[6] Lakshminarayan Subramanian, Sonesh Surana, Rabin Patra, and Sergiu Nedevschi. Rethinking
wireless for the developing world. 2010.
[7] Timothy Cox, Christopher Nagy, Mark Skoog, and Ivan Somers. Civil UAV Capability Assessment.
2004.
[8] Louisa Brooke-holland. Unmanned Aerial Vehicles ( drones ): an introduction. 2012.
[9] European Commission. Study Analysing the Current Activities in the Field of UAV. Technical report,
2007.
[10] Chunhua Zhang and John M. Kovacs. The application of small unmanned aerial systems for preci-
sion agriculture: A review. 2012.
[11] Andrea S Laliberte, Albert Rango, and Jeff Herrick. Unmanned Aerial Vehicles for Rangeland
Mapping and Monitoring : a Comparison of Two Systems. 2007.
[12] Fat Shark. Predator V2 specification document.
[13] Fat Shark. RC Vision Systems PREDATOR V2 RTF FPV KIT. 2013.
[14] Teradek. Teradek Clip. 2013.
[15] Teradek. Teradek BondII. 2013.
[16] René Fintel. Comparison of the Most Common Digital Interface Technologies in Vision Technology.
2012.
[17] GoPro. GoPro Hero 3 - Manual. 2013.
75
[18] Scott Pace, Gerald Frost, Irving Lachow, David Frelinger, Donna Fossum, Don Wassem, and Mon-
ica Pinto. The Global Positioning System.
[19] ED Kaplan and CJ Hegarty. Understanding GPS: principles and applications. 2005.
[20] National Marine Electronics Association (US). NMEA 0183–Standard for Interfacing Marine Elec-
tronic Devices. 2002.
[21] Francois Koenig and David Wong. Differential Global Positioning System (DGPS) Operation and
Post-Processing Method for the Synchronous Impulse Reconstruction (SIRE) Radar. 2007.
[22] Jerome R Vetter and William A Sellers. Differential Global Positioning System Navigation Using
High-Frequency Ground Wave Transmissions. 1998.
[23] Yoichi Morales and Takashi Tsubouchi. DGPS, RTK-GPS and StarFire DGPS Performance Under
Tree Shading Environments. March 2007.
[24] Mustafa Dinc. Design Considerations for Military Data Link Architecture in Enabling Integration of
Intelligent Unmanned Air Vehicles ( UAVs ) with Navy Units.
[25] Raj Jain, Fred Templin, and Kwong-Sang Yin. Wireless Datalink for Unmanned Aircraft Systems:
Requirements, Challenges, and Design Ideas. March 2011.
[26] Jon T Adams. Texas Wireless Symposium 2005. 2005.
[27] Smartroom Network, Remote Power Management, and The Zigbee Alliance. What is Zigbee ?
2005.
[28] LAN/MAN standards Committee. Part 11: Wireless LAN medium access control (MAC) and physi-
cal layer (PHY) specifications. IEEE-SA Standards Board, 2012(March), 2003.
[29] Computer Society IEEE. IEEE Std 802.11n. 2009.
[30] Lara Deek, E Garcia-Villegas, and Elizabeth Belding. Joint rate and channel width adaptation in
802.11 MIMO wireless networks. 2013.
[31] Daniel Halperin, Wenjun Hu, Anmol Sheth, and David Wetherall. 802 . 11 with Multiple Antennas
for Dummies.
[32] WL Shen, YC Tung, and KC Lee. Rate adaptation for 802.11 multiuser mimo networks. 2012.
[33] Rabin Patra and Eric Brewer. WiLDNet : Design and Implementation of High Performance WiFi
Based.
[34] a. Sheth, S. Nedevschi, R. Patra, S. Surana, E. Brewer, and L. Subramanian. Packet Loss Charac-
terization in WiFi-Based Long Distance Networks. IEEE INFOCOM 2007 - 26th IEEE International
Conference on Computer Communications, pages 312–320, 2007.
76
[35] Annalisa Durantini and Marco Petracca. Experimental Evaluation of IEEE 802 . 16 WiMAX Perfor-
mances at 2 . 5 GHz Band. pages 338–343, 2008.
[36] David Astély, Erik Dahlman, Anders Furuskär, Ylva Jading, Magnus Lindström, and Stefan Parkvall.
LTE : The Evolution of Mobile Broadband. (April):44–51, 2009.
[37] Mehdi Alasti and Behnam Neekzad. Quality of Service in WiMAX and LTE Networks. (May):104–
111, 2010.
[38] Douglas a Kerr. Chrominance subsampling in digital images. The Pumpkin,(1), November, (3),
2005.
[39] International Telecommunication Union. Terminal equipment and protocols for telematic services.
Study Group VIII, ITU, Geneva, 1988.
[40] G.K. Wallace. The JPEG still picture compression standard. IEEE Transactions on Consumer
Electronics, 38(1):1–17, 1992.
[41] IT Storage Systems Technical Standardization Committee on AV and Equipment. Exchangeable

image file format for digital still cameras. 2002.
[42] N Boettcher, Luciano Ahumada, Roberto Konow, and Luis Loyola. Empirical efficiency gains of
high-speed UDP-based protocols in realistic settings.
[43] Mike Cantelon, Marc Harter, T J Holowaychuk, and Nathan Rajlich. Node.js in Action.
[44] Kai Lei, Yining Ma, and Zhi Tan. Performance Comparison and Evaluation of Web Development
Technologies in PHP, Python, and Node.js. 2014.
77
78

Raspberry Drone - Unmanned Aerial Vehicle (UAV)

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Raspberry Drone - Unmanned Aerial Vehicle (UAV)

Enviado por

Direitos autorais:

Formatos disponíveis

Raspberry Drone: Unmanned Aerial Vehicle (UAV)

Guilherme Lereno Santos Vale

Thesis to obtain the Master of Science Degree in

Telecommunications and Information Engineering

Esta tese foca-se no desenvolvimento e implementação de um módulo computacional num unmanned

Palavras-chave:Unmanned Aerial Vehicle, Sistema Embebido, Mini Computador, Aquisição

2 Related Work and Technologies 4

4.1 V4L2 Device types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

1.1 Simplified view of the project architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.1 First Person View architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

4.1 Mobile station threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

2. Can reach the desired point of observation in just few minutes.

3. Implementation of the user interface, at the base station.

Figure 1.1: Simplified view of the project architecture

Related Work and Technologies

2.1 Existing Solutions

2.1.1 FPV Video Systems

• A camera: For video acquisition.

• A video transmitter: To transmit the picture to the ground.

• A video receiver: To receive the picture on the ground.

• A video display: To watch the video fed from the aircraft.

Figure 2.1: First Person View architecture

Fat Shark FPV system

Sky Drone FPV system

2.1.2 Video/Telemetry Transmission Systems

Figure 2.3: Teradek Bond transmission

2.1.3 Existing Solutions Discussion

2.2 Relevant Technology

2.2.1 Processing Boards

Figure 2.5: Raspberry Pi single board computer

Figure 2.6: BeagleBone Black single board computer

Figure 2.7: Gooseberry single board computer

Raspberry Pi (model B) BeagleBone Black Gooseberry

Figure 2.8: Gray levels image

Figure 2.9: Industrial camera iXU 150

advantage of webcams is the low price.

2.2.3 GPS Systems

2.2.4 Data Communication

Frequency Bandwidth Max Data Rate Approximate

Table 2.2: Comparison of 802.11 standards [3] [4] [5]

• Greater capacity for service requests such as geocoding.

• Business-friendly terms and conditions.

• Intranet application support within the enterprise.

• Control over advertisements within the maps.

3.1 Overall System

3.2 Mobile Station

Figure 3.2: Mobile station block diagram

• Webcam module: hardware which assures the image capture.

3.2.1.1 Image Acquisition

Figure 3.3: Logitech C270 HD webcam

This camera features are:

• Video capture: Up to 1280 x 720 pixels.

• Up to 30 frames per second.

• Photos: Up to 3.0 megapixels.

• Hi-Speed USB 2.0 certified.

• 220 grams weight.

3.2.1.2 GPS Module

• Hardware ready to connect to a USB port.

• Compatible with GPSd Linux daemon.

• Compatible with NMEA 0183 protocol.

Figure 3.4: ND-100S GPS USB Dongle

This GPS features are:

• GPS Chipset SiRF Star III.