Você está na página 1de 13

Application Shih-Chia Huang,

Notes Department of Electronic Engineering, National


Taipei University of Technology, Taiwan
Bo-Hao Chen,
Department of Computer Science and
Engineering, Yuan Ze University, Taiwan
Sheng-Kai Chou,
Department of Electronic Engineering, National
Taipei University of Technology, Taiwan
Jenq-Neng Hwang,
Department of Electrical Engineering,
University of Washington, USA
Kuan-Hui Lee,
Department of Electrical Engineering,
University of Washington, USA

Smart Car

Abstract In general, the smart devices are char-

I n contrast to a traditional mechanical


car, the Smart Car is a highly comput-
erized automobile featuring ubiquitous
computing, intuitive human-computer
interaction and an open application plat-
acterized by three important properties:
Ubiquitous computing (ubicomp):
assessing infor mation or being
assessed interactively and autono-
mously everywhere and anywhere
form. In this paper, we propose an via various sensors over different
advanced Smart Car demonstration plat- wireless protocols.
form with a transparent windshield display Human computer interaction (HCI):
and various motion sensors where drivers the essential interfaces of smart
can manipulate a variety of car-appropri- devices that offer an interaction
ate applications in augmented reality. Simi- istockphoto.com/filo
between human and computer.
lar to smartphones, drivers can customize Applications platform: allowing users
their Smart Car through free downloads have developed as the third car-appropri- to download the third-party applica-
of car-appropriate applications according ate application. By embedding these car- tion (app) software for customizing
to their needs. Additionally, three potential appropriate applications, the Smart Car has their smart devices.
car-appropriate applications related to the potential to increase safety of driving Thanks to these properties, smart
computer vision are investigated and conditions both in daytime and nighttime, devices can act as critical facilitator for
implemented in our platform for increased even in bad weather. the Internet of Things (IoT) through
driving safety. The first and second car- the use of recent information and com-
appropriate applications aim to enhance I. Introduction munication technologies (ICT), such as
the driving visual field by restoring the Ubiquitous sensors, devices, networks, and mobile operating systems (e.g., Android,
low-visibility scenes captured during information are opening the door to a smart iOS, Windows Phone, etc.), multimedia
inclement-weather or nighttime driving world in which smart devices have extended interface, internet access (e.g., Bluetooth,
conditions to be high-visibility ones, computational intelligence throughout the NFC, Wi-Fi, 3G, 4G/LTE, etc.), mobile
respectively, and display them on a trans- physical environment to provide reliable, rel- apps, digital cameras, global positioning
parent windshield display. We also survey evant services to users [1]. These devices are system (GPS) sensors, motion sensors,
pedestrian tracking techniques that com- getting smarter, more multi-functional, and and so on.
bine multiple driving recorders informa- more customizable to allow users to access So what will the next generation
tion as a mobile surveillance network, and store comprehensive information via devices be like? Lets take a look at the
including one proposed framework we many downloadable applications. facts in terms of mobile phones and
televisions. Mobile phones, which
Digital Object Identifier 10.1109/MCI.2016.2601758 Corresponding author: Shih-Chia Huang (e-mail: schuang@
were truly the first pervasive transport-
Date of publication: 12 October 2016 ntut.edu.tw). able computer, have evolved into

46 IEEE Computational intelligence magazine | november 2016 1556-603x/162016ieee


multi-service devices now called
Smartphones. The prevalence of these
Before
computerized phones allows users to Life and Smartphones
send photos, audio, written documents, After
and videos to contacts via their phones
Before After
instead of just using their phones to People now use phones
make calls, as can be seen in Fig. 1(a). In People just used to send photos, audio,
recent years, these modern properties phones to make calls. documents, and video.
have also been enabling new television
(TV) features, turning dumb TV
into Smart TV that give users great
potential to watch on-demand videos
and access internet-based options via
Smart TVs range of online applica-
tions (see Fig. 1(b)). Nowadays, the
(a)
automobile can be regarded as an elec-
tronic system in addition to a mechani- Before
cal one as it incorporates computers Life and Smart TVs
[2], [3], informational storage [4], [5], After
and software [6].
Before After
Looking beyond prevalence, we People had access only those People now have access to on-
believe that in the near future such television programs that were demand programs and films, as
advanced properties will include more available at the time through the well as internet-based options
vehicle technologies, thereby turning discretion of broadcasting companies. such as YouTube.
dumb cars into Smart Cars. This
may include transparent windshield
displays (see Fig. 1(c)) to increase Pause!
driver and passenger safety, conve-
nience, and performance. Hence, the
Smart Car should include the follow-
ing properties: (b)
Ubicomp for Smart Cars: The in-
vehicle system should be networked, Before
Life and Smart Cars
autonomously context-aware, and After
transparently accessible. Additional-
ly, the system should be able to Before After
handle multiple datasets and inter- People used cars Many cars will offer access to
actions acquired quickly from vari- solely for transport. information through transparent
ous sensors. windshield displays, built-in
GPS, etc.
HCI for Smart Cars: The windshield
should be replaced entirely by a see-
through display while embedding
various sensors to convey information
in a straightforward and easy manner.
Application platform for Smart Cars:
(c)
The development platform should
be opened up. Hence, several car-
Figure 1 Lives before and after changes in technology for mobile phones, televisions, and
appropriate applications from third-
cars. (a) Lifestyle brought by smartphones nowadays; (b) Lifestyle brought by smart TVs
party developers would allow users nowadays; (c) Lifestyle brought by Smart Cars in the future.
to customize vehicle capabilities and
features for their wants and needs.
To illustrate our claim, we present a display to receive and display virtual everyday traffic in augmented reality for
novel Smart Car demonstration plat- information to driver. The Smart Car highway, urban streets, and unstructured
form (see Fig. 2) equipped with various that can provide virtual managements scenarios, will become a reality in the
sensors and a transparent windshield of every detail in our lives and navigate next few decades. Additionally, we

november 2016 | IEEE Computational intelligence magazine 47


traditional windshield, and with a
var iety of in-vehicle sensors to
convey information to users or to
the computing and communica
tion unit.
2) The computing and communica-
tion unit, which provides access
and execution of both external and
inter nal services with inter net
access capability.
3) The application unit, which facilitates
downloading of third-party apps.

A. Interaction Unit
In our Smart Car demonstration plat-
form (see Fig. 4), there are two classes of
input sensors to convey information
about the environment outside the vehi-
cle and the users intended actions, and
one class of output devices used to con-
vey information about the feedback of
the computing and communication unit.
Input sensors vehicle surround sensors:
There are six image sensors plus a
GPS sensor around the Smart Car
Figure 2 Exterior and interior views of the Smart Car demonstration platform. demonstration platform for captur-
ing road scenes and locating the cars
position. The configuration of these
image sensors, which jointly offer
Application Unit the driver with a 360-degree view
Automobile Companies Government Utility Provider out of the vehicle, is given in the left
side of Fig. 4.
Private Utility Provider Insurance Companies Input sensors user behavioral sensors:
Computing and Communication Unit In our Smart Car demonstration
Communication Management
platform, there are three kinds of
Device Management
Wi-Fi, 3G/3.5G/4G/LTE, motion sensors (including a gesture
NFC, Bluetooth, etc. Operating System sensor, a voice sensor, and an eye-
Interaction Unit
tracking sensor) mounted on the
dashboard of the vehicle which
Image Sensors
receive users commands and trans-
Voice Sensors mit them to the computing and
Transparent Windshield
Display communication unit, as shown in
Eye Sensors
the right side of Fig. 4.
Gesture Sensors Output devices transparent windshield
display: The Smart Car demonstration
platform has a transparent display
Figure 3 System architecture of the Smart Car demonstration platform.
built into its windshield. Hence, the
windshield acts as a display that allows
d iscuss three potential applications, II. Smart Car Platform users to not only see through the
including a visibility restoration applica- The system architecture of the Smart screen but also show on-demand
tion, a nighttime contrast enhancement Car demonstration platform is divided information with real-time display in
application, and a driving environment into three major units (see Fig. 3): augmented reality through this visual
understanding application, to enhance 1) The interaction unit, with a see- interface. Here, the on-demand infor-
operational safety for Smart Cars when through windshield monitor to dis- mation refers to the drivers requests
driving in urban street scenarios. play content quickly instead of a regarding the content of Smart Cars

48 IEEE Computational intelligence magazine | november 2016


Transparent LCD Windshield Display
Image Sensor Displays the On-Demand Information with
Captures Road Real Time Display in Augmented Reality
Scene from
Image Sensor Rear-View Image Sensor
Captures Road Captures the Inside
Scene from View of Vehicle
Image Sensor
Right-View
Captures Road Eye Tracking Sensor
Scene from Receives the Users
Image Sensor
Left-View Commands and
Captures Road Scene
from the View of Transmits Them
Right-Hand Side Pillar

Gesture Sensor
Image Sensor Receives the Users
Captures Road Commands and
Scene from Transmits Them
Front-View
Image Sensor Voice Sensor
Captures Road Scene from Receives the Users
the View of Left-Hand Commands and
Side Pillar Transmits Them
GPS Sensor
Locates the Current Position
and Transmits to the In-Vehicle
Computing System
In-Vehicle Computing System
Provides Access and Execution of
Both External and Internal Services
with Internet Access Capability

Figure 4 The Smart Cars configuration.

applications which are made available


Table 1 Specific information of in-vehicle computing system.
to driver as needed.
Item Features

B. Computing and Embedded Multi-core Processor Base Frequency 2.2 GHz/Max Turbo
Processors Frequency 3.1 GHz/Cache 4 MB
Communication Unit
The computing and communication Internal Memory Memory Interface DDR3L 1600/Memory Size 16 GB

unit of the Smart Car should provide Solid State Disk Capacity 1TB/Form Factor 2.5-Inch/Interface SATA III
real-time access and execution of Communications Bluetooth/WLAN/FM/Transmitter/Receiver
external/internal services and process (802.11a/b/g/n, Bluetooth V2.1+EDR, 65 nm)

multiple data sets acquired from varied Global Positioning System Tracking Sensitivity-163 dBm/NMEA 0183 data pro-
tocol/Time To First Fix/Built-in SuperCap
sensors of the interaction unit. Hence,
the growing complexity of in-vehicle Graphics Processing Unit Memory Size 4 GB/Memory Clock 2500 MHz/Memory
Interface GDDR5/CUDA Cores 1536
systems requires: integration and coop-
Power Management Output 19.5 V DC, 9.23 A, 230 W
eration of the software modules; multi-
sensor data fusion; stable storage; Mobile Operating Systems Windows 8.1 Enterprise

capability of plug and play. Therefore,


the in-vehicle computing system ers, and insurance companies in order (mid portion of the screen), and a foot-
should include specific features, as listed to customize the features, performance, er container (lower portion of the
in Table 1. and capabilities. screen), as shown in Fig. 5. The header
The applications in this platform can container always displays information
C. Application Unit be manipulated through gestures, voice, such as the status of the car, traffic
We gave the Smart Car an open plat- and eyes. Additionally, the user interface information, weather conditions, etc.
form where users are able to download of the transparent windshield display The content and footer containers
applications from automobile compa- consists of a header container (upper show the complete and simplified con-
nies, government/private utility provid- portion of the screen), a content container tent from the activated application,

november 2016 | IEEE Computational intelligence magazine 49


surface in which transmission T can be
determined as T = e - bd, where b is the
medium attenuation coefficient and d is
Header Container the distance between the camera and the
particle surface. In particular, the aim of
visibility restoration algorithms [9][11],
[13], [14] is to estimate the latent haze-
free image J via transmission T from haze
Content Container image I when there is no additional
information available regarding depth
and airlight. In practice, some driving sit-
uations may appear and disturb these vis-
Footer Container
ibility restoration algorithms. Here, we
identified the following challenging situ-
Figure 5 Layout combination of the transparent windshield display [45]. ations in the field of visibility restoration
for driving situations.
Local lights: When driving in hazy
Table 2 Icons of three potential applications for operational safety enhancement
in the Smart Car demonstration platform. The first column represents the number conditions, either the headlights of
of icons and the second column represents three potential applications. vehicles or the streetlights are usually
Icon Application turned on to improve the drivers
vision field. This often causes mis-
judgement of airlight for some visi-
Visibility Restoration Application
bility restoration algorithms, resulting
in artifact effects.
Colorcast problems: It is due to the
Nighttime Contrast Enhancement Application different kinds of particles that absorb
different portions of the color spec-
trum, and usually results in d ifferent
distributions of color channels, there-
Driving Environmental Understanding Application
by producing a restored image that
still suffers from colorcast problems.
Gray road: The roadway is usually gray
respectively. The application platform Fig. 6. The ability to use a transparent in images captured in the daytime. In
can be downloaded from [7]. windshield display that enhances visibility foggy conditions, this may result in
could alert a driver to potential dangers over-restoration of the road part of the
III. Smart Car Applications when his/her vision field is diminished in image due to the similarity between
In this section, we implement and discuss inclement weather conditions. the road color and the fog color.
three potential applications for Smart In general, visibility restoration of a Deep depth of field: Under hazy
Cars related to computer vision which single image can be performed by using weather conditions, the haze is thicker
can be used to enhance operational safety. the optical model. This model recogniz- across the vanishing point of the road
All these three applications were previ- es that the reflected light and distance than it is in front of vehicle (which is
ously developed by the authors of this are linearly correlated with the observed close to haze-free). When driving in
paper and were adapted to fit into this object and camera [8]. The intensity of a situations with deep depths of field,
Smart Car demonstration platform. The hazy image I from the observed object the uneven haze density in the image
icons used for these three applications are to camera can be represented by direct might be misjudged by visibility
given in Table 2. attenuation (including the intensity of restoration algorithms, thereby result-
haze-free image J and the intensity of ing in a restored image that still
A. Visibility Restoration Application transmission T ) and airlight (including appears hazy.
Inclement weather, such as heavy haze, the global atmospheric light A): Planar surface: Hazy images captured
fog, sandstorms, and so on, can make on roadways during driving condi-
I = JT + A (1 - T ) . (1)
driving quite difficult and dangerous. For tions are generally assumed to be pla-
instance, a driver may have trouble seeing Lets assume a homogeneous medium nar. As such, some visibility restoration
the road clearly and need more time to and that the intensity of transmission T algorithms are not able to effectively
make a decision to brake when driving represents the amount of light reflected restore visibility in hazy images that
in heavy haze conditions, as shown in back to the camera from the particle were not taken along a planar surface.

50 IEEE Computational intelligence magazine | november 2016


Complex architecture: Complex archi-
tecture may exist along a road, such as
pylons, streetlamps, etc. A restored
image can easily suffer from halo effects
along the edges of these structures.
Table 3 gives an overview of com-
mon problematic issues. In this article, we
take a cue from reference [9] when con-
sidering the frequent appearance of local
lights and colorcast problems during
driving conditions. In the literature [9], a Before
fore
haze removal approach was developed
and consisted of a hybrid dark channel
prior (HDCP) module, a color analysis App
(CA) module, and a visibility recovery
(VR) module. In the HDCP module, a
hybrid transmission map was produced to
estimate the haze density while avoiding
the misjudgement of local lights as atmo-
spheric lights. The hybrid transmission
map can be expressed as
~b
t h = 1 - ~a d X (I ) - d n (I ) ,
a+b a+b
(2)
After
where ~ , a , and b are the constant fac-
tors and can be acquired according to
[9]; d X (I) and d n (I) are the processes of Figure 6 Manipulation of visibility restoration application in the Smart Car. Upper portion:
driving in conditions with poor visibility; mid portion: turning on the visibility restoration appli-
dark channel prior on each color chan- cation; lower portion: drivers vision field has been improved after turning on the application.
nel of haze image I using patches of
sizes 3 # 3 and 45 # 45, respectively.
Moreover, the colorcast problem was vb = 1 ` argmax PMF ^I g h The complexity of HDCP module is
suppressed by the CA module, where 2 6l O (kMN) to produce the hybrid trans-

the color adjustment c c for cth color  + argmax PMF ^I bhj, (7) mission map for M row and N column
6l
channel was measured as via the patch size of containing k pixels.
m r (I ) and l is the intensity level. Note that The complexities of both CA and VR
cc = , (3) PMF ($) is the process of probability mass modules are O (MN). After turning on
m c (I )
function and A c is the atmospheric light the visibility restoration application (see
where m r (I ) and m c (I ) are the aver for cth color channel obtained from d n [9]. Fig. 6), the vision field of the driver is
age processes on the R color channel In addition, the computational complex- improved, such that the driver can see the
and cth color channel in RGB color ity of the haze removal approach is prin- surrounding environment more accurate-
space, respectively. cipally embodied in its three modules. ly and engage in better decision-making.
Next, the VR module restored the
haze image to haze-free one J c by

c c (I c - A c ) Table 3 Challenges and solutions for visibility restoration algorithms in driving


Jc = + A c + v c ( c c - 1) , conditions. The first column lists the challenges and the second column lists the
max (t h, t 0)
(4) corresponding solutions.

Challenges Solutions
where v c is the gain factor for cth color
Local lights Hybrid dark channel prior [9]
channel and can be produced by
Colorcast Gray world assumption [9], White patch-Retinex
theory [10]
vr = argmax PMF (I r ) , (5) Gray road No-black-pixel constraint [11]
6l
Deep depth of field Bi-histogram modification [12]
vg = 1 ` argmax PMF ^I r h
2 6l Planar surface Flat-world assumption [13]

 + argmax PMF ^I g hj , (6) Complex architecture Nonlinear filtering [14]
6l

november 2016 | IEEE Computational intelligence magazine 51


B. Nighttime Contrast In general, the current image I cap- v ariance in the same scene. When the
Enhancement Application tured by using imaging device consists of imaging device can receive sufficient
At night, the nighttime contrast both the light reflectance R and the visi- illumination, the ideal illumination
enhancement application combines ble-light illumination L according to the image I i can be obtained by the ideal
enhanced night vision with the transpar- literature [15].This can be expressed as illumination L i . On the contrary, poor
ent windshield display to allow drivers illumination L p can produce a restored
I = LR.(8)
to both identify and highlight the loca- image with poor illumination I p .
tion of threats (e.g., pothole or bump), as The light reflectance R would not The main aim of various image con-
shown in Fig. 7. be changed due to the illumination trast enhancement algorithms [16][25] is
to increase the illumination level via the
processing function of contrast enhance-
ment f ($) , while preserving the image
features.This can be expressed as follows:

I i ! f (I p) .(9)

Various algorithms have been devel-


oped to achieve the processing function
of contrast enhancement. However, the
use of these algorithms during night-
time driving conditions should specifi-
Before
fore
cally address the following issues:
Artifact tolerance: when driving at
App night, there are various light sources
along the roadway to enhance a driv-
ers vision field. The image contrast
enhancement algorithms should
avoid introducing certain artifacts
(also called blocking effects) as the
vehicle is moving.
Uniform contrast: the image contrast
enhancement algorithms should be
able to simultaneously and effectively
After enhance all portions of the current
image during night-driving.
Brightness preservation: after the con-
Figure 7 Driver performing the nighttime contrast enhancement application where the
drivers vision field is increased. Upper and lower portions represent the drivers vision field trast of the current image is enhanced,
before and after using the application. the enhanced image should not lose
the brightness of various light sources
along the roadway.
Table 4 Challenges and solutions for image contrast enhancement algorithms
Hence, several approaches have been
when driving at night. The first column lists the challenges and the second column proposed to overcome these issues, as listed
lists the corresponding solutions. in Table 4. In this article, we implement
Challenges Solutions the nighttime contrast enhancement
Artifact tolerance Recursive sub-image histogram equalization [16], application by using algorithm [25] for
Optimum recursion based histogram separation brightness preservation and artifact toler-
[17], Local histogram equalization [18] ance. In the literature [25], the adaptive
Uniform contrast Spatial entropy-based contrast enhancement [19], gamma correction with weighting distri-
Centre-surround Retinex contrast enhancement
[20], Gaussian mixture model histogram
bution (AGCWD) was proposed to
modification [21] enhance the image contrast for visual
Brightness preservation Brightness-preserving bi-histogram equalization quality improvement. First, the transfor-
[22], Dualistic sub-image histogram equalization mation curve T of adaptive gamma cor-
[23], Bilateral-Bezier-Curve based histogram rection was devised as follows:
modification [24], Adaptive gamma correction
with weighting distribution based histogram
modification [25] T ^ l h = l max (l/l max) c , (10)

52 IEEE Computational intelligence magazine | november 2016


where l and l max are each intensity level these two applications at the beginning (MHT) [33], [34] and Joint Probabilistic
and maximum level for an 8-bit pixel of their processes. This information will Data Association Filters (JPDAFs) [35]
on V channel of HSV color model in a be recalculated only when the absolute to optimize detected target associations
given image I, respectively. c is the difference between E t and E t - 1 is higher by jointly considering all tracked tar-
adaptive parameter for each level, which than a predefined threshold d, where d gets information over several time steps.
can be defined as can be set as 0.05 according to [25]. Alternatively, several approaches
based on the structure-from-motion
c = 1 - CDF (l ), (11)
C. Driving Environmental (SfM) framework have been developed.
where CDF ($) is the process of cumula- Understanding Application The SfM framework, when combined
tive distribution function and can be Thanks to increasing development of with visual simultaneous localization
expressed as follows: autonomous technologies, many video ana- and mapping (V-SLAM), can be used to
l max lytics applications are being applied to mov- calibrate and localize the 3-D positions
CDF (l ) = / W (l ) / W , (12) ing platforms such as Smart Cars. Human of the moving camera, and eventually
l=0
tracking is quite challenging since humans detect, track, and reconstruct moving
and RW is the sum of process of may vary greatly in appearance due to dif- objects with respect to the static back-
weighting distribution function W, ferent viewing perspectives, non-rigid ground. This results in a so-called
which can be expressed as follows: deformations, intra-class variability in shape, dynamic scene reconstruction. The
l max and other visual properties. The challenge advantage of the SfM-based approaches
/ W = / W (l ).(13) increases when moving cameras (such as is that they locate the objects in 3-D
l=0
the car recorders) are employed, due to the space, so as to better deal with occlu-
Next, the weighting distribution effects of egomotion, blur, and the issues sion issues during tracking. Table 5
function W can be expressed as follows: mentioned above. The introduction of a summarizes several existing human
moving camera invalidates many effective tracking-by-detection schemes based on
PDF ^ l h - PDFmin a moving object tracking techniques used in a moving camera, with or without 3-D
W ^ l h = PDFmax c m,
PDFmax - PDFmin static camera, such as background subtrac- inference [36][44].

(14) tion and a constant ground plane assump- A robust moving-platform-based
tion, thus making the task more difficult. human tracking system [45], which
where a is an adjusted parameter [25], Rather than using background modeling- takes advantage of the tracking-by-
PDF($) is the process of probability den- based methods to extract captured human detection scheme and successfully inte-
sity function, and PDFmax and PDFmin objects, human detectors are widely used to grates V-SLAM, human detection,
are the maximum and minimum values detect images of people in the video frame. ground plane estimation, and kernel-
of the statistical histogram in PDF ^6l h, Therefore, the challenge is to successfully based tracking techniques, is adopted in
respectively. By using the transformation detect human images captured by moving our Smart Car system. As shown in
curve to the dimmed image in the cameras, and then apply the tracking tech- Fig. 8, this system starts with human
application, the enhanced image poten- niques to those detected, resulting in so- detection and V-SLAM for camera
tially provides richer road information called tracking-by-detection schemes. calibration. Then, the ground planes are
to the Smart Cars driver, as can be seen By applying a human detector to estimated based on the camera motions,
in Fig. 7. The overall efficacy of above each frame of a video sequence, the so that the 3-D locations of the pedes-
processes on computational complexity tracking scheme becomes a task that trians (relative to the cameras) can be
is O(MN ), where M and N are the associates the detected human objects inferred. Subsequently, the absolute
number of rows and columns in a given with each other frame-by-frame. In 3-D locations of tracked humans will
image I, respectively. Therefore, a tem- general, human detection follows two become available since the GPS loca-
poral-based entropy model was suggest- basic steps [26]: foreground segmenta- tion information of the camera is also
ed in [25] and employed to reduce the tion and object classification. available. By taking 3-D information
computational complexity for the visi- After human object detection, a into account, the tracking problem can
bility restoration application and night- tracking framework is applied to the be reformulated as a constrained 3-D
time contrast enhancement application. detected objects. Previous research of multiple-kernel (CMK) tracking prob-
For tth frame, the entropy E t can be human tracking with moving cameras lem [32], which can effectively resolve
measured as follows: has involved Kalman filters [27], [28], occlusions during tracking by globally
l max Particle filters [29], [30], and kernel- optimizing the data association
E t = - / PDF (l ) log ^PDF (l ) h .(15) based tracking [31], [32], which are between consecutive frames.
l=0
widely used in tracking and indepen- In [32], the CMK tracking scheme
Here, the first incoming frame is uti- dently consider each tracked human tracks video objects in 2-D space (image),
lized to produce the hybrid transmission within the temporal frames. Some stud- i.e., x ! 0 2 # Nk in Eq. (16). To efficiently
map and the transformation curve for ies adopt Multi-Hypothesis Tracking integrate the depth information into the

november 2016 | IEEE Computational intelligence magazine 53


Table 5 Several existing human tracking-by-detection schemes based on a moving camera, with or without 3-D inference.

Tracking Taxamony Challenges Solutions


2-D-based individual object Local optimum Multi-body parts tracking by combining an implicit shape model
independent tracking data association (ISM) detector and a stereo-odometry-based tracker [32].
A part-based human detector with a Gaussian process latent vari-
able model to compute the temporal consistency of detections
over time [33].
A color model and the event cone the time-space volume in
which the trajectory of a tracked object is sought in 3-D space [34].
2-D-based multiple object Global optimum A maximum-a-posteriori (MAP) data association formulation of a
joint tracking data association cost-flow network with a non-overlap constraint on trajecto-
ries [35].
A cost function with the objects birth and death states, solved
by a greedy algorithm [36].
A coupling formulation to avoid the problem of error propaga-
tion to overcome the partial or complete occlusions [37].
Incorporates constraints of piecewise constant-velocity path
smoothness based on the flow network framework [38].
SfM based 3-D inference Track moving objects An incremental V-SLAM system that allows choosing between
tracking within a static background full 3-D reconstruction or simply tracking moving objects [39].
with a moving camera
Estimate the ground plane by using sparse features, dense inter-
frame stereo and object detection based on a real-time monocu-
lar SfM framework [40].

M M Ni
CMK framework, we need to reformu- J (X ) = / v i J i (X ) = / v ik / w ik J ik ^X h, geometry [45]. Hence, for each target i,
late the problem. First we extend the Eq. i=1 k=1 k=1 the movement vector d x can now be
(16) from 2-D to 3-D space  X ! 0 3 # M # Ni , (17) iteratively solved by using the projected
gradient method [32]. The computa-
Ni
J i ^X h = / w ik J ik ^X h, X ! 0 3 # Ni . where M is the number of the targets tional complexity is relatively high, due
(16) k=1 in one video frame, and X = to the use of human detector on every
[(X 11) T , ..., (X ik) T ] T is for the ith target frame of video, as well as the ground
This equation is regarded as the local and the kth kernel. plane estimation. The CMK tracking in
optimization for each individual target Necessarily, the constraint functions 3-D space based on projected gradient
i with multiple ^N i h kernels, each of C (X ) = 0 must be considered to main- can be very fast. At this moment, using a
them is weighted by w ik . Second, con- tain the relative locations of the kernels. high-end desktop CPU still requires
sidering the depth information, we In [32], two-kernel and four-kernel lay- couple seconds to complete one frame
assign the visibility of each target as a outs are proposed to describe a human. of video. This complexity can be
weight v i to deal with the global opti- Unlike the constraints used in [32], relieved if the cloud computing with
mization. In other words, the total cost which are mainly based on 2-D geome- GPU speedup, real-time processing can
function becomes: try, we set the constraints based on 3-D be expected.
As examples, representative perfor-
mances of human tracking for videos
Video Tracking Results obtained from four separate car recorders
are shown in the left column of Fig. 9.
Moreover, a 3-D visualization of
Hypothesis
dynamic on-road scenes can also be
Human Depth CMK reconstructed. Its purpose is not only to
SfM
Detection Tracking
visualize the pedestrians paths and
Camera 2-D movements in a 3-D environment, but
Motions Locations Depth
Map also to avoid issues of privacy invasion
by using avatar-like 3-D models. When
Ground Plane Ground Pose 3-D Depth Map
effectively integrated with a 3-D map
Estimation Plane Estimation Locations Construction
service, such as Google Earth, we can
treat this new 3-D augmented reality
Figure 8 A moving-platform-based human tracking system [47]. visualization as a dynamic 3-D GPS

54 IEEE Computational intelligence magazine | november 2016


navigation system as shown in the mid-
dle column of Fig. 9. When there is an During driving conditions, the more appearances
event, such as road crossing or activity, of local lights, the more serious artifact effects are
we can see the dynamic scene in differ-
ent aspects of views (see the right col-
caused to the recovery results, so by employing the
umn of Fig. 9), which can be effectively hybrid dark channel prior we also overcome the
displayed within the Smart Car to facili- generation of artifact effects.
tate drivers viewing experience.
Moreover, due to the video analytics ited to the front of the vehicle and scenarios associated with tracking across
on each individual Smart Car, it is high- dynamically changes when the vehicle multiple moving cameras: overlapping
ly desirable to collectively combine ana- moves. To achieve a more complete and non-overlapping FOV scenarios. In
lytics information, such as on-road understanding of the cars surroundings, an overlapping FOV scenario, a pedes-
pedestrian tracking, from multiple near- the tracking must be implemented trian simultaneously appears in two or
by vehicles to dynamically facilitate bet- across multiple moving cameras based more different cameras FOVs; therefore,
ter monitoring of the on-road situation on the results of the tracking in a single re-identification issues can be facilitated
and cooperatively share the informa- moving camera. The challenge is that by exploiting the cameras relationship
tion. When tracking pedestrians in a appearance features extracted from the based on the overlapping views, so as to
Smart Car with a single moving camera, same pedestrian by different cameras obtain consistent features from different
the cameras field of view (FOV) is lim- may be inconsistent. There are two cameras. In a non-overlapping FOV

Figure 9 3-D visualization of the scene recorded by four driving recorders. Each row belongs to one driving recorder; the leftmost shows the
video frames, the middle shows the corresponding view of 3-D visualization, and the right shows the scene visualized from different aspects [47].

november 2016 | IEEE Computational intelligence magazine 55


cameras. After estimating the pedestrians
Car-appropriate applications can be diverse to meet 3-D tracklets in each camera, the pedes-
the needs of users, including communication and trian tracking across multiple cameras is
further applied, facilitated with the
social, entertainment, travel, shopping, utilities, etc. BTFs and map prior. Finally, the pedes-
trians 3-D tracklets are summarized and
scenario, a pedestrian enters a cameras tracking across cameras as a multi-label visualized in the 3-D real-world envi-
FOV then leaves, and later enters other classification task, which determines ronment based on an open map service
cameras FOVs, which do not share a each target belonging to one or several like Google Earth.
common region with the previous FOV. cameras FOVs by considering associa- As shown in Fig. 11, the pedestrians
The problem is more difficult due to tion likelihood of the target as calculated (in yellow and green bounding boxes)
dynamic environment, lighting changing based on targets motion cues and appearing in Camera 1s FOV are suc-
and unexpected camera locations. appearance features. When a target is out cessfully tracked (in the same color
Hence, the methods for tracking across of a cameras FOVs, we predict the tar- boxes) later in Camera 3s FOV.
multiple static cameras [46] cannot be gets locations via an open map service
applied to the tracking across moving such as Google Maps. Moreover, by IV. Conclusions
cameras. Moreover, both overlapping using the Google Earth service, a 3-D We proposed the demonstration platform
and non-overlapping cases may occur visualization of a dynamic scene can be for a Smart Car and further implemented
alternatively during tracking, thus mak- reconstructed to allow users to see a three potential applications on this plat-
ing this task even more challenging. holistic view or different viewing per- form. To implement the Smart Car dem-
To effectively deal with tracking spectives of a 3-D scene reconstructed onstration platform, we included three
across multiple driving recorders, we by multiple videos. important properties. First, the Smart Car
developed a framework [47] to track on- Figure 10 shows the system of the is networked and is able to process multi-
road pedestrians recorded in the videos, developed framework. First, pedestrian ple datasets from various sensors based on
where a cloud server is used to collect tracking, which produces the moving the ubicomp property. Next, due to the
driving information of several nearby trajectory and associated features of the use of the HCI property, the traditional
vehicles via a mobile surveillance net- tracked person (tracklet) in 3-D space, in windshield is entirely replaced on the
work. First, pedestrian tracking in a sin- a single camera is applied to each video. Smart Car by a transparent windshield
gle moving camera is applied to each The videos are then used to build display on which the driver can see
video. Based on the single-camera- Brightness Transfer Functions (BTFs) for graphic information and interact with
tracking results, we treat the problem of compensating the color diversity of the the car via embedded motion sensors.

Video, GPS Video, GPS Video, GPS Video, GPS

...
Pedestrian Pedestrian Pedestrian Pedestrian
Tracking Tracking Tracking Tracking
BTF
3-D 3-D 3-D 3-D Construction
Locations Locations Locations Locations
BTFs

Tracking Across Multiple Moving Cameras Prior Information

3-D 3-D 3-D 3-D


Tracklet Tracklet Tracklet Tracklet

Google
3-D Visualization
Map/Earth

Dynamic Scene in 3-D Virtual World

Figure 10 A system of human tracking across multiple moving car cameras [47].

56 IEEE Computational intelligence magazine | november 2016


(a)

(b)

Figure 11 Visual tracking results, where the top rows are the recorded frames, and bottom rows are the corresponding 3-D visualization [47].
(a) Two human tracked frames from driving recorder1. (b) Two human tracked frames from driving recorder3.

Finally, the application platform of the opened up for third party developers, and on three proposed applications. Addition-
Smart Car is designed based on the plat- can be downloaded from [7]. This Smart ally, we have demonstrated the potential
forms of smartphone and smart TV, is Car demonstration platform was tested capability of a Smart Car to effectively

november 2016 | IEEE Computational intelligence magazine 57


implement and run these three applica- [12] B. H. Chen, S. C. Huang, and J. H. Ye. Hazy [31] D. Comaniciu, V. Ramesh, and P. Meer, Kernel-
image restoration by bi-histogram modification, ACM based object tracking, IEEE Trans. Pattern Anal. Machine
tions during driving conditions. We Trans. Intell. Syst. Technol., vol. 6, no. 4, Article 50, July Intell., vol. 25, no. 5, pp. 564577, May 2003.
believe that in the future there will be a 2015.
[32] C. T. Chu, J. N. Hwang, H. I. Pai, and K. M.
great diversity of car-appropriate applica- [13] N. Hautiere and D. Aubert, Contrast restoration of Lan, Tracking human under occlusion based on
foggy images through use of an onboard camera, in Proc. adaptive multiple kernels with projected gradients,
tions released on the application platform. IEEE Conf. Intelligent Transportation Systems, Sept. 2005, IEEE Trans. Multimedia, vol. 5, no. 7, pp. 16021615,
At that point, the smart worlds drivers pp. 601606. Nov. 2013.
can build their own Smart Car based on [14] S. C. Huang, B. H. Chen, and W. J. Wang, Vis- [33] I. J. Cox, A review of statistical data association
ibility restoration of single hazy images captured in techniques for motion correspondence, Int. J. Comput.
their wants and needs. Consequently, the real-world weather conditions, IEEE Trans. Circuits Vision, vol. 10, no. 1, pp. 5366, 1993.
Smart Car will provide not only a more Syst. Video Technol., vol. 24, no. 10, pp. 18141824,
Oct. 2014. [34] D. B. Reid, An algorithm for tracking multiple
interesting driving environment, but also [15] L. Wang, L. Xiao, H. Liu, and Z. Wei, Variational
targets, IEEE Trans. Automat. Contr., vol. 24, no. 6, pp.
843854, Dec. 1979.
a safer one. Bayesian method for retinex, IEEE Trans. Image Process-
ing, vol. 23, no. 8, pp. 33813396, Aug. 2014. [35] T. E. Fortmann, Y. Bar-Shalom, and M. Scheffe,
[16] K. S. Sim, C. P. Tso, and Y. Y. Tan, Recursive sub- Sonar tracking of multiple targets using joint probabi-
V. Acknowledgements image histogram equalization applied to gray scale images, listic data association, IEEE J. Oceanic Eng., vol. 8, no. 3,
pp. 173184, July 1983.
This work was supported by the Min- Pattern Recognit. Lett., vol. 28, no. 10, pp. 12091221, July
2007. [36] A. Ess, B. Leibe, K. Schindler, and L. Van Gool,
i s tr y of Science and Technology, Robust multiperson tracking from a mobile platform,
[17] S. C. Huang and C. H. Yeh, Image contrast en-
Taiwan, under Grant Nos. MOST hancement for preserving mean brightness without losing IEEE Trans. Pattern Anal. Machine Intell., vol. 31, no. 10,
pp. 18311846, Oct. 2009.
105-2923-E-027-001-MY3, MOST image features, Eng. Appl. Artif. Intell., vol. 26, no. 5-6,
pp. 14871492, May-June 2013. [37] M. Andriluka, S. Roth, and B. Schiele, People-track-
103-2221-E-027-031-MY2, MOST [18] Z. Y. Chen, B. R. Abidi, D. L. Page, and M. A. Abi- ing-by-detection and people-detection-by-tracking, in
103-2221-E-027-030-MY2, MOST di, Gray-level grouping (GLG): An automatic method Proc. IEEE Conf. Computer Vision and Pattern Recognition,
for optimized image contrast enhancement-part I: The June 2008.
103-2923-E-002-011-MY3, MOST basic method, IEEE Trans. Image Processing, vol. 15, no. [38] B. Leibe, K. Schindler, N. Cornelis, and L. Van-
104-2221-E-027-020, and MOST 105- 8, pp. 22902302, Aug. 2006. Gool, Coupled object detection and tracking from
2218-E-155-003. [19] T. Celik, Spatial entropy-based global and local im- static cameras and moving vehicles, IEEE Trans. Pat-
age contrast enhancement, IEEE Trans. Image Processing, tern Anal. Machine Intell., vol. 30, no. 10, pp. 16831698,
vol. 23, no. 12, pp. 52985308, Dec. 2014. Oct. 2008.
References [20] D. J. Jobson, Z. Rahman, and G. A. Woodell, Prop- [39] L. Zhang, Y. Li, and R. Nevatia, Global data as-
[1] S. Poslad, Ubiquitous Computing Smart Devices, Environ- erties and performance of the center/surround retinex, sociation for multi-object tracking using network f lows,
ments and Interactions. New York: Wiley, 2009. IEEE Trans. Image Processing, vol. 6, no. 3, pp. 451462, in Proc. IEEE Conf. Computer Vision and Pattern Recogni-
[2] C. T. Lin, L. W. Ko, and T. K. Shen, Computational Mar. 1997. tion, June 2008.
intelligent brain computer interaction and its applications [21] T. Celik and T. Tjahjadi, Automatic image equal-
on driving cognition, IEEE Comput. Intell. Mag., vol. 4, [40] H. Pirsiavash, D. Ramanan, and C. C. Fowlkes,
ization and contrast enhancement using Gaussian mix- Globally-optimal greedy algorithms for tracking a vari-
no. 4, pp. 3246, Nov. 2009. ture modeling, IEEE Trans. Image Processing, vol. 21, no. able number of objects, in Proc. IEEE Conf. Computer
[3] P. G. Balaji and D. Srinivasan, Multi-agent system in 1, pp. 145156, Jan. 2012. Vision and Pattern Recognition, June 2011.
urban traffic signal control, IEEE Comput. Intell. Mag., [22] Y. T. Kim, Contrast enhancement using brightness
vol. 5, no. 4, pp. 4351, Nov. 2010. [41] Z. Wu, A. Thangali, S. Sclaroff, and M. Betke,
preserving bi-histogram equalization, IEEE Trans. Con-
Coupling detection and data association for mul-
[4] B. Lin and C. Wu, Mathematical modelling of the sumer Electron., vol. 43, no. 1, pp. 18, Feb. 1997.
tiple object tracking, in Proc. IEEE Conf. Com-
human cognitive system in two serial processing stages [23] Y. Wang, Q. Chen, and B. Zhang, Image enhance- puter Vision and Pattern Recognition, June 2012, pp.
with its applications in adaptive workload-management ment based on equal area dualistic sub-image histogram 19481955.
systems, IEEE Trans. Intell. Transport Syst., vol. 12, no. 1, equalization method, IEEE Trans. Consumer Electron.,
pp. 221231, Mar. 2011. vol. 45, no. 1, pp. 6875, Feb. 1999. [42] A. A. Butt and R. T. Collins, Multi-target tracking
by Lagrangian relaxation to min-cost network f low, in
[5] D. Diaz, A. Cesta, A. Oddi, R. Rasconi, and M.D. R- [24] F. C. Cheng and S. C. Huang, Efficient histogram Proc. IEEE Conf. Computer Vision and Pattern Recognition,
Moreno, Efficient energy management for autonomous modification using bilateral bezier curve for the contrast June 2013.
control in rover missions, IEEE Comput. Intell. Mag., vol. enhancement, IEEE/OSA J. Disp. Technol., vol. 9, no. 1,
8, no. 4, pp. 1224, Nov. 2013. pp. 4450, Jan. 2013. [43] A. Kundu, K. M. Krishna, and C. V. Jawahar, Re-
[6] J. Ziomek, L. Tedesco, and T. Coughlin, My car, my altime multibody visual slam with a smoothly moving
[25] S. C. Huang, F. C. Cheng, and Y. S. Chiu, Efficient
way: Why not? I paid for it!, IEEE Consumer Electron. monocular camera, in Proc. IEEE Conf. Computer Vision,
contrast enhancement using adaptive gamma correction
Mag., vol. 2, no. 3, pp. 2529, July 2013. 2011, pp. 20802087.
with weighting distribution, IEEE Trans. Image Process-
[7] S. C. Huang, B. H. Chen, S. K. Chou, J. N. Hwang, ing, vol. 22, pp. 10321041, Mar. 2013. [44] S. Song and M. Chandraker, Robust scale estimation
and K. H. Lee. Smart car application platform [Online]. [26] D. Geronimo, A. M. Lopez, A. D. Sappa, and T. in real-time monocular SFM for autonomous driving, in
Available: https://github.com/smartCarLab/smartCar Graf, Survey of pedestrian detection for advanced driver Proc. IEEE Conf. Computer Vision and Pattern Recognition,
assistance systems, IEEE Trans. Pattern Anal. Machine In- June 2014.
[8] H. Koschmieder, Theorie der horizontalen sicht-
weite, in Beitrge zur Physik der Freien Atmosphare. Mu- tell., vol. 32, no. 7, pp. 12391258, July 2010. [45] K. H. Lee, J. N. Hwang, G. Okapal, and J. Pitton,
nich, Germany: Keim & Nemnich, 1924. [27] D. M. Gavrila and S. Munder, Multi-cue pedestrian Driving recorder based on-road pedestrian tracking using
detection and tracking from a moving vehicle, Int. J. visual SLAM and constrained multiple-kernel, in Proc.
[9] S. C. Huang, B. H. Chen, and Y. J. Cheng, An ef-
Comput. Vision, vol. 73, no. 1, pp. 4159, 2007. IEEE Conf. Intelligent Transportation Systems, Oct. 2014,
ficient visibility enhancement algorithm for road scenes
pp. 26292635.
captured by intelligent transportation systems, IEEE [28] M. Bertozzi, A. Broggi, A. Fascioli, A. Tibaldi, R.
Trans. Intell. Transport Syst., vol. 15, no. 5, pp. 23212332, Chapuis, and F. Chausse, Pedestrian localization and [46] C. T. Chu and J. N. Hwang, Fully unsupervised
Oct. 2014. tracking system with Kalman filtering, Proc. IEEE Intel- learning of camera link models for tracking humans
[10] S. C. Huang, J. H. Ye, and B. H. Chen, An ad- ligent Vehicles Symp., 2004, pp. 584589. across non-overlapping cameras, IEEE Trans. Circuits
vanced single image visibility restoration algorithm for [29] V. Philomin, R. Duraiswami, and L. Davis, Pedes- Syst. Video Technol., vol. 24, no. 6, pp. 979994, June
real-world hazy scenes, IEEE Trans. Ind. Electron., vol. trian tracking from a moving vehicle, Proc. IEEE Intel- 2014.
62, no. 5, pp.29622972, May 2015. ligent Vehicles Symp., 2000, pp. 350355. [47] K. H. Lee and J. N. Hwang, On-road pedestrian
[11] J.-P. Tarel, N. Hautiere, L. Caraffa, A. Cord, H. [30] R. Arndt, R. Schweiger, W. Ritter, D. Paulus, and tracking across multiple driving recorders, IEEE Trans.
Halmaoui, and D. Gruyer, Vision enhancement in O. Lohlein, Detection and tracking of multiple pedes- Multimedia, vol. 17, no. 9, Sept. 2015.
homogeneous and heterogeneous fog, IEEE Intell. Trans- trians in automotive applications, Proc. IEEE Intelligent
port. Syst. Mag., vol. 4, no. 2, pp. 620, 2012. Vehicles Symp., 2007, pp. 1318. 

58 IEEE Computational intelligence magazine | november 2016

Você também pode gostar