Você está na página 1de 290

Amlan Chakrabarti

Neha Sharma
Valentina Emilia Balas Editors

Advances in
Computing
Applications
Advances in Computing Applications
Amlan Chakrabarti Neha Sharma
Valentina Emilia Balas
Editors

Advances in Computing
Applications

123
Editors
Amlan Chakrabarti Valentina Emilia Balas
A.K. Choudhury School of Information Faculty of Engineering, Department of
Technology Automatics and Applied Software
University of Calcutta Aurel Vlaicu University of Arad
Kolkata, West Bengal Arad, Arad
India Romania

Neha Sharma
Zeal Institute of Business Administration,
Computer Application and Research
Zeal Education Society
Pune, Maharashtra
India

ISBN 978-981-10-2629-4 ISBN 978-981-10-2630-0 (eBook)


DOI 10.1007/978-981-10-2630-0
Library of Congress Control Number: 2016950755

Springer Science+Business Media Singapore 2016


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microlms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specic statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made.

Printed on acid-free paper

This Springer imprint is published by Springer Nature


The registered company is Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #22-06/08 Gateway East, Singapore 189721, Singapore
Preface

Globalization has influenced almost every facet of human life due to the emergence
of new and affordable digital and computing technologies as well as current trends
in management. At the same time, informatics with its strong focus on providing
fast and ready access of information is the backbone of many of the present-day
intelligent applications serving our mankind. All these create a perfect landscape for
widespread research interest in information and communication technologies along
with management policies and thus impacting peoples lives from entertainment to
health care and from databases to e-governance.
This edited volume of Advances in Computing and Management presents the
latest high-quality technical contributions and research results in the areas of
computing, informatics, management, and information management. It deals with
state-of-the-art topics and provides challenges and solutions for and explores future
research directions. Original, unpublished research work highlighting specic
research domains from all viewpoints is contributed from scientists throughout the
globe. The main goal of this volume is not only to summarize new research ndings
but also to place these in the context of past work. This volume is designed for
professional audience, composed of researchers, practitioners, scientists, and
engineers in academia and industry.
The following is the brief summary extracted from the respective chapters and
their abstracts:
Chapter 1, by Vamsi Krishna Myalapalli, proposes an approach to overcome
these befalling sundry wait events in the internal database engine through Query
and PL/SQL rewrite methodologies, and additional overhauling approaches to
increase data hit ratio. The experimental progression and approaches evinced that
CPU impact, wait events, and other performance bottlenecks are minimized. This
paper could serve as tuning tool to boost query as well as database performance by
wait event tuning and also can oblige as a utility for Database Administrators, SQL
programmers, and database operators.
Chapter 2, by Snehalata Shirude and Satish Kolhe, proposes Agent based
Library Recommender System with the objective to provide effective and intelligent

v
vi Preface

use of library resources such as nding right books, relevant research journal
papers, and articles.
Chapter 3, by Kalyan Baital and Amlan Chakrabarti, presents a scheduling
algorithm where random tasks generated at different time interval with different
periodicity and execution time can be accommodated into a system, which is
already running a set of tasks, meeting the deadline criteria of the tasks.
Chapter 4, by Manoj K. Sabnis and Manoj Kumar Shukla, explains Model-based
Approach for Shadow Detection of Static Images using two methods, i.e.
color-based and texture-based.
Chapter 5, by Nitin Vijaykumar Swami et al., has covered the concepts of Li-Fi,
how the Li-Fi technology can be enhanced in the mobile communication, how it
works, the Li-Fi cellular network, some ubiquitous computing applications, com-
mon misconceptions about Li-Fi, Li-Fi in solar cell, and Internet of things (IoT).
Chapter 6, by Shraddha Oza and K.R. Joshi, analyzes the performance of
denoising lters like NLM (non-local mean (NLM) spatial domain lter), bilateral,
and linear Gaussian lters using PSNR, MSE, and SSIM for MR (Magnetic
Resonance) Images.
Chapter 7, by Gnter Fahrnberger, presents a detailed view on SecureString 3.0.
The homomorphic cryptosystem SecureString 3.0 remedies to recapture the cloud
users faith in secure cloud computing on encrypted character strings by combining
the paradigms blind computing and secret sharing. Implementation details of
this cryptosystem given in pseudocode allow researchers to realize their own pro-
totypes and practitioners to integrate SecureString 3.0 in their own security
solutions.
Chapter 8, by Sujay D. Mainkar and S.P. Mahajan paper, focuses on the
development of feature extraction and accurate classication of variety of acoustic
sounds in unstructured environments, where adverse effects such as noise and
distortion are likely to dominate. This chapter attempts to classify 10 different
unstructured real-world acoustic environments using empirical mode decomposi-
tion (EMD) which considers inherent non-stationarity of acoustic signals by
decomposing the signal into intrinsic mode functions (IMFs).
Chapter 9, by Mrunal Pathak and N. Srinivasu, presents the overview of different
multimodal biometric (multibiometric) systems and their fusion techniques with
respective to their performance.
Chapter 10, by Mohan S. Khedkar and Vijay Shelake, proposes a technique
based on the concept of dynamic secret key (DSK), which is used to generate
symmetric cryptography keys for designing an authentication and encryption
scheme in smart grid wireless communication. In this scheme, recently added
device (e.g., smart meter) is authenticated by a randomly chosen authenticated
device. This scheme enables mutual authentication between a control center situ-
ated in local management ofce and the randomly chosen device as an authenticator
to generate proper dynamic secret-based dynamic encryption key (DEK) for con-
sequent secure data communications.
Chapter 11, by Vyanktesh Dorlikar and Anjali Chandavale, proposes a smart
security framework with the integration of enhanced existing technologies.
Preface vii

This authentication framework provides security for the high-risk location, such as
government ofce buildings, airport, military bases, and space stations.
Chapter 12, by Ashwini Shewale et al., attempts to scale down the medical
image processing time, without getting arousing effect on the quality of image using
efcient computational methods for medical image processing.
Chapter 13, by Supriya Kunjir and Rajesh Autee, attempts to develop a
cost-effective security system based on radar sensor network to prevent terrorism to
a great extent. The system specically aims at the task of detecting obstacles by
means of ultrasonic radar sensor network and provides photograph of the detected
obstacles using camera and also provides total count of detected obstacles by means
of counter. The ultrasonic sensor network coupled with counter and display unit is
then totally coupled to the FM transceiver to get the voice announcement.
Chapter 14, by Moumita Acharya et al., proposes a low-resource and
energy-aware hardware design for DWT through dynamic bit width adaptation, thus
performing the computation in an inexact way. They have performed eld pro-
grammable gate array (FPGA)-based prototype hardware implementation of the
proposed design.
Chapter 15, by Valentina Emilia Balas et al., aims to minimize the possibility of
avalanche via systematically analyzing the cause behind avalanche. A novel and
efcient attack model is proposed to evaluate the degree of vulnerability in a
dependency-based system caused by its members. This model uses an algorithmic
approach to identify, quantify, and prioritizing, i.e., ranking the extent of vulner-
ability due to the active members in a dependency-based system.
Chapter 16, by Neha Sharma and Hari Om, presents a case study to predict the
survival rate of oral malignancy patients, with the help of two predictive models,
linear regression (LR), which is a contemporary statistical model, and multilayer
perceptron (MLP), which is an articial neural network model.
The main goal of this volume is to summarize new results but also place these in
the context of past work.
We are grateful to Springer, especially to Ms. Swati Meherishi (Senior Editor,
Applied Sciences & Engineering) and her team for the excellent collaboration,
patience, and help during the evolvement of this volume.
We hope that the volume will provide useful information to professors,
researchers, and graduated students in the area of intelligent transportation.

Kolkata, India Amlan Chakrabarti


Pune, India Neha Sharma
Arad, Romania Valentina Emilia Balas
Acknowledgement

We, the editors of the book, Dr. Amlan Chakraborty, Dr. Neha Sharma, and
Dr. Valentina Emilia Balas, take this opportunity to express our heartfelt gratitude
toward all those who have contributed toward this book and supported us in one
way or the other. This book incorporates the work of many people, all over the
globe. We are indebted to all those people who helped us in the making of this
high-quality book which deals with state-of-the-art topics in the areas of computing,
informatics, management, and information management.
At the outset, we would like to extend our deepest gratitude and appreciation to
our afliations: Dr. Amlan Chakrabarti to University of Calcutta, India, Dr. Neha
Sharma to Zeal Institute of Business Administration, Computer Application, and
Research of S.P. Pune University, India, and Dr. Valentina Emilia Balas to
Department of Automatics and Applied Software, Faculty of Engineering of
University Aurel Vlaicu of Arad, Romania, for providing all the necessary
support throughout the process of book publishing. We are grateful to all the
ofcers and staff members of our afliated institutions who have always been very
supportive and have always been companions as well as contributed graciously in
the making of this book.
Our sincere appreciation goes to our entire family for their undying prayers,
love, encouragement, and moral support and for being with us throughout this
period, constantly encouraging us to work hard. Thank You for being our
backbone during this journey of compilation and editing of this book.

Amlan Chakrabarti
Neha Sharma
Valentina Emilia Balas

ix
Contents

1 Wait Event Tuning in Database Engine . . . . . . . . . . . . . . . . . . . . . . 1


Vamsi Krishna Myalapalli
2 Machine Learning Using K-Nearest Neighbor for Library
Resources Classication in Agent-Based Library
Recommender System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Snehalata B. Shirude and Satish R. Kolhe
3 An Efcient Dynamic Scheduling of Tasks for Multicore
Real-Time Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Kalyan Baital and Amlan Chakrabarti
4 Model-Based Approach for Shadow Detection of Static Images . . . 49
Manoj K. Sabnis and Manoj Kumar Shukla
5 Light Fidelity (Li-Fi): In Mobile Communication
and Ubiquitous Computing Applications . . . . . . . . . . . . . . . . . . . . . . 75
Nitin Vijaykumar Swami, Narayan Balaji Sirsat
and Prabhakar Ramesh Holambe
6 Performance Analysis of Denoising Filters for MR Images . . . . . . . 87
Shraddha D. Oza and K.R. Joshi
7 A Detailed View on SecureString 3.0 . . . . . . . . . . . . . . . . . . . . . . . . . 97
Gnter Fahrnberger
8 Performance Comparison for EMD Based Classication
of Unstructured Acoustic Environments Using GMM
and k-NN Classiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Sujay D. Mainkar and S.P. Mahajan
9 Performance of Multimodal Biometric System Based
on Level and Method of Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Mrunal Pathak and N. Srinivasu

xi
xii Contents

10 DSK-Based Authentication Technique for Secure Smart


Grid Wireless Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Mohan S. Khedkar and Vijay Shelake
11 A Smart Security Framework for High-Risk Locations
Using Wireless Authentication by Smartphone . . . . . . . . . . . . . . . . . 173
Anjali Chandavale and Vyanktesh Dorlikar
12 High Performance Computation Analysis for Medical
Images Using High Computational Method . . . . . . . . . . . . . . . . . . . . 193
Ashwini Shewale, Nayan Waghmare, Anuja Sonawane,
Utkarsha Teke and Santosh D. Kumar
13 Terrorist Scanner Radar and Multiple Object Detection
System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Supriya Kunjir and Rajesh Autee
14 Inexact Implementation of Wavelet Transform
and Its Performance Evaluation Through Bit Width
Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Moumita Acharya, Chandrajit Pal, Satyabrata Maity
and Amlan Chakrabarti
15 A Vulnerability Analysis Mechanism Utilizing Avalanche
Attack Model for Dependency-Based Systems . . . . . . . . . . . . . . . . . . 243
Sirshendu Hore, Sankhadeep Chatterjee, Nilanjan Dey,
Amira S. Ashour and Valentina Emilia Balas
16 Performance of Statistical and Neural Network Method
for Prediction of Survival of Oral Cancer Patients . . . . . . . . . . . . . . 263
Neha Sharma and Hari Om
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
About the Editors

Dr. Amlan Chakrabarti is professor and coordinator at the A.K. Choudhury


School of Information Technology, University of Calcutta, India. He has done his
doctoral research on quantum computing and related VLSI design at Indian
Statistical Institute, Kolkata, 20042008. He was a postdoctoral fellow at the
School of Engineering, Princeton University, USA, during 20112012. He is the
recipient of BOYSCAST fellowship award from the Department of Science and
Technology, Government of India, in 2011 and Indian National Science Academy
Visiting Scientist Fellowship in 2014. He has published around 80 research papers
in refereed journals and conferences. He is a Sr. Member of IEEE, Member of
ACM, and life member of Computer Society of India. He has been the reviewer of
IEEE Transactions on Computers, IET Computers & Digital Techniques, Simula-
tion Modeling Practice and Theory, and Journal of Electronic Testing: Theory and
Applications. His research interests are as follows: quantum computing, VLSI
design, embedded system design, video and image processing algorithms, and
pattern recognition.
Dr. Neha Sharma is the director of Zeal Education Societys Institute of Business
Administration, Pune, Maharashtra, India. She has completed her Ph.D. from the
prestigious Indian School of Mines, Dhanbad. She is a Student Activity Committee
Chair for IEEE Pune Section. She has organized several national and international
conferences and seminars and is a chief editor for the international journal
International Journal of Advances in Computing and Management. She has
published several papers in reputed indexed journals. She has 11 years of teaching
experience and 4 years of industrial experience. Her areas of research interest
include data mining, database design, analysis and design, articial intelligence,
software engineering, and metadata.
Dr. Valentina Emilia Balas is currently an associate professor in the Department
of Automatics and Applied Software at the Faculty of Engineering, University
Aurel Vlaicu, Arad (Romania). She holds a Ph.D. in applied electronics and
telecommunications from Polytechnic University of Timisoara. She is an author

xiii
xiv About the Editors

of more than 160 research papers in refereed journals and international conferences.
Her research interests are in intelligent systems, fuzzy control, soft computing,
smart sensors, information fusion, modeling, and simulation. She is the editor in
chief to International Journal of Advanced Intelligence Paradigms (IJAIP) and
Editorial Board member of several national and international journals and is
evaluator expert for national and international projects. She is a member of
EUSFLAT, ACM, and a Senior Member IEEE, member in TCFuzzy Sys-
tems (IEEE CIS), member in TCEmergent Technologies (IEEE CIS), member in
TCSoft Computing (IEEE SMCS), and also a member in IFACTC 3.2
Computational Intelligence in Control.
Chapter 1
Wait Event Tuning in Database Engine

Vamsi Krishna Myalapalli

Abstract The magnitude of data in the concurrent databases is exponentially


escalating with respect to time. This advent conveys a challenge to the database
administrators and SQL developers in the arena of performance, due to incessant
data accumulation and manipulation. As such, this paper proposes an approach to
overcome these befalling sundry wait events in the internal database engine through
Query and PL/SQL rewrite methodologies, and additional overhauling approaches
to increase the data hit ratio. Our experimental progression and approaches evinced
that CPU impact, wait events and other performance bottlenecks are minimized.
This paper could serve as tuning tool to boost query as well as database perfor-
mance by wait event tuning and can also oblige as a utility for database adminis-
trators, SQL programmers, database managers and database operators.

Keywords Database engine tuning


Database performance tuning Database
tuning Query optimization
Query rewrite Query tuning SQL
optimization
SQL tuning

1.1 Introduction

Tuning a database is an activity with the goal of making the database run better than
it did earlier. Even a small database will, with time and use, grow in size and
complexity. A process or report that once performed with an adequate runtime will
run more slowly the longer the system is in usage. This is due to more demands
being made on the system as well as the increasing amount of data through which
the systems need to search to perform operations. All too often, this happens
gradually until the users have a critical deadline; in worst case it may take form of a
crisis. This realization triggers a reactive tuning session as the DBA attempts to
respond to sudden demand to enhance database performance.

V.K. Myalapalli ()
JNTUK University College of Engineering, Vizianagaram, Andhra Pradesh, India
e-mail: vamsikrishna.vasu@gmail.com

Springer Science+Business Media Singapore 2016 1


A. Chakrabarti et al. (eds.), Advances in Computing Applications,
DOI 10.1007/978-981-10-2630-0_1
2 V.K. Myalapalli

Tuning has always been an important part of DBA job next to backup and
recovery. It is sometimes deemed as Black Magic. The analytical process is
centered on using ratios to determine health of a database or component of the
database.
Further paper is organized in the following manner. Section 1.2 explains the
related work and background. Section 1.3 deals with proposed benchmark, i.e., the
methodologies. Section 1.4 demonstrates experimental set-up, i.e., implementing
methodologies in a pragmatic approach. Section 1.5 explains comparative analysis
and nally Sect. 1.6 concludes the paper.

1.2 Background and Related Work

Query optimization can take form of either Rule Based or Cost Based referred as
Rule-Based Optimization (RBO) or Cost-Based Optimization (CBO), respectively.
High-Performance SQL [1] explained RBO and CBO level tuning models. It made
part of the queries faster via CBO approach and rest others via RBO approach.
An appraisal to optimize SQL queries [2] explained basic model that rewrites
sundry queries to minimize CPU cost and raise the index utilization. It also reduced
rate of hard parsing and increased query reuse (soft parse).
Augmenting database performance via SQL tuning [3] explained query tuning
through index tuning, data access types, and hints. On the other hand, it delineated
database tuning through session handling.
High-performance PL/SQL [4] explained the tuning model to reduce the rate of
Context Switching (an overhead) among SQL and PL/SQL engines. Also it
explained gaining performance over implicit and explicit cursors.
This paper explains reactive and a few proactive-tuning approaches toward
minimizing wait events and other bottlenecks in database engine. In reactive tuning,
it has to be seen whether it is the buffer cache not containing enough information for
quick retrieval, execution plans being aged out of the shared pool (shared memory),
locking or latching issues, excessive parsing or any of the myriad of other com-
ponents in the DBMS engine.

1.3 Proposed Methodology

This paper explains sundry query rewriting methodologies, choosing the precise
query based on the scenario and other overhauling approaches toward reducing
impact on database engine.
1 Wait Event Tuning in Database Engine 3

In the case of wait event tuning, there are plenty of v$ views (dynamic per-
formance views) in the database that contains comprehensive information about
wait events as they occur.
Query Tuning #(18):
Query tuning constitutes the major part (>70%) of the comprehensive database
tuning.
(1) Deter Full Table Scans (FTS): FTS retrieves each and every bit of data from
the table, i.e., triggers a lot of disk I/O. FTS may arise due to the following
reasons
(a) WHERE clause absence.
(b) Index and statistics of table or view are stale.
(c) Absence of row ltering, data type mismatch, not equal, like and null
operators in WHERE clause.
(d) Using functions in SELECT clause.
(2) Pivoting vs. Grouping: Grouping functions can be re-written as pivot state-
ments to reduce overhead in grouping. It also transforms rows in columns.
Succinctly, prefer pivoting to simple grouping when output format is not a
concern.
(3) Efciently handling function(s) in WHERE clause: Occasionally functions in
WHERE clause will lead to FTS. These are caused when a WHERE clause
predicate is invalidated with a pre-dened function.
Ex: Following query constraint(s) involving date range may lead to FTS.
Case1: where trunc (join_date) > trunc (sysdate-7);
Case2: where to_char (join_date,yyyy-mm-dd) = 1991-12-27;
Even though there exists an index for column join_date, the trunc() and
to_char() pre-dened function(s) would invalidate index, leading to sub
optimal execution and needless I/O.
(4) Logical vs. Physical Delete of a Column: In a huge database environment,
physical drop of a column would instigate higher resource consumption apart
from consuming more time.
Logical deletion of a column allows DBA or Developer to physically delete
the column at later time (during non-peak time).
Exploiting Index #(58):
(5) Efcient Truncate: Using truncate function on column prevents using
index [8]. Henceforth, query should be re-written to take advantage of index
for faster access.
(6) Deter using columns on both sides of the operator: If an indexed column is
present on both sides of the operator, then index is not used.
4 V.K. Myalapalli

(7) Enforcing CBO to Pick Best Access Path: This is performed to make operation
more or less index friendly, i.e., compelling optimizer less or more prone to
select an index access path over an FTS. This behavior can be leveraged
through the parameter OPTIMIZER_INDEX_COST_ADJ.
On an (OLTP) Online Transaction Processing system, altering this parameter
to lesser value (preferably 1030) would lead to massive performance gain.
(8) Index Merge: Index merging permits merging separate indexes and use the
result, instead of visiting the table from one of the indexes. This reduces the
amount of bytes processed and CPU cost for processing query.
Optimization via Parameters #(912):
(9) Dynamic Sampling: This lets the optimizer to take sample rows of table to
compute missing statistics. This behavior is driven by the parameter
OPTIMIZER_DYNAMIC_SAMPLING or DYNAMIC_SAMPLING
hint.
This practice is benecial when there are frequently executed n-way joins.
By sampling a small sub-set of data, the DBMS Engine Cost-Based Opti-
mizer looks at faster join order for Tables
(10) Optimization Level: It denotes level of optimization used for compiling
PL/SQL library unit(s). Higher the value, higher the compiler effort. (Range:
0 to 2). Nevertheless, higher value will comparatively slow down the
processing.
(11) Performing Optimization contingent on Data Retrieval: Optimizer_Mode
parameter is very benecial and important to tune queries, especially if rows
are retrieved in chucks (ex: rst 10 or 100 rows). By default, it is set to
ALL_ROWS mode. If chunks of data are to be retrieved, then setting this
parameter to FIRST_ROWS_n will increase performance.
(12) Determining Wait Events: Wait event may take form of sundry kinds of
internal contention. Enabling timed statistics and wait interfaces allows us to
see what various components are doing by looking at where they are
spending their time waiting.
Ratio-Based Tuning #(1319):
(13) Ensuring In-Memory Sort: If memory is insufcient for sorting, disk memory
is used, which slows down a query. This situation can be recognized using v
$sysstat view. To resolve this issue, main memory limit has to be raised via
the parameter sort_area_size.
(14) Disk I/O Tweaking: Time taken to R/W database les or external les should
be minimized. This can be resolved by defragment les at OS level.
(15) User Wait Events: Users experiencing higher wait time (or logged in for long
time) can be terminated to release resources held by them and can be
1 Wait Event Tuning in Database Engine 5

rescheduled later. So that the other current users can make use of resources
released.
(16) Enhancing Parse Once Execute Many: Ideally hard parsing [2] should be
eliminated. If the ratio of hard parsing is found much higher than soft
parsing, enforce cursor sharing [4] and bind variables [4].
(17) Ascertaining Hard Parses: Higher amount of hard parsing designates that
repeatedly executed query is getting out of shared pool (shared memory).
Increasing the size of shared pool [5] or pinning object (ex: table/procedure)
to pool or caching table [2] will minimize hard parse.
(18) Minimizing Paging and Swapping: Effectively modifying (mostly increas-
ing) the size of (SGA) shared global area (Shared Memory) will reduce the
rate of paging and swapping.
(19) Tuning PGA for Optimal Memory Usage: Size of program global area
(user-specic session memory) affects cache hit ratio [7]. The parameter
PGA_AGGREGATE_TARGET (species target aggregate memory) should
be altered to enhance cache hit ratio.
Mending Database Objects #(2021):
(20) Cleaning Invalid Object(s): Often during application maintenance, there may
be code mutations, which might affect (ex: dependencies) other objects
making them invalid. This leads to build failure or functional failure at run
time. To prevent this, we should heed the status of dependent objects.
(21) Disabled Triggers: These are deleterious than invalid object(s) as they simply
does not execute. They will lead to failure of the business logic of an
application.
PL/SQL Tuning #(2225):
(22) Reducing calls to SYSDATE: Calls to SYSDATE escalate overhead for a
query. If multiple calls to SYSDATE are inevitable (prevalently in a loop),
then the technique of code motion should be preferred.
(23) Limiting Dynamic SQL: Dynamic SQL is benecial to application from
functional perspective. However, it will open windows for performance
degradation and SQL injection [6]. If dynamic SQL is integrated with static
SQL, optimizing generated SQL statements would be highly challenging.
(24) Implicit versus Explicit Cursor: Deem implicit cursor(s) worthy over explicit
cursor(s) for faster code processing, since implicit cursors are pre-optimized.
Explicit cursors must pass through expensive Declare, Open, Fetch and
Close phases.
(25) Using Associative Arrays: They ensure fast reference table lookups. When
the reference tables are searched via key, then query performance over ref-
erence tables can be drastically enhanced by loading reference tables into
associative arrays also referred as Index-by tables.
6 V.K. Myalapalli

1.4 Experimental Setup

In this section, the methodologies specied in the earlier section will be explained
via queries or code snippets that serve in the form of exemplars.
1 Wait Event Tuning in Database Engine 7
8 V.K. Myalapalli
1 Wait Event Tuning in Database Engine 9
10 V.K. Myalapalli
1 Wait Event Tuning in Database Engine 11
12 V.K. Myalapalli
1 Wait Event Tuning in Database Engine 13

1.5 Experimental Results

After tuning queries, CPU cost and response time were enhanced and wait events
were signicantly minimized. Some of the results are represented beneath in the
form of output screens.
In order to get statistics for each query, we red some queries on database prior
to tuning, for evaluating statistics and execution plans.
SQL > set timing on //Displays elapsed time
SQL > set feedback on //Metadata of output
SQL > set autotrace on explain //Traces every query
Figure 1.1 shows the member count in each department retrieved through
Grouping method.

Fig. 1.1 Retrieving data via grouping (#1 Raw Query)


14 V.K. Myalapalli

Figure 1.2 shows that data retrieved in Fig. 1.1 are retrieved via pivoting method
with less CPU cost and elapsed time. Also the method of retrieving data does not
include hashing as involved in formal method.
Figure 1.3 shows the statistics (CPU Cost-31, Bytes Accessed-31458942, and
Rows Processed-1023029) of a query, which retrieves rows from tables in chunks
(50 at a time). Here, optimizer mode is set to ALL_ROWS (default mode).
Figure 1.4 shows the statistics (CPU Cost-9, Bytes Accessed-31457406, Rows
Processed-1022981) after changing optimizer mode is to FIRST_ROWS_1. Here,
though the operations performed as per execution are same, more than 70 % of the
CPU cost is reduced.

Fig. 1.2 Retrieving data via pivoting (#1 Rewritten Query)

Fig. 1.3 Statistics before changing optimizer mode (#11)


1 Wait Event Tuning in Database Engine 15

Fig. 1.4 Statistics after changing optimizer mode (#11)

Fig. 1.5 Reduced disk sorts (#13 Post-overhauling)

Figure 1.5 designates that after increasing the size of sort area, disk sorts are
diminished to 0.
After enforcing the proposed explicated approaches, tuning is ensured at data-
base engine level as well as at the query level.

1.6 Conclusion

Database tuning necessitates each bit of our ingenuity to evade the harmful way.
Query rewrite can alter the way the data are accessed or processed. Efcient query
will always minimize the impact on underlying database.
16 V.K. Myalapalli

The system of utilizing wait(s) to tune database engine performance is to use


real-time system wait events. At any time, wait event(s) are those events that cause
database engine to wait or hamper accomplishing work as swiftly as probable. All
these incidents are logged by the DBMS engine as they ensue or in real time. To
overcome this problem, we must hold this information and log it frequently to a
table that can be referenced to see what happened at previous time. Nonetheless,
there exist several ways in order to accomplish this; each method having its own
merits and demerits.
Ratio-based tuning allows us to mark the bottleneck(s) based on existing details
of the database and summarize them into a single metric. On the other hand, a slight
code amendment could bring hit ratio down, but it will increase performance due to
reduction in data that need to be processed.
Trying to implement a tuning project that involves wildly different schools of
thought is a recipe for chaos. Hence, only the core reason for bottleneck must be
recognized and resolved instead of tuning everything.

References

1. Myalapalli VK, Savarapu PR (2014) High performance SQL. In: 11th IEEE international
conference on emerging trends in innovation and technology, Pune, India, Dec 2014
2. Myalapalli VK, Shiva MB (2015) An appraisal to optimize SQL queries. In: IEEE international
conference on pervasive computing, Pune, India, Jan 2015
3. Myalapalli VK, Totakura TP, Geloth S (2015) Augmenting database performance via SQL
tuning. In: IEEE international conference on energy systems and application, Pune, India, Oct
2015
4. Myalapalli VK, Teja BLR (2015) High performance PL/SQL programming. In: IEEE
International conference on pervasive computing, Pune, India, Jan 2015
5. Myalapalli VK, Chakravarthy ASN, Reddy KP (2015) Accelerating SQL queries by
unravelling performance bottlenecks in DBMS Engine. In: IEEE international conference on
energy systems and application, Pune, India, Oct 2015
6. Myalapalli VK, Chakravarthy ASN (2014) A unied model for cherishing privacy in database
system. In: IEEE international conference on networks and soft computing, Andhra Pradesh,
India
7. Burleson DK, Celko J, Cook JP, Gulutzan P (2003) Advanced SQL database programmer
handbook, August 20031st edn. Rampant Tech press
8. Mishra S, Beaulieu A (2002) Mastering oracle SQL April 2002 1st edn. OReilly Publishing
Chapter 2
Machine Learning Using K-Nearest
Neighbor for Library Resources
Classication in Agent-Based Library
Recommender System

Snehalata B. Shirude and Satish R. Kolhe

Abstract Agent-based library recommender system is proposed with the objective


to provide effective and intelligent use of library resources such as nding right
book(s), relevant research journal papers, and articles. It is composed of prole
agent and library recommender agent. Library recommender agent performs the
main task of ltering and providing recommendations. Library resources include
book records having table of contents and journal articles including abstract and
keywords. This provides availability of rich set of keywords to compute similarity.
The library resources are classied into fourteen categories specied in ACM
computing classication system 2012. The identied category provides a way to
obtain semantically related keywords for the library resources. The results of
k-Nearest Neighbor (k-NN) for library recommender system are encouraging as
there is improvement in the existing results. Use of ACM CCS 2012 as ontology,
semantic similarity computation, implicit auto update of user proles, and variety of
users in evaluation are the features of the complete recommender system which
makes it useful and novel. This paper details classication of library resources
performed by library recommender agent.

Keywords Library resources classication


Machine learning k-Nearest
Neighbor
Recommender system
Recommender agent

S.B. Shirude () S.R. Kolhe


School of Computer Sciences, North Maharashtra University, Jalgaon,
Maharashtra, India
e-mail: snehalata.shirude@gmail.com
S.R. Kolhe
e-mail: srkolhe2000@gmail.com

Springer Science+Business Media Singapore 2016 17


A. Chakrabarti et al. (eds.), Advances in Computing Applications,
DOI 10.1007/978-981-10-2630-0_2
18 S.B. Shirude and S.R. Kolhe

2.1 Introduction

Recommender system can help user to obtain relevant books and journal articles. It
lters library resources according to users interest. The agent-based architecture is
used to design the framework of recommender system. The recommendation per-
formance of the agents is improved by machine learning. Library recommender
agent performs the main tasks of ltering and providing recommendations. Rec-
ommender agent makes use of user proles to lter the library resources. User
proles and library resources are represented as vectors containing term frequencies
for every signicant keyword. In library resources, most of the book records have
table of contents, and journal articles include abstract and keywords. This provides
rich set of keywords available while computing similarity. More correct recom-
mendations are possible by the addition of semantically equivalent keywords into
the vectors. ACM computing classication system (ACM CCS) 2012 is used as
static ontology [1]. ACM CCS is in simple knowledge organization system (SKOS)
format which has 14 categories. Machine learning is useful to identify the category
of the book. The results of the experiments performed using k-Nearest Neighbor are
discussed in this paper.

2.2 Literature Study

The literature study is performed to learn the present state of the work in the eld of
development of recommender system for library resources. The overall study take
into consideration the various views, viz., approach used, technique applied for the
tasks such similarity computation and classication, and method for relevance
feedback and performance of the system. This paper describes the library resources
classication using k-Nearest Neighbor which is the important stage of library
recommender agent in the complete library recommender system. Therefore, the
part of literature study focusing on agent-based systems performing classication or
similar tasks to achieve the similar objectives is given in this paper.
The survey of recommender systems for libraries is performed by Gottwald and
Koch in 2011. BibTip, ExLibris bX, Techlens, Foxtrot, Fab, and LIBRA are listed
as existing solutions to the eld [2]. The workshop on new trends in content-based
recommender systems (CBRecSys2014) is identied. The goal of CBRecSys2014
was to provide the platform for the papers dedicated to all aspects and new trends of
content-based recommendations. There are many recommendation domains and
applications where content and metadata play a key role. In domains such as
movies, the relationship between content and usage data has seen thorough
investigation done already but for many other domains such as books, news, sci-
entic articles, and Web pages, it is not still known if and how these data sources
should be combined to provide the best recommendation performance. This is
2 Machine Learning Using K-Nearest Neighbor for Library 19

motivating to the researchers in the eld of recommender system using library


domain specically [3].
The use of agent-based approach is found in [410]. Software agents are used in
diverse elds such as personalized information management, electronic commerce,
interface design, computer games, and management of complex commercial and
industrial processes. The key hallmarks of agenthood systems are given such as
autonomy, social ability, responsiveness, and proactiveness [4]. DIGLIB is the
architecture proposed for a digital library which helps a user or a group of users to
identify and nd reading material of his or her interest. It uses a software agent
which is a unique combination of ltering and information agent to facilitate
intelligent search. They have compared conventional information retrieval and
agent-based information retrieval. Agent-based information retrieval has features
such as user interaction and dynamic nature of information spreads over hetero-
geneously distributed environment. Conventional information retrieval is designed
relatively for static databases. Filtering agent works using Boolean operators OR,
AND, and NOT for search renement. The evaluation of implementation on sample
dataset is not included in the paper but it gives idea that agent-based information
retrieval is useful [5]. BibTip is recommender system developed at Karlsruhe
University. The architecture of BibTip consists of OPAC observation agent,
aggregation agent, and recommendation agent. The co-occurrences between pair of
titles are established if more titles are viewed in one session. The co-occurrences are
summarized in co-occurrence matrix. The matrix is evaluated while generating
recommendations. Repeat buying theory is used in the recommender system. The
importance of system is evaluated by conducting a survey which scales the rec-
ommendation service to the value 4.21 (range 1 to 5). This proved the need of
recommender systems to library users [6]. Morales-del-Castillo, J. M., Peis, E., and
Herrera-Viedma E. has presented a multiagent recommender system prototype for
digital libraries designed for the e-scholars community. This provides the integrated
solution to minimize the problem of access relevant information in vast document
repositories. They used semantic Web technologies and several fuzzy linguistic
modeling techniques to dene a richer description of information. The system is
evaluated in terms of precision, recall, and f1 measure giving 50.00, 70.66, and
58.19 values, respectively [8]. The results seem to be improvable further by the use
of other techniques. The conceptual framework of a multiagent-based recommender
system is proposed by Pakdeetrakulwong, U., and Wongthongtham, P. This
framework provides active support to access and utilize knowledge and project
information in the software engineering ontology. The framework consists of use of
user agents, semantic recommender agent, ontology agents, and evolution agent.
All these agents work collaboratively while performing the processes such as
semantic annotation process, ontology population process, query process, recom-
mendation process, ontology evolution update process, and issue rising with
instance update process [9]. This work is performed using only the software
engineering ontology.
20 S.B. Shirude and S.R. Kolhe

Collaboratively, annotation of learning resources with a lightweight resource


annotation metadata scheme is proposed by Simon Boung-Yew et al. In person-
alization learning process, the parameters such as learning style and competency
level of learner are specied by learner. Resource classication is done into two
classes such as good and poor. k-Nearest Neighbor is used to perform the task [11].
The classication performed is binary on the annotated tags. The system explicitly
asks some set of the questions to users to take the feedback. Normally users do now
like to ll up questionnaire; therefore, this way the correct feedback collection
becomes difcult. The implicit feedback along with very limited explicit informa-
tion such as rating can improve the performance. The tag-based interests are utilized
in recommender system by Cheng-Lung Huang et al. using folksonomy. They
worked for nding users having similar interests in social network Web sites [12].
The nature of the data on social network Web sites is different than the contents of
library resources. It gives a direction to perform more work in the eld of library
domain. Gray sheep users problem in recommender system is analyzed by Mus-
tansar Ali Ghazanfar and Adam Prugel-Bennett. This work is useful in recom-
mender system because system applies data mining and machine learning
techniques to unseen information for predication [13]. A location-aware book
recommendation system is proposed by Chih-Ming Chen. Use of k-NN classier is
described in process of deciding learners location. This system specically
designed to facilitate book searches on handheld mobile devices [14]. The coop-
erative learning model is used. The recommendation mechanism lters the records
based on similarity of the book titles and learner search queries. The recommen-
dation accuracy can be improved if more details of the books are inputted to
ltering process. Anna Hulseberg, Sarah Monson has given taxonomy of
student-driven library Web site. There is need to have implementation of diverging
terminologies for efcient searching, research guidance to library users [15]. The
addition of semantic information while ltering the records can improve the per-
formance of the system. A library recommender system based on a personal
ontology is designed by Kuo-Fong Kao, I-En Liao. The system creates personal
ontology for each user based on favorite value of category. The concept of tree
distance is used for comparing two personal ontologies while providing recom-
mendations [16]. The broader use of ontology to acquire the semantically related
keywords can allow specifying the weights to keywords. This can increase the
number of relevant matches according to the interest of users.
Summing up the review, it is learned that the agent-based approach is the most
suitable as the recommendation process involves the task of decision making and
performing action based upon perception of the recommender. Though there are
some previous works identied in this eld, performance improvement is possible.
The most identied difculty is lack of existing datasets with rich data about library
resources with table of contents, index, abstract, keywords, etc. It is possible to
improve the performance of ltering by considering semantically related keywords
and weight assignment. This task will need use of ontology and richer dataset of
library resources.
2 Machine Learning Using K-Nearest Neighbor for Library 21

2.3 Architecture of Library Recommender Agent

Figure 2.1 shows the architectural design of library recommender agent.


The process of recommendation is different from searching. The recommender
system provides variety of library resources such as journal articles, books, indi-
vidual chapters, and theses in the form of recommendations. Generation of recom-
mendations is based on the user proles while searching is query based. The system
provides recommendations to the members of the system only after the system
creates prole for the members. Library recommender agent uses hybrid approach to
generate the recommendations. The advantage of hybrid approach is that it combines
results generated by two agents: content-based agent and collaborative-based agent
as shown in Fig. 2.1. The main task of library recommender agent is to provide the
recommendations to the user. Library resources and user proles are represented as
vectors consisting term frequencies of containing keywords after removal of stop
words. Content-based agent matches the user prole with the available library
resources by computing similarity between vectors representing library resources
and vectors representing user proles. Library resources are enriched with the
addition of semantically related terms retrieved from knowledgebase by the
content-based agent. The recommendations provided by the content-based agent to
the particular user are added with the recommendations provided by collaborative
agent and hence, it is using hybrid approach. Collaborative agent provides recom-
mendations by nding similar users to the active user. Its agent refers knowledge
about past users of the system. The library recommender agent is utility based.
Performance measure is correctness of provided recommendations. Environment is

Fig. 2.1 Architecture of library recommender agent


22 S.B. Shirude and S.R. Kolhe

user prole, knowledgebase, and library resources. The properties of task environ-
ment are partially observable, deterministic, sequential, static, discrete, and single
agent. Actuators have ability to follow the links that are user interface and display the
information. Sensors have ability to parse documents and Web pages.

2.4 Library Recommender Classier

(A) Dataset Design


Library resources are in MARC format. The library resources are added into the
dataset from library of congress using Z39.50 protocol. Some of the records of the
books belonging to the area such as cluster analysis, coding theory, computer
networks evaluation, distribution, knowledge management, and business computer
programs are downloaded. In addition, the resources are added from catalogs of
PHI learning, Laxmi, Pearson education, Cambridge and McGraw Hill Publica-
tions. Library resources in computer science area are used as dataset for experi-
ments. The MARC format records are exported to XML with use of Koha. One of
the important elds of records of these library resources is table of contents. Col-
lection of conference and journal papers from computer science eld is added into
the dataset in XML form. The records of journal articles contain abstract which is
helpful while training the system. There are 705 library resources added into the
dataset. The ACM Computing Classication System 2012 is used as knowledge-
base for the experiments [1].
(B) k-Nearest Neighbor-based Classier
The library resources classication is carried out on test data containing one
thousand four hundred sixty three columns for input features and one bit for output.
Input features are generated from knowledgebase. They are unique keywords
present in each category after removal of stop words. In sample data, one row
represents single library resource. The value under each column is the frequency of
keywords present, in particular the library resource. The output bit represents the
category to which the library resource belongs Table 2.1.
k-NN classier [11] takes decision for 231 test library resources using
distance-based and accuracy-based approaches. Hold out, tenfold, leave one out
cross-validation (LOOCV) techniques are applied. The experiments are carried out
using Rough Set Exploration System version 2.2 [17]. The classier has two stages.
First is the determination of the nearest neighbors and second is the determination
of the class using those neighbors. The nearest neighbor is determined by using the
metric-type City-Singular Value Decomposition. The distance measure d is the
weighted sum of the distance measures da for particular attributes a Li; i vary
from 1 to 1463. For any two numerical values of attribute, it can be calculated using
following equation.
2 Machine Learning Using K-Nearest Neighbor for Library 23

Table 2.1 Sample vectors of [Input] [Output]


library resources
[2,4,0,6,0,0,1,2,3,4,0,0,0,1,7,0,,4,2,0,1] [1]
[0,1,2,1,1,0,0,0,1,0,0,4,0,0,1,0,,0,0,2,0] [2]
[2,2,0,3,0,0,0,1,0,0,3,0,0,2,0,3,,0,0,2,0] [3]
[0,0,1,0,2,5,0,2,0,1,0,0,0,0,0,0,,1,0,0,1] [4]
[1,4,0,3,0,0,0,2,0,0,0,0,0,0,0,1,,0,1,0,2] [5]
[4,0,0,1,1,0,0,0,0,2,0,0,0,0,1,0,,0,0,3,0] [6]
[0,0,2,0,0,0,0,1,0,0,4,0,0,0,2,0,,0,1,0,0] [7]
[0,2,1,0,2,0,1,0,0,2,0,0,0,0,0,2,,0,0,0,0] [8]
[0,1,0,3,1,0,2,0,1,0,0,0,0,0,0,0,,5,0,1,2] [9]
[5,1,2,0,0,0,1,1,0,0,1,0,0,2,2,0,,0,0,0,0] [10]
[0,0,1,0,1,0,4,1,0,0,0,0,0,1,0,1,,4,2,0,0] [11]

dx, y = wa .da ax, ay


a Li

City block the so-called Manhattan metric is combined for numerical attributes
in City-SVD. The absolute value of difference between attribute values ax, ay
is the distance between those two numerical values. jax, ayj is divided by
standard the length of the range of the attribute a in training data as the normal-
ization parameter is set to the value range. The length is the difference between
largest and smallest values of the attribute in training data. The number of neighbors
(k) is kept 1, 2, 3, 4, and 5. Finally, the option for number of neighbors is set to
search optimal between 1 and 100 to identify the right value of k. Attribute
weighting/scaling is done by two methods: distance-based and accuracy-based. The
distance-based method works iteratively to choose the weights to optimize the
distance to correctly identied training library resource Li. Accuracy-based attribute
scaling method also works iteratively to choose the weights to optimize the accu-
racy of decision prediction for training library resource Li. Twenty iterations are
performed. In accuracy-based method, every iteration increases the weight of
attributes with high accuracy of classier having value k = 1. The second stage of
k-NN classier is determining the category using the recognized neighbors. The
distance-weighted voting method is selected to perform this. The votes are weighted
using the distance between the neighbor and the library resource for which category
needs to identify. Finally, the category is selected having the largest total of
weighed votes. If a neighbor generates a local rule inconsistent with other members
of neighborhood, then the classier excludes it from voting process as the option
lter neighbors using rules are set to true.
Figure 2.2 shows the design of k-NN classier for one sample experiment.
Testing and training library resources records are in tab le format. Eleven cate-
gories among fourteen are taken for classication from ACM CCS 2012. Three are
ignored because they are general. Experiments are performed using hold out, ten-
fold, leave one out cross-validation techniques. The values of k are 1,2,3,4,5 and in
24 S.B. Shirude and S.R. Kolhe

Fig. 2.2 Design of k-NN classier

range 1 to 100 for all techniques. Figure 2.3 gives the confusion matrix generated
for one sample experiment.
For correctly classied library resources, classication accuracy is computed for
experiments. For evaluation of experiments, precision, recall, and f1 values [18] are
computed for each of the folds and LOOCV technique.

No.of Relevant Retrieved Concepts


Precision = *100
Total No.of Retrieved Concepts

No.of Relevant Retrieved Concepts


Recall = *100
Total No. of Relevant Concepts

2 Precision Recall
F1 =
Precision + Recall

Fig. 2.3 Result of one sample experiment by k-NN classier


2 Machine Learning Using K-Nearest Neighbor for Library 25

2.5 Results and Discussion

Results of accuracy-based, distance-based, and LOOCV approaches of k-NN


classier are given below. Table 2.2 shows the result of accuracy-based k-NN
classier using tenfold technique. Table 2.3 shows the result of distance-based
k-NN classier using tenfold technique.
The total correctly classied library resources and average classication accu-
racy of distance-based and accuracy-based measures are calculated and summary is
given below in Table 2.4.

Table 2.2 Results of accuracy-based k-NN classier 10-fold approach


Accuracy-based k-NN classier
k=1
Fold 1 2 3 4 5 6 7 8 9 10
Correctly 10 9 12 14 14 15 13 10 18 16
classied LR
Classication 45.50 40.90 54.50 63.60 63.60 68.20 59.10 45.50 81.80 48.50
accuracy (%)
k=2
Correctly 10 11 12 14 14 15 13 10 18 16
classied LR
Classication 45.50 50.00 54.50 63.60 63.60 68.20 59.10 45.50 81.80 48.50
accuracy (%)
k=3
Correctly 10 11 14 14 14 16 13 10 17 17
classied LR
Classication 45.50 50.00 63.60 63.60 63.60 72.70 59.10 45.50 77.30 51.50
accuracy (%)
k=4
Correctly 10 11 14 14 15 16 13 10 16 17
classied LR
Classication 45.50 50.00 63.60 63.60 68.20 72.70 59.10 45.50 72.70 51.50
accuracy (%)
k=5
Correctly 10 12 14 14 16 15 12 10 16 16
classied LR
Classication 45.50 54.50 63.60 63.60 72.70 68.20 54.50 45.50 72.70 48.50
accuracy (%)
k is in range 1 to 100
Correctly 10 11 14 15 16 15 13 9 19 12
classied LR
Classication 45.50 50.00 63.60 68.20 72.70 68.20 59.10 40.90 86.40 54.50
accuracy (%)
26 S.B. Shirude and S.R. Kolhe

Table 2.3 Results of distance-based k-NN classier 10-fold approach


Distance-based k-NN classier
k=1
Fold 1 2 3 4 5 6 7 8 9 10
Correctly 9 9 14 13 14 14 14 11 19 10
classied LR
Classication 40.90 40.90 63.60 59.10 63.60 63.60 63.60 50.0 86.40 45.50
accuracy (%)
k=2
Correctly 9 9 14 13 14 14 14 11 19 10
classied LR
Classication 40.90 40.90 63.60 59.10 63.60 63.60 63.60 50.0 86.40 45.50
accuracy (%)
k=3
Correctly 10 9 14 13 14 13 14 11 19 10
classied LR
Classication 45.50 40.90 63.60 59.10 63.60 59.10 63.60 50.0 86.40 45.50
accuracy (%)
k=4
Correctly 10 10 14 13 14 14 13 11 19 10
classied LR
Classication 45.50 45.50 63.60 59.10 63.60 63.60 59.10 50.0 86.40 45.50
accuracy (%)
k=5
Correctly 10 10 14 15 14 14 13 11 19 10
classied LR
Classication 45.50 45.50 63.60 68.20 63.60 63.60 59.10 50.0 86.40 45.5
Accuracy (%)
k is in Range 1 to 100
Correctly 10 10 13 15 15 14 13 11 19 19
classied LR
Classication 45.50 45.50 59.10 68.20 68.20 63.60 59.10 50.0 86.40 57.60
accuracy (%)

From the above table, it is clear that average classication accuracy is better
when the value of k lies between 1 and 100. The precision, recall, and f1 values are
computed for these experiments and are given in Table 2.5.
The graph is plotted for the above values to compare the approaches as given in
Fig. 2.4.
k-NN classier in experimented approaches takes decision for 231 library
resources. In tenfold approach, rst ninefold takes 22 records of library resources
while in the last tenfold 33 remaining records are added. In LOOCV technique, the
single record of library resource from all testing records is selected at a time. The
coverage for all experiments using k-NN is 100 % because the classier works
nonlinearly and classifying all testing library resources. It tries to nd nearest
category that may result in misclassication. Figure 2.4 shows the graphical
2 Machine Learning Using K-Nearest Neighbor for Library 27

Table 2.4 Average classication accuracy


Value of k k=1 k=2 k=3 k=4 k=5 Range
1100
Accuracy-based k-NN classier (10-fold)
Total correctly classied 131 133 136 136 135 134
resources
Average accuracy (%) 57.12 58.03 59.24 59.24 58.93 60.91
Distance-based k-NN classier (10-fold)
Total correctly classied 127 127 127 128 130 139
resources
Average accuracy (%) 57.72 57.72 57.73 58.19 59.10 60.32
k-NN classier LOOCV approach
Total correctly classied 141 141 141 144 147 152
resources
Average accuracy (%) 60.70 60.70 60.70 62.00 63.30 65.52

Table 2.5 Precision, recall, and f1 values


When k is in range 1 to 100
Precision (%) Recall (%) F1 (%)
Accuracy-based k-NN 73.78 60.91 66.60
Distance-based k-NN 74.40 60.32 66.47
LOOCV-based k-NN 65.52 65.52 65.52

Fig. 2.4 Graph of precision,


recall, and f1 values
28 S.B. Shirude and S.R. Kolhe

Table 2.6 Comparison with other similar work


Udsanee P. and pornpit W. Hulsberg and monson Liao, Hsu, etl Proposed
work 2013 work 2012 work 2010 recommender
system
Specically designed for No use of ontology Use of personal Use of
the subject software ontology for each ACM CCS 2012
engineering only user as ontology
Semantic annotation is Semantic annotation is Semantic Semantic
conceptually dened not implemented annotation is not similarity is
implemented computed
Implicit feedback is taken Explicit set of User proles are Auto updates
questions asked to not updated user proles
update user proles automatically implicitly
Very limited numbers of Very limited numbers Very limited Variety of users
users of users numbers of users performance is
better

comparison of the precision, recall, and f1 values. The precision of distance-based


k-NN is showing better classication with respect to total retrieved library
resources. Similarly, LOOCV-based k-NN has high recall so with respect to the
total number of samples the performance is improved as compared to other
approaches. The work of recommender system is compared with similar work
identied in literature study. Each row of Table 2.6 gives the comparison with [9,
15, 16] according the features such as use of ontology, semantic similarity, feed-
back mechanism, and variety of users.

2.6 Conclusion

Library recommender agent provides recommendations to users by hybrid


approach. The variety of library resources suggests need of classifying and
grouping them which resembles to the idea of arranging similar library records into
the common shelf. Tenfold and LOOCV approach used in k-NN classication gives
better performance than hold out as the cross-validation is the average of all pos-
sibilities for choosing the correct instance of the library resource. LOOCV approach
outperforms the other approaches since n-1 out of n library resources are used for
training. LOOCV approach has the single record of library resource in every turn of
experiment; therefore, the bias is low and hence improves the result. The estimation
is comparatively good and unbiased for tenfold and LOOCV approaches. The latter
step of library recommender agent uses the results of classication while generating
recommendations.
2 Machine Learning Using K-Nearest Neighbor for Library 29

References

1. http://www.acm.org/about/class/2012
2. Gottwald S, Koch T (2011) Recommender systems for libraries. In: ACM recommender
systems 2011. Chicago
3. Bogers T, Koolen M, Cantador I (2014) Workshop on new trends in content-based
recommender systems:(CBRecSys 2014). In: Proceedings of the 8th ACM conference on
recommender systems. ACM, pp 379380
4. Jennings N, Wooldridge M (1996) Software agents. IEE Rev 42(1):1720
5. Prakash N (2004) Intelligent search in digital libraries. In: 2nd convention PLANNER.
Manipur University, Imphal Copyright INFLIBNET Centre, Ahmedabad, pp 8390
6. Mnnich M, Spiering M (2008) Adding value to the library catalog by implementing a
recommendation system. D-Lib Mag 14(5):4
7. Prakasam S (2010) An agent-based intelligent system to enhance e-learning through mining
techniques. Int J Comput Sci Eng
8. Morales-del-Castillo JM, Peis E, Herrera-Viedma E (2010) A ltering and recommender
system for e-scholars. Int J Technol Enhanc Learn 2(3):227240
9. Pakdeetrakulwong U, Wongthongtham P (2013) State of the art of a multi-agent based
recommender system for active software engineering ontology. Int J Digit Inf Wirel Commun
(IJDIWC) 3(4):2942
10. Bedi P, Vashisth P (2015) Argumentation-enabled interest-based personalised recommender
system. J Exp Theor Artif Intell 27(2):199226
11. Lau SB-Y, C-S Lee, YP Singh (2015) A folksonomy-based lightweight resource annotation
metadata schema for personalized hypermedia learning resource delivery. Interact Learn
Environ 23(1):79105
12. Huang C-L, Yeh P-H, Lin C-W, Den-Cing W (2014) Utilizing user tag-based interests in
recommender systems for social resource sharing websites. Knowl-Based Syst 56:8696
13. Ghazanfar MA, Prgel-Bennett A (2014) Leveraging clustering approaches to solve the
gray-sheep users problem in recommender systems. Expert Syst Appl 41(7):32613275
14. Chen C-M (2013) An intelligent mobile location-aware book recommendation system that
enhances problem-based learning in libraries. Interact Learn Environ 5:469495
15. Hulseberg A, Monson S (2012) Investigating student driven taxonomy for library website
design. J Electron Res Librariansh 361378
16. Liao I-E, Hsu WC, Cheng MS, Chen LP (2010) A library recommender system based on
PORE and collaborative ltering technique for English collections. The Electron Libr 28:386
400
17. Bazan JG, Szczuka MS, Wroblewski J (2002) A new version of rough set exploration system.
In: Rough sets and current trends in computing. Springer, Berlin, pp 397404
18. Gunawardana A, Shani G (2009) A survey of accuracy evaluation metrics of recommendation
tasks. J Mach Learn Res 10:29352962
Chapter 3
An Efcient Dynamic Scheduling of Tasks
for Multicore Real-Time Systems

Kalyan Baital and Amlan Chakrabarti

Abstract Embedded real-time systems are increasing day by day to execute


high-performance-oriented applications on multicore architecture. Efcient task
scheduling in these systems are very necessary so that majority of the tasks can be
scheduled within their deadline and thus providing the needed throughput. This
paper presents a scheduling algorithm where random tasks generated at different
time intervals with different periodicity and execution time can be accommodated
into a system, which is already running a set of tasks, meeting the deadline criteria
of the tasks. The idle time of the cores has been found based on the execution time
of the existing tasks. Using the concept of Pfair scheduling, random new tasks have
been divided to t into the idle times of the different cores of the system. We verify
the proposed algorithm using generated task sets, and the results show that our
algorithm performs excellently in all the cases.

Keywords Dynamic scheduling Pfair


Multicore
Real-time system
Task scheduling RTOS
Idle time
Two-level queues

3.1 Introduction

(A) Task scheduling in Real-Time multicore systems


Kopetz denes real-time system as follows: A Real Time computer system is a
computer system in which correctness of the system behaviour depends not only on
the logical results of the computation but also on the physical instant at which these

K. Baital ()
National Institute of Electronics and Information Technology,
Kolkata, India
e-mail: kalyan_baital@yahoo.co.in
A. Chakrabarti
A. K. Choudhury School of Information Technology,
University of Calcutta, Kolkata, India
e-mail: amlanc@ieee.org

Springer Science+Business Media Singapore 2016 31


A. Chakrabarti et al. (eds.), Advances in Computing Applications,
DOI 10.1007/978-981-10-2630-0_3
32 K. Baital and A. Chakrabarti

results are produced [1]. In non real-time system, throughput is the indicator of
measurement of the performance of the system, that is more and more tasks are
required to be executed in a certain time period for increasing the performance.
Performance in real-time system is measured based on the following criterion:
many tasks should be processed as much as possible, so that they will produce the
desired results before their deadline. Hence, the real-time system must be pre-
dictable in nature. Therefore, in strict real-time systems, a delay in result is not just
a delay but useless. In a real-time system, the system time (internal time) is mea-
sured with same timescale as the controlled environment (external time). The
deadline parameter classies the main difference between real time and
non-real-time systems. In real-time system, deadline has to be met under all even as
well as worst circumstances. Real-time system is divided into three types depending
upon the processing of tasks of different strictness levels, which are described as
follows:
(a) A real-time task is called hard if its miss in deadline may cause disastrous
consequences on the environment under control.
(b) A real-time task is soft if meeting its deadline is desirable but missing does not
cause serious damage.
(c) A real-time task is called rm if its miss in deadline makes the result useless,
but missing does not cause serious damage.
Two types of real-time task are there in respect of periodicity, namely periodic
and aperiodic. In a periodic task, a sequence of instances is generated with a xed
period, and in the case of aperiodic task, no period is present, that is next instance of
task is not known.
High-speed processors are essential for real-time system, and reduced cost of
high-speed processors has shown the way for solving real-time system demand
more efciently [25]. Today, multicore system is integrated into real-time system.
A multicore system integrates two or more processors in an integrated circuit for
performance enhancement and optimum power consumption. It executes multiple
tasks simultaneously with efciency. Unlike a single core, it performs concurrent
processing of tasks involving more than one core. The switching takes place either
at regular intervals or when currently executing task releases the control of the
processor. Multicore is an emerging trend currently due to its speed and perfor-
mance [6].
There are different ways of processing a multicore system which are summarized
below [7]:
(a) Symmetric multiprocessing (SMP)A single instance or image of the
real-time operating system managing the cores and shared resources of the
system. All the resources are available for all the cores and tasks; hence, no
external communication is required between the cores.
3 An Efcient Dynamic Scheduling of Tasks 33

(b) Asymmetric multiprocessing (AMP)Instead of sharing cores with one


image, it has a number of images per core and hence treats each core as a
separate CPU. Task of one core can only use the resources of that core.
Given a set of cores P, a set of tasks K, a set of resources R, where K >> P, there
may exist precedence, which can be shown using a precedence graph (DAG);
considering real-time system, timing constraints are associated with each of the
tasks [8]. The objective of real-time scheduling is to allocate processors from P and
resources from R to K tasks in such a way that all task instances are completed
under the constraints. In general form, the problem is NP-complete [9, 10].
Therefore, relaxed situations have to be considered or/and proper heuristic algo-
rithm have to be applied. In principle scheduling is an online algorithm though
under certain circumstances large parts of scheduling can be done offline. However,
in any case, all exact or heuristic algorithms should have low complexity.
There are two types of scheduling algorithm in nature, namely preemptive and
non-preemptive. In preemptive scheduling, a running task instance may be
preempted as per requirement and restarted at any time later. Any preemption
means delay in executing task instance, which needs to be addressed appropriately.
Task instance in non-preemptive scheduling will be executed undisturbed until it
nishes. In non-preemptive approaches, there is less context switches. It may
appear that non-preemptive should be better in real-time scheduling, but preemption
is required for better processor utilization.
(B) Existing Works
Many scheduling algorithms including many good heuristic approaches, namely
Earliest Deadline First (EDFdeadline-driven scheduling which dynamically
assigns priorities according to the deadline), Rate Monotonic (RMstatic
priority-driven scheduling where each periodic task assigns priority inversely based
on its period), Least Common Multiple (LCMscheduling technique considering
least common multiple of the process periods), Ant Colony Optimization (ACO
schedule based on the ant colony optimization algorithm) have been developed for
efcient mapping and allocation of tasks to processing cores [1119]. Some of the
algorithms highlight single scheduling approaches and some algorithms sketch on
several other paradigms. Some consider static scheduling while others consider
dynamic scheduling. Some work well in underloaded conditions and some work
well in overloaded conditions.
Timing analysis of concurrent tasks is an approach to meet the task deadlines in
real-time system. Suhendra et al. [20] presents a timing analysis technique for
concurrent tasks running on multicore with shared cache memory. The authors in
[21] present a framework to check whether time-dependent task set is executable or
not and ordering those executable tasks for self organized resource scheduling. The
authors of [22] address the problem of scheduling periodic parallel tasks on a
multiresource platform with task of real-time constraints. Scheduling approach uses
34 K. Baital and A. Chakrabarti

a Worst Case Execution Time (WCET) estimation per task, which is the maximum
value for all execution scenarios of the task. [23] presents an approach that consider
not a single WCET estimation but a set of WCET estimation per task. This allows
the algorithm to reduce the number of resources for a given task set and hence
enhance the performance of the system. The authors of [24] implement a staggered
model for Proportionate-fair (Pfair) scheduling on a symmetric multiprocessor for
performance improvement and [25] uses Pfair to reduce the scheduling overhead,
and also supporting task-interrupt co-scheduling for embedded multicore systems.
One can visit [2637, 38] to have more exhaustive study on scheduling considering
different parameters, namely processor resources allocation, sequence of task,
dependent task, parallel multithread, power optimization, temperature and memory
access.
From all the previous work related to real-time scheduling done by the
researchers, it may be concluded that there is a need to have a flexible dynamic
scheduling model for real-time task execution in a multicore system under all
conditions, preserving the deadline among the tasks and also optimizing the
throughput of the system.
(C) Claim for novelty
This paper proposes a simulation model to nd the best possible way to schedule
the given set of tasks efciently to the available processing cores so that all the task
deadlines are met and also the throughput of the system is optimum for the set of
tasks.
The model maintains two-level queuesglobal and local [39, 40]. It also uses
the concept of two algorithms, namely EDF and Pfair as per follows:
(a) EDF at the global queue of new tasks and
(b) Pfair for context switching and forwarding the job to local queue.
Time complexity of both EDF and Pfair algorithm is O (logn); therefore, time
complexity of the proposed model is also O (logn), but the model has some nov-
elties in respect to efcient utilization of the processors as well as tting almost all
the tasks efciently meeting the deadline condition, which is given below:
The model is better than EDF scheme because EDF works well in underloaded
condition and when there is only one processor. But our proposed model is in
overloaded condition and has multiple processors. The same model can also be used
in underloaded condition having a single processor.
Further we can allocate almost all the new tasks into the cores if
1. Periodicity is big for the new task
2. Execution time of new task is small
3. Task is divided into huge number of jobs thus decreasing their execution time so
that jobs can be tted before next instance of the existing task starts.
3 An Efcient Dynamic Scheduling of Tasks 35

In RM scheduling, all the deadlines of randomly generated periodic task can be


met if CPU utilization is less than or equal to 85 % [41]. With scheduling periodic
task containing same deadlines and periods, utilization bound of EDF is 100 %, but
this is an ideal situation where the system is not in overloaded condition. The CPU
utilization of our proposed model varies from 95 % to 100 %, meeting all the task
deadlines and all the system conditions. Hence, CPU utilization of our model is
very high comparable to RM scheduling and EDF scheduling.

3.2 Proposed Task Scheduling Scheme

(A) Problem statement


Given a set of cores, assume a scenario where the system initiates with a set of
tasks equal to the number of cores. Each task has a periodicity (pi), and deadline (di)
can be assumed to be same as its period interval, i.e. di = pi. Each task also has an
execution time (ei). During the run of the system, at different time intervals, random
tasks may arrive. The system has to accommodate these tasks based on a scheduling
scheme or will decide whether the task will be accepted by the system or not.
(B) Problem solution
Assumptions:
All tasks are independent.
Preemption is allowed.
No task can suspend itself, for example on I/O operation.
All overheads in the kernel are assumed to be zero.
Time required for context switching can be ignored.
Tasks are released as soon as they arrive, i.e. = 0. ( = occurrence of rst
instance of task Ti. Second instance occurs at + pi and so on) (Fig. 3.1).

Fig. 3.1 Periodic task T for


core C
36 K. Baital and A. Chakrabarti

Initial Task Stage:


1. Let the tasks T1, T2 and T3 have been assigned to cores C1, C2, and C3,
respectively.
2. Let core C1 has execution time = 1 ms and period = 2 ms. Therefore, idle time
of the core C1 is from 1 ms2 ms, 3 ms4 ms, 5 ms6 ms and so on.
3. Similarly for C2, say execution time of task T2 is 2 ms and period is 4 ms.
Therefore, idle time is from 2 ms4 ms, 6 ms8 ms, 10 ms12 ms, etc.
4. Similar time plan can be thought for the other core C3.
5. We draw a graph as shown in Fig. 3.2 for all the idle times of cores and
calculate idle times corresponding to the cores.
6. We maintain the calculated idle time corresponding to cores in the TimeCore
Map Table as shown in Table 3.1.
New Task Stage:
Assumptions:
Maximum number of tasks in global queue p.
Maximum number of tasks in local queue l (p > l).
Waiting time at local queue is zero.
New task also has periodicity P (= deadline).
New task also has execution time E.

Fig. 3.2 Graph of idle time


of core C1, C2 and C3

Table 3.1 TimeCore Map Time (Idle) ms Core


table
1 C1
2 C2
3 C1, C2, C3
4 C3
5 C1, C3
3 An Efcient Dynamic Scheduling of Tasks 37

The architecture of the algorithm model with two-level queues is shown below
(Fig. 3.3):
The working principle of the proposed model is described below:
1. New tasks are rst stored in global queue (say T5, T6, T7) and are sorted in a
priority queue according to EDF or least time to go, i.e. task with earliest
deadline is given the highest priority. The rearrange queue is (T7, T6, T5).
Whenever a scheduling event occurs (task nishes, new task released, etc.), the
queue will be searched for the task with earliest deadline.
2. Using Pfair concept, front task (say T5) is divided into a number of jobs z with
execution time, Et ms (where Et = E/z), i.e. execution time is divided among the
jobs where Et is the execution time of divided jobs and the jobs have the same
periodicity P of the arrival task As per Pfair algorithm, each job is associated
with pseudo-release time and pseudo-deadline. Pseudo-release time and
pseudo-deadline of the ith job are as follows:

r = i 1 wtT
d = i wtT 1
Where wt T = E P and i 1

where r1, r2rz are the release times and d1, d2 dz are the deadlines of the
respective jobs.
3. Next, based on the pseudo-release time (say 5 ms) of new job of task T5 in the
global queue, search on TimeCore Map table is done and accordingly for-
warded to local queue of either C1 or C3 depending upon the lesser CPU
load/utilization of C1 or C3.
4. Based on the release time and execution time of that of the new job, we estimate
whether the job can be allocated to core of C1 or C3 before starting of next

Fig. 3.3 Architectural modeltwo-level queues, Global and Local


38 K. Baital and A. Chakrabarti

instance of the existing task at core C1 or C3, honouring the deadline constraints.
If the condition is satised, we assign the job from local queue of C1 or C3.
5. The newly assigned job to core is preempted when another job from global
queue with highest priority (earliest deadline) releases or next period of existing
task of that core starts next instance (chance of which is near to zero as we break
the new task into number of jobs to decrease the execution time of jobs which
can easily t before next instance of existing task start). Preempted newly
assigned job switched to the global queue, and the global queue is rearranged
based on the EDF scheme.
6. Then again mapping from global queue to TimeCore Map table occurs for the
next job to nd a suitable core.
7. The global queue is also rearranged when (a) new task arrives at global queue
and (b) also new instance of task at global queue starts.
8. The process continues till all the tasks get executed.
9. Each core has private rst-level (L1) cache and a second-level (L2) cache, which
is shared by all the cores. The two-level caches (L1 and L2) are available in
many existing multicore architectures [20, 42].

3.3 Algorithm and Its Description

Algorithm Dynamic_Schedule
Input: (T, C, P, I, e, N, Gq, J, z, r, Et, cn, d, p, l)
T = Initial Task, C = Core executing task T, P = Period of task T, I = Idle time
of core, e = execution time of task T, N = No of new tasks, Gq = Global queue,
J = new job, z = no of jobs for each new task, r = release time of job J,
Et = Execution Time of job J, cn = no of core, d = Deadline of job J, p = no of
slot in global queue Gq for storing p no of task, l = no of unit of local queue for
storing l no of jobs
Output: Assign job J into the system within deadline
3 An Efcient Dynamic Scheduling of Tasks 39

Step 1: /* Assign all the initial tasks (T ) to all the cores (C ) where no of initial
tasks = no of cores */

Call Initial_assignment(T, C, cn ) (a)

Step 2: /* For all cores calculate idle time */

Call Calculate_idletime (e, P, I, cn ) (b)

Step 3: /* Display periods corresponding to core */

Call Period_core (P, cn ) (c)

/* Arrival of New Tasks */

Step 4: /* N no of new tasks are arrived and stored in global queue Gq based on
the availability of slot*/

Call newtask_global(N, Gq, p) (d)

Step 5: /* Compare idle time-core table with release time r */

Call time_core_release(I, p, z, r) (e)

Step 6: /*Calculating lesser CPU Utilization among cores whose idletime is


matched with the release time*/

Call less_cpu(l, C ) (f)

Step 7: /* Job is assigned from global queue to local queue of selected lesser CPU
utilization */

Call Assigned_tolocalqueue(l, C,J ) (g)

Step 8: /* Assign jobs from local queue to core of lesser CPU utilization */

Call Localqueue_tocore(r, Et, d, P, I ) (h)


40 K. Baital and A. Chakrabarti

(a) Initial_assignment(T, C, cn )

for (i=0; i<cn; i++)


C [i] = T [i]

(b) Calculate_idletime (e, P, I, cn )

P[cn][10] = {{ Pi, 2Pi, 3Pi upto10 periods for T1},..Upto cn no}


e[cn][10] = { ei, Pi+ei, 2Pi+eiupto 10 e.t for T1},..Upto cn no}
for(j=0;j <cn;j++){
for(k=0;k<10;k++){
if (e[j][k] < P[j][k])
{
I[j][k] = e[j][k] // storing idle time at I
Display I[j][k] // display idle time
}
}
Display j+1 // display core no of idle time
}

(c) Period_core (P, cn )

P[cn][10] = {{ Pi, 2Pi, 3Pi upto10 periods for T1},..Upto cn no}


for ( i=0; i<cn ;i++) {
for(j=0;j<10;j++) {
Display P[i][j]; } }

(d) newtask_global(N, Gq, p)

for(i=0;i<p; i++){
if(Gq [i] == null)
Gq [i] == N[i] //storing new task
}

(e) time_core_release(I, p, z, r )

//Assume current time = release time


// Comparing two matrix, I (idle time) and r (release time) to find a match.
for(r1=0;r1<p;r1++)
3 An Efcient Dynamic Scheduling of Tasks 41

for(c1=0;c1< z;c1++)
for(r2=0;r2< z;r2++)
for(c2=0;c2<p;c2++)
if(I[r2][c2]==r[r1][c1]{
Display I[r2][c2] //display idle time matches with release time
Display r2+1 //display idle core corresponding to idle time
}

(f) less_cpu(l, C )

for(k=0;k<l;k++){
if(C[k]!=null)
count++ }
Display count // no of unit full in local queue
//Repeat the process for all the selected idle cores
//minimum count core selected as lesser CPU utilization

(g) Assigned_tolocalqueue(l, C,J )

for(k=0;k<l;k++) {
if(C[k]==null && di>=current time)
C[k]=Ji // ith job is Ji and deadline of ith job di
//assigned job to lesser utilized queue of core
}

(h) Localqueue_tocore(r, Et, d, P, I )

Total Time = r+Et


for(i=0;i<10;i++) {
// checking if job completion time can be fitted before next instance of
existing task, meeting the deadline
if (r == I[i] && (Total Time < (P[i] && di ))
Then job is allocated
Else
Break;}
42 K. Baital and A. Chakrabarti

3.4 Verication of the Proposed Algorithm

The algorithm is veried through simulation in C programme environment. As per


our system, each core is assigned to a task and we take ve cores as well as ve
numbers of tasks. The periodicity and execution time of each task have been taken
randomly, and 10 (ten) instances of the period and execution time have been stored
in an array to nd out idle time of each core. Next idle time and period of each core
are also stored in an array.
We take 10 (ten) slots in the global queue and therefore ten new tasks will be
stored in the global queue according to priority order of earliest deadline. We take
periodicity and execution time of new tasks randomly, and each task is divided into
5 (ve) jobs with execution time 5 times smaller with the same periodicity. Release
time and deadline of each job for each of the tasks are calculated and stored in the
queue based on the Pfair scheduling formula [43, 44]. Thereafter, release-time of
each job is compared with the idle time of core to nd out whether the job can be
allocated to that core or not. If the release-time of job is matched with same idle
time of multiple cores, then we nd minimum utilized core. The minimum utilized
core has been found out based on the utilization of the local queue of each core.
Next, we allocate the job from the global queue to the local queue of minimum
utilized core. Lastly, we assign the job from local queue to the corresponding core
calculating the scenario: (Release-time of job + Execution time of job) < (next
instance of the existing task of the selected core starts and deadline of the job).

3.5 Experimental Results

The following experimental results can be concluded:


(a) We can allocate almost all the new tasks into the cores if Periodicity is big for
the new task and hence increasing the utilization of the system
Proposed algorithm is designed to increase the throughput of the system by
allocating maximum number of jobs to idle time of the cores and executing them
within their deadline and within the next instance of existing task starts.
The algorithm uses the concept of Pfair algorithm which depends on e/p (w),
where e is the execution time of job and p the period. If we increase the peri-
odicity, keeping the execution time constant, the value of w decreases, and the
formulas for release-time and deadline provide a scatter values. The scatter value of
release-time of job and deadline of the job is easily tted into the system. In our
experimental result, 4 jobs found the idle time at different cores when the period of
the new task is 12 ms, but only one could be allocated because others could not be
tted within their deadline. We draw a graph with the experimental results, illus-
trating the variation in number of jobs getting accommodated with the increase in
the periodicity of tasks (Fig. 3.4).
3 An Efcient Dynamic Scheduling of Tasks 43

Fig. 3.4 Taking single new


task, which is divided into
ve jobs with respective
release times and deadlines. It
reflects that more number of
jobs get accommodated into
core when period of task is
more

We can draw similar graphs for all the new tasks with different period range and
different execution times maintaining the w value in such a way that almost all the
jobs of new tasks are accommodated into the system as drawn in the Fig. 3.4.
Hence, from the experimental results, we can conclude that approximately all the
random new tasks with different periods and execution times are accommodated in
the system meeting the deadline criteria.
We also achieve the result of CPU utilization of the system based on the
formula:

U = C i Ti

where Ci is the computation time of the task and Ti is the period of the task. We nd
that at a xed time period, the CPU utilization of the proposed system is greater
than 95 %. The CPU utilization can be increased further towards 100 % if we take
the periodicity and execution time in such a way that release time and deadline of
all the jobs are tted into the system.
(b) We can allocate almost all the new tasks into the cores if Execution Time is
reduced for the new task and hence increasing the utilization of the system
even more
In our experimental result, 4 out of 5 jobs found the idle time at different cores
when the execution time of new task is 4 ms (keeping periodicity constant) but only
one could be allocated at 10 ms execution time because others could not be tted
within their deadline or their release time could not be tted into the system. We
draw a graph with the experimental results, illustrating the variation in number of
jobs getting accommodated with the reduction in the execution time of tasks
(Fig. 3.5).
Experimental result shows that more and more jobs get accommodated into the
system if we reduce the execution time of the new task. Using the formula, we
calculate that the CPU utilization at a xed time interval approaches towards
100 %.
From experimental results, we nd that our model achieves the maximum CPU
utilization when we combine the above two scenario, i.e. increasing the periodicity
along with decreased execution time.
44 K. Baital and A. Chakrabarti

Fig. 3.5 Taking single new


task, which is divided into
ve jobs with respective
release time and deadlines. It
reflects that more number of
jobs get accommodated into
core when execution time of
task is reduced

Fig. 3.6 The sample job release time is 15 ms and execution time is 0.4 ms as the task is divided
into ve jobs (execution time of task is 2 ms and execution time of each divided job is 0.4 ms).
The job is easily tted into the core C1 as it gets completed before next instance 16 ms of C1. If
the task is divided into four jobs, the job will take 15.5 ms to complete and if it is divided into 2
jobs the job will not be tted into the core as it will take 16 ms to complete and at that time the
next instance of C1 will be started. Therefore, it can be concluded that the job will be tted easily
to the core if new task is divided into more number of jobs

(c) We can allocate almost all the new tasks into the cores if Task is divided into
a huge number of jobs thus decreasing their execution time so that the jobs
can easily be tted before next instance of the existing task.
The result achieved through simulation is shown in Fig. 3.6

3.6 Conclusion

This paper presents a dynamic scheduling algorithm to accommodate random tasks


generated at different time intervals with different periods and execution times. The
proposed multicore task scheduling model assumes EDF at the global queue of new
tasks and Pfair for context switching and forwarding the job to local queue. The
3 An Efcient Dynamic Scheduling of Tasks 45

proposed algorithm has been veried through simulation, and the results show that
our algorithm performs excellently in all the cases. The other important issues such
as load balancing among the processing cores, task migration, memory overheads
and energy parameter, resource sharing and other performance parameters will also
be considered in the different phases of this model development.

References

1. Kopetz H (1997) Real-time systems: design principles for distributed embedded applications.
Kluwer Academic, Norwell
2. Nakate MSS, Meshram BB, Chavan MJP (2012) New trends in real time operating systems
IOSR J Eng 2(4):883892
3. Rutzig MB, Madruga F, Alves MA, Cota H, Beck ACS, Maillard N, Navaux POA, Carro L
(2010) TLP and ILP exploitation through a recongurable multiprocessor system. IEEE
978-1-4244-6534-7/10
4. Luque C, Moreto M, Cazorla FJ, Gioiosa R, Buyuktosunoglu A, Valero M (2012) CPU
accounting for multicore processors. In: IEEE transaction on computers, vol 61, No 2, Feb
2012
5. Ju M, Jung H, Che H (2014) A performance analysis methodology for multicore,
multithreaded processors. In: IEEE transaction on computers, vol 63, No 2, Feb 2014
6. Singh AK, Shaque M, Kumar A, Henkel J (2013) Mapping on multi/ many-core systems:
survey of current and emerging trends. In: DAC 13 May 29 June 07 2013, Austin, TX
7. Vaidehi M, Gopalakrishnan Nair TR (2008) Multicore applications in real time systems. J Res
Ind 1(1):3035
8. Amalarethinam DIG, Mary GJJ (2011) A new DAG based dynamic task scheduling algorithm
(DYTAS) for multiprocessor systems. Int J Comput Appl (09758887) 9(8), April 2011
9. Rammig F, Ditze M, Janacik P, Heimfarth T, Kerstan T, Oberthuer S, Stahl K (2009) Basic
concept of real time operating systems. Springer Science + Business Media B.V.
10. Mohammadi A, Akl SG (2005) Technical report no. 2005499 scheduling algorithms for
real-time systems. School of computing, Queens University, Kingston, Ontario, 15 July 2005
11. Lehoczky J, Sha L, Ding Y (1989) The rate monotonic scheduling algorithm: exact
characterization and average case behavior. IEEE. CH2803-5/89/0000/0166
12. Buttazzo GC (2005) Rate monotonic versus EDF: judgment day. Real time systems, vol 29.
Springer Science + Business Media, Inc. Manufactured in The Netherlands, pp 526
13. Kato S, Takeda A, Yamasaki N (2008) Global rate-monotonic scheduling with priority
promotion. IPSJ Trans Adv Comput Syst (ACS) 2(1):6474
14. Zhang J, Fang X, Qi L (2014) LCM cycle based optimal scheduling in robotic cell with
parallel workstations In: 2014 IEEE international conference on robotics and automation
(ICRA), 31 May 20147 June 2014
15. Wang G, Gong W, Kastner R (2008) Operation scheduling: algorithms and applications In:
Coussy P, Morawiec A (eds) High-Level synthesis, Springer Science + Business Media B.V.
16. Mathiyalagan P, Dhepthie UR, Sivanandam SN (2010) Enhanced hybrid PSOACO
algorithm for gri scheduling. ICTACT J Soft Comput (01), July 2010
17. Bertogna M, Baruah S (2009) Limited preemption EDF scheduling of sporadic task systems.
In: IEEE transactions on industrial informatics, vol. X, No X, Nov 2009
18. Herman JL, Kenna CJ, Mollison MS, Anderson JH, Johnson DM (2012) RTOS support for
multicore mixed-criticality systems. In: RTAS 12 Proceedings of the 2012 IEEE 18th real
time and embedded technology and applications symposium
46 K. Baital and A. Chakrabarti

19. Mollison MS, Anderson JH (2013) Bringing theory into practice: a user space library for
multicore real-time scheduling. In: IEEE real-time and embedded technology and applications
symposium 2013, pp 283292
20. Li Y, Suhendra V, Liang Y, Mitra T, Roychoudhury A (2009) Timing analysis of concurrent
programs running on shared cache multi-cores. In: 2009 30th IEEE Real-time systems
symposium
21. Pacher M, Brinkschulte U (2011) Ordering of time-dependent tasks for self-organized
resource scheduling. In: 2011 14th IEEE international symposium on
object/component/service-oriented real-time distributed computing workshops
22. Holenderski M, Bril RJ, Lukkien JJ (2012) Parallel-task scheduling on multiple resources. In:
2012 24th IEEE euromicro conference on real-time systems
23. Paolieri M, Quinones E, Cazorla FJ, Davis RI, Valero M (2011) IA3: an interference aware
allocation algorithm for multicore hard real-time systems. In: 2011 17th IEEE real-time and
embedded technology and applications symposium
24. Holman P, Anderson JH Implementing pfairness on a symmetric multiprocessor. In:
Proceedings of the 10th IEEE real time and embedded technology and applications
symposium (RTAS 04)
25. Park S (2014) Task-I/O co-scheduling for pfair real-time scheduler in embedded multicore
systems. In: 2014 IEEE international conference on embedded and ubiquitous computing
26. Jayaseelan R, Mitra T (2009) Temperature aware scheduling for embedded processors. In:
2009 IEEE 22nd international conference on VLSI design
27. Chen L, Boichat N, Mitra T (2011). Customized MPSoC synthesis for task sequence. In: 2011
IEEE 9th symposium on application specic processors (SASP)
28. Chen L, Marconi T, Mitra T (2012) Online scheduling for multi-core shared recongurable
fabric 978-3-9810801-8-6/DATE12 2012 EDAA
29. Ding H, Liang Y, Mitra T (2012) WCET-centric partial instruction cache locking. In: DAC
2012, Jun 37 San Francisco, California, USA
30. Pricopi M, Mitra T (2014) Task scheduling on adaptive multi-core. IEEE Trans Comput 63
(10):25902603
31. De Giusti L, Chichizola F, Naiouf M, De Giusti A, Luque E (2010) Automatic mapping tasks
to coresevaluating AMTHA algorithm in multicore architectures. IJCSI Int J Comput Sci
Issues 7(2)(1), Mar 2010
32. Aghazarian V, Ghorbannia A, Motlagh NG, Naeini MK (2011) RQSG-I: an optimized real
time scheduling algorithm for tasks allocation in grid environments. IEEE
978-1-61284-486-2/11
33. Bamakhrama M, Stefanov T (2011) Hard-real-time scheduling of data-dependent tasks in
embedded streaming applications. In: EMSOFT11, Taipei, Taiwan, 914 Oct 2011
34. Kim J, Shin T, Ha S, Oh H (2011) Resource minimized static mapping and dynamic
scheduling of SDF Graphs. IEEE 978-1-4577-2122-9/11
35. Wu G, Li Y, Ren J, Lin C (2013) Partitioned xed-priority real-time scheduling based on
dependent task-split on multicore platform. In: 2013 12th IEEE international conference on
trust, security and privacy in computing and communications
36. Qu WX, Fan XY, Liu Y, Yang H, Chen L (2010) Memory system prefetching for multi-core
and multi-threading architecture. In: 2010 3rd international conference on advanced computer
theory and engineering (ICACTE). IEEE
37. Peternier A, Ansaloni D, Bonetta D, Pautasso C, Binder W (2012) Hardware-aware thread
scheduling: the case of asymmetric multicore processors. In: 2012 IEEE 18th international
conference on parallel and distributed systems
38. Anane A, Aboulhamid EM, Savaria Y (2012) System modeling and multicore simulation
using transactions. IEEE 978-1-4673-2297-3/12
39. Pericas M, Cristal A, Cazorla FJ, Gonzalez R, Veidenbaum A, Jimenez DA, Valero M (2008)
A two-level load/store queue based on execution locality In: International symposium on
computer architecture IEEE
3 An Efcient Dynamic Scheduling of Tasks 47

40. Jayaseelan R, Mitra T (2009) A hybrid local-global approach for multi-core thermal
management. In: ICCAD 09, 25 Nov 2009, San Jose, California, USA Copyright 2009 ACM
978-1-60558-800-1/09/11
41. Lehoczky J, Sha L, Ding Y (1989) The rate monotonic scheduling algorithm: exact
characterization and average case behavior. In: 1989 IEEE real-time system symposium,
pp 166171
42. Suhendra V, Mitra T (2008) Exploring locking & partitioning for predictable shared caches on
multi-cores. In: DAC 2008, 813 June 2008, Anaheim, California, USA Copyright 2008
ACM 978-1-60558-115-6/08/0006
43. Levin G, Funk S, Sadowski C, Pye I, Brandt S (2010) DP-Fair: a simple model for
understanding optimal multiprocessor scheduling In: 2010 IEEE 22nd euromicro conference
on real-time systems
44. Anderson JH, Srinivasan A (2000) Pfair scheduling: beyond periodic task systems. IEEE
1530-1427/00
Chapter 4
Model-Based Approach for Shadow
Detection of Static Images

Manoj K. Sabnis and Manoj Kumar Shukla

Abstract The existing computer vision systems accept computer-based images for
processing. These images can be processed differently depending upon the types of
applications. One such application that maps to the domain of our work is tracking
in which the image has a number of objects of which only the object of interest is
mapped for tracking. Due to the light sources and existence of surrounding light
conditions, the object of interest, if opaque, creates a hard cast shadow on the
background while if the object is semi-opaque, it creates soft cast shadows on the
background. These shadows resemble very closely to the shape of the object. Thus,
there are chances that they may be mistaken for the object leading to the problem of
false tracking. To avoid this problem, the shadows have to be detected in the
computer vision system before they enter the processing unit. To add to this, the
computer vision systems cannot differentiate between an object and its shadow and
thus passes both of them. So it was required as a necessary and important stage to
incorporate the shadow detection and elimination stage between the computer
vision and processing stage. This stage which is a complete system on its own has
its own input stage, processing, output and evaluation stage. The scope of the work
presented in the paper was for two methods, i.e. colour based and texture based
along with justied evaluation of the results obtained by them.

Keywords Tracking Shadow detection Colour models and texture

M.K. Sabnis ()
CS&IT Department, JVWU, Jharna, Jaipur, Rajasthan, India
e-mail: manojsab67@yahoo.co.in
M.K. Shukla
Amity School of Engineering, Amity University Noida, Noida, India
e-mail: mkshukla001@gmail.com

Springer Science+Business Media Singapore 2016 49


A. Chakrabarti et al. (eds.), Advances in Computing Applications,
DOI 10.1007/978-981-10-2630-0_4
50 M.K. Sabnis and M.K. Shukla

4.1 Introduction

As computer vision systems are not capable of separating the shadows form the
images, these shadows in general have to be separated from the objects before
entering the processing unit. This is done through the shadow detection and
elimination stage [14].
Shadow detection is a compulsory stage but shadow elimination is optional
which depends on image type and its use.
Figures 4.1: Image 1 and 4.2: Image 2 represent objects along with their shadows
having similar features in terms of shape and colour. In case of tracking, this may
confuse the tracking system to select the shadow instead of the object. This is called as
false tracking. This can also lead to miscounting of objects. In such type of applications
along with shadow detection, its elimination is also important [5, 6, 7].
For Figs. 4.3: Image 3 and 4.4: Image 4, the shadows can be used as an
advantage after their detection. For counting applications, the shadows of Fig. 4.3
can be used as the objects are not clearly detectable. For applications like object
identication, the shadow of Fig. 4.4 can be used as the object is not visible.

Fig. 4.1 Image 1

Fig. 4.2 Image 2

Fig. 4.3 Image 3


4 Model-Based Approach for Shadow Detection of Static Images 51

Fig. 4.4 Image 4

The shadow detection and elimination system has four stages. Stage one, the
input stage gives an idea of the type of images required for colour- and
texture-based methods. It is a collection of these types of images.
The processing stage rst evaluates the existing techniques in colour-based and
texture-based methods and further suggests how these techniques can be used to
develop an improved version of the algorithm.
The output stage gives the output image which detects the shadow. The per-
centage of the detected shadows by a particular algorithm is then evaluated by using
qualitative and quantitative parameters in the evaluation stage.
The research work done in the domain of image processing for shadow detection
using these two methods is presented as the remaining part of the paper inform of
detail explanation of each stage.

4.1.1 Shadows

As seen in Fig. 4.5 named shadow formation, the light coming from a single light
source reaches the background partially due to occlusion of light by the object. This
dark region on the background is called as the cast shadow. It has two regions, the
central dark region without any light from the light source called as umbra region and
the region between umbra and the outer boundary where there is a soft transition from
dark to a bright light due to some ambient diffused light or by the effect of other light
source is called as the penumbra region. In other words, umbra corresponds to back-
ground area where the direct light is totally blocked by the foreground object and the
penumbra region where the light is partially blocked [811].

4.1.2 Shadow Types

There are large numbers of shadow types, depending upon their type of formation,
the properties they exhibit or the view point of their usage. But for shadow
detection as the working domain, shadow types have been classied to a limited
number of levels which are represented in a hierarchical form [9].
52 M.K. Sabnis and M.K. Shukla

Fig. 4.5 Shadow formation

The shadow type classication follows two levels of hierarchy: the rst level
classies the image as either moving or stationary. This is referred as image type or
coarse-level classication [12].
The next level, ne tunes the selected object shadows into a number of different
shadow classes based on their prominent properties observed, i.e. sharp and com-
plete, partial or attached. This type of classication is called as property-based
shadow classication [12].

4.1.2.1 Image Type Classication

In case of videos, the objects and their shadows are on the move. Such types of
shadows are called as dynamic shadows. In case of images, the object and their
shadow are stationary. Such types of shadows are called as static shadows [8].
This work mainly focuses on static images because lot of research is already
done on dynamic images in the domain of object tracking and detection [7, 8].

4.1.2.2 Property Wise Classication

The object, i.e. the occluder, comes in the direction of light and casts its shadow on the
background. Depending on the properties such as position, intensity and edges which
the shadow exhibits, the shadows are classied as cast, self or attached [13, 14].
Occlusion of the light coming from the light source, due to the foreground
object, causes cast shadows on the back ground. This light source can have either
entire or partial occlusion of light. This then reduces the background region inci-
dent. Therefore, shadow points have lower luminance but similar chromaticity [15,
16].
4 Model-Based Approach for Shadow Detection of Static Images 53

The cast shadows, as show in Fig. 4.6, are divided into two regions, the umbra
and the penumbra. The complete blockage of direct incident light is referred as
umbra region and the one with partial blockage is referred as the penumbra region
[9, 10, 17].
Umbra being the darkest part of the shadow is easily seen and detected due to its
distinct edges. These edges, however, if not detected, leads to the detection of
shadows as the object instead the object of interest itself [18]. It is also referred as
projected shadows, since the cast shadow is the projection of the object on the
surface background [19].
When light falls on the object and if the object is not of a regular shape, then the
light does not reach some parts of the object due to occlusion of light by other parts
of the object. This creates a dark region on the object itself which is called as
self-shadow [12]. This is shown in Fig. 4.7.
Self shadows do not create a problem because the objects outer boundaries are
clearly detectable, which is the basic requirement of shadow detection [14, 17].
Compared to cast shadows, they have a higher brightness level as they receive more
secondary lighting from surrounding illumination objects [14]. Figure 4.8 repre-
sents attached shadow conditions [11, 17].
There are two views in case of attached shadows, one in which the shadows are
attached to the object and second is the one in which the shadows are attached to
another shadows. In the rst case, it does matter where the shadow is attached to the
object. If it is attached only at one place and at the bottom side, then such type of
images can be used to separate the shadows and its objects by geometrical methods.

Fig. 4.6 Cast shadow

Fig. 4.7 Self-shadow


54 M.K. Sabnis and M.K. Shukla

Fig. 4.8 Attached shadows

Now for case number two, in a scene with multiple objects casting multiple
shadows, depending upon the object nearness to each other, or due to multiple
un-oriented light sources, shadow of one object may fall on shadow of other object.
Such types of shadows are called as attached shadows.

4.1.3 Shadow Taxonomy

There are many approaches proposed for shadow detection and its elimination. This
is so because these approaches depend upon shadow models which vary depending
on the conditions applied, i.e. from an elementary model with assumptions to a
more practical one with realistic environmental conditions and minimal assump-
tions [10].
Also the shadow model for one environmental condition becomes nonapplicable
when the environmental conditions change making the algorithm unstable.
In 2003, Prati et al. reviewed all the shadow detection approaches and proposed
a standard shadow detection method having limited algorithms. These algorithms,
based on two-level taxonomy, were known as algorithm-based shadow detection
taxonomy [10].
The rst or the primary level considers the decision-making ability of the
algorithm to be certain or uncertain, thus dening two approaches: the statistical
and deterministic [10]
The second layer further classied the statistical methods as parametric and
nonparametric based while the deterministic methods were classied as
model-based approach or nonmodel-based approach. This was called as the Primary
Level Classication [10].
This model nalized the area of working for actual algorithmic development.
However, for this, the information required is image type and prominent areas to be
targeted within the image, respectively.
Another classication was also specied by Parti et al. called as the secondary
level classication which was also hierarchical but having only a single level. This
classication basically was feature based. These methods used features for
4 Model-Based Approach for Shadow Detection of Static Images 55

algorithm development and had been broadly classied into three working classes:
spectral, spatial and temporal [19, 20].
Thus, the algorithms based on properties such as intensity and chromaticity
mapped into the spectral domain. Geometry- and texture-based features mapped
into the spatial domain and its working areas were conned to frame based, region
based in the selected frame and pixel based in the selected region. The redundant
information was reflected in the temporal domain which could be used for further
improvement [4].

4.1.4 Advantages of Shadows

Shadow detection in static images uses the application domain of 3D analysis of


object for the purpose of extraction, light source detection, object shape, illumi-
nation and occluders geometry [2, 10, 17].
In dynamic scenes, shadow detection can be used for change detection, scene
matching, understanding relative object position and size [1, 17].

4.1.5 Limitations of Shadows

(1) Cast shadows cause loss of information of the surface under the shadow, as
this surface can be another shadow, object itself or another object or back-
ground. This presents difculties in image interpretation, image matching,
detection and others applications [8, 21].
(2) Attached shadows cause object merging, object shape distortion and object
loss.
(3) Further shadows usually degrade the visual quality of the image leading to
serious problems in segmentation and object extraction [13, 22].

4.2 Input Stage

Large numbers of different types of images are available. These images are rich in
feature such as intensity, colour, texture geometry and orientation. The algorithm
are developed based on the features that are prominent in an image so as to give
maximum output results. To limit the number of algorithms, four standard classes
of methods are developed for four prominent feature observable in an image [19].
This leads to the formation of four standard image classes which are method
based. A standard demarcation is not present between these image classes as some
images can belong to more than one class also. The image selection for a class also
56 M.K. Sabnis and M.K. Shukla

depends on image parameters such as camera properties, illumination conditions,


light directions, number of light sources, amount of reflection and distortion [19].
With all such limitations and with number of assumptions, the images are
classied into four classes, namely
Intensity-based method class: For this type of images, only intensity change is
observed and is within measurable limit. The greyscale or RGB images are present in
such class. Images are indoor images and with single and uniform light source [2, 4].
Colour-based method class. The colour and intensity changes are observed.
Type of images is RGB or HSV images with indoor scenes with multiple and
changing light conditions and all type of outdoor images [5, 6].
Geometry-based method class: For the images to fall in this class, the condition
that needs to be satised is that the object and its shadow need to be connected at
the bottom edge and should have measurable orientation angle between them [7, 8].
Texture-based method class: When the colour of the background, the shadow
and the object are almost the same and angle of orientation between the object and
its shadow is not measurable, then such types of images are identied as
texture-based method class [9, 10].

4.3 Processing Stage

This is the main stage which represents the bulk of the work done in the domain of
shadow detection for colour- and texture-based methods.
As shown in Fig. 4.9, the processing module has two stages. The rst is the
preProcessing stage which is common for both these methods, for image enhancement
and ltering. The second stage is the processing stage which is divided into number of
blocks where each block corresponds to the model they represent [5].

4.3.1 Colour-Based Method

The basic shadow model is acted upon by two types of light sources: the direct light
and ambient light. Direct light is from a point source and ambient light is from
environmental light, from the reflections of surrounding surfaces. This source is

Fig. 4.9 Processing module


4 Model-Based Approach for Shadow Detection of Static Images 57

regarded as the area light. For the formation of shadow area, the direct light is
occluded totally or partially [7, 1113].
The shadow model is represented as follows:

Ii = ti cos i .Ld + LeRi 4:1

Expanding Eq. 4.1,

Ii = ti cos i .Ld. Ri + Le. 4:2

Let ti cos i = ki, be the shadow coefcient of ith pixel.


Let r = Ld/Le
where
r is the ratio between direct light and environmental light
Ii Intensity value of ith pixel, Ld: Intensity of direct light
Le Intensity of environmental light
Ri Surface reflectance of the pixel
i Angle between direct lighting direction and surface normal
ti Attenuation factor of direct light
If ti = 1 means the pixel points are in sunshine region or shadow-free region, where

Ii shadow free = cos i Ld + LeRi 4:3

If ti = 0, then pixel point is in shadow region, i.e. in umbra. where

Iishadow umbra = Le Ri 4:4

If ti is 0 < ti < 1, then the pixel points are in penumbra region. where

Iishadow penumbra = ticos i Ld + LeRi 4:5

For more realistic shadow model, the image is captured by the camera and the
shadow model is represented in a more realistic way as follows: [8]

Fx y = ix y rx y 4:6

where
F (x y) Intensity of the pixel (x y)
r (x y) Reflectance component of object surface
i (x y) Illumination component of object surface
i (x y) is computed as the amount of light power received by the object, per
surface area and it is further expressed as follows:

i x y = Ca + Cp.cos( 4:7
58 M.K. Sabnis and M.K. Shukla

Equation 4.6 is for illuminated area, where Ca: intensity of ambient light, Cp:
intensity of light source and : Angle enclosed by light source direction and surface
normal.

ix y = Ca + tx y.cp.cos 4:8

Equation 4.7 is for the penumbra region.


t: transition inside the penumbra which depends on the light source and scene
geometry and the range is 0 t(x y) 1.

ix y = Ca 4:9

This denes the umbra region.


The intensity value is affected by camera-related problems such as gamma
correction, window aperture, shutter speed and lens distance.
The work carried out in the domain of shadow detection is represented as four
models for four different cases of image types.

4.3.1.1 Model Representation

The four models represent four types of image subclasses which can be evaluated
under colour-based methods. These models are brightness model, chromaticity
model, brightness chromaticity model and nonlinearity model.
Brightness Model: It is also called as illumination model. It works for greyscale
images, which are indoor and having uniform light source. This model represents
intensity invariant colour space where only the intensity changes, whereas the
colour information remains unchanged [14].
Chromaticity Model: It works on colour images (RGB) which are indoor, and
only intensity change is observed. It is not suitable for complex scenes [14].
Brightness Chromaticity Model: In case of outdoor scenes, both intensity and
colour information change. RGB images are not able to follow these changes in a
linear manner. HSV, HSI and C1C2C3 colour models are used [14]. This model
works under a number of assumptions such as shadow is casted on flat nontextured
surface, objects are uniformly coloured, only one light source illuminates the scene,
light source is strong and well illuminated, static and accurate camera, and no
sudden changes in the scene. Environmental changes and noise level are well with
the accepted limits [14].
Brightness Chromaticity Distortion Model: The image class is of outdoor
complicated scenes where no assumptions are applicable to give a practical con-
ditions scenario. To model such a scene, lots of correction have to be applied which
makes the modelling complicated.
4 Model-Based Approach for Shadow Detection of Static Images 59

4.3.1.2 Algorithms

The existing algorithm used thresholding as initial approach for static shadow
detection [13]. Then, the method of automatic thresholding was used for dynamic
images [11, 16, 18, 21]. The limitation of these methods was they assumed equal
distribution of intensity levels within the image which was not practical.
O Gorman proposed a method which was a technique for thresholding based on
image connectivity. The image was threshold at multiple intensities levels
depending upon the calculated connected values. From this stable set, connectivity
values were selected to give multiple intensities with this selected range [15, 17].
An improvement suggested over connectivity intensities was regions. The
region-based method had the advantage that the Euler number could be locally
countable and could be determined in a single raster scan of the image [15, 19]
The method which leads to the idea of multiple intensity-based thresholding was
basically edge-based using canny edge detection. The edges obtained from the
image were put in three predened classes: H (higher level threshold), L (lower
level threshold) and M (medium level threshold) [15, 20].
The edge-based method is further modied as region based. By thresholding
method in which two levels of threshold values are dened as H and L, the region
above H is retained as objects and the region below L or connected to L is rejected
as shadows [15].
The hysteresis methodology suggested is further used to combine local and
global information where the global thresholding is image-wise and local thresh-
olding is region-wise [23]. This is further used as the basis of the proposed
algorithm.
Two algorithms are proposed: one with single intensity for greyscale images and
another for colour images using the thresholding concept.

Intensity-Based Algorithm

(1) The greyscale image is read with image dimensions. The window size is
dened as per the output window requirements.
(2) The average intensity of the image is calculated to get a single value of the
threshold called as the global threshold.
By using the image dimensions, all pixels are added and its average is
obtained. This average, if taken as global threshold, will be lower than lighted
objects but the darker objects will fall below it and will not be detected. Thus,
to separate the darker objects from shadows, sixty per cent of average is taken
and added to average to get the global threshold.
(3) Binary image of the input image is now obtained.
Logic 1 is entered for those pixels whose intensity is greater than the reference
global threshold value, and logic 0 is entered for those pixels whose intensity
value is less than the threshold value.
60 M.K. Sabnis and M.K. Shukla

(4) Morphological operations such as closing, thickening and dilation are per-
formed [24, 25].
Closing operation is done for noise removal. It expands the boundaries and
closes on the background. The thickening is performed for selective growing
of foreground. Then, foreground boundaries are expanded by dilation.
(5) The binary image so formed has logic 1 for objects and logic 0 for shadows
and dark objects. To work with shadows, invert the image, so logic 1 now
represents shadows and dark objects and logic 0 represents the objects.
(6) Region labelling is performed
The image has object and their shadows. With the shadow size so selected,
regions are formed by four connected methods and labelled. Binary labelled
image with the number of labels is returned.
(7) Eliminate smaller regions
For the binary labelled image, nd threshold two. This threshold value is
image size divided by shadow size. Find labelled areas smaller than these
thresholds. These small regions are made logic 0 and thus put in background.
(8) Mask formation
The binary image has only larger regions. The mask so formed has logic 1 for
shadows and logic 0 for objects.
(9) Shadow region selection
This binary mask is multiplied element-wise with the original image. For
better resolution, the variable taken is double float. With elimination of image
contents under mask with logic 1, a shadow-free image is presented.
(10) Image Representation
For shadow representation, during mask comparison, shadow regions are lled
with green colour and objects darker than threshold are represented with red
colour.
The input image of Fig. 4.10 named as input being greyscale has a similar colour
range for object and its shadow. So the shadow in the output image is presented in
different colour as shown in Fig. 4.11 named as output.

Fig. 4.10 Input


4 Model-Based Approach for Shadow Detection of Static Images 61

Fig. 4.11 Output

Colour-Based Algorithm

The algorithm which is used as reference for coming up with proposed algorithm
has the basic RGB colour model transformed into invariant colour models such as
YCbCr, HCV, YIQ, HSI and HSV. Where Y is the luma, Cb is blue difference
chroma, Cr is red difference chroma, H is the hue content, C is the chroma and V is
the intensity value.
HSI model is selected due to higher H in shadow region and less blue colour
value with smaller difference between green and blue colour values.
The algorithm proposed depends on ratio map technique using thresholding
techniques.
The proposed algorithm is as follows:
(1) Selected input image
RGB is converted into HSV as RGB gives output under limited environmental
conditions, whereas HSV has a better adaptation capacity for environmental
variations.
(2) Channel separation
From the converted image, separate the hue, saturation and intensity value
channels
(3) Construct the ratio map for all pixels as follows:

Rx, y = Hex, y + 1 Iex, y + 1 4:10

R (x, y) pixel at position xy of R


He (x, y) pixel at position xy of He
Ie (x, y) pixel at position xy of Ie

(4) Scale the ratio map in the range of 0 to 255.


62 M.K. Sabnis and M.K. Shukla

(5) The ratio map is then converted to modied ratio map by applying exponential
function of ratio map. This stretches the gap between the ratio value of shadow
and nonshadow pixels.

Rx, y = Hex, y + 1 Iex, y + 1 4:11

He(x, y) is modied to He(x, y) + 1, and Ie(x, y) is sealed in the range of [0, 1].
This works satisfactory in small and medium range but gives unsatisfactory
results of shadow detection in large range.
(6) Then, calculate the global threshold value T for the images by using
Otsu thresholding method [23, 26].
where

Rn = R minR MaxR minR 4:12

And from the modied ratio map,

Rnnew = Rnew minRnew MaxRnew minRnew 4:13

By using the graythresh function of MATLAB, two threshold levels are


obtained for Rn and the other for Rnnew
(7) Image conversion
The image is now converted into binary image
(8) Edge detection
Returns classied edge for intensity value of the input image
(9) Morphological operations
Opening thickening and dilation is done on the binary image
(10) Gradient map
The gradient map of the intensity value of the HSV image is obtained, and this is
then available as the binary output.
(11) Shadow removal
Three methods are dened for this: the additive method, basic light mode 1 and
advanced light mode 1. In case of this algorithm, advanced light mode is selected as
it is based on global and directed light as in case of basic light mode 1 but
containing two types of light.
(12) Image representation
The dark background and the shadow is represented in the output image
4 Model-Based Approach for Shadow Detection of Static Images 63

Remove the shadow region so as to have a shadow-free image.


These algorithms are applicable to brightness model, chromaticity model and
brightness chromaticity model. The nonlinearity model is a practical approach
where corrections have to be applied for each assumption removed and thus making
the algorithm design more complex.
Figure 4.12 represents a colour input image. The image has cast and
self-shadow. The output image (Fig. 4.13) shows the darker region as shadow. As
the edge of the object and the colour of the background are comparable, some part
of the dark background is also detected as the shadow.
Figure 4.14 is the input test image having very distinct colour of object, shadow
and background. The output image (Fig. 4.15) shows clearly shadow detection.
This condition is satised because the threshold selections are perfectly satised in
the algorithm for this input image.
The input image as seen in Fig. 4.16 shows the light dark shadow on a green
background. The output image shows the shadow clearly as seen in Fig. 4.17.

Fig. 4.12 Input

Fig. 4.13 Output

Fig. 4.14 Input


64 M.K. Sabnis and M.K. Shukla

Fig. 4.15 Output

Fig. 4.16 Input

Fig. 4.17 Output

Thus, it can be concluded that the output mainly depends on the type of input
image and the selection of the threshold. Though algorithm for greyscale is simple,
it has lot of limitations in its output representation.

4.3.2 Texture-Based Method

Texture is a very general idea that can attribute to almost everything in nature. For a
human, texture analysis initially was dependent on look and feel concept where
texture related mostly to a specic spatially repetitive structure of surfaces points
formed by repeating a particular element or several elements in different relative
spatial positions [9].
4 Model-Based Approach for Shadow Detection of Static Images 65

The differences seen and felt by the human are difcult to be dened in a
quantitative manner. This leads to the dening of texture in the form of features
which can be measurable [9].
This leads to the denition of informal qualitative structural features such as
neness, coarseness, smoothness, granularity, lineation, directionality, roughness,
regularity and randomness [9].
Further, it is difcult to use human classication as a basis for formal denition
of image textures, because there is no obvious way of associating these features
which can be easily perceived by human vision with computational models that has
the goal to describe the texture [9].
After several decades of research and development on texture analysis and
synthesis, a variety of computational characteristics and property have been dened
for images [9].
Image texture can be dened as natural textured surface and articially created
visual patterns. For an image, the texture is considered as a texture pattern which
covers the entire image or a texture pattern with the region of the image which
repeats itself in a denite sequence covering the entire image [9].
The texture pixels have some denite property variations described as local
arrangement of the image signal in the spatial domain or the domain of Fourier or
other spectral transform. This is further used to model the texture image into
computational model.
The pixels have a relationship with their immediate neighbourhood pixels which
help them to dene a texture region with boundary pixels. This pixel set of region
as a whole has relationship with pixel sets of other region to identify identical
repeated patterns.
For modelling texture patterns in an image, at pixel level that texture-related
property of the pixel is selected which is more prominent.
There are number of such properties but CBIR standard has dened and accepted
six texture-based properties such as coarseness, contrast, degree of directionality,
line likeness, regularity, roughness and Markov random. These features are called
as Tamuras texture features [2729].

4.3.2.1 Algorithm

The existing algorithms examined on which the proposed algorithm is based are as
follows:
K Emily and Esther Rani suggested a solution for dynamic images by initially
using change detection mask and then canny edge map to separate shadow and
shaded areas [7].
A Leone and C Distante proposed a texture-based solution by feature extraction
method where texture-wise characterization is done of neighbourhood pixels by
projecting them on a set of Gabor functions [12].
The proposed algorithm is as follows:
66 M.K. Sabnis and M.K. Shukla

(1) Read the input image and convert the image into greyscale. The image usually
selected is HSV
(2) Find the entropy of the image. In an image, many pixel changes take place.
Entropy is the measure of this uncertainty of random variables, where Entropy
k is,

k = X Y Z 4:14

where

X = Log A + Log B.

Y = Log C log A.

Z = Log D Log B.

A Sum of all probability values from 1 to K


C Sum of all nonzero probability values from 1 to K
B Sum of all probability values from 1 to 256
D Sum of all nonzero probability values from 1 to 256

(3) If calculated entropy value is less than zero, then entropy(k) = 0, For K = 1 to
K = 256, calculate Entropy (k) and select the greatest value of entropy (k) as
the entropy of the image and then perform histogram equalization on the
image to increase its contrasts.
(4) Select a mask size m, compatible with the image and take the position of the
two least values in entropy and set those positions as 1 in the mask and set the
remaining mask position as 0. Shadow pixels are those pixels with minimum
variation, thus less entropy, i.e. 1 in the mask
(5) Replace the pixels of the original image wherever the mask is 1 so as to
remove shadow pixels
Figure 4.18 is the input image. This cannot be used for colour as the colour of
object and background is almost same. Similarly, it cannot be used for
geometry-based method as the object and its shadow are not connected to have the
angle of orientation. Thus, we restore for texture-based method. All shadows are
obtained depending upon the texture algorithm though our interest is only in the
shadow of the dog. This is seen in Fig. 4.19.
As shown in Fig. 4.20, the object has illumination almost similar to the back-
ground, and even though the object and its shadow are connected but still not
having a single orientation angle, we have to restore to texture-based techniques.
This output is shown in Fig. 4.21.
As the shadow is more prominent in the input image (Fig. 4.22), its advantage is
taken. The basic principal of texture analysis is used in this. Texture of the
4 Model-Based Approach for Shadow Detection of Static Images 67

Fig. 4.18 Input

Fig. 4.19 Output

Fig. 4.20 Input

Fig. 4.21 Output


68 M.K. Sabnis and M.K. Shukla

Fig. 4.22 Input

Fig. 4.23 Output

background does not change due to shadow falling on the background, so back-
ground with object gives texture change and background with shadow no texture
change, so this is detected as shadow region as represented in Fig. 4.23.

4.4 Output Stage

The visual representation of the output of the image is available. This output image
is then evaluated on qualitative and quantitative bases for its justication.

4.5 Evaluation Stage

Both the shadow methods give results under different conditions and types of
images. Thus, they cannot be compared on a direct basis. The types of errors that
are found common in all these methods are shadow detection failure (SDF) and
object detection failure (ODF).
At the shadow edges, the pixel information closely maps to the object pixels;
therefore, these pixels are misclassied as shadow pixels. This is called as shadow
detection failure. Similarly, in case of some dark regions of the objects, the pixels
are detected as shadows. This is called as object detection failure.
Now to measure and minimize these failures, some evaluation techniques are
suggested at the qualitative and quantitative levels.
4 Model-Based Approach for Shadow Detection of Static Images 69

4.5.1 Qualitative Evaluation

The metrics dened for evaluation under qualitative are as follows: [30] robustness to
noise, object independence scene independence, computational load, flexibility to
shadow strength, width and shape, detection of indirect cast shadows and penumbra.
Table 4.1 explains what should be the state of these metrics for faithful shadow
detection.

4.5.2 Quantitative Evaluation

Three important parameters of measurement at a theoretical level dened are good


detection, good discrimination and good localization [31].
Good detection is the probability of misclassifying a shadow point. This value
has to be as low as possible. It aim is to minimize the term false negative (FN), i.e.
the shadow points being classied as foreground or background [7, 30].
Good discrimination, it is the probability of nonshadow points being classied as
shadows points. This value should also be low. It is also referred as false alarm rate.
This is also represented as false positive (FP) [7, 30].
Good localization indicates the nearness of the marked shadow points to the real
shadow point position [5].
For these three quantities of measures, two metrics are dened for measurement.
They are detection rate (DR) and false alarm rate (FAR) [7]. Detection rate is often
referred as true positive rate as recalled in classication literature.
Detection rate (DR) is as follows:

DR = TP TP + FN 4:15

Table 4.1 Qualitative evaluation


Name Comments
Object independence Object should be depended on shadow to form BLOB. But it should
be independent of background for background subtraction
Scene independence As the scene goes far away from its environmental conditions, it loses
its reality.
Computational load The computational load can be reduced by avoiding repeated
scanning
Flexibility to shadow This should be high to accommodate changing environmental
strength conditions
Width The window size should be adjustable as per requirements
Shape Shape of the shadow should map to some object for its detection
Indirect cast shadow They do not create any problem as long as they do not cross the cast
detection shadow boundaries
Penumbra detection If the edges of the cast shadow are detected properly, then penumbra
detection does not create any problems
70 M.K. Sabnis and M.K. Shukla

False alarm rate (FAR) is 1 , where is precision in classication theory [7]

FAR = FP TP + FP 4:16

where FN: false negative where shadow points are classied as background or
foreground. TP: true positive where shadow points are classied as shadows, FP:
false positive where foreground or background points are detected as shadow
points.

4.5.2.1 Colour-Based Method

Table 4.2 represents the detection rate and false alarm rate of different images
taken.
For greyscale with similar colour object and background, the detection rate is not
that high and more false pixels are detected as shadow pixels (Figure 4.11). Due to
large difference in colour of shadow and image, the detection rate is high
(Figs. 4.13, 4.15 and 4.17).

4.5.2.2 Texture-Based Method

Table 4.3 represents the detection rate and false alarm rate for different images
tested under this method.

Table 4.2 Evaluation of Output Image Detection rate False alarm rate
colour-based method
0.751 0.14826

Figure 4.11
0.58033 0.0595

Figure 4.13
0.8696 0.18142

Figure 4.15
0.94698 0.91152

Figure 4.17
4 Model-Based Approach for Shadow Detection of Static Images 71

Table 4.3 Evaluation of Output image Detection rate False alarm rate
texture-based method
0.95 0.2472

Figure 4.19
1 0.2864

Figure 4.21
1 0.14

Figure 4.23

If the background is plan or texture is uniform, then shadow detection by texture


method gives accurate results as seen in Figs. 4.19, 4.21 and 4.23. But this method
is best suited for single object and shadow pair. This is so because BLOBs are
formed as algorithm works on single criteria of texture detection.

4.6 Conclusion

It can be concluded that no single method can be used for getting accurate results.
Hierarchical methods are recommended so that the accuracy can be improved. The
selection of the method and their sequence of implementation again depend on
individual type of image.
Intensity method can be used for greyscale or colour (RGB) for simple images
with uniform and single light source.
In case of multiple objects even with single light source, it gives multiple
reflections. The colour model gives both intensity and colour variations. In this
case, thresholding is preferred as threshold value selection totally depends on the
algorithm developer within the domain of image types.
When no conditions met in an image texture method, it is the only solution
available.

4.7 Future Scope

As this eld is very vast, the scope of this work is only limited to two methods. The
scope can be extended beyond these two methods to include intensity and
geometry-based methods so as to cover images of all types.
72 M.K. Sabnis and M.K. Shukla

In case of thresholding, maximum two thresholds with xed values are rec-
ommended. The value of these two thresholds can be made adaptive according to
the image requirements within a specic range.
In case of textures, the direct analysis of images using CBIR texture properties
makes the computation very complicated and very limited image processing
methods are available related to textures.

References

1. Chen T-M, Wang W-X (2006) Image shadow detection and classication based on rough sets.
In: Proceedings of fth international conference on machine learning and cybernetics, Dalian,
1-4244-0060-0/60/, IEEE, pp 38913896, 1316 Aug 2006
2. Blajovici C, Kiss PJ, Bonus Z, varga L (2011) Shadow detection and removal from a single
image. In: 19th summer school on Image Processing, Szeged, Hungary, pp 16, 7th-16th July
3. Gudmundsson K, Benediktsson JA, Cagatin FS (2008) Shadow extraction, Seventh
Euro-American workshop on information optics. J phys Conf ser 139(012032):17. IOP
publication ltd
4. Lakshmi S, Sankaranarayanan V (2010) Cast shadow detection and removal in a realtime
environment. IEEE, pp 245247. ISDN: 978-1-4244-9008-0/10/
5. Xu L, Qi F, Jiang R, Hao Y, Wu G Shadow detection and removal in real images: a survey,
CVLAB, Shaighai Jiao Tong University P.R China
6. Sanin A, Sanderson C, Lovell BC (2012) Shadow detection: a survey and comparative
evaluation of recent methods. Pattern Recognition, Elsevier, vol 45, No 4. pp 16841695.
ISSN: 0031-3203
7. Withagen PJ, Groen FCA, Schutte K (2007) Shadow detection using physical basis,
intelligent autonomous system Technical report, pp 114
8. Horprasert T, Harwood D, Davis L (2008) A statistical approach for real time robust
background subtraction and shadow detection, computer vision lab. University of Maryland,
pp 119
9. Madsen CB, Moeslund TB, Pal A, Balasubramanian S (2009) Shadow detection in dynamic
scenes using dense stereo information and an outdoor illumination model, computer vision
and media technology lab, Aalborg university Denmark, pp 110125
10. Rani EE, Jemilda G (2011) Shadow detection in single still image using tam based multi-step
algorithm, IJCST, vol 2, Issue 4. pp 496500, OctDec 2011. ISSN(Online): 0976-8491 ISSN
(Print): 2229-4333
11. Moving cast shadow detection, vision systems: segmentation and pattern recognition, pp 47
58 (2011)
12. Onoguchi K (1998) Shadow elimination method for moving object detection. Proce Fourteen
Int Conf Pattern Recogn 1:583587
13. https://www.cs.auckland.ac.nz/courses/compsci708s1c/lectures/Glect-html/topic4c708FSC.
htm
14. Howorth P, Ruger S Evaluation of texture feature for content based image retrieval.
Department of Computing, Imperial College, London, South Kensington Campus, London,
SW7 2AZ
15. Guo R, Dai Q, Hoiem D Single image shadow detection and removal using paired regions,
supported in part by the National Science Foundation under IIS-0904209, pp 20332038
16. Hierarchical static shadow Detection methods, US 7, 970, 168 B1
17. Rosin PL, Ellis T Image difference threshold strategies and shadow detection, Institute of
remote sensing applications, Joint Research Center, Italy
4 Model-Based Approach for Shadow Detection of Static Images 73

18. Jyothirmai MSV, Srinivas K, Srinivasa Rao V (2012) Enhancing shadow area using RGB
colour space. IOSR J Comput Eng 2(1):2428, July-Aug 2012. ISSN: 2278-0661
19. Bichsel M (1994) Segmenting simple connected moving objects in a static scene. IEEE
Trans PAMI 16:11381142
20. Gorman LO (1994) Binarization and multi thresholding of document images using
connectivity, Symposium on document analysis and Information retrieval, pp 237252
21. Leone A, Distante C (2006) Shadow detection for moving objects based on texture analysis.
J Pattern Recogn, pattern recognition, Elsevier 40(2007):12221233, Sept 2006. ISSN:
0031-3203
22. Otsu N (1979) A threshold selection method from grey level histogram. IEEE Trans Syst Man
Cybern SMC-9(1):6269, Jan 1979
23. Grey SB (1971) Local properties of binary images in two dimensions. IEEE Trans Comput
20:551561
24. Morphological image processing. https://www.cs.auckland.ac.nz/courses/compsci773s1c/
lectures/ImageProcessing-html/topic4.htm
25. Morphology fundamentals: dilation and erosion. http://in.mathworks.com/help/images/
morphology-fundamentals-dilation-and-erosion.html
26. Canny J (1986) A computational approach to edge detection. IEEE Trans PAMI 8:679698
27. Tuceryan M, Jain AK (1998) Texture analysis, the handbook of pattern recognition and
computer vision(2nd edn). World scientic Publication Co, pp 207248
28. Srinivasan GN, Shobha G (2008) Statistical texture analysis. In: Proceeding of world academy
of science, engineering and technlolgy, vol 36. pp 12641269, Dec 2008. ISSN: 2070-3747
29. Materka A, Strzelecki M (1998) Texture analysis methods-A review, Technical university of
Lodz, institute of electronics, COST B11 report, Brussels
30. wang SKE, Qin BO, Fan Z-H, Ma Z-S (2007) Fast shadow detection according to the moving
region. In: Proceedings of 6th international conference on machine learning and cybernetics,
Hong-Kong, IEEE 1-4244-0973-X/07, pp 15901595
31. Chung KL, Lin YR, Huang YH (2009) Efcient shadow detection of colour ariel image based
on successive thresholding scheme, transaction on Geoscience and remote sensing,
0196-2892, vol 42, No2. IEEE, pp 671682
Chapter 5
Light Fidelity (Li-Fi): In Mobile
Communication and Ubiquitous
Computing Applications

Nitin Vijaykumar Swami, Narayan Balaji Sirsat


and Prabhakar Ramesh Holambe

Abstract As the increasing nature of big data during the last few years, the
problem of data trafc arises and also the pervasive nature of smart phones the
number of end user smart phones was overtaken the total entire population of
planet. On the other hand, the emerging technology of Li-Fi has found to be the best
solution on these problems. This works on the concept of visible light communi-
cation (VLC) and also offering many solutions to reduce the cellular infrastructure
trafc and its needs. Not only the mobile communication but also the ubiquitous
computing adding the more overloads to the RF spectrum. In this paper, we have
covered the concepts of Li-Fi, how the Li-Fi technology can be enhanced in the
mobile communication, how it works, the Li-Fi cellular network, some ubiquitous
computing applications, common misconceptions about Li-Fi, Li-Fi in solar cell
and Internet of Things (IoT).

Keywords Li-Fi Li-Fi AP LED VLC IoT Ubiquitous computing

5.1 Introduction

According to the statistics of ITU, the use of mobile and Internet will reached per
month around 24 Exabyte of data which was more than the 30 times the size of the
entire global Internet in 2000 [1]. Also the different computing devices like laptops,
tablets and sensors are adding more load to the existing infrastructure making it

N.V. Swami () N.B. Sirsat


Shri Yogeshwari Polytechnic, Ambajogai, Beed, India
e-mail: nitinswami59@gmail.com
N.B. Sirsat
e-mail: narayansirsat45@gmail.com
P.R. Holambe
College of Engineering, Pune, India
e-mail: prabhakar.holambe2010@gmail.com

Springer Science+Business Media Singapore 2016 75


A. Chakrabarti et al. (eds.), Advances in Computing Applications,
DOI 10.1007/978-981-10-2630-0_5
76 N.V. Swami et al.

complicate. Surely, these technologies will be occupying the electromagnetic


spectrum band in few upcoming decades.
The German physicist Harald Haas proposed a Li-Fi technology which is a
bidirectional, high speed and fully networked like Wi-Fi using the light flickering
concept [2]. The simple concept of light flicker is used in which light ON condition
indicates binary one and light OFF indicates binary zero. As the light speed is 1080
million km/hr the Li-Fi achieved the speed of more than 10 Gbps in its recent
experiments, theoretically allowing HD lms to be downloaded in 30 s only [3].
The visible light spectrum comprises 100s of THz of free bandwidth which is
10,000 times more than the entire RF spectrum [4]. This will offload the all limi-
tations for next generations of mobile communication.
The Li-Fi network uses LED to generate the data streams and photo detectors to
receive the data streams. As the small bulbs of 5mW developed by the University of
Strathclyde, Glasgow, use the red, green and blue elements to transmit the three 3.5
Gbps streams, which together exceeded the speed of 10 Gbps [5].

5.2 The Working Principle of Li-Fi

Li-Fi works on the principle of visible light communication which provides a much
wider license-free visible light spectrum to the service provider. The continuous
data streams are generated by flickering LED at transmitter side on higher rate. This
flickering is imperceptible to human eye. The PCB controls the electrical inputs and
outputs of the lamp and houses the microcontroller used to manage different lamp
functions [6].
The voltage regulator and level shifter circuits are used on both the side to regulate
the flickering [7]. Finally at receiver side photo detectors are used to sense the light
and to convert it into the respected pulses. These pulses then amplied and processed
to achieve the original data stream. The merging of illumination with wireless
communication provides a measurable reduction in both infrastructure complexity
and energy consumption [8]. As lighting used in indoor environments even during
day time, the energy used for communication would practically be zero. Fig. 5.1

5.3 Li-Fi Transmitter: Led

It is estimated that in 2014, LEDs will secure 24 % of the lighting market. The
invention of blue LED had changed the view of the light sources. In Li-Fi LED is
used as a data transmitter that generates thousands of data streams as compare to
IR LED [9]. The data are transmitted from RS-232 cable to the power line and then
to the LED [10]. This transmission is not limited to line-of-sight direction. We can
use a single or an array of LED to transmit the data. The potential of these LEDs
can be controlled by using microcontroller and can be adjusted by some Luminaries
5 Light Fidelity (Li-Fi): In Mobile Communication 77

Fig. 5.1 Working principle of Li-Fi [8]

Design Optimization technique. We can use different colored LEDs for different
intensity channels. The experiments with a various dimmed intensities are observed
at Li-Fi R&D centre [9] showing the satisfactory reports for energy saving. The
innovations like nobel prize winner Blue LED add more advantages for Li-Fi by
increasing LED lifetime to 100,000 h and cheaper in cost [11].

5.4 Li-Fi Receiver: Photodiode

The Avalanche photodiodes have found to be the better receiver in Li-Fi com-
munication. They read the flickering patterns of LED and interpret them as data.
Prof. Harald Haas has shown the rst receiver chip for Li-Fi with integrated
Avalanche photodiodes on CMOS [9] which is a 7.8 mm2 IC that houses 49
photodiodes [12]. When compared to Wi-Fi antenna it is just a 13 % of its length.
To make a full duplex communication, the mobile terminals uses infrared
(IR) transmission to talk with nearest optical AP. Various experiments of com-
bining both emitters and photodiode properties are in progress [10]. Some
78 N.V. Swami et al.

companies like SunPartner Technologies, French Technology company Oledcomm


are succeeded to develop sensors that both downloads data and charges battery
itself using those light rays [13].

5.5 The Li-Fi Cellular Network with Attocell

The visible light area is a cell in Li-Fi cellular network. This can be smallest in size
depending on the potential of LED source used. These cells are called as optical
attocell. [14] The decrease in cell size can signicantly increase cellular capacity
and user data rates [15]. The optical attocells can be covering an area of 1-10 m2
and distances of about 3 m [14].
The size of attocell is very small, and walls prevent the system from suffering
from co-channel interference between rooms [4]. The type of security is also
maintained as compare to Wi-Fi. The only thing is that the rate of hand-off will be
increased due to the small size attocells. As the Li-Fi works on VLC, there is no
heavy setup like RF network is required, so we can use Internet and make mobile
phone calls without any additional international roaming rates.
Li-Fi attocells can be deployed as part of a heterogeneous VLC-RF network.
They do not cause any additional interference to RF macro and picocells allowing
the system to hand-off users between the RF and Li-Fi subnetwork [4]. In Li-Fi
cellular network, the indoor attocells were surrounded by a microcellular network
[14]. At glance, data coverage in a room contains many attocells forming a very
dense cellular attocell network.

5.6 Hand-off in Hybrid Network (Li-Fi Plus RF Network)

The Li-Fi attocell and RF macro and picocells together form a wide and unregulated
hybrid network in which the mobile users are served by both technologies. Due to
this hybrid nature, the hand-off falls in four categories: Li-Fi to RF, Li-Fi to Li-Fi,
RF to RF and RF to Li-Fi [16]. As studied in [16] both mobile users and xed users
are served by hybrid network. As attocell in indoor navigation has small size so
bandwidth reuse is possible which results in high spatial spectral efciency pro-
vided to the users [17]. On handover, the channel state information (CSI) must be
provided to the central unit (CU) which is monitoring the system continuously. The
continuous handover takes place in case of mobile users, and the signaling infor-
mation is exchanged between users and CU. This process takes an average time
ranging from around 30 ms to 3000 ms depending on the algorithm used. Some-
times the light beams are blocked due to some obstacle at that time the RF network
serves the coverage that is users having low optical CSI are served by RF APs.
A data rate threshold is used to identify whether the user is served by a Li-Fi AP or
a RF AP [16].
5 Light Fidelity (Li-Fi): In Mobile Communication 79

5.7 Ubiquitous Computing Applications of Li-Fi

The ubiquitous with smart computing makes things live in environment. These
things can talk, interact, and operate with each other to make the environment more
sustainable and smarter. In this section, we are going to see how effectively Li-Fi
can be deployed in ubiquitous computing.
(a) Smart Homes:
Each home appliance can be enhanced with Li-Fi LED. It can be your micro-
wave oven cooking food for you, can be a vacuum cleaner cleaning room carpet by
checking the status of dust collected, can be a washing machine, freeze, thermos,
clocks, chairs, fans with LED, smart LED TVs playing favorite channels by
identifying your mood. It all is making together the social web of things in your
home.
Many companies like Ericson, Luminous, Revolve, Philips Hue, Staples connect
and Apple are participating to become a top of the developments in smart home
appliances. [11].
(b) Dense Area:
One of the achievements of this technology is that it works there where most of
the Wi-Fi technology fails. As we know in congested area that is in dense area
where more disturbances are present like construction materials, the Wi-Fi is not a
suitable option. Instead, a Li-Fi serves in a better way by smart lighting and
frequency reuse. This application can be possible in hotels, congested residency
areas, etc.
(c) Ubiquitous Healthcare:
It is the delivery of health-related services and information via ubiquitous
computing technologies to your family doctor [18]. This can be achieved by using
wearable Li-Fi transmitters. These can be rings, ear rings, wrist watches, jewelry,
etc., allowing monitoring not only your health issues but also adding a more beauty
to your personality. For example, your shirt buttons measuring your heart beat rate.
The information is send over to the smart applications of family doctor through
Internet. A xed rate of data packets can be decided for these data transfer like 30
packets per hour.
(d) Learning Environment:
Allowing each classroom light bulb to work as Li-Fi AP, we can make a smart
classroom. In which all students have access to a variety of digital devices like
PDAs, laptops, cell phones, headsets, wireless modems, webcams and services,
whenever and wherever they need them. Both teachers and students can be an
active participants in learning process both collaboratively and individually by
one-to-one or one-to-many or many-to-many communication facility in learning
environment.
80 N.V. Swami et al.

(e) Indoor Navigation:


Li-Fi AP can be equipped at each indoor lighting like shopping malls, cinema
theatres, government ofces, workstations and museums using existing power line
infrastructure [9]. The efcient intensity modulation making light visually off helps
to use a Li-Fi in day time also.
(f) Intelligent Trafc Control System:
Several researches have been done to make vehicle to vehicle communication
using different techniques like vehicular Ad hoc network (VANET) technology
based on Wi-Fi, Wi-Max and DSRC technologies in addition to 3G networks [19,
20], using Programmable Interface Controller (PIC) sonar which sends 40 kHz
short pulse of sound [21], etc. In Li-Fi, each head light and back light can be turned
into Li-Fi transmitter developing an intelligent transport system where cars can talk
to each other and also to the trafc signal lights and street lights providing the
statistics and position of vehicles [22]. By doing this, we can reduce the number of
accidents and can make an intelligent trafc control system. Fig. 5.2
(g) Smart Lighting:
The cellular towers required a heavy equipments and large space for its con-
struction. There are more than 35,000 towers across 18 states of the union of India
are developed by Bharti Airtel [23]. These counts denitely reach to some tens of
lakhs by considering the other telecom companies and all other countries. The Li-Fi
technology will reduce the entire cellular infrastructure by smart lighting using the

Fig. 5.2 Li-Fi in trafc control system [27]


5 Light Fidelity (Li-Fi): In Mobile Communication 81

existing infrastructure of power line. So it can be effectively equipped with street


lamps serving multiple users and also multiple users can talk to street lamps.
(h) Indoor Positioning System (IPS):
An indoor positioning system is a solution to locate objects or people inside a
building using some sensory information collected by mobile devices [24]. As the
shortest size of attocell and LED object tags and LED appliances we can easily locate
and identify the object in indoor location. The Li-Fi IPS system is overcoming the
problem of signal attenuation caused by construction materials found in GPS system.
(i) Location Based Services:
As each thing in Li-Fi technology is equipped with Li-Fi transmitter and
receiver, it is easy to track things in environment. Each transmitter is uniquely
identied, so the location of any VLC device can be identied quickly and accu-
rately. The pet tracking is the best example of this service.
(j) Augmented Reality:
Li-Fi can be also enhanced and supports other technologies also. As the light is
present at everywhere, the Li-Fi transmitter can be used as a part of augmented
reality delivering the statistics information about the subjects. For example, when
Alice go for a museum and hold his computing device in front of object it will show
its information, similar when we go for a shop in big shopping mall it will show the
stock, offers and shopping details on your mobile by identifying Li-Fi tags.
(k) Advertising through Li-Fi:
Imagine you are walking through the streets and at each street lamps, your
mobile device shows you ad on your cell. Your shops door light shows you the
current offers of that shops, what are the new products are available, etc., this will
creates a paper-free advertising helping in saving trees.
(l) EMI sensitive environments
The Li-Fi system can be used to serve the data in EMI sensitive environment
likewise aircraft, radioactive research laboratories. In aircrafts, this will provide a
continuous data access to the passengers and also reduces the complicated cabling
weights.
(m) Cellular Communication
The cellular communication with the Li-Fi system will surely offload the major
burdens of establishing the base station, installing it, power supply arrangements,
space arrangements, spectrum limitations, etc. The costs of these burdens are
reduced effectively using existed power supply. With the help of Li-Fi, the distance
between will be reduced to few hundreds of meters. As discussed earlier the street
lamps will be providing both illumination during night and a high-speed data
communication 24/7.
82 N.V. Swami et al.

5.8 Common Misconceptions About Li-Fi

The common misconceptions raised about Li-Fi are as follows:


(a) Lights cannot be dimmed:
Prof. Harald Haas had shown that there is no effect of dimmed intensity lights.
For better performance, the different intensity modulation techniques are used
which makes light visually off.
(b) The Lights flickers:
The Flickering rate of Li-Fi light source is very high that it cannot be recognized
by human eye. Human eye can recognize only lower rate flickering like
120-150 Hz.
(c) This is for downlink only:
For uplinking either low intensity LED or IR LED is used to complete the Li-Fi
system. It can be equipped in existing mobile camera flash LED.
(d) There will be interference from sunlight:
Sunlight is a biggest source of energy which can be intelligibly utilized system
by using various energy harvesting techniques to strengthen the Li-Fi System. And
also the programmable photodiodes can be manufactured to sense the different
colored intensity lights. As discussed above that the receivers are only interested in
the flickering of lights and the sunlight is not flickering so there is no any bottleneck
of sunlight. Instead, the Li-Fi solar cell will produce big energy harvesting project.
(e) Lights need to be ON so this is insufcient:
As compared to cellular network, we are using only existing power lines so no
special arrangement is required to deploy Li-Fi system. We need to just update the
light source and end user equipments. So much more energy will be saved.
(f) This is a line -of-sight technology:
The experiments at R&D centre of pure Li-Fi and also the Professor Haas in his
talk at Edinburgh University has shown that the Li-Fi system is working beyond the
line-of-sight direction. As the receiver can sense even small changes in light rays it
is not limited to line-of-sight only.

5.9 Li-Fi in Solar Cell

The latest invention of Professor Harald Haas team had made possible not only to
transfer a high-speed stream of data with the help of solar cell but also providing a
power to survive the system independently. This made a best scenario of power
5 Light Fidelity (Li-Fi): In Mobile Communication 83

saving by implanting each electronic object with self-powered Li-Fi solar cell. This
solar cell can act as Li-Fi receiver and solar cell at the same time. The demon-
stration of this experiment was done by Professor Haas at TED Global 2015 event
in London. He also stated that over a four billion people in the world are still do not
have access to the Internet but now with the little energy infrastructure using solar
energy this situation can be changed in developing countries.
They are focusing on the integration of power gathering and data reception at
solar panels and turning them into communication devices. In effect, solar cells
within the panel become communications nodes that receive high-bandwidth data
while also providing electrical power for the nodes operation. These solar panels
can be used on the roof of houses or vehicles to act as a broadband receiver from a
nearby Li-Fi transmitter. The translucent solar cells can be integrated into windows,
doors and in other glass furnitures. These can be also integrated into street fur-
nitures and billions of such devices forming the Internet of Things (IoT) [25].
The Google and Facebook like companies are participating to provide an
Internet to the entire population of the earth by their efforts like Google loon project
based on Internet balloons. The worldwide information and communication tech-
nologies will be requiring more than 100 nuclear power plants to serve for world
population. These self-powered nodes will remove a major barrier to data com-
munication growth. In conventional optical wireless communications, the steady
background component of the received optical signal is usually discarded, but can
instead be used to directly power to the receiving terminal.[26].

5.10 Li-Fi in IoT

The Internet of things (IoT) is the one of the growing industry in market connecting
the network of physical objects, devices, vehicles, buildings and other items which
are embedded with electronics, software, sensors and network connectivity, which
enables these objects to collect and exchange data. Depending on the industry, the
sensor data can be related to temperature, humidity, pressure, machine vibration,
leakage and many other things. Now, of course, modern automation systems are
Internet enabled and these systems can now be called Internet of Things (IoT) ap-
plications. The pervasive nature of Internet of Things will make pervasive con-
nections between each unconnected services, machines, businesses and individuals.
Professor Harald Haas stated in his speech that there are 50 billion devices are
going to be connected by 2020 [26] which will required a huge amount of energy to
serve these devices. Li-Fi solar cell can be enhanced in this Internet of Things
concept for self-powered battery free devices. It is easy to attach Li-Fi tags than
RFID tags. The Li-Fi tags will be communicating to each other through visible light
communication.
84 N.V. Swami et al.

5.11 Conclusion

From above discussion, we can say that Li-Fi will serve many upcoming genera-
tions of mobile communication and other pervasive computing. It is perfectly tted
for mobile communication infrastructure in many better ways by providing a high
degree of mobility, saving energy, saving space, saving cost for deployment and
maintenance. It can be also used to make environment greener, safer and smarter.
No doubt it will fulll and shape the future technologically by making more sus-
tainable lifestyle for mankind.

References

1. Cisco visual networking index, Global mobile data trafc forecast Update 20142019, White
paper, CISCO
2. Li-Fi. Accessed https://www.en.m.wikipedia.org/wiki/Li-
3. Rani J, Chauhan P, Tripathi R (2012) Li-Fi (Light Fidelity)-The future technology. In:
Wireless communication, in International journal of applied engineering research. http://
www.ripublication.com/ijaer.htm
4. Tsonev D, Videv S, Haas H, Light delity (Li-Fi): towards all-optical networking. In: Institute
for digital communications, Li-Fi R&D Centre, Edinburgh, UK
5. Newton T (2013, Oct). British scientists shove 10 gbps through micro light bulbs in LiFi
experiment. https://www.recombu.com/digital/article
6. How light emitting plasma works. Accessed http://www.luxim.com
7. Bhut JH, Parmar DN, Mehta KV (2014, Jan) LI-FI technologya visible light communi-
cation. Int J Eng Dev Res
8. http://www.pureli.com/li-re/li-ame/
9. Haas H (2014, Apr) My Li-Fi revolution, Presented at Tam Dalyell prize lecture. http://www.
m.youtube.com/
10. Pujapanda KP (2013, Apr) LiFi integrated to power-line for smart illumination cum
communication. In: International conference on communication system and network
technologies. http://www.ieeexplore.ieee.org/
11. Wolf M (2014, Dec) 5 smart home companies to watch in 2014. www.forbes.com/sites/
michaelwolf/2013/12/31/5-smart-home-companies-to-watch-in-2014/
12. Savage N (2014, Nov) Li-Fi gets ready to compete with Wi-Fi. http://www.spectrum.ieee.org/
tlecom/internet/
13. Piltch A (2014, Jan). Bright idea: smartphone sensor receives data via light, Laptop. https://
www.m.blog.laptopmag.com/wysip-connect-li-downloads-data
14. Haas H (2013) High-speed wireless networking using visible light. Accessed https://www.
spie.org/x93593.xml
15. Uedinburgh NI, Partner on Li-Fi (2013, Nov). www.photonics.com/m/Article
16. Wang Y, Videv S, Haas H (2014) Dynamic load balancing with handover in hybrid Li-Fi and
Wi-Fi networks. In: IEEE 25th international symposium on personal, indoor and mobile radio
communications
17. Stefan I, Burchardt H, Haas H (2013) Area spectral efciency performance comparison
between VLC and RF femtocell networks. In: 2013 IEEE international conference on
communications (ICC), June 2013, pp. 38253829
18. Neethu MS (2013, Mar) Ubiquitous computing. https://www.lbsitbytes2010.wordpress.com/
2013/03/19/ubiquitous-computing-2/
5 Light Fidelity (Li-Fi): In Mobile Communication 85

19. Jin W-L (2012) SPIVC: a smartphone-based inter-vehicle communication system. Proc Trans
Res Board Ann Meet
20. Boukerche A et al (2008) Vehicular Ad Hoc networks: a new challenge for localization-based
systems. Comput Commun Sci Direct 112
21. Husain Fidvi NM Car to car communication system. Source: car communication system.
http://www.engineersgarage.com/contribution/car-to-car-commuincation-system?page=1
22. Swami NV (2015, Mar) Li-Fi (LIGHT FIDELITY)THE changing scenario of wireless
communication. Int J Res Eng Technol. http://www.ijret.org
23. Tower Infrastructure Solutions. Accessed https://www.bharti-infratel.com/cps-portal/web/
passive-infrastructure-solutions.html
24. Curran K, Furey E, Lunney T, Santos J, Woods D, Mc Caughey A (2011) An evaluation of
indoor location determination technologies. J Location Based Serv 5(2):6178, June 2011.
ISSN: 1748-9725, doi:10.1080/17489725.2011.562927, Taylor & Francis
25. http://www.li-centre.com/the-connected-solar-panel/
26. http://www.ted.com/talks/harald_haas_a_breakthrough_new_kind_of_wireless_internet?utm_
campaign=social&utm_medium=referral&utm_source=t.co&utm_content=talk&utm_term=
technology
27. http://visiblelightcomm.com/top-10-visible-light-communications-applications/automotive/
Chapter 6
Performance Analysis of Denoising Filters
for MR Images

Shraddha D. Oza and K.R. Joshi

Abstract Medical imaging modalities play an extremely signicant role in treating


human diseases. Magnetic resonance imaging (MRI) is one such imaging technique
popularly used for its ability to scan and generate a view of any internal body organ
or tissue. The MR image suffers from signal-dependent noise which obeys Rician
distribution. This multiplicative noise is difcult to remove but is necessary for
accurate diagnosis. For this noise removal, denoising lters are added in the pre-
processing stage of MRI. The denoising lters can be implemented in spatial or
temporal domain and they can be broadly classied as linear or nonlinear. The
frequency-domain wavelet-based lters have been implemented for noise removal
in MR images. But they may add characteristic artifacts which can be critical.
The NLM and Bilateral lter are nonlinear neighborhood lters which are being
preferred for denoising for their better edge-preserving ability. Quality of the
denoised image may be evaluated using pixel-based parameters such as PSNR,
MSE, and SSIM index which indicate structural content of the image and is more
close to human visual system. The paper analyzes the performance of NLM,
bilateral, and linear Gaussian lters using PSNR, MSE, and SSIM index. In future,
edge quality measuring metric may be used for better evaluation of the lter
performance.

Keywords MRI
Denoising Nonlinear lters NLM Bilateral Image
quality assessment

S.D. Oza ()
E & Tc Department, Army Institute of Technology, Pune, India
e-mail: sdoza@aitpune.edu.in
K.R. Joshi
E & Tc Department, PES Modern College of Engineering, Pune, India
e-mail: krjpune@gmail.com

Springer Science+Business Media Singapore 2016 87


A. Chakrabarti et al. (eds.), Advances in Computing Applications,
DOI 10.1007/978-981-10-2630-0_6
88 S.D. Oza and K.R. Joshi

6.1 Introduction

Biomedical image processing is an important research domain contributing to the


human health care. Medical imaging is the noninvasive tool for visualization of
internal research organs and helps in proper diagnosis. There are different imaging
modalities used such as MRI, CT, PET, ultrasound, and X-ray.
Magnetic resonance imaging (MRI) modality is a popularly used technique as it
can generate a view of every possible section of an internal organ. It does not need
any ionizing radiation and can discriminate among different tissues on the basis of
its physical and biochemical properties. The MR image resolution is limited by the
magnetic eld strength and imaging time. The MRI acquisition process adds
signal-dependent noise in the resultant image that is Rician distributed [4]. This
noise degrades further processing of the image such as segmentation, classication,
and registration. It is a critical but important task to remove this noise and improve
the quality of the image while preserving anatomical content for better diagnosis.
There are various lters being used to remove the MR image noise. The con-
ventional neighborhood lters such as Gaussian and Weiner, though simple, fail to
reduce Rician noise as they presume the noise to be Gaussian distributed. These
lters tend to blur ne details while removing high-frequency content in the image.
In recent years, the nonlinear lters such as non-local means and bilateral lters
have shown good edge-preserving performance [1, 2, 5, 6].
The NLM and bilateral lter, both look for a similar pixel in the whole image
and not only the close neighborhood as against the linear Gaussian lter. This
makes their denoising performance much better [1, 6]. Manjon et al. [2] in their
work have compared the unbiased NLM with other lters such as anisotropic
diffusion lter [7] and wavelet-based denoising lter [8]. The experimental results
of this work show that the NLM outperforms the others.
The lter performance needs to be evaluated using different computational
models to measure the image quality. PSNR and MSE are the most commonly used
conventional metrics which operate directly on the intensity of the image. They do
not correlate well with the subjective delity. Human visual system (HVS)-based
metric developed with extensive efforts of researchers such as SSIM index and UQI
is sensitive to luminance, contrast, and the structural content in the image [9].
The paper analyzes the performance of non-local means lter and bilateral lter
over Gaussian lter for MR images with the help of PSNR, MSE, and SSIM index
as a quality metric.
The paper is organized into ve sections. Section 6.2 describes the basic
mechanism of MRI and elaborates the Rician noise model. In Sect. 6.3, basic theory
of different lters is briefly discussed. Section 6.4 details different IQA metric used.
Section 6.5 provides the details of the experimentation done and the results
obtained. Finally, Sect. 6.6 gives the conclusions drawn from the observations.
6 Performance Analysis of Denoising Filters for MR Images 89

6.2 MRI and Rician Noise Model

The magnetic resonance imaging (MRI) modality operates on the principle of


directional magnetic eld associated with hydrogen nuclei in motion [10]. In the
process of MRI, the body to be diagnosed is placed in magnetic eld of the order of
1.5 T or more. This causes the hydrogen nuclei within the body to spin like a
gyroscope. These nuclei are further excited by applying RF pulses at Larmor fre-
quency. On the removal of the RF pulses, the nuclei emit energy and return back to
the original state. The MR image is reconstructed using the decaying induction
signal which in turn indicates the distribution of atoms in the selected tissue. MRI is
sensitive to physical parameters like blood flow [10, 11].
The MR image acquisition process introduces noise in the image. The raw data
obtained during MRI scan are complex in nature. These data represent the Fourier
transform of magnetization distribution of a section of tissue at a certain point of time.
The complex data points are further converted into magnitudefrequency and phase
components using inverse Fourier transform. These complex components represent
the physiological features of the organ tissue [11]. The noise in the reconstructed
image follows Rician distribution PDF [4, 12]. It is represented as follows:
  M2 + A2  
M AM
pMMjA, n = e 2n I0 M 6:1
2

2n 2n

where I0 is the modied zeroth-order Bessel function of the rst kind, (.) is the
Heaviside step function, 2n is the noise variance, and A is the noise-free signal level.
For high SNR, the Rician distribution tends to be Gaussian. As the SNR falls to low
level (<2), it tends to be Rayleigh distributed.
The Rician noise is signal dependent and thus very difcult to remove. It is
necessary to denoise the MR image and support accurate diagnosis.

6.3 Denoising Filters

The denoising lters are broadly classied as linear and nonlinear. The linear lters
are the ones which obey the principles of superposition and shift invariance.
Gaussian lter is a classic example of linear lter. The linear lters can be
implemented in spatial domain or temporal (frequency) domain. The spatial domain
ltering techniques directly manipulate the intensity of pixels in the image [4],
while the performance of frequency-domain lter is governed by the selection of
frequency response of the lter. The linear lter though simple to implement tends
to give blurring effect. The nonlinear lter response does not obey the linearity
principles. Its output can vary in an intuitive way and tends to respect the edges in
the image. Non-local means (NLM), bilateral, and anisotropic diffusion lters are
few examples of this category [2, 5, 1316].
90 S.D. Oza and K.R. Joshi

a. Non-local Means Filter


Non-local means (NLM) lter is a nonlinear spatial lter which was rst
introduced by Buades et al. in 2005 [1]. For denoising an image, rather than looking
for similar pixels, NLM looks for similar patterns (pixel with its neighborhood) in
the whole image. The pixel to be ltered is replaced by computing weighted
average of all the pixels with similar neighborhood. It does not penalize the similar
neighborhoods which are far off from the current pixel in the image matrix.
If I is a noisy image, then the ltered value at point r in the image I is computed
using the equation:

NLMIr = wr, sIs 6:2

Here,
0 wr, s 1 and wr.s = 1
where r is the point being ltered and s represents any other pixel in the same
image. If Nr and Ns are neighborhoods of the pixels r and s, respectively, then
weight w(r, s) indicates similarity between Nr and Ns.
Thus, this method averages all pixels in the image with similar neighborhood,
while weight w(r, s) judges the similarity between the neighborhoods. The com-
putation of weight leads to a huge time overhead. This can be optimized by
restricting the search to a region with radius R in the whole image.
b. Bilateral Filter
The bilateral lter is a neighborhood lter which is nonlinear. It was rst
introduced as nonlinear Gaussian lter in 1995 by Aurich et al. In 1998, Tomasi
and Manduchi rediscovered it and gave the name bilateral [6].
The bilateral lter replaces the current pixel intensity, with a weighted average of
geometric (spatial) and photometric (intensity) distances between neighboring
pixels in the search region. The weight assigned to each neighbor decreases with the
distance in the image plane (the spatial domain S) as well as the distance on the
pixel intensity axis (the range domain R). This helps in preserving edges along with
noise removal. The expression for the ltered value of pixel r is given as follows:

ks r k jIs Irj
2 2
1
Ir = s Nr e 22 d e 22 r Is 6:3
c

where d and r are standard deviations in spatial and intensity domains, respec-
tively [17, 18].
c. Gaussian Filter
Gaussian lter computes weighted average of the intensity of the adjacent positions
wherein weight decreases with the spatial distance to the center position p (origin):
6 Performance Analysis of Denoising Filters for MR Images 91

1 r2 +2 s2
gr, s = e 2 6:4
2 2

where r is the distance from the origin in the horizontal axis, s is the distance from the
origin in the vertical axis, and is the standard deviation of the Gaussian distribution.

6.4 Image Quality Assessment

Image quality assessment using appropriate measures is signicant in evaluating


quality of the denoised image. The quality metrics can be classied depending on
whether the original noise-free image is available, with which the noisy image is to
be compared. Commonly used full-reference approach uses a known reference
image for computing performance metrics. The popular examples of this type are
peak signal-to-noise ratio (PSNR) and mean-squared error (MSE) [19]. They opt for
pixel-based measurement. PSNR is the ratio of the maximum signal power to the
noise power. Though simple to calculate, MSE and PSNR do not correlate well
with perceived quality of the image. In recent years, extensive research has been
carried out in the development of measures based on human visual system (HVS).
Structural similarity (SSIM) index is one of the HVS-based metrics and indicates
the structural quality of the image [9]. The SSIM metric compares the two images
using three parameters: luminance, contrast, and structure.

6.5 Experimentation and Results

The three lters, viz. NLM, bilateral, and linear lters, were implemented using
MATLAB (ver 2010) on Intel i5 machine with 4-GB RAM working at 2.2 GHz. The
image database includes ten MR images of bone and joint (658 495) and brain slice
(512 512). To each image, Rician noise was added with varying standard deviation
starting from S = 3 to S = 24 in ve steps. For each noise level, lter performance
was tested using three parameters, namely PSNR, MSE, and SSIM index.
In the case of each lter, as the noise level was increased, MSE increased
(Fig. 6.1), while PSNR (Fig. 6.2) and SSIM index reduced as was expected
(Fig. 6.3).
On application of NLM lter to brain MRI, the PSNR was 32.748 for S = 3
which was reduced to 23.4484 as S was increased to 24. When bilateral lter was
applied to the brain MRI, it outperformed the rest giving the PSNR index of
82.0602 for noise level S = 3 and 64.4933 for S = 24. In the case of linear lter,
PSNR was observed to be 24.7394 for minimum noise level of 3. This was the
lowest result as compared to the other two lters (Fig. 6.2).
The SSIM index was noted to be 0.9356 for minimum noisy image, in the case
of NLM lter. It was found to be reduced to 0.7871 as noise level was increased to
24. Here too, bilateral lter proved to be better with minimum drop, i.e., with
92 S.D. Oza and K.R. Joshi

Fig. 6.1 MSE versus std. 500 NLM


deviation LF
400 BILATERAL

300

MSE
200

100

0
3 6 12 18 24
STANDARD DEVIATION

Fig. 6.2 PSNR(dB) versus 90


NLM
std. deviation
80 LF
BILATERAL
70

60
PSNR in dB

50

40

30

20

10

0
3 6 12 18 24
STANDARD DEVIATION

Fig. 6.3 SSIM index versus 1.2 NLM


std. deviation LF
1 BILATERAL

0.8
SSIM

0.6

0.4

0.2

0
3 6 12 18 24
STANDARD DEVIATION
6 Performance Analysis of Denoising Filters for MR Images 93

Fig. 6.4 Noisy image with


S = 24

Fig. 6.5 Computation time


versus std. deviation

maximum index of 0.99990.9319 being the least for even the noisiest image with
S = 24 (Fig. 6.4).
The average computation time, in the case of each lter, was noted down. For
the brain MR image, NLM lter with the search width of 15 was the slowest with
average computation time of 180 s, while linear Gaussian lter was proved to be
the fastest with average computation time of 3.8 s. The bilateral lter took optimum
average time of 4.5 s and provided best results (Fig. 6.5).

6.6 Conclusion and Future Scope

The experimentation results imply that bilateral lter performs better than NLM
lter in terms of PSNR as well as SSIM index. As shown in Figs. 6.6, 6.7, and 6.8,
SSIM index is close to the human perception of the quality of the denoised image.
94 S.D. Oza and K.R. Joshi

Fig. 6.6 Bilateral lter


output

PSNR =64.4933
SSIM =0.9313

Fig. 6.7 NLM lter output

PSNR =23.4484
SSIM =0.7871
6 Performance Analysis of Denoising Filters for MR Images 95

Fig. 6.8 Linear lter output

PSNR =20.9520
SSIM =0.4070

It can also be seen that NLM and bilateral lter try to retain edges, while linear lter
blurs the image. In future, the performance may further be analyzed using edge
quality measure. In the present work, the computation time of the NLM lter was
noted to be too large as compared to the rest (Fig. 6.4). It may be reduced by
implementing the lter on parallel architectures [20]. The NLM lter being a
neighborhood lter can easily be mapped to parallel architectures. Also, variants of
NLM [21] and improvised bilateral, i.e., trilateral lter [22], can be analyzed for
performance.

References

1. Buades et al (2005) A non local algorithm for image denoising. IEEE Int Conf Comput Vision
Pattern Recogn CVPR 2(2005a):6065
2. Jose M et al (2008) MRI de noising using non local means. Med Image Anal 12:514523
Science Direct, Elsevier
3. Mohan J et al (2014) A survey on the magnetic resonance image denoising methods. Biomed
Signal Process Control 9:5669
4. Godbjartsson H, Patz S (1995) The Rician distribution of noisy MRI data. NIH Public Access
Author Manuscript
5. Kundu R et al (2014) De-noising image lters for bio-medical image processing. CSI
Commun
6. Tomasi C, Manduchi R (1998) Bilateral ltering for gray and color images. In: Proceedings of
the 1998 IEEE International Conference on Computer Vision, Bombay
96 S.D. Oza and K.R. Joshi

7. Perona P, Malik J (1990) Scale-space and edge detection using anisotropic diffusion. IEEE
Trans Pattern Anal Mach Intell 12:629639
8. Wu ZQ et al (2003) Wavelet-based Rayleigh background removal in MRI. IEEE Electron.
Lett. 39:603605
9. Avcibas I, et al (2002) Statistical evaluation of image quality measures. J Electron Imaging 11
(2):207
10. https://www.cs.sfu.ca/stella/papers/blairthesis/main/node11.html
11. http://www.drcmr.dk/mriinbrief
12. P Coupe et al (2010) Robust Rician noise estimation for MR Images. Med Image Anal
14:483493 Elsevier
13. Dolui S et al (2013) A new similarity measure for non local means ltering of MRI
images. 10401054
14. Teuber T et al (2012) A new similarity measure for non local ltering in the presence of
multiplicative noise. Comput Stat Data Anal 56:38213842 Elsevier
15. Manjon J et al (2010) Adaptive non local Means De noising of MR images with spatially
varying Noise levels. J Magn Reson Imaging 31:192203 Wiley-Liss 2009
16. Rajan J et al (2012) An adaptive non local maximum likelihood estimation method for
denoising magnetic resonance images. IEEE
17. Sylvain Paris et al (2008) Bilateral ltering: theory and applications. Comput Graph Vision 4
(1):173
18. Zhang M (2009) Bilateral ltering for image processing. Thesis, Beijing University
19. Wang Z et al (2004) Image quality assessment: from error measurement to structural
similarity. IEEE Trans Image Process 13(1)
20. Eklund A, et al (2013) Medical image processing on the GPU: past, present and future. Med
Image Anal http://www.dx.doi.org/10.1016/j.media.2013.05.008. Linkping University Post
Print
21. Kim DW et al (2011) Rician non local means de noising for MR images using nonparametric
principal component analysis. EURASIP J Image Video Process 2011:15
22. Wilbur CK, Wong et al (2004) Trilateral ltering for biomedical images. 0-7803-8388-5/04
IEEE
Chapter 7
A Detailed View on SecureString 3.0

Gnter Fahrnberger

Abstract These days, vicious globally playing enterprises and various culprits try
to exploit individuals sensitive data whenever opportunities arise. Public clouds and
private users with low (or even no) security awareness facilitate such malefactions.
On account of this, a rational data owners trust in public clouds should reach the
level semi-honest or semi-honest-but-curious at best due to their uncertainty of the
location(s) of their data respectively of undesired access on them. The homomorphic
cryptosystem SecureString 3.0 remedies to recapture the cloud users faith in secure
cloud computing on encrypted character strings by combining the paradigms blind
computing and secret sharing. While existing literature already covers the principles
of SecureString 3.0, herein, the adduced implementation details of this cryptosystem
given in pseudocode allow researchers to realize their own prototypes and practition-
ers to integrate SecureString 3.0 in their own security solutions. Decent security and
performance analyses prove the applicability of this cryptosystem in the eld, e.g.,
for secure instant messaging sifters.

Keywords Blind computing Character string Character string function Char-


acter string operation Cloud Cloud computing Secret sharing Secure
computing String String function String operation

7.1 Introduction

Nowadays, it is a common practice for private users to have their les and elec-
tronic mail accounts outsourced in public clouds due to comfortable accessibility
from nearly all over the world. Daily news about exploitations of sensitive data let
the private users security awareness upswing awhile before it reverts to its original
level, or even worse, people do not fear for the security of their data because they
suppose that their data do not arouse anyones interest. Even the latter would imme-

G. Fahrnberger ()
University of Hagen, North Rhine-Westphalia, Germany
e-mail: guenter.fahrnberger@studium.fernuni-hagen.de

Springer Science+Business Media Singapore 2016 97


A. Chakrabarti et al. (eds.), Advances in Computing Applications,
DOI 10.1007/978-981-10-2630-0_7
98 G. Fahrnberger

diately change their opinion if somebody successfully looted their bank account due
to their stored plaintext credentials in an online folder.
Companies usually act much more carefully because lost or exploited business
data could entail the complete ruin of reputation or even existence of a rm. An
organization will not make use of a public cloud as long as its personnel assess
uncertainty for its valuable data there. If the security stas of an enterprise have
condence in a cryptosystem, then they will not hesitate to employ it for the stor-
age of crucial information in public clouds. Evenhandedly, the responsible sta will
only incorporate a viable homomorphic encryption scheme for blind computations
on ciphered data in public clouds if they rely on it.
Unfortunately, wrongdoing corporations take advantage of their prudent back-
ground about IT security and the users ignorance of equivalent ken to exploit user
data for entrepreneurial purposes. Ironically, the rms characterize such data abuse
as marketing action because the illegally observed user habits permit them to proer
their products more purposively.
Altogether, a decision maker has two options available. Either they entirely con-
de in a public cloud, or they completely suspect it. On account of this, basically,
trust represents a dichotomous property between two objects. Some authors allude
to semi-honest [22] or semi-honest-but-curious [5] parties which are only partially
trusted. Albeit such shady actors do not behave viciously, they can easily cause cru-
cial security breaches if they ignore or negate any prophylactic measures or respon-
sive countermeasures against genuine villains. The treatise about SIMS (Secure
Instant Messaging Sifter) characterized transmission or IM (Instant Messaging) plat-
form providers with such a misbehavior as neutralists [10]. The airy way is to eschew
such neutral resources, but in spite of security risks they often promise luring com-
mercial advantages. Cloud computing exemplies a classical neutralist. SLAs (Ser-
vice Level Agreements) and nondisclosure agreements bode well, but they cannot
change the fact that cloud capabilities opaquely reside and even roam in any com-
puting centers on our planet without control possibilities for the data possessor.
For this reason, a policy maker can consider to involve a public cloud in such a
way that an interloper cannot gure out what the cloud stores or computes. While
topical cryptosystems applied for end-to-end-encryption assure the safe detention of
sensitive data in cloud storages, SSE (Searchable Symmetric Encryption) schemes
abet keyword searches in ciphertexts. Reasonable modications of encrypted data
require the employment of homomorphic cryptosystems. While the famous encryp-
tion schemes of Rivest, Shamir and Adleman (RSA) [25], Paillier [23] and Gentry
[16] ensure homomorphic operations on ciphered numerical values, the SecureString-
family with the cryptosystems SecureString 1.0 (see Sect. 7.2.1), SecureString 2.0
(see Sect. 7.2.2), and SecureString 3.0 (see Sect. 7.2.3) proers various homomor-
phic functions on encrypted character strings. SecureString 2.0 improved the privacy
of SecureString 1.0, and SecureString 3.0 hardened SecureString 2.0 with better
authenticity, integrity, and privacy. The creator of these encryption schemes ear-
marked them for the secure sifting of chat and instant messages [10, 13] which does
not exclude them from utilization for other assignments.
7 A Detailed View on SecureString 3.0 99

Homomorphic cryptosystems follow the paradigm of blind computing and pre-


sume the existence of a homomorphic function g for a (dyadic) function f , so that
D(g(E(v), E(u)) = f (v, u) if E denotes an encryption function, D denotes a decryp-
tion function, and v and u denote input variables. It means that g fabricates conve-
nient encrypted output out of ciphertext input without the possibility and the need to
decrypt anything. An exemplary dyadic function for character strings would be the
querying operation that claims a rst input parameter in which it seeks for appear-
ances of the second input parameter.
The existing poster disquisition about SecureString 3.0 [12] omitted to treat of
realization details, appropriate security and comparative performance analyses.
Thence, this sequel to SecureString 3.0 outweighs this decit by dint of the
following structure. Due to already existent exhaustive contrasting juxtapositions
of SecureString 1.0 and SecureString 2.0 with comparable cryptosystems [7, 8],
Sect. 7.2 rather explicates the functionality of all SecureString releases as well as
related work about the exerted core techniques pseudorandomness and re-keying
for a better understanding of the novelty of SecureString 3.0. Section 7.3 dedicates
itself to the algorithms established in SecureString 3.0. The security analysis in
Sect. 7.4 discusses the hardiness of SecureString 3.0 against a plurality of oensives.
Section 7.5 comprises a convenient performance analysis of SecureString 3.0 includ-
ing a comparative consideration of SecureString 2.0. The endmost Sect. 7.6 summa-
rizes this publication, recapitulates its merits, and recommends valuable future work.

7.2 Related Work

7.2.1 SecureString 1.0

The rst and sole paper about SecureString 1.0 was released under the title Com-
puting on Encrypted Character Strings in Clouds [7]. It treats of a homomorphic
cryptosystem with the following benets:
Underlying replaceable symmetric cryptosystem: SecureString 1.0 performs
each ciphering by a topical underlying exchangeable symmetric cryptosystem that
becomes inevitably exchanged if its vulnerability exceeds any observable modern
security norms.
Polygraphic encryption: SecureString 1.0 splits plaintext words into its disjoint
polygrams of length n in order to encrypt them individually.
Intra-word polyalphabetism [3]: SecureString 1.0 enciphers each polygram of a
plaintext word with a separate encryption transformation.
Position-based encryption transformation: SecureString 1.0 achieves unique
encryption transformations by ciphering each polygram together with the begin-
ning position within its encircling word.
Polygram order scrambling: SecureString 1.0 brings the ciphertext polygrams
out of sequence without losing the possibility to operate on them because their
100 G. Fahrnberger

order can be reconstructed after their decryption through their enveloped position
information.
The developer of SecureString 1.0 himself detected some drawbacks in his cryp-
tosystem as follows:
Finite length support: SecureString 1.0 can only fully support character string
operations on ciphertext objects up to a predened length.
Overt word boundaries: Due to the nite support of ciphertext object lengths,
the intentional use of an own SecureString 1.0 object for each word suggests itself.
Disadvantageously, this approach discloses the character string boundaries of texts
and abets eective repetition pattern attacks on them.
Repetitive ciphertexts: If polygrams are encrypted together with their beginning
positions but without any randomness (such as salts or nonces), then ciphertext
repetitions likely promote eectual statistical distribution attacks.
Due to these recognized aws, the inventor of SecureString 1.0 felt obliged to launch
the enhanced successor SecureString 2.0.

7.2.2 SecureString 2.0

The foremost essay about SecureString 2.0 came up with the undermentioned advan-
tages [8]:
Underlying replaceable symmetric cryptosystem: SecureString 2.0 performs
each ciphering by a topical underlying exchangeable symmetric cryptosystem that
becomes inevitably exchanged if its vulnerability exceeds any observable modern
security norms.
Inter-word polyalphabetism [3]: SecureString 2.0 enciphers each plaintext word
with a unique encryption information. Therefrom, the arbitrary number of con-
catenated words per SecureString 2.0 object conceals their boundaries.
Salting: Morris and Thompson came along with the idea to salt plaintext pass-
words (i.e., to prolong them with a random character string) before their encipher-
ment to obtain dierent ciphertext versions of them by adhering various salts [20].
SecureString 2.0 attains a distinct encryption transformation for every plaintext
word by exclusively enciphering each character of the same plaintext character
string together with an identical salt.
Automatic salt updating [2, 10]: Each salt serves as input for a hash function that
outputs the salt for the encryption transformation of the successive plaintext word.
The so-called TTPG (Trusted Third Party Generator) can prepare the homomor-
phic computations for an untrustworthy (external) node just with a starting salt
(see Sect. 7.3.2).
Hereinafter, a list itemizes the features of SecureString 2.0 which represent pros and
cons at the same time:
7 A Detailed View on SecureString 3.0 101

Monographic encryption [3]: SecureString 2.0 substitutes plaintext monograph-


ically (character by character) to avoid time-consuming inter-polygram computa-
tions. Cut and splice attacks could take advantage of this property.
Intra-word monoalphabetism [3]: SecureString 2.0 applies the same encryption
transformation for all characters of a string to reduce the volume of all client and
cloud repositories. Disadvantageously, repetition pattern attacks might prot from
this mitigation.
Non-size-preservation: Advantageously, the decryption scheme can simply ignore
salts in unscrambled ciphertexts rather than care about them. Adversely, sizable
space overhead eventuates as a drawback of salting, this means that the produced
ciphertext becomes longer than the concealed plaintext.

Among a formal renement of the en- and decryption scheme as well as of


the character string functions of the cryptosystem, a dedicated treatise conducted
a detailed performance analysis and exposed the odds to break SecureString 2.0
objects with repetition pattern attacks or distribution attacks [9].
Another contribution to secure cloud computing spotted the arduousness of deriv-
ing homomorphic character string functions and the vulnerability of ciphertexts with
observable word delimiters [17].
The very last essay about SecureString 2.0 resolved this issue with multi-word-
containing SecureString 2.0 objects with an unknown number of comprised words
and blurred boundaries between them [11]. Beyond that, it scrutinized the suc-
cess probability of repetition pattern attacks on the following three sorts of Secure-
String 2.0 objects: single-word-containing ones, multi-word-containing ones with a
known number of words plus unknown delimiter positions, and multi-word-containing
ones with an unknown number of words plus unknown boundary locations.
Subsequent scientic work dealt with the feasibility of applications based on
SecureString 2.0.
Elgaby designed the protection of a rudimentary SIP (Session Initiation Proto-
col) network with SecureString 2.0 [6]. Moreover, he commended further research
regarding a dynamic dictionary for registered SIP-entities.
Freitag examined how the SMPP (Short Message Peer-to-Peer) protocol can be
safeguarded if condent ESMEs (External Short Message Entities) mistrust an out-
sourced SMSC [15]. Furthermore, she advocated the adherence to the security aims
authenticity, integrity, privacy, and resilience in future innovations.
Another academic draft amalgamated SecureString 2.0 with the 4-CBAF (4 layer
Context Based Authentication Framework) [21] to SafeChat that shields children
from cyberbullying and their communication from explicit messages [13]. Further,
the scholarly paper advised to improve SafeChat by joining it with a searchable
encryption scheme that supports similarity queries, just to deter inventive teenagers
from bypassing SafeChat by deliberately misspelling cusses.
Its successor, SIMS (Secure Instant Messaging Sifter), put this suggestion into
action. On top of that, it provided a comprehensive concept to sustain authenticity,
integrity, privacy, and resilience for IM (Instant Messaging) platforms [10].
102 G. Fahrnberger

While untrusted SIP proxies respectively untrustworthy SMSCs (Short Message


Service Centers) can make secure blind routing decisions with the aid of Secure-
String 2.0, semi-honest IM platforms may eradicate explicit expressions from instant
messages without becoming aware of the meaning of any messages or lter terms.
Objections are legitimate which advocate IM consigners or consignees to straight-
forwardly perform desired purges of sent respectively received plaintext messages
themselves. A few reasons counter these demurs as follows:

Frequent transfers of ever-changing repositories with the explicit phrases over


narrow-band (air) interfaces to (mobile) sender or receiver terminals exhaust band-
width (or free data volume of mobile taris).
Recipients subjected to be censored (e.g., children at their legal guardians demand)
can impede censorship by using hard- or software without sifting abilities.
Addressees may distrust originators who (pledge to) sieve their transmitted mes-
sages themselves. Eventually, consigners allegorize the main reason for message
ltering.
Even if consigners who encrypt their created messages with an SSE scheme in
order to empower their IM platform to let only pass messages with unobjectionable
content, consignees may mistrust them. Addressers could run cracked client soft-
ware that either skips the encipherment of oensive keywords or wrongly encrypts
them in order to delude the search function in the IM platform.

That is why a prudent IM solution must inevitably build on a centralized sifter that
does not have the chance of getting to know and exploiting any instant messages.
Designs based on SecureString 2.0 were steps in the right direction, but too weak
achievements of the security goals authenticity, integrity, privacy, and resilience led
to the development of SecureString 3.0.

7.2.3 SecureString 3.0

As its predecessors SecureString 1.0 and SecureString 2.0, SecureString 3.0 [12]
poses a non-size-preserving cryptosystem. The encryption scheme does nothing else
but to salt each individual plaintext character of a string, i.e., to append an arbitrary
character sequence, and to encipher the amalgamation of plaintext character and salt
with a contemporary high-performing cryptosystem in ECB (Electronic CodeBook)
mode [18].
The encryption algorithm may prolong as many characters with the same salt as
no ciphertext repetition appears. Therefore, before it scrambles a dedicated charac-
ter with an equal salt twice, it must switch to another salt and thereby defuses the
7 A Detailed View on SecureString 3.0 103

hazardousness of repetitions despite the usage of ECB mode. Automatic salt updat-
ing by inputting the recent salt in a (trapdoor) hash function denotes an easy and
secure alternative to obtain a fresh salt from an RNG (Random Number Genera-
tor) [2]. Blum and Micali pioneered with their general algorithmic scheme for con-
structing polynomial-time deterministic algorithms which stretch a short real ran-
dom input (also called seed) into a long sequence of unpredictable pseudo-random
bits [4]. Stream ciphers and re-keying approaches [1] capitalize on these algorithms
and draw pseudo-random material from their PRNGs (Pseudo-Random Number
Generators). In contrary to the PRNG of Petit et al. [24], SecureString 3.0 cannot
include payload content in salt derivations because the TTPG must not encounter
any message texts due to secret sharing with the untrusted host.
This kind of salt production does not only perpetuate backward security, i.e., no
salt can be concluded from its successors, but also lets a TTPG prepare homomorphic
computations for an untrustworthy node just with a starting salt per cohesive text
body. Every TTPG and every trusted sender (see Sect. 7.3.1) must possess the ability
to fabricate arbitrary salts. Especially mobile devices in the role of trusted senders
without a swift on-board RRNG (Real Random Number Generator) yearn for an
acceptable alternative.
If a modern block cipher combined with a safe mode of operation [18] (e.g., CBC
(Cipher Block Chaining), CFB (Cipher FeedBack), or OFB (Output FeedBack)) was
utilized instead of salting and ECB mode (as done in the SecureString-series), then it
would also warrant backward security, but its output ciphertext blocks would depend
on their antecedent blocks and a TTPG could not derive ciphertext alternatives for
any plaintext characters. It goes without saying that stream ciphering also does not
permit a TTPG to compute ciphertext alternatives for plaintext characters.
To make a long story short, SecureString 3.0 achieves non-repetitive ciphertexts
with controlled randomness attained through hashed salts in lieu of haphazardly
compassed ciphertexts through the contents of foregoing blocks.

7.3 Algorithms

The implementation of SecureString 3.0 needs at least the availability of the ve


components depicted in the architecture model in Fig. 7.1. The trusted sender A
(see Sect. 7.3.1) scrambles an entered plaintext by SecureString 3.0 and securely
announces its length, its positions of the needed salt changes, and an initial salt to
the TTPG (see Sect. 7.3.2) via the untrusted host B (see Sect. 7.3.3). The TTPG com-
piles a public ciphertext repository with which B can nd unwanted ciphertext parts
and obliterate respectively substitute them before B forwards the cleaned cipher-
text to the trusted receiver C (see Sect. 7.3.4). The CA (Certication Authority) (see
Sect. 7.3.5) administers the certicates of all aforementioned components.
104 G. Fahrnberger

Fig. 7.1 Architecture model of SecureString 3.0

The condential components belong either to one or to several scattered trustable


environments, i.e., the individual elements are connected either through trusted or
through uncondent paths. The latent worst case of trustless routes coerces to apply
parametrized ciphering and, hence, each pair or triplet of interacting objects to mutu-
ally negotiate on an extra session key.
The next subsections inclose the algorithmic ows in the particular items and their
explanations. Beforehand, Denition 1 prescinds abbreviations which are commonly
alluded to the algorithms to simplify their readability.

Definition 1 Let A be the trusted sender, let Acert be A s certicate with A s public
key Ae , let AR be an ecient data structure in A, let ARNG be A s random num-
ber generator, let B be the untrusted host, let BR be B s public repository, let C
be the trusted receiver, let CA be the Certication Authority, let TTPG be the
Trusted Third Party Generator, let TTPGR be an ecient data structure in TTPG,
let sACTTPG be a securely stipulated session key between A, C and TTPG, let sAB
be a securely stipulated session key between A and B, let sBC be a securely stipu-
lated session key between B and C, let sBTTPG be a securely stipulated session key
between B and TTPG, let s XY be transport encryption between two parties X and
Y based on a topical symmetrical cryptosystem and a securely stipulated session
key sXY , let be an alphabet, let Dss (ciphertext) m be a decryption
and Ess (plaintext) m be an encryption function of a contemporary
symmetrical cryptosystem based on a session key s m , let H m1 be a
hash function, let v lv be plaintext of length lv and {vj |1 j lv } be
its ordered set of plaintext characters, let Jv be the ordered set of the character
positions of the needed salt changes for v, let w lv m be the SecureString 3.0-
object for v and {wj m |1 j lv } be its ordered set of ciphertext characters,
let R be a plaintext repository with search strings and their replacement
strings, let (ui , ti ) R|1 i |R| be the ith element of R, let Jui be the ordered
set of the character positions of the needed salt changes for ui , let lui the length of
ui , let d be a beginning index, let e be an ending index, and let salt m1
be a random word.
7 A Detailed View on SecureString 3.0 105

7.3.1 Trusted Sender A

If a trusted receiver C does not conde in the computations of a trusted sender A


(such as queries or screenings), then an invoked external resource B must take them
over. Neither A nor C can entirely trust such a B (e.g., a public cloud) if it is not under
their physical control. For this reason, according to Algorithm 1, A enciphers all input
plaintexts that shall be truncated by B with Algorithm 5. Additionally, it triggers an
appropriate TTPG with an initial salt as well as with the positions of the needed salt
changes to prepare all ciphertexts which need to be only sought by dint of Algorithm
3 or accessorily deleted/replaced with the aid of Algorithm 4 in SecureString 3.0-
objects by B. The common encipherment and conveyance of As certicate, of the
initial salt, of the word length lv , and of the positions of the needed salt changes
Jv in step 9 of Algorithm 1 makes sense because an attacker could unnoticeably
alter unsigned salts or unsigned transformation change counters during their transit
through B and, therefore, causes the production of unusable public repositories.

Algorithm 1 encrypt
Require: Acert with Ae , AR , Jv , sAB , sACTTPG , sBTTPG , v
Ensure: w or error

1: A ARNG creates salt {Salt creation}


2: A AR {}; Jv {}; w {} {Initialization}
3: for j 1 to lv do {Iteration over plaintext characters}
4: if vj AR then {Salt change}
5: A AR {}; Jv (Jv {j}); salt H(salt)
6: end if
7: A AR (AR {vj }); w (w {EssACTTPG (vj , salt)}) {Character encryption}
8: end for
9: sABB EssACTTPG (Acert , salt, lv , Jv ) {Safe transfer of initial salt, plaintext length, and salt change
A
positions}
10: sBTTPGTTPG EssACTTPG (Acert , salt, lv , Jv ) {Safe transfer of initial salt, plaintext length, and salt
B
change positions}
11: TTPG (Acert , salt, lv , Jv ) DssACTTPG (EssACTTPG (Acert , salt, lv , Jv )) {Salt decryption}
12: TTPG CA Acert {Sender certicate transfer}
13: if Acert invalid then {CA rejects sender certicate}
14: CA TTPG error {Transfer of rejection}
15: sBTTPGB error {Safe transfer of rejection}
TTPG
16: sABA error {Safe transfer of rejection}
B
17: return error
18: else {CA approves sender certicate}
19: CA TTPG approved {Transfer of approval}
20: TTPG BR createPublicRepository {see algorithm 2}
21: sBTTPGB BR {Safe public repository transfer}
TTPG
22: sABA approved {Safe transfer of approval}
B
23: return w
24: end if
106 G. Fahrnberger

7.3.2 Trusted Third Party Generator (TTPG)

A TTPG relieves A of precomputing ciphertexts (destined to be queried, exchanged,


or removed) and the interface between A and B of voluminous uploads. As soon as
a TTPG has received an initial salt, vs length lv , and vs positions of the needed salt
changes Jv from A and veried their authenticity via As certicate, it procures the
suitable plaintext expressions and enciphers those ones which are not longer than
v and do not necessitate more salt changes than v. Ultimately, a TTPG stores the
engendered ciphertexts in the public repository BR . Thence, BR at least comprehends
all ciphertexts which B must seek in w with the querying Algorithm 3. If B executes
the replacing Algorithm 4, then it requests a BR that does not only embrace the sought
ciphertexts but also their substitutes. Algorithm 2 generates a BR for the replacing
Algorithm 4, which also complies with the querying Algorithm 3.

Algorithm 2 createPublicRepository
Require: Jv , R, TTPGR , lv , sACTTPG , salt
Ensure: BR

1: TTPG BR {} {Initialization}
2: for i 1 to |R| do {Iteration over search strings}
3: if lui lv then {Search string not too long}
4: TTPG Jui {}; TTPGR {} {Initialization}
5: for k 1 to lui do {Iteration over search string characters}
6: if uik TTPGR then {Salt change incrementation}
7: TTPG Jui (Jui k); TTPGR {};
8: end if
9: TTPG TTPGR (TTPGR {uik })
10: end for
11: if |Jui | |Jv | then {Proper number of salt changes in search string}
12: TTPG salt salt {Salt initialization}
13: for j 1 to lv lui + 1 do {Iteration over starting string character}
14: TTPG TTPGR {}; qi {}; salt salt {Initialization}
15: for k 1 to lui do {Iteration over search string characters}
16: if (j + k 1) Jv then {Salt change}
17: TTPG TTPGR (TTPGR k)
18: end if
19: end for
20: for k 1 to lui do {Iteration over search string characters}
21: if k TTPGR then {Salt change}
22: TTPG salt H(salt)
23: end if
24: if not (|Jui | > 0 and |TTPGR | > 0 and min(Jui ) = min(TTPGR ) and Jui
TTPGR ) then {Useful search string}
7 A Detailed View on SecureString 3.0 107

25: TTPG qi (qi {EssACTTPG (uik , salt)})


26: end if{Search string character encryption}
27: end for
28: TTPG TTPGR {}; ri {}; salt salt {Initialization}
29: for k 1 to lti do {Iteration over replacement string characters}
30: if tik TTPGR then {Salt change}
31: TTPG TTPGR {}; salt H(salt)
32: end if
33: TTPG TTPGR (TTPGR {tik }); ri (ri {EssACTTPG (tik , salt)})
34: end for{Replacement string character encryption}
35: TTPG BR (BR {(qi , ri )}); salt H(salt )
36: end for
37: end if
38: end if
39: end for
40: return BR

7.3.3 Untrusted Host B (e.g., Public Cloud)

Once a faithless host B has got a BR from its assigned TTPG, it can properly exe-
cute the querying Algorithm 3 or the replacing Algorithm 4 on the accordant input
ciphertext w. Without an acquired BR , B can still produce reasonable substrings of
w by pruning w with the picking Algorithm 5. While a TTPG as the producer of BR
becomes aware of the meaning of all elements in BR , it neither gets in touch with v
nor with w nor with querying results. In contrast, B as the consumer of BR can read
and even write on w, but cannot learn anything about the content of v. Thus, this
division of labor between a TTPG and B enforces secret sharing.

Algorithm 3 query
Require: BR , m, sAB , w
Ensure: true or false

sABB w {Safe ciphertext word transfer}


1: A
l
2: B lv w {Calculation of plaintext word length}
m
3: for o 1 to lv do {Iteration over beginning ciphertext character}
4: for j o to lv do {Iteration over ending ciphertext character}
5: B q {wo , , wj } {Ciphertext substring creation}
6: if q BR then {Substring found}
7: sABA true
B
8: return true
9: end if
108 G. Fahrnberger

10: end for


11: end for
12: return false

Algorithm 4 replace
Require: Acert with Ae , BR , m, sAB , sACTTPG , sBC , w
Ensure: v or error

sABB w {Safe ciphertext word transfer}


1: A
l
2: B lv w {Calculation of plaintext word length}
m
3: for o 1 to lv do {Iteration over beginning ciphertext character}
4: for j o to lv do {Iteration over ending ciphertext character}
5: B q {wo , , wj } {Ciphertext substring creation}
6: if q BR then {Replacement of found substring}
7: B (q, r) (qi , ri ) BR |qi = q; w {w1 , , wo1 , r, wj+1 , , wlv }
l l
8: B lv (lv mq + mr ) {Recalculation of plaintext word length}
9: end if
10: end for
11: end for
12: sABA w {Safe ciphertext word transfer}
B
13: A v decrypt(w) {see algorithm 6}
14: if A approves v then {Sender approves changed ciphertext word}
15: sABB EssACTTPG (Acert , HA (w)) {Safe signature transfer}
A
16: sBCC w, EssACTTPG (Acert , HA (w)) {Safe transfer of ciphertext word and signa-
B
ture}
17: C (Acert , HA (w)) DssACTTPG (EssACTTPG (Acert , HA (w))) {Signature decryption}
18: if HC (w) = HA (w) then {Receiver approves signature}
19: C CA Acert {Sender certicate transfer}
20: if Acert valid then {CA approves sender certicate}
21: CA C approval {Transfer of approval}
22: C v decrypt(w) {see algorithm 6}
23: return v
24: else {CA rejects sender certicate}
25: return error
26: end if
27: else {Receiver rejects signature}
28: return error
29: end if
30: end if
7 A Detailed View on SecureString 3.0 109

Algorithm 5 pick
Require: Acert with Ae , d, e, sAB , sBC , w
Ensure: v or error

sABB w, d, e {Safe transfer of ciphertext word, beginning index, and ending


1: A
index}
2: B w {wd , , we } {Ciphertext substring creation}
3: sABA w {Safe ciphertext word transfer}
B
4: A v decrypt(w) {see algorithm 6}
5: if A approves v then {Sender approves changed ciphertext word}
6: sABB EssACTTPG (Acert , HA (w)) {Safe signature transfer}
A
7: sBCC w, EssACTTPG (Acert , HA (w)) {Safe transfer of ciphertext word and signa-
B
ture}
8: C (Acert , HA (w)) DssACTTPG (EssACTTPG (Acert , HA (w))) {Signature decryption}
9: if HC (w) = HA (w) then {Receiver approves signature}
10: C CA Acert {Sender certicate transfer}
11: if Acert valid then {CA approves sender certicate}
12: CA C approval {Transfer of approval}
13: C v decrypt(w) {see algorithm 6}
14: return v
15: else {CA rejects sender certicate}
16: return error
17: end if
18: else {Receiver rejects signature}
19: return error
20: end if
21: end if

7.3.4 Trusted Receiver C

The only technical job of a benecial trustworthy consignee C is to decipher incom-


ing SecureString 3.0-objects in accordance with algorithm 6. Nevertheless, C needs
to trust in the following responsibilities of other involved parties:
A trusted sender A double-checks the revisions of a TTPG and chooses to either
acknowledge modied messages with its signature or to drop them.
For each plaintext v, a TTPG gains a starting salt from A plus a (cached) set of
objectionable terms from an online database and precomputes a public repository
BR out of them.
An untrusted host B noties A of the appearance of any BR -element in the relevant
ciphertext w, or it eliminates/replaces all appearances of BR -elements in w and
returns the amended SecureString 3.0-objects to A for review.
110 G. Fahrnberger

Algorithm 6 decrypt
Require: m, sACTTPG , w
Ensure: v

l
1: C lv mw ; w {} {Calculation of plaintext word length; Initialization}
2: for j 1 to lv do {Iteration over ciphertext characters}
3: C vj (rst character of DssACTTPG (wj )) {Character decryption}
4: C v (v {vj }) {Plaintext reassembly}
5: end for
6: return v

7.3.5 Certification Authority (CA)

All components of the architectural model conde in the used certication author-
ity. The CA generally behaves passively and becomes merely active on demand to
approve/reject authentication requests for valid/invalid certicates (see Algorithms
1, 4, and 5), to issue new certicates, and to blacklist compromised certicates.
The next section describes how SecureString 3.0 withstands various typical oenses
against cryptosystems.

7.4 Security Analysis

Menezes et al. itemized six attack classes whose objective is to systematically recover
plaintext from ciphertext, or even more drastically, to deduce the decryption key [19].
This section investigates the implications of probable examples of each class on
SecureString 3.0-objects in the descending order of their diculty respectively in
the ascending order of their dangerousness.
All subsections below assume the encryption scheme of SecureString 3.0 in Def-
inition 2.

Definition 2 Let denote an alphabet, then SecureString 3.0 extends each sin-
gle character vj |1 j |v| of a new plaintext message v with a salt sj
m1 in a vein so that the lowest number of salt changes adheres to (j )(i
)|vj sj = vi si . Thereafter, it lets the underlying cryptosystem scramble each plain-
text character-salt-combination vj sj m |1 j |v| with a secret key k l to the
ciphertext E l m m , k, vj sj E(k, vj sj ) = wj .
7 A Detailed View on SecureString 3.0 111

7.4.1 Ciphertext-Only Attacks

A ciphertext-only attack indicates an advance of an adversary in which they try to


break a cryptosystem by only observing ciphertext. This sort of strikes usually occurs
if a raider snis and analyzes scrambled packets traversing a network hop.
Each ciphertext block of a SecureString 3.0-object disguises a plaintext charac-
ter concatenated with a salt. The covertness of these blocks primarily results from
the privacy quality of the used underlying high-performing cryptosystem. Pruden-
tial secrecy means that, for instance, hackers cannot draw conclusions whether two
adjacent ciphertext blocks hide the identical salt. In an extreme case, two neighbored
plaintext characters merely dier in one bit and become prolonged with a common
salt. Upon conversion of them into two ciphertext blocks, dierential cryptanalysis
must not be able to unveil the tiny disparity between the two plaintexts with the help
of the ciphertext blocks.
Salt swaps (just before the encryption scheme of SecureString 3.0 would twice
encipher a certain plaintext charactersalt combination) guarantee unique cipher-
text characters within SecureString 3.0-objects. This statement holds as long as the
employed (trapdoor) hash function h does not doubly confect any salt sj m1 dur-
ing the assembly of a SecureString 3.0-object. If it happened anyway, then not only a
salt sk |k would become repeated but also all its successors due to sk+1 = h(sk )
and so forth. Successful collision attacks could be the consequence. Thus, Secure-
String 3.0 inherits its privacy from h as well according to Theorem 1.

Theorem 1 As long as a practiced (trapdoor) hash function h m1 m1 does


not repeatedly generate any salt during the creation of a SecureString 3.0-object
w m|v| in agreement with Definition 2, w cannot be broken with ciphertext-only
attacks due to its resemblance to a random character string.

Proof The timely salt renewals of the encryption scheme of SecureString 3.0 war-
rant the singularity of all enriched plaintext characters within the plaintext v, i.e.,
(j )(i )|vj sj = vi si . The bijective characteristic of the encryption function
E leaves the ciphertext characters in w the uniqueness of the plaintext character-salt-
combinations in v.

The prevention of succeeding collision attacks forces the ciphering scheme of


SecureString 3.0 to create a fresh random initial salt s1 at the encryption start of each
subsequent plaintext-object. After the exhaustion of all possible salts, at the latest, a
new secret key k becomes badly required. Otherwise, fruitful collision attacks would
become highly supposable.
The encryption schemes of all SecureString-versions produce one ciphertext
block for each plaintext character. Therefrom, an evildoer can eortlessly and clan-
destinely cut out substrings of SecureString 3.0-objects or permute ciphertext char-
acters inside them. Originators that conrm modications of their SecureString 3.0-
objects with their signature provide the most auspicious counteraction against such
integrity violations.
112 G. Fahrnberger

7.4.2 Known-Plaintext Attacks

A known-plaintext attack takes place if an oender attempts to reveal ciphertext or


the secret key k by a quantity of plaintext and corresponding ciphertext. This attack
only deserves its name if a cryptanalyst must work with given plain- and ciphertext.
Otherwise, if plain- or ciphertext is selectable, then one of the successional chosen-
text attacks goes on. To take a case in point, spyware can debunk a bulk of plaintext
(intended for safe conveyance) in a trustworthy environment (like in a client device),
and the related ciphertext is to be ascertained in the client application as well or in
the transmission cloud.
In the case of SecureString 3.0, attackers will not nd any replicated ciphertext
block, not even in face of recurring plaintext characters due to the well-timed salt
changes. If salts and the exerted hash function become apparent to assailants, then
they can predict successive salts. Even so, they have not learned the secret key k of the
underlying cryptosystem and can neither correctly encipher any plaintext characters
with any foretold salt nor decrypt ciphertexts pursuant to Theorem 2.

Theorem 2 If the plaintext character substring {vj |1 j i < |v|} v and


the appended salts {sj m1 }, the (trapdoor) hash function h m1 m1 ,
the ciphertext string w m|v| , the ciphering function E of the encryption scheme
according as Definition 2 and its inverse decryption function D l m m ,
k, wj D(k, wj ) = vj sj are known, and the secret key k l is unknown, then the
complementary plaintext character substring {vj |v|i |i + 1 j |v|} v can
be solely revealed with a brute-force-attack.

Proof The direct decryption {D l m m , k, wj D(k, wj ) = vj sj |i + 1


j |v|} v fails due to the lack of k, and a brute-force-attack with all keys of the
key space l takes too long and drops out. The reciprocal plan to guess and encrypt
a vj plus its derived adhesive salt sj with all keys of the key space l until the result
coincides with wj lasts even longer and drops out as well.

Nonetheless, the underlying cryptosystem must be bulletproof against dierential


cryptanalyses to avoid prosperous ndings about correlations between ciphertext dif-
ferences.

7.4.3 Chosen-Plaintext Attacks

A chosen-plaintext attack happens if a miscreant challenges an encryption scheme


with an amount of discretionary plaintext in order to gain the commensurate cipher-
text. For example, a malicious regular user of a service can simply try to operate this
raid type if they feed the encryption scheme with arbitrary plaintext and monitor the
output.
7 A Detailed View on SecureString 3.0 113

In comparison to the aforementioned known-plaintext-attacks, an assault against


SecureString 3.0 may lay open plaintextciphertext relations for repetitions of self-
chosen plaintext rather than for a caught one. If they also gather the opportunity to
see the attached salts, then they can verify the deployed hash function. Again, as long
as dierential cryptanalyses do not do the trick against the underlying cryptosystem,
the inecient brute-force-attack described in Theorem 2 remains their mere prospect
and, on that account, the sky has not fallen. If a culprit takes cover as normal sub-
scriber of a service, they might know their own secret key, but even in that case, they
cannot elicit foreign secret keys.

7.4.4 Adaptive Chosen-Plaintext Attacks

An adaptive chosen-plaintext attack ameliorates a chosen-plaintext attack by letting


a wretch select plaintext in accordance with the ciphertext obtained from previous
requests. This oense can be conducted by the recently mentioned malecent user
if they choose and encipher plaintext depending on formerly gotten ciphertext.
Adaptive chosen-plaintext attacks on SecureString 3.0 additionally advantage dif-
ferential cryptanalyses compared to non-adaptive ones, because they admit short-
term decisions to encrypt either the same plaintext character with a consecutive salt
or another plaintext character with the equal salt for example. While the underlying
cryptosystem resists them, the ineective brute-force-attack delineated in Theorem 2
persists the sole option.

7.4.5 Chosen-Ciphertext Attacks

A chosen-ciphertext attack is carried out if an opponent can retrieve plaintext out


of picked ciphertext. One way to mount such an oense is to grab ciphertext in the
transmission cloud in order to attain access to the proper equipment used for decryp-
tion with malware (but not to the secret key k that hopefully securely resides in the
TPM (Trusted Platform Module) of the device).
If a foe proceeds against SecureString 3.0 in this manner, then they aim to dis-
cover the correct secret key k and to emulate the decryption scheme of the underly-
ing cryptosystem without needing admittance to the original decryption terminal in
order to be able to decipher dierent ciphertext. In the event of a vulnerable underly-
ing cryptosystem, they even could descry the secret key k. In case of an invulnerable
underlying cryptosystem, they need to kick o a brute-force-attack on the basis of
the plaintextciphertext pairs at hand to glean k as per Theorem 3.

Theorem 3 If the ciphering function E of the encryption scheme under Definition 2,


its inverse decryption function D in virtue of Theorem 2 and all possible appertain-
ing pairs of cipher- and enriched plaintext characters {vj sj , wj m m |1 j
114 G. Fahrnberger

|Sigmam | E(k, vj sj ) = wj } are known, and the secret key k l is unknown, then k
can be solely recovered with a brute-force-attack.

Proof The brute-force-attack merely vi si , wi as one of the pairs of cipher- and


enriched plaintext characters in order to either encrypt vi si or decrypt wi with all
keys of the key space l till it nds k that leads to E(k, vi si ) = wi respectively
D(k, wi ) = vi si .

7.4.6 Adaptive Chosen-Ciphertext Attacks

An adaptive chosen-ciphertext attack is a chosen-ciphertext attack in which an enemy


selects ciphertext dependent on the plaintext retrieved from precedent inquiries. In
reference to the example for chosen-ciphertext attacks, an antagonist takes the deci-
sion about the next ciphertext (that they want to process) after each acquired plain-
text.
In comparison with non-adaptive chosen-ciphertext attacks on SecureString 3.0,
adaptive ones connote a higher hazardousness because they imitate the decryption
scheme of the underlying cryptosystem without the need for the secret key k. In the
worst case of a violative underlying cryptosystem, a transgressor could even fully
break it by uncovering the secret key k without the referred brute-force-attack in
Theorem 3.
The subjacent section moves away from security and inspects the ecacy of
SecureString 3.0.

7.5 Performance Analysis

Due to the discriminative encryption scheme of SecureString 3.0 in contrast with


SecureString 2.0 concerning salt generation, this section dedicates itself to a com-
parative survey of their performance.

7.5.1 Experimental Environment

An expedient comparison at least requires a similar experiment setup for both cryp-
tosystems. On this account, the test setting of SecureString 2.0 [9] has become reused
for SecureString 3.0, complemented with a TTPG and a common RMI (Remote
Method Invocation) registry for all remote objects. Figure 7.2 illustrates the logi-
cal components of the elaborated experimental environment as well as a rudimen-
tary overview of the exchanged messages between them. The test program initially
executes the black-colored steps to interconnect the elements as needed. It solely
7 A Detailed View on SecureString 3.0 115

Fig. 7.2 Experimental environment and message ow

reruns them for recovery purposes after detected functional outages. The red-colored
sequence becomes oftentimes iterated to accomplish a statistically signicant sample
size for each evaluated data value.
The below-mentioned lines recapitulate the role of each unit in the test design.

Trusted Sender: This machine sets out to automatically fabricate and encrypt ran-
dom messages when all constituents have become interlinked through the (black-
colored) dialogs. The sender coercively pauses if at least one part of the testbed
malfunctions and resumes once everything is up and running again. This entails
that the present lab conguration does not support the examination of a store-and-
116 G. Fahrnberger

forward mode which would be necessary in a real use case to retry failed deliveries
towards an absent trusted receiver. The condential sender generates and conveys
a haphazard initial salt for each message to the TTPG before it pushes the enci-
phered message to the untrustworthy host. Together with the salt, the consigner
lets the TTPG know about the positions of salt transitions for each message to help
the TTPG minimizing the magnitude of the prepared cloud repository. In addition,
the sending process takes the responsibility to gage the turnaround time of every
spawned message through the red-colored ow in order to compile nal perfor-
mance statistics.
Trusted Third Party Generator (TTPG): This trusty entity is responsible to
quickly convert one or multiple sought plaintext keywords into a cloud reposi-
tory with all possible ciphertext alternatives in compliance with the assigned salt
change positions. Applications are conceivable in which such a cloud repository
embraces a ciphertext replacement string for each ciphertext search string. Just to
give an example, the TTPG of SIMS [10] fetches a bunch of deprecated vocables
from an online plaintext bad word dictionary, and for each of them it opposes an
equally long chain of asterisks as replacement string, alternatively the empty string
acts as replacement string. In the performance analysis of SecureString 2.0 [9],
the trustworthy sender generated one random replacement string for each search
string. The will for a convincing performance analysis of SecureString 3.0 calls
for dealing in the same way. The job of the TTPG nishes with the handover of
distilled cloud repositories to the untrusted host.
RMI Registry: The Java implementation of SecureString 3.0 works with the RMI
mechanism and, hence, demands a suitable reference library, also called an RMI
registry. While each of the two launched RMI servers in the SecureString 2.0
architecture (untrusted host, trusted receiver) sparked o an own RMI registry,
all RMI servers in the architectural layout of SecureString 3.0 (TTPG, untrusted
host, trusted receiver) share a collective RMI registry. The black-colored arrows
in Fig. 7.2 represent the registrations of the RMI servers in the RMI registry as
well as the lookups of the RMI clients (trusted sender, TTPG, untrusted host) for
RMI servers in the RMI registry.
Untrusted Host: This computer simulates a cloud resource that does not possess
full condence of the TTPG and of the trusted senders and receivers. Neverthe-
less, it shall take over considerable workload due to tendered unrivaled benets.
The workload incloses two simple assignments. First, the untrusted host queries
incoming ciphertext messages regarding ciphertext keywords out of the associ-
ated cloud repository and passes such messages without any found items on to
the trusted receiver. Second, it replaces each nding by the appendant ciphertext
replacement string out of the cloud repository and relays every processed mes-
sage to the trusted receiver. Third, it truncates the inbound messages and remits
the resulting substrings to the trusted receiver. Contrary to the recommendation in
[12], the untrusted host in the experimental buildup does not let the trusted sender
double-check and sign/reject modied ciphertexts to comply with the experimen-
tal makeup of SecureString 2.0 [9] and chalk up meaningful comparative perfor-
mance analysis.
7 A Detailed View on SecureString 3.0 117

Trusted Receiver: This node just deciphers each arriving SecureString 3.0-object
and noties the untrusted host of its accomplished reception. The untrusty node is
in charge of forwarding all these notications to the trusted sender.

Every reader of the introductive work [12] will notice that the experimental instal-
lation for SecureString 3.0 misses a CA. The rationale is the same as for the lack
of sender signaturesmaintaining comparability with the performance of Secure-
String 2.0.
By same token, both SecureString-editions even resort to the consistent, below-
quoted test details:
Hardware: HP DL380 G5 with 8 2.5 GHz cores and 32 GB main memory
Operating system: Fedora Core 64 bit Linux
Programming language/virtual machine: Java 8 Standard Edition
Underlying high-performing cryptosystem: AES (Advanced Encryption Stan-
dard) 128 bits
Sample size: 1,000,000 random messages for each pictured data point

7.5.2 Querying Performance

The querying performance becomes combinatorially measured for the mean process-
ing time of each plaintext message length |v| {1, , 64} and each plaintext search

string length |u| {1, , |v|}, i.e., for 64
|v|=1
= 64(64+1)
2
= 2, 016 data points. A
Java RNG in the trusted sender issues one arbitrary search string per data point
that becomes enciphered in the TTPG and sought in all 1,000,000 samples by the
untrusted host. Figure 7.3 comparatively shows that SecureString 3.0 approximately
consumes the 4.vefold processing time of SecureString 2.0. The throughput time
rises from 0.1 ms to 0.45 ms for |u| = |v| = 1 with the utilization of SecureString 3.0
instead of SecureString 2.0. Evenhandedly, the diagram peak increases from 1 ms to
4.5 ms.

Fig. 7.3 Querying performance of SecureString 2.0 [9] (left) and SecureString 3.0 (right)
118 G. Fahrnberger

Fig. 7.4 Replacing performance of SecureString 2.0 [9] (left) and SecureString 3.0 (right)

7.5.3 Replacing Performance

The replacing performance becomes also clocked as the average cycle time of replac-
ing operations for 2,016 data values. For all 1,000,000 samples of each particular |v|-
|u|-combination, the identical random search string is employed as for the equivalent
data point of the querying performance evaluation. In this test run, the untrusted host
exchanges all occurrences of search strings in messages for the empty string, i.e., it
shortens aected messages. In contrast to the querying performance, Fig. 7.4 evi-
dences a lower rate of lead time increase for replacing operations with the practice
of SecureString 3.0 in place of SecureString 2.0. The raise from 0.3 to 0.6 ms for
|u| = |v| = 1 implies just doubling. The heightening of the top from 1.4 to 5.24 ms
roughly implicates a factor of 3.7.

7.5.4 Picking Performance

The picking performance again becomes quantied as the mean runtime for 2,016
diagram points. A Java RNG in the trusted sender randomizes a beginning index d

Fig. 7.5 Picking performance of SecureString 2.0 [9] (left) and SecureString 3.0 (right)
7 A Detailed View on SecureString 3.0 119

|v| and an ending index e |v| for all 1,000,000 samples of each particular |v|-|u|-
combination that fullls the condition e d = |u|. Figure 7.5 attests for |u| = |v| = 1
that the run duration of 0.29 ms for SecureString 2.0 imperceptibly grows to 0.32 ms
for SecureString 3.0. Furthermore, it merely substantiates a marginal growth of the
summit from 0.51 to 0.70 ms.

7.6 Conclusion

Seemingly, even huge data scandals sensitize most private online users only for a
short time or bother them not at all. Thereby, they carelessly expose intimate infor-
mation about themselves in/through the Internet without being aware that malecent
parties are just waiting to exploit their behavior. Homomorphic cryptosystems thwart
villains with blind computations on enciphered data.
This scholarly piece continues previous introductive work about the homomor-
phic cryptosystem SecureString 3.0. After an introductory section and a further one
about related work, it unveils its implementation mechanisms with the help of a rudi-
mentary architecture model. The remaining section concerns the description of the
necessary constituents in this model and of six (pseudocode) algorithms spanning
them.
The security analysis examines the impact of Menezes six attack classes on
SecureString 3.0-objects. It arrives at the conclusion that the concealment of Secure-
String 3.0 mainly bases itself on an underlying high-performing cryptosystem with
strong robustness (e.g., against dierential cryptanalyses) and on a collision-free
(trapdoor) hash function in use for the fabrication of nonrecurring secret salts. More-
over, SecureString 3.0 wangles integrity by obliging message originators to sign
accepted manipulated messages.
A performance analysis of SecureString 3.0 constitutes another merit of this doc-
ument. It demonstrates that designers who count on SecureString 3.0 rather than on
SecureString 2.0 to eliminate perilous security lapses (like repetition pattern attacks)
must pay the price of worse but still agreeable time performance. In other words,
secure design philosophy forbids to cut a security edge in the name of eciency,
because there exist too many fast, insecure systems [14]. Under the adopted labora-
tory conditions of SecureString 2.0, querying operations on SecureString 3.0-objects
elongate around 4.5 times, replacing operations length up to 3.7 times, and picking
operations last just slightly longer.
This disquisition prompts the authoring of successional treatises that cope with
the embedment of the demonstrated SecureString 3.0 algorithms in live multi-user
applications.

Acknowledgements Many thanks to Bettina Baumgartner from the University of Vienna for proof-
reading this contribution!
120 G. Fahrnberger

References

1. Abdalla M, Bellare M (2000) Increasing the lifetime of a key: a comparative analysis of the
security of re-keying techniques. In: OkamotoT (ed) Advances in Cryptology ASIACRYPT
2000. Lecture notes in computer science, vol. 1976. Springer, Berlin, Heidelberg, pp 546559.
https://dx.doi.org/10.1007/3-540-44448-3_42
2. Anderson RJ (2008) Security engineeringa guide to building dependable distributed sys-
tems, 2nd ed. Wiley. https://www.cl.cam.ac.uk/~rja14/book.html
3. Bauer FL (2010) Decrypted Secrets: methods and maxims of cryptology, 4th edn. Springer
Publishing Company. https://dx.doi.org/10.1007/978-3-540-48121-8
4. Blum M, Micali S (1984) How to generate cryptographically strong sequences of pseudoran-
dom bits. SIAM J Comput 13(4):850864. https://dx.doi.org/10.1137/0213053
5. Chai Q, Gong G (2012) Veriable symmetric searchable encryption for semi-honest-but-
curious cloud servers. In: 2012 IEEE International Conference on Communications (ICC),
pp 917922, jun 2012. https://dx.doi.org/10.1109/ICC.2012.6364125
6. Elgaby M (2013) Computing on condential character strings in an enhanced SIP-framework.
Masters thesis, University of Hagen. https://dx.doi.org/10.13140/2.1.2059.4241
7. Fahrnberger G (2013) Computing on encrypted character strings in clouds. In: Hota, C., Sri-
mani, P.K. (eds) Distributed computing and internet technology. Lecture notes in computer sci-
ence, vol 7753. Springer, Berlin, pp 244254. https://dx.doi.org/10.1007/978-3-642-36071-8_
19
8. Fahrnberger G (2013) Securestring 2.0 a cryptosystem for computing on encrypted character
strings in clouds. In: Eichler G, Gumzej R (eds) Networked information systems. Fortschritt-
Berichte Reihe 10, vol 826, pp. 226240. VDI Dsseldorf, Jun 2013. https://dx.doi.org/10.
13140/RG.2.1.4846.7521/3
9. Fahrnberger G (2014) A second view on securestring 2.0. In: Natarajan R (ed) Distributed
computing and internet technology. Lecture notes in computer science, vol 8337. Springer
International Publishing, pp 239250. https://dx.doi.org/10.1007/978-3-319-04483-5_25
10. Fahrnberger G (2014) SIMS: a comprehensive approach for a secure instant messaging sifter.
In: 2014 IEEE 13th international conference on Trust, Security and Privacy in Computing and
Communications (TrustCom), Sep 2014, pp 164173. https://dx.doi.org/10.1109/TrustCom.
2014.25
11. Fahrnberger G (2015) Repetition pattern attack on multi-word-containing securestring 2.0
objects. In: Natarajan R, Barua G, Patra MR (eds) Distributed computing and internet tech-
nology. Lecture notes in computer science, vol 8956. Springer International Publishing, pp
265277. https://dx.doi.org/10.1007/978-3-319-14977-6_26
12. Fahrnberger G, Heneis K (2015) Securestring 3.0a cryptosystem for blind computing on
encrypted character strings. In: Natarajan R, Barua G, Patra MR (eds) Distributed computing
and internet technology. Lecture notes in computer science, vol 8956. Springer International
Publishing, pp 331334. https://dx.doi.org/10.1007/978-3-319-14977-6_33
13. Fahrnberger G, Nayak D, Martha VS, Ramaswamy S (2014) Safechat: a tool to shield childrens
communication from explicit messages. In: 2014 14th International Conference on Innovations
for Community Services (I4CS), Jan 2014, pp 8086. https://dx.doi.org/10.1109/I4CS.2014.
6860557
14. Ferguson N, Schneier B (2003) Practical cryptography. Wiley
15. Freitag D (2013) Erweiterung eines SMPP-Frameworks zur sicheren Verarbeitung ver-
traulicher Zeichenketten. Masters thesis, University of Hagen. https://dx.doi.org/10.13140/2.
1.4680.8641
16. Gentry C (2009) A fully homomorphic encryption scheme. Ph.D. thesis, Stanford University,
Stanford, CA, USA. http://crypto.stanford.edu/craig/craig-thesis.pdf
17. Halang WA, Komkhao M, Sodsee S (2014) Secure cloud computing. In: Boonkrong S, Unger
H, Meesad P (eds) Recent advances in information and communication technology. Advances
in intelligent systems and computing, vol 265, May 2014. Springer International Publishing,
pp 305314. https://dx.doi.org/10.1007/978-3-319-06538-0_30
7 A Detailed View on SecureString 3.0 121

18. ISO/IEC 10116:2006 (2006) Information technologysecurity techniquesmodes of opera-


tion for an n-bit block cipher. International Organization for Standardization. https://www.iso.
org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=38761
19. Menezes AJ, Vanstone SA, (1996) Oorschot P.C.v.: Handbook of applied cryptography, 1st
edn. CRC Press, Inc., Boca Raton, Florida, USA. http://cacr.uwaterloo.ca/hac
20. Morris R, Thompson K (1979) Password security: a case history. Commun ACM 22(11):594
597. https://dx.doi.org/10.1145/359168.359172
21. Nayak D, Swamy M, Ramaswamy S (2013) Supporting location information privacy in mobile
devices. In: Hota C, Srimani P (eds) Distributed computing and internet technology. Lecture
notes in computer science, vol 7753. Springer, Berlin, pp 361372. https://dx.doi.org/10.1007/
978-3-642-36071-8_28
22. rencik C, Savas E (2014) An ecient privacy-preserving multi-keyword search over
encrypted cloud data with ranking. Distrib Parallel Databases 32(1):119160. https://dx.doi.
org/10.1007/s10619-013-7123-9
23. Paillier P (1592) Public-key cryptosystems based on composite degree residuosity classes. In:
Stern J (ed) Advances in cryptology EUROCRYPT 99. Lecture notes in computer science, vol
1592. Springer, Berlin, pp 223238. https://dx.doi.org/10.1007/3-540-48910-X_16
24. Petit C, Standaert FX, Pereira O, Malkin TG, Yung M (2008) A block cipher based pseudo ran-
dom number generator secure against side-channel key recovery. In: Proceedings of the 2008
ACM Symposium on Information, Computer and Communications Security. ASIACCS08,
ACM, New York, NY, USA, pp 5665. https://dx.doi.org/10.1145/1368310.1368322
25. Rivest RL, Shamir A, Adleman L (1878) A method for obtaining digital signatures and
public-key cryptosystems. Commun ACM 21(2):120126. https://dx.doi.org/10.1145/359340.
359342
Chapter 8
Performance Comparison for EMD Based
Classication of Unstructured Acoustic
Environments Using GMM and k-NN
Classiers

Sujay D. Mainkar and S.P. Mahajan

Abstract The environment around us is very rich in acoustic information, with its
scope extending beyond mere speech signals. Perception of acoustic information
plays a signicant role in allowing human beings to comprehend sounds that they
hear in their surroundings. This paper focuses on the development of feature
extraction and accurate classication of variety of acoustic sounds in unstructured
environments, where adverse effects such as noise, distortion are likely to dominate.
This work attempts to classify ten different unstructured real-world acoustic envi-
ronments using empirical mode decomposition (EMD) which considers inherent
non-stationarity of acoustic signals by decomposing the signal into intrinsic mode
functions (IMFs). These IMFs are used for feature extraction. This work suggests
utility of composite feature set for classication and proposes an optimized, robust,
best-suitable feature set for classication of the diverse acoustic environments. For
the classication task, Gaussian mixture model (GMM) and k-nearest neighbor
(k-NN) classiers are used. Utilization of this optimized best-suitable feature set
yields the maximum classication accuracy of 100 % with GMM classier and an
average accuracy of 95 % for k-NN (k = 1, 3, 5). Lastly, this study presents the use
and comparison of various performance metrics to evaluate classication tech-
niques used.

Keywords Empirical mode decomposition Intrinsic mode function Composite



feature set Performance measures

S.D. Mainkar ()
Finolex Academy of Management and Technolgy, Ratnagiri, Maharashtra, India
e-mail: smart.extc@gmail.com
S.P. Mahajan
College of Engineering, Pune, Maharashtra, India
e-mail: spm.extc@coep.ac.in

Springer Science+Business Media Singapore 2016 123


A. Chakrabarti et al. (eds.), Advances in Computing Applications,
DOI 10.1007/978-981-10-2630-0_8
124 S.D. Mainkar and S.P. Mahajan

8.1 Introduction

The rapid development in the elds of digital networks and multimedia technolo-
gies during last few years has opened the new paradigm of huge expansion in audio
signal processing. In particular, for a mobile device to be smart, it requires
knowledge about context of the surrounding environment. The introduction of
mobility in the system implies, in turn, that the user may access the system in an
adverse environment where acoustic disturbances may severely degrade the system
performance. Such diversity in acoustic environment arising due to user mobility
can provide the rich source of information about the current context. In this con-
cern, this work suggests the use of surrounding acoustic environment as a hint for
contextual identication. So, this paper focuses on most accurate recognition of
acoustic events, in challenging real-world, unstructured environments.
There are different factors which set this work of acoustic environment recog-
nition distinguishable from the topic of speech recognition. Firstly, the properties of
acoustic environments differ from those of speech, as the frequency content,
duration and prole of the sounds have a much wider variety than those of speech
alone. Secondly, no subword dictionary exists for real-world acoustic sounds in
contrast to the possibility of decomposition of words into their constituent pho-
nemes. Moreover, factors such as noise, reverberation and multiple sources are
possible in real-world unstructured scenarios, whereas research in the domain of
speech recognition has traditionally ignored these, by assuming the use of single
speakers using close-talking microphones.
Practical applications of such a system include smart sensing of acoustic envi-
ronment through intelligent wearable gadgets and hearing aids. Such knowledge
about the surroundings enables the device to provide better service as per users
needs. A mode of operation of a device can be adjusted in accordance with the
context; for instance, a smart phone may automatically switch to silent mode when
the user is at a lecture or in a conference meet, and in contrast it may ring louder
when the person is in noisy place like market, station. Further, a device may also
adjust processing parameters by appropriate judging of the surrounding environ-
ment. For example, some recent hearing aids have multiple equalization lters for
different scenarios but require manual switching between lters to be done by the
user. As an improvement to this, implementation of a proposed system could
smartly identify the surrounding acoustic environment so as to automatically apply
desired settings to the hearing aid.

8.2 Related Work

In the past, most of the work have been done on audio categorization and framing
by using different features and methods. The use of feature extraction matrix is
elaborated in [1] to classify ve different audio categories, namely speech, music,
8 Performance Comparison for EMD Based Classication 125

noisy speech, speech with music and silence. The discrimination of vehicular trafc
noise sources including cars, motorbikes and heavy trucks is done in [2] by using
different spectral features and temporal feature. Three feature vectors for repre-
senting pitch content, rhythmic content and timbral texture of music signals are
suggested and assessed using statistical pattern recognition classiers with 61 %
accuracy in [3] for classifying ten musical genres. Categorization of environmental
sounds is carried out in [4] with the help of Chirplet, curvelet, and Hilbert trans-
forms. In [5], authors have made use of discrete wavelet transform (DWT) to
differentiate between speech and music. In [6], authors have put forth an algorithm
for segmentation and discrimination of audio clips into male speech, female speech,
silence, music and noise. They also have put forward best-suited features for
multiclass classication yielding accuracy of 96.34 % in audio classication.
Background noise sources of four different types are classied using EMD with
discrimination success rate of 77 to 85 % in [7]. In [8], authors have attempted
noise event recognition (car, truck, airplane, etc.) using three classiers. Classi-
cation of ve everyday noise types was done in [9], by comparing multiple
approaches, of which lter bank features proved to have edge over others. In [10],
authors have designed noise classication algorithms by using four pattern recog-
nition frameworks. In [11], authors have done a review on latest trends in recog-
nition of environmental sounds. A context awareness system on the basis of
acoustic signal is suggested in [12] for detection of sound events in ve different
real-world environmental sound categories.
Concerning machine learning methodologies, they have been utilized in various
audio processing tasks, e.g., music classication [13], generalized audio signal
categorization (news, music, sports, etc.) [14] and quantication of speech intelli-
gibility [15]. This literature review identied a need of system design for multiclass
acoustic environment discrimination with the prime target of improvement in
accuracy using EMD to deal with the intrinsic non-stationarity of acoustic signal.
The work presented in this paper constructs on the basis of abovementioned
related works with the following chief contributions. First of all, in this paper, we
put forward unique, optimized, most appropriate composite feature set to make a
distinction between multiple classes of 10 different real-world unstructured acoustic
environments from Diverse Environments Multi-channel Acoustic Noise Database
(DEMAND), achieving maximum classication accuracy of 100 % using GMM
classier and an average accuracy of 95 % by k-NN (k = 1, 3, 5) classier with
Euclidean distance and of 96 % with city-block distance. Another novel work
contributed in this paper is the use and comparison of performance criteria to
evaluate different multiclass classiers used, with the help of confusion matrix.
The further outline of this paper includes overview of EMD in Sect. 8.3. The
basic idea behind proposed methodology is presented in Sect. 8.4. Section 8.5
describes feature extraction and feature selection. The classication methodologies
employed are explained in Sect. 8.6. Experimental results along with database
description are covered in Sect. 8.7 followed by conclusion and future scope in
Sect. 8.8.
126 S.D. Mainkar and S.P. Mahajan

8.3 Empirical Mode Decomposition

The empirical mode decomposition (EMD) is completely data-driven method


suitable for the analysis of nonlinear and non-stationary time series. The EMD
algorithm performs the expansion of time series into an ensemble of functions
dened by the signal itself, rather than using an a priori selection of basis functions
or lters to segregate a frequency component. The summation of amplitude mod-
ulated (AM) and frequency modulated (FM) constituents are used for representation
of a signal. The sifting process is the heart of EMD algorithm which expands the
signal into a set of zero-mean, AM and FM constituents referred as intrinsic mode
functions (IMFs).
For signals with non-stationary and nonlinear nature, local mean is associated
with local time scale for calculation of the mean which is impractical to compute.
As a substitute, EMD uses spline interpolation approach for dening envelopes.
These envelopes are dened in terms of local maxima and local minima. The IMFs
correspond to the oscillatory modes embedded within the signal. Each IMF actually
is a zero-mean mono-component AM-FM signal with the following form:
n
st = ak t cos k t
k=1

where envelope amplitude ak(t) and phase k(t) are time-varying quantities.
Physical interpretation and mathematical meaning can be extracted by means of the
amplitude and phase. In practice, most of the signals consist of more than one
oscillatory mode and so cannot be treated as IMFs. That means, EMD is equivalent
to numerical sifting process which empirically splits a signal into nite number of
IMFs that are basically hidden fundamental intrinsic oscillatory modes.
The mechanism of sifting technique for extraction of these modes from a given
acoustic time series s(t) can be briefly explained as follows [16]:
1. make out all local extrema (maxima and minima) of s(t);
2. interpolate between maxima (resp. minima) to obtain two envelopes sup(t) and
slow(t);
3. estimate mean envelope e(t) = [sup(t) + slow(t)]/2 and extract the difference

d1 t = st e1 t 8:1

4. check whether r1(t) is an IMF. If it is not an IMF then above steps should be
repeated till we obtain IMF. So, consider d1(t) as new data and iterate on it.

d11 t = r1 t e11 t 8:2

5. Suppose that after completion of such k iterations d1k becomes an IMF. That
means
8 Performance Comparison for EMD Based Classication 127

d1k t = d1k 1 t e1k t 8:3

So, this is assigned as:

C1 t = d1k t 8:4

The rst IMF C1(t) comprises of the highest frequency component of the signal.
The residual signal r1(t) is then given by

r1 t = st C1 t 8:5

6. Treat r1(t) as new data and repeat above steps so as to extract all the IMFs. The
sifting procedure is ceased until remainder of signal becomes zero mean or of
monotonic nature as per predecided stoppage criterion given by

T d1k 1 t d1k t 2
SDk = 8:6
t=0 d1k
2 t

7. Finally, the original signal s(t) can be expressed as:


m
st = Cj t + rn t 8:7
j=1

where Cj indicates IMF, j represents the number of corresponding intrinsic mode


which are nearly orthogonal to each other, and rn symbolizes residual trend. In this
manner, EMD operates on non-stationary signals, thereby yielding narrowband
components with reduction in frequency.
An oscillation must satisfy two criteria for consideration as an IMF:
(a) At any time, the mean envelope derived by averaging of the local maxima and
local minima must be zero;
(b) the number of zero-crossings must be equal to the number of extrema or at
most they may differ by 1.

8.4 Proposed Methodology

The appropriate preprocessing can be used for timefrequency analysis of input


acoustic signal with inherent non-stationarity. In the present approach, initially, the
silence removal is done from the audio recording under test. The remaining audio
clip is then segmented into frames having 50-ms frame period and 25-ms overlap
duration. This frame overlapping makes sure that audio features occurring at a point
of discontinuity are at least appearing entirely in the next overlapped frame. Then,
128 S.D. Mainkar and S.P. Mahajan

Fig. 8.1 Generalized outline for EMD-based classication of diverse environmental sounds using
multiclass GMM/k-NN classiers

frame decomposition is done so as to split these frames into number of IMFs. From
these IMFs, different temporal and spectral features are extracted to form composite
feature vectors as described in Sect. 8.5. Finally, classication is done using
Gaussian mixture model (GMM) and k-nearest neighbor (k-NN) classiers, fol-
lowed by performance evaluation, so as to conclude with EMD-based unique
optimized feature set which is best suitable for discrimination of various real-world
acoustic environments. The generalized outline is shown in Fig. 8.1.

8.5 Feature Extraction

For obtaining the characteristics of audio data, the feature extraction is the vital
processing step in multiclass classication of environmental sounds. The primary
goal is to acquire that set of features from the real-world audio clips of interest
which are capable of expressing utmost information regarding preferred charac-
teristics of the actual signal. The analysis of the environmental sound audio clip is
done by using feature extraction. Two major sorts of feature extraction methods
involve: temporal analysis and spectral analysis approaches. Temporal analysis
makes use of time-domain waveform of the audio signal itself, whereas frequency
domain representation of the audio signal is utilized for analysis while dealing with
spectral analysis. IMFs generated by splitting the input audio signals into a series of
analysis windows or frames are used for feature extraction, and nally single value
corresponding to every feature per frame is computed.
In this work, we have selected short-time energy (STE) and zero-crossing rate
(ZCR) as temporal features and spectral flux (SF), spectral roll-off (SR) and spectral
8 Performance Comparison for EMD Based Classication 129

centroid (SC) as spectral features [3]. Further, we also have selected mel-frequency
cepstral coefcients (MFCCs) as one more feature. Further, here, we do experi-
mentations by using composite feature vectors formulated by aggregating different
features for all frames of rst IMF.

8.6 Classication

For the task of classication of environmental sounds, we have used k-nearest


neighbor (k-NN) and Gaussian mixture model (GMM) classiers.
(A) k-Nearest Neighbor Classier
The k-nearest neighbors (k-NN) technique is a simple lazy learner classication
algorithm that stores up all obtainable data points and classies newly introduced
instances based on a similarity measure (e.g., distance functions). It takes into
account the set of supervised learning algorithms and has been popularly utilized in
the domain of pattern recognition from the beginning of 1970s as a nonparametric
technique. During the training phase, the algorithm simply stores the data points
including their class labels, and all computation is postponed until the classication
process. So k-NN is based on the rule that instances which are in close proximity to
one another have comparable properties. So, to categorize new unclassied
instances, one simply has to consider their k-nearest neighbors, to gure out the
classication label. The class membership can be decided by a majority vote of
the immediate k neighbors or the neighbors can be ranked and weighted as per their
distance from the new instance. A general weighting scheme consists of giving each
neighbor a weight of 1/d, where d is the distance to the neighbor.
Example of effect of k value on classication is shown in Fig. 8.2, where k-NN
classier classies two-dimensional data into two classes.
First circle represents a region with three neighbors that are involved in making
decision where unknown data point red star is belonging. In this case, k value is

Fig. 8.2 Illustrative example


of k-NN classication [17]
130 S.D. Mainkar and S.P. Mahajan

Fig. 8.3 Illustrative example


of GMM classication [17]

set to three (k = 3) and classied star belongs to blue polygons class. Second
circle represents ve neighbors (k = 5) considered in classication task. In the
second case, the classication result is an opposite and unknown star belongs to
green squares class [17].
(B) Gaussian Mixture Model (GMM) Classier
In GMM classication, Gaussian mixture model is used for statistical repre-
sentation of environmental audio recordings. The distribution of feature vectors
extracted from these audio clips is modeled by a mixture of Gaussian density
functions (Fig. 8.3).
Complete GMM is dened by mean vector, covariance matrix and mixture
weights. Every identied environment type has its own model which is then used as
its characteristic representation instead of speakers and their utterances to identify
the neighboring atmosphere of the speaker [17]. The identication assignment is
maximum likelihood classier. The main task of the system is to make a decision
whether input acoustic recording belongs to one of the set of environmental sounds,
which are represented by its models. This decision is based on computation of
maximum posterior probability for input feature vector [17].

8.7 Database Description and Experimental Results

This section highlights description of the database used followed by focus on


experimental results along with detailed discussion.
(A) Database Description
For experimentation, Diverse Environments Multi-channel Acoustic Noise
Database (DEMAND) is used. These 16 channel recordings were carried out by
using 16 Sony ECM-C10 omnidirectional electret condenser microphone arrays.
The recordings were captured at a 48-kHz sampling rate and with a 5-min (300 s.)
target length, after removal of all sorts of setup noises and other artifacts by
trimming.
8 Performance Comparison for EMD Based Classication 131

Table 8.1 Description of real-world acoustic environments


Sr. no. Environment Description
1 Bus Public transit bus
2 Field Sports eld with activity nearby
3 Kitchen Inside a kitchen while food preparation
4 Living Inside a living room
5 Meeting Meeting room while the microphone array is discussed
6 Ofce Small ofce with three people using computers
7 Park Well-visited city park
8 Restaurant University restaurant at lunch time
9 River Creek of running water
10 Station Main transfer area of a busy subway station

The database is comprised of recordings in 6 broad categories with 3 different


environments within each category. Four of these categories consist of indoor
environment, whereas remaining two belongs to outdoor recordings. The indoor
environments are categorized asdomestic, ofce, public and transportation; the
open air environments include nature and street [18].
Among these 18 noisy real-world acoustic environments, we have used 16
channels recording of 10 environments for experimentation. So, each of these ten
individual categories consists of 160-min (9600 s.) recordings considering 50 %
frame overlapping, of which 80-min recordings are used for training and remaining
80-min recordings are used for testing purpose in each case. These ten real-world
noisy acoustic environments are summarized in Table 8.1 [19].
(A) Experimental Results and Discussion
(a) Evaluation of feature-set effectiveness
With the aim of evaluation of the effectiveness of each composite set of features
under consideration, we have experimented for 18 different varieties of composite
feature vectors comprising of aggregation of multiple features corresponding to rst
IMF, for all the frames of all 16 samples associated with 10 real-world noisy
acoustic environments, using both GMM and k-NN (k = 1, 3, 5) as presented in
Table 8.2.
Out of 18 different composite feature pool combinations, ZCR-STE-MFCC
(highlighted with bold faces in Table 8.2) outperforms, irrespective of variation in
classier used, and hence, only results corresponding to ZCR-STE-MFCC are
presented and discussed hereafter in this paper.
(b) Performance comparison of k-NN and GMM classiers
Table 8.3 represents the comparative performance of k-NN (k = 1, 3, 5) and
GMM classiers when three features, namely ZCR-STE-MFCC are used together
with their respective rst IMFs to form a composite feature pool. Table 8.3 illus-
trates the effect of combination of effective composite feature set: ZCR-STE-MFCC
132 S.D. Mainkar and S.P. Mahajan

Table 8.2 Composite feature description


Set no. Composite feature set Set no. Composite feature set
1 ZCR-SF 10 STE-SF-MFCC
2 ZCR-SF-MFCC 11 SC-SF-SR
3 SF-MFCC 12 SC-STE-SF
4 ZCR-MFCC 13 SR-STE-SF-MFCC
5 STE-MFCC 14 SR-STE-SC-MFCC
6 STE-ZCR 15 ZCR-STE-SF-MFCC
7 STE-SF 16 ZCR-STE-SF-SC
8 ZCR-SC-MFCC 17 ZCR-STE-SF-SR
9 ZCR-STE-MFCC 18 SC-STE-SF-SR

Table 8.3 Illustration of comparative classier performance for 10 real-world acoustic


environments
Environment Classication accuracy (%)
GMM 1-NN 3-NN 5-NN
Living 100 75 75 75
Kitchen 100 100 100 100
Park 100 100 100 100
Meeting 100 100 87.5 87.5
River 100 100 87.5 75
Ofce 100 100 100 100
Sports eld 100 100 100 100
Restaurant 100 100 100 100
Station 100 87.5 100 100
Public bus 100 100 100 100

along with efcient classier GMM, collectively representing 100 % classication


accuracy. The reason is obvious as GMM is a weighted sum of Gaussian probability
density functions.
GMM-based classication algorithms describe the data distribution according to
the given data, thus giving better estimation of the whole data space, robust to the
samples data distribution, provided the dataset is considerably large. Further,
classier algorithms do not work well with raw data such as huge array of numbers
representing an audio clip. Data have to be preprocessed to extract few pieces of
valuable information, called features.
In this work, features are formulated as composite feature vectors. After vast
experimentation, ZCR-STE-MFCC feature set along with GMM classier found to
yield 100 % classication accuracy, as shown in Fig. 8.4.
8 Performance Comparison for EMD Based Classication 133

Fig. 8.4 Performance comparison of GMM and k-NN classiers using ZCR-STE-MFCC as
composite feature set

Table 8.4 Summary of performance evaluation measures


Measure Formula Evaluation focus
Sensitivity TP/(TP + FN) Effectiveness of a classier to identify positive
(Recall) labels
Specicity TN/(TN + FP) Ability of a classier to exclude negative labels
Accuracy (TP + TN)/ Overall effectiveness of a classier
(TP + TN + FP + FN)
Precision TP/(TP + FP) Class agreement of data labels with the positive
labels given by a classier

(c) Comparison using different performance measures


The metrics for evaluation used in this work are based on the standard elements
of the confusion matrix: true positive (TP), false positive (FP), true negative
(TN) and false negative (FN). TP represent the total correctly classied samples,
and TN is the total number of samples that did get classied to a class they did not
belong to. Similarly, FP cases are those that did not belong to a class but were
allocated to it; on the other hand, FN cases are those belong to a class but were not
allocated to it. Further, there exist numerous performance measures expressed in
terms of TP, TN, FP and FN. A brief summary of evaluation parameters used here
along with their formulae is given in Table 8.4.
The evaluations discussed so far help us in evaluating the cost of making wrong
decisions in turn leading to incorrect classication by the system under consider-
ation, as illustrated in Fig. 8.5.
As can be observed from Fig. 8.5, all characterizing parameters under consid-
eration show good performance with their values ranging from 93.75 to 100 %,
irrespective of variation in classier used. This further highlights the effectiveness
of EMD-based feature extraction for classication task.
134 S.D. Mainkar and S.P. Mahajan

Fig. 8.5 Performance


comparison of GMM and
k-NN classiers using
different machine learning
measures

8.8 Conclusion

In this paper, we explained and evaluated GMM and k-NN (k = 1, 3, 5) classiers


used for multiclass real-world acoustic environment classication task. Classica-
tion accuracy for the 10 diverse environments from DEMAND database was
computed using feature set obtained by employing feature fusion of rst IMFs,
formulated on the basis of EMD. The topmost accuracy of 100 % was reached by
GMM classier using ZCR-STE-MFCC as composite feature set, irrespective of
variation in acoustic environment. Further, this feature set has also provided
accuracy of more than 90 % for k-NN (k = 1, 3, 5) classier, with both Euclidean
and Manhattan distance metric. Thus, robustness of ZCR-STE-MFCC as composite
feature set along with its great multiclass discrimination accuracy has been proven
by our experiments.
An obvious path for potential research is further increasing the number of classes
for multiclass classication and focusing on validation methods for performance
characterization apart from generic performance measures used here. Alternative
classiers such as support vector machine (SVM) could also be used.

References

1. Nitanda N, Haseyama M, Kitajima H (2005) Accurate audio segment classication using


feature extraction matrix. In: Proceedings of ICASSP
2. Sobreira- Seoane MA, Moleras AR, Castro JLA (2008) Automatic classication of trafc
noise. In: Proceedings of Acoustics08, pp. 62216226, Paris, June 29July 4 2008
3. Tzanetakis G, Cook P (2002) Musical genre classication of audio signals. IEEE Trans
Speech Audio Process 10(5)
4. Han B, Hwang E (2009) Environmental sound classication based on feature collaboration.
Proc ICME, 542545
5. Ramalingam T, Dhanalakshmi P (2014) Speech/Music classication using wavelet based
feature extraction techniques. J Comput Sci 10(1): 3444
8 Performance Comparison for EMD Based Classication 135

6. Mahajan SP, Sahu J, Sutaone MS, Kokate VK (2010) Improving performance of multiclass
audio classication using SVM. CIIT Int J Data mining Knowl Eng 2(5):95103. ISSN
0974-9683
7. Jhanwar D, Sharma KK, Modani SG (2013) Classication of environmental background noise
sources using Hilbert-Huang transform. Int J Signal Process Syst 1(1), 6873
8. Couvreur C (1997) Environmental sound recognition: a statistical approach, Ph.D. thesis,
Faculte Polytechnique de Mons, Belgium, June 1997
9. Sawhney N (1997) Situational awareness from environmental sounds
10. El-Maleh K. Samouelian A, Kabal P (1999) Frame level noise classication in mobile
environments. In: Proceedings of the 1999 IEEE international conference on acoustics,
speech, and signal processing, pp 237240
11. Chachada S, Jay Kuo C-C (2013) Environmental sound recognition: a survey, In: Signal and
Information Processing Association Annual Summit and Conference (APSIPA), 2013
Asia-Pacic, pp 19, Oct.-Nov 2013
12. Sivaprakasam T, Dhanalakshmi P (2013) A robust environmental sound recognition system
using frequency domain features. Int J Comput Appl 09758887, 80(9) 510 Oct 2013
13. Bergstra J, Casagrande N, Erhan D, Eck D, Kegl B (2006) Aggregate features and
ADABOOST for music classication. Mach Learn 65(23):473484
14. Dhanalakshmi P, Palanivel S, Ramalingam V (2011) Classication of audio signals using
AANN and GMM. Appl Soft Comput 11(1):716723
15. Li FF, Cox TJ (2007) A neural network model for speech intelligibility quantication. Appl
Soft Comput 7(1):145155
16. Huang NE, Shen Z, Long SR, Wu ML, Shih HH, Zheng Q, Yen NC, Tung CC, Liu HH
(1998) The empirical mode decomposition and Hilbert spectrum for nonlinear and
non-stationary time series analysis. Proc Roy Soc London A 454:903995
17. Wu X, Vipin K (2009) The Top 10 algorithms in data mining, Chapman & Hall/CRC
18. Thiemann J, Ito N, Vincent E (2013) The Diverse environments multi-channel acoustic noise
database (DEMAND): a database of multichannel environmental noise recordings. In: 21st
International Congress on Acoustics, Jun 2013, Montreal, Canada
19. http://parole.loria.fr/DEMAND/
Chapter 9
Performance of Multimodal Biometric
System Based on Level and Method
of Fusion

Mrunal Pathak and N. Srinivasu

Abstract User authentication is essential to provide security that restricts access to


system and data resources. Automated recognition of individuals based on their
biological and behavioral characteristics is referred as biometric system. Recogni-
tion of legitimate user depends upon a feature vector(s) extracted from either their
distinguishing behavioral or both distinguishing behavioral and physiological traits
such as face, nger, speech, iris and gait. Research on biometrics has distinctly
increased for solving identication and authentication issues in forensics, physical
and computer security, custom and immigration. However, unimodal biometric
system is not able to fulll reliability constraints, speed and acceptability of
authentication in real applications due to noise in sensed data, spoof attacks, data
quality, lack of distinctiveness, restricted degree of freedom, non-universality and
other factors. Therefore, multimodal biometric systems are used to increase security
as well as better performance. To establish the identity of individuals, multimodal
biometric systems unite the information presented by multiple biometric sensor,
samples, algorithms, units and traits. Multimodal biometric systems are not only
used for enhancing matching performance, but these systems also provide improved
people coverage, discourage deceiving, make possible continuous monitoring and
impart fault tolerance to biometric applications. This paper presents an overview of
different multimodal biometric (multibiometric) systems and their fusion techniques
with respect to their performance.

Keywords Biometrics Unimodal Multimodal Fusion Multibiometric


system

M. Pathak ()
AISSMS Institute of Information Technology, Savitribai Phule Pune University,
Pune, India
e-mail: mrunalkpathak@gmail.com
N. Srinivasu
K. L. University, Guntur, India
e-mail: srinivasu28@kluniversity.in

Springer Science+Business Media Singapore 2016 137


A. Chakrabarti et al. (eds.), Advances in Computing Applications,
DOI 10.1007/978-981-10-2630-0_9
138 M. Pathak and N. Srinivasu

9.1 Introduction

For todays scenario, security is major concern. A high-level industry uses bio-
metric authentication systems established on evidence of individual source of
information called unimodal systems [1], which make use of physiological char-
acteristics such as iris, ear, ngerprint, face, teeth, retina, palm print, veins or
signature, voice and gait. as behavioral characteristics [2, 3]. Each biometric has its
own strength and weakness in terms of accuracy, user acceptance and applicability,
and accordingly each biometrics is used in authentication application. The advan-
tage of the present biometric system is that it does not change, gets stolen and
misplaced. In real-world application, none of the individual biometric system is
anticipated to effectively meet all requirements after deploying. Variety of problems
occurs in unimodal biometric system like [4].
(a) Noisy sensor data: for example voice sample altered by cold or ngerprint
with a scar. Due to ambient condition, defective or improperly maintained
sensor or noisy data lead to inaccurate matching or false rejection.
(b) Non-universality: subset of user biometric system may not be able to provide
meaningful biometric data due to users illness or disabilities.
(c) Intra-class variation: This variation is caused by user when sensor character-
istics are changed during authentication or if user incorrectly interacts with
sensor. For example wrong facial pose. False rejection rate (FRR) in biometric
system increases due to large intra-class variation.
(d) Inter-class similarities: It means overlapping of feature space matching to
several users. False acceptance rate (FAR) in biometric system increases if
inter-class similarities increase. For example ideal twins face recognition.
(e) Failure-to-enroll: Unsuccessful attempts to create a template from low-quality
inputs.
(f) Spoof attacks: Submitting fake biometric traits such as face mask, mimicry of
voice or signature causes spoong attack. Due to imitation of data, unimodal
biometric system is vulnerable to spoong.
(g) Restricted degree of freedom: in unimodal biometric system, we are using
features from any single biometric traits such as face, iris and palm which will
restrict the performance of recognition.
(h) Unacceptable error rate: equal error rate (EER) have been decided based on
same value for FRR and FAR. Lower the EER, system will be considered as
more accurate. Otherwise error rate will be unacceptable.
Multimodal biometric system overcomes some of the limitations of unimodal
biometric systems. Multibiometric system utilizes information from one or more
modalities or multiple processing techniques or both. Therefore, multimodal bio-
metric systems are those which integrates more than one physiological or/and
behavioral characteristics for enrollment, identication or verication to improve
the performance and reliability [5, 6]. Some common multimodal biometrics are
9 Performance of Multimodal Biometric System Based 139

face and iris, iris and ngerprints, face and ngerprints, face and voice, face,
ngerprints and iris, face, ngerprint and signature.
In this paper, details of multimodal biometric systems are reviewed in different
sections. Section 9.1 is about introduction. In Sect. 9.2, general biometric system
will be discussed. Section 9.3 will be an introductory section on multimodal bio-
metric systems. This section gives an overview of a selection of well-known
multimodal biometric systems and set-ups that are used by researchers world wide.
Subsequently, Sects. 9.4 and 9.5 will be addressed on overview of level and
methods of multimodal fusion. Operation modes are discussed in Sect. 9.6, where
Sect. 9.7 focuses on issues in multimodal biometrics with its application in
Sect. 9.8. The paper is concluded in Sects. 9.9 and 9.10 with experimental results,
conclusion and discussion on the future directions in multimodal biometrics.

9.2 Biometric System

Varieties of civilian, commercial and forensic applications have used biometric


system for person verication. Conventional methods to secure person authenti-
cation applications include tokens as well as passwords and PINs, magnetic and
smart cards. However, biometric technologies have supreme benet when it comes
to person identity assurance. Biometric system is a branch of pattern recognition
system which works on feature extracted and acquired information from biometric
traits for each individual.
Standard biometric system has four phases [7]: (a) Information of individual
biometric trait is captured in the form of raw biometric data in enrollment phase.
(b) Feature extraction phase processes data to remove artifacts from the sensor and
uses few types of normalization to build extracted feature set that is a compact
representation of trait. Synthesis of the signicant characteristics extracted from the
features is known as Template. (c) Score is generated based on how closely the
sample matches with created templates which are stored in database in the process
of the matching and comparing phase. (d) Acceptance or rejection of a user is based
on matching score obtain after matching module in decision-making phase. Iden-
tication and verication are the two modes of biometric recognition: identication
involves checking of acquired information about unknown person with templates
corresponding for all users in the database to give identity for unrecognized person
in the form of name and identication number, and verication mode compares
acquired biometric information with templates corresponding to the claimed iden-
tity. Finally, authentication occurs based on pattern matching (Fig. 9.1).
Most of the existing biometric systems were developed based on single bio-
metric features (ngerprint, ear, face, iris and so on). Every biometric trait used for
authentication has its own strength and weakness. Figure 9.2 shows different bio-
metric traits that are popular currently.
140 M. Pathak and N. Srinivasu

Biometric System Stored


Template

Test
Pre- Feature Template Test
Matcher
Processing Extraction Generator

Application

Sensor Device

Fig. 9.1 Block diagram of general biometric system

Fig. 9.2 Examples of biometric trait


9 Performance of Multimodal Biometric System Based 141

Choice of biometric trait depends on the application because no single biometric


is the ultimate recognition tool. A comparison of the above biometric traits based
on their characteristics is provided in Table [1].
False acceptance rate (FAR) and false rejection rate (FRR) at various thresholds
[8] are used to measure the performance of a biometric system.
False acceptance rate (FAR): It is dened as the probability that the system
incorrectly authorizes a non-authorized person, due to incorrectly matching bio-
metric input with stored template. The ratio of the number of false acceptances
divided by the number of identication attempts is known as false acceptance rate.
Value of FAR is measured in percentage.
False rejection rate (FRR): It is dened as the probability that the system
incorrectly rejects access to an authorized person, due to failing match of biometric
input with a stored template. FRR is normally calculated as the ratio of the number
of false rejections divided by the number of identication attempts. Its value is
measured in percentage.
A threshold value is based on FAR and FRR values calculated for all possible
legitimate and imposter matching score to decide acceptance or rejection of match.
Two feature vectors for same individual are compared to measure a genuine
matching score, and two feature vectors from different individual are compared to
obtain an impostor matching score, Table 9.1.
.

9.3 Multimodal Biometric System

Unimodal biometrics system have some of the limitations such as inter-class sim-
ilarities, intra-class variation, noisy data, spoong and non-universality which can
be overcome by multimodal biometric system by synthesizing information from
multiple biometric trait to claim the identity of person [9]. Multiple samples for a
single biometric trait are captured known as multisample biometrics and/or samples
for more than one biometric traits are captured known as multisource or multimodal
biometrics. Identity of claimed person is based on input taken from single or
multiple sensors measuring two or more different modalities of biometric charac-
teristics. To forge single biometric characteristics is more easier than to forge
multiple biometric characteristics; therefore, multimodal biometric systems are
generally much more vital to fraudulent technologies. Hence, these systems are
more reliable, highly accurate and secure in biometric identication as compared to
individual biometric modalities [6]. One of major advantages of this system is that
failure to enroll (FTE) rate is signicantly reduced in multimodal evaluation.
Nowadays multimodal biometric system has been widely deployed and adapted in
various civilian applications such as banking security to check online credit card
transactions and ATM security by providing login privileges. A multimodal bio-
metric system made decision either a genuine individual type of decision or an
142

Table 9.1 Comparison of various biometric traits based on characteristics


Biometric characteristics Universality Distinctiveness Permanence Collectability Performance Acceptability Circumvention
Facial thermogram H H L H M H L
Hand vein M M M M M M L
Gait M L L H L H M
Keystroke L L L M L M M
Oder H H H L L M L
Ear M M H L L M L
Hand geometry M M M H M M M
Fingerprint M H H M H M M
Face H L M H L H H
Retina H H M L H L L
Iris H H H M H L L
Palmprint M H H M H M M
Voice M L L M L H H
Signature L L L H L H H
DNA H H H L H L L
M. Pathak and N. Srinivasu
9 Performance of Multimodal Biometric System Based 143

imposter type of decision. Equal error rate [ERR], false acceptance rate [FAR] and
false rejection rate [FRR] are used to measure the accuracy of system [10].
Several biometric systems are developed in the literature which can be accom-
plished by fusing single trait with multiple representations and multiple matchers,
more than one traits of an individuals, or multiple future extractions and matching
algorithm accomplish on same biometric. Each representation has its own classier
in single trait with multiple representation fusion, and fusion takes place after
reporting score for class by each classier at matching stage. Performance degra-
dation can be avoided by selecting classiers based on their goodness [11]. In the
matching module, multiple matching strategies of biometric system are incorpo-
rated with the help of single biometric multiple matchers, and the performance is
improved by combining scores generated by these strategies [12]. To improve
speed and reliability of biometric systems, matching scores obtained from multiple
biometrics sources are integrated [13].
Various levels of fusion, possible scenarios, modes of operation, integration
strategies and design issues for multimodal biometric system have been proposed
by Ross and Jain (2003). Serial, parallel or hierarchical modes are the operational
modes of a multimodal system. Serial mode forces user to use the modalities one
after another. Therefore, there is no need to acquire information about multiple
sources (e.g. multiple traits) simultaneously. It reduces overall recognition time
because decision could be made before acquiring all the traits. Information from
multiple modalities is used simultaneously in order to perform recognition in the
parallel mode of operation. Strength of multimodal biometric system can be
enhanced by fusioning measurements from different biometric traits [14]. General
multimodal biometrics system can be represented as shown in Fig. 9.3.

Feature Fusion Final Accept


Input
Extraction Level Decision
Modalities
Reject
Selection

Feature
Matching
Fingerprint
, palm System score or
print, iris, database Decision level
face, ear, etc.

Fig. 9.3 Block diagram of general multimodal biometric system


144 M. Pathak and N. Srinivasu

Generally there are two phases, i.e. enrollment phase and authentication phase
on which multimodal biometric system can operate. The two phases are described
as follows [15].
Enrollment phase: Biometric traits of a user for two or more modalities are
captured and are stored as a template for that user in the system database which is
further used for authentication phase.
Authentication phase: Biometric traits of a user for different modalities are
captured which are for later use either to identify or verify a person by comparing
templates stored corresponding to all users in database [16].
Multimodal biometric system can be described with the help of following four
modules:
1. Sensor module
2. Feature extraction module
3. Matching module
4. Decision-making module
Sensor module: Biometric traits for different modalities are captured and used as
an input for feature extraction module after performing preprocessing.
Feature extraction module: Compact representation of different biometric traits
or modalities known as Features extracted from different modalities which are
then given to matching module for comparison and also stored as an template in
database.
Matching module: Templates in a database are compared with the extracted
features.
Decision-making module: Decision is taken for user either accepted or rejected
against template stored in database.

9.4 Levels of Multimodal Fusion

Features from various biometric traits are fused to perform multimodal biometric
fusion to enhance the strength and reduce the weakness of each measurement. The
purpose of multimodal biometric fusion is to take out content from a group of input
biometric traits. Multimodal fusion in biometric system is classied into two broad
types [17]: pre-classication and post-classication. In pre-classication, fusion
information is combined before applying any classication method or matching
algorithm. Information is combined after decision of classiers in post-classication
method. Pre-classication fusion uses raw input data from different biometric traits
[18, 19], so that biometric fusion takes place either at feature level (early fusion) or
at data level (sensor level). The post-classication fusion is divided into dynamic
9 Performance of Multimodal Biometric System Based 145

classier selection, abstract level fusion, rank-level fusion and matching score level
fusion [17, 20, 21].
(A) Data Level Fusion
It is the process of integration of multiple data and knowledge representing
more than one signals from a set of same type of modality source (e.g. same
view recorded by two webcams from various viewpoints) without loss of
information into a consistent, accurate and useful representation. It is highly
vulnerable to noise and failures due to the absence of preprocessing.
Data-level fusion is not commonly used because data required for fusion
should be compatible which is rare in biometric sensors, Fig. 9.4.

(B) Feature-Level Fusion


Time synchronized (tightly coupled) modalities are to be fused in feature-level
fusion. Features extracted from different modalities are independent, so it is
sufcient to combine two feature vectors into a single vector. This feature
reduction technique is employed to extract useful features from large features
set. Selection of features is to be done by using dened precise mechanism
such as sequential forward selection in which the distance score between query
vector and vector stored in database is calculated by using measures such as
Euclidean distance to perform analysis, for example fusion of speech and lip
movement in speaker recognition. Feature-level fusion is at risk to time syn-
chronization between multimodal features and low-level information loss,
although it handles noise and perform better task accomplishment. A dynamic
classier selection scheme is used to estimate accurateness of individual
classier in local region surrounding input pattern to be classied and mostly

Biometric Feature Matcher I Classifier I


Sensor I Extractor I

Feature Matching Decision


Sensor
Fusion Score Fusion
Fusion
Fusion

Biometric Feature Matcher II Classifier II


Sensor II Extractor II

Fig. 9.4 Levels of fusion


146 M. Pathak and N. Srinivasu

select classier to prove the correct decision for the precise input pattern [22,
23]. Dynamic selection requires the large data sets for estimating local clas-
sier accuracy. Future-level fusion produces best result when modalities are
related than when they are unrelated [24, 25].

(C) Decision-Level Fusion


In decision-level fusion, features are extracted from every biometric trait, and
these extracted features are then classied as accept or reject after matching
module. The nal output of multiple classiers for different modalities is then
combined. Decision-level fusion commonly faces the problem of possibility of
tie. This can be avoided by adding more classier than classes so that problem
can be overcome by taking three classiers in verication application.
Methods such as majority voting, AND rule and OR rule, weighted voting
based on DempsterShafer theory of evidence behavior and knowledge space
are then used to arrive at nal decision. Decision-level fusion has some
advantages over feature-level fusion-like scalability in terms of modalities
used in fusion process; it also allows suitable method for analyzing each single
modality such as support vector machine (SVM) for image and hidden Mar-
kova model (HMM) for audio. Disadvantage of decision-level fusion is that
learning process is tedious and more time consuming as it uses different
methods of classier to obtain local decision for every modality used. Also
decision-level fusion uses abstract level of information which holds binary
value so they are less preferred.

(D) Rank-Level Fusion


It is preferred in biometric identication system to improve performance. In
rank-level fusion, each classier associates a rank with every enrolled identity.
Each biometric matcher provides results as a subset of possible matches
ranked in descending condence values. Borda count method or logistic
regression method is the highest rank method which is used to combine rank
assigned by different matchers. Different matchers compute highest (mini-
mum) rank for each possible match by using highest rank method. With the
help of Borda count method, sum of the ranks provided by each member is
used to calculate combined rank. In logistic regression method, logistic
regression [26, 27] method is used to determine weights which are then used
to calculate weighted sum of individual rank. Fusion can be done by con-
solidating more than two biometric matching scores related with an identi-
cation to determine new unique rank which will be used in nal decision [28].

(E) Match Score-Level Fusion


Similarity between input biometric and template biometric stored in database is
measured by match score. Integration can be done at matching score level,
when output from each biometric matching module is a group of desirable
9 Performance of Multimodal Biometric System Based 147

matches along with quality of individuals matching score associated with


condence values. Measurement- or condence-level fusion is referred as match
score level. Matching score level is a commonly used approach in multimodal
biometric because richest information about input pattern is stored by matcher as
an output which will gives more accurate decision. Also scores generated by
different matchers are relatively easy to access and combine [29, 30].
Information integration in early stage (pre-classication fusion) is more effective
than integration of information done in later stage (post-classication fusion) in
multimodal biometric. So it is expected that feature-level fusion gives better result
of recognition but it is difcult to integrate features at this level due to large feature
set as well as incompatibility of features of different modality. Also most of the
economical biometric system does not allow to access feature set which is used in
their product by them. Integration of information at decision-level fusion would
inevitably lose useful detailed information as it uses abstract data. Match score-level
fusion is mostly preferred because scores of different modalities are easily accessed
and combined [1, 2, 31, 32, 33].

9.5 Methods for Multimodal Fusion

Rule-based methods, classication-based methods and estimation-based methods


are the three categories of multimodal fusion based on the nature of their methods
and classication of problem space [17, 34, 35].
(a) Rule-Based Fusion Methods
Basic rules of combing multimodal information are used in ruled-based fusion
methods. MAX, MIN, AND, OR, linear weighted fusion (sum and product)
and majority voting are some statistical rule-based methods [36].
Custom-dened rules can also be constructed depending on some specic
application perspective. However, these rules are domain-specic, and precise
knowledge of the domain is required to dene rules. In a rule-based method,
linear weighted fusion method is commonly used because it is simple and
computationally less expensive and performs well if weight of different
modalities are appropriately determined [37]. Domains such as sports video
analysis and multimodal dialog system mostly use rule-based fusion method.

(b) Classication-Based Fusion Method


Classication-based fusion method includes range of classication techniques
to categorize multimodal observation into one of the pre-dened classes.
Methods for classications are support vector machine (SVM), Bayesian
interface, dynamic Bayesian network (DBN), neural network (NN), Demp-
sterShafer theory and maximum entropy model. The Bayesian interface
148 M. Pathak and N. Srinivasu

fusion works on probabilistic principals and use priori information to provide


easy integration of new observation. However, due to the lack of appropriate
priori information, it may provide inaccurate fusion results and not suitable for
handling mutually exclusive hypothesis. Time series data commonly use DBN
in its different forms (e.g. HMM). However, it is difcult to determine right
DBN state. For high dimensional problem space, generally NN method is used
to generate e high-order nonlinear mapping. But due to complicated network
type, there is a problem of slow training. SVM and DBN are widely used to
enhance classication performance.

(c) Estimation-Based Fusion Method


Problem of estimating parameters is solved by estimation-based fusion
method. It uses Kalman Filter, practical lter and extended Kalman lter
methods. These methods are primarily used to calculate and anticipate the
fused observation over the time period. For tracking the task and object
localization, these methods are preferred. Kalman lter is suitable for linear
model, extended Kalman lter suitable in nonlinear model. However, practical
lters method is robust in nonlinear and non-Gaussian models.

9.6 Modes of Operation

Multimodal biometric system works in three modes of operation:


a. Parallel mode: Multiple sources of information are acquired simultaneously to
perform recognition [32].
b. Serial mode: Information from more than one biometric trait is not obtained
simultaneously. The output of individual biometric trait is applied to scale down
the number of possible identities before getting next trait. So information from
more than one source is not acquired simultaneously. Due to this, recognition
time will get reduced [38].
c. Hierarchical mode: Tree-like structure is formed to combine individual classi-
ers in hierarchical mode. It will be used when there will be large number of
classiers.
Parallel fusion mode demands that in both enrollment and recognition stages, all
types of required traits are always captured for each user. Because of this, parallel
fusion will become inefcient and inconvenient due to redundant capturing and
matching of all the traits. In serial fusion mode, user checks authentication for
individual biometric trait at every stage ane by one. Certain types of trait are
sampled and matched against template in each stage, and after valid authentication,
all later stages will be bypassed. Users efforts and time will be extremely reduced
or saved by improving system efciency.
9 Performance of Multimodal Biometric System Based 149

9.7 Issues in Multimodal Biometrics

Design and implementation of multibiometric recognition system architecture is


based on software and hardware techniques which may have the following issues
[39, 40]:
Optimal biometric modalities and number of trait selection;
To gather adequate training data;
To model uncertainty with different combinations of modalities at different
times;
Correlation between different modalities;
Condence level of different modalities based on learning weights assigned to
individual biometric;
Fusion methodology and fusion level;
Cost versus performance trade-off and accuracy versus reliability trade-off;
Temporal modeling;
Non-ideal and challenging condition;
The synchronization of modalities such as DNA, RFID for person identication
and verication, haptic for HCI and dialog systems;
In multimodal fusion, it is difcult to take appropriate decision in synchro-
nization to decide the type of modalities, amounts of their data to be processed
and time for multimodal fusion to perform analysis of multimedia task;
Choosing best samples for particular biometric;
To optimize the multibiometric recognition benets, rstly understand the design
issues better, so that more effective multibiometric system can be developed with
respect to architecture and design methodology.

9.8 Applications

Today biometrics has been used in wide variety of applications to provide security,
convenience, privacy enhancement in much more commercial, criminal and civil
applications, for e.g. to provide secure, economical and user-friendly system to
access personal information and perform business transaction against fraud or
information theft.

9.9 Result and Discussion

Multimodal and multiclassier are the different levels used to design multimodal
biometric system. Multiple algorithms or classier are used together to estimate
better results in multiclassier level. Previous research shows different biometric
150 M. Pathak and N. Srinivasu

Table 9.2 Previous research-based comparison of different fusion techniques and its FAR, FRR
and accuracy rate by various authors
Author Fusion Data set FAR FRR Accuracy
(%) (%) (%)
Muhammed et al. Score Face, nger, vein 0.05 0.23 95
Mohammed et al. Score Face, speech 0.087 0.67 96
Dhanashree et al. Feature Palmprint, palm, vein 0.029 1.0 99
Rattani et al. Feature Face, ngerprint 1.98 3.18 98
Bhagat et al. Feature Palmvein, face 0.5 1.0 98
Feifei et al. Score Fingerprint, F.vein 1.2 0.75 95
Krishneswari et al. Feature Palm and ngerprint 1.02 0.9 98
Nazmeen et al. Decision Face, ear 0 4 96
Nageshkumar et al. Score Palmprint, face 2.4 0.8 97
Krzyszof et al. Decision Face, speech 1.1 3.0 87
Mohamed et al. Decision Fingerprint, iris 2 2 98
Lin hong et al. Decision Palmprint, face 1 1.8 92
Gayatri et al. Feature Face, palmprint. 0.5 1.2 95
Mitul et al. Feature Palm and ngerprint 0.2 1.1 87
Jegadeesan et al. Feature Fingerprint, iris 10 5.3 91

and multibiometric systems developed and tested against FAR, FRR and accuracy
to check their performance. Table 9.2 shows FAR, FRR and accuracy of these
systems [41].
The performance of multibiometric system is evaluated on basis of score,
decision and feature fusion. Feature fusion techniques are better choice as compared
to score and decision fusion techniques [41].

9.10 Conclusion and Future Direction

In this paper, we highlighted biometric system and limitations of individual bio-


metric. Multimodal biometric system is better than unimodal biometric system as it
overcomes the problem of unimodal biometric systems such as noisy data,
intra-class variation, inter-class similarities, non-universality and spoong. Multi-
modal biometrics authentication process provides and maintains higher authenti-
cation security as strong as possible to provide more accuracy. Various fusion levels
that are plausible, integration strategies which can be adopted to consolidate
information and methods of multimodal system were discussed. For authentication
of person, there are many multimodal biometric systems in existence, but still there
are some challenging issues in the designing of multimodal biometric system such
as selection of appropriate model, choice of optimal fusion level and redundancy in
extracted features which need to be solved. Multibiometric system requires
9 Performance of Multimodal Biometric System Based 151

additional hardware and matching algorithm, so it becomes more expensive and


complex. By combing the multiple biometric traits, the performance of biometric
system and level of security can be enhanced.

References

1. Jain AK, Ross A, Prabhakar S (2004) An introduction to biometric recognition. IEEE Trans
Circuit Syst Video Technol 14(1):420
2. Ross A, Flynn P, Jain AK (2008) Handbook of biometrics. Springer, New York, USA
3. Aggithaya V, Zhang D, Luo N (2012) A multimodal biometric authentication system based
on 2D and 3D palmprint features. Proc SPIE, 6944:69440C-1
4. Ross A, Nandkumar K (2008) A. Jain: handbook of multibiometrics, Springer international
edition pp 271292
5. Telgad RL, Siddiqui AMN, Deshmukh PD (2013) Automated biometric verication: a survey
on multimodal biometrics, Int J Comput Sci Business Inform 6(1) 18. ISSN 1694-2108|
6. Ahuja MS, Chabbra S (2011) A survey of multimodal biometrics. Int J Comput Sci Appl
1:157160
7. Ahuja MS, Chabbra S (XXX) A survey of multimodal biometrics. Int J Comput Sci Appl 157
160. ISSN 2250-3765.
8. Hamming RW (XXX) Error detecting and error correcting codes. Bell Syst Tech J 26(2):147
160, 195
9. Sanjekar PS, Patil JB (2013) An overview of multimodal biometrics. Signal Image Process
Int J (SIPIJ), 4(1)
10. Mishra A (2010) Multimodal biometrics it is: need for future systems. Int J Comput Appl 3
(4):09758887
11. Prabhakar Salil, Jain Anil K (2002) Decision-level fusion in ngerprint verication. Pattern
Recogn 35:861874
12. Jain AK, Prabhakar S, Chen S (1999) Combining multiple matchers for a high security
ngerprint verication system. Pattern Recogn Lett 20:13711379 [24]
13. Ross Arun, Jain Anil (2003) Information fusion in biometrics. Pattern Recogn Lett 24:2115
2125
14. Rodriguez LP, Crespo AG, Lara M, Mezcua M (2008) Study of different fusion technique for
multimodal biometric authentication, in networking and communications. In: IEEE Interna-
tional Conference on Wireless and Mobile Computing
15. Chaudhary S, Rajender Nath (2015) A new multimodal biometric recognition system
integrating Iris, face and voice. Int J Adv Res Comput Sci Softw Eng 5(4):45150. ISSN 2277
128X
16. Golfarelli M, Maio D, Maltoni D(1997) On the errorreject tradeoff in biometric erication
systems. IEE Trans Pattern Anal Mach Intell 19(7):786796
17. Atrey P, Houssain A, Saddik A, Kanakanhalli M (2010) Mutimodal fusion for multimedia
analysis: a survey. Springer Trans Multimedia Syst 16: 345379
18. Sharma R, Pavlovic VI, Huang TS (1998) Towards multimodal Human computer interface.
In: Proc IEEE, 86(5):853860
19. Ross A (2003) A. Jain: information fusion in biometrics. J Pattern Recogn Lett 24(13):2115
2125
20. Zheng Y, Elmaghraby A (2011) A brief survey on multi spectra l face recognition and
multimodal score fusion. In: Proceedings of IEEE International Symposium on Signal
Processing and Information Technology (ISSPIT), Bilbao, pp 543550, 1417 Dec 2011
21. Kaur D, Kaur G (2013) Level of fusion in multimodal biometrics: a review. Int J Adv Res
Comput Sci Softw Eng 3(2)
152 M. Pathak and N. Srinivasu

22. Woods K, Kegelmeyer WP, Jr, Bowyer K (1997) Combination of multiple classiers using
local accuracy estimates. IEEE Trans Pattern Anal Mach Intell 19 (4):405410
23. Giacinto G, Roli F (2001) Dynamic classier selection based on multiple classier behaviour.
Pattern Recogn 34:179181
24. Nadheen F, Poornima S (2013) Feature level fusion in multimodal biometric authentication
system. Int J Comput Appl 69(18):09758887
25. Ross A, Govindarajan R (2005) Feature level fusion in biometric systems. SPIE conference
on Biometric technics for human identication II, vol 5779, Orlando, USA, March 2005,
pp 196204
26. Abaza A, Ross A (2009) Quality based rank-level fusion in multibiometric systems. In:
Proceedings of 3rd IEEE International Conference on Biometrics: Theory, Applications and
Systems (BTAS), (Washington DC, USA), September 2009
27. Siddiqui AMN, Telgad R, Deshmukh P (2014) Multimodal biometric systems: study to
improve accuracy and performance, Int J Current Eng Technol 4(1) ISSN 2277-4106
28. Meva DT, Kumbharana CT (2013) Comparative study of different fusion techniques in
multimodal biometric authentication. Int J Comput Appl 66(19):09758887
29. Garje1 PD, Agrawal SS (2012) Multibiometric identication system based on score level
fusion. IOSR J Electron Commun Eng (IOSRJECE), 2(6):0711. ISSN 22782834
30. Kazi M, Rode Y (2012) Multimodal biometric system using face and signature: a score level
fusion approach. Adv Comput Res 4(1)
31. Ross A, Govindrajan R (2005) Feature level fusion using hand and face biometrics. Proc SPIE
5779:196204
32. Hong L, Jain A (1998) Integrating face and ngerprints for personal identication. IEEE
Trans Pattern Anal. Match. Intell 20(1):12951307
33. Fierrez-Aguilar J, Ortega-Garcia J, Garcia-Romero D, Gonzalez Rodriguez J (2003) A
comparative evaluation of fusion strategies for multimodal biometric verication. In: Kittler j,
Nixon m (eds) Proceedings of 4th International Conference on Audio-video-based Biometric
Person Authentication, vol. LNCS 2688, pp 830837
34. Aguilar JF, Garcia JO, Romero DG, Rodriguez JG (2003) A cooperative evaluation of fusion
strategies for multimodal biometric verication. In: International conference on video based
biometric person authentication, pp 830837, Guildford
35. Razzak MI, Alghathbar MKK, Yusof R (2011) Multimodal biometric recognition based on
fusion of low resolution face and nger veins. Int J Innov Comput Inf Control ICIC Int. 7
(8):46794689. ISSN 1349-4198
36. Indovina M, Uludag U, Snelick R, Mink A, Jain A (XXX) Multimodal biometric
authentication methods: A COTS approach
37. Kisku D, Rattani A, Gupta P, Sing J (2009) Biometric sensor image fusion for identity
verication: a case study with wavelet-based fusion rules graph matching. In: Proceedings of
IEEE Conference on Technologies for Homeland Security, HST 09, Boston, pp 433439, 11
12 May 2009
38. Marcialis GL, Mastinu P, Roli F (2010) Serial fusion of multimodal biometric systems. In
Proceedings of BIOMS, Taranto, Italy, Sept 2010, pp 17
39. Anzar SM, Sathidevi PS (2012) Optimal score level fusion using modalities reliability and
separability measures. Int J Comput Appl 51(16):09758887
40. Asha S, Chellappan C (2008) Authentication of e-learners using multimodal biometric
technology. Presented at the International Symposium on Biometrics and Security
Technologies, ISBAST 2008. Islamabad, pp 16
41. Soruba Sree SR, Radha N (2014) A survey on fusion techniques for multimodal biometric
identication. Int J Innov Res Comput Commun Eng 2(12). ISSN(Online) 2320-9801, ISSN
(Print): 2320-9798
Chapter 10
DSK-Based Authentication Technique
for Secure Smart Grid Wireless
Communication

Mohan S. Khedkar and Vijay Shelake

Abstract The evolution of the smart grid involves development and integration of
proceed information and communication technology model (ICT) into all aspects to
improve and optimize power generation, distribution, delivery, pricing, consump-
tion, and other effectiveness operations. The advanced functionality joined together
with the integration in information technology also comes with many safety and
privacy issues of user-related data in Smart Grid Communication. In proposed
technique, concept of dynamic secret key (DSK) is used to generate symmetric
cryptography keys to design an authentication and encryption scheme in smart grid
wireless communication. In this scheme, recently added device (e.g., smart meter)
is authenticated by a randomly chosen authenticated device. This scheme enables
mutual authentication between a control center situated in local management ofce
and the randomly chosen device as an authenticator to generate proper dynamic
secret-based dynamic encryption key (DEK) for consequent secure data commu-
nications. As stated in [1], a single misjudgment or missing retransmission
sequence (RS) in communication will prevent an adversary from obtaining the
dynamic encryption keys and hence prevent from unauthenticated access. The
transmission failure and packet loss in wireless communication are unpreventable
and cannot be predicted, and so it is not possible for an adversary to trace the
updating of dynamic encryption key (DEK).

Keywords Smart grid Authentication Encryption Decryption


Adversary Packet loss Retransmission Dynamic secret Initialization
Authenticator

M.S. Khedkar ()
Department of Computer Engineering, SESs GOIFE,
University of Mumbai, Mumbai, India
e-mail: mohan_khedkar@hotmail.com
V. Shelake
Department of Computer Engineering, SESs YTCEM,
University of Mumbai, Mumbai, India
e-mail: vijay.shelake@tasgaonkartech.com

Springer Science+Business Media Singapore 2016 153


A. Chakrabarti et al. (eds.), Advances in Computing Applications,
DOI 10.1007/978-981-10-2630-0_10
154 M.S. Khedkar and V. Shelake

10.1 Introduction

The smart grid is a next-generation revolutionary and evolutionary system of power


grids. In particular with the combination of a super computing and ICT, smart grid
is expected to signicantly enhance the reliability and efciency of future power
grids [2]. Together with silent features of smart grid, information network security
appears as highly critical issue since many of the electronic smart devices such as
smart meters and control center are interconnected by communication networks
together with power facilities, which have instant effect on reliability of such an
infrastructure of smart grid.
A smart grid network integrates much of the advancement used today, including
ICT, geographical information technology systems, and wireless communication
techniques. Each component of smart grid brings its own system and societal
advantages to improve electricity delivery and usage. The smart grid consists of
four main elements: (I) advanced distribution operations, (II) advanced metering
infrastructure (AMI), (III) advanced transmission operations, and ( IV) advanced
asset management [3]. The integration of wireless communication technologies into
smart grid for applications such as metering, system monitoring, and data collection
goes number of decades back; however, they have potential to be used as much
greater extent as utilities deploy infrastructure for the smart grid [4]. Since the
advent of the smart grid concept, security has always been a primary concern.
Electricity usage information, pricing information, and also control actions are
transmitted via the information network [1]. Various attacks, such as information
tampering, eavesdropping, and malicious control command injection that have
almost ruined the Internet, could impose serious threat on secure, stable smart grids
operation [5]. Moreover, SG is an attractive target for various hackers with
diversied motivations. Many wireless devices lack the computing power needed
for prime number generation. Many cases cannot tolerate delays associated with
public key operations. Complexity of public key infrastructure (PKI) makes it less
favorable as a general and easy-to-adopt technique for wireless devices. The
standard cryptographic and security techniques in IT and communication networks
may not be applicable for SG wireless communication. Hence, we propose novel
technique for secure wireless smart grid communication and authentication.
The remainder of this paper is organized as follows. In Sect. 10.2, we describe
the related work of smart grid security. Section 10.3 claries the proposed method.
Section 10.4 presents the system flow and Sect. 10.5 focuses on analysis and results
of encryption and authentication based on transmission error in wireless commu-
nication. Finally, conclusions and recommendations for future work are drawn in
Sect. 10.6.
10 DSK-Based Authentication Technique for Secure Smart Grid 155

10.2 Related Work

Implementing ne-grained and integrated security solution that can accomplish


possible security-, user-, and customer-related data privacy concerns in subsystem
of smart grid is critical. Furthermore, design of the security scheme also must
consider additional features of smart grid as well as underlying power grid system.
So it is important to address privacy and communications security issues of the
smart grid metering, local management ofce, and control system from viewpoint of
cryptographic solutions [6,7]. We discuss different cryptographic solutions; goal is
to supply comprehensive vulnerability analysis and also lightweight solution of a
cybersecurity of smart grid. Although smart grid is an important component to the
new energy economy, current power grid is not responsible nancially and tech-
nologically, because the smart meter does not balance supply and demand of the
renewable energy sources.
Wu and Zhou [8] studied the issue of secure key management for smart grid and
explained existing standard key management techniques are not suitable for the
deployment in smart grid. Author proposed a key management scheme which clubs
symmetric key and elliptic curve public key scheme. This symmetric key scheme is
based on NeedhamSchroeder authentication protocol.
In Rongxing Lu, Xiaohui Liang Xiaodong Lin, Xuemin (Sherman) Shen [9]s
study, they proposed an efcient and privacy-preserving aggregation scheme
(EPPA) for secure smart grid communications. It realizes a multidimensional data
aggregation approach with traditional one dimension based on homomorphic
Paillier cryptosystem. Compared to data aggregation methods, EPPA can signi-
cantly reduce computational cost and signicantly improve communication ef-
ciency while satisfying real-time high-frequency data collection requirements in the
smart grid communications.
Ami wen, Rongxing lu, Kuan zhang, Jingsheng lei [10] proposed a novel
privacy-preserving range query (PaRQ) technique over the encrypted metering data
to address privacy issues in the nancial auditing for smart grid. PaRQ allows
residential user to store their metering data on the cloud server in an encrypted
form. In this PaRQ, only the authorized users can obtain query results, while data
condentiality and query privacy are also preserved.
In [11, 12], discussed are several choices of wireless communications for con-
necting, controlling, and monitoring consumer electric devices. However, most of
them, e.g., wireless local area networks or wireless sensor networks, are not directly
applicable to smart grid communication due to the several issues. As the smart grid
communication network consists of millions of embedded computing systems with
limited computational ability, computational efciency becomes an important factor
for an encryption scheme to be adopted in the smart grid [13,14].
156 M.S. Khedkar and V. Shelake

10.2.1 Comparative Study of Various Security Techniques

As mentioned above, there are various security solutions available for information
networks; however, in this section we summarize some existing and standard
security approaches and their limitations. An encryption technique is essential for
data integrity and condentiality in wireless smart grid communication. Thus, in
this subsection, we will summarize the various existing approaches for Secure
Wireless Smart Grid Communication in Table 10.1.
Since there are various cryptography and security needs of each of different
communication protocols and technologies used in wireless smart grid communi-
cation, employing interoperability between these systems is a critical task. A whole
solution based on this idea is needed. Hence, we took this as a challenge and
proposed a method that is introduced in the next section.

10.3 Proposed System

We proposed a novel lightweight technique for authentication and encryption that


can protect secret keys and automatically updates keys based on transmission error
in wireless smart grid communication. It is based on simple fact that error in
wireless communication is unavoidable and it needs little computing power. In the
proposed technique, dynamic secret key (DSK)-based scheme is designed to
authenticate new device and to secure wireless communication between smart
devices and local management ofce or control center. The concept of dynamic
secret is used to generate dynamic encryption key (DEK) for smart grid wireless
communication. This symmetric key is used to add, initialize, and authenticate new
device in smart grid.
The block diagram of the proposed scheme is shown in Fig. 10.1 including
retransmission sequence generation (RS), dynamic secrets generation (DS), veri-
cation, authentication, encrypt, and decrypt module. We discuss the concept of
DSK to generate dynamic secret keys in Sect. 10.3.1 and authentication protocol in
Sect. 10.3.2.

10.3.1 Concept of Dynamic Secret Key (DSK)

In DSE [1], dynamic encryption keys are generated based on retransmission in


wireless communication. The concept of dynamic secrets is applied to update the
symmetric keys. We will brief the steps used to obtain dynamic secret keys in
subsequent subsections A to C. Section 10.3.2 will explain DSK-based authenti-
cation technique for wireless smart grid.
10

Table 10.1 Various existing approaches for secure smart grid wireless communication
Sr. Title Authors Advantages Disadvantages
no
1 Fault-Tolerant and Scalable Wu and Zhou (1) Proposed key management (1) In case of cross realm secure
Key Management for Smart technique consists of strong security, access, data aggregator may send
Grid, 2012 accessibility, scalability, request to local trust anchor for a
fault-tolerance, and efciency session key for communicating to a
(2) The key management at trust remote data aggregator, and trust
anchors is notably simplied anchor in remote will instead issue the
(3) It does not need to maintain shared actual session key
symmetric keys (2) It cannot be applied for wireless
smart grid communication because of
heavy computation
2 Security Technology for Metke and Ekl (1) It suggests Public Key (1) All suggested techniques in this
Smart Grid Networks, Infrastructure for Security paper increases the complexity in SG
IEEE Trans. Smart Grid, (2) It provides data encryption and (2) It also increases security issues as
vol. 1, pp. 99107, 2010 authentication third party is associated with it
(3) It uses the Standard cryptographic (3) It requires more resources and
technique computation power
3 EPPA: An Efcient and Rongxing Lu, Xiaohui Liang (1) Data aggregation is used as (1) EPPA does not improve the
Privacy-Preserving Xiaodong Lin, Xuemin (Sherman) multidimensional instead of existing efciency of the GW-to-OA
Aggregation Scheme for Shen one-dimensional communication
DSK-Based Authentication Technique for Secure Smart Grid

Secure Smart Grid E.g., It considers amount of energy (2) Is also does not address technique
Communications, 2013 consumed, at what time and for what for multidimensional data aggregation
purpose the consumption was, and so
on
(2) It also considers that power usage
information is often small in size, as
compared to the plain text space of the
encryption algorithm used
(continued)
157
Table 10.1 (continued)
158

Sr. Title Authors Advantages Disadvantages


no
4 Security framework for W. Xudong and Y. Ping, (1) Wireless provides easy setup of (1) Wireless are more vulnerable to
wireless communications in connection, low cost, and high speed security attack than wired
smart grid distribution, (2) Smart tracking rewall is (2) It does not provide authentication
IEEE Trans. Smart Grid, developed to detect and response to or encryption for wireless
vol. 2, pp. security attacks communication
809818, 2011 (3) It is based on wireless mesh
network which is multi-hop network
5 PaRQ: A Ami Wen, Rongxing lu, Kuan (1) The individual residential users (1) This technique increases
Privacy-Preserving Range Zhang, Jingsheng Lei1, Xiaohui data condentiality can be achieved communication overhead
Query Scheme Over liang, and Xuemin shen (2) The requesters query privacy can (2) Needs more computational power
Encrypted Metering Data be preserved (3) Not feasible low-powered devices
for Smart Grid 2014
6 Development of a prototype S. Zahurula,Grozescub, M. Luta, (1) The common electricity (1) Most important security issues of
for remote current H. Hashima, M. Amrana, Izhamc measurement sensors (or current future distributed wireless smart grid
measurements of PV panel sensors) that are widely used in communication is not addressed. E.g.,
using WSN, 2014 electrical meters to measure the renewable generations to build an
electricity consumption efcient energy management and
(2) Adriano can be used to in various demand response system
perspectives; however, integration of (2) It only considers ZigBee
current sensor is one of the interactive micropower wireless communication,
jobs to measure current for different does not recognize other protocols,
applications e.g., IP
7 Security for Smart S.K. Saranya1, Dr.R. Karthikeyan (1) For the mentioned requirements of (1) In an SDG mesh networking can
Distribution Grid by Using SDG communications lie in the create complex network
Wireless Communication, advantages of wireless mesh network (2) In this technique, there is no
2014 (2) The concept of smart grid used to flexibility in topology formation
control the power grid operation
(continued)
M.S. Khedkar and V. Shelake
Table 10.1 (continued)
10

Sr. Title Authors Advantages Disadvantages


no
8 A Dynamic Secret-Based Ting Liu, Member, IEEE, Yang (1) This technique protects the users (1) This solution does not provide
Encryption Scheme for Liu, Yashan Mao, Yao Sun, against eavesdropping by updating mechanism to scale network
Smart Grid Wireless Xiaohong Guan, Fellow, IEEE, encryption key periodically based on (2) It also does not provide
Communication, IEEE Weibo Gong, Fellow, IEEE, and retransmission sequence authentication for newly added node
Transactions On Smart Sheng Xiao (2) It uses simple operations such as
Grid, 2014 MD2 and XOR that makes it
lightweight
(3) It utilizes very low bandwidth for
communication that results low-cost,
easy maintenance
(4) It dynamically generates keys
during the normal communication
without additional trafc and control
command
(5) It has good compatibility,
which could be integrated into many
wireless techniques and applications,
such as ZigBee and Modbus
(6) It provides symmetric encryption
technique, i.e., each domain generates
DSK-Based Authentication Technique for Secure Smart Grid

its own encryption key


159
160 M.S. Khedkar and V. Shelake

Fig. 10.1 Block diagram of proposed method CC: control center, RS: retransmission sequence,
DS: dynamic sequence, TM: transceiver modules, SM: smart meter

A. Retransmission Sequence Generation (RS):


Before two parties start communication, we perform analysis of data transmis-
sion to generate retransmission sequence (RS). RS module monitors link layers
retransmitted packets. The retransmitted packets are marked as 1 and the
non-retransmitted packets are marked as 0. In this way, all previously transmitted
packets are coded as 0/1 sequence known as retransmission sequence (RS).
Retransmission sequence is applied to generate dynamic secret [1].
B. Dynamic Secret Generation(DS):
When retransmission sequence reaches the threshold (L_RS), it is compressed to
a dynamic secret (DS). Considering the limitations on computation power, MD2-
the hash function is selected in this module as low computational function for
compression [1].
10 DSK-Based Authentication Technique for Secure Smart Grid 161

DS(k) = Fhash L RS 10:1

where DS(k) is the kth new dynamic secret, Fhash is the hash function, and L_RS is
RS with length L.
C. Dynamic Secret Key (DSK):
In this module, generated dynamic secret (DS) is applied to generate and update
symmetric key at both sides of communication. This DS based updated dynamic
secret key (DSK) is known as Dynamic Secret Encryption Key (DEK). The XOR
function is used for encryption and decryption as a low computational function. The
sender encrypts and the receiver decrypts cipher using DEK(k) as follows [1]:

DEK(k) = DS(k) EX OR DEKk 1 10:2

In the proposed technique, this dynamic encryption key DEK(k) is used to


authenticate smart devices at authentication server and encrypt the data at sender
and decrypt the cipher at receiver.

10.3.2 A DSK-Based Authentication for Secure Smart Grid


Wireless Communication

Authentication technique is the process to achieve data integrity [15]. Any new
device must be veried before added in the smart grid network by remote
authentication server located at the control center (local management ofce). So, in
proposed system, authentication and verication modules play an important role.
The authenticated device such as smart meter can play role of authenticator to
interchange messages between new device, called as supplicant, and authentication
server. Figure 10.2 illustrates this process in detail.
A. Verication:
As shown in Fig. 10.1, verication module veries devices in smart grid at
control center or local management ofce; if device is new it invokes registration
and authentication process. Next section describes the proposed authentication
protocol.
B. Registration and Authentication:
In registration and authentication process, new device, i.e., supplicant and the
authentication server, has a same key K0 which was precongured, not disclosed to
anyone else, even authenticator also not aware of this key K0. Both supplicant and
authentication servers identify each with key K0 and for the consequent data,
encryption/decryption between supplicant and authentication server each updates
K0 by applying concept of DSK as described in Sect. 10.3.1. We denote this
162 M.S. Khedkar and V. Shelake

Fig. 10.2 A DSK-based authentication protocol

updated key as DEKinit. Figure 10.2 illustrates DSK-based authentication protocol


for each new smart meter.
If the supplicant is authenticated as a legal device, authenticator initiates
authentication process by generating an initial vector (IV). Then, IV and DEKinit
will be encrypted, denoted as EK0(IV|| DEKinit.) as shown in Fig. 10.2. At other
side of communication, the supplicant can decrypt it with K0 and retrieves IV and
DEKinit. At the same time, authentication server forwards same DEKinit to ran-
domly chosen or neighboring authenticated smart meter, i.e., authenticator by
encrypting with their own DEK# denoted as EDEK#(IV|| DEKinit) because
authenticator was authenticated already with DEK#. By this way, authenticator also
gets DEKinit. Now by using DEKinit both supplicant and authenticator update their
keys based on DSK concept as a DEKa, which used for message authentication
code generation and validation between supplicant and authenticator. A four-way
handshake process is established to complete and fullls another mutual authenti-
cation between supplicant and authenticator. The key DEKa at both sides is vali-
dated and made ready for subsequent message authentication code generation,
validation, and encryption/decryption purposes once mutual authentication suc-
cessfully completed. After the initialization process, the proposed method transmits
control messages, pricing information, and meter reading between smart meters and
a control center.
10 DSK-Based Authentication Technique for Secure Smart Grid 163

10.4 Proposed Algorithm

In this section, we describe algorithms for secure SG communication. This algo-


rithm is based on the concept of DSK. So it continuously tracks communication for
error during transmissions and updates DEK for authentication and
encryption/decryption. System flow diagram is shown in Fig. 10.3.
It consists of the following steps.
(1) Start
(2) Generate retransmission sequence (RS)
(3) Generate dynamic secrets (DS)
(4) Update dynamic secrete key (DEK)
(5) Authentication using DEK
(6) Encrypt/decrypt data
(7) If communication is over, go to step 9; else go to step 8
(8) Check threshold, if threshold reached go to step 2; otherwise, go to step 6
(9) Stop
In step 2, retransmission sequence (RS) is generated at the sender as well as
receiver by coding each transmitted packet as 0 or 1 based on packet loss or
error in transmission. To obtain RS, simple stop-and-wait protocol is used. In step
3, this generated RS is compressed to generate dynamic secret as DS.
As smart devices have limitation on computation power, simple hash function
SHA-1 is used. In step 4, when a new dynamic secret is generated, it is applied to
update encryption and authentication key at both sender and receivers sides. In step
5, proposed DSK-based authentication protocol is used which generates dynamic
encryption key (DEK) for authentication and registration as described in
Sect. 10.3.2 (B). Once the device is authenticated in step 5, updated DEK is used to
encrypt data at sender and decrypt cipher at receiver in step 6 and if DEK is shorter
than data, it is replicated and padded to generate whose length equal to raw data or
cipher text. After reaching to threshold, DEK updated during the communication
because of which third party cannot track secrets. So, this DSK-based scheme can
prevent eavesdropping, unauthorized access, and forging by utilizing an inevitable
error in wireless communication that reduces cost on computation and storage.

10.5 Result and Analysis

Dynamic secret key-based demo system is designed and implemented for secure
smart grid wireless communication and authentication using Java ME embedded
emulator. Implemented system secures wireless smart grid communication by
encryption based on dynamic encryption key. It also supports addition of new
device, authentication, and conguration of devices in smart grid.
164 M.S. Khedkar and V. Shelake

Fig. 10.3 System flow diagram

10.5.1 Outcomes

In this section from Sect. 10.5.1.1 to Sect. 10.5.1.5, different results of implemented
system on Windows platform are discussed.
10 DSK-Based Authentication Technique for Secure Smart Grid 165

Fig. 10.4 Control center waiting for connection

10.5.1.1 Receiver (CC) in Smart Grid Waiting for Connection

Above Fig. 10.4 screen shows receiver (CC) opens passive connection on a port
and waits for the senders (SM) request. Once connection is established, receiver
receives messages from various smart meters present in the smart grid network.

10.5.1.2 RS Generation at Sender (SM)

Once the connection is established, as discussed in Sect. 10.3 system monitors


errors in communication and generates RS successfully. Figure 10.5 shows the
generated RS on embedded smart device, i.e., on smart meter.

10.5.1.3 RS Generation at Receiver (CC)

Another embedded smart device, i.e., receiver (CC or local management ofce),
accepts connection from sender (SM), starts packet transmission, and performs
analysis of each packet, and same RS is generated at receiver. Figures 10.5
and 10.6 prove that same RS is generated at both sender and receiver.
166 M.S. Khedkar and V. Shelake

Fig. 10.5 SM request connection, once connection accepted by CC and SM generates RS

Fig. 10.6 CC accepts connection from SM and generates RS

10.5.1.4 DS, DEK Generation at SM, and CC

Once RS reaches threshold L_RS (length of retransmission sequence), generated


RS is compressed to a dynamic secrete (DS) (k) using SHA-1 as a secure hash
10 DSK-Based Authentication Technique for Secure Smart Grid 167

Fig. 10.7 DS, DSK generation at smart meter

Fig. 10.8 DS, DSK generation at CC

algorithm at both sides of communication. This newly generated DS is applied to


update the secrets at sender and receiver as shown in Figs. 10.7, 10.8, and 10.9.
168 M.S. Khedkar and V. Shelake

Fig. 10.9 Encryption at smart meter

10.5.1.5 Encryption/Decryption

Newly generated DEK is used to authenticate smart devices in SG network. If


device is valid, same key is used for further communication. The XOR function
selected as one of the most lightweight and easy-implementation algorithms for
encryption and decryption at both sender and receiver sites.

10.5.2 Analysis

Numerous experiments are conducted to analyze the security of proposed system.


Firstly, retransmitted packet ratio, packet loss ratio, and length of RS are investi-
gated to guide the design of system. Then, we compared Public Key Encryption and
Dynamic Secret-Based Encryption and then Password-Based Authentication and
Authentication with Encryption.
RS is generated on the CC and SM by conducting an experiment. 3000 packets
are sent from the Embedded Smart Device-1 (i.e., smart meter) on Windows
platform with 1 packet/second. Based on proposed protocol, Embedded Smart
Device-2(i.e., Control Center) and smart meter gets the same RS in which there
were 69 retransmitted packets in 3000 packets. The adversary obtains different RS
from the CC and SM and hence generates the wrong DS and fail to track the DEK.
If the adversary attacks the RS, complexity is related to the number of retransmitted
packets, lost packets of adversary, and length of the RS.
10 DSK-Based Authentication Technique for Secure Smart Grid 169

The complexity of RS is calculated by number of retransmissions. For example,


if there is no retransmitted or non-retransmitted packet, the retransmission
sequence, i.e., RS, is all zeros or all ones, and if there is only 1 retransmitted packet,
adversary can easily crack RS using brute force. Thus, to prevent against brute force
cracking we need enough non-retransmitted and retransmitted packets. There are
two factors that determine number of retransmitted packet: the RPR and length of
RS. By conducting more experiments, it is found that: (1) It is hard to track or
predict how many packets can be retransmitted in wireless smart grid communi-
cation; and (2) the retransmitted packet ratio is sufcient to prevent RS from
cracking. The average retransmitted packet ratio of all conducted experiments was
4.1 %.
If we assume the length of RS is 110 and number of retransmitted packets as 4,
then combination of RS is 4C110 = 5.77 million. In Section A, we compare rst
Public key Encryption and Dynamic Secret-Based Encryption and in Section B,
Password-Based Authentication and Authentication with Encryption.
A. Comparison Between Public key Encryption and Dynamic Secret-Based
Encryption:
By generating these comparisons, we conclude that in Public Key Encryption the
key must be kept hidden from the adversary so that no one can access. From
Fig. 10.10, it is clear that time required in seconds for encryption and decryption is
more for public key encryption. We have taken different le sizes in bytes such as
120, 250, 350, and 1000 and the time for encryption increases as compared to
dynamic secret-based encryption.
B. Comparisons between Password-Based Authentication and Authentication
with Encryption:
We conducted experiment with four trials to compare password-based authen-
tication system and authentication with encryption. It is observed that
password-based system can be Brute forced by literally reading through the

Fig. 10.10 Public key Encryption time


encryption and dynamic 2.5
secret-based encryption Dynamic Secret Based
Encryption
2
Public Key Encryption

1.5

0.5

0
120 250 350 1000
170 M.S. Khedkar and V. Shelake

Fig. 10.11 Password-based 80


authentication and
authentication and encryption 70

60

50

40

30

20

10

0
Trial1 Trial 2 Trial 3 Trial 4
Password Based Authencaon and
Authencaon Encrypon

provided dictionary of terms and trying each word until password matches. To
avoid Brute force, we need to create complex password which not feasible for SG
due to its limitations and constraints. (2) Also in password-based system, we must
store password in database to authenticate users; again we must provide strong
security to this database. It is concluded that authentication with encryption tech-
nique remains stable and superior as it is depending on dynamic and randomness of
the proposed technique. Implemented technique is more secure than
password-based because every time secret key is updated based on retransmitted
packets. Each trial of the percent of the DSK-based technique increases as shown in
Fig. 10.11.

10.6 Conclusion

We designed and implemented a DSK-based encryption and authentication demo


system using Java ME embedded emulator. The recently added device (e.g., smart
meter) is authenticated by a randomly chosen authenticated device. This scheme
achieves mutual authentication between a control center located in the local man-
agement ofce and a randomly selected smart meter to obtain proper cryptography
keys for subsequent secure smart grid wireless data communication. Readings from
smart meters and control messages from control center or local management ofces
can be encrypted for security needs and constraints. RS is frequently updated to
update dynamic encryption key DEK to enforce the strong security against
eavesdropping. SHA-1 and XOR are selected as low computational algorithm for
message digest and encryption/decryption. The study shows that retransmitted
10 DSK-Based Authentication Technique for Secure Smart Grid 171

packets in wireless communication are unavoidable and unpredictable. Therefore,


tracking of DSK-based dynamic encryption key is not possible for the adversary
and hence, our proposed mechanism achieves privacy as well as provides
authenticated access to system resources.

Acknowledgements The authors would like to thank Dr. S.K. Ukarande, Principal, SESs
YTCEM and former Dean, Faculty of Technology, University of Mumbai and Dr. N.M Shekokar,
H.O.D. (Computer Engineering, DJSCOE) for their suggestions and help for system evaluation.

References

1. Liu T, Liu Y, Mao Y, Sun Y, Guan, IEEE Members (2014) A dynamic secret-based
encryption scheme for smart grid wireless communication. IEEE Trans Smart Grid 5(3)
2. Smart grid cyber security potential threats, vulnerabilities and risks (2009) California Energy
Commission, PIER EnergyRelated Environmental Research Program
3. Yan Y, Hu Q (2013) An efcient security protocol for advanced metering infrastructure in
smart grid. IEEE Netw
4. Li F, Xiong P (2013) Practical secure communication for integrating wireless sensor networks
into the Internet of Things. IEEE Sensors J 13
5. Metke AR, Ekl RL (2010) Security technology for smart grid networks. IEEE Trans Smart
Grid 1(1)
6. Jose EF, Hopkinson KM (2014) A trust-management toolkit for smart grid protection systems.
IEEE Trans Power Deliv
7. Grilo A, Sarmento H, Nunes M, Gonalves J, Pereira P (XX) A wireless sensors suite for
smart grid applications
8. Wu D, Zhou C (2011) Fault-tolerant and scalable key management smart grid. IEEE
9. Lu R, Liang X (2013) EPPA: an efcient and privacy-preserving aggregation scheme for
secure smart grid communications. Trans Parallel Distrib Syst
10. Rongxing MW, Zhang LK (2013) PaRQ: a privacy-preserving range query scheme over
encrypted metering data for smart grid. IEEE Trans Emerg Topics Comput 2013
11. Saranya SK, Karthikeyan R (2014) Security for smart distribution grid by using wireless
communication. IEEE
12. Zahurula S, Mariuna N, Grozescub V (XX) Development of a prototype for remote current
measurements of PV panel using WSN
13. Al-Anbagi I, Hussein T (2014) A reliable IEEE 802.15.4 model for cyber physical power grid
monitoring systems. Emerg Topics Comput Mouftah University of Ottawa
14. Fouda MM, Md. Fadlullah Z (2011) A lightweight message authentication scheme for smart
grid communications. IEEE Trans Smart Grid
15. Nicanfar H, Jokar P (2014) Efcient authentication and key management mechanisms for
smart grid communications. IEEE Syst J 8
Chapter 11
A Smart Security Framework
for High-Risk Locations Using Wireless
Authentication by Smartphone

Anjali Chandavale and Vyanktesh Dorlikar

Abstract The development and urbanization occurred in past decades led to the
concept of smart cities. These smart cities consist of various high-risk locations, so
there is a need of smart authentication technique which is more reliable and secured.
We proposed smart security framework with the integration of enhanced existing
technologies. This authentication framework provides security for the high-risk
location, such as government ofce buildings, airport, military bases, and space
stations. The area consists of various Wi- access points creating the geo-fence. As
soon as the person enters into the geo-fence with his smartphone, biometric
authentication is carried out using smartphone and accordingly the person is
allowed to enter inside the geo-fence. At the same time, Wi- positioning system
locates the path of person in the geo-fence. Biometric authentication uses the local
binary pattern matching algorithm having accuracy of 80 %. The location of the
person inside the geo-fence is tracked with the help of Trilateration method. The
Wi- positioning system based on Trilateration method nds the location of the
person with an error rate of 5 m, which is relatively very lower than the Global
Positioning System. This wireless authentication framework provides the better
solution for security of smart cities which in turn will be benecial for maintaining
the peaceful and healthy social relations.

Keywords Wireless authentication system


Face recognition Wi- posi-
tioning
Geo-fence
Smart security
Trilateration

A. Chandavale ()
SMIEEE, Department of Information Technology, MIT College of Engineering, Pune, India
e-mail: c.anjali38@gmail.com
V. Dorlikar
IBM, Pune, India
e-mail: v.vyenkatesh@gmail.com

Springer Science+Business Media Singapore 2016 173


A. Chakrabarti et al. (eds.), Advances in Computing Applications,
DOI 10.1007/978-981-10-2630-0_11
174 A. Chandavale and V. Dorlikar

11.1 Introduction

With the aim of providing peoples better place to live, the concept of the city was
born. Cities are developing quickly in all over the world; this urbanization intro-
duces the concept of smart cities and simultaneously makes them a necessity. Smart
cities consist of various communication centers, research centers, and space centers.
With the aim of providing them a smart security solution, we proposed this
framework. The proposed smart security framework is the novel approach based
upon the concept of geo-fence and Wi- positioning of the person. The proposed
system consists of several nodes such as a base station, sensor node, and the cluster
member. Smartphone nodes are generic wireless devices having smart capabilities.
Sensor nodes in a cluster communicate among each other and also communicate
with the cluster head. Cluster heads are more resource rich than cluster members.
In this framework, we use smartphones for the authentication, basically the
camera of the smartphones. The smartphone consists of an application installed in
it, which is automatically responding when the smartphone enters into the specic
area (military area/space stations/or research) (Fig. 11.1).
Those geographical areas covered by the Wi- range creating the virtual
boundaries called geo-fence. Those Wi- nodes also refer as sensor nodes, and the
base station is the server maintaining the authenticating activity. One intranet Web
site used to maintain authentication activity should be there which is managed by
administrator having various access rights. Based upon the access control policy,
person is either granted or denied to use the area.
Section 11.2 presents the related work in the area of authentication system.
Section 11.3 presents a methodology of the proposed framework. Section 11.4

Fig. 11.1 Wireless authentication system


11 A Smart Security Framework for High-Risk Locations 175

discusses implementation details. Section 11.5 shows performance analysis, and


Sect. 11.6 concludes the paper.

11.2 Related Work

The framework consists of three different modules. First is geo-fencing to create a


protected area, the second is a biometric recognition system, and the last is Wi-
positioning.
(A) Geo-fence
Creation of geo-fence means the creation of virtual boundary which is mapped with
the physical area. The virtual area is created with the help of electronically steerable
directional antennas [1]. These antennas are controlled by adjusting the orientation
and transmit power of the directional antennas to focus on the virtual area
(Fig. 11.2).
Geo-fence using directional antennas consists of three different approaches. First
is omni-directional approach in which directional antennas are not used and the
network region is spread over 360. Angle of arrival approach and minimum
overlap approach is created only with the help of directional antennas [1].
In region selectivity, regions should be well dened and client devices are
outside from dened regions; more than a few feet should not be able access the
network. While 0 % packet reception outside of the network region and 100 %
packet reception rate inside the network region would be ideal, it cannot be prac-
tically achieved. Studies on the UDP- and TCP-based applications such as Skype
show that this kind of applications is unusable beyond a link layer loss rate of
greater than 25 %. Based on these observations, we desire an absolute link layer

Fig. 11.2 Different antenna orientation approaches (a) Omni-directional approach (b) Angle of
arrival approach (c) Min overlap approach
176 A. Chandavale and V. Dorlikar

packet reception rate threshold of <70 % and >30 % loss outside the region and a
threshold of >90 % and <10 % loss within the region [2].
Limitations to create a geo-fence with GPS mainly are: It cannot restrict the GPS
for the smaller area within the few square meters and steerable antennas increase the
overall cost. To overcome these limitations, framework proposed by us uses the
omni-directional approach to create a virtual boundary with the help of Wi- access
points (AP). Wi- APs are restricted according to areas, which also works great in
indoor locations where GPS fails. In omni-directional approach, steerable antennas
are not required so it automatically reduces the cost.
(B) Biometric Capture and Recognition
The second module is based upon the authentication. Using smartphone as an
authentication device is a challenging task because there are no biometric input
devices available with these systems. The main challenge is to capture the biometric
data from the smartphone by minimizing the surrounding noise, collect it, and
compare it with the large databases. A multilevel verication scheme using face,
ngerprint, hand geometry, and other biometric signatures which scans through
camera of smartphones increases the reliability of the system [3]
Face recognition is unique and reliable authentication system which is used with
smartphones. The face recognition module can integrate with 3D face geometry with
the skin texture analysis making it one of the most accurate biometric systems [4]
The available Eigenface and Fisherface approaches are scattered are maximized
within the classes as well as between the classes [5]. This reduces the accuracy of
Fisherface and Eigenface algorithms. Most of in-class scattering occurred due to
light variation, and it can be minimized by discarding the three most signicant
principal components [6, 7].
(C) Wi- Positioning
Extensive research has happened in Wi- positioning in the decade, and applica-
tions have been developed for the same. Lot of scope for accurate Wi- location
using Internet is available [8].
Wi- positioning algorithms are categorized into two types: RF ngerprint based
and access point position based. Some algorithms use both schemes. In an RF
ngerprint-based algorithm, overall space is divided into grids. Dedicated operators
collect radio scans at every grid. The data gathered from the entire grid construct an
RF signal map in the space [9]. When a smartphone scans radio signals at a
position, the radio scan is searched in the stored signal map. Position of the best
matching scans on the signal map is determined as the device position. This
technique consumes more processing power and time. Approximation and Trilat-
eration methods come into the category of access point-based algorithm. Approx-
imation algorithm is less accurate than among these algorithms [10].
Trilateration-based algorithm consumes less processing power, but less accurate
than RF ngerprint-based algorithm. Wi- Trilateration accuracy is generally
between 10 and 20 m; radio nger printing positioning shows an accuracy of at
least 510 m in the investigation and depends on the strength of Wi- in geo-fence
zone [11, 12].
11 A Smart Security Framework for High-Risk Locations 177

11.3 Proposed Work

We proposed to develop a smart security framework based on wireless authenti-


cation techniques such as biometrics and Wi- positioning. For the purpose of
authentication, we proposed to use a face recognition algorithm with
password-based access and control on movement of person positioning through
Wi- positioning [13].
Person gets identied in geo-fence and has to go through a face recognition
system. Local binary pattern (LBP) histogram algorithm is used for face recogni-
tion. In addition, persons location in (x, y) coordinates is recorded at the server.
The location of the person is calculated with the help of received signal
strength-based Trilateration algorithm and the location coordinates are stored in the
Google App Engine (GAE) database. Figure 11.3 shows the overview of proposed
smart security architecture.
Wireless authentication system works only within the geo-fence. Geo-fence is
created using access points with the help of omni-directional approach [1]. These
access points are situated in the appropriate distance from each other so that they
can communicate with server. The distance of each AP is registered at the time of
programming. The IP address of each AP is also registered in the program for
communication purpose. When smartphone with proposed and developed android
application installed in it enters into the geo-fence, the android application issues
the notication for authentication with continuous vibration and ring. To stop the
notication, person has to give his authentication with biometric feature, i.e., face.
Face captured by the smartphone is sent to the server with the IMEI (Interna-
tional Mobile Equipment Identity) number of the smartphone. To minimize the
Fig. 11.3 Overview of
system Start

Create Geo fence for protected area

Execute face recognition for wireless


authentication

Track persons location inside Fence to


monitor persons activities

End
178 A. Chandavale and V. Dorlikar

searching time at server side, IMEI number is searched rst; then calculated binary
patterns are compared with the existing patterns. Face recognition is done with the
help of LBP algorithm, where face pattern is calculated in binary format.
If the face authentication fails 3 times, then smartphone application issues the
notication for PIN (Personal Identication Number) authentication. PIN is gen-
erated at the time of persons registration by using random number generation
algorithm (Figure 11.4).
After successful authentication, the next module for Wi- positioning started for
monitoring the persons activity inside the geo-fence. The values of RSS are
converted into the real-time distance in meter. Minimum three access points are
required to execute the Trilateration algorithm. The location values are calculated in
(X, Y) coordinate with respect to 4th quadrant. The data (name, IMEI, email id,
PIN, and location coordinates) related to person are stored in the Google App
Engine database. This database works on local host.
The extensive approach using various means such as person identication, face
recognition, and location tracking in geo-fence is a secured and reliable approach
for implementation in the eld of computer science.
(A) Creating Geo-fence
We are creating geo-fence using an omni-directional approach instead of a tradi-
tional directional antenna approach to minimize the implementation cost. The
proposed method uses overlapping coverage to create geo-fence [1]. Physical SSID
of all the APs inside the geo-fence is registered in application code.
Figure 11.5 shows the overview of creating geo-fence. Four Wi- APs of same
specication are used to create geo-fence around specic geographic area. All these

Fig. 11.4 Data flow diagram for wireless authentication system


11 A Smart Security Framework for High-Risk Locations 179

access points are registered while programming with the help of their SSID. When the
Wi- coverage of registered APs is found, the authentication procedure starts [13].
The algorithm for creating geo-fence is given below:
Step 1: While Wi- state is enabled.
Step 2: Search for available Wi- access points.
Step 3: For count = 0, set the initializing flag = false and increment the counter
according to number of Wi- SSIDs say n.
Step 4: Check the available Wi- SSID, if Wi-Fi SSID is present inside the code is
same as available SSID then set flag = true and run the wireless authen-
tication application.
Step 5: If expected SSID not present till count = n. //Expected SSID is already
present inside the code. n is the total number of available access points
Step 6: set count = 0. Go to step 2.
(B) Biometric Signature Capture and Recognition
In biometric signature recognition, minimum 1.3 megapixel camera is required to
capture the facial image. The framework supports face recognition and PIN for an
authentication. IMEI number is used for indexing of various face binary patterns.
Local binary pattern histogram algorithm is used for face recognition. The local
binary pattern methodology has its roots in 2D texture analysis. The basic idea
behind local binary patterns is to summarize the local structure in an image by
comparing each pixel with its neighborhood. Input image of size 200 200 pixels
is divided into 7 7 matrix zones. In each zone, intensity of the central pixel is
measured. If the intensity of the central pixel is greater than or equal to its neighbor,
then assign the value as 1; otherwise, assign value as 0. That ends up with a
binary number for each pixel. This process continues for all zones and it gener-
ates local binary patterns or sometimes referred to as LBP codes [14].

Fig. 11.5 Creating geo-fence


with 4 access points
180 A. Chandavale and V. Dorlikar

LBP face recognition algorithm:


Step 1: Calculate the intensity of central pixel.

p1
LBPXc, Yc = 2p SIp Ic
p=0

where (Xc, Yc) is the central pixel.


Ic and Ip being the intensity of the neighbor pixel.
Step 2: S is the sign function dened as:

1 if x 0
S x =
0 else

Step 3: Align the number of neighbors on a circle with a variable radius in pixel,
which enables to capture the following neighborhoods:

Step 4: For a given point (Xc, Yc), the position of the neighbor (Xp, Yp), p P is
calculated by:
   
2p 2p
Xp = Xc + Rcos Yp = Yc Rsin
p p

where R is the radius of the circle and p is the number of sample points.
Step 5: If a points coordinate on the circle does not correspond to image coor-
dinates, then the point gets interpolated using bilinear interpolation [15]:
  
f 0, 0 f 0, 1 1y
f x, y = 1 x x
f 1, 0 f 1, 1 y

Here, face of the person is captured by the application in PNG format. This
image is captured in RGB32 format that means each and every pixel size is 32 bits
and then it is converted to the grayscale8. The dimension of captured facial image is
200 200 pixel. The binary code is calculated for each image and tagged to the
image at the time of storing the image into database.
11 A Smart Security Framework for High-Risk Locations 181

The local boundary pattern histogram algorithm is used for face recognition.
This algorithm is a supervised learning based, where the LBP code of facial image
is compared with the existing LBP code in database. The captured color image is
converted into a grayscale image before calculating binary code. LBP code is
calculated by dividing the facial image into the grid of 7 7 and accordingly 49
zones are created. For every zone, the LBP code or binary code is calculated using a
circular neighboring algorithm. The set of LBP code for single image is further
referred as a pattern or byte code.
(C) Wi- Positioning
To implement a Wi- positioning system, dedicated access points are respon-
sible for gathering radio scans with ground truth data. Off-the-shelf smartphones
have the capability to scan radio signals from Wi- access points (APs) [4]. We
propose to use Trilateration-based scheme to construct Wi- positioning system
(Fig. 11.6).

Fig. 11.6 Overview of


Trilateration method
182 A. Chandavale and V. Dorlikar

Trilateration algorithm:
Step 1: Let P1(X0, y0, r0), P2(X1, y1, r1), and P3(x2, y2, r2) be three access points
where (x, y) are the known position of access points and r be the radius.
Step 2: Calculate dx and dy are vertical and horizontal distances between the AP
centers.
dx = x1 x0
dy = y1 y0

Step 3: Determine the straight line distance between the centers.


p
d= dy2 + dx2

Step 4: check for solvability


If d is greater than the sum of r0 and r1, then AP coverage not intersects.
If d is less than subtraction of r0 and r1, then AP contained within another.
Step 5: Let num_y = (x2-x1)*(x32 + y32-r32) + (x1-x3)*(x22 + y22-r22) + (x3-
x2)*(x12 + y12-r12)
Denum _y = y3*(x2-x1) + y2*(x1-x3) + y1*(x3-x2)
Y = num_y/denum_y
Step 6: Now num_x = r22 + x12 + y12-r12-x22-y22-2*(y1-y2)*y
Denum_x = 2*(x1-x2)
Step 7: X = num_x/denum_x
So the position of mobile device is (X, Y).

11.4 Implementation

We developed the wireless authentication system where person gets identied


through a face recognition system. Once the person is authenticated, location of the
person is recorded at server side. Figure 11.7 shows the graphical user interface of
proposed and developed system on client side.
Source code for the module 1: Geo-fence Detection.
In the android code given below, for all the 4 access points the notication for login
is issued.
11 A Smart Security Framework for High-Risk Locations 183

Fig. 11.7 Implementation details client side

Here stored SSIDs is compared with the scanned SSID.

For implementation of this task, the imported packages are given below:

In Wi- positioning, static access points are used to calculate position of the
person inside the geo-fence. The standards taken into consideration are as follows:
When distance and frequency are in centimeters and kilohertz, the constant
becomes 87.55;
184 A. Chandavale and V. Dorlikar

When distance and frequency are in meters and megahertz, the constant
becomes 27.55;
When distance and frequency are in kilometers and megahertz, the constant
becomes 32.45;

These are the constant values to convert RSS value to centimeters, meters, and
kilometers. There should be minimum 3 access points to execute the Wi- posi-
tioning. The access points are registered during the code with their distance in any
of the quadrant (distance given below is in the 4th quadrant).
Source code of module 3: Wi- positioning.
In the android code given below, known static routers are registered while
coding with their distances in 4th quadrant.

(A) Database and Intranet Server


The backend server uses the Google App Engine (GAE). Google App Engine
enables you to build and run the applications on Googles infrastructure. App
Engine applications are very easy to create, maintain, and scale as your trafc, and
data storage needs change. Applications are sandboxed and run on multiple servers.
App Engine offers an automatic scaling for Web applicationsaccording to the
increasing number of incoming requests for Web application, GAE automatically
allocates more resources for the Web application to handle the additional demand.
The app engine basically is modied to work with local host.
11 A Smart Security Framework for High-Risk Locations 185

The code given below shows how communication server is registered. The
address 192.168.0.15 is the IP address of access point through which server can
communicate with other access points and smart phone devices.

Figure 11.8 shows the working of backend module. After running the server
successfully in Android Studio, the local host with port number 8888 starts working
as backend Web page. In the same Web page, admin can see the registered devices
and their locations.

Fig. 11.8 Working of backend (Google App Engine)


186 A. Chandavale and V. Dorlikar

(B) Requirement and Development Platform


Software:
Android Studio
Intranet Web server (Google app engine)
Platform:
Windows
Android 2.2+
Hardware: (minimum requirement)
PC/Laptop
CoreTM 2 Duo processor
Minimum RAM 2 GB
HDD space minimum 4 GB
Access points
Handset:
Android handset with Android 2.2 + operating system and min 1.3 Megapixel
camera.
Figure 11.9 shows the deployment of the system. An android application is
installed in the smartphone where temporary database is created to store the facial

Fig. 11.9 Deployment diagram


11 A Smart Security Framework for High-Risk Locations 187

image. The smartphone is connected to server through the set of Wi- access points.
Server computer consists of facial image database and persons information which is
accessible with the help of Web browser.

11.5 Result and Analysis

(A) Geo-fence
Figure 11.10 below delineates distance as a function of received signal strength.
The received signal strength value goes on decreasing as the distance increases.
The RSS value is recorded with the help of Wi- analyzer testing tool. The results
differ from access point to access point. The experimental access point taken into
consideration is D-Link Dir 605L (2.4 GHz).

RSSI Versus Distance:

Fig. 11.10 RSSI versus distance

Battery Consumption inside geo-fence:

Fig. 11.11 Battery level versus time

Figure 11.11 delineates battery level as a function of time. The experimental


results on Sony Xperia C within geo-fence are observed. As far as the power
consumption of smartphone is concerned, battery draining in Wi- is relatively very
lower than battery draining in GPS.
188 A. Chandavale and V. Dorlikar

(B) Face Recognition


To calculate the results of face recognition, we created the scale according to clarity
of face image (Fig. 11.12) [14].
Face recognition rate:
The local binary pattern matching algorithm works best for good quality of
captured image. As shown in Fig. 11.13, recognition rate of the face is increased
with the increasing quality of the facial image.
Comparison of LBP approaches:
Table 11.1 shows different zones that are created in LBP algorithm. But
according to accuracy and time, LBP 7 7 is proposed to use. 7 7 means the
face is divided into 49 parts for further pattern calculation. Time given in the table is

Fig. 11.12 Face scale according to noise in the image

Fig. 11.13 Recognition rate versus face quality

Table 11.1 LBP zones Zones Accuracy (%) Time (ms) Space (bytes)
3X3 60 257 198
5X5 70 325 329
7X7 80 423 524
9X9 96 736 780
11 A Smart Security Framework for High-Risk Locations 189

the face matching time done at server end. And space given in the table shows the
size required for each face block.
Comparison between accuracy of different face recognition algorithm:
Figure 11.14 shows accuracy in percentage of various face recognition algo-
rithms with respect to increasing number of face training set. Among all the
algorithms, LBP 7 7 gives the best result [14].
Recognition Versus Face Quality

Total number of correctly recognized images


Accuracy = 100
Total images

As per results shown in Fig. 11.14, we executed LBP, Eigenface, and Fisherface
algorithm and compared the results of Eigenface and Fisherface with LBP7 7
algorithm. In Fig. 11.15, the accuracy of LBP7 7 is 1020 % more than
Eigenface and Fisherface algorithms (Fig. 11.16 and Table 11.2).

Fig. 11.14 Accuracy of different face recognition algorithms [14]

Fig. 11.15 Recognition rate


of Eigenface, Fisherface, and
LBP algorithm
190 A. Chandavale and V. Dorlikar

0 1 2 3 4 5 6 7 8 9 10 11

Fig. 11.16 Sensitivity and specicity

Table 11.2 Comparison result for Eigenface, Fisherface, and LBP algorithm
Algorithm Total no. Total no. of Total no. of Total no. of facial
of facial unrecognized incorrectly images correctly
images facial images identied facial recognized
images
Eigenface 100 14 21 65
Fisherface 100 13 17 70
LBP 100 10 10 80

Sensitivity and Specicity:


Sensitivity
TPR = TP/P = TP/(TP + FN)
TPR = 80/(80 + 20)
TPR = 0.8
Specicity
SPC = TN/N = TN/(TN + FP)
SPC = 20 /100 = 20/(20 + 10)
SPC = 0.66
(C) Wi- Positioning Result
Table 11.3 shows the actual distance of smartphone from 3 different access
points. With the help of Trilateration algorithm, the location of the smartphone is

Table 11.3 Estimated distance


Sr no (Xa, Node 1 Node 2 Node 3 (Xest, Error
Ya)mobile (0, 0) (0, 9) (9, 0) Yest) location(m)
MobA (1, 1) 42 88 86 (1.2, 0.22
0.9)
Mob (4, 2) 81 86 80 (2.52, 3.52
B 1.58)
Mob (6, 2) 79 87 76 (7.37, 3.45
C 3.74)
11 A Smart Security Framework for High-Risk Locations 191

Fig. 11.17 Approximate


error position

Fig. 11.18 comparison


between Trilateration and
proximatory algorithm

calculated. The approximate error rate is found to be 3 m which is comparatively


very lower than GPS.
Figure 11.17 shows that approximate error rate goes on decreasing as the
number of access points increased.
Comparative results for Wi- positioning:
In Fig. 11.18, the comparison between proximatory and Trilateration algorithm
is given. Trilateration algorithm requires minimum 3 access points to calculate the
position of smartphone. According to results, Trilateration algorithm works with
more accuracy than proximatory algorithm. The accuracy of location increases as
the number of access points increases.

11.6 Conclusion

The smart security concept is gaining popularity among various cities in the world.
In this paper, novel approach is proposed for wireless authentication with the help
of smartphone and biometric features of person. The accuracy of face recognition
algorithm based on binary pattern histogram technique is found to be 80 %. The
192 A. Chandavale and V. Dorlikar

Wi- positioning Trilateration method works with the approximate error rate of
5 m, which is relatively very smaller than Global Positioning System. With the
advancement in technology and widespread use of smartphones, it is found that
various techniques are developed for authentication. Here integration of two or
more techniques provides more secure and reliable framework. The proposed
approach is independent of android versions and applied on various smartphones
without requiring dedicated biometric scanners.

References

1. Sheth A, Seshan S, Wetherall D (2009) Geo-fencing: conning Wi-Fi coverage to physical


boundaries. In: 7th international conference on pervasive computing, vol. 5538. Springer,
Heidelberg, pp 274279
2. Namiot D, Sneps-Sneppe M (2013) Geo-Fence and network proximity. In: 13th international
conference, NEW2AN, vol. 8121. springer, Heidelberg, pp 117127
3. Narendar M, Mohan Rao M, Babu MY (2010) Multi-Layer person authentication approach
for electronic business using biometrics. Glob J Comput Sci Technol 10:6367
4. Islam MF, NazrulIslam DCM (2012) Biometrics-Based secure architecture for smart phone
computing. In: Systems, applications and technology conference (LISAT). IEEE Long Island,
pp 15
5. Chowdhury A, Tripathy SS (2014) Human skin detection and face recognition using fuzzy
logic and eigenface ICGCCEE. In: IEEE 2014, pp 14
6. Shiji SK (2013) Biometric prediction on face images using eigenface approach ICT. IEEE
2013, pp 103109
7. Li B, Ma K-K (2009) Fisherface versus Eigenface in the dual-tree complex wavelet domain.
In: Intelligent information hiding and multimedia signal processing. IEEE, pp 3033
8. Koo J, Cha H (2012) Unsupervised locating of Wi-Fi access points using smart phones. IEEE
Trans Syst Man CybernPart C: Appl Rev 42(6), Nov 2012
9. Kelley KJ (2014) Wi-Fi location determination for semantic locations. In: The Hilltop
Review, vol. 7 (1), Article 9. http://scholarworks.wmich.edu/hilltopreview/vol7/iss1/9
10. Mazuelas S, Bahillo A, Lorenzo RM, Fernandez P, Lago FA, Garcia E, Blas J, Abril EJ
(2009) Robust indoor positioning provided by real-time RSSI values in unmodied WLAN
networks. IEEE J Sel Top Sign Proces 3(5), Oct 2009
11. Mohammed MM (2013) A multi-layer of multi factors authentication model for online
banking services. In: Elsadig, M. computing, electrical and electronics engineering (ICCEEE),
pp 220224
12. Kabaou H, Lorenz P, Tabbne S (2014) Mixed positioning system guaranteeing the continuity
of indoor/outdoor tracking. In: IEEE international conference on advances in computing,
communications and informatics (ICACCI), pp 19281935
13. Dorlikar V, Chandavale A (2015) A framework for defence security using smart phone in
Geofence. In: IPGCON 2015, pp 15
14. Oravec M, Pavlovicova J, Mazanec J, Omelina L, Feder M, Ban J (2011). Efciency of
recognition methods for single sample per person based face recognition, reviews, renements
and new ideas in face recognition, Dr. Peter Corcoran (ed), ISBN: 978-953-307-368-2,
InTech, doi:10.5772/18432
15. Face Recognition with OpenCV article. http://docs.opencv.org/modules/contrib/doc/facerec/
facerec_tutorial.html
Chapter 12
High Performance Computation Analysis
for Medical Images Using High
Computational Method

Ashwini Shewale, Nayan Waghmare, Anuja Sonawane,


Utkarsha Teke and Santosh D. Kumar

Abstract Medical image processing can be done through various methods, but it
may take long time for processing it. As the processing time is increased, it delays
the report generation which can affect patients life. So to scale down the medical
image processing time and without getting arousing effect on the quality of image,
high computational medical image processing methods are being used. Here in the
proposed system, to deal with the medical image, different theorems are used
(image registration, image segmentation, image de-noising). Image registration
regulates the image, and the image is then segmented into identical structures using
image segmentation. The image may have different types of noise in it. So, to expel
this noise image de-noising will be used. As the processing will be done in indi-
vidual environments (CPU and GPU), the efciency in both the situations will be
analyzed. Reasonably, GPU will be the capable environment for medical image
processing. This proposed system can be used to process different techniques like
MRI and CTScan.

Keywords MRI GPU (CUDA) Shared memory (Open MP) Medical


image processing

Technical Keywords (As per ACM Keywords) 1. Software


(a) Programming
techniques (E)
i. Concurrent programming
A. Distributed programming
B. Parallel programming

12.1 Introduction

Medical image processing on GPU [13] is becoming popular constantly, as this


high-tech method makes it feasible to apply progressive methods and to perform
their responsibility quickly in an analytic framework. Considering the challenges in

A. Shewale N. Waghmare A. Sonawane U. Teke S.D. Kumar ()


SITRC, Nashik, India
e-mail: santosh.kumar@sitrc.org

Springer Science+Business Media Singapore 2016 193


A. Chakrabarti et al. (eds.), Advances in Computing Applications,
DOI 10.1007/978-981-10-2630-0_12
194 A. Shewale et al.

the eld of medical image processing [1, 2, 4, 5], full computation power is needed.
To achieve high accuracy and real-time performance, shared memory is used and
the parallel architecture like CUDA [13] (compute unied device architecture) is
also used for it. Notable anomalies include reviews centralized on specic medical
imaging algorithms such as image registration [4, 12], segmenting [1, 3] the image
and de-noising [5, 8] the image.
Here, we consider an extent of medical image methods and types of theorems
and basic functions. The aim of this proposed system is to give analysis of medical
image processing on GPU and CPU for introducing parallel programming
approaches [6] to many algorithms.
Focus of proposed system is to use GPU [13] and shared memory environment
to accelerate image processing. The front end of proposed system has a medical
image (MRI-medical resonance imaging) which will be processed through the
different image processing algorithms to obtain high-quality processed image.
While processing the image it will undergo different process like aligning the image
(i.e., image registration), segmenting the image into similar structures (i.e., image
segmentation) and image de-noising. The efciency of this proposed system will be
evaluated separately using high computational languages (GPU and CPU envi-
ronment [6]). Further considering the most efcient system, regarding
high-performance system environment will be concluded.

12.2 Literature Survey

Image enhancement, image de-noising and image segmentation represent a


methodological overview on study of image processing using Fourier analysis,
Mathematical morphology, etc. The various approaches are interconnected which
ultimately improves the quality of image.
The image segmentation [3] methods include multi-channel logical segmentation
technique which incorporates logical partitioning of multiple channels.
The de-noising techniques and algorithms make it possible to remove salt and
pepper noise. The extension to multi-channel TV de-noising is also provided.
The medical image analysis procedure might consume large amount of time
which can be accelerated with the help of high computational processing speed. So
to accelerate the computational speed, GPU and shared memory can be used for
medical image processing.
As medical images are complex in nature, so to reduce its complexity, medical
image processing is done. The proposed system uses adequate methods for faster
execution of images and reduces the complexity of processing.
In image registration, the efcient method found is frequency domain method
which uses Fourier-based correlation using FFT (Fast Fourier transform) for
alignment of multiple images.
12 High Performance Computation Analysis for Medical Images 195

In image segmentation [14], clustering method is the most suitable method


which uses K-means clustering algorithm for segmentation of the image.
In image de-noising, Gaussian ltering method is the most efcient algorithm for
removing the noise from the source image.

12.3 Problem Statement

Medical image [9, 10] data that are collected with different medical imaging devices
such as MRI [16], CT scan (computed tomography scan) used for image diagnosing
by different practitioners. Processing this medical image by using CPU may take
large amount of time, which results delay in report generation, due to this may
cause adverse effect on patients life. So there is need to increase the efciency and
performance of medical image processing without affecting the quality of image.
Medical image analysis involves interleaving numerous methods, consisting of
different algorithms for processing. Processing the algorithms over a high com-
putational processor and without compromising the quality of image, the execution
time will be speedup. To check the efciency of algorithm, it has been analyzed on
two different architectures (i.e., CPU and GPU).

12.4 Methodology

12.4.1 Medical Image Processing

In recent few years, image processing is playing major role in medical eld. But,
the overall process that is done during medical image processing [7, 15] is very time
consuming. So, to reduce the time consumption, the medical image processing is
done with the help of GPU and the shared memory.
Image processing consists of many algorithms through which the image is pro-
cessed, but in medical image processing the basic algorithms are used that are image
pre-processing (image acquisition), image registration, image segmentation and
image de-noising. These algorithms are basic algorithms of image processing. All
these algorithms consist of some methods through which the image processing is
done. The methods and the basic algorithms [11] will be discussed further in detail.

12.4.2 How MRI Image Is Generated?

In magnetic resonance imaging, strong magnetic eld is generated and a high-radio


frequency waves are emitted over the body. The magnetic resonance image is an
exposure of radio frequency signals (RF) that are expended by tissue during the MR
(Medical Resonance) image [10] scanning process. The MR image acquisition is
196 A. Shewale et al.

nothing but retrieving the data from the RF signal parameters. Performing image
acquisition is the rst step in image processing (this is because without an image no
process can be done). The magnetic eld is being changed during the acquisition
process as the distinct physical nature of tissue, or fluid or spectroscopic posses-
sions related to molecular structure, that is visible in the image. An image consists
of an acquisition cycle that is repeated many times. The tissue magnetization is
forced through a series of changes during each cycle. All the tissues and fluids do
not progress through these changes at the same rate. At the end of each cycle that
determines the intensity of the resulting tissue brightness in the image. It is the level
of magnetization on that is present at a special picture snapping time. The basic
types of image in MRI are proton density (PD), T1- and T2-weighted images. In
proton density, it produces strong signals or a bright appearance where the density
of protons is more. T1 and T2 describes or characterize the image by Magnetic
Relaxation time in this the tissue is flipped into the unstable state and the time
taken for the tissue to get back into the normal state is calculated and hence
differentiate into normal and unstable tissues or a part of the body. All these types
are used in a combined way and not only one type to gain more accuracy.

12.4.3 Algorithm

There are many different algorithms used for medical image processing. This
algorithms are image pre-processing in which image acquisition is done, image
registration which is done to align two images that are taken from two different
locations or at two different times. Image segmentation through which a particular
part may be a tissue or a membrane in body can be extracted from the MRI image.
Image de-noising is a noise reduction algorithm in which the noise like pepper noise
and salt noise, shot noise, impulse noise is removed using different methods in
image de-noising (Fig. 12.1).
The basic algorithms are as follows
Image acquisition
Image registration
Image segmentation
Image de-noising
The rst process that takes place in image processing is image pre-processing,
i.e., image acquisition.

12.4.3.1 Image Acquisition

Image acquisition is the rst process in the image processing in which the image is
retrieved from the source or hardware and stored temporarily in the processing.
Carrying out image acquisition is always the rst step in the work-flow cycle
12 High Performance Computation Analysis for Medical Images 197

Fig. 12.1 Example of brain MRI medical image

because, without an image, no processing is achievable. The image acquired is


completely unprocessed and is just gain from hardware that was used to generate
the image. One of the forms of image acquisition is the real-time image acquisition.
Real-time image acquisition creates stream of les stitched in single medical format
or in queue, etc.

12.4.3.2 Image Registration

Image registration, a process of aligning two images taken at two different times or
angles. To register two images means to line up them, so that common features
coincide and dissimilarities between them should be emphasized and readily visible
to the naked eyes. Image registration species to the process of considering two
images taken at different times.
While registering images, we are arranging a geometric conversion which
adjusts one image to t another. Basic image subtraction does not work for a
number of reasons. MRI images are assembled according one slice at a time.
Chances are that the slices from the two volumes are not parallel when correlated
with a 6-month-old MR volume with one captured yesterday. As a result, the
context would be different. Consider a right cylindrical cone. A plane slicing
through the cone, parallel to its base, forms a circle. If the slice is slightly off
parallel, an ellipse results. So for image registration different algorithms are used.
One of the algorithm which is efcient and suitable to medical image processing is
Fourier-based correlation using FFT. This algorithm is integral part of the
frequency domain method.
198 A. Shewale et al.

Taking Fourier transform of given images

Ia = F fia g, Ib = Ffib g 12:1

1. Taking translation and rotations by (a, b) and q:

Ia x, y = Ib x cos + y sin a, x sin + y cos b 12:2

2. By using, properties of the Fourier transform

F fIa g = e iax + bx F fIb cos + sin , Ib sin cos g 12:3

3. By, taking whole square (Norms) and power spectrum translation will be
removed.

jFfIa gj2 = jFfIb cos + sin , Ib sin cos gj2 12:4

12.4.3.3 Image Segmentation

Image segmentation algorithms are designed for segmenting the image to similar
structures or having similar features. Image segmentation is to arrange an image
into assorted sectors according to the feature of image, for example, the frequency
response. Up till now, lots of image segmentation algorithms exist and be abun-
dantly applied in day to day life. According to image segmentation method, we can
classify an image into region-based segmentation, region growing, region splitting
and merging, data clustering and edge-based segmentation.
In the survey of medical image processing, various image segmentation algo-
rithms are found to be good or efcient in their working. We cannot process the
whole image directly for the reason that it is incompetent and idealistic, thus image
segmentation algorithms are used for some applications, such as image recognition
or compression. It can identify the regions of interest in a scene.
For image segmentation, clustering method is most benecial for segmenting the
image. In clustering method, K-means clustering algorithm is most efcient
algorithm.
The K-means clustering algorithm includes following steps:
(1) Composite the N number of clusters required in the nal segregated result and
set them a number, and unconcerned select N patterns in the whole database as
N centroids for N clusters.
(2) Analyze the each pattern to nd the closest cluster centroid. The adjacent
centroid usually checks the pixel values similarity, but still it may also
consider the other features also.
12 High Performance Computation Analysis for Medical Images 199

(3) Associate the cluster centroids and then they all will have N centroids for N
clusters as we do in the step 2.
(4) Repeat the steps 2 and 3 to match the concurrency criterion.
(5) The typical convergence criteria are nothing but no realignment of any pattern
from one cluster into another, or the minimal decrease in squared error.

12.4.3.4 Image De-noising

Usually noise is introduced in the image during image transmission. The added
noise can be in various forms such as shot noise, impulse noise, structured noise,
salt and pepper noise, Gaussian noise. Depending on the type of the noise, the
degradation of the image will differ. The conventional methods of noise removal
include clustering, NLMeans lter, Gaussian ltering, Anisotropic PDE (Partial
Differential Equation), Rudin-Osher-Fatemi TV, shrinkage models and different
transforms. Wavelet thresholding, isotropic diffusion wavelet multi-frame,
de-noising anisotropic diffusion, Bayesian estimation de-noising, and curvelet
transform and wave atom transform are the efcient ways for image noise reduction
algorithms. The noise removal algorithm must be picked according to the per-
centage of image quality reduction.
Most of the noise removal techniques suggested till now depend on what type of
noise is introduced. The required noise reduction algorithm is used to which the
image and video are based on. Gaussian ltering is considered as a perfect blur
for many applications, which provides kernel support that is large enough to t the
essential part of the Gaussian. In case of 2D ltering, it can be fragmented into a
series of 1D ltering for rows and columns. When the lter radius is narrow then
the direct 1D convolution is the fastest and efcient way to calculate the ltering
result. First of all, it is observed that the result of convolution has a length
N + M 1, where N is the signal size and M is a lter kernel size (equal to 2r + 1),
i.e., the output signal is longer than the input signal. Secondly, calculating FFT of
the complete image row is not superlative, as the complexity of FFT is ON log N.
By breaking the kernel, the complexity of FFT (fast Fourier transform) can be
reduced into sections with an approximate length M and performing overlap-add
convolution section-wise. When the size of FFT is considered as F and is selected
as the smallest power of 2 larger than 2M, and signal section size is selected as
FM + 1 for full utilization of FFT block then the superlative performance is
achieved. This reduces the overall complexity of 1D convolution to ONorm, thus
complexity of Gaussian blur per-pixel becomes Olog r, so for many practical
purposes Gaussian blur can be successfully implemented with simpler lters as the
value of constant is quite large.
200 A. Shewale et al.

12.4.4 High Computational Language (CPU and GPU)

12.4.4.1 GPU

CUDA stands for compute unied device architecture, and it is a new hardware and
software architecture for subjecting and conducting calculations on the GPU as a
data-parallel computing device without the constraint of mapping them to a
graphics application programmable interface. The operating systems multi-tasking
mechanism is bounded for managing the access to the GPU by several CUDA and
graphics applications running synchronously. Here by using CUDA, the process of
medical image processing will be done faster (Fig. 12.2).

Fig. 12.2 GPU computational architecture


12 High Performance Computation Analysis for Medical Images 201

12.4.4.2 Shared Memory Architecture (CPU)

OpenMP (Open Multi-Processing) can be called as a framework for shared memory


parallel computing. OpenMP is a standard C/C++ and FORTRAN compilers.
Compiler directives indicate where parallelism should be used. C/C++ use
#pragma directives FORTRAN uses structured comments.
A library provides support routines.
Based on the fork/join model following are the steps for shared memory process:
The program starts or initiates as a single thread.
At designated parallel regions, a pool of threads is formed.
The threads execute in parallel all over the region.
At the end of the region, the threads wait for team of thread to arrive or reach.
The master thread then continues until the next parallel region.
Advantages
Usually can arrange so the same code can run sequentially.
Can add parallelism incrementally.
Compiler can optimize.
The OpenMP runtime creates and manages separate threads.
OpenMP is much easier to use than low-level thread.

12.4.4.3 The Proposed System

The proposed system consists of MR image as an input image. The images will be
processed with the help of core image processing algorithms that are image reg-
istration, image segmentation, image de-noising. The whole processing will be done
on two different environments that are shared memory (CPU) and GPU (CUDA).
The analysis of both the environment will make more pre-sized decision regarding
the efciency of system. In the proposed system architecture, the system takes the
input from MRI machine. The input image undergoes three image processing
algorithms. Image registration, image segmentation and image de-noising. Image
registration is the most common algorithm in medical imaging. One reason for this
is GPUs hardware support for linear interpolation, which makes it possible to
transform image and volumes very efciently. Whereas image segmentation is often
used for segmentation of brain structures, blood vessels, organs, tumors and bones.
It immediately compares multiple candidate segmentation algorithms, once the
segmentation algorithm has been well established it works on massive datasets. It
also performs interactive segmentation and visualization.
In case of image de-noising, the main aim is to improve overall image quality.
Image registration and segmentation algorithms can also asset from a decreased
noise level. After ionizing radiation in MRI image, we get processed high-quality
image (Fig. 12.3).
202 A. Shewale et al.

Fig. 12.3 System architecture of medical image analysis

12.5 Problem Formulation

12.5.1 Algorithm: GPU

Start
Obtain MRI image from the web by image acquisition.
Study the characteristics of MRI.
Implement image registration using Fourier-based correlation algorithm.
Implement segmentation by K-means clustering algorithm.
De-noise the image using Gaussian lter theorem algorithm.
Analyze the performance for the system.
Stop

12.5.2 Flowchart

The mentioned flowchart gives the brief idea of the whole process that takes place
in the medical image processing. In the gure, the process starts from the basic
stage of image processing that is image acquisition that is image retrieval and then
after studying its characteristics the main image processing begins with the image
registration, image segmentation and image de-noising algorithms used for effective
image processing. The overall process takes place in two different environments,
and then the performance is calculated in both of them. Whichever will be more
efcient will be concluded as the best system in both of the environments.
As shown in gure, we can show how the whole sequence of the system works
over a time.
12 High Performance Computation Analysis for Medical Images 203

Fig. 12.4 Flowchart of


medical image analysis

An MRI image will be provided as an input to the system.


The image acquisition will takes place.
After image acquisition if an image is not aligned then image registration will be
done.
After aligning the image it further goes for segmentation in which image is
highlighted into different structures.
After segmentation process, it is then forwarded to de-noise it.
And at last, the output processed image will be obtained (Fig. 12.4).

12.5.3 UML (Unied Modeling Language)

Here for describing the overall medical image processing, the class diagram is
drawn in which the basic three algorithms are shown as the three different classes.
All the three classes perform the medical image processing algorithms in which
different methods are being used like Fourier-based coefcient method for image
registration, K-means clustering method segmenting the image into different
structures and for de-noising the image, Gaussian ltering method is used.
204 A. Shewale et al.

Fig. 12.5 Class diagram of medical image analysis

The class diagram for the proposed system is given above which gives the
description as follows:
Here for describing the overall medical image processing, the class diagram is
drawn in which the basic three algorithms are shown as the three different
classes.
All the three classes perform the medical image processing algorithms in which
different methods are being used.
The rst class in class diagram is Fourier-based coefcient method for image
registration which consists of Fourier transform, translations and rotations and
properties of Fourier transform.
Second class is K-means clustering method used segmenting the image into
different structures, which consist of clusters in it.
Then the third class is for de-noising the image, i.e., Gaussian ltering method
which consists of FFT and Gaussian lter (Fig. 12.5).

12.6 Mathematical Model

One view of analysis modeling, called structural analysis, considers data and the
processes that transform the data as separate entities. Data objects are modeled in a
way that de nes attributes and relationships. Processes that manipulate data objects
12 High Performance Computation Analysis for Medical Images 205

Fig. 12.6 Venn diagram

are in a manner that shows how they transform data as data objects ow through the
system. Let I be the set of input containing image and its properties.

I = fI1 , I2 , I3 , . . . , In g 12:5

Let O be the set of output containing processed image.

O = f o1 , o 2 , o 3 , . . . , o 4 g 12:6

Consider Venn diagram for the given proposed system (Fig. 12.6).
Where

E = Universal set having environment in it.


E = fG, F, Cg

where

G = GPU Environment
C = CPU environment
F = set of functions
F = ff1 , f2 , f3 , f4 g

Where G C = F
Let f1 = Image Acquisition

f2 = Image Registration
f3 = Image Segmentation
f4 = Image Denoising

f1 = Image Acquisition
{
Taking Fourier transform of given images
206 A. Shewale et al.

Ia = F fia g, Ib = Ffib g 12:1

4. Taking translation and rotations by (a, b) and q:

Ia x, y = Ib x cos + y sin a, x sin + y cos b 12:2

5. By using properties of the Fourier transform

F fIa g = e iax + bx FfIb cos + sin , Ib sin cos g 12:3

6. By taking whole square (Norms) and power spectrum, translation will be


removed.

jFfIa gj2 = jFfIb cos + sin , Ib sin cos gj2 12:4

}
f3 = Image Segmentation

f3 = Image Denoising
{
The normal distributions density function is called as Gaussian function. The
Gaussian function format for one dimension is given as follows:

1
f x = p e x2 2 2
2

Depending upon the one-dimensional function, we can obtain the


two-dimensional function
12 High Performance Computation Analysis for Medical Images 207

Gx, y = 1 2 2 e x + y2 2 2
2

With the above two-dimensional function, we can calculate the weight of the
point.
Where is the standard deviation for the given distribution. And the mean for
the distribution is assumed to be 0.
The Gaussian function is used in different research areas:
It denotes a probability distribution for noise or data.
It is said to be a smoothing operator.
It is also used in mathematics.
There are some important properties of Gaussian function which are then veri-
ed by its integral function as which is given as follows:
  p
I = exp x2 dx =

Let S be a universal set for medical image processing


S = ff1 f2 f1 f3 11 f4 g = Oset of out i.e. processed image
Hence S O
Where

O = fo1 , o2 , o3 , . . . , on g

Processed image

12.7 Conclusion

Hence the high-performance computational analysis will be done with the help of
high computational languages. Here for analysis, the three algorithms are used that
are image registration, image segmentation and image de-noising. These algorithms
are the basic algorithms in the medical image processing. In image registration,
Fourier-based coefcient method is used for aligning the images. In image seg-
mentation, K-means clustering algorithm is used for segmenting the image into the
similar structures. For de-noising, the image Gaussian theorem is used. Using all
these methods and algorithms, the system performance is analyzed. After the per-
formance analysis, which system performs better will be concluded.
208 A. Shewale et al.

References

1. Abramov A, Kulvicius TF, Dellen B (2011) Real-time image segmentation on a GPU. In:
Facing the multicore challenge. Lecture notes in computer science, vol 6310, pp 131142
2. Bouman CA (2015) Magnetic resonance imaging (MRI). In: Digital image processing, 12 Jan
2015
3. Brady M, FRS FREng Hilary (2005) Image segmentation and classication. University of
Oxford
4. Buades A, Coll B, Morel JM (2003) A review of image denoising algorithms, with a new one.
Kostelec PJ, Periaswamy S (2003) Image registration for MRI. In: Modern signal processing,
vol 46. MSRI Publications
5. Buades A, Coll B, Morel J-M (2005) A review of image denoising algorithms, with a new
one. SIAM J Multiscale Model Simul A SIAM Interdiscip J
6. Chandra R, Dagum L, Kohr D, Maydan D, McDonald J, Menon R (2001) Parallel
programming in OpenMP. Morgan Kaufmann publishers
7. Eklund A, Dufort P, Forsberg D, LaConte SM (2013) Medical image processing on the
GPU-Past, present and future 17 (Elsevier B.V)
8. Gayathri R, Sabeenian RS (2012) A survey on image denoising algorithms (IDA), 5 Nov
2012. Int J Adv Res Electr Electron Instrum Eng
9. Goswami P (2015) Biomedical image processing, 19 Apr 2015. Published on Slideshare
10. Hanson LG (2009) Introduction to magnetic resonance imaging techniques. Copenhagen
University
11. Kaur A, Singh N (2012) Region growing and object extraction techniques. Int J Sci Res
12. Kostelec PJ, Periswamy S (2003) Image registration for MRI. In: Modern signal processing,
vol 46. MSRI Publications
13. NVIDIA CUDA (2007) Compute unied device architecture-programming guide. NVIDIA
Corporation
14. Smistad E, Falch TL, Bozorgi MM, Elster AC, Lindseth F (2015) Medical image
segmentation on GPUs-a comprehensive review 20 (Elsevier)
15. Zhu H Medical image processing overview. University of Calgary
16. Sprawls P. www.sprawls.org/mripmt/MRI01/index.html. Sprawls Educational Foundation
Chapter 13
Terrorist Scanner Radar and Multiple
Object Detection System

Supriya Kunjir and Rajesh Autee

Abstract Requirement in the eld of security at national border increases day by


day, and it leads to a demand for advance sensor network security system beyond
simple security applications. Detection of terrorist or any terrorism activity at
national border is an important threat. There are many security systems which are
already working at national border such as manual screening, autobomb blaster,
long-range ring equipment, cordless transmitter and receiver common in con-
trolled access like in entry-restricted area of national border, but there are certain
problems with them. The purpose of this presented system is to build more secure
system based on sensor network for enhanced security at national border which can
automatically detect the obstacles which are coming into entry-restricted area of
national border. The system specically aims at the task of detecting of obstacles by
means of ultrasonic radar sensor network and provides photograph of the detected
obstacles using camera and also provides total count of detected obstacles by means
of counter. The ultrasonic sensor network coupled with counter and display unit is
then total assembly coupled to the FM transceiver to get voice announcement at
military control department. Here it is tried to develop a cost-effective security
system based on radar sensor network to prevent terrorism to a great extent.

Keywords FM transceiver Obstacles Security Terrorism Ultrasonic


sensors

S. Kunjir R. Autee ()
Electronics and Communication Department, Deogiri Institute of Engineering
and Management Studies, Aurangabad, Maharashtra, India
e-mail: rmautee@gmail.com
S. Kunjir
e-mail: supriyakunjir@gmail.com

Springer Science+Business Media Singapore 2016 209


A. Chakrabarti et al. (eds.), Advances in Computing Applications,
DOI 10.1007/978-981-10-2630-0_13
210 S. Kunjir and R. Autee

13.1 Introduction

In todays world, security system is becoming more essential and as per the need it
becomes more advanced. In this rapidly growing world, as technology changing
each and every eld with very rapid speed, simultaneously the requirement of
security also becomes an important need at every place of the world. And changing
technology has done great revolution in the eld of security also. In response to the
recent incidents of terrorism in the nation, it becomes necessary to develop an
advance and fast security system against terrorism for national border. Therefore,
here is an attempt made to have security system for national border against ter-
rorism using ultrasonic sensor network, which helps to protect the nation against
terrorism to a great extent. Proposed system goal is to build a more advance and
sensitive security system which can control terrorism. The development of the
system can be achieved by using ultrasonic radar sensor network. Here by con-
sidering the purpose of security system requires for the national border, the system
is designed such that it continuously scans the national border. A pair of ultrasonic
transmitter and receiver sensors with camera is placed on motor assembly and being
rotated by rotating assembly, i.e., motor in 180 angle using limit switches at
entry-restricted area of national border. This sensor network and camera are then
coupled to counter and display section, which will give us the count of obstacles
found by sensor network and also give photographs which are taken by camera
when any obstacle coming into their scanning path. Further the counter and display
system is coupled with recording and playback system to get voice announcement
at military base camp to get continuous updates of position of the obstacles from
border area. Due to the advantage of image provision, we are able to take suitable
control action against terrorist as per the situation. Figure 13.1 shows the overall
system conguration.

Playback Main System Scanning


System Assembly

Military Base Camp Border Area

Fig. 13.1 System conguration


13 Terrorist Scanner Radar and Multiple Object Detection System 211

13.2 Literature Survey

A typical radar technology includes emitting radio waves, receiving their reflection
and using this information to generate data. For imaging radar, the returning waves
are used to create an image. When the radio waves reflect off objects, this will make
some changes in the radio waves and can provide data about the objects, including
how far the waves travelled and what kind of objects they encountered. Using the
acquired data, a computer can create a 3D or 2D image of the target. Radar imaging
for combating terrorism (Grifths and Chirs J. Baker) [1] presented the review
application of imaging radar systems to counter terrorism. In his case, the imaging
radar is used to distinguish targets from the background, by exploiting differences in
signature, and wherever possible making use of prior knowledge. Further infor-
mation may be obtainable through the use of radar polarimetry and interferometer.
There are two distinct stages to this:
(i) The production of high-quality, artifact-free imagery and
(ii) The extraction of information from imagery.
This is followed by a discussion of four specic applications to counterterrorism:
(i) The detection of buried targets,
(ii) Through wall radar imaging,
(iii) Radar tomography and detection of concealed weapon and
(iv) Passive bistatic radar.
Radar is widely used in the application of weapon detection against terrorism.
Inmarsat Global Limited (Bulletin Global Government, November 2013) [2] has
identied the Blighter family of surveillance radars as the key component of a
solution that will enable border agencies to deliver very signicant improvements to
the security of their borders at an affordable cost.
The combination of BGAN with Blighter generates a highly capable, versatile
and mobile capability which fully meets the requirement and offers excellent value
for money. Blighter is the name of a family of modern state-of-the-art electronic
scanning radars, which are designed and built to provide continuous persistent
surveillance at borders, boundaries and perimeters. They detect moving targets over
both land and water, covering a wide area. Both mobile and man-portable variants
are available.
Blighter radars employ electronic rather than mechanical scanning. The elimi-
nation of mechanical scanning provides the highest possible levels of system
availability and reliability. Blighter can detect very slow moving targets, down to
0.4 km/h. This ensures that targets moving almost tangentially to the radar can still
be detected. Advances in sensor technology and computer processing power can
signicantly enhance border security. It is also a highly effective backup system that
can be guaranteed to provide resilient communications in the event of the accidental
or deliberate disruption of a primary terrestrial communications network.
212 S. Kunjir and R. Autee

Ultrasonic sensors emit short, high-frequency sound pulses at regular intervals.


These propagate in the air at velocity of sound. If they strike an object, then they are
reflected back as echo signals to the sensor, which itself computes the distance to the
target based on the time span between emitting the signal and receiving the echo.
Terrorist scanner radar with military headquarter informing system (Rushikesh
et al. 2013) [3] proposed a border security system which is mainly based on radar
principle. Radar is a system that uses radio waves to detect, determine the direction
and distance and/or speed of objects such as aircraft, ships, terrain or rain and map
them.
A transmitter emits radio waves, which are reflected by the target, and detected
by a receiver, typically in the same location as the transmitter. Although the radio
signal returned is usually very weak, radio signals can easily be amplied, so radar
can detect objects at ranges where other emission, such as sound or visible light,
would be too weak to detect.
Radar is used in many contexts, including meteorological detection of precipi-
tation, air trafc control, police detection of speeding trafc and by the military. In
this paper, the proposed system can detect only object within its working range, but
it is not possible to detect exact kind of object is been detected. As the system is not
able to detect the exact type of object due to which it is not possible to take suitable
control action. In control action, it simply takes a ring action for specic time
period. Sometimes the obstacle detected not required that much long time ring or
sometimes it may be insufcient. Such type of problems occurs with the system
because the system does not have any facility to capture the images of the objects
being detect.
Project review on Ultrasonic Distance Measurement (Prakhar Shrivastava et al.
2014) [4] constructed a device can measure distance in the range of 0.54 m with
the accuracy of 1 cm. This device is used to measure the distance by using ultra-
sonic sensors. It works by transmitting ultrasonic waves at 40 kHz. Then, the
transducers will measure the amount of time taken for a pulse of sound travel to a
particular surfaces and return as the reflected echo. After that, the circuit that have
been programmed with AT mega microcontroller will calculate the distance based
on the speed of sound at 25 C which an ambient temperature and also the time
taken. The distance then will be display on a LCD module. The importance of the
device is calculating accurate distance from any obstacle that we want to measure.
Radar-based automatic target system (Gavin Dingley and Clive Alabaster 2009)
[5] described the novel application of both dual tone CW and ISAR techniques to
measure the position of a small high-velocity projectile as it passes through a dened
sensory virtual plane, so forming the basis of an automatic targeting system for live
re training. Systems that are commercially available operate using either acoustic or
optical principles. The various technical problems associated with optical and
acoustic automatic target systems have made the development of a radar-based
system desirable, as such a system will generally be more compact, capable of
measuring both subsonic and supersonic velocity projectiles and less influenced by
the characteristics of the local environment. This contribution describes the work so
far in developing a radar-based targeting system for small targets.
13 Terrorist Scanner Radar and Multiple Object Detection System 213

Ajay Kumar Shrivastava et al. (Nov. 2009) [6] presented the system which gives
the effect of variation in separation between the ultrasonic transmitter and receiver
on accuracy of distance measurement. Distance measurement of an object in front
or by the side of a moving or stationary entity is required in a large number of
devices. To maintain the accuracy of measured distance, the separation between
transmitter and receiver is very important. In this system, the transmitter and
receiver are separated and readings are taken with the distance in the interval of
5 cm. When the distance increases, the error becomes constant and very less.
A correction may be applied to calculate the correct distance.
Ankita Sharma et al. (April 2010) [7] presented a report on ultrasonic 3D locator
which aims to locate the 3D coordinates of a point in space. Ultrasonic signals have
been used for the purpose. The device has two basic modules: location indicator and
position calculator. The location indicator is the transmitter module which is
independent of the calculation unit. The transmitter module sends out a modulated
pulse with carrier in ultrasound range. The receiver module has 5 receivers located
at pre-calculated positions so as to calculate the position of transmitter accurately.
The 3D coordinate of the object is detected based on the time difference of arrival at
the receivers. The transmitter sends pulses to the receivers, and the central unit
calculates the delay between the receivers. Using these measurements, it then
calculates the coordinate of the object (the transmitter). Continuous-wave radar is a
type of radar system where known stable frequency continuous-wave radio energy
is transmitted and then received from any reflecting objects.
A microwave measurement system for metallic object detection using swept
frequency radar (Yong Li et al. 2008) [8] described a microwave measurement
system for metallic object (gun, knives) detection using swept frequency radar. This
system reports on a 114 GHz swept frequency radar system for metallic object
detection using reflection conguration. The swept frequency response and resonant
frequency behavior of a number of metallic objects in terms of position, object
shape, rotation and multiple objects have been tested and analyzed. The system
working from 1 to 14 GHz has been set up to implement sensing of metal items at a
standoff distance of more than 1 m.
Helmut Essen et al. [9] presented two different approaches of concealed weapon
detection with active and passive millimeter wave sensors. The design of a passive
radiometric sensor in the W-band is presented. On the active side, an FMCW radar
system at 94 GHz is introduced for the scanning of persons. Sensors used for security
purposes have to cover the noninvasive control of humans, baggage and letters with
the aim to detect weapons, explosives and chemical or biological threat material.
Those sensors have to cope with different environmental conditions. Preferably, the
control of people has to be done over a longer distance. In times of increasing threat
by terrorist attacks, the control of passengers at airports and stations is one of the
major items. People carrying concealed weapons or explosives or those who have
other terroristic attacks in mind have to be detected under all circumstances.
Image segmentation of concealed objects detected by terahertz imaging (Sheeja
Agustin et al. 2010) [10] presents a method for image segmentation of concealed
objects detection by terahertz imaging. Terahertz radiation is emitted as part of the
214 S. Kunjir and R. Autee

black body radiation from anything with temperatures greater than about 10 K.
A reasonable penetration depth in certain common materials such as cloth, plastic,
wood, sand and soil can be performed by terahertz waves. Therefore, THz radiation
can detect concealed weapons since many nonmetallic, nonpolar materials are
transparent to this type of radiation (and are not transparent to visible radiation). In
this system, the multi-level thresholding method is applied to get the initial seg-
mentation of concealed objects in terahertz images. Then Gonzalez method and
Gonzalez Improved methods are proposed to detect and segment concealed objects
in terahertz images more correctly with specic shape.
Clive M. Alabaster et al. (2013) [11] proposed a virtual target radar system for
small arms re training. It provides the miss distance of a bullet from an aim point
in two axes as the bullet passes through the target plane. The VTR consists of two
radars at ground level separated by a baseline distance, d, which is approximately
2 m. Each radar bore sight is inclined upwards at 45 so that the bore sights
intersect at right angles. The region in which the beams overlap in the plane dened
by the two bore sight vectors denes the targeting plane. The shooter is presented
with an aim point by the intersection of two visible laser beams from a pair of diode
lasers aligned to the bore sight of each radar. A bullet passing through the beams
between set velocity and amplitude limits triggers the recording of the data by each
radar which is processed using inverse synthetic aperture radar (ISAR)-like tech-
niques to extract the range to the bullet. Each radar measures the range to the bullet
as it passes through the targeting plane. This yields the x and y coordinates of the
bullet and hence its miss distance in each plane from the target center.
Naidu (March 2012) [12] develops a system for 3D target tracking named fusion
of radar and IRST sensor measurements for 3D target tracking using extended
Kalman lter. Filters and multi-sensors are used to enhance the target-tracking
capabilities. Radar can measure azimuth, elevation and range of the target. It can
measure range with good resolution, but the angular measurements are not with
good resolution. Radar provides sufcient information to track the target since it
measures both angular position and range of the target. An infrared search and track
(IRST) sensor (sometimes known as infrared sighting and tracking) can measure
azimuth and elevation of a target with good resolution. It can provide only the
direction of the target but not its location because it does not provide the range. The
uncertainty associated with IRST might be represented as a square whose dimen-
sions are comparatively small perpendicular to the measured line of sight. With the
fusion of measurements from radar and IRST, the resultant uncertainty of the
estimated position of the target is smaller than the uncertainty of the estimates with
either measurement alone.
Technology Focus (Bulletin of Defence Research and Development Organisation,
April 2013) [13] is the bulletin of describes different radars with their technology and
applications for defense area. Since the radar is used to detect the target, it is necessary
to get the image of the target so that we can nd out exact scene of the target and able
to take suitable action against it. The radar imaging system gives the images of the
target, but it should not be expected that radar images should look like photographs.
Radar is also used for detection of metallic object and concealed weapon, but other
13 Terrorist Scanner Radar and Multiple Object Detection System 215

than metallic objects it will not detect any other weapons. An application of the radar
also includes determination of the position, distance and speed of the target. Further
effect of variation in separation between transmitter and receiver on the accuracy of
the distance measurement is also studied. To overcome the drawbacks, a specic
target detection arrangement is done in proposed system. In the proposed system,
camera is placed with the radar to capture the images of the target. Thus, it reduces the
problem of photographs of the target. The system provides an image of the target;
however, it needs a specic camera arrangement to be mounted on the radar.

13.3 System Development

The proposed system terrorist scanner radar and multiple object detection system
which consists of ultrasonic transmitter/receiver pair which is main part of system
which is use to detect the object which comes into its path. Once the object is
detected by ultrasonic transmitter/receiver, its output further sends to counter and
display unit. Counter counts the number of objects which detected by ultrasonic
transmitter, and display unit displays that count. Simultaneously the system gener-
ates a voice signal by using message driver unit and sends that voice signal to the
military base camp using FM transmitter and plays via FM receiver. The system
designed here divides into three main parts: ultrasonic sensor section which is
located at national border area, counter, display and message driver section which
are located at military ofce and voice announcement section located at military base
camp. The proposed system consists of the following blocks shown in Fig. 13.2.
When rotating assembly, i.e., motor starts rotates in 180 angle in forward and
backward direction, the ultrasonic transmitter and receiver and camera which are
placed on motor will also start to rotate. While rotating, it scans the area which
comes into its range and while scanning if it found any obstacle in its scanning path,
at the same time motor stops its rotation and the frequency which is transmitted by
transmitter gets reflected by obstacle and received by its receiver. When receiver
receives echo signal from obstacle, it then send one high pulse to the counter.
Counter gets incremented by one for one high pulse, incremented by two for two
high pulses and goes on increasing for every successive high pulses received from
ultrasonic receiver. One high pulse corresponds to one obstacle. Simultaneously
camera placed on motor will take the photographs of obstacle detected by US
sensors and sends images to the display unit. The display unit consists of
seven-segment display and PC or laptop as display devices.
Seven-segment displays are used to display the count of counter section, and
PC/laptop is used to display the images captured by camera. This total section is
then coupled with message driver unit, where one voice message generates after a
specic count reach by counter. This voice message is then passed to FM trans-
mitter to transmit it to military base camp. In military base camp, there will be FM
receiver and recording and playback assembly is located to play the voice
announcement in military base camp.
216 S. Kunjir and R. Autee

Ultrasonic Camera Ultrasonic


Transmitter Receiver

40 KHz Pulse Rotating Assembly 3 Stage


Generator Amplifiers

Display Unit Schmitt


Counter
Trigger
Flip-Flop Power Supply Unit 5V, Counter&
+/-12 V Display unit
2 Hz
Oscillator
Message
Driver &
Inverter Unit

Recording & FM Receiver FM


playback Unit Transmitter

Fig. 13.2 Overview of proposed system

13.4 Experimental Results

This terrorist scanner radar gives the simplest and more efcient method to detect,
count and identify the terrorist at nation border. As all the other existing system
does not provide exact identication of the terrorist, the proposed will help to
provide analysis of detection and identication of terrorist session.

13.4.1 Parametric Evolution of Proposed System

The system analysis consists of output voltages at different test points before
detection of obstacle and after obstacle detection (Table 13.1).
13 Terrorist Scanner Radar and Multiple Object Detection System 217

Table 13.1 Voltages at different test points before and after obstacle detection
Sr. Test Test point description Voltage available Voltage available
no. point at TP before at TP after
obstacle detection obstacle detection
1 TP1 Across pin no. 4 and pin no. 6 of 0V 11.80 V
IC CA3140 which is output of
ultrasonic receiver
2 TP2 Across pin 1 and pin no. 3 of IC 0V 9.25 V
555 which output of motor
control circuit
3 TP3 Across pin 1 and pin no. 3 of IC 0V 7.53 V
555 which output of counter
circuit
4 TP8 Across pin 1 and pin no. 3 of IC Count below 8 Count above 8
555 which is output of message 0V 10.52 V
driver circuit

13.4.2 The System Analysis also Consists of Output


Response of Ultrasonic Transceiver When There is
Obstacle Into its Scanning Path

Figure 13.3 shows the waveform of output of ultrasonic transceiver when one
obstacle comes into scanning path. It shows that positively going high pulse when
detects an obstacle into its scanning path. Normally its output is at lower voltage
level approximately 6.87 V and when it detects some object into its path, the
voltage level increases positively up to 25.4 V which is given in Table 13.2.
Figure 13.4 shows the waveform of output of ultrasonic transceiver when two
obstacles come into its scanning path. It shows that positively going two high
pulses when detects two obstacles into its scanning path. Normally its output is at
lower voltage level approximately 6.87 V and when it detects some object into its

30

25

20

15

10

0
103
109
115
121
127
133
139
145
151
157
163
169
175
181
187
1
7
13
19
25
31
37
43
49
55
61
67
73
79
85
91
97

Time (ms)--->on horizontal axis and Voltage in (Volt)---> on vertical axis

Fig. 13.3 One object detected waveform


218 S. Kunjir and R. Autee

Table 13.2 Data table for one object detection


Voltage Time Voltage Time Voltage Time Voltage Time
(V) (ms) (V) (ms) (V) (ms) (V) (ms)
7 1.68 7 9.52 7 13.28 7 17.04
6.8 1.76 6.8 9.6 7 13.36 7 17.12
6.8 1.84 6.8 9.68 6.8 13.44 7 17.2
6.8 1.92 6.8 9.76 6.8 13.52 7 17.28
7 2 7 9.84 6.8 13.6 7 17.36
6.8 2.08 6.8 9.92 7 13.68 7 17.44
6.8 2.16 6.8 10 6.8 13.76 7 17.52
7 2.24 7 10.08 6.8 13.84 6.8 17.6
7 2.32 7 10.16 7 13.92 6.8 17.68
7 2.4 7 10.24 7 14 6.8 17.76
7 2.48 7 10.32 7 14.08 7 17.84
7 2.56 7 10.4 7 14.16 6.8 17.92
7 2.64 7 10.48 7 14.24 6.8 18
7 6.8 6.8 10.56 7 14.32 7 18.08
25.4 6.88 6.8 10.64 7 14.4 7 18.16
25.4 6.96 6.8 10.72 6.8 14.48 7 18.24
25.4 7.04 6.8 10.8 6.8 14.56 7 18.32
25.4 7.12 6.8 10.88 6.8 14.64 7 18.4
25.4 7.2 6.8 10.96 7 14.72 7 18.48
25.4 7.28 6.8 11.04 6.8 14.8 7 18.56
25.4 7.36 6.8 11.12 6.8 14.88 6.8 18.64
25.4 7.44 6.8 11.2 7 14.96 6.8 18.72
25.4 7.52 6.8 11.28 7 15.04 6.8 18.8
25.4 7.6 7 11.36 7 15.12 6.8 18.88
25.4 7.68 7 11.44 7 15.2 6.8 18.96
25.4 7.76 7 11.52 7 15.28 6.8 19.04
7 7.84 7 11.6 7 15.36 6.8 19.12
6.8 7.92 7 11.68 7 15.44 6.8 19.2
6.8 8 7 11.76 6.8 15.52 6.8 19.28
6.8 8.08 7 11.84 6.8 15.6 6.8 19.36
7 8.16 7 11.92 6.8 15.68 6.8 19.44
6.8 8.24 7 12 7 15.76 7 19.52
6.8 8.32 7 12.08 6.8 15.84 6.8 19.6
7 8.4 7 12.16 6.8 15.92 6.8 19.68
7 8.48 7 12.24 7 16 7 19.76
7 8.56 6.8 12.32 7 16.08 7 19.84
7 8.64 6.8 12.4 7 16.16 7 19.92
7 8.72 6.8 12.48 7 16.24 7 20
7 8.8 6.8 12.56 7 16.32 7 20.08
(continued)
13 Terrorist Scanner Radar and Multiple Object Detection System 219

Table 13.2 (continued)


Voltage Time Voltage Time Voltage Time Voltage Time
(V) (ms) (V) (ms) (V) (ms) (V) (ms)
7 8.88 7 12.64 7 16.4 7 20.16
7 8.96 6.8 12.72 7 16.48 7 20.24
7 9.04 6.8 12.8 6.8 16.56 6.8 20.32
7 9.12 7 12.88 6.8 16.64 6.8 20.4
7 9.2 7 12.96 6.8 16.72 6.8 20.48
7 9.28 7 13.04 7 16.8 7 20.56
7 9.36 7 13.12 6.8 16.88 6.8 20.64
7 9.44 7 13.2 6.8 16.96 6.8 20.72

path, the voltage level increases positively up to 25.4 V which is given in


Table 13.3.
Figure 13.5 shows the waveform of output of ultrasonic transceiver when three
obstacles come into its scanning path. It shows that three positively going high
pulses when detects three obstacles into its scanning path. Normally its output is at
lower voltage level approximately 6.87 V and when it detects some object into its

30
25
20
15
10
5
0
13
19
25
31
37
43
49
55
61
67
73
79
85
91
97
103
109
115
121
127
133
139
145
151
157
163
169
175
181
187
1
7

Time (ms)--->on horizontal axis and Voltage in (Volt)---> on vertical axis

Fig. 13.4 Two object detected waveform

30
25
20
15
10
5
0
103
109
115
121
127
133
139
145
151
157
163
169
175
181
187
1
7
13
19
25
31
37
43
49
55
61
67
73
79
85
91
97

Time (ms)--->on horizontal axis and Voltage in (Volt)---> on vertical axis

Fig. 13.5 Three object detected waveform


220 S. Kunjir and R. Autee

Table 13.3 Data table for two object detection


Voltage Time Voltage Time Voltage Time Voltage Time
(V) (ms) (V) (ms) (V) (ms) (V) (ms)
7 1.68 6.8 9.52 6.8 13.28 7 17.04
6.8 1.76 6.8 9.6 6.8 13.36 7 17.12
6.8 1.84 7 9.68 7 13.44 7 17.2
6.8 1.92 7 9.76 6.8 13.52 7 17.28
7 2 7 9.84 6.8 13.6 7 17.36
6.8 2.08 7 9.92 7 13.68 7 17.44
6.8 2.16 7 10 7 13.76 6.8 17.52
7 2.24 7 10.08 7 13.84 6.8 17.6
7 2.32 7 10.16 7 13.92 7 17.68
7 2.4 7 10.24 7 14 7 17.76
7 2.48 7 10.32 7 14.08 7 17.84
7 2.56 7 10.4 7 14.16 7 17.92
7 2.64 7 10.48 7 14.24 7 18
25.4 6.8 25.4 10.56 7 14.32 7 18.08
25.4 6.88 25.4 10.64 7 14.4 7 18.16
25.4 6.96 25.4 10.72 6.8 14.48 6.8 18.24
25.4 7.04 25.4 10.8 6.8 14.56 6.8 18.32
25.4 7.12 25.4 10.88 6.8 14.64 6.8 18.4
25.4 7.2 25.4 10.96 7 14.72 7 18.48
25.4 7.28 25.4 11.04 6.8 14.8 6.8 18.56
25.4 7.36 25.4 11.12 6.8 14.88 6.8 18.64
25.4 7.44 25.4 11.2 7 14.96 7 18.72
25.4 7.52 25.4 11.28 7 15.04 7 18.8
25.4 7.6 25.4 11.36 7 15.12 7 18.88
25.4 7.68 25.4 11.44 7 15.2 7 18.96
25.4 7.76 25.4 11.52 7 15.28 7 19.04
25.4 7.84 6.8 11.6 7 15.36 7 19.12
6.8 7.92 6.8 11.68 7 15.44 7 19.2
6.8 8 7 11.76 6.8 15.52 6.8 19.28
7 8.08 7 11.84 6.8 15.6 6.8 19.36
6.8 8.16 7 11.92 6.8 15.68 6.8 19.44
6.8 8.24 7 12 7 15.76 7 19.52
7 8.32 7 12.08 6.8 15.84 6.8 19.6
7 8.4 7 12.16 6.8 15.92 6.8 19.68
7 8.48 7 12.24 7 16 7 19.76
7 8.56 7 12.32 7 16.08 7 19.84
7 8.64 7 12.4 7 16.16 7 19.92
7 8.72 7 12.48 7 16.24 7 20
7 8.8 7 12.56 7 16.32 7 20.08
(continued)
13 Terrorist Scanner Radar and Multiple Object Detection System 221

Table 13.3 (continued)


Voltage Time Voltage Time Voltage Time Voltage Time
(V) (ms) (V) (ms) (V) (ms) (V) (ms)
7 8.88 7 12.64 7 16.4 7 20.16
7 8.96 7 12.72 7 16.48 7 20.24
7 9.04 7 12.8 6.8 16.56 6.8 20.32
7 9.12 7 12.88 6.8 16.64 6.8 20.4
6.8 9.2 7 12.96 6.8 16.72 6.8 20.48
6.8 9.28 7 13.04 7 16.8 7 20.56
6.8 9.36 7 13.12 6.8 16.88 6.8 20.64
7 9.44 6.8 13.2 6.8 16.96 6.8 20.72

path, the voltage level increases positively up to 25.4 V which is given in


Table 13.4.
Figure 13.6 shows the waveform of output of ultrasonic transceiver when four
obstacles come into scanning path. It shows that positively going four high pulse
when detects an obstacle into its scanning path. Normally its output is at lower
voltage level approximately 6.87 V and when it detects some object into its path,
the voltage level increases positively up to 25.4 V which is given in Table 13.5.

Table 13.4 Data table for two object detection


Voltage Time Voltage Time Voltage Time Voltage Time
(V) (ms) (V) (ms) (V) (ms) (V) (ms)
7 1.68 7 9.52 7 13.28 7 17.04
6.8 1.76 6.8 9.6 7 13.36 7 17.12
6.8 1.84 6.8 9.68 6.8 13.44 6.8 17.2
6.8 1.92 6.8 9.76 6.8 13.52 6.8 17.28
7 2 7 9.84 6.8 13.6 6.8 17.36
6.8 2.08 6.8 9.92 7 13.68 7 17.44
6.8 2.16 6.8 10 6.8 13.76 6.8 17.52
7 2.24 7 10.08 6.8 13.84 6.8 17.6
7 2.32 7 10.16 7 13.92 7 17.68
7 2.4 7 10.24 7 14 7 17.76
7 2.48 7 10.32 7 14.08 7 17.84
7 2.56 7 10.4 7 14.16 7 17.92
7 2.64 7 10.48 7 14.24 7 18
25.4 6.8 25.4 10.56 7 14.32 7 18.08
25.4 6.88 25.4 10.64 25.4 14.4 7 18.16
25.4 6.96 25.4 10.72 25.4 14.48 7 18.24
25.4 7.04 25.4 10.8 25.4 14.56 7 18.32
(continued)
222 S. Kunjir and R. Autee

Table 13.4 (continued)


Voltage Time Voltage Time Voltage Time Voltage Time
(V) (ms) (V) (ms) (V) (ms) (V) (ms)
25.4 7.12 25.4 10.88 25.4 14.64 7 18.4
25.4 7.2 25.4 10.96 25.4 14.72 7 18.48
25.4 7.28 25.4 11.04 25.4 14.8 7 18.56
25.4 7.36 25.4 11.12 25.4 14.88 7 18.64
25.4 7.44 25.4 11.2 25.4 14.96 7 18.72
25.4 7.52 25.4 11.28 25.4 15.04 7 18.8
25.4 7.6 25.4 11.36 25.4 15.12 7 18.88
25.4 7.68 25.4 11.44 25.4 15.2 7 18.96
25.4 7.76 25.4 11.52 25.4 15.28 7 19.04
25.4 7.84 25.4 11.6 25.4 15.36 7 19.12
6.8 7.92 6.8 11.68 25.4 15.44 6.8 19.2
6.8 8 7 11.76 7 15.52 6.8 19.28
7 8.08 6.8 11.84 7 15.6 6.8 19.36
7 8.16 6.8 11.92 7 15.68 7 19.44
7 8.24 7 12 7 15.76 6.8 19.52
7 8.32 7 12.08 7 15.84 6.8 19.6
7 8.4 7 12.16 7 15.92 7 19.68
7 8.48 7 12.24 7 16 7 19.76
7 8.56 7 12.32 7 16.08 7 19.84
7 8.64 7 12.4 7 16.16 7 19.92
7 8.72 7 12.48 7 16.24 7 20
7 8.8 7 12.56 7 16.32 7 20.08
7 8.88 7 12.64 7 16.4 7 20.16
7 8.96 7 12.72 7 16.48 7 20.24
7 9.04 7 12.8 7 16.56 7 20.32
7 9.12 7 12.88 7 16.64 7 20.4
7 9.2 7 12.96 7 16.72 7 20.48
7 9.28 7 13.04 7 16.8 7 20.56
7 9.36 7 13.12 7 16.88 7 20.64
7 9.44 7 13.2 7 16.96 7 20.72

13.4.3 Output of Inverter

Figure 13.7 shows the output of inverter which is required to provide trigger as a
low pulse to the timer in mono stable mode.
13 Terrorist Scanner Radar and Multiple Object Detection System 223

30
25
20
15
10
5
0

103
109
115
121
127
133
139
145
151
157
163
169
175
181
187
13
19
25
31
37
43
49
55
61
67
73
79
85
91
97
1
7

Time (ms)--->on horizontal axis and Voltage in (Volt)---> on vertical axis

Fig. 13.6 Four object detected waveform

Table 13.5 Data table for four object detection


Voltage Time Voltage Time Voltage Time Voltage Time
(V) (ms) (V) (ms) (V) (ms) (V) (ms)
7 1.68 6.8 9.6 6.8 13.44 6.8 17.28
6.8 1.76 6.8 9.68 6.8 13.52 6.8 17.36
6.8 1.84 6.8 9.76 6.8 13.6 7 17.44
6.8 1.92 7 9.84 7 13.68 6.8 17.52
7 2 6.8 9.92 6.8 13.76 6.8 17.6
6.8 2.08 6.8 10 6.8 13.84 7 17.68
6.8 2.16 7 10.08 7 13.92 7 17.76
7 2.24 7 10.16 7 14 7 17.84
7 2.32 7 10.24 7 14.08 7 17.92
7 2.4 7 10.32 7 14.16 7 18
7 2.48 7 10.4 7 14.24 7 18.08
7 2.56 7 10.48 7 14.32 25.4 18.16
7 2.64 25.4 10.56 25.4 14.4 25.4 18.24
25.4 6.8 25.4 10.64 25.4 14.48 25.4 18.32
25.4 6.88 25.4 10.72 25.4 14.56 25.4 18.4
25.4 6.96 25.4 10.8 25.4 14.64 25.4 18.48
25.4 7.04 25.4 10.88 25.4 14.72 25.4 18.56
25.4 7.12 25.4 10.96 25.4 14.8 25.4 18.64
25.4 7.2 25.4 11.04 25.4 14.88 25.4 18.72
25.4 7.28 25.4 11.12 25.4 14.96 25.4 18.8
25.4 7.36 25.4 11.2 25.4 15.04 25.4 18.88
25.4 7.44 25.4 11.28 25.4 15.12 25.4 18.96
25.4 7.52 25.4 11.36 25.4 15.2 25.4 19.04
25.4 7.6 25.4 11.44 25.4 15.28 25.4 19.12
25.4 7.68 25.4 11.52 25.4 15.36 25.4 19.2
25.4 7.76 25.4 11.6 25.4 15.44 6.8 19.28
(continued)
224 S. Kunjir and R. Autee

Table 13.5 (continued)


Voltage Time Voltage Time Voltage Time Voltage Time
(V) (ms) (V) (ms) (V) (ms) (V) (ms)
25.4 7.84 6.8 11.68 7 15.52 6.8 19.36
6.8 7.92 7 11.76 7 15.6 7 19.44
6.8 8 6.8 11.84 7 15.68 7 19.52
7 8.08 6.8 11.92 7 15.76 7 19.6
7 8.16 7 12 7 15.84 7 19.68
7 8.24 7 12.08 7 15.92 7 19.76
7 8.32 7 12.16 7 16 7 19.84
7 8.4 7 12.24 7 16.08 7 19.92
7 8.48 7 12.32 7 16.16 7 20
7 8.56 7 12.4 7 16.24 7 20.08
7 8.64 7 12.48 7 16.32 7 20.16
7 8.72 7 12.56 7 16.4 7 20.24
7 8.8 7 12.64 7 16.48 7 20.32
7 8.88 7 12.72 7 16.56 7 20.4
7 8.96 7 12.8 7 16.64 7 20.48
7 9.04 7 12.88 7 16.72 7 20.56
7 9.12 7 12.96 7 16.8 7 20.64
7 9.2 7 13.04 7 16.88 7 20.72
7 9.28 7 13.12 7 16.96 7 20.8
7 9.36 7 13.2 7 17.04 7 20.88
7 9.44 7 13.28 7 17.12 7 20.96
7 9.52 7 13.36 6.8 17.2 7 21.04

0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53
-2

-4

-6
Time (ms)--->on horizontal axis and Voltage in (Volt)---> on vertical axis

Fig. 13.7 Inverter output waveform


13 Terrorist Scanner Radar and Multiple Object Detection System 225

13.4.4 Received Frequency Calculation

Frequency received by ultrasonic receiver is:


Time period of received signal is 11.40 s.

1
F= 13:1
T
1
F =
11.40 10 6
1
F = 13:2
11.40 10 6
F = 87.719 KHz

13.5 Conclusion

Mainly for security of national border, government is taking a lot of efforts as well
as spending a lot of money. So by employing such type of system such as terrorist
scanner radar and multiple object detection system, we can achieve higher security
at the nation border with very nominal cost. The basic aim of the project is to
provide a security at nation border. This system is basically designed to provide
continuous scanning of entry-restricted area, to nd any obstacle which is coming
into that entry-restricted area and to detect exact kind of obstacle. While scanning
the entry-restricted area given to the system, if it found any obstacle into its path, it
rst detects exact type of obstacle; then, according to type of obstacle necessary
control action takes place. In control action, not only does it start ring but also
additional necessary control action can be taken by military as the FM transmitter
transmits message to military head ofce. The ring time varies according to the
obstacle it detects. Simultaneously FM transmitter transmits the message to FM
receiver which is located at military head quarter so that another necessary action
can be taken by military.

Acknowledgments Completion of this system is a task which would have not accomplished
without cooperation of my guide Prof. R. M. Autee. I would like to thank my parents for their
encouragement. At last, I am also thankful to my friends who have helped me in completion of this
system.
226 S. Kunjir and R. Autee

References

1. Grifths HD, Baker CJ (2006) Radar imaging for combating terrorism. Springer, Netherlands
2. Bulletin Global Government (2013) Border surveillance systems- blighter scanning radar and
sensor solutions
3. Kausadikar RB et al (2013) Terrorist scanner radar with military headquarters informing
system. Int J Comput Technol Electron Eng 3
4. Shrivastava P et al (2014) Project review on ultrasonic distance measurement. Int J Eng Tech
Res. ISSN 2321-0869
5. Dingley G, Alabaster C (2009) Radar based automatic target system. In: IEEE International
WD&D conference, pp. 9781-4244-2971
6. Shrivastava AK et al. (2009) Effect of variation of separation between the ultrasonic
transmitter and receiver on the accuracy of distance measurement. Int J Comput Sci Inf
Technol 1(2)
7. Sharma A et al (2011) Ultrasonic 3D locator. EE318 Electronic Lab Project Report, EE Dept,
IIT Bombay, April 2011
8. Li Y et al (2008) A microwave measurement system for metallic object detection using
swept-frequency radar. Proc SPIE 7117:71170K
9. Helmut E et al Concealed weapon detection with active and passive millimeter wave sensors
10. Agustin SA et al (2010) Image segmentation of concealed objects detected by terahertz
imaging. In: IEEE Conference on Computational Intelligence and Computing Research
11. Alabaster CM, Hughes EJ, Flores-Tapia D (2013) A virtual target radar system for small arms
re training. IEEE-2013
12. Naidu VPS (2009) Fusion of radar and ISRT sensor measurements for 3D Target tracking
using extended kalman lter. Defence Sci J 59(2):175182
13. Bulletin of Defence Research and Development Organisation (2013) Technol Focus 21(2)
Chapter 14
Inexact Implementation of Wavelet
Transform and Its Performance Evaluation
Through Bit Width Reduction

Moumita Acharya, Chandrajit Pal, Satyabrata Maity


and Amlan Chakrabarti

Abstract Resource and energy optimization in computing is gaining a lot of impor-


tance due to the increasing demand of smart and portable devices. These devices
have a stiff budget in terms of resource and energy. Most of the applications running
in these devices are media intensive and hence special efforts are needed to minimize
the resource and energy requirements for the various computational tasks involved
in media processing. Discrete wavelet transform (DWT) is an important transform,
which is utilized in various forms of image and video processing applications. It is
a complex transform and hence demands a direct hardware implementation instead
of software execution in many application scenarios, to increase the overall system
throughput. Inexact computing sacrifices the precision of computing accuracy by
rejecting one or few bits of data storage. The inexactness in computing does not
hamper those applications whose quality is not much compromised due to such inac-
curacy. In this paper, we propose a low-resource and energy-aware hardware design
for DWT through dynamic bit width adaptation, thus performing the computation in
an inexact way. We have performed field programmable gate array (FPGA) based
prototype hardware implementation of the proposed design. To the best of our knowl-
edge this is a first of its kind of modeling of DWT involving inexact computing.

Keywords Inexact computing Wavelet Image pyramid PSNR Discrete Haar


wavelet transform System generator

M. Acharya
C.V. Raman College of Engineering, Bhubaneswar 752054, Odisha, India
e-mail: acharya_moumita@yahoo.co.in
C. Pal (B) S. Maity A. Chakrabarti
A.K. Choudhury School of Information Technology, University of Calcutta,
92 APC Road, Kolkata, India
e-mail: palchandrajit@gmail.com
URL: http://www.caluniv.ac.in/

Springer Science+Business Media Singapore 2016 227


A. Chakrabarti et al. (eds.), Advances in Computing Applications,
DOI 10.1007/978-981-10-2630-0_14
228 M. Acharya et al.

14.1 Introduction

Inexact computing technique is a tradeoff between computation quality (e.g., accu-


racy) and computational effort (e.g., energy) [1]. Earlier the foremost involvement of
a VLSI designer was performance, area and cost. Power consideration was the sec-
ondary issue. But, nowadays the efficient use of energy is the major concern of every
design [24]. As per Moores law in nanometric VLSI technology, power dissipation
has become a significant factor [5]. Inexact computing is one of the best possible ways
to focus on the high-performance, energy-efficient and reliable design. So approx-
imation is nothing but a low-power design methodology by which battery lifetime
and packing cost can be reduced. In the field of inexact computing it is possible to
implement image processing by dynamic bit width adaptation using discrete cosine
transform (DCT) for low-power system design [6]. In this paper, we have proposed
a completely new technique for inexact computing through bit width reduction in
discrete Haar wavelet transform of images with a hardware design consuming low
resource and power. One can get good reviews on inexact computing in [710].
Wavelet transformation is one of the most popular candidates of the time-
frequency transformations in many image and video processing applications and
hence we thought it as a good candidate for performing inexact computing. Discrete
wavelets are not continuously scalable and translatable but can only be scaled and
translated in discrete steps. In the earlier days Burt [11] defined a subband coding
technique to decompose discrete time signals named as pyramidal coding, which is
also known as multi-resolution analysis. Image pyramid is a filter-based representa-
tion to extract image features at multiple scales by reducing the redundancy for image
modeling and increasing the efficiency for coding and image analysis/synthesis.
Some related works on image pyramid can be found in [12, 13]. Wavelets are a more
general way to represent and analyze multiscale images. One of the earliest basic
discrete wavelet transformation (DWT) is the Haar wavelet transformation, which
considers time-frequency analysis function for the local features of a signal in appli-
cations like image compression. Related works on discrete Haar wavelet transform
can be found in [1417].

14.2 Inexact Computing

Inexact computing is performed for those computing applications where perfect


accuracy is not a mandate, but doubles the efficiency with a reduction in power
consumption [2]. In embedded systems, the most expensive resource is energy and
its efficient utilization is a necessity. So for the time of implementation of a design in
image processing domain one should be very much careful about the tradeoff between
the image accuracy and power consumption. In inexact computing we can tolerate
some loss of quality in the resultant image which is well within the limit of human
vision perception. Applications in the domain of video or image processing benefit
14 Inexact Implementation of Wavelet Transform and Its Performance Evaluation 229

from approximate computing as the input data and the human sense to perceive the
output data are inexact. In embedded systems, the most expensive resource is energy
and its efficient utilization is significant. So designers should be careful about the
tradeoff between the image accuracy and energy consumption.

14.3 Image Processing Algorithms

Image processing is a process to translate an image into digital form and then analyze
and manipulate that digitized image including data compression and image enhance-
ment to improve its quality or to extract some convenient information from it. As
shown in Fig. 14.1 it shows a very simple way to compute the input image in fre-
quency domain (using wavelet transform) to extract the information by using different
processes like bit width reduction, filtering and perform the inverse transformation
to reconstruct the enhanced image. Image is usually treated as a two dimensional
function f (x, y); where x and y are spatial coordinates and the amplitude f at any
pair of coordinates (x, y) is its intensity or grayscale level of the image at that point.
An image is continuous with respect to x and y coordinates and in amplitude. When
x, y and amplitude of f are finite and discrete quantity, the image is known as
digital image. The purpose of image processing is visualization to human eyes,
image sharpening and restoration, multiscale signal analysis, image retrieval, etc.
For digital image processing some techniques that are generally used includes lin-
ear filtering, Wavelets, Pixilation etc. In natural scenes, an image may compose of
multiple resolutions because there is no particular scale or spatial frequency that has
a special status. Due to this reason any visual system may follow some uniformity
in the time of image representation and image processing over multiple scales [13,
14]. Image pyramid is the representation of an image at multiple levels.

Input Image
Input Image Wavelet
Matrix
Transform

Processing

Inverse
Output Image
Output Image Wavelet
Matrix
Transform

Fig. 14.1 Image processing block diagram


230 M. Acharya et al.

Fig. 14.2 a Image pyramid; b Levels of image pyramid

14.3.1 Image Pyramid

A pyramid is a multiscale representation of recursive method that naturally leads a


self-similarity structure [13]. In Fig. 14.2 the image pyramid and its different levels
of arrangement are shown. Image pyramid helps to extract image features such as
edges at multiple scales and reduce the redundancy by modeling the image through
analysis and synthesis process. Well known image pyramids are Gaussian pyramid,
Laplacian pyramid and wavelet pyramid. In this paper, wavelet pyramid concept is
used to implement discrete wavelet transformation for low-power image processing
techniques. Wavelets are generally permitted to analyze a signal (image) in spatial
and frequency domain [12]. Images of different spatial resolutions build up the levels
of the pyramid.
The original image is in the level j having highest resolution and it is the lowest
pyramid level. Each higher level contains a lower resolution image and the resolutions
are usually half the previous one. The image at the top Level 0(apex) holds only
one pixel. If level j is formed by smoothing and downsampling the image at level
j + 1 and it is known as approximation pyramid and if level j is formed as the
difference between the up sampled and interpolated level j1 image, and the level j
approximation image, that is known as prediction residual pyramid [14]. In Fig. 14.3
the block diagram and its output are shown.

14.3.2 Subband Coding and Filter Banks

The subband coding can decompose a signal (image) into two or more components
that are known as subbands. Here the operation is performed with the combination of
two filter banks named as analysis filter bank and synthesis filter bank formed by one
low-pass filter h0(n) and one high-pass filter h1(n) to decompose the input sequence
f (n) and produces two half-length sequences: approximation subband f lp(n) and
details subband f hp(n) shown in Fig. 14.4. Similarity holds for the synthesis filter
14 Inexact Implementation of Wavelet Transform and Its Performance Evaluation 231

(b) (c)
(a) Approximaion 2 Approximation
Filter Downsampler (Level j-1)

2
Upsampler

(d) (e)
Interpolation
Filter

Prediction

Input -- Prediction
Image +
residual
(Level j) (Level j)

Fig. 14.3 a Image pyramid block diagram; b Original image; c Original image histogram; d
Pyramidal representation of original image; e Pyramidal image histogram

2 C3(n)
H(z)

C2(n)

2
H(z) 2
G(z) D3(n)
C1(n)

H(z) 2
2
G(z) D2(n)
f(n)=Co(n)

G(z) 2
D1(n)

Fig. 14.4 Block diagram of analysis filter bank

C3(n)

H(z)
2
C2(n)

H(z)
2
G(z)
2 C1(n)
D3(n)

H(z)
G(z) 2
D2(n) 2

f(n)=Co(n)

2 G(z)
D1(n)

Fig. 14.5 Block diagram of synthesis filter bank

bank and produces two subbands f lp(n) and f hp(n) to form the reconstructed
signal f (n) as shown in Fig. 14.5. If the analysis filter bank is recursively applied to
the approximation subband, approximation pyramid is formed. The coefficients Ck
and Dk are produced by convolving the digital signal, with each filter, followed by
decimation of the output.
232 M. Acharya et al.

Convolution with LPF H(z) and


then decimating the output
(a)
Apply to Column
Convolution with LPF H(z) and
then decimating the output 2 LL
H(z)
Apply to Row
2 4X4
L Convolution with HPF G(z) and
H(z)
then decimating the output

8X4 DWT
2 LH
G(z)

4X4 LL HL
Apply to Column

Apply to Column LH HH

2 HL
8X8 H(z) 8X 8

2 4X4
Convolution with LPF H(z) and
G(z) H then decimating the output

Apply to Row 8X4 2 HH


Convolution with HPF G(z) and G(z)
then decimating the output 4X4
Apply to Column

Convolution with HPF G(z) and


then decimating the output

(b) (c)

Fig. 14.6 a Overall decomposition process for image; b Original image; c Resultant image after
subband coding

Subband coding for a 2D signal (image) may be applied consecutively along the
rows and columns. Figure 14.6 shows the hole decomposition process of subband
coding and its result. Initially the analysis filters are applied to the rows of an image.
It produces two new images. One is a set of common row coefficients and another
one is the set of detail row coefficients. The next analysis filters are applied to the
columns of each new image and produce four different images called subbands
which are the approximations A (LL), the vertical detail V (HL), the horizontal
detail H (LH), and the diagonal detail D (HH) [15]. Rows and columns which are
analyzed with a high-pass filter are designated with an H. Similarly, rows and columns
analyzed with low-pass filter are designated with an L. For an instance, if a subband
image is produced by using a low-pass filer on the rows and a high-pass filter on the
columns, it is called the LH subband. Each subband has different image information.
The LL subband consists of common approximation of the image and removes all
high-frequency information. Similarly the LH subband removes all high-frequency
14 Inexact Implementation of Wavelet Transform and Its Performance Evaluation 233

information along the rows and attenuates high-frequency information along the
columns. As a result the vertical edges of an image get highlighted. Likewise the HL
subband highlight horizontal edges and the HH subband diagonal edges of an image.

14.4 Discrete Wavelet Transformation

Wave is an oscillating function of time and space like sinusoid. The most popular
wave analysis is Fourier transform, which only deals with frequency resolution of
a signal but the temporal details are not present. To overcome this problem wavelet
analysis provides an alternative solution. Wavelet is a small wave that possesses
the energy concentration in time as well as frequency to provide a technique for
the analysis of nonstationary time varying phenomenon. As per the phenomena of
Heisenberg uncertainty principle it is impossible to know the exact time and fre-
quency of occurrence of this frequency in a signal. Either we have low-frequency
resolution with good temporal resolution or high-frequency resolution with poor
temporal resolution. Following this in wavelet transform the primary functions dif-
fers both in frequency and spatial range. This type of transformation is basically
intended to get good frequency resolution for low-frequency components that are
basically the average intensity values of the image and high chronological resolution
for high-frequency components forming the edges of the image. Wavelet analysis
are generally used to extract information from many different types of data including
audio signal, images and segregate that information in approximation and details sub
signal. There are two types of wavelet transformation; continuous wavelet transfor-
mation (CWT) and discrete wavelet transformation (DWT). To overcome the problem
of redundancy in CWT, discrete wavelet transform have been commenced. DWT are
not continuously translatable and scalable but can only be in discrete steps. When
DWT is used for continuous signal, it will generate a series of wavelet coefficients
and it is known as wavelet series decomposition.

14.4.1 Haar Wavelet Transform

The basic and simplest wavelet transformation is the Haar wavelet transformation,
introduced in the year of 1910 by an Hungarian mathematician Alfred Haar. The Haar
transform uses Haar function that is compact, dyadic and orthonormal, rectangular
pair. The Haar transform works as a model for the wavelet transformation and it is
closely related to the discrete Haar wavelet transform [16, 17]. For multiresolution
analysis, scaling function (x) create a series of approximations of a signal or
image and each varying by a factor of 2 from its nearest neighboring approximation.
Wavelet functions (x) are used to encode the difference (detail) in information
between adjacent approximations. So the Haar transform corresponding to the Haar
scaling function:
234 M. Acharya et al.

1 0x <1
(x) = (14.1)
0 otherwise

and the Haar wavelet transform:



1 0 x < 0.5
(x) = 1 0.5 x < 1 (14.2)

0 otherwise

So after shifting and scaling the Haar wavelet functions are defined as

j,k (x) = 2 j/2 (2 j x k), (14.3)

j,k (x) = 2 j/2 (2 j x k), (14.4)

where k is the position of the functions and j is the width and its value define the layer
in the image pyramid. The wavelet function (x) relates to the difference between
the layer j and j + 1.
Now assume A is an N X N image where N is an even number. To compute
W N A and W N AW NT , D H W T is applied to each column of A and to each row of A,
respectively. W N is a transformation matrix.

a11 a12 a13 a14
a21 a22 a23 a24
A=
a31 a32
(14.5)
a33 a34
a41 a42 a43 a44

T
H H H AH T H AG T A V
W N AW NT = A = = (14.6)
G G G AH T G AG T H D

where A denotes the approximation coefficient; V denotes the vertical coefficient; H


denotes the horizontal coefficient and D denotes diagonal coefficient.

a11 + a12 + a21 + a22 a13 + a14 + a23 + a24
H AH = 1/4
T
(14.7)
a31 + a32 + a41 + a42 a33 + a34 + a43 + a44

Partition A in 2 2 blocks as
  
a11 a12 a13 a14
  
a21 a22 a23 a24 A11 A12

=     
  = A21 A22
a31 a32 a33 a34
a41 a42 a43 a44
14 Inexact Implementation of Wavelet Transform and Its Performance Evaluation 235

Then the (i, j) element of H AH T is simply the average of the elements in Ai j.


So H AH T is an approximation of the original image and denoted as A.

(a12 + a22) (a11 + a21) (a14 + a24) (a13 + a23)
H AG = 1/4
T
(a32 + a42) (a31 + a41) (a34 + a44) (a33 + a43)
(14.8)
Partition A in 2 2 blocks as
  
a11 a12 a13 a14
  
a21 a22 a23 a24 A11 A12

=    
   = A21 A22
a31 a32 a33 a34
a41 a42 a43 a44

The (i, j) element of H AG T can be viewed as a difference between columns of


Ai j. So H AG T defines the vertical differences and is denoted as V.

(a21 + a22) (a12 + a11) (a23 + a24) (a13 + a14)
G AH = 1/4
T
(a31 + a32) (a42 + a41) (a43 + a44) (a33 + a34)
(14.9)
Partition A in 2 2 blocks as
  
a11 a12 a13 a14
  
a21 a22 a23 a24
=     = A11 A12
a31 a32 a33 a34 A21 A22
   
a41 a42 a43 a44

The (i, j) element of G AH T can be viewed as a difference between the columns


of Ai j. So, G AH T defines the Horizontal differences and denoted as H.

(a11 + a22) (a12 + a21) (a13 + a24) (a23 + a14)
G AG T = 1/4
(a31 + a42) (a32 + a41) (a33 + a44) (a43 + a34)
(14.10)
Partition A in 2 2 blocks as
  
a11 a12 a13 a14
  
a21 a22 a23 a24
=     = A11 A12
a31 a32 a33 a34 A21 A22
   
a41 a42 a43 a44

The (i, j) element of G AG T can be viewed as a difference between columns of


Ai j. So G AG T is the diagonal differences and denoted as D. The resultant Image is
shown in Fig. 14.7. If again HWT process is applied to the approximation, it is called
the iterated HWT.
236 M. Acharya et al.

Fig. 14.7 a Input image; b Level one HWT image; c Iterated HWT image

14.5 Strategy for Inexact Computing

In this section we discuss about the proposed methodology for the bit width reduction
algorithm using discrete Haar wavelet transform to make a tradeoff between image
quality and computational energy. Generally the peak signal-to-noise ratio (P S N R)
is an evaluating parameter of image quality. P S N R is an approximate measure of
human perception on reconstruction quality. The P S N R is computed between the
reconstructed images and is compared with the original image. P S N R is defined
as the mean-squared error (M S E) of two m n size images f and z; where f is
original image and z is the reconstructed image.
m1 n1
M S E = 1/mn [ f (i, j) z(i, j)]2 (14.11)
i=0 j=0

The PSNR (in dB) is defined as

P S N R = 10log10 (M AX 2I /M S E) (14.12)

P S N R = 20log10 (M AX I /M S E) (14.13)

P S N R = 20log10 (M AX I ) 10log10 (M S E) (14.14)

Here M AX I is the maximum possible pixel value of the image under consider-
ation. In general, when samples are represented using linear PCM with B bits per
sample, M AX I is 2 B 1. In our experiment initially the original image pixels are
represented using 8 bits per sample, hence M AX I is 255 and after bit width reduction
the image pixels become 7 bits per sample, therefore M AX I is 128.
14 Inexact Implementation of Wavelet Transform and Its Performance Evaluation 237

Algorithm 14.1 Inexact computing of a grayscale image through LSB Bit slicing
using discrete Haar wavelet transform
Input: Load input image f (i) from the workspace. It is a grayscale image of resolution 256 256.
Output:Reconstructed image z(i).
Step 1. 1st level 2D D H W T is applied on the image and following the decomposition it will
generate the coefficients c A1, cH 1, cV 1, cD1.
Step 2. Then through single reconstruction A1, H 1, V 1, D1 is extracted from
c A1, cH 1, cV 1, cD1 respectively.
Step 3. Each A1, H 1, V 1, D1 is then converted into 8 bit binary number representation.
Step 4. Clipping the LSB bit from all 8 bit binary number and convert it into 7 bit equivalent
binary no.image representation.
Step 5. Rescaling the bit slicing image as coefficient matrix c A2, cH 2, cV 2, cD2.
Step 6.Finally reconstruct the image using I D H W T and compute P S N R and Entr opy
Return: Reconstructed image z(i).

In this algorithm c A1, cH 1, cV 1, cD1 denote approximation coefficient, hori-


zontal coefficient, vertical coefficient, diagonal coefficient, respectively. A1 is one
level approximation image, H 1 is one level horizontal image, V 1 is one level vertical
image and D1 is one level diagonal image. D H W T and I D H W T is discrete Haar
wavelet transform and inverse discrete Haar wavelet transform, respectively. P S N R
means peak signal-to-noise ratio. The whole process is shown in Fig. 14.8.

14.6 Hardware Implementation

Referring to Fig. 14.9 the From Workspace block designated as ImageSource variable,
which reads data from the MATLAB workspace where image data can be stored in
the form of a 2D array. A MATLAB preload scripting file reads the image and

A1: 256X256 y1: 256x256

Re-Scaling
Synthesis LSB Bit
Slicing

128x128 128x128

H1: 256X256 y2: 256x256 cA2


cA1
(LL)
LSB Bit
Slicing
128x128
128x128
z:256x256
256x256 Re-Scaling
Grayscale Image cH1 Synthesis cH2 1st level
1st Level (HL)
cA1(LL) cH1(HL) cA2(LL) 2D IDWT
2D DWT y4: 256x256 cH2(HL)
D1: 256X256
128x128 128x128
LSB Bit
cV1(LH) cD1(HH) cD1 Slicing cV2(LH) cD2(HH) one level
Re-Scaling cD2 Recon-
One level (HH)
Synthesis struction
Decomp- Reconstructed
Original Image
osition Image
128x128 128x128

cV1 y3: 256x256


V1: 256X256
(LH) cV2
LSB Bit
Slicing

Synthesis Re-Scaling

Fig. 14.8 Design flow


238 M. Acharya et al.

CA2

unit
unit
delay delay

Image source line buffer

Gateway in unit
delay

unit
delay

adder
256
A<=B line buffer
2 CONSTANT Gateway out
DOWNSAMPLE
A
CONSTANT mux
2 fixed pt bit slicing
A<B

ENABLE fixed point conversion


B
counter counter Relational
0.5 MULTIPLIER followed by bit slicing
Relational
0.5
CONSTANT INVERSE OF 2
CA1 to form A1

computation of DISCRETE WAVELET TRANSFORM i.e CA1

unit unit
delay delay
CH2

Gateway in unit

line buffer delay


unit
delay

adder
256

2 CONSTANT A<=B line buffer


DOWNSAMPLE fixed point conversion Gateway out
A
CONSTANT
mux
2 followed by bit slicing
A<B
ENABLE
B fixed pt bit slicing
counter Relational
counter 0.5 MULTIPLIER
Relational
0.5
CONSTANT INVERSE OF 2
CA1 to form A1

computation of DISCRETE WAVELET TRANSFORM i.e CH1

..

..
CV2

COMPUTATION BLOCK FOR CV2

...

...

...

CD2
...

COMPUTATION BLOCK FOR CD2

Fig. 14.9 Hardware design


14 Inexact Implementation of Wavelet Transform and Its Performance Evaluation 239

serializes the pixels and temporarily stores in the workspace to make it in a format
suitable for hardware execution [18]. We have designed four parallel channels for
the computation of C A2, C H 2, C V 2 and C D2, respectively. The details of the
computation have been described in the Eqs. (14.5)(14.10). In the first phase c A1
has been computed. From the input matrix of the image, the total matrix is broken
down into 2 2 segments and as per the formula (Eq. 14.3) the summation of each
2 2 segment is first done followed by dividing the result by 2 is undertaken, which
produces a single value A1. This operation is computed for the entire input image
matrix say (M N ) and as a result a half resolution image is generated after the
DWT, i.e., (M/2 N /2). In the second phase cH 1 has been computed. From the
input matrix of the image, the total matrix is broken down into 2 2 segments and
as per the formula (in Eq. 14.4) the subtraction of each 2 2 segment is first done
followed by dividing the result by 2 is undertaken, which produces a single value
H 1. This operation is computed for the entire input image matrix say (M N ) and
as a result a half resolution image is generated after the DWT, i.e., (M/2 N /2).
Similarly the other two parallel sections are also computed as per the formulas in
Eqs. (14.5) and (14.6) which produces a single value V 1 and D1, respectively. This
operation is computed for the entire input image matrix say (M N ) and as a result
a half resolution image is generated after the DWT, i.e., (M/2 N /2).
T hr oughput: The design requires (512 512 + 1000 = 263144) clock cycles
to process one frame of resolution 512 512. The end blocks are working at a fre-
quency of 50 MHz. Therefore each clock period is 20ns. Therefore 263144 20 =
5262880 ns = 0.01 s is needed to process per frame. There the design can process
100 f rames per second for a frame of resolution 512 512. The algorithm has been
successfully implemented using FPGA hardware by using the Xilinx System Gen-
erator platform with Intel(R) Core(TM) 2 Duo CPU T6600 @ 3.2 GHz platform and
Xilinx Virtex-5 LX110T OpenSPARC Evaluation Platform (100 MHz).

14.7 Results and Discussion

Here we have implemented the bit width reduction algorithm using MATLAB for
four different images. Then we have performed the FPGA based hardware design and
its implementation for such inexact computing. We have compared the software and
the hardware implementations for all the four tested images as shown in Fig. 14.6. The
resource utilization and the power consumption data has been provided in Tables 14.1
and 14.2, respectively. The information regarding the resource and power reduction
percentages are also provided in Tables 14.1 and 14.2 respectively. The resource and
power reduction was achieved for the design through one bit slice reduction. We
compared the P S N R values between the original image with the DWT image and
that between the original image and the reconstructed IDWT image for both software
and hardware based approaches. Table 14.3 shows the comparative P S N R values
for the four 512 512 grayscale images. Figure 14.10 shows the supporting images
for the software and hardware validation.
240 M. Acharya et al.

Table 14.1 Resource utilization


Resource Original image Reconstructed image Percentage reduction
after IDWT
Slices 575 out of 17280 504 out of 17280 12.34
Flipflop 1113 out of 69120 977 out of 69120 12.21
BRAMs 9 9 0
LUTs 1322 out of 69120 1213 out of 69120 8.25
Mults/DSP48s 25 out of 64 17 out of 64 32

Table 14.2 Power consumption


Original image (W) Reconstructed image Percentage reduction
after IDWT (W)
Dynamic power 0.010 0.009 10
Quescent power 1.187 1.186 0.1
Power consumption 1.197 1.195 0.2

Table 14.3 PSNR Calculation for four grayscale images measured in decibel(dB)
Sl. No. Image (res. 512 Tradeoff with hardware Tradeoff with software
* 512)
PSNR for PSNR for PSNR for PSNR for
DWT Image reconstructed DWT Image reconstructed
image after image after
IDWT IDWT
1. Leena.jpg 22.3167 20.0550 28.9009 28.1333
2. Cameraman.jpg 18.9383 17.2801 23.8522 22.8284
3. Einstein.jpg 24.0937 21.6293 26.7061 26.0342
4. Mandrill.jpg 19.7734 18.3522 24.2948 23.3573

Fig. 14.10 a software output; b hardware output


14 Inexact Implementation of Wavelet Transform and Its Performance Evaluation 241

14.8 Conclusion

We have performed a hardware design for DWT involving dynamic bit width reduc-
tion and hence an inexact computing approach. The prototype design has been imple-
mented on Virtex5 OpenSPARC FPGA. Implementation results show that compared
to the standard hardware implementation of DWT our approach gains in terms
of reduced resource utilization, power consumption while sacrificing a negligible
amount of image quality. To the best of our knowledge our work can be considered
as a first attempt in terms of inexact computing based hardware design of discrete
wavelet transform. We are in a process of developing a more robust algorithm for
resource and power adaptability.

Acknowledgements This work has been supported by the Department of Science and Technology,
Govt of India under grant No DST/INSPIRE FELLOWSHIP/2012/320 as well as grant from TEQIP
phase 2 (COE), University of Calcutta for the experimental equipments. We thank C.V. Raman
College of Engineering in Bhubaneswar, India for facilitating our work. We also thank Prof. (Dr.)
Kaushik Roy, School of Electrical and Computer Engineering, Purdue University, USA for the
encouragement and motivation provided to us.

References

1. Sinha A, Chandrakasan AP (1999) Energy efficient filtering using adaptive precision and vari-
able voltage. In: IEEE international conference on ASIC/SOC, 15 Sept 199918 Sept 1999, pp
327331. doi:10.1109/ASIC.1999.806528
2. Nawab SH et al (1997) Approximate signal processing. J VLSI Signal Process Syst Signal
Image Video Technol 15(1/2):177200
3. Keating M, Flynn D, Aitken R, Gibsons A, Shi K (2007) Low power methodology manual for
system on chip design. Springer Publications, New York
4. Banerjee N, Karakonstantis G, Roy K (2007) Process variation tolerant low power DCT archi-
tecture. In: Proceedings of the conference on design, automation and test in Europe, p 16
5. Gupta V, Mohapatra D, Raghunathan A, Roy K (2013) Low-power digital signal processing
using approximate adders. IEEE Trans CAD Integr Circ Syst 32(1):124137
6. Park J, Choi JH, Roy K (2010) Dynamic bit-width adaptation in DCT: image quality versus
computation energy trade-off. IEEE Trans Very Large Scale Integr (VLSI) Syst 18(5), May
2010
7. Allam MW (2000) New methodologies for low-power high performance digital VLSI design.
Ph.D. thesis, Waterloo, Ontario, Canada
8. Bellaouar A, Elmasry MI (1995) Low power digital VLSI design circuit and system. Kluwer
Academic Publication
9. Chippa VK, Roy K, Chakradhar ST, Raghunathan A (2013) Managing the quality vs. efficiency
trade-off using dynamic effort scaling. ACM Trans Embed Comput Syst 12(2s):90
10. Chippa VK, Venkataramani S, Chakradhar ST, Roy K, Raghunathan A () Approximate com-
puting: an integrated hardware approach. In: ACSSC 2013, pp 111117
11. Burt PJ, Adelson EH (1983) The laplacian pyramid as a compact image code. IEEE Trans
Commun 31(4)
12. Adelson EH, Simoncelli EP, Freeman WT (1990) Pyramids and multiscale representations. In:
Proceedings of European conference on visual perception
13. Adelson EH, Anderson CH, Bergen JR, Burt PJ, Ogden JM (1984) Pyramid methods in image
processing. RCA Eng 29(6):33-41
242 M. Acharya et al.

14. Niclas Brlin (2009) Image analysis wavelets and multi-resolution processing. Department of
Computing Science Ume University, 20 Feb 2009
15. Toufik B [1], Mokhtar N [2] (2012) The wavelet transform for image processing applica-
tions from [1] Automatic Department, Laboratory of Non Destructive Testing, Jijel University
Department of Electrical and Computer Engineering, Algeria. [2] Bristol Robotic Laboratory,
and University of the West of England, UK
16. Chao P-Y Haar transform and its applications. In: Time frequency analysis and wavelet trans-
form tutorial (D00945005)
17. Bnteau C (2011) University of South Florida Tampa, FL USA. Discrete haar wavelet transforms.
UNM - PNM state wide mathematics contest
18. https://www.mathworks.com/products/Image-analysis.html. using Discrete Wavelet Transfor-
mation
Chapter 15
A Vulnerability Analysis Mechanism
Utilizing Avalanche Attack Model
for Dependency-Based Systems

Sirshendu Hore, Sankhadeep Chatterjee, Nilanjan Dey,


Amira S. Ashour and Valentina Emilia Balas

Abstract Avalanche attack is huge in any computing systems, predominantly in


security systems. Thus, this work aims to minimize the possibility of avalanche via
systematically analyzing the cause behind avalanche. A novel and efcient attack
model is proposed to evaluate the degree of vulnerability in a dependency-based
system caused by its members. This model uses an algorithmic approach to identify,
quantify, and prioritize i.e., ranking the extent of vulnerability due to the active
members in a dependency-based system. It is implemented using heuristic search
techniques to pinpoint the member having highest participation in vulnerability in
its absence or to get the safest member having minimum participation using the
Simulated Annealing method. Both the maximization/minimization problem is
successfully solved, as the results locate the desired objectives that support the
ingenuity of the proposed model. The results prove that the proposed method is
superior than using the uniform search method.

S. Hore S. Chatterjee
Department of CSE, Hooghly Engineering & Technology College,
Hooghly, West Bengal, India
e-mail: shirshendu_hore@yahoo.com
S. Chatterjee
e-mail: sankha.hetc@gmail.com
N. Dey ()
Department of Information Technology, Techno India College
of Technology, Kolkata, India
e-mail: neelanjan.dey@gmail.com
A.S. Ashour
Faculty of Engineering, Department of Electronics and Electrical
Communication Engineering, Tanta University, Tanta, Egypt
e-mail: amirasashour@yahoo.com
V.E. Balas
Faculty of Engineering, Aurel Vlaicu University of Arad, Arad, Romania
e-mail: balas@drbalas.ro

Springer Science+Business Media Singapore 2016 243


A. Chakrabarti et al. (eds.), Advances in Computing Applications,
DOI 10.1007/978-981-10-2630-0_15
244 S. Hore et al.

Keywords Avalanche Dependency-based systems


Maximization and
minimization problem Vulnerability analysis
Simulated annealing

15.1 Introduction

Topical progression in information technologies leads to considering the secure


transmission of digital data a signicant design point of interest. Cryptography
plays foremost role in information security. Cryptography is essentially related to
converting data in order to make them immune and secure to attacks, where
cryptanalysis is related to the breaking of codes [1]. Several researchers and hackers
are always endeavoring to break the cryptographic algorithms using side channel
attacks and brute force [2]. There are several tools involved in computer and
network attacks, which have some common threats to attack the system, such as
Virus Threats, Hackers, Spyware Threats, Viral Web Sites, Unsecured Wireless
Access Points, and Phishing Threats [3]. In the year 2010, the Anti-Phishing
Working Group (APWG) statement that about two-thirds of all phishing attacks in
the second half of 2009 had been due to avalanche. Thus, they described avalanche
as one of the mainly sophisticated and damaging on the Internet and the worlds
most prolic phishing gang [4].
Avalanche in a computer system has a high volume of its attacks [5, 6]. Typi-
cally, the avalanche effect is acquired when a slight change in the input leads to a
signicant output change. In context of nature, avalanche signies catastrophic
change caused due to a small but effective change in nature. The analysis of the
numerous such phenomena can be understood by observing the sophisticated
relation between avalanche and the problems of interest.
There is a strong relationship between avalanche and dependency. The type of
dependency can be of two types (a) Single node dependency (b) Multinode
dependency. For example, in the case of computer network, particularly in a star
topology, the terminals are connected to a centralize hub or switch through which
data pass. An attack or failure in switch or hub may bring destruction in the network
[7, 8]. While, in case of relational data model changing/removing primary key and
its corresponding foreign key may destroy the integrity base of relational model [9].
The presence of such dependency is not limited as it exists in several elds such as
economics, linguistics, logic, medicine, music, physiology, and politics. Therefore,
a profound impact of dependency in case of avalanche and its consequences can be
observed.
The primary reason behind avalanche, where a change of the element or ele-
ments on which others depend brings destruction, can be classied into two cate-
gories; Single element and Block/Group of elements as the computational
performance depends on the processor, random access memory, cache memory,
virtual memory, graphics card, etc.
15 A Vulnerability Analysis Mechanism Utilizing Avalanche Attack 245

Generally, avalanche is based on the systems/resources, as its effect can be


observed both in hardware as well as in software. Such resources and systems that
affected by avalanche are as follows:
I. Hardware Resource
Switch/Router: Absence of switch/router can marginalize or separate a
system from internal or external network.
Radar: Malfunction of radars misleads pilot or ATC.
PLC: Abnormal behavior can stall any important automatic system such as
atomic power plant.
II. Software Resource
Text: Replacement of logical operator and with logical operator or in a
part of a Boolean equation incurs signicant errors in the whole equation.
Image/Video: Changing the sequence of frames in the video clip can
entirely change the meaning of the video clip.
File: Change of le attributes such as archive to hide or change in read,
write, and execute permissions in UNIX.
Database: Modication in constraints may bring redundancy, affect
integrity, and reduces security in the system.
Operating System: Alteration in access policy may expose the entire
system.
Intellectual Property Right: Change in IPR may economically hit one or
group of person or any organization.
In addition, avalanche can be based on operation, such as the following:
A. Insertion: In computer security, SQL Injection may have catastrophic effects on
system security.
B. Deletion: In the public place, removal of the word not from Do not Smoke
warning entirely changes the sense of the warning, thus can have severe effects.
C. Modication: Change of subnet from 2 to 20 in computer network
(192.168.2.10 to 192.168.20.10) causes signicant anomaly.
Avalanche can be categorized according to the severity into high, medium, and
low, where:
A. High: In a structure of cards, removal of base level cards bring down the entire
structure or change of single tectonic plate brings destruction (Tsunami).
B. Medium: If one or more data center is down in the cloud-based system then
performance (latency time) of the overall system may be hit badly.
C. Low: Performance may degrade signicantly if one data center is down in the
cloud-based system.
Since, Vulnerability analysis (assessment) is a procedure that identies and
classies the security holes (vulnerabilities) in a network, computer, or communi-
cations infrastructure. In addition, it can estimate and evaluate the effectiveness of
246 S. Hore et al.

proposed countermeasures. Therefore, Vulnerability analysis and defense are cur-


rently one of the leading foundations of computer security confrontations [10].
Due to the avalanche effect, this present work has been inspired by its natural
effect as it tries to locate the most vulnerable part, attacking which would cause
maximum damage to the system. Consequently, the objective of the work is to
discover a generalized solution to nd the most dependable element or a set of
dependable elements within a system on which the entire system depends largely
through dependency detection algorithm. Into the bargain, to inform the system
owner/administrator to protect the systems most valuable/key/dependable element
or elements from the attackers. Thus the effect of avalanche to some extent can be
prevented. On the contrary, the proposed system may destroy the enemys system
(attacker). In order to determine the optimal solution, a linear programming
which refers to problems dened as maximization or minimization of a linear
objective function subject to constraints that are linear equalities and inequalities is
to be used.
This work is organized as follows: Sect. 15.2 includes a background survey.
Then, in Sect. 15.3 a system overview is addressed. In Sect. 15.4, the proposed
system is illustrated, while the experimental results and discussion are presented in
Sect. 15.5. Finally, the conclusion is introduced in Sect. 15.6.

15.2 Background

Since the proposed work tries to nd a generalized solution for nding the primary
cause behind avalanche therefore an extensive background analysis is done.
Korel [11] proposed a well-practiced method known as the Goal-Oriented
Approach. Through their work, the author uses data dependence analysis to guide
the process of test data generation. Data dependence analysis automatically iden-
ties statements that affect the execution of the selected statement, and this infor-
mation is used to guide the search process to nd the goal.
Ibrahim et al. [12] provided an analysis of the Feistel Networks (FN) structural
models known as Extended Feistel Network (EFN). The work inspected the models
with respect to its avalanche criterion to establish the optimal scheme proper for the
flexible block size cipher. The analysis showed that the F-function played a sig-
nicant role to achieve the avalanche effect. The experiential analysis demonstrated
that as more functions were used for each round, the smaller the threshold number
of rounds required by the model to gain avalanche probability of 0.5. Through the
same year, Newsome and Song [13] suggested a dynamic taint analysis for auto-
matic detection and analysis of exploits on commodity software. Taint Check is a
novel mechanism which used dynamic taint analysis to perceive a buffer overrun
vulnerability or format string vulnerability.
Kasthurirathna et al. studied the mechatronic software systems failure tolerance
under different software types and/or hardware failures by using a complex network
approach [14]. Subsequently, the constructed, imitated system architectures were
15 A Vulnerability Analysis Mechanism Utilizing Avalanche Attack 247

subject to various attack forms, which emulated failure of signicant hardware/


software. The authors inspected four attack types, namely degree centrality,
betweenness centrality, closeness centrality, and random attack. Robustness coef-
cient metric of the connectedness of the attacked network is used to measure the
failure tolerance of the system. Therefore, this study endowed with a data-driven
technique to engineer the architecture of mechatronic software systems for failure
tolerance.
For information security, Patidar et al. proposed a block-based encryption model
using logical/mathematical operation arrangement [15]. This method was used as a
pre-encryption method to confuse the association between the original data and the
generated ones. In order to compute the data security level, the avalanche effec-
tiveness, efciency, and execution time are calculated. The experimental results
depicted that the existing algorithms had lower efciency and higher execution
time. Sensarma and SenSarma proposed a design for secure encryption graph-based
algorithm, which was not completely depended on secret key [16]. This algorithm
was proposed to protect the unauthorized attacks. To decrease the assorted attacks
probability, the proposed algorithm constructed different cipher text using the same
secret key for the same basic text. The authors recommended to implement the
proposed technique in embedded systems, smart card security, and cloud computing
security.
Recently, under localized attacks, Berezin et al. suggested a general spatially
embedded network model with dependencies [17]. In order to o predict the local-
ized attacks effect on the suggested system model, theoretical and numerical
approaches were developed. The experimental results demonstrated that a localized
attack can cause signicantly more damage than an equivalent random attack.
Moreover, it was observed that for a broad range of parameters, systems which
appear stable are in fact metastable. The high-risk possibility of the localized attacks
on spatially embedded network systems with dependencies was noted.
Kirkpatrick et al. [18] briefly reviewed the central constructs in combinatorial
optimization and in statistical mechanics. A deep and useful connection between
statistical mechanics and multivariate or combinatorial optimization (nding the
minimum of a given function depending on many parameters) has been reported.
Reeves [19] has shown how modern approaches are helpful in solving combinatorial
problems. Corne et al. [20] reported a new idea regarding the use of Multiple Ant
Colony System (MACS) for vehicle routing problems with time windows where
authors successively optimize a multiple objective function. Back [21] suggested the
theoretical and practical approach of evolutionary algorithm while an overview of
evolutionary algorithms covering genetic algorithms, evolution strategies, genetic
programming, and evolutionary programming has been proposed by Whitley [22].
A search-based software engineering has been suggested by Harman et al. [23]. They
suggest that search-based software engineering (SBSE) application can be imple-
mented using metaheuristic search techniques, such as genetic algorithms, simulated
annealing and tabu search. Clark et al. [24] reviewed the principal metaheuristic
search techniques and surveys on existing work on the application of metaheuristics
and have shown how metaheuristic can be applied to predict cost/effort.
248 S. Hore et al.

Consequently, the proposed study suggests a model using heuristic search


techniques to get the safest member having minimum participation to make the
system insecure in its absence.

15.3 System Overview

Typically, the optimization problems are concerned with determining the values for
one or some decision variables that convene the objective(s) the best without
violating the constraint(s). According to the objective function, optimization
problems may have multiple solutions some of which may be local optima.

15.3.1 Minimization and Maximization Problem

Generally, optimization problems do not produce an optimal solution that can be


computed in a reasonable time, i.e., in polynomial time, but provides an optimal
solution or best available values of some objective function within a dened domain
by using techniques from linear programming (LP). Since, in a linear program, a
problem concerning the maximization or minimization of a linear function while
satisfying a nite set of linear constraints [25].
Thus, LP can be functional when the objective function is linear in the decision
variable as well as the constraints are all equalities/inequalities that are linear, too.
Therefore, using linear programming is the best way to solve any optimization
problem. The function that has to be maximized is called the problem objective
function. Consequently, a general statement would be as follows:
n j
n ij j
Maximize = 1C Xj
that subject to the constraint = 1a x
bi , for all 1 i mXj
j j
and for all 1 j n, where a and x are captured the constraints. While, the variables
xj are referred to as decision variables. The following functions are to be used:

Maximize fexpr, consg, fx1 , x2 . . .g 15:1

Maximize expr subject to the constraints cons.

Minimize fexpr, consg, fx1 , x2 , . . .g 15:2

Minimize expr subject to the constraints cons. All constraints are inequalities
and all variables are nonnegative.
15 A Vulnerability Analysis Mechanism Utilizing Avalanche Attack 249

15.3.2 Heuristic Models

Recently, heuristic search optimization techniques have several applications instead


of the classical optimization techniques [2630]. The heuristic optimization
(HO) methods force the search toward promising regions/solutions in the space.
These techniques are very flexible and have less restricted along with the advantage
of being converge to the optimum through iterated search with less possibility to
end up in a local optimum. The essential method steps for all HO algorithms are as
follows: (i) start with an arbitrary initial solution, (ii) iteratively construct new
solutions using generation rule, (iii) estimate these new solutions, and (iv) ulti-
mately report the best solution found during the search process. Selected examples
of the HO methods are as follows.
Hill Climbing
Hill climbing is a well-practiced rule-based [31, 32] local search algorithm. It is a
variant of generate-and-test in which feedback from the test procedure helps to
improve a solution via maximizing an objective function. Hill Climbing starts with
a random solution that is potentially poor and iteratively makes small changes to the
solution by testing the neighborhood of this solution. In case a better solution is
found, then this replaces the current solution. This process continues until con-
verging to the near optimal/optimal solution.
A random ascent (steepest ascent) strategy considers all the moves from the
current states and selects the best one as the next one. In a simple hill climbing, the
rst order node is chosen whereas in random ascent all successors are compared and
the closet to the solution is chosen. Hill climbing is best suited to problems where
the heuristic gradually improves and gets closer to the solution; it works poorly
where there are sharp drops caught in local maxima, plateau, or ridges. A solution
of this algorithm is to include a series of restarts involving different initial
solutions, to sample more of the search space and minimize this problem as much as
possible. A high level description of the algorithm is illustrated as follows.

where S is the solution space; N is the neighborhood structure; and the objective
function to be maximized is obj.
250 S. Hore et al.

Simulated Annealing
Simulated Annealing (SA) [33, 34] is a variation of the hill climbing, where it
includes a general survey of the scene to avoid climbing foot hills. Since, the whole
space has been explored initially and this avoids the being caught on a plateau or
ridges, by allowing for a probabilistic acceptance of poorer solution. It applies a
probabilistic rule to decide whether the innovative solution replaces the current one
or not. Through this algorithm, the probability of acceptance P of an inferior
solution changes as the search progresses and is calculated as follows:

P = exp E kT 15:3

where k is Boltzmanns constant, E is the difference in the objective value


between the current solution and the neighboring inferior solution being considered.
T denotes the temperature that is initially high in order to allow free movement
around the search space and to avoid the dependency on the starting solution.
Metropolis et al. [35] has simulated the change in energy of the system when
subjected to a cooling process, until it converges into a steady state. This algorithm
was later proposed as the basis of the search mechanism. The SA process is broadly
divided into two processes, as the principle of the system convergence to a steady
state has been incorporated to locate the vulnerable part of a dependency-based
system.
Evolutionary Algorithm
Evolutionary Algorithms (EA) utilize the simulated evolution as a search procedure
to generate candidate solutions, as it is inspired by genetics and natural selection.
The most well-known evolutionary algorithm type is the genetic algorithm (GA). In
genetic algorithms, the search is primarily driven by the use of recombination, a
mechanism of exchange of information between solutions to breed new ones,
whereas evolution strategies principally use mutation, a process of randomly
modifying solutions.

15.4 Proposed Work

The proposed methodology considers each system under concern as a


dependency-based system. Commonly, dependence analysis includes the interde-
pendent elements of the system identication. It is referred to as a reduction
method, since the interdependent elements induced by a given inter-element rela-
tionship structures a subset of the system. It has been extensively studied for
principles such as automatic program parallelization, code restructuring through
optimization, and test-case generation [30]. Consequently, several denitions are
mentioned as follows.
15 A Vulnerability Analysis Mechanism Utilizing Avalanche Attack 251

Fig. 15.1 Typical


dependency-based system
having 6 nodes

Denition 1 (Dependency-based Systems) Dependency-based system D is a rela-


tion dened upon set of vertices (V) such that D = fi, j j i, j V and i, j E g
where E is the set of edges such that i, j E iff one edge exists from i to j which
denotes a dependency of vertex i upon j.
Formally, a dependency-based system can be depicted using a simple directed
graph. Each node of the graph represents one active member of that system. For
example, Fig. 15.1 depicts a typical dependency-based system having 6 entities and
10 dependencies. Each edge of the graph depicts one dependency. For instance,
node 3 is dependent upon 2, 4, 5 but no node depends upon node 3.
Denition 2 (Direct Dependency) For any node i and j if i, j E then there exists
a direct dependency of i upon j.
Denition 3 (Indirect Dependency): The indirect dependency of node i upon j can
be dened, where there exists one indirect dependency of node i upon j: iff
fi, n1 , n1 , n2 , n2 , n3 . . . nm 1 , nm , nm , jgE for any m > 1.
Meanwhile, one node may have direct or indirect dependencies. For example,
node 2 has a direct dependency upon 1 whereas 6 has a direct dependency
upon 2 indicating an indirect dependency of 6 upon 1. The calculation of
dependency (direct and indirect) in the proposed work is explained through an
example of a dependency-based system having 6 nodes as shown in Fig. 15.1. For
each direct dependency, the value assign is 2, and for each indirect dependency the
value assign is 1.

15.4.1 Avalanche Attack

Denition 4 Active node: If for any node i V, where V is the set of vertices in the
dependency-based system, there exists at least one direct dependency, it is called an
active node.
252 S. Hore et al.

Denition 5 Dead node: If for any node i V, where V is the set of vertices in the
dependency-based system, there does not exist any direct dependency or indirect
dependency, it is called a dead node.
Denition 6 Chain Reaction: A cumulative effect caused due to one active node
removal of a dependency-based system.
Denition 7 Avalanche Factor (AF): A heuristic which is used to measure the
effect of one active node removal from the dependency-based system. It also
measures the effect of the chain reaction (Denition 6) initiated by that active node.
Therefore, in Fig. 15.1 for node 1, the avalanche factor is 7. For node 5, the
avalanche factor is 10. Algorithm 1 has been devised to nd the avalanche factor of
any node in a dependency-based system.
Denition 8 Priority (Rank): is a heuristic which is used to measure the local
minim as in minimization problem and local maxima as in maximization problem.
The priority decreases with the increase in AF.
Since, it is quite obvious that an active node in a dependency-based system have
different direct dependencies and indirect dependencies, where the chain reaction
depends on them. Thus, different direct dependencies and indirect dependencies
would steer the chain reaction to different directions from the starting node and will
cover dissimilar set of nodes of the dependency-based system. Theoretically, there
may exist at least two such nodes for which the direct dependencies and indirect
dependencies may be same. Thus, it creates same pattern of the chain reaction
unless it nds different direct dependencies and indirect dependencies at some stage
of the chain reaction. Hence, the starting node does not have a strict and sole effect
on the chain reaction which it has initiated. Consequently, as a conclusion, the
pattern of the chain reaction may depend upon the active node which has initiated it.
Avalanche attack aims at nding the weakest or strongest point of a
dependency-based system by pointing out the node, the deletion of which will affect
the system to the lowest or highest extent. It is obvious to notice that in a
dependency-based system removal of one node will affect other nodes that are
directly or indirectly dependent on it. A node is said to be dead if it does not have
any type of connection with any other node of the system, whatever that is it neither
depends on any node nor any other node depends upon it. The existence of such
dead node is unrealistic in dependency-based systems. Hence, removal of one node
may create a dead node which in turn removes that freshly spawned dead node from
the system.
It is also noted that removal of one node from the system may partially affect
nodes having multiple dependency in the system which in turn may affect other
indirectly dependent nodes of the system. The chain reaction initiated by the
removal of one node will stop when it will not be possible to nd any further node
to affect. It can be observed that the overall effect on the system depends on the
starting node. Depending upon the dependencies, a different starting node may
result in different pattern of effects on the system.
15 A Vulnerability Analysis Mechanism Utilizing Avalanche Attack 253

To measure the level of the effect due to removal of one node is given by
avalanche factor (AF), a numeric value which gives an idea about consequences
that will take place after exclusion of one node from the system. A higher avalanche
factor indicates a higher vulnerability of that node in the system.

It is clear that the algorithm 1 returns a numeric value which reflects the effect
caused by removal of one particular node of a system. Figure 15.2 depicts the
dependency-based system depicted in Fig. 15.1, after applying avalanche factor
with starting node 5. The nodes lled with red depict the nodes which are
excluded due to the attack. Nodes with red boundaries depicts the affected ones but
not deleted. Edges in red are removed dependencies after the attack and the dark
blue ones carried the indirect effect of attack to other nodes. Nodes and edges in
light blue are unaffected ones. Figure 15.2 reveals that due to removal of node 5
254 S. Hore et al.

Fig. 15.2 After the


avalanche attack with starting
node 5 on the system
depicted in Fig. 15.1

Fig. 15.3 After the


avalanche attack with starting
node 5 on the system
depicted in Fig. 15.1

all the nodes are affected by the system, whereas Fig. 15.3 reveals the effect of the
attack started at node 3 and none, but 3 is the only node affected leaving others
untouched.

15.5 Experimental Results and Discussion

An extensive experiment is conducted to nd the cause behind the stated problem.


An IBM compatible PC is used to implement the algorithms which are written in
MATLAB (version 2013). The avalanche attack model proposed by the authors has
been implemented via heuristic search technique and comparison with results found
by uninformed search technique to nd the maximum and minimum dependency.
The search operation is deployed using all possible starting nodes to check the
validity of the claim of having the optimized result. The severity measurement of the
attack model is obtained by using the average avalanche factor (AF) of all starting
15 A Vulnerability Analysis Mechanism Utilizing Avalanche Attack 255

nodes. The average AF is compared with the average AF obtained from uninformed
search to observe the effect of deploying different search techniques. The informa-
tion regarding the heuristic search is obtained by using the Algorithm 1 which
gives a quite satisfactory approximation of the effect of individual members of a
dependency-based system. Simulated Annealing (SA) search technique has been
deployed to implement the model upon randomly generated dependency-based
systems with quite large variation of dependencies. The problem of nding out the
member having the highest effect on the system in its absence is found to be the
maximization problem which aims at nding out the member, termination of which
would have the highest effect on the system of interest. Similarly, the minimization
problem can be constructed as to nd out the member having lowest effect on the
system in its absence.

15.5.1 Minimization Problem

Table 15.1 illustrates the search results of a minimization problem of a typical


dependency-based system having 6 nodes, with ten iterations.
It is clear from Table 15.1 that node 5 has the highest avalanche factor, which
refers to a higher vulnerability of that node in the system.
Figure 15.4 demonstrates the avalanche factor (AF) versus the starting node of
the solution of the minimization of a typical dependency-based system having 6
nodes. The gure shows the relation between the AF with average AF using uni-
form search as well as with that using heuristic simulated annealing (SA) search
method. It is clear that the proposed method using the heuristic search gains the
minimized AF as required compared to the other methods.
Figure 15.4, reveals the degree of vulnerability in terms of AF for a particular
starting node. The search operation is deployed for each possible starting node of
the system, there by resulting in a broader spectrum of visualization of overall
performance of the proposed model. The plot in red depicts the average effect due
to all possible searches using uninformed search technique, the blue one shows the
effect due to all possible searches using the SA, and nally, the green one reflects
the average of the later one. It is observed that the SA provides better results in all

Table 15.1 Search results of a minimization problem


Starting node No. of iteration Optimized result Avalanche factor
1 10 4 7
2 4 7
3 3 4
4 4 7
5 5 10
6 6 7
256 S. Hore et al.

Fig. 15.4 Avalanche factor


(AF) versus the node for the
minimization problem

possible searches and has ability to maintain a reasonable difference with the
average obtained from the uninformed search.

15.5.2 Maximization Problem

For a maximization problem of a dependency-based system that has 6 nodes


illustrated in Fig. 15.1, the results are demonstrated in Table 15.2 as follows.
It is clear that, for the same number of iterations, the maximization problem
nodes 2, 3, and 5 have the highest (maximum) AF as well as the highest opti-
mization results.
Figure 15.5 represents the results of the maximization problem as depicted in
Table 15.2. This gure illustrates the AF versus the starting node maximization of a
typical dependency-based system having 6 nodes. It proves that the proposed
method satised the maximum AF compared to the uniform search method. In
addition, it has nearly the same maximum AF at all nodes.

Table 15.2 Search results of the maximization problem of the same system in Fig. 15.1
Starting node No of iteration Optimized result Avalanche factor
1 10 2 9
2 5 10
3 5 10
4 2 9
5 5 10
6 2 9
15 A Vulnerability Analysis Mechanism Utilizing Avalanche Attack 257

Fig. 15.5 Avalanche factor


(AF) versus the node for the
maximization problem

Figure 15.5 reveals that the results obtained using SA are always superior to the
results obtained using uninformed search.
Generally, in both the problems (minimization/maximization) the SA has pro-
vided superior result irrespective of the selection of starting node which clearly
indicates the ingenuity of the proposed model.
i. Different starting nodes number:
To examine the effect of using different starting nodes, Figs. 15.6, 15.7, 15.8, 15.9,
15.10, and 15.11 depict the results obtained by following the previous conguration
on dependency-based systems for both the minimization and maximization having
12, 25, and 200 nodes, respectively.

Fig. 15.6 Avalanche factor


(AF) versus the starting node
minimization of atypical
dependency-based system
having 12 nodes
258 S. Hore et al.

Fig. 15.7 Avalanche factor (AF) versus the starting node maximization of a typical
dependency-based system having 12 nodes

Fig. 15.8 Avalanche factor (AF) versus the starting node minimization of a typical
dependency-based system having 25 nodes

The gures reveal that in all the cases SA provides more accurate and optimized
results than that of the uninformed search.
It is obvious that using the information obtained by Algorithm 1 heuristic search
technique has been able to nd out a node having highest or lowest participation in
the post attack time in its absence irrespective of the selection of the starting node.
During the minimization problem, the local minim as having highest priority and it
15 A Vulnerability Analysis Mechanism Utilizing Avalanche Attack 259

Fig. 15.9 Avalanche factor (AF) versus the starting node maximization of a typical
dependency-based system having 25 nodes

Fig. 15.10 Avalanche factor (AF) versus the starting node minimization of a typical
dependency-based system having 200 nodes
260 S. Hore et al.

Fig. 15.11 Avalanche factor (AF) versus the starting node maximization of a typical
dependency-based system having 200 nodes

decreases with increasing value of the AF. While, in the maximization problem, the
local maxim have the highest priority and it decreases with increasing value of AF.
From the previous experimental results, it is clear that the proposed SA method
satised superior performance for either the minimization or the maximization
problems compared to the uniform search method. Also, it provides more stable
results compared to the AF method without search (traditional method).

15.6 Conclusion

This paper introduced a new vulnerability assessment analysis through the ava-
lanche attack model in a dependency-based system. The results show that the
proposed model has shown that how effectively avalanche attack model can safe-
guard the proposed system. Also, it helps to prevent any destruction that may ruin a
dependency-based system. The extensive experimental result shows that using
some heuristic technique allows an easy identication for the most valuable part or
parts of the proposed system and therefore either we can prevent attacks or can
destroy the enemy system.
15 A Vulnerability Analysis Mechanism Utilizing Avalanche Attack 261

References

1. Kahate A (2009) Cryptography and network security, 2nd edn. McGraw-Hill


2. Kumar A, Tiwari N (2012) Effective implementation and avalanche effect of Aes. Int J Secur
Priv Trust Manage (IJSPTM) 1:3/4
3. Ahmad A (2012) Type of security threats and its prevention. Int J Comput Technol Appl 3
(2):750752
4. Aaron G (2010) Director, key account management and domain security at alias; Rod
RasmussenPresident and CTO at Internet Identity
5. McMillan R (2010) Report blames Avalanche group for most phishing. Network World
6. Alrabady A, Mahmud S (2005) Analysis of attacks against the security of keyless-entry
systems for vehicles and suggestions for improved designs. In: IEEE Transactions on
Vehicular Technology, Vol. 54, No. 1
7. Tanenbaum AS (2012) Computer networks. Pearson education/PHI. pp 1314
8. Forouzan BA, Fegan CS (2011) Data communications and networking. McGraw-Hill Inc,
pp 120145
9. Korth HF, Abraham S, Sudarshan S (2003) Database system concepts. McGraw-Hill Inc,
pp 3536
10. Brumley D (2008) Analysis and defense of vulnerabilities in binary code. Ph.D. thesis, School
of Computer Science Carnegie Mellon University Pittsburgh
11. Korel B (1992) Dynamic method for software test data generation. Softw Test Verication
Reliab 2(4):203213
12. Ibrahim S, Maarof M, IDRIS N (2005) Avalanche analysis of extended feistel network. In:
Proceedings of the postgraduate annual research seminar. pp 265269
13. Newsome J, Song D (2005) Dynamic taint analysis for automatic detection, analysis, and
signature generation of exploits on commodity software. In: Proceedings of 12th network and
distributed system security Symposium (NDSS 05)
14. Kasthurirathna D, Dong A, Piraveenan M, Tumer I (2013) The failure tolerance of
mechatronic software systems to random and targeted attacks. In: Proceedings of the 2013
ASME International design engineering technical conferences & computers and information
in engineering conference IDETC/CIE, Portland, Oregon, USA
15. Patidar G, Agrawal N, Tarmakar S (2013) A block based encryption model to improve
Avalanche effect for data security. Int J Sci Res Publ 3(1)
16. Sensarma D, SenSarma S (2014) A graph based modied data encryption standard algorithm
with enhanced security. IJRET: Int J Res Eng Technol 3(3)
17. Berezin Y, Bashan A, Danziger M, Li D, Havlin S (2015) Localized attacks on spatially
embedded networks with dependencies. Sci Rep 5(8934):15
18. Kirkpatrick S, Gellat CD, Vecchi MP (1983) Optimization by simulated annealing. Science
220(4598):671680
19. Reeves CR (ed) (1995) Modern heuristic techniques for combinatorial problems.
McGraw-Hill
20. Corne D, Dorigo M, Glover F (1999) New ideas in optimization. McGraw-Hill
21. Back T (1996) Evolutionary algorithms in theory and practice. Oxford University Press, New
York
22. Whitley D (2001) An overview of evolutionary algorithms Practical issues and common
pitfalls. Inf Softw Technol 43(14):817831
23. Harman M, Jones B (2001) Search-based software engineering. Inf Softw Technol 43
(14):833839
24. Clark J, Dolado JJ, Harman M, Hierons R, Jones B, Lumkin M, Mitchell B, Mancoridis S,
Rees K, Roper M, Shepperd M (2003) Reformulating software engineering as a search
problem. IEE ProcSoftw 150(3):161175
25. Cook WJ, Cunningham WH, Pulleyblank WR, Schrijver A (1998) Combinatorial optimiza-
tion. Wiley-Interscience Series in Discrete Mathematics and Optimization, pp 37119
262 S. Hore et al.

26. Kriti Virmani J, Dey D, Kumar V (2015) PCA-PNN and PCA-SVM based CAD systems for
breast density classication. Appl Intell Optim Biol Med Curr Trends Open Probl
27. Dey N, Samanta S, Yang X-S, Chaudhri SS, Das A (2013) Optimisation of scaling factors in
electrocardiogram signal watermarking using cuckoo search. Int J Bio-Inspired Comput
(IJBIC) 5(5):315326
28. Samanta S, Acharjee S, Mukherjee A, Das D, Dey N (2013) Ant weight lifting algorithm for
image segmentation. In: 2013 IEEE international conference on computational intelligence
and computing research(ICCIC), Madurai
29. Chakraborty S, Samanta S, Mukherjee A, Dey N, Chaudhuri SS (2013) Particle swarm
optimization based parameter optimization technique in medical information hiding. In: 2013
IEEE international conference on computational intelligence and computing research
(ICCIC), Madurai
30. Rani J, Seth K (2012) Dependency analysis for component based systems using minimum
spanning tree. In: International conference on advances in computer applications (ICACA).
pp 1116
31. Russell S, Norvig P (2009) Articial intelligence: a modern approach. Prentice Hall Press
32. Sullivan KA, Jacobson SH (2001) A convergence analysis of generalized hill climbing
algorithms. IEEE Trans Autom Control 46:12881293
33. Granville V, Krivanek M, Rasson JP (1994) Simulated annealinga proof of convergence.
IEEE Trans Pattern Anal Mach Intell 16:652656
34. Fleischer MA (1995) Simulated annealing: past, present, and future. In: Alexopoulos C,
Kang K, Lilegdon WR, Goldsman D (eds) Proceedings of the 1995 winter simulation
conference. IEEE Press, Arlington, Virginia, pp 155161
35. Metropolis N, Rosenbluth A, Rosenbluth M, Teller A, Teller E (1953) Equation of state
calculations by fast computing machines. J Chem Phys 21(6):10871092
Chapter 16
Performance of Statistical and Neural
Network Method for Prediction
of Survival of Oral Cancer Patients

Neha Sharma and Hari Om

Abstract Traditionally, standard statistical methods by direct hands-on data


analysis were used for prediction, as the data were simple and volume was less.
However, the increasing volume of data and its complex nature has motivated the
study of automatic data analysis using articial neural networks with the help of
more sophisticated tools which can operate directly on data. This paper presents a
case study to predict the survival rate of oral malignancy patients, with the help of
two predictive models, linear regression (LR), which is a contemporary statistical
model, and multilayer perceptron (MLP), which is an articial neural network
model. The data of more than 1000 patients who visited tertiary care center during
June 2004 to June 2009 are collected through active case nding method and are
used to build models. Analytical comparison of the performance of both the models
is carried out. The experimental result shows that the classication accuracy of
MLP model is 70.05 % and of LR model is 60.10 %. After comparing on various
evaluation criteria, it is concluded that the MLP model is certainly a better model to
predict the survival rate of oral malignancy patients, in comparison with LR model.
Besides, there are many other benets of neural networks, such as less formal
statistical training needed, ability to detect nonlinear complex relationships between
independent and dependent variables, capability to diagnose most practicable
interactions concerning predictor factors, as well as the ease of use of several
training algorithms.

Keywords Oral cancer


Data mining
Linear regression
Multilayer per-
ceptron
Statistical method
Neural networks
Articial intelligence

N. Sharma ()
Zeal Institute of Business Administration, Computer Application
and Research, Savitribai Phule Pune University, Ganesh Khind,
411007 Pune, Maharashtra, India
e-mail: nvsharma@rediffmail.com
H. Om
Computer Science and Engineering Department, Indian School of Mines,
Dhanbad, Jharkhand, India
e-mail: hari.om.cse@ismdhanbad.ac.in

Springer Science+Business Media Singapore 2016 263


A. Chakrabarti et al. (eds.), Advances in Computing Applications,
DOI 10.1007/978-981-10-2630-0_16
264 N. Sharma and H. Om

16.1 Introduction

Recently, it has been observed that statistical methods and articial neural network
techniques are widely used in healthcare industry, as modern medical facility
generates huge amounts of heterogeneous data on daily basis [1]. A majority of
areas related to medical services that require data to be analyzed are mainly for
prediction of effectiveness of surgical procedures, medical tests, medication, and the
discovery of relationships among clinical and diagnosis data [2]. This study intends
to provide advantageous understanding into the capabilities and performance of
statistical methods and articial neural networks for prediction of survival of oral
cancer patients, as it is considered as a grave problem globally and in India as well.
The information extracted from the survey carried out by GLOBOCAN indicates
that oral cancer ranks among the top three types of cancer in India and has one of
the highest incidences in Indian subcontinent [3] as shown in Fig. 16.1.
Approximately, 83,000 new cases and 46,000 deaths are recorded each year
because of oral malignancy in the country [4]. Figure 16.2 presents the statistics of
oral cancer cases in India stratied by sex. The statistics informs that there is a high
rate of mortality in males than females [5]. If we do not take this disease seriously,
then the situation in future is going to be very alarming as projected by GLO-
BOCAN 2008 data (refer Fig. 16.3).

Incidence Mortality

Fig. 16.1 Incidence and mortality rates (age standardized) by cancer type in India (sexes
combined) data extracted from GLOBOCAN 2008: cancer incidence and mortality worldwide

Incidence Mortality
9.8

7.5
6.8
5.2 5.2
3.6

Male Female Combined

Fig. 16.2 Incidence and mortality rates (age standardized) lip and oral cancer in Indiaby sex
extracted from GLOBOCAN 2008: cancer incidence and mortality worldwide
16 Performance of Statistical and Neural Network Method 265

Fig. 16.3 Crude incidence projection for lip/oral cavity cancer (20082030). Data extracted from
GLOBOCAN 2008: cancer incidence and mortality worldwide

In this paper, we have attempted to build two models for predicting the sur-
vivability of oral malignant patients. First model is a linear regression model
(LR) which is a statistical method, and the second is multilayer perceptron model
(MLP) which is an articial neural network. Linear regression performs better for
extremely small sample size and when hypothesis or experience shows an under-
lying relationship between dependent and predictor variables [6], whereas the true
power of multilayer perceptron model lies in its ability to learn linear as well as
nonlinear relationships straight from the data being modeled, and its performance
enhances with specimen measure [7]. This paper is arranged as accompanies: The
following section reviews the related work done by different researchers; Sect. 16.3
gives the information about predictive models adopted in this research; the
experimental results are presented in Sects. 16.4 and 16.5 covers the conclusions.
Finally, acknowledgment and references is mentioned in the last section.

16.2 Related Work

In recent two decades, requisition of neural systems to therapeutic information and


medical data has increased copiously in comparison with statistical methods as they
surpass the performance of conventional statistical methods in terms of estimation
and prediction capability [813]. Sunny et al. [14] have estimated the time trends
and the risk of developing an Oral Cancer in Mumbai, India, in the incidence
recorded at the Bombay Population Based Cancer Registry during the 15-year
period from 1986 to 2000. The linear regression model based on the logarithm of
the observed incidence rates were applied for evaluation of the trend. The result
observed decreasing trend in oral cancers in Indian men during the period from
1986 to 2000, with a yearly decrease of 1.70 %, and it was attributed to a decrease
in the usage of pan and tobacco. The high prevalence of usage of smokeless tobacco
among young adult men and women explains the stable trend in oral cancer inci-
dence in this group. These ndings help to strengthen the association between
tobacco use and oral cancer risk. Hari Kumar et al. [15] considers 100 breast cancer
and 125 oral cancer patients and analyze the classication accuracy of the TNM
(tumor, lymph nodes, metastasis) staging system along with that of the Chi-square
test and neural networks. The Chi-square classication almost emulated medical
examination in connection to TNM classication. However, when TNM prognostic
266 N. Sharma and H. Om

factors alone are used, then the articial neural networks (MLP and RBF) are
certainly more exact than the TNM staging system. Moreover, there are many
researchers who have studied oral tumor evolvement and progression using various
types of neural networks and data mining approaches, and found it useful in
identifying early potential relapses [1621].

16.3 Predictive Models

16.3.1 Linear Regression (LR)

It is a statistical method that allows summarizing and studying relationships


between two continuous (quantitative) variables [22]. It is a method of predicting
scores on one variable from the scores on a second variable [23]. The variable
which is being predicted is referred to as Y [23]. The variable which helps in
predicting is referred to as X. The oldest among the statistical method is linear
regression, and it is the most extensively used predictive model that ts a set of data
points to a linear function, which is expressed as Y = a + bX, where Y is the target
variable, X is the predictor variables, a and b are coefcients which are constants
and are multiplied to the predictor variable [23]. The residual error associated with a
set of data can be calculated using following formula:

Residual Error = Yactual Ypredicted 16:1

where Yactual is actual target value and Ypredicted is observed or predicted target
value. The sum of the squared residuals (SSR) is calculated to yield the desired
goodness of t of the linear function that can be expressed in form of following
form:

N
SSR = Yactual Ypredicted 2 16:2
i=1

where N is number of instances. The smaller the value of SSR, the closer is the
predicted target value to the actual observed target value. Therefore, the goal of
regression analysis is to obtain the regression equation by adjusting the coefcients
a and b until the sum of the residuals errors that are associated with all the points
reaches its minimum. It is known as least-squares regression (LSR) t or ordinary
least-squares (OLS) regression. The coefcients a and b can be derived as follows:

N XY X Y
b= a = Y bX 16:3
N X 2 X 2

Several other statistics like t-statistics, prob(|t|), and condence interval


(CI) along with the standard error of the coefcient is displayed in addition to the
16 Performance of Statistical and Neural Network Method 267

coefcient values. The t-statistics is a measure of the likelihood that the actual value
of the variable Y is not zero. The larger the absolute value of t, the less is the chance
that the actual value of the variable could be zero. The prob(|t|) is the probability of
obtaining the estimated value of the coefcient if the actual coefcient value is zero.
The smaller the value of prob(|t|), the more signicant the coefcient and the less
likely that the actual value is zero. Here absolute value of t is considered in prob(|t|).
The condence interval (CI) conrms the range of values for the calculated coef-
cient that covers the actual coefcient value with the specied condence. Linear
regression can be performed using several algorithms; however, in this case study,
singular value decomposition (SVD) algorithm is used to implement linear
regression as it is robust and less sensitive to predictor variables that are nearly
collinear [24, 25].

16.3.2 Multilayer Perceptron (MLP)

A single-layer network cannot be a linear discriminant function [26]. Nonlinear


discriminant functions for linearly non-separable function can be considered as
piecewise linear function [26]. The piecewise linear discriminant function can be
implemented by a multilayer network [27]. It is a supervised network because it
needs output from data set in order to learn [28]. The objective of such network is to
build a model which can predict the correct output when the desired output is
unknown [28]. MLP is a multilayered, feed-forward, and fully connected network.
Except for the input nodes, each node is a processing unit with an activation
function. Back propagation, which is a supervised learning technique, is used to
train the network [28]. MLP is a modication of the standard linear perceptron in
order to differentiate data that are linearly inseparable [29].
The basic model has three layers: one input, one (or more) hidden, and one
output layer [30]. The input layer comprises of all predictor variables (xi), and each
one of them is associated with a random weight (Whij) from ith input layer node to
jth hidden layer node, where i and j are indices of neurons [30]. The input layer
distributes the input values to each neuron in hidden layer, where summation
function computes the weighted sum (ui) and feeds it to the activation function,
which computes the output of hidden layer (hi) as follows:
n
Weighted Sum = ui = xi Wijh 16:4
i=1

1
Output of Hidden layer = hi = f ui = 16:5
1 + e ui

The hidden layer of the MLP model uses nonlinear sigmoid function (). The
output from the hidden layer is distributed to each of the neurons in the output layer
[30]. The output layer also applies the summation function to compute the weighted
268 N. Sharma and H. Om

sum, denoted by vi, and feeds it to the activation function to compute the output of
the layer (yi) as follows:
n
Weighted Sum = vi = hi Wijy 16:6
i=1

1
Output of Output layer = yi = f vi = 16:7
1 + e vi

where hi is output of hidden layer and Wyij is weight from ith hidden layer node to jth
output layer node (i and j are indices of neurons). The back-propagation learning
algorithm is based on minimization of the error between the actual output (Yi) of
the network and the desired output (Yi(d)). The error E is calculated using Eucli-
dean function as follows:

1
E= Yi d Yi 2 16:8
2

where E shows error of ith neuron of the output layer. To normalize the error, the
weight of the neuron must be updated as follows:

E
WijK + 1 = WijK 16:9
Wij

where Wk+1
ij is updated weight, Wkij is old weight, and E Wij is rate of change of
error with respect to weight. The process of learning algorithm adopted by the
multilayer perceptron neural network is presented below:
Step1: Read inputs and desired/expected output of the network.
Step2: Initialize weight of inputs with random values.
Step3: Compute summation function.
Step4: Compute activation function.
Step5: Calculate prediction error E (=desired outputactual output).
Step6: Average the error over the complete set of training cases.
Step7: Transmit the error backward through the network and calculate the gra-
dient of change in error with respect to changes in weight.
Step8: Reduce the error by adjusting or updating the weight.

16.4 Model Estimation

ENT (Ear-Nose-Throat) and HeadNeck department of three Tertiary Care


Hospitals of Pune, Maharashtra, India, were visited to gather the data of 1025
patients. The details of the data set are presented in previous paper [31], which is
16 Performance of Statistical and Neural Network Method 269

subsequently stored in comma-separated values (csv) le format and is used to


build linear regression and MLP models using DTREG tool [32]. The database is
described with the help of 35 variables. The attribute survival, is considered as
target variable and cross-validation method with tenfold is used for validating both
the models. The details of the model development for each model are discussed
below.

16.4.1 Model Development

(i) Linear Regression


The linear regression model (Y = a + b X) is developed, where Y is target variable
Survival = Dead, X is category of variable and constant a is 1.30653e+014.
The Table 16.1 presents the computed coefcient b values along with standard
error, t-statistics, prob(t), 95 % condence interval.
(ii) Multilayer Perceptron
The architecture of multilayer perceptron network is presented in Table 16.2, and
the training statistics of the network are given in Table 16.3. The DTREG adopts
NguyenWidrow algorithm to create a set of random starting values within the
range specied by the algorithm and then uses conjugate gradient method to
optimize the set of network parameter. Conjugate gradient algorithm is used to
perform backward propagation to nd the optimal network weight. Each cycle of
weight update is called epoch.
Table 16.4 represents the model size summary report for the network that has
been performed by using fourfold cross-validation. The network has been built by
using eight neurons for hidden layer 1 as it has minimum percentage of
misclassication.
Model estimation in terms of misclassication table, confusion matrix, sensi-
tivityspecicity and liftgain for linear regression and multilayer perceptron is
presented below:

16.4.2 Misclassication Table

The table presents the number of rows with a particular category (A = Alive or
D = Dead) that were misclassied by the model, for training as well as validation
data. Along with the misclassied count, the table also presents misclassied
weight, misclassied percent, cost, and classication accuracy of the model.
Table 16.5 presents the misclassication table for MLP, and Table 16.6 presents
the misclassication table for linear regression model for training and validation
data, respectively.
270 N. Sharma and H. Om

Table 16.1 Linear regression parameter for survival = Dead: coefcient b values, standard
error, t-statistics, Prob(t), 95 % condence interval
Variable Coefcient Std. error t Prob(|t|) 95 % Condence
interval
Clinical symptom
Burning sensation 0.0248006 0.03886 0.64 0.52350 0.1011 0.05146
Loosening of tooth 0.0293704 0.06485 0.45 0.65070 0.1566 0.09788
Mass 0.459061 0.1678 2.74 0.00634 0.7884 0.1298
History of addiction
None 7.52112 e 2.424 e 0.31 0.75645 5.509 e 4.005 e
+012 +013 +013 +01
Tobacco chewing 7.52112 e 2.424 e 0.31 0.75645 5.509 e 4.005 e
+012 +013 +013 +01
History of addiction 1
None 6.25376 e 2.821 e 2.22 0.02689 1.179 e 7.17 e
+013 +013 +014 +012
Comorbid condition 1
None 1.32596 e 5.517 e 2.40 0.01643 2.433 e 2.409 e
+014 +013 +013 +014
Gross examination
Inltrative 0.0429256 0.1913 0.22 0.82246 0.4182 0.3324
Plaque like 0.246707 0.2641 0.93 0.35043 0.7649 0.2715
Polypoidal 0.184125 0.177 1.04 0.29848 0.5315 0.1632
Ulcero-proliferative 0.178139 0.1131 1.58 0.11546 0.4 0.04374
Site
BM 0.267512 0.1413 1.89 0.05867 0.5448 0.009821
LA 0.0985091 0.0981 1.00 0.31555 0.291 0.094
Palate 0.0596517 0.131 0.46 0.64907 0.1975 0.3168
RMT 0.0883389 0.2038 0.43 0.66471 0.3115 0.4882
Tongue 0.0658472 0.09115 0.72 0.47021 0.2447 0.113
Tumor size
24 cm 5.96114 e 4.236 e 1.41 0.15972 2.352 e 1.427 e
+013 +013 +013 +014
<2 cm 7.00587 e 3.193 e 2.19 0.02848 7.393 e 1.327 e
+013 +013 +012 +014
Neck nodes
Present 7.52112 e 2.424 e 0.31 0.75645 5.509 e 4.005 e
+012 +013 +013 +013
LFT
Normal 0.199452 0.2686 0.74 0.45789 0.7265 0.3276
FNAC of neck node
Yes 0.0277584 0.04156 0.67 0.50435 0.1093 0.0538
Diagonistic biopsy
None 4.87633 e 2.126 e 0.23 0.81859 4.659 e 3.683 e
+013 +014 +014 +014
(continued)
16 Performance of Statistical and Neural Network Method 271

Table 16.1 (continued)


Variable Coefcient Std. error t Prob(|t|) 95 % Condence
interval
SCC 9.94372 e 3.8 e 0.26 0.79360 8.451 e 6.462 e
+013 +014 +014 +014
USG
Yes 5.20903 e 4.315 e 1.21 0.22760 3.258 e 1.368 e
+013 +013 +013 +014
CTScan/MRI
Normal 0.287663 0.6471 0.44 0.65676 1.558 0.9822
Diagnosis
Acantholytic 2.58254 e 5.287 e 0.49 0.62533 7.793 e 1.296 e
+014 +014 +014 +015
Adenocarcinoma 5.31094 e 3.346 e 0.16 0.87393 6.036 e 7.098 e
+013 +014 +014 +014
Basaloid 1.01163 e 5.378 e 0.19 0.85084 9.543 e 1.157 e
+014 +014 +014 +015
Benign 1.04387 e 2.17 e 0.48 0.63058 3.214 e 5.302 e
+014 +014 +014 +014
Lymphoepithelioma 7.55833 e 1.24 e 0.61 0.54221 1.677 e 3.189 e
+014 +015 +015 +015
Plaque like 8.78053 e 9.825 e 0.89 0.37169 1.05 e 2.806 e
+014 +014 +015 +015
Sarcomatoid 1.56412 e 1.143 e 1.37 0.17148 3.807 e 6.788 e
+015 +015 +015 +014
SCC 2.348 e+015 1.495 e 1.57 0.11659 5.856 e 5.282 e
+015 +014 +015
Staging
I 1.60254 e 5.823 e 0.28 0.78322 9.825 e 1.303 e
+013 +013 +013 +014
II 8.5043 e 4.776 e 0.18 0.85871 8.522 e 1.022 e
+012 +013 +013 +014
IV 1.60254 e 5.823 e 0.28 0.78322 9.825 e 1.303 e
+013 +013 +013 +014
Radiotherapy
Y 2.191 e 1.424 e 1.54 0.12426 4.986 e 6.037 e
+015 +015 +015 +014
Chemotherapy
Y 0.0979578 0.2071 0.47 0.63633 0.5044 0.3085
Histopathology
Acantholytic 2.58254 e 5.287 e 0.49 0.62533 1.29 6 e 7.793 e
+014 +014 +015 +014
Adenocarcinoma 5.31094 e 3.346 e 0.16 0.87393 7.098 e 6.036 e
+013 +014 +014 +014
Basaloid 1.01163 e 5.378 e 0.19 0.85084 1.157 e 9.543 e
+014 +014 +015 +014
(continued)
272 N. Sharma and H. Om

Table 16.1 (continued)


Variable Coefcient Std. error t Prob(|t|) 95 % Condence
interval
Benign 5.75667 e 4.833 e 1.19 0.23390 1.524 e 3.727 e
+013 +013 +014 +013
Lymphoepithelioma 7.55833 e 1.24 e 0.61 0.54221 3.189 e 1.677 e
+014 +015 +015 +015
Plaque like 8.78053 e 9.825 e 0.89 0.37169 2.806 e 1.05 e
+014 +014 +015 +015
Sarcomatoid 1.56412 e 1.143 e 1.37 0.17148 6.788 e 3.807 e
+015 +015 +014 +015
SCC 5.75667 e 4.833 e 1.19 0.23390 1.524 e 3.727 e
+013 +013 +014 +013
Schonama 5.75667 e 4.833 e 1.19 0.23390 1.524 e 3.727 e
+013 +013 +014 +013
Constant 1.30653 e 5.601e 2.33 0.01986 2.406 e 2.074 e
+014 +013 +014 +013

Table 16.2 Architecture of multilayer perceptron network


Layer Neurons Activation Min. weight Max. weight
Input 48 Passthru
Hidden 1 8 Logistic 1.504e+000 1.660e+000
Output 2 Logistic 1.366e+000 1.288e+000

Table 16.3 Training Process Time Evaluations Error


statistics of multilayer
perceptron network Conjugate 00:00:00.4 153,135 1.1655e-001
gradient

Table 16.4 Model size Hidden layer 1 neurons % Misclassications


summary report for MLP
2 30.14634
3 31.41463
4 30.92683
5 31.90244
6 30.92683
7 30.24390
8 29.65854 Optimal size
9 30.04878
10 30.63415
11 30.34146
12 30.92683
13 31.21951
14 30.63415
15 30.82927
16 31.31707
16 Performance of Statistical and Neural Network Method 273

Table 16.5 Misclassication table for multilayer perceptron (training and validation data)
Category Misclassied (training data) Misclassied (validation data)
Count Weight % Cost Count Weight % Cost
A 212 212 34.6 0.34 236 236 38.56 0.38
D 95 95 23.0 0.23 74 74 17.91 0.17
Total 307 307 29.9 0.30 310 310 30.24 0.30
Overall accuracy = 70.05 % Overall accuracy = 69.76 %

Table 16.6 Misclassication table for linear regression (training and validation data)
Category Misclassied (training data) Misclassied (validation data)
Count Weight % Cost Count Weight % Cost
A 10 10 1.63 0.01 196 96 32.02 0.32
D 399 399 96.6 0.96 201 201 48.66 0.48
Total 409 409 39.9 0.39 397 397 38.73 0.38
Overall accuracy = 60.10 % Overall accuracy = 61.27 %

16.4.3 Confusion Matrix

The classication of data rows by the model is presented in the confusion matrix.
The matrix has a row and a column for each category (A = Alive and D = Dead) of
the target variable. The categories in the rst column are the actual categories of the
target variable. The categories shown across the top of the table are the predicted
categories. The gure in each cell is the value of the actual category of the row and
the predicted category of the column. The diagonal cells contain the values those
are correctly classied by the model, whereas off-diagonal cells have misclassied
data. Confusion matrix for training and validation data for MLP and LR is shown in
Table 16.7.

Table 16.7 Confusion matrix for multilayer perceptron and linear regression (training and
validation data)
Category Multilayer perceptron Linear regression
Training data Validation data Training data Validation data
A D A D A D A D
A 400 212 376 236 602 10 416 196
D 95 318 74 339 399 14 201 212
274 N. Sharma and H. Om

16.4.4 Sensitivity and Specicity

The sensitivity and specicity report is regarding positive category and the negative
category. Sensitivity represents the probability of an algorithm to correctly predict
malignancy, and it is computed as Sensitivity = TP/(TP + FN). Specicity repre-
sents the probability of an algorithm to correctly predict non-malignancy, and it is
computed as Specicity = TN/(FP + TN). Survival = D is considered as a positive
and Survival = A is considered as negative for both the models developed. The
report also provide details about negative predictive value (NPV), positive pre-
dictive value (PPV), geometric mean of sensitivity and specicity, geometric mean
of ppv and npv, recall, precision, f-measure and area under ROC curve for training,
and validation data for both the models in Table 16.8.
Area under ROC curve (AUC) plots the true positive rate against the false
positive rate for the different possible cut-points of a diagnostic test. It shows the
trade-off between sensitivity and specicity (any increase in sensitivity will be
accompanied by a decrease in specicity). The area under it ranges between 0 and
1. It is used as a criterion for measuring the prediction ability of a model. The
number nearer to 1 predicts better. Figures 16.4 and 16.5 present area under ROC
curve for Survival = Dead for MLP and Linear Regression models, respectively.

16.4.5 Lift and Gain Chart

Lift is a ratio between expected response using predictive model and without any
model for specic population, and gain is a ratio between expected response using
predictive model and without any model for entire population. The lift and gain
chart used following terms:
Bin indexThe entire dataset is divided into equal number of bins. The rst bin
represents the data rows that have the highest predicted probability for the specied
target category.
Cutoff ProbabilityIt is the smallest predicted probability of data rows falling
in the bin.
Class % of binThe percentage of the cases in the bin that have the specied
category of the target variable.
Cumulative % populationThe cumulative percentage of the total cases (with
any category) falling in bins up to and including the current one.
Cumulative % of classThe cumulative percentage of the rows with the
specied category falling in bins up to and including the current one.
16 Performance of Statistical and Neural Network Method 275

Table 16.8 Sensitivity and specicity report for multilayer perceptron and linear regression
(training and validation data)
Category Description Multilayer perceptron Linear regression
Training Validation Training Validation
data (%) data (%) data (%) data (%)
True Predicting malignant 31.02 33.07 1.37 20.68
positive (TP) patients among the
malignant patients
True Predicting non-malignant 39.02 36.68 68.73 40.59
negative (TN) patients among
non-malignant patients
False positive Patients who are predicted 20.68 23.02 0.98 19.12
(FP) as malignant among
non-malignant patients
False Patients who are predicted 9.27 7.22 38.93 19.61
negative (FN) as non-malignant among
malignant patients
Sensitivity Probability to correctly 77.00 82.08 3.39 51.33
predict malignancy
specicity Probability to correctly 65.36 61.44 98.37 67.97
predict non-malignant
cases
Geometric Geometric mean of 70.94 71.01 18.26 59.07
mean of sensitivity and specicity
sensitivity
and
specicity
Positive Proportion of patients with 60.00 58.96 58.33 51.96
predictive the disease who are
value (PPV) correctly predicted to have
the disease
Negative Proportion of patients who 80.81 83.56 60.14 67.42
predictive do not have the disease
value (NPV) who are correctly
predicted as not having the
disease
Geometric Geometric mean of PPV 69.63 70.19 59.23 59.19
mean of PPV and NPV
and NPV
Precision Proportion of cases 60.00 58.96 58.33 51.96
selected by the model that
have the true value;
Precision is equal to PPV
Recall Proportion of the true 77.00 82.08 3.39 51.33
cases that are identied by
the model; recall is equal
to sensitivity
(continued)
276 N. Sharma and H. Om

Table 16.8 (continued)


Category Description Multilayer perceptron Linear regression
Training Validation Training Validation
data (%) data (%) data (%) data (%)
F-Measure It combines precision and 0.674 0.686 0.064 0.516
recall to give an overall
measure of the quality of
the prediction
Area under Area under the receiver 0.769 0.739 0.722 0.631
ROC curve operating characteristic
(ROC) curve for the
model

Fig. 16.4 Area under ROC curve for multilayer perceptron model

Cumulative GainThe ratio of the cumulative percent of class divided by the


cumulative percent of the population.
% of populationThe percentage of the total cases falling in the bin.
% of classThe percentage of the cases with the specied category that were
placed in this bin.
LiftThis is calculated by dividing the percentage of rows in a bin with the
specied target category (% of class) by the total percentage of cases in the bin (%
of Population).
The lift and gain values presented below shows how much improvement the
linear regression and multilayer perceptron model provides.
16 Performance of Statistical and Neural Network Method 277

Fig. 16.5 Area under ROC for linear regression model

Table 16.9 Calculation of lift and gain for linear regression for training data for
Survival = Dead
Bin Cutoff Class % Cum % Cum % Cum % of % of Lift
Index probability of bin Population of class Gain population class
1 0.79395 64.08 10.05 15.98 1.59 10.05 15.98 1.59
2 0.73145 63.11 20.10 31.72 1.58 10.05 15.74 1.57
3 0.66895 64.08 30.15 47.70 1.58 10.05 15.98 1.59
4 0.60645 59.22 40.20 62.47 1.55 10.05 14.77 1.47
5 0.57520 51.46 50.24 75.30 1.50 10.05 12.83 1.28
6 0.45020 51.46 60.29 88.14 1.46 10.05 12.83 1.28
7 0.15332 17.48 70.34 92.49 1.31 10.05 4.36 0.43
8 0.05957 13.59 80.39 95.88 1.19 10.05 3.39 0.34
9 0.00000 10.68 90.44 98.55 1.09 10.05 2.66 0.27
10 0.00000 6.12 100.00 100.00 1.00 9.56 1.45 0.15

Table 16.10 Lift and gain for linear regression for training and validation data
Lift/Gain Training data Validation data
A D A D
Average gain 1.259 1.4134 1.149 1.170
Maximum lift 1.54 1.59 1.51 1.59
Percent of cases with survival (%) 59.71 40.29 59.71 40.29
278 N. Sharma and H. Om

(a) Lift

(b) Gain

Fig. 16.6 Lift and gain for survival = Dead for linear regression model

(i) Linear Regression


Tables 16.9 and 16.10 provide the details of the calculation done for computing lift
and gain, average gain, maximum lift, and percentage of patients survived. Fig-
ure 16.6 presents the graph for lift and gain for the model.
16 Performance of Statistical and Neural Network Method 279

Table 16.11 Calculation of lift and gain for multilayer perceptron for training data for
Survival = Dead
Bin Class % Cum % Cum % of Cum % of % of Lift
Index of bin population class gain population class
1 66.02 10.05 16.46 1.64 10.05 16.46 1.64
2 65.05 20.10 32.69 1.63 10.05 16.22 1.61
3 58.25 30.15 47.22 1.57 10.05 14.53 1.45
4 66.02 40.20 63.68 1.58 10.05 16.46 1.64
5 47.57 50.24 75.54 1.50 10.05 11.86 1.18
6 46.60 60.29 87.17 1.45 10.05 11.62 1.16
7 24.27 70.34 93.22 1.33 10.05 6.05 0.60
8 14.56 80.39 96.85 1.20 10.05 3.63 0.36
9 4.85 90.44 98.06 1.08 10.05 1.21 0.12
10 8.16 100.00 100.00 1.00 9.56 1.94 0.20

Table 16.12 Lift and gain for multilayer perceptron for training and validation data
Lift/Gain Training data Validation data
A D A D
Average gain 1.278 1.416 1.278 1.308
Maximum lift 1.61 1.64 1.53 1.57
Percent of cases with survival (%) 59.71 40.29 59.71 40.29

(ii) Multilayer Perceptron


Tables 16.11 and 16.12 provides the details of the calculation done for computing
lift and gain, average gain, maximum lift, and percentage of patients survived.
Figure 16.7 presents the graph for lift and gain for the model.

16.5 Comparing the Models

The attempt has been made to compare linear regression and MLP models on the
basis of various criteria to assess their performance, and the same is discussed
below:-
Misclassication Table: Classication accuracy of MLP model for training data
is 70.05 % and validation data is 69.76 %, whereas classication accuracy of linear
regression model for training data is 60.10 % and validation data is 61.27 %.
280 N. Sharma and H. Om

(a) Lift

(b) Gain

Fig. 16.7 Lift and gain for Survival = Dead for multilayer perceptron model

Confusion Matrix: The matrix shows that MLP model generates better true
positive and false positive results, whereas linear regression generates better true
negative and false negative results.
16 Performance of Statistical and Neural Network Method 281

Fig. 16.8 Comparison of


multilayer perceptron and
linear regression for training
data

Sensitivity and Specicity: It indicates that MLP model can predict the malig-
nant cases more efciently than that of Linear Regression.
Lift and Gain: Average gain and percentage of cases with Survival = A or D is
slightly better in the case of MLP model in comparison to Linear Regression
models.
Figures 16.8 and 16.9 present the graphical comparison of both the models for
training and validation dataset. MLP displays better results in terms of accuracy,
true negative, false negative, specicity, geometric mean of sensitivity and speci-
city, positive predictive value, geometric mean of PPV and NPV, precision,
f-measure, and area under ROC curve in comparison with linear regression. The
results clearly indicate that MLP is a better model to predict the survivability of the
oral cancer patients.
282 N. Sharma and H. Om

Fig. 16.9 Comparison of


multilayer perceptron and
linear regression for
validation data

16.6 Conclusions

In this paper, we have built two models for the prediction of survival of oral cancer
patients, contemporary statistical model, i.e., linear regression model and a neural
network model, i.e., multilayer perceptron model. Both the models are different in
the way they are developed and the way they handle the dataset. The performance
of both the models is compared systematically. Classication accuracy of neural
network model is certainly better than statistical method, and denitely a better
model to predict the survival of oral cancer patients.

Acknowledgments The authors would like to place on record their sincere thank Dr. Vijay
Sharma, MS, ENT, for his valuable contribution in understanding the occurrence and diagnosis of
Oral Cancer. The authors offer their gratitude to the management and staff of Indian School of
Mines, for their constant support.
16 Performance of Statistical and Neural Network Method 283

References

1. Setyawati BR, Sahirman S, Creese RC (2002) Neural networks for cost estimation. AACE Int
Trans EST13:13.113.8
2. Smith K, Palaniswamy M, Krishnamoorthy M (1996) A hybrid neural approach to
combinatorial optimization. Comput Oper Res 23(6):597610
3. Buntine WL, Weigend AS (1991) Bayesian back-propagation. Complex Syst 5:603643
4. Warner B, Misra M (1996) Understanding neural networks as statistical tools. Am Stat 50
(4):284293
5. Lippmann RP (1987) An introduction to computing with neural nets. IEEE ASSP Mag:422
6. Freeman JA (1994) Simulating neural networks with mathematica reading. Addison-Wesley,
MA
7. Hertz J, Krogh A, Palmer RG (1991) Introduction to the theory of neural computation. Santa
Fe institute studies in the sciences of complexity, vol 1. Addision-Wesley, Redwood City, CA
8. Fausett L (1994) Fundamentals of neural networks: architecture, algorithms and applications.
Prentice Hall
9. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating
error. Nature 323:533536
10. Smith M (1993) Neural networks for statistical modeling. Van Nostrand Reinhold, New York
11. Taylor JG (1999) Neural networks and their applications. Wiley
12. White H (1992) Articial neural networks: approximation and learning theory. Basil
Blackwell, Oxford
13. Langdon JD, Russel RC, Williams NS, Bulstrode CJK (2000) Arnold: oral and oropharyngeal
cancer practice of surgery. Hodder Headline Group, London
14. Dinshaw KA, Ganesh B (2008) Annual report 20022005, hospital based cancer registry,
Tata Memorial Hospital, 2008
15. Willium GS, Hine MK, Levy BM A text book of oral Pathol. 4th edn. W.B. Saunders
Company, Philadelphia
16. Baxt WG (1995) Application of articial neural networks to clinical medicine. Lancet
346:11351138
17. Bottaci L (1997) Articial neural networks applied to outcome prediction for colorectal cancer
patients in separate institutions. Lancet 350:469472
18. Fogel DB, Wasson EC, Boughton EM, Porto VW, Angeline PJ (1998) Linear and neural
models for classifying breast masses. IEEE Trans Med Imaging 17:485488
19. Guh JY, Yang CY, Yang JM, Chen LM, Lai YH (1998) Prediction of equilibrated
postdialysis BUN by an articial neural network in high-efciency hemodialysis. Am J
Kidney Dis 31:638646
20. Lapeer RJA, Dalton KJ, Prager RW, ForsstrUom JJ, Selbmann HK, Derom R (1995)
Application of neural networks to the ranking of perinatal variables influencing birth weight.
Scand J Clin Lab Invest 55:8393
21. Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal
approximators. Neural Netw 2:359366
22. Fujita H, Katafuchi T, Uehara T, Nishimura T (1992) Application of articial neural network
to computer-aided diagnosis of coronary artery disease in myocardial spect Bulls-eye images.
J Nucl Med 33(2):272276
23. Poli R, Cagnoni S, Livi R, Coppini G, Valli G (1991) A neural network expert system for
diagnosing and treating hypertension. Computer:6471
24. Shang JS, Lin YE, Goetz AM (2000) Diagnosis of MRSA with neural networks and logistic
regression approach. Health Care Manag Sci 3(4):287297
25. HariKumar R, Vasanthi NS, Balasubramani M (2012) Performance analysis of articial
neural networks and statistical methods in classication of oral and breast cancer stages. Int J
Soft Comput Eng (IJSCE) 2(3):263269
284 N. Sharma and H. Om

26. Exarchos KP, Rigas G, Goletsis Y, Fotiadis DI (2012) Modelling of oral cancer progression
using dynamic Bayesian networks. In: Data mining for biomarker discovery. Springer
optimization and its applications, pp 199212
27. Rosenblatt F (1961) Principles of neurodynamics: perceptrons and the theory of brain
mechanisms. Spartan Books, Washington DC
28. Kent S (1996) Diagnosis of oral cancer using genetic programminga technical report,
CSTR-96-14
29. Kaladhar DSVGK, Chandana B, Bharath Kumar P (2011) Predicting cancer survivability
using classication algorithms. Int J Res Rev Comput Sci (IJRRCS) 2(2):340343
30. Milovic B, Milovic M (2012) Prediction and decision making in health care using data
mining. Int J Public Health Sci 01(2):6978
31. www.dtreg.com
32. Sharma N, Om H (2012) Framework for early detection and prevention of oral cancer using
data mining. Int J Adv Eng Technol 4(2):302310
Author Index

A M
Acharya, Moumita, 227 Mahajan, S.P., 123
Ashour, Amira S., 243 Mainkar, Sujay D., 123
Autee, Rajesh, 209 Maity, Satyabrata, 227
Myalapalli, Vamsi Krishna, 1
B
Baital, Kalyan, 31 O
Balas, Valentina E., 243 Om, Hari, 263
Oza, Shraddha D., 87
C
Chakrabarti, Amlan, 31, 227 P
Chandavale, Anjali, 173 Pal, Chandrajit, 227
Chatterjee, Sankhadeep, 243 Pathak, Mrunal, 137

D S
Dey, Nilanjan, 243 Sabnis, Manoj K., 49
Dorlikar, Vyanktesh, 173 Sharma, Neha, 263
Shelake, Vijay, 153
F Shewale, Ashwini, 193
Fahrnberger, Gnter, 97 Shirude, Snehalata B., 17
Shukla, Manoj Kumar, 49
H Sirsat, Narayan Balaji, 75
Holambe, Prabhakar Ramesh, 75 Sonawane, Anuja, 193
Hore, Sirshendu, 243 Srinivasu, N., 137
Swami, Nitin Vijaykumar, 75
J
Joshi, K.R., 87 T
Teke, Utkarsha, 193
K
Khedkar, Mohan S., 153 W
Kolhe, Satish R., 17 Waghmare, Nayan, 193
Kumar, Santosh D., 193
Kunjir, Supriya, 209

Springer Science+Business Media Singapore 2016 285


A. Chakrabarti et al. (eds.), Advances in Computing Applications,
DOI 10.1007/978-981-10-2630-0

Você também pode gostar