Você está na página 1de 84

International Journal of Computer Science

and Business Informatics


(IJCSBI.ORG)

ISSN: 1694-2507 (Print)


VOL 1, NO 1
MAY 2013
IJCSBI.ORG
Table of Contents VOL 1, NO 1 MAY 2013

Implementation of Image Steganography Using 2-Level DWT Technique ............................................. 1


Aayushi Verma, Rajshree Nolkha, Aishwarya Singh and Garima Jaiswal

Efficient Neighbor Routing in Wireless Mesh Networks ....................................................................... 1


V. Lakshmi Praba and A. Mercy Rani

Content Based Messaging Model for Library Information System ........................................................ 1


Surbhi Agarwal, Chandrika Chanda and Senthil Murugan B.

Building an Internal Cloud for IT Support Organisations: A Preview ..................................................... 1


S. M. M. M Kalyan Kumar and Dr S. C. Pradhan

Use of Intelligent Business, a Method for Complete Fulfillment of E-government ................................ 1


M. Nili Ahmadabadi, Masoud Najafi and Peyman Gholami

Comparison of Swarm Intelligence Techniques ................................................................................... 1


Prof. S. A. Thakare

An Efficient Rough Set Approach in Querying Covering Based Relational Databases ............................. 1
P. Prabhavathy and Dr. B. K. Tripathy
International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Implementation of Image
Steganography Using 2-Level DWT
Technique
Aayushi Verma, Rajshree Nolkha, Aishwarya Singh and
Garima Jaiswal
Department of Computer Science and Engineering,
Inderprastha Engineering College,
Gautam Buddh Technical University

ABSTRACT
Image steganography is an engineering term defining a different and significant discipline
for information hiding. This process can be described as hiding of secret information
behind an image. Discrete Wavelet Transform (DWT) is one of the known methods used
in steganography. The focus of the proposed work in this paper is on decreasing the
complexity in image hiding through DWT technique while providing better undetectability
and lesser distortion in the stego image. This paper proposes the algorithm for embedding
and extracting the secret image embedded behind the cover gray scale image. Also, the
analysis of performance measurement methods such as Peak signal to noise ratio (PSNR)
and Mean square error (MSE), gives us the experimental summary for four different cases
where each case spans different sizes of cover and secret image, comparing the cover
image and stego image at the senders side and embedded secret and extracted secret at the
receivers side. The stego attacks are then applied on the stego image and after each of the
attack, the secret image is extracted from the distorted image. For better analysis, this
extracted secret is compared with the expected result on the basis of PSNR and MSE. Also,
the proposed algorithm is compared with one of the existing method using DWT technique,
proposed by K.B. Shiva Kumar et. al. [7].

Keywords
Cover image, DWT, Key information, Secret image, Stego image

1. INTRODUCTION
Image steganography has been a vast area of research for many years now.
It is a process that hides the secret image behind the cover image in such a
way that the presence of the secret image is locked and the cover image
appears to be the same [1]. In such a way, the digital information can be
embedded and transferred to the destination with minimum risk of
detectability. The concept of undetectability has raised the need of
steganography in all dimensions such as commerce, national security
services, and banking and other private communication areas. Other
information hiding methods such as cryptography, watermarking and digital
signature differs from the steganography concept as steganography allows

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 1


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

the communication to be hidden and also, provides better quality of the


secret image.

Figure 1. Principles of Steganography


Figure 1 shows the basic flow of processes that takes place in image
steganography. The steganography [2] is a two sided method where on the
one side the secret image is embedded in the cover image and on the other
side, the secret image is extracted by performing inverse operations on the
stego image.

In this paper, the discrete wavelet transform (DWT) technique is used to


accomplish the embedding of the image which is one of the most robust,
secure and high capacity image steganographic techniques.

2. DISCRETE WAVELET TRANSFORM (DWT)


Discrete wavelet transforms are used to convert the image in spatial domain
to frequency domain, where the wavelet coefficients so generated, are
modified to conceal the image. In this kind of transformation the wavelet
coefficients separates the high and low frequency information on a pixel to
pixel basis [3]. The DWT approach applied in the proposed work is the
Haar DWT, simplest of all the wavelet transform approaches. In this
transform, time domain is passed through low-pass and high pass filters and
the high and low frequency wavelet coefficients are generated by taking the
difference and average of the two pixel values respectively [4]. The
operation of Haar DWT on the cover image results in the formation of 4
sub-bands, namely the approximate band (LL), horizontal band (HL),
vertical band (LH) and the diagonal band (HH). The approximate band
contains the most significant information of the spatial domain image and
other bands contain the high frequency information such as edge details.
Thus, the DWT technique describes the decomposition of the image in four
non overlapping sub-bands with multi-resolution. This process can be

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 2


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

iterated on one of the sub-band of first level DWT to get the further second
level sub bands for better results.

LL HL

LH HH

Figure 2. Sub bands formed after applying Haar DWT [4]


Figure 2 shows the 4 sub-bands that are formed after applying 1-level Haar
DWT on a 2-dimensional image.

3. STEGANALYSIS
Steganalysis [5] is an art of identifying stego images that contains a secret
image. However it does not consider the successful extraction of the secret
image, which is a requirement for cryptanalysis. Steganalysis is a very
difficult task as it is based on insecure steganography. Recently,
steganalysis has received a lot of attention from the media and the legal
world. The attacker either can destroy, disable the secret image or may also
add counter information over the original secret image which leads to
statistical differences of the secret image.

4. PROPOSED MODEL
The model proposed in this paper is a unique attempt to simplify the
embedding procedure and reduce the effort of concealing the secret image
in the cover image and yet offering better results. The model can be broadly
divided into two sub modules where one module deals with the proper
concealing of secret image and the other module extract the secret image.
The models are explained in a step-wise procedure below.

4.1 Embedding model


This model will take the cover image and secret image as inputs and will
output the stego image which appears to be the same as the cover image but
will have the secret image within it.

STEP 1 - Input the cover image and then apply the 2-level DWT transform
on the image. This will result in the formation of four bands i.e. LL1, HL1,
LH1 and HH1. Now for better imperceptibility, the DWT transform is
applied once again on the HH band to get the next coarser scale of wavelet
coefficients resulting in another level of sub-bands in HH1 band as LL2,
HL2, LH2 and HH2. Here, the LL2 band is selected to embed the secret

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 3


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

because hiding in the approximate band will result in a smooth and better
extraction of the secret at the receivers side.

Figure 3. 2-level DWT operation on cover image

STEP 2 - Starting from the top left corner of the LL2 level band, replace the
5 LSB of the LL2 band coefficient by 5 MSB of the secret image pixel.

Figure 4. Example depicting the operation

STEP 3 - Iterate the above step for n times (where n*n is the size of the
secret image) and hence we get the embedded secret.

Figure 5. Secret image embedded Figure 6. Stego image

STEP 4 - Apply inverse DWT twice to retranslate the frequency domain


information to the spatial domain and hence we obtain the stego image
which appears to be the same as the cover image.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 4


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

The stego image so formed is transferred over the public network with the
least risk. The key information is also sent to the receiver, because without
the prior knowledge of key info the secret image cannot be extracted. The
key info is the random combination of the size of the secret image, the name
of the band where the secret is embedded and the number of MSB bits of
the secret embedded.

Figure 7. Key information

Figure 7 shows the detailed concept for the generation key information by
the sender, which is sent to the receiver along with the stego image.

4.2 Extracting model


The stego image is taken as input in this model and the secret image is
extracted out of it, after processing it according to the key information. The
extraction model is simpler than the embedding module, as it is just the
reverse process of embedding. It simply employs the Haar DWT operations
and the corresponding extraction of the MSB of the secret.

STEP 1 - The stego image is loaded as the input. The receiver has the prior
knowledge of the location of the secret as it is provided in the form of key
information. Thus to obtain the required band, the stego image is
transformed to the frequency domain from the spatial domain by applying
the 2 level DWT operations over it. After this step, the receiver has the LL2
band wherein it contains the secret images bits.

Figure 8. 2-level DWT operation on received image

STEP 2 - Starting from the top left corner of the 2nd level approximate band
i.e. LL2 band, extract the 5 LSB into a new matrix vector.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 5


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Figure 9. Example depicting the operations

STEP 3 - After iterating the above step n times (where n is the size of the
secret image as provided by the sender, included in the key information X),
we get the secret image in an n*n matrix.

Figure 10. Secret image extracted

5. THE BASIC ALGORITHM


The basic steps involved in the entire process as explained in the proposed
model can be enumerated in an algorithmic way showing the proper flow of
operations.

5.1 Embedding Algorithm


1. Input the cover image.
2. Apply the 2-level DWT on the cover image.
3. Select the band to be modified as m (i.e. LL2).
4. Input the secret image.
5. Obtain the size of the secret as n.
6. For each of the n*n coefficient of the m band replace the p LSB bits
(i.e. 5 bits) by the p MSB bits of the secret image.
7. Apply IDWT (Inverse DWT) operation twice and the stego image is
obtained.
8. The key information is formed as: K = n + m + p.

5.2 Extracting Algorithm


1. Input the stego image.
2. Apply 2-level DWT transform on the stego image.
3. Load the key information K and assign the corresponding values in m, n
and p.
4. Starting from the top left corner of the m band, extract the p LSB of the

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 6


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

band coefficient to the p MSB of the new matrix vector.


5. Repeat this step for n times in both dimensions and name this new matrix
vector as the secret image.
6. Output the secret image.

6. STEGO ATTACKS
The stego image so formed at the end of the embedding module, when
passed over the public network, then an intruder may acquire the stego
image and can willingly modify this image to distort the secret hidden
behind it. The algorithm which we propose in this paper is robust to various
kinds of stego attacks.

6.1 Geometric Attacks


These attacks are applied on the 2D matrices formed of the stego image
such as rotation, scaling and translation. Figure 11 shows the distortion
created in the stego image and corresponding extracted secret images.

Figure 11. Extracted secret images after applying Geometrical distortions

Figure 11 shows the extracted secret image which is translated, scaled and
rotated corresponding to the translation, scaling and rotation of stego image.
When the stego image is rotated, then the secret image, which is stored in
LL2 bands coefficients in a right to left and top to bottom sequential
manner also gets rotated and hence we get the rotated extracted secret. The
same happens with other geometrical attacks too, as shown above.

6.2 Adding Noise


These attacks are applied on the stego image by adding noise such as
Gaussian noise, salt and pepper noise and speckle noise.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 7


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

The figure below shows the application of these noises, with default amount
of noise on the stego image and the extracted secret images for each case.

Figure 12. Extracted secret images after applying noise

Figure 12 shows the extracted secret image which is added with speckle,
salt and pepper and Gaussian noise corresponding to the noised stego
image.

6.3 Pixel Arithmetic


These attacks are directly applied on the pixels of the stego image such as
transpose, thresholding, brightening and darkening. The figure given below
explains the application of these kinds of attacks and the corresponding
extracted secrets.

Figure 13. Extracted secret image after pixel arithmetic operations

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 8


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Figure 13 shows the extracted secret image which is transposed, threshold,


brightened and darkened corresponding to the transposed, threshold,
brightened and darkened stego image.

7. QUALITY MEASUREMENT TECHNIQUES


The quality of the stego image and the extracted secret image is measured
by calculation of certain quality measurement metrics [6]. These metrics
gives the comparison ratio between the original image and the modified
image. The quality may be assessed on the basis of these values. The
metrics used in this paper are as follows:

7.1 Peak signal to noise ratio (PSNR)


The PSNR depicts the measure of reconstruction of the compressed image.
This metric is used for discriminating between the cover and stego image.
The easy computation is the advantage of this measure. It is formulated as:

...Eq.1 [6]

A low value of PSNR shows that the constructed image is of poor quality.

7.2 Mean square error (MSE)


MSE is one of the most frequently used quality measurement technique
followed by PSNR. The MSE [6] can be defined as the measure of average
of the squares of the difference between the intensities of the stego image
and the cover image. It is popularly used because of the mathematical
tractability it offers. It is represented as:

...Eq.2 [6]

Where f (i, j) is the original image and f (i, j) is the stego image. A large
value for MSE means that the image is of poor quality.

7.3 Normalised Correlation (NK)


Normalised Correlation measures the similarity between the two images, i.e.
the original image and the stego image. Larger values of NK indicate poorer
image quality. Its value tends to one as the difference between the two
images tends to zero [6]. Normalised Correlation is formulated as:

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 9


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

...Eq.3 [6]

7.4 Normalised absolute error (NAE)


The NAE [7] is the measure of how distant is the modified image from the
original image with the value of zero being the perfect fit. The normalised
absolute difference can be calculated as:

...Eq.4 [6]

8. EXPERIMENTAL SUMMARY
On the basis of the formulae discussed above, various set of cover and
secret images are compared. The cover images and the secret image used
are shown below in Figure 14 and Figure 15, respectively.

Figure 14. Cover images (.bmp)

Figure 15. Secret Image embedded (.bmp)

Figure 16. Four cases for various set of sizes of cover and secret image.

Figure 16 shows the four cases with, each with varied set of sizes of cover
and secret image.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 10


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

8.1 Comparison between cover image and stego image


The table given below shows the value of the PSNR and MSE presenting
the comparison between the original cover image and the stego image
formed at the end of the embedding module for all the four cases defined.

Table 1. PSNR and MSE values

The highlighted region in Table 1 shows that the maximum PSNR value
obtained, is 56.24 dB for case 3 (cover size- 512*512 and secret size-
64*64), which is a very high value.

8.2 Comparison between embedded secret image and extracted secret


image
The table given below shows the value of the PSNR and MSE presenting
the comparison between the embedded secret and the extracted secret
formed obtained at the end of the extracting module for all the four cases.

Table 2. PSNR and MSE values

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 11


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Table 2 shows the PSNR and the MSE values for each case. The PSNR
values ranges from 17 dB to 18 dB, which is a fair enough value when the
smaller sized images are compared.

8.3 Comparison between embedded secret and extracted secret after


applying stego attacks
The table given below shows a PSNR comparison for one of the case (i.e.
case 4, where size of cover and secret is 512*512 and 128*128
respectively) between the secret image embedded at the time of embedding
and the extracted secret after applying various stego attacks.

Table 3. PSNR value for case 4 after applying attacks

Table 3 shows the PSNR value, comparing the embedded secret and
extracted secret after distortion of stego image (for case 4 only), which
ranges from 7 to 13 dB which is approximately 40%-75% of the PSNR
value obtained (from Table 2) while comparing the embedded secret and
extracted secret when no attack was performed on the stego image.

9. SIMULATION RESULTS
In this section, an experiment is carried out to prove the efficiency of the
proposed method. The proposed scheme has been simulated using
MATLAB 7.6 running on a Windows 7 platform. An 8-bit grayscale image
of 256*256 is used as the cover image to form the stego image, concealing a
90*90 secret image. Both, the secret image and the cover image are in the
.png format.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 12


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Figure 17. Boatman.png Figure 18. Pout.png

Our proposed algorithm was applied on this set of cover image and secret
image with the given size, and the respective PSNR is calculated. Then the
PSNR generated by one of the existing method, proposed by in [7] was
compared with our PSNR values.

Table 4. Comparison of Results

PSNR VALUE
EXISTING METHOD PROPOSED METHOD
32.18 46.77

The PSNR value for the original cover image and the stego image, as
computed by our proposed method was found better than the existing
method [7], as the PSNR value comparing cover image and stego image,
calculated from our proposed method is 45% more than the PSNR value
calculated from one of the existing method [7].

10. CONCLUSIONS
The proposed model for image steganography is a simple, secure, robust
technique for image hiding providing good embedding capacity of secret,
where the maximum size of secret allowed is th of the size of the cover
image. The stego image formed using this proposed algorithm appears to be
the same as the cover image offering the high PSNR value as shown in
Table 1 and Table 2. The stego image is partially robust to various
geometrical and statistical attacks but ensures to deliver the exact pattern of
the secret to the receiver even if the stego image appears to be highly
distorted. The PSNR value, comparing the embedded secret and extracted
secret after distortion of stego image is approximately 40%-75% of the
PSNR value comparing the embedded secret and extracted secret when no
attack was performed on the stego image. Also the PSNR value of the
original cover image and the stego image, as computed by our proposed
method was found 45% more than the PSNR value calculated from the
existing method.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 13


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

11. REFERENCES

[1] L. Marvel, C. G. Boncelet, Jr, and C. T. Retter, Spread spectrum image


steganography, IEEE Trans. Image Process., Vol. 8, No. 8, pp. 10751083, Aug. 1999.

[2] Ying Wang, Pierre Moulin, Perfectly Secure Steganography: Capacity, Error
Exponents, and Code Constructions, IEEE Trans. On Information Theory, Vol. 54, No. 6,
June 2008.

[3] Po-Yueh Chen and Hung-Ju Lin, A DWT Based Approach for Image Steganography,
International Journal of Applied Science and Engineering 2006. Vol 4, No. 3, pp 275-290.

[4] Yedla Dinesh and Addanki Purna Ramesh, Efficient Capacity Image Steganography
by Using Wavelets , International Journal of Engineering Research and Applications
(IJERA), Vol. 2, Issue 1, Jan-Feb 2012, pp 251-259.

[5] Miss. Prajakta Deshmane, Prof. S.R. Jagtap,Skin Tone Steganography for Real Time
Images, International Journal of Engineering Research and Applications (IJERA), Vol. 3,
Issue 2, Mar-Apr 2013, pp 1246-1249.

[6] Sumathi Poobal, G. Ravindran,The Performance of Fractal Image Compression on


Different Imaging Modalities Using Objective Quality Measures, International Journal of
Engineering Science and Technology (IJEST), Vol. 2, Issue 1, Jan-Feb 2011, pp 239-246

[7] K B Shiva Kumar ,K B Raja ,R K Chhotaray, Sabyasachi Pattnaik ,Performance


Comparison Of Robust Steganography Based On Multiple Transformation Techniques,
International Journal of Comp. Tech. App., Vol. 2(4), July-Aug 2011, 1035-1047.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 14


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Efficient Neighbor Routing


in Wireless Mesh Networks
V. Lakshmi Praba
Assistant Professor
Govt.Arts College for Women, Sivaganga, India

A. Mercy Rani
Research Scholar
Manonmaniam Sundaranar University, Tirunelveli, India

ABSTRACT
Wireless Mesh Network (WMN) is a rising technology in the wireless field and has the
advanced features like self-healing, self-configuring, low deployment cost, easy network
maintenance and robustness. The latest research is paying more interest on the efficient
route construction of the networks. The efficient route can be constructed by choosing the
best neighbor for transmitting the packets. The on-demand protocol AODV selects a route
based only on the minimum hop-count, but this is not enough for constructing the best
route. For selecting the best route, nodes energy is an important constraint. This paper
considers maximal net energy neighbor for routing the packets. The AODV protocol is
enhanced to construct a route with maximal energy. For the performance evaluation the
packet delivery ratio, dropped packets, and energy consumed per packet metrics were
analyzed using NS-2 simulator by varying energy ranges. The observed results prove that
there is a substantial increase in packet delivery ratio and decrease in dropped packets and
energy consumed per packet.
Keywords
Net Energy, Efficient Neighbor, Energy Consumption, Route lifetime.

1. INTRODUCTION
Wireless Mesh Network (WMN) is a communication network made up of
radio nodes organized like a mesh topology. Wireless mesh networks often
consist of mesh clients, mesh routers and gateways. The mesh clients are
often stationery devices, laptops, mobile phones and other wireless devices.
The mesh routers forward messages to and from the gateways and it also
forwards the packets to distant nodes through another router located within
a few hops. Gateway may connect to the Internet by a wired or wireless
link. A mesh network is reliable and provides redundancy. When one node
fails in the network, the rest of the nodes can communicate with each other,
directly or through one or more intermediate nodes [1][2]. WMN possess
the advanced features of robustness, wide area coverage, easy network
deployment and maintenance, self-healing, self-configuring, low
deployment cost and self-organizing. Due to these features WMN is mainly
used in Healthcare, Disaster recovery, Home Automation, Historical
Monuments and Industries [3].

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 1


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

The on-demand routing protocols such as AODV (Adhoc On-Demand


Distance Vector) [4] and DSR (Dynamic Source Routing) [5] are designed
for finding the route from the source to the destination nodes using
minimum-hop count. It is not assured that all minimum-hop count routes
will always lead to quick and successful delivery of packets to the
destination. Node failure in the route would generate a route discovery
frequently to find another path for transmitting the packets. The on-demand
protocols are designed to get routing information only when it is desired.
The nodes will maintain only the desired routes. The drawback of this
approach is, in intermittent-data applications and population scenarios the
route discoveries are increased when a new route is requested [6]. Frequent
route discovery attempts lead to high route discovery latency and can affect
the network performance.
In some applications there is a necessity to have a stronger route for
delivering more packets in shorter time and to increase the route lifetime.
These can be achieved if the node participate in the route has a maximum
net energy to support the communication. Nodes energy is an important
constraint for selecting the best node to forward the packets. Since
maximum energy node will stay active for a longer time in a route, which
improves the network performance by increasing the throughput and route
lifetime. In this paper we aim to enhance the AODV protocol for mesh
networks to construct the efficient routes based on maximum net energy
nodes. The paper is organized as follows: Section 2 deals with the
architecture of WMN; section 3 shows related work; section 4 discusses the
proposed AODV in WMN. Section 5 describes simulation process and
results and section 6 presents the conclusion and future scope.

2. ARCHITECTURE OF WIRELESS MESH NETWORK


The architecture of Wireless Mesh Network consists of Mesh Routers, Mesh
clients, and Gateway. Mesh clients are mobile devices such as mobile
phones, laptops, PDA etc and mesh routers and gateways are immobile
nodes. Immobile mesh routers form the mesh backbone network. Mesh
clients access the network through mesh routers as well as directly
connecting with each other. The gateway is also a mesh router with a high
bandwidth wired connection to the Internet. Figure 1 shows the architecture
of Wireless Mesh Network.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 2


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Internet
Gateway

Mesh Backbone

Client Mesh C
l
i
e
n
t

M
e
s
h Mesh clients Mesh Router

Wired link Wireless link


Mesh Mesh

Figure 1. Architecture of Wireless Mesh Network


The mesh backbone connected to Internet through Gateway using a wired
connection whereas the other connections such as Mesh Clients to Mesh
Routers in the network are wireless connections. The Mesh Routers are
connected with each other to share their information. The Internet
connection is an optional one. The Mesh Routers and Mesh Clients are
connected in a multihop style. Each Mesh Router and Mesh Client is
connected to more than one Mesh Routers and Mesh Clients, so that if a
mesh client or mesh router in the network fails, it automatically finds a
different route for sending the packets to the destination.

3. RELATED WORK
In communication network energy based routing has been studied in
multihop wireless networks. Few of the important findings in WMN and
Adhoc networks are listed below.
Yumei et al [6] proposed a routing protocol which exploited maximal
minimal nodal remaining energy concept. It balanced the nodal energy
consumption. This protocol found the minimal nodal remaining energy of
each route in the route discovery process, then sorted the multi route by
descending nodal net energy and used the route with maximal minimal net
energy to forward the data packets. Vazifehdan et al [7] proposed energy-
aware routing algorithms for ad hoc networks with both battery-powered
and mains-powered nodes. The results showed that it reduces the routing
overhead and increases the network lifetime.
Visu et al [8] proposed energy efficient routing protocols using Artificial
Bee Colony (ABC) based routing algorithm. The performance of the

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 3


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

proposed algorithm was discussed in terms of response time and throughput.


Getsy et al [9] proposed a new on-demand protocol E2AOMDV for saving
battery energy in a dense mobile network with high traffic loads. Ajitsingh
et al [10] presented a survey on energy efficient routing protocols for
wireless Ad Hoc networks. Survey focused on recent development and
modifications in this widely used field.
Shivendu et al [11] measured the energy consumption in traffic models
using routing protocols namely AODV, OLSR and AOMDV and observed
that AOMDV consumed less energy than OLSR and AODV with increasing
number of nodes, average speed and send rate. Antonio et al [12] proposed a
novel routing algorithm for 802.11 based wireless mesh networks called
Energy and Throughput-aware Routing (ETR). The design objectives of
ETR were to provide flows with throughput guarantees, and to minimize the
overall energy consumption in the mesh network.
Annapurna et al [13] proposed the design of a protocol that is a combination
of two energy cost metrics in a single protocol and evaluated the
performance of the proposed protocol against the two protocols chosen for
combination and against the traditional AODV. The various performance
metrics were analyzed in the proposed protocol.
All the above works proposed several routing protocols to construct a route
with minimum energy. This paper proposes an enhanced version of AODV
by selecting the efficient neighbor with the maximum net energy for routing
the packets since the maximum energy based node will stay active for a
longer time in the route.

4. PROPOSED AODV IN WMN


In WMN, the route and link failures occurred when the nodes in the path
have lesser power or energy. This leads to frequent route discovery. This
can be avoided by selecting the maximum net energy node for transmitting
the packets to the destination. This increases the performance of the network
obviously. On considering this feature the proposed protocol has been
designed in WMN for selecting the efficient neighbor with maximum net
energy. The proposed protocol is compared with AODV by taking the
energy ranges as 0-25, 25-50, 50-75, 75-100 and 100-125 in joules.
4.1 Efficient Neighbor Selection
Selection of maximum energy neighbor for transmitting the packets from
source to destination is the key approach used in this paper. The proposed
protocol is an extension of AODV. In this proposed protocol each node in
the network transfers the HELLO messages with their net energy value.
Initially each node in the network has been assigned with the maximum
energy of energy range. Adjacent nodes energy values are stored in the

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 4


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

neighbors table of the communicating nodes with its neighbor-id and net
energy value. The energy threshold is the energy ranges minimum energy.
The nodes are allowed to participate in the route selection process when
their net energy value is above the energy-threshold.
The control packets RREQ and RREP of AODV protocol are used for
setting up the communication with the nearby nodes. In the proposed
protocol the RREQ packet is made stronger with two additional fields
RREQ_ENERGY and FRWD_NBR and the RREP packet with one
additional field RREP_ENERGY.
RREQ_ENERGY field has been assigned the ENERGY_THRESHOLD
value. The node which has the energy below the ENERGY_THRESHOLD
discards the packets. FW_NBR field will store the neighbors node-id that
possesses the maximal net energy value among the neighbors.
RREP_ENERGY field in RREP packet has been assigned with their net
energy at the time of reply.
The MaxNBR() procedure integrated in the existing protocol AODV
determines the nodes with the maximal net energy among the neighbors of
the current node. The proposed protocol maintains the list of neighbors of
each node. The MaxNBR() procedure retrieves the neighbor-id which
possess the maximal net energy by giving the node-id as a parameter. The
returned neighbor-id is stored in FW_NBR field of RREQ packet. The
source sends the RREQ packet to all its neighbors. The neighbors receive
the RREQ packet and checks whether their energy is above or below the
RREQ_ENERGY to accept or reject that route are explained in the
algorithm.
4.2 Algorithm
Step 1: Initialize ENERGY_THRESHOLD as the energy ranges minimum energy.
Step 2: The source node does the following before sending the RREQ packet.
i) Set RREQ_ENERGY field of RREQ packet as ENERGY_THRESHOLD.
ii) Source node calls the MaxNBR() procedure to do the following:
a) Find the neighbor which has maximum net energy.
b) Assign its node-id to FW_NBR field of RREQ packet.
Step 3: The node that receives the RREQ packet does the following:
i) Calculate the net energy of a node using the energy model.
ii) Check if CurNodes net energy is less than the RREQ_ENERGY and CurNode is
not a destination then drop the packet. goto step 4.
iii) Check if it is the destination then goto step5.
iv) Otherwise check if FW_NBR field of RREQ, matches with CurNode.id and if
CurNode.id is a destination then goto step 5.
a) Otherwise forwards the RREQ packet.
Step 4: Step 3 is repeated for each neighbor until a destination is found.
Step 5: Send RREP packet to select the route for transmitting the data packets.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 5


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

When the destination is reached the net energy of a current node is assigned
to RREP_ENERGY field of RREP packet. The further process is similar to
AODV. The route is made available by unicasting a RREP back to the
origination of the RREQ. Each node receiving the request caches a route
back to the originator of the request, so that the RREP can be unicast from
the destination along a path to that originator, or likewise from any
intermediate node that is able to satisfy the request. Finally, the proposed
protocol selects a route with maximal net energy for transmitting the
packets from the source to the destination.
87.5
95.4 G 78.7

A H
89.2
98.3 F K 78.5
72.5
9.5 94.1 94.6
S B 5
I
E 79.6 D 73.4
C G 9.5 S
98.8 89.5 J

RREQ RREP
Rejected node Selected nodes
Wireless links

Figure 2. Efficient neighbor selection process


Figure 2 shows the efficient neighbor selection process with the proposed
protocol by considering the energy range as 75-100. Each node in the
network shows its net energy value and RREQ_ENERGY has been assigned
an ENERGY_THRESHOLD as 75. Initially, the neighbors A, B and C of
Source S receive the RREQ packet. The neighbor which has maximum net
energy is chosen for forwarding the packets. In this example, neighbor C is
selected. Next, C forwards the RREQ packet to its neighbors E and F from
which E is selected. E forwards RREQ packet to its neighbors I and J, I is
rejected since its net energy is less than the ENERGY_THRESHOLD and
not even considered for MaxNBR() procedure. J is selected for forwarding
the packets. Now J forwards the packet to its neighbor D. Even though the
net energy of D is less than the ENERGY_THRESHOLD it is considered in
the selection process since it is the destination node. After reaching the
destination, it sends the RREP packet to the source through the selected
nodes. Throughout the process, the route with maximum net energy nodes
(S->C->E->J->D) are selected for transmitting the data packets to the
destination. Table 1 shows the summary of the above process.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 6


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
Table 1. Efficient neighbor selection process summary
Source/ Neighbors Rejected : Selected for
Intermediate Receive net energy forwarding
node on the RREQ < energy due to
route threshold maximum
net energy
S A, B and C - C
C E, F and G - E
G I and J I J
J D - D

5. SIMULATION PROCESS & RESULTS


The Simulations are performed using Network Simulator 2 (NS-2) [14]. For
the performance evaluation of the proposed protocol in WMN, a network
with 5 Mesh Clients, 8 Mesh Routers and one Gateway has been created.
The simulation layout is shown in Figure 3. The gateway, mesh clients and
mesh routers are placed in an area of 800 x 800 meters. Mesh routers are
placed fixedly so that it assists the mesh clients in establishing reliable
connections to the gateway and also to other mesh routers and mesh clients.
CBR connections are created to establish connection between gateway,
mesh routers and mesh clients.

Figure 3. Simulation layout


The simulation layout as shown in Figure 3 serves as the basis for
evaluating the performance of the proposed protocol. The Table 2 shows the
simulation parameters used for evaluating the performance.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 7


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
Table 2. Simulation parameters
Parameter Value
Simulation NS-2
Simulation area 800 x800m
Simulation time 200 s
Transmission range 200 m
Packet Size 512 bytes
Transmission rate 1Mb
No. of Mesh Clients 5
No. of Mesh Routers 8
No. of Gateway 1
Routing Protocol AODV
Packets CBR
Energy Ranges 0-25, 25-50, 50-75,
75-100, 100-125 joules
Initial Energy 25,50,75,100, 125
Energy Threshold 0,25,50,75,100
rxPower 35.28e-3 W
txPower 31.32e-3 W
idlePower 712e-6 W
sleepPower 144e-9 W
transitionTime 0.003 s

5.1 Performance Metrics


5.1.1 Packet Delivery Ratio
The ratio between the numbers of packets successfully received at the
destinations and the total number of packets sent by the sources.
5.1.2 Dropped Packets
No. of packets dropped during transmission.
5.1.3 Energy Consumed/packet
The total energy consumption is divided by the total number of packets
received at the destination. This metric reveals the energy efficiency of the
proposed protocol.
This paper considers the packet delivery ratio, dropped packets and energy
consumed per packet as performance metrics which were analyzed by
varying the energy ranges.
5.2 Simulation Results
The performance analysis was conducted in the simulation layout to
evaluate the performance of AODV protocol in WMN by varying energy
ranges. The simulation results are shown in the form of graphs. The

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 8


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

proposed protocol is compared with AODV by taking the energy ranges as


0-25, 25-50, 50-75, 75-100 and 100-125 in joules. Figure 4 to 6 show the
graph for the considered metrics.
Figure 4 shows the performance of the proposed protocol and AODV on the
basis of considered performance metrics by varying energy ranges. For each
energy range its respective lowest energy is set as the energy threshold.

Figure 4. Packet delivery ratio Vs Energy


From Figure 4, it is observed that the PDR value of proposed protocol is
better when compared to the existing AODV for various energy ranges.

Figure 5. Dropped packets Vs Energy


From Figure 5, it is observed that the dropped packets have decreased
significantly when compared to the existing AODV.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 9


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Figure 6. Energy consumed per packet Vs Energy


From Figure 6, it is observed that the proposed protocol consumed less
energy for transmitting a packet in all energy ranges when compared with
the existing AODV.

5. CONCLUSION
In this paper an effective routing protocol is proposed to improve the
performance of the wireless mesh network. This proposed protocol selects
the efficient neighbor for the route construction. Due to this efficient
neighbor selection the route lifetime is increased and more number of
packets is transmitted when compared to the existing AODV which
increases the PDR value and decreases the dropped packets. Even though
this proposed protocol uses the maximum energy neighbor for routing, the
energy consumed per packet is reduced when compared with exiting
AODV. Future work will be focused on finding the optimal route for
transmitting the packets by considering various other metrics.

6. REFERENCES
[1] http://en.wikipedia.org/wiki/ Wireless_mesh_network
[2] Guokai Zeng, Bo Wang, Yong Ding, Li Xiao, and Matt W. Mutka (2010). Efficient
Multicast Algorithms for Multichannel Wireless Mesh Networks, IEEE Transactions
On Parallel And Distributed Systems, Vol. 21, No. 1, January 2010, pp-86-99.
[3] Whats so good about mesh networks? Daintree Networks, www.daintree.net
[4] Perkins C.E. and Royer E.M. (1998), Ad hoc on demand distance vector (AODV)
routing (Internet-draft), in: Mobile Ad-hoc Network (MANET) Working Group, IETF
(1998).
[5] David B. Jhonson, David A.Maltz and Josh Broch, DSR: The Dynamic Secure Routing
protocol for Multi-Hop Wireless Adhoc Networks.
www.monarch.cs.rice.edu/monarch-papers/dsr-chapter00.pdf
[6] Yumei Liu, Lili Guo, Huizhu Ma, Tao Jiang (2008). Energy efficient on demand
multipath routing protocol for multi-hop ad hoc networks , in the Proceedings of

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 10


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
ISSSTA-08, IEEE 10th International symposium on Spread spectrum and applications,
Bologna, Italy, August 25-27 2008, pp-592-597.
[7] Vazifehdan Javad, Prasad Venkatesha Venkatesha, Onur Ertan, Niemegeers Ignatius G
M M, Energy-aware routing in wireless ad hoc networks with mains-powered nodes,
Future Network and Mobile Summit, 2010, IEEE Xplore, Print ISBN: 978-1-905824-
16-8, pp1 12
[8] P. Visu, J. Janet, E. Kannan, S. Koteeswaran (2012), Optimal Energy Management in
Wireless Adhoc Network using Artificial Bee Colony Based Routing Protocol,
European Journal of Scientific Research, 2012, ISSN 1450-216X, Vol.74 No.2, pp.
301-307
[9] Getsy S Sara, Neelavathy Pari.S, Sridharan.D (2009). Energy Efficient Ad Hoc On
Demand Multipath Distance Vector Routing Protocol, International Journal of Recent
Trends in Engineering, Vol 2, No. 3, November 2009, pp. 10-12
[10] Ajit Singh, Harshit Tiwari, Alok Vajpayee, Shiva Prakash, A Survey of Energy
Efficient Routing Protocols for Mobile Ad-hoc Networks, (IJCSE) International Journal
on Computer Science and Engineering, Vol. 02, No. 09, 2010, 3111-3119
[11] Shivendu Dubey, Rajesh Shrivastava, Energy Consumption using Traffic Models for
MANET Routing Protocols, International Journal of Smart Sensors and Ad Hoc
Networks (IJSSAN) Volume-1, Issue-1, 2011, pp84-89.
[12] Antonio De La Oliva a, Albert Banchs A.B. Pablo Serrano, Throughput and energy-
aware routing for 802.11 based mesh networks, Elsevier, Computer communications
2012.
[13] Annapurna P Patil, Dr K Rajani Kanth, BatheySharanya, M P Dinesh Kumar,
Malavika J. (2011). Design of an Energy Efficient Routing Protocol for MANETs
based on AODV, IJCSI International Journal of Computer Science Issues, Vol. 8,
Issue 4, No 1, July 2011 ISSN (Online): 1694-0814 ijcsi.org/papers/IJCSI-8-4-1-215-
220.pdf
[14] NS-2 Network Simulator http://www.isi.edu/nsnam/ns.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 11


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Content Based Messaging Model


for Library Information System
Surbhi Agarwal, Chandrika Chanda, Senthil Murugan B.
School of Information Technology and Engineering,
Vellore Institute of Technology
Vellore, Tamil Nadu, India.

ABSTRACT
This paper presents an application that introduces publish-subscribe model based service in
the existing library information system. It applies content based messaging model in order
to provide the due-date reminder service through SMS in the library system. The users of
the library could subscribe to receive SMS notifications about the due date of their books as
per their convenience and the library system will publish the information to its users. The
users themselves select when they want to receive their notifications. Hence this model
would be based on content based filtering. This paper proposes the addition of such a
module to the existing library information systems in an attempt to enhance it.
Keywords
Content-based messaging, library management system, publish-subscribe, messaging
model, web services.

1. INTRODUCTION
In todays cyber-world, it is becoming increasingly essential for all day-to-
day facilities to be available in the form of web services. The majority of the
student population finds it more convenient for all services and notifications
to be delivered to their mobile phones as SMS instead of e-mails. Many
libraries that have a web interface provide services like searching for
available books, reserving books that are loaned out and requesting for new
books through their websites. Services like re-issuing of books or reminders
about books that are due are also provided, by sending a notification e-mail.
The Short Messaging Service (SMS) has almost made e-mails obsolete in
many walks of life.
The advancement of technology, especially in the areas of computer and
networking technologies indicates an information era. This change needs to
be implemented in all walks of life in order to keep up with technological
advancements. There is a need to improve the traditional library
management systems with the inclusion of emerging technological trends
that are more user-friendly and make the library experience traditional yet
cutting edge. The most common medium for providing any service today is
the internet, and so it is essential for the library to start offering web services
to maintain its clientele.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 1


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
Existing enterprise library management systems provide solutions for basic
needs for the management of the library from an administrative perspective.
The only interaction that the client of the library has with the system is
while accessing the OPAC and that again is not usually available on the
internet. Newer functionalities like auto-generated reminders, subscribing to
interested books and requesting for renewals could be implemented in the
existing systems. These features can be provided by the use of concepts like
publish-subscribe messaging model.
This paper presents a messaging model enhancement for an existing library
system to implement the features of auto-generated reminders using Short
Messaging Service (SMS) and notification services to interested subscribers
using a web based platform. This system requires the Java 2 Enterprise
Edition, Apache Tomcat Server, JSP, and Oracle 10g Database on a
Windows 7 operating system for its implementation.
This paper proceeds as follows. In the next section, the related works in this
field are described. Section 3 describes the proposed work. Section 4
describes a simple architecture for the proposed system along with a short
algorithm. Section 5 discusses the results after implementation of the system
and section 6 gives a conclusion.
2. RELATED WORKS
The existing library management systems are based on two tiered client-
server architecture [1]. This system has several disadvantages since the
development cycle is quite long and the amount of resources required from
the client side is significant. Also problems related to installation,
maintenance and scalability grow due to the usage of this model [1].
In India, very few colleges provide a web based library service for the
students and the staff. These libraries implement an Integrated Library
System which is an Enterprise Resource Planning system for libraries. It
comprises of relational databases, software which serves as a middleware to
interact with the database and a visually aesthetic graphical user interface
for students and staff. It has various modules for acquisitions, cataloguing,
classifying and indexing all kinds of materials, serials and OPAC [2]. Most
libraries have their own system to which the user has limited access. Library
users generally fill the form and data entry operators are employed to update
the form contents into the database. When the due date of an item passes
away, often the user forgets and is unable to return the item on time. Thus
he/she has to pay huge amounts of fine and other users who may have
requested for the book have to keep waiting.
During our research we found quite a few popular library management
systems. One such system is provided by Navayuga Infotech. The system
does have a generate overdue notification module [3], but this module
does not send the notification generated to the borrowers (subscribers) of the

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 2


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
library service. Similar open source library management software is
Evergreen, which also provides management services to the library and only
searching facilities to the borrowers or patrons of the library [4]. Another
such software that we came across was, PhpMyBibli, which is an open
source integrated library system which provides cataloging and circulating
feature with no mobile based notification services [5].
The existing systems provide features for improving the management of the
library alone. They still do not provide any function for enhanced user
experience. In the existing systems all service notifications are sent to the
library users through emails. It is also common knowledge that students in
general are lax about checking their emails regularly. The above stated
problems have led to our proposal which will be able to solve most of these
issues and make the library experience more modern and more suited to the
younger generations of today.
3. PROPOSED WORK
As discussed in the above section the stated problems provide a new
research scope for improving the library information system. We intend to
convert the system of sending notifications through emails into a system that
sends notifications through SMS to the mobile phones of the subscribers.
For this we use publish-subscribe messaging model with content based
filtering.
This system will be able to solve the problems faced till now. Since there is
no need to check emails, so the problems of slow internet access, and
intranet-based service will be eliminated. Also students who instantly check
their SMS inbox will be reminded on time about the due date of their library
items and hence will save on the large amount of fine they have to pay on
overdue items.
The system will have a backend database which will be accessed using JSP
at the front end. There will a login module for students, staff and
administrators. The entire system will be web based. We will also provide a
new book arrival notification service to all interested subscribers.
Subscribers will also be notified on the arrival of the book that they have
requested for but was not available earlier.
Since the system is being developed as a web application, this system will
be reusable and can be implemented by any library vendor to improve
his/her business. As print media is almost dying [6], we think that many
such libraries will be interested to look for ways to attract customers and
keep their business running by using easy web applications to provide
quality services to their customers.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 3


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
4. METHODOLOGY
The three main components of publish-subscribe model are the subscriber,
the publisher and the message. The subscriber is the one who subscribes to
receiving notifications of some information. The publisher is the one who
supplies or sends the information to all its subscribers. The message is the
way in which information is communicated between publishers and
subscribers. Event-driven or notification based interaction pattern is most
commonly used for inter-object communication [7]. The above described
notification service serves as a middle layer between publishers and
subscribers to avoid each publishers requirement to know all possible
subscribers.
The publisher and subscriber both communicate with a single entity the
Notification Service. All the subscriptions associated with respective
subscribers are stored by the notification-service. This service also
dispatches the published notifications to the correct subscribers [8]. The
filtering process is based on the content, topic or type of the message.
Content based filtering is most suitable for our application since the
subscribers receive the messages which match the constraint defined by the
users themselves. In this case the user (student, staff) himself/herself
decides when he/she wishes to receive the reminder notification, a number
of days in advance. He/she also decides which updates regarding new books
he/she wishes to receive.

Figure 1. Publish-Subscribe Architecture Model for Library Reminder Notification


Service

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 4


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
Figure 1 explains about the basic architecture of publish-subscribe model
using content-based filtering technique. As shown in the diagram, the role of
the librarian is that of the publisher and the students and staff, that is the
borrower, plays the role of the subscriber. The publisher publishes the
services and the subscriber subscribes to these services according to the
filtering constraints. The subscriber service also allows the subscriber to
subscribe or unsubscribe to any registered service.

Send (msg, subscriber id){


If (id is student id){
Do :no=retrieve mobile no from student table;
delvr_msg(no, msg);
}
Else if(id is staff id){
Do :no=retrieve mobile no from staff table;
delvr_msg(no, msg); }
}
Notify (publisher id, subscriber id){
date_diff:value;
Do subscribe_filter=date_diff
Switch (subscribe_filter){
Case 1: do send (msg, id) break;
Case 2: do send (msg, id) break;
Case 3: do send (msg, id) break;
}
}
notifynewbook(publisher id, subscriber id, category){
do subscribe_filter=category
switch (subscribe_filter){
case category1:do send(msg, id) break;
case category2:do send(msg, id) break;
case category3:do send(msg, id) break;
.
.
case category N:do send(msg, id) break;
}
}

pub_sub_msg() {
main()
subscriber
publisher
do : subscriber id=retrieve each subscriber id according to
date_diff
do :notify(publisher id, subscriber id)
do: notifynewbook(publisher id, subscriber id, category)
}

Figure 2. The proposed algorithm

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 5


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
Figure 2 describes the algorithm used to implement the system. This
algorithm has two primary modules each of which implement publish-
subscribe model. The pub_sub_msg() acts as the entry point. It identifies the
publisher that is the Librarian and the Subscriber that is the staff and student
who are likely to avail this facility of the library. In the next step content
based filtering is applied for both the modules. The content pertains to the
number of days prior to which the notification must be sent as well as the
list of book categories subscribed to. The subscribers are identified
according to their subscription requests for reminder time of due date of
books and for availability of interested books. In the first module the
difference between due date of book and present date is calculated and
accordingly reminder message is sent, based on the subscribers preference.
In the second module, the subscribers are requested to identify the
categories they are interested in and according to their choice whenever a
new book is added to that category, a notification is sent to the subscriber.
To implement our library notification system, we used Java 2 Enterprise
Edition to implement a web service using NetBeans 7.2 IDE. This service
required the use of JSP pages to create the front end and handle business
logic using standard Java classes and Servlets. The notifications will be sent
using URL modification. This service will be hosted using Apache Tomcat
6.0 Server. For the backend we used Oracle 10g relational database on
Windows 7 operating system.
5. RESULTS
The proposed system was implemented as a web application. It was
deployed on the local server of a machine and temporary data was inserted
into the database to test the working of the system.

Figure 3. Notification SMS

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 6


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
Figure 3 shows the messages received for books due and for new additions
to the library. The message is sent to the subscribers, but not to all
subscribers. The selection of the subscribers to which the message has to be
sent is based on the constraints specified by the subscriber as per the
content-based filtering technique. Content-based filtering is applied by
categorizing the contents as subscriptions options and allowing the
subscriber to select to which content he wishes to subscribe. The selection
of the number of days prior to which reminder notification has to be sent, as
well as the selection of the interested book categories provides constraints
necessary in content-based filtering of publish-subscribe model.

Figure 4. Screenshot of the subscription page


Figure 4 shows a screenshot of the page displayed when a subscriber wishes
to subscribe to some category of books that he/she wishes to be notified
about. Based on the options selected by the subscriber on this page,
appropriate notifications will be sent to him/her whenever the content of the
category attribute of the new book matches to the constraints specified by
any of the subscribers.
The data relating to users interested subscriptions was retrieved in the form
of number of days of reminder from the databases and appropriate message
was sent. The successful delivery of message was displayed on screen and

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 7


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
failure was displayed too. The messages were on an average successfully
delivered to the subscribers and it helped them to avoid paying fine for late
return of library item.
6. CONCLUSION
The student population will benefit highly from this new system. The library
services provided to the students will be considerably enhanced in various
areas. Since students have financial issues and paying huge amounts of fine
to the library for delayed returns often leads to irritation and frustration.
This system will be able to help the student remember at the right time and
avoid paying fine. It will also inform the students about the new arrivals in
which they have shown an interest and hence they will not need to go to the
library everyday to find out if their books have arrived or not. Since it is a
web service it can be deployed everywhere with little maintenance. Any
library vendor can make use of this service. So this will improve the library
facilities and will keep it at par with the advancement of technology that is
so rampant these days.
7. REFERENCES
[1] Yujun Li, Hao Zheng, Tengfei Yang, Zhiqiang Liu Design and Implementation of a
Library Management System Based on the Web Service. Multimedia Information
Networking and Security (MINES), 2012 Fourth International Conference on , vol.,
no., pp.433,436, 2-4 Nov. 2012 doi: 10.1109/MINES.2012.94
[2] Integrated Library Systems : http://en.wikipedia.org/wiki/Integrated_library_system
[Accessed on 10 April 2013]
[3] Navayuga Infotech Library Management System Brochure :
http://www.navayugainfotech.com/multimedia/lms.pdf [Accessed on 25 April 2013]
[4] Evergreen Library Management : http://evergreen-ils.org/ [Accessed on 25 April 2013]
[5] PhpMyBibli : http://en.wikipedia.org/wiki/PhpMyBibli [Accessed on 25 April 2013]
[6] Death of print media : http://www.davidakka.com/enterprise-
mobility/deathofprintmedia/ [Accessed on 25 April 2013]
[7] Oasis Web Services Base Notification : http://docs.oasis-open.org/wsn/wsn-
ws_base_notification-1.3-spec-os.htm [Accessed on 7 April 2013]
[8] Shehnaaz Yusuf. Survey of Publish Subscribe Communication System. Advanced
Internet Application and System Design, 24 December 2004

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 8


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Building an Internal Cloud for IT


Support Organisations: A Preview
S. M. M. M Kalyan Kumar
National Informatics Centre, India

Dr S. C. Pradhan
National Informatics Centre, India

ABSTRACT
We design the most of our computing environment as a centralized data centre since our
organization's creation. Users of various development projects are deploying their services
and connecting remotely to the data centre resources from all the stations of NIC. Currently
these servers are mostly underutilized due to the static conventional approaches using for
accessing or using of these resources. So, we build up and prototyped a private cloud
system called nIC (NIC Internal Cloud) to leverage the benefits of cloud environment and
optimal usage of centralized resources. For this system we adopted the combination of
various techniques from open source software community. The user base consists of
developers, web and database administrator, service providers and cloud users from various
projects of NIC. We can optimize the resource usage by customizing the user based
template services on the virtualized infrastructure. It will also increase the flexibility of the
managing and maintenance of the operations like archiving, disaster recovery and scaling
of resources. In this paper, we describe the design and analysis of implementing issues in
internal cloud environments in NIC and similar organizations.

Keywords
Internal Cloud, Open Source, Authentication, Virtualization.

1. INTRODUCTION
Cloud computing is a supercomputing model that offers the services which
solves the vast kind of user requirements efficiently. Other changes It gives
the provisioning to parallel and dynamic processing to the end users and
offers virtualized, scalable, on demand resources to the end users over the
internet. It eliminates the challenges in non-cloud techniques on scaling up
and down of resources, upgrading of hardware and software components
and monitoring of services. So, further we discuss the appropriateness and
necessity of cloud environments at organizations like NIC [1].
An Internal cloud aims to deliver many of the characteristics of public
cloud computing such as scalability and elasticity, the pooling of shared
infrastructure, user self-service, availability and reliability. However, by
taking a internal cloud approach, organizations can deliver on these goals
while still using their private physical resources allowing them to keep up
complete control and security over their data and applications. By giving
application owners better visibility over their resource usage organizations

ISSN: 1694-2108 | Vol. 1, No. 1. May 2013 1


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

are able to more easily apply their strategies to enhance the throughput. A
self-service interface to which standardized services are publishes from IT
provider, eases application owners and other internal users are able to easily
provision resources dynamically.
The workings of conventional approaches are not ideal to centralized
patterns to follow the dynamic and non-uniform nature of requirements in
resources. The cloud is an effective reuse model where reusable services are
deployed once and shared by many potential consumers. So, further we
discuss the appropriateness and necessity of cloud environments at
organizations like NIC.
The Internal cloud pattern can enhance the user experience and decreases
operational costs with its nature of cloud techniques. It adapt to deal the
situations like sudden increase or decrease of rate of demand of resources
and outplay the traditional methods which are fails in those situations. We
can offer various heterogeneous combinations of software stacks for
different [12] project requirements. The project requirements include
database, ticket, domain, Kerberos, mail, print, middleware, clients, net,
storage, build, test, versioning, and so on. The platform can host the
different combinations of operating systems and software and provisions the
on demand service environment which increases the productivity by
offloading the users from these platforms. The seamless working state of
these platforms from the perspective of users for a medium-sized company
like NIC gives provision of considering the solutions for the production
environments.
National Informatics Centre (NIC) is a premier Institute and government
software agency which is running various software projects at different
datacenters spans across the states of India. The data centres are well-
connected by high bandwidth network backbone which will give rich
Network services. So, the availability of huge footprint of hardware
resources gives the good chance to implementing large-scale cloud
environments.
However, implementing the required private cloud architecture at
production level data centres needs the seam-less working software. In the
open source community we can find such software to deploy the private
cloud. The infrastructure virtualization components like Xen or XCP [2] and
virtual desktop software like XVP [9], storage virtualization/cluster
components like SWIFT [5], orchestration components like CloudStack [3],
OpenStack [4] gives the wider options to implement the Internal cloud.
In this paper, we propose internal cloud architecture to implement the
infrastructure, Platform and software as a service to the developers, users
from various projects [1] of NIC and appropriate maintenance and

ISSN: 1694-2108 | Vol. 1, No. 1. May 2013 2


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

monitoring techniques to control the system. The rest of the paper is


organized as follows. Section 2 describes building of the system. Section 3
describes the design details. Section 4 gives the details deployment and
performance. Section 5 concludes the paper.

2. BUILDING OF INTERNAL CLOUD

2.1 Key Technologies of the nIC Environment


In this section we present the key technologies of design choice to
implement the private cloud model, including virtualization software, user
management orchestration software, VDI and storage service technologies.

2.1.1 Server Virtualization


Virtualization layer provides significant benefit for organizations delivering
Internal cloud solutions through enhanced scalability and virtual machine
mobility. The server virtualization is at the root level of the any cloud setup
which segregates or aggregates the computing server pools. The computing
pools include central processing units (CPU), memory, disk, and I/O
channels. It works transparently to the hosted application intended to
improve stability, utilization, or ease of management of the system. Many
variants are available in the virtualization includes hardware virtualization,
para virtualization, full virtualization and operating system level
virtualization according to the type of resources available. Open source
solutions like Xen or XCP gives a complete enterprise level of virtualization
services. It works using the hardware and Para virtualization techniques.

Hardware Virtualization
Hardware supported virtualization is where the CPU has additional
hardware support/instructions to facilitate some common tasks usually seen
in virtualization. The hardware provides architectural support that facilitates
building a virtual machine monitor and allows guest OS's to be run in
isolation.

Para Virtualization
Para virtualization is the concept of making changes to the kernel of a guest
operating system to make it aware that it is running on virtual, rather than
physical, hardware, and so exploit this for greater efficiency or performance
or security. It gives the more flexibility and security to the guest instances
running in the virtualized platform.

XCP
XCP runs as a bare metal hypervisor and it consists of a set of tools to
manage the virtual instances running on it. The architecture of the XCP is
very simple and gives production ready service to the end users. It can able

ISSN: 1694-2108 | Vol. 1, No. 1. May 2013 3


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

to create server pools consist of one master node and multiple slave nodes.
It can scales up and down the servers easily and migrate the VMS among
the server pool. Rich set of management software also available to
configure and monitor the XCP platform like XenCenter, OpenXenManager
and XenWebManager.

2.1.2 Desktop Services


The centralized virtual servers need to be accessed by the end users in order
to use those resources. Desktop virtualization provides access to remote
resources and enables IT to deliver the right desktops to meet the needs of
every user. With centralized desktop service users data is well protected,
provisions rich set of resources and decreases the operational costs. Users
can connect to their Virtual instances through the VNC protocol, SSH, or
XVP protocol.

XVP
The Xen VNC Proxy (XVP) gives the Xen based virtual instance access
deployed on XCP machines. XVP gives the proper user management in the
private cloud Setup so that users also can access, starts and stops the
instances in the virtual farms.

2.1.3 Storage Services

The Storage services in cloud are characterized by repeatable, automated


provisioning to the end users. Conventional storage devices like SAN, NAS,
local Hard Disks are turned to form the infrastructure for these services and
provides consistent, cost-effective solution. Various types of storage
services are available like permanent and transient, high and low latency,
high and low bandwidth, high and low protected services. Techniques like
LVM, Cluster file systems, Gluster FS, DRBD, SWIFT together gives the
seamless storage services [13] in cloud environments.

SWIFT
Swift is a multi-tenant, highly scalable and durable object storage system
that was designed to store large amounts of unstructured data at low cost via
a RESTful http API. Swift is used to meet a variety of needs. Swift's usage
ranges from small deployments for "just" storing VM images, to mission
critical storage clusters for high-volume websites, to mobile application
development, custom file-sharing applications, data analytics and private
storage infrastructure-as-a-service.

DRBD
DRBD [6] stands for Distributed Replicated Block Device, which replicates

ISSN: 1694-2108 | Vol. 1, No. 1. May 2013 4


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

data at the block level between two or more sites. DRBD is widely used as
high availability and disaster recovery replication technology. DRBD takes
over the data, writes it to the local disk and sends it to the other
host. On the other host, it takes it to the disk there.

GlusterFS
GlusterFS [7] is a powerful network/cluster file system written in user space
which uses FUSE to hook itself with VFS layer. GlusterFS takes a layered
approach to the file system, where features are added/removed as per the
requirement. Though GlusterFS is a file System, it uses already tried and
tested disk file systems like ext3, ext4, xfs, etc. to store the data. It can
easily scale up to petabytes of storage which is available to user under a
single mount point.

2.1.4 Cloud Orchestration and Networking

The management of virtual resources is critical step to forming the private


cloud. Open source tools like CloudStack, OpenStack gives enough
flexibility to deploy the on the fly private cloud in the premises. With use of
these tools we provision the self-service portals for virtual resources to the
end users and monitoring, metering of usage of resources.

CloudStack
CloudStack can be used for IaaS solution which builds private clouds. It
enables compute orchestration, Network-as-a-Service, user and account
management, a full and open native API, resource accounting, and a first-
class user interface. CloudStack works in monolithic way includes
management server take control of total setup. It having two modes of
operation as basic for simple network setup and advanced for complicated
network setup.

OpenStack
Openstack software designed to orchestrate the large networks of virtual
machines, gives available, scalable and on-premise cloud infrastructure
platform. It consists of various modules to enable the complete IaaS
services includes image store provides a catalog and repository for virtual
disk images, Compute provides virtual servers upon demand, dashboard
provides a modular web-based user interface, identity provides
authentication and authorization, Quantum service provides network
connectivity as a service, Ceilometer provides metering and block store
provides persistent block storage to guest VMS. Stack of different
technologies used for nIC are presented in Figure 1.

ISSN: 1694-2108 | Vol. 1, No. 1. May 2013 5


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Figure 1: nIC Paradigm

2.2 Architecture of the nIC

2.2.1 Prototype for nIC


The framework consists of all the components required to establish the total
Cloud environment and orchestrates the virtual resources among the user
requests. The components are worked at different roles from the client
interaction to the actual processing and storing of data. The basic layers for
the framework are top layer consists of user interfacing tools like xvpweb,
Xencenter, OpenXenManager, CloudstackGUI and Openstack KeyStone.
The middle layer consists of hypervisors like Xen, Orchestration software
cloudstack and OpenStack, and user management systems like ldap, NIS
servers.

The bottom layer consists of stores of virtual machines images, data


repositories, storage services like swift, drbd, nfs and glusterfs. All these
layers are loosely coupled and interacted themselves gives the
complementary services in the private cloud environment. The private cloud
resources can be segregated according to the respective project domains.
Upon the project basis we can customize the software stack and given it as a
on-the-fly platform for the project development. The layers according to the
different functionalities of the system can be seen below:

ISSN: 1694-2108 | Vol. 1, No. 1. May 2013 6


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Three layers of the nIC


* The Framework/Admin receives the resource requests from cloud clients.
* Request processed by creating new virtual resource allotments
1. Accessing of cloud resources from the top layer of the framework
a. SSH, VNC and RDP protocol access.
b. XVP and VDI access
2. Computational services from the middle layer of the framework
a. System and Management services
b. Virtual hardware services.
3. Storage services from the bottom layer of the framework
a. Storage services for on-the-fly computational requests
b. Storage services for archiving requests
* Monitoring and Maintenance of production services

2.2.2 Project based resource allotment


We can divide the server pools as per data centre based or project based.
Within a data centre there exists multiple projects and allocated physical
servers. According the project allotments of the servers we create a
customized private cloud on those servers. The shared requirements exists
on the multiple projects can be serviced from the single data centre servers.
The disaster recovery site can be placed at geographically different data
centre in order to protect the cloud.

Figure 2: Project based Virtual farms

ISSN: 1694-2108 | Vol. 1, No. 1. May 2013 7


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

At the higher level the users see only the dedicated allocation of servers and
no have idea of physical placement of virtual servers. The virtual farm is
totally work in transparent way to dedicate the resources to end user
workloads or projects.

2.2.3 Template Management


The Users of nIC are from the different projects with different
requirements. According to those requirements nIC have the ready made
templates to build the on-the-fly resource. The users can build their own
template and uses it for own service. The templates are customized with
various software stacks needed for project development, deployments etc.
The users can instantiate many numbers of virtual instances from the single
template. The life cycle from template creation to virtual instance explained
in the following routine.

Life Process of a Virtual Instance


1. Booking of the template from the template store.
a. Pre-configured template available in the store.
b. On-the-fly Creation of template by the user.
2. Attributing the instance
a. Network service assignment.
b. Storage service assignment.
3. Start the instance on server pool.
4. Remote service assignment.
5. Monitoring and backup service assignment.
6. Stop and destroy the instance.

2.2.4 Design of nIC


The design of nIC consists of the all modules specification and its services
involved in the internal cloud. The nIC offers the all the dimensions that a
cloud consists are infrastructure service, platform service and software
service. The module service specification is presented here in the following
subsections.

1) User management
There are several people from different projects in the organization that deal
with data center and cloud. It is required to define role and assign people to
those role and based on their role, they should have different access.
Administrators should be able to view and change everything and users
should only have access to view everything without modify them. Each

ISSN: 1694-2108 | Vol. 1, No. 1. May 2013 8


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

project has different requirements of role and access. nIC provides user
management capability at different levels of the framework.

Authentication services in nIC


1. Authenticate with the nIC framework.
2. Authenticate with the virtual image service.
3. Authenticate with the storage service.
4. Authenticate with the network and remote access service.

2) Network management:
The Network services in private cloud offers the isolation, flexibility and
self-service among the virtual resources allocated to the end users. The
virtual networks assigned to the instances define the reach ability and
accessibility of the user service. The software switches gives the flexibility
to tune and configure the performance control on the virtual networks. The
software defined firewalls, load balancers gives the flexible tuning of
security and performance controls. DHCP based IP assignment and data
centre level VLANs gives rich network service to the cloud users.

Virtual network services in nIC


1. Assign the virtual networks from network pools.
2. Assign the IP and MAC address from address pool.
3. Assign the firewall rules.
4. Assign the network load balancer rules.

3) Storage Management:
Private cloud storage is elastic, automated and multi-tenant. According to
service the storage is transient and low latency or permanent and high
latency. The virtual instances are created on minimal storage required for
the instance to run. For archiving of data object based storage services are
used which gives reliable and rapid provisioning of storage services.

Storage services in nIC


1. Assign the block storage.
2. Assign the object storage.
3. Assign the centralized storage services.
4. Assign the cluster and high available storage services.

ISSN: 1694-2108 | Vol. 1, No. 1. May 2013 9


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

4) Remote Access Control:


The remote access hosted cloud scenario provides a secure way for users to
access resources in the private cloud over the internet. Users are relieved
from the burden of having high end resources at local sites from this setup.

Remote services in nIC


1. Assign the remote agent.
2. Assign the vdi service.
3. Assign the centralized desktop services.

5) Virtual Farms:
The nIC is spans on the different datacenter servers. The virtual farms
created upon that are for respective project domains. They all are isolated
from each other by means of physical servers, VLAN separation or at
application level isolation. The shared cloud resources are common for the
servers for each domain. Individual isolated cloud resources are allocated
within the virtual farms. Each virtual farm consists of set of dynamic
number of virtual instances with pre-configured software stacks. These
instances are grouped according to the project requirements. Service login
within the farm is enabled through the techniques of NIS and auto-mount. A
private storage space is part of the farm is created using the NFS. All these
virtual farms are managed through the centralized orchestration techniques.
We can see the total look of the setup in the Figure 3.

Figure 3: Virtual farm Deployment

ISSN: 1694-2108 | Vol. 1, No. 1. May 2013 10


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

The primary site is totally replicated to the secondary site and live
synchronization of data is done automatically between these two sites. The
second site acts as a disaster recovery site and automatically switches over
to services when primary site stops working. These two sites are put in
geographically different places to ensure higher reliability.

3. PERFORMANCE OF THE DEPLOYMENT


In this section we are presenting the deployment environment in the data
centre for various projects. The framework is deployed using the
technologies mentioned in the technology stack. Build up of cloud
environment by integrating these different components provisions rich
service availability to the end users. We can see the management of
components at each part of the framework and evaluation of such a system
for efficiency measuring purposes. The utilization of different resources can
be evaluated in nIC.

A Typical deployment consists of the following resource configuration. The


Physical data centre consists of the hundreds of servers and we constitute
following configuration on them as in Table 1.

Table 1: H/W and S/W Specification

Server Dell Blade/Rack server/Quad-Core AMDOpteron


CPU Intel-Xeon
Hypervisior XCP/XEN
Orchestration CloudStack/OpenStack
BlockStorage SAN/Disk
ObjectStorage NFS/Disk

Virtual Instance configuration details are listed here. Further these are
configured according to software stacks required by the nIC users.

Table 2: Virtual Instance Specification

Operating System Linux(Centos6)


CPU 16 cores(2GHz each)
Storage 2 TiB
Memory 128 GiB
Virtualization Para/Hardware
Networks VLANS(4)

ISSN: 1694-2108 | Vol. 1, No. 1. May 2013 11


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

The Virtual farms are created according to the project requirements. The
typical farm was created using the following specification as in Table 3.

Table 3: Virtual Farm Specification

Servers in the Hardware Pool 16


Virtual Instances 500
Swift 1 TiB
SAN 5 TiB
ISO/Template Store 1
LDAP/NIS Cluster 1
Management Server Cluster 1
VDI Server Cluster 1

Using these Specifications we setup a production level deployment in the


data centre for internal projects. We can see the performance of this
deployment in terms of seamless working state and response of the
deployment for different workloads. Various statistics are presented here are
taken from the working sets of production deployments.

3.1 CPU Utilization

This section shows CPU resource utilization from the various kinds of
project workloads in nIC. The setup of nIC and allocation of initial
resources are planned according to the workloads and benchmarked in the
Figure 3 and Figure 4.

Figure 4. CPU Usage among Service Instances

ISSN: 1694-2108 | Vol. 1, No. 1. May 2013 12


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Figure 5. CPU Usage among Development Instances

4. CONCLUSION
In this paper, we describe the design of our private cloud system nIC and
evaluated the typical deployment. For our system we adopted the open
source software at various components of nIC. During our study, we have
considered performance criteria, seam-less working state and analyzed the
various project workloads in cloud. Our experimental architecture
demonstrated the reference system design of internal cloud for various
project workloads. We presented a production level deployment scenario
and performance counters of CPU usage on this reference design. The
difficulty of evaluating the production level deployments is lack of
standardized methods for getting normalized results at each part. Further we
work on disaggregation of system for better control and optimization of
system behaviour. We are also in the middle of taking pragmatic results by
considering the efficiency of the system.

REFERENCES
[1] National Informatics Centre. [Online] Available from: http://www.nic.in/

[2] XEN and XCP(Xen Cloud Platform) [Online] Available from: http://www.xen.org/

[3] CloudStack - An Open-Source Cloud Computing Platform. [Online] Available from:


http://cloudstack.apache.org/

[4] OpenStack- An Open-Source Cloud Computing Platform. [Online]


http://www.openstack.org/

[5] SWIFT - An Open-Source Object Storage Platform. [Online] Available from:


http://docs.openstack.org/developer/swift/

[6] DRBD -A Block level distributed storage system for the GNU/Linux platform.[Online]
Available from: http://www.drbd.org/

ISSN: 1694-2108 | Vol. 1, No. 1. May 2013 13


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

[7] GLUSTERFS -A scale-out NAS file system. [Online] Available from:


http://www.gluster.org/

[8] Open vSwitch - An Software switch for cloud environments. [Online] Available from:
http://openvswitch.org/

[9] XVP - Cross-platform VNC-based and Web-based Management for Xen Cloud
Platform. [Online] Available from: http://www.xvpsource.org/

[10]VCL - A self-service system used to dynamically provision and broker remote access to
a dedicated compute environment for an end-user. [Online] Available from:
http://vcl.apache.org/

[11]XenServer Configuration Limits. [Online] Available from:


http://support.citrix.com/servlet/KbServlet/download/32312-102692726/CTX134789-
XenServer6.1.0_ConfigurationLimits.pdf

[12] Kumar, SMMM Kalyan, and SD Madhu Kumar. "A Mobile-Cloud Paradigm for
Constraint-less Computing." [Online] Available from:
http://www.ewh.ieee.org/soc/e/sac/itee/index.php/meem/article/viewFile/217/224.
[Accessed: 1st June 2012].

[13] Toor, S.; Toebbicke, R.; Resines, M.Z.; Holmgren, S., "Investigating an Open Source
Cloud Storage Infrastructure for CERN-specific Data Analysis,"Networking, Architecture
and Storage (NAS), 2012 IEEE 7th International Conference on, vol.,no., pp.84,88,28-30
June2012 doi:10.1109/NAS.2012.14 [Online]
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6310879&isnumber=6310787

[14] Guanghui Xu; Feng Xu; Hongxu Ma, "Deploying and researching Hadoop in virtual
machines,"Automation and Logistics (ICAL), 2012 IEEE International Conference on, vol.,
no., pp.395,399, 15-17 Aug. 2012 doi: 10.1109/ICAL.2012.6308241 [Online]
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6308241&isnumber=6308132

ISSN: 1694-2108 | Vol. 1, No. 1. May 2013 14


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Use of Intelligent Business, a


Method for Complete Fulfillment of
E-government
M. Nili Ahmadabadi
Department of Management,
Qom University,
Qom, Iran

Masoud Najafi
Department of Management,
Islamic Azad University, Najafabad Branch,
Isfahan, Iran

Peyman Gholami
Department of human sciences,
Payam Noor University, Kazerun branch,
Fars, Iran

ABSTRACT
In this paper, the concepts and missions of the electronic government (e-government) have
been taken into consideration, and there has been a look at the current status and the desired
status of these concepts. Use of tools such as service offering portals to people, public
communication portals, and other services used by e-government show that the only current
use of IT is in the input and output systems but the brain of this system which has the
capability to control, supervise, plan and lead remains intact. In this paper, the use of
business intelligence as a new and emerging idea of electronic government system
engineering in fulfillment of e-government goals has been paid attention to. Utilization of
business intelligence tools and techniques in inter and intra organizational levels in order to
back up the inter-governmental processes in line with non- business purposes such as
social equity establishment, criterion control and increase in leading ability of the society as
methods of fulfilling electronic government have been brought up.
Keywords
Business intelligence, E-government, IT management, E-government processes, Iran code.

1. INTRODUCTION
After emerging a concept entitled as electronic government (e-government),
a lot of efforts have been made in order to develop and expand this concept
and to make use of IT potential. According to Bekertz, IT services in
government which are referred to as e-government can be summarized as
follows: public informatics services (information services), discussion halls

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 1


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
and virtual associations(subscription services), electronic entrance, renewal
of permissions, tax payment (transaction services), public reports and
communications (communication services), transferring data among people,
government and its representatives and private sector (data transference
services) [1].
As it can be understood from the above mentioned items, the government
uses IT tools in order to establish a direct and fast way of communication
between organizations and public information centers with people, private
sectors and suppliers. There are plenty of definitions for electronic
government, but a mostly agreed upon definition of e-government by many
opinion holders, is: electronic government is the selection, implementation,
and use of information and communication technologies in government to
provide public services, improve managerial effectiveness, and promote
democratic values and mechanisms as well as the development of a legal
and regulatory framework that facilitates information-intensive initiatives
and foster the knowledge society [2].
It seems that the interactions between the government and the environment
have been deemed more important than other aspects in a way that the
scholars havent taken into account investigating and researching about
other IT potential in the government structure, while the government have
many more responsibilities to do such as planning, policy making,
supervising and controlling other than the aforementioned items [3].The
usage of IT in the correct and on time performance of the above tasks should
be regarded. In this paper this aspect of IT in the complete fulfillment of the
e-government is brought up.
This paper has been organized as follows: In section 2, related works are
presented, in section 3 E-government as an open system is clarified; section
4 deals with BI definition, section 5 shows unnoticed aspects of the BI,
section 6 talks about the needed requirements of the BI, in section 7 a case
study is presented, section 8 offers the results and section 9 is the
conclusions.
2. RELATED WORKS
The importance of e-government practices cannot be overstated, as it
focuses the direction of government technology funding for future years [4].
To that end, the goals the President of the United States has set forth for e-
government are to increase the ease of access for citizens; to increase
efficiency/effectiveness of government; and to increase government
responsiveness to citizens [5].

Mahdi Bahrami et al state: Nowadays, business organization needs to


analyze market so that be able to stay stable in facing market variant
changes and eventually to be able to handle market management. For this

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 2


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
purpose, organization should update their business processes by utilizing
modern technologies which this is called BI [6]. In todays highly
competitive and increasingly uncertain world, the quality and timeliness of
an organizations BI can mean not only the difference between profit and
loss, but also even the difference between survival and bankruptcy [6].

Different levels of functionality and technical aspects of electronic


government systems and applications are represented by the characteristics
of electronic government. These characteristics provide a way of measuring
the success of initiatives in terms of how they meet technical requirements
such as usability, quality of information, privacy, or security. Furthermore,
they reflect the level of sophistication of these systems, differentiating, for
example, between applications that only provide information and those that
serve to carry out application processes or government services associated
with health, education, and other important policy areas [7]. Among the
main electronic government results identified in the literature are the
following: improvements in the quality of public services [8], efficiency and
productivity in processes and government operations [9], more effective
programs and policies [10], transparency and accountability [2,11], citizen
participation [12], a regulatory framework that supports electronic
government [13], a legal and regulatory framework that encourages the
information society [2,14], and transformation of government structures
[15].

3. E-GOVERNMENT AS AN OPEN SYSTEM


If we consider e-government as a dynamic and updated open system which
is interacting with its environment permanently, the process of this system
can be divided into three main parts: figure-1.In this part, the current
status of the e-government has been investigated and then it has been
compared to the desired status.
3.1The Internal Process of E-government
As it was mentioned, processes such as policy making, planning,
controlling, and supervising the country affairs are up to the government.
Performing these affaires is among the main responsibilities of the
government. The government should provide a permanent interaction with
society. The government should consider the conditions and contingencies
of the time.
3.2 IT Supporting Tools of E-government Internal Processes
With due consideration to the nature of these processes it can be understood
that these processes are in the cluster of organization high levels, and mostly
related to managerial and decision making. Regarding the huge amount of

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 3


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
the offered tools in IT field, the decision support and expert systems can be
deemed and pointed out in this field.
Having access to accurate and correct information from all of the engaged
parties of the society is the prerequisite for providing strategic management,
long term policy makings, and macro plannings and etc. This information
should be entered via interacting input and output portals of the e-
government. After collecting data it should be processed, this is exactly
what this paper talks about. The cycle of data accumulation in the
environment has been illustrated in Figure 2.
Data accumulation from all around is the key factor in establishing an
efficient and successful e-government. Therefore, it is still of paramount
importance to improve and expand the input and output process. But how
data is accumulated and how it is integrated mostly depend on the outlook
and goal of the system. If we have a long term plan for the internal process
of the system we should definitely have a long term outlook at data
integration category. At last, thoughtful brain of the e-government system
will encounter with huge amount of data which cannot be dealt with and
processed by humans brain and it might not make the right decision.

4. BUSINESS INTELLIGENCE (BI)


The definition of the authors from this concept is different. Using a broad
definition:Business Intelligence is a set of methodologies, processes,
architectures, and technologies that transform raw data into meaningful
and useful information used to enable more effective strategic, tactical, and
operational insights and decision-making [3]. The authors believe that
intelligence in business includes characteristics such as ability of collecting,
processing and accumulating of the information that all levels of people of
the organization could access to them according to their own requirements
and help to shape it in future and protect them against competitive treats. In
order to define briefly and clear of the business intelligence we can say that
business intelligence is collecting information about competitors and the
environment to create and sustain competitive advantage [16].

In fact, intelligence in business is a systematic process to be ensured of the


updated, exact and related information of the competitors. An intelligent
system refers to a set of programs and origins that is used by managers in
order to access to the daily information marketing environment. BI is a set
of concepts, methods, and processes to improve business decisions, which
use information from multiple sources and apply experience and
assumptions to develop an accurate understanding of business dynamics. It
integrates the analysis of data with decision support system to provide
information to people throughout the organization in order to improve

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 4


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
strategic and tactical decisions. BI is considered as a strategic management
tool and one of the fastest growing areas of the world of business as shown
in Figure 3 [17].

5. NON- BUSINESS USE OF BI


Business intelligence is a set of applied softwares, technologies and
processes which are used for accumulating, arranging, accessing and
analyzing data in order to make the right decision [18].
The above mentioned definition shows that BI as the suggested tool in this
paper has been used for IT supporting role in the internal process of e-
government system. As it is apparent from the business intelligence term,
first it was used to help managers to make the right decisions and to get
competitive advantages in todays turbulent market and it attempted to
explore the hidden aspect of the data in the organization transactions as well
as the data in the competitive environment of the organization by means of
data mining techniques and in some cases to distinguish condition altering
patterns and effective variables in a model as well [19].
While in this paper another usage of the BI has been considered and it is far
beyond the business borders and it is seeking the non-business use of the BI.
For example, utilization of business intelligence in predicting and analyzing
the cultural situation of the society such as crime and violence criteria,
marriage and divorce or scientific situation such as the rate of knowledge
growth in the country or quality and quantity of the university students
growth in technical fields or market control growth such as investigating the
required rate of the export and import, peoples complaints and
dissatisfaction from different guilds and etc. The aforementioned can just be
considered as a small part of the non-business usage of the business
intelligence in complete realization of the e- government. But for the time
being, this dimension is a sort of far goal to reach since fulfillment of this
goal is in need of a highly developed infrastructure.
If the utilization idea of business intelligence to support the internal process
of the e-government to be accepted by experts, some giant steps with a long
term approach to pave the path for this idea should be taken.

6. INFRASTRUCTURE OF BI
For implementation of the BI, the governments all around the world take
different kinds of strategic measures which are out of this papers depth.
Inter-organizational BI is in need of complete integration of the information
among different organizations [20]. For example if two organizations are
willing to have an effective management system in order to make a better
use of supply chain advantages they should integrate their data, in other
words, in both of these organizations, if such a system to be created and

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 5


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
implemented all the required information in both management systems can
be made use of and the best decisions can be made. As a result, e-
government cannot make use of these beneficial tools without integrating
the required information organizations management systems. So, it seems
that one of the requirements of an auspicious infrastructure for e-
government is having a similar management information system, standard
and coding. In the last part of this paper Iran code has been studied as one of
the coding systems.

7. IRAN CODE SYSTEM AS A CASE STUDY


Iran code or national system of the goods and services classification is a
system which codifies the goods and services information. It provides the
supply chain existence identity and it simplifies the information flow in the
chain. This system tries to create a common language between suppliers and
customers and in general all those involved in goods and services chain
nationally.
It is aimed at developing a common language at national supply chain of the
goods and services; creating information backgrounds for companies and
the products nationally; simplifying different processes of supply chain;
creating a background to develop and simplify inter organizational
processes; creating backgrounds to develop modern methods of trading [21].
The main purpose of this paper is to investigate about the advantages of
using Iran code expansion as well as development of the e-government. If
the government tries to generalize this integration and generalization to all
parts of the concerned information system, it will be a giant step in
providing the necessary backgrounds for utilization of the BI.

8. RESULTS
There is no doubt that information technologies have great potential to
improve government all around the world. However, it is necessary to
explore and execute all of its hidden aspects and make this technology as
efficient as possible. Achieving to the advantages of business intelligence
from non-business point of view in supporting the internal process of the e-
government as the key factor for its complete fulfillment, requires a long
term outlook along with strategic planning for development of information
system. Also the needed frameworks and infrastructures in order for its
complete execution and fulfillment should be provided. It is suggested to
those interested in this field of research to pay more attention to the new and
emerging sciences, especially IT.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 6


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Output Processing Input

Figure 1: e-government as an open system

1- Input: gathering required information from different information


databases such as citizens, companies, suppliers
2- System internal process: such as policy making, planning and
controlling
3- Output: these are resulted from the second part of the process. It can
be generalized to society in order to fulfill e-government.

Government Citizens

Communications
E-government

Companies
Staff
Information Transactions

Suppliers

Figure 2: Accumulating information from inputs in order to make decision

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 7


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Figure 3: Forming areas in business intelligence [17]


In Figure 3 different involved areas are shown. In order to have a complete
form of BI these areas should be respected. As it is shown three main
activities entitled as Business, Information Technology and Management
activities. These three parts are the basic forming areas of BI and result in a
new field which is known as BI. In order to define briefly and clear of the
business intelligence we can say that business intelligence is collecting
information about competitors and the environment to create and sustain
competitive advantage [16].

9. CONCLUSIONS
In fact, intelligence in business is a systematic process to be ensured of the
updated exact and related information of the competitors. An intelligent
system refers to a set of programs and origins that is used by managers in
order to access to the daily information marketing environment. It is clear
that the full potential of information technologies hasnt been made use of.
However, exploring the hidden aspects of these emerging technologies and
making them efficient is of paramount importance. We put forward a
practical framework to fulfill the complete execution of e-government. By
providing the required backgrounds for e-government, not only its current
status can be improved, but also there will be actual impacts on the welfare
of citizens and government as well. In this paper, after a broad clarification
of the e-government and non-business purposes of business intelligence the
needed infrastructures for its complete execution have been presented.

REFERENCES
[1] E-Government Development in Taiwan, Research, Development, and Evaluation
Commission the Executive Yuan November 2003.
[2] Gil-Garcia, J. R., Luna-Reyes, L. F. Integrating conceptual approaches to e-
government., Encyclopedia of e-commerce, e-government and mobile commerce (pp.
636643). Hershey, PA: Idea Group Inc., 2006

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 8


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
[3] http://en.wikipedia.org/wiki/E-Government
[4] Evans, D., & Yen, V. E-government: An analysis for implementation - Framework for
understanding cultural and social impact. Government Information Quarterly, 22,
2005, 354373
[5] US Government Report, US Government. (Feb 2002). E-government strategy:
Implementing the Presidents management agenda for e-government- Simplified
delivery of services to citizens.
_http://www.whitehouse.gov/omb/inforeg/egovstrategy.pdf).
[6] L. Fuld, the New Competitor Intelligence, Wiley, New York, 1995
[7] Luis, F., Gil-Garcia, J. R. Towards a multidimensional model for evaluating electronic
government: Proposing a more comprehensive and integrative perspective.
Government Information Quarterly, 29 (2012) 324334
[8] Brown, M. M., & Brudney, J. L. Achieving advanced electronic government services:
Opposing environmental constraints. Public Performance & Management Review,
28(1), 2004, 96114.
[9] Estevez, J., & Joseph, R. C. A. Comprehensive framework for the assessment of E-
Government projects. Government Information Quarterly, 25, 2008, 118132.
[10] Dawes, S. S. Interagency information sharing: Expected benefits, manageable risks.
Journal of Policy Analysis and Management, 15(3), 1996, 377394.
[11] Rocheleau, B. Politics, accountability and governmental information systems. In G. D.
Garson (Ed.), Public information technology: Policy and management issues (pp. 20
52). Hershey, PA: Idea Group Publishing. 2003.
[12] Fountain, J. E. Prospects for improving the regulatory process using e-rulemaking.
Communications of the ACM, 46(1), 2003, 4344.
[13] Andersen, D. F., & Dawes, S. S. Government information management: A primer and
casebook. Englewood Cliffs, NJ: Prentice Hall, 1991.
[14] Helbig, N., Gil-Garca, J. R., & Ferro, E. (2005, August 1114). Understanding the
complexity of electronic government: Implications from the digital divide literature.
Paper presented at the Americas Conference of Information Systems, Omaha, NE,
USA. 2005.
[15] Garson, G. D. The promise of digital government In A. Pavlichev, & G. D. Garson
(Eds.), Digital government: Principles and best practices(pp. 215). Hershey, PA: Idea
Group Publishing.
[16] Hugh J. Watson, Business Intelligence Past, Present, and Future, Department of
MIS, University of Georgia, Nov 2009.
[17] Mahdi Bahrami, Innovation in Market Management by Utilizing Business Intelligence:
Introducing Proposed Framework, procida Social and behavioral sciences 41, 160-
167, 2012
[18] Assessment of Worldwide Municipal Web Portals.
[19] http://en.wikipedia.org/wiki/E-Government, [Accessed on 4April 2013]
[20] http://www.irancode.ir[Accessed on 11February 2013]
Shojaei, Seyed, Mahmood, A study on the effect of business intelligence on managers
decision makings, the first conference of organizational intelligence, Tehran,2010.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 9


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Comparison of Swarm Intelligence


Techniques
Prof. S. A. Thakare
Assistant Professor
JDIET, Yavatmal (MS), India.

ABSTRACT
Swarm intelligence is a computational intelligence technique to solve complex real-world
problems. It involves the study of collective behaviour of behavior of decentralized, self-
organized systems, natural artificial. Swarm Intelligence (SI) is an innovative distributed
intelligent paradigm for solving optimization problems that originally took its inspiration
from the biological examples by swarming, flocking and herding phenomena in vertebrates.
In this paper, we have made extensive analysis of the most successful methods of
optimization techniques inspired by Swarm Intelligence (SI): Ant Colony Optimization
(ACO) and Particle Swarm Optimization (PSO). An analysis of these algorithms is carried
out with fitness sharing, aiming to investigate whether this improves performance which
can be implemented in the evolutionary algorithms.

1. INTRODUCTION
Swarm intelligence (SI) is The emergent collective intelligence of groups
of simple agents (Bonabeau et al., 1999). It gives rise to complex and often
intelligent behavior through simple, unsupervised (no centralized control)
interactions between a sheer number of autonomous swarm members. This
results in the emergence of very complex forms of social behavior which
fulfills a number of optimization objectives and other tasks. Swarm is
considered as biological insects like ants, bees, wasps, fish etc. In this paper
we have considered biological insects ant and birds flocking for our study.
Ants possess the following characteristics:
(1) Scalability: The ants can change their group size by local and
distributed agent interactions. This is an important characteristic by
which the group is scaled to the desired level.
(2) Fault tolerance: Each ant follows a simple rule. They do not rely on
a centralized control mechanism, graceful, scalable degradation.
(3) Adaptation: Ants always search for new path by roaming around
their nest. Once they find the food their nest members follow the
shortest path. While nest members follow shortest path, some of the
members of the colony always search for another shortest path. To
accomplish this they change, die or reproduce for the colony.
(4) Speed: In order to make other ants to know the food source, they
move faster to their nest. Other ants find more pheromone on the
path and follow the path to the food source. Thus changes are

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 1


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
propagated very fast to communicate to other nest mates in order to
follow the food source.
(5) Modularity: Ants follow the simple rule of following the path which
has a higher level of pheromone concentration. They do not interact
directly and act independently to accomplish the task.
(6) Autonomy: No centralized control and hence no supervisor is
needed. They work for the colony and always strive to search food
source around their colony.
(7) Parallelism: Ants work independently and the task of searching food
source is carried out by each ant in parallelism. It is parallelism due
to which they change their path, if a new food source is found near
their colony. These characteristics of biological insects such as ants
resemble the characteristics of Mobile Ad Hoc Networks. This helps
us to apply the food searching characteristics of ants for routing
packets in Mobile Ad Hoc Networks.

2. SWARM INTELLIGENCE (SI) MODELS


Swarm intelligence models are referred to as computational models inspired
by natural swarm systems. To date, several swarm intelligence models
based on different natural swarm systems have been proposed in the
literature, and successfully applied in many real-life applications. Examples
of swarm intelligence models are: Ant Colony Optimization, Particle
Swarm Optimization, Artificial Bee Colony, Bacterial Foraging, Cat Swarm
Optimization, Artificial Immune System, and Glowworm Swarm
Optimization. In this paper, we will primarily focus on two of the most
popular swarm intelligence models, namely, Ant Colony Optimization and
Particle Swarm Optimization.

2.1 Ant Colony Optimization (ACO) Model


The ant colony optimization meta-heuristic is a particular class of ant
algorithms. Ant algorithms are multi-agent systems, which consist of agents
with the behavior of individual ants [1]. The ant colony optimization
algorithm (ACO) is a probabilistic technique for solving computational
problems which can be reduced to finding better paths through graphs. In
the real world, ants (initially) wander randomly, and upon finding food
return to their colony while laying down pheromone trails. If other ants find
such a path, they are likely not to keep travelling at random, but to instead
follow the trail; returning and reinforcing it if they eventually find food.

Over time, however, the pheromone trail starts to evaporate, thus reducing
its attractive strength. The more time it takes for an ant to travel down the
path and back again, the more time the pheromones have to evaporate. A
short path, by comparison, gets marched over faster, and thus the

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 2


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
pheromone density remains high as it is laid on the path as fast as it can
evaporate. Pheromone evaporation has also the advantage of avoiding the
convergence to a locally optimal solution. If there were no evaporation at
all, the paths chosen by the first ant would tend to be excessively attractive
to the following ones. In that case, the exploration of the solution space
would be constrained [3]. Thus, when one ant finds a good (i.e., short) path
from the colony to a food source, other ants are more likely to follow that
path, and positive feedback eventually leads all the ants following a single
path. The idea of the ant colony algorithm is to mimic this behavior with
"simulated ants" walking around the graph representing the problem to
solve.

The original idea comes from observing the exploitation of food resources
among ants, in which ants individually limited cognitive abilities have
collectively been able to find the shortest path between a food source and
the nest [4, 5, 7].

1. The first ant finds the food source (F), via any way (a), then returns
to the nest (N), leaving behind a trail of pheromone (b)
2. Ants indiscriminately follow four possible ways, but the
strengthening of the runway makes it more attractive as the shortest
route.
3. Ants take the shortest route; long portions of other ways lose their
trail pheromones.

In a series of experiments on a colony of ants with a choice between two


unequal length paths leading to a source of food, biologists have observed
that ants tended to use the shortest route. A model explaining this behavior
is as follows:
1. An ant (called "blitz") runs more or less at random around the
colony;
2. If it discovers a food source, it returns more or less directly to the
nest, leaving in its path a trail of pheromone;
3. These pheromones are attractive, nearby ants will be inclined to
follow, more or less directly, the track;
4. Returning to the colony, these ants will strengthen the route;
5. If two routes are possible to reach the same food source, the shorter
one will be, in the same time, traveled by more ants than the long
route will.
6. The short route will be increasingly enhanced, and therefore become
more attractive;
7. The longer route will eventually disappear, pheromones are volatile;
8. Eventually, all the ants have determined and therefore "chosen" the
shortest route. Ants use the environment as a medium of

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 3


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
communication. They exchange information indirectly by depositing
pheromones, all detailing the status of their "work". The information
exchanged has a local scope, only an ant located where the
pheromones were left has a notion of them. This system is called
"Stigmergy" and occurs in many social animal societies (it has been
studied in the case of the construction of pillars in the nests of
termites). The mechanism to solve a problem too complex to be
addressed by single ants is a good example of a self-organized
system. This system is based on positive feedback (the deposit of
pheromone attracts other ants that will strengthen it by themselves)
and negative (dissipation of the route by evaporation prevents the
system from thrashing). Theoretically, if the quantity of pheromone
remained the same over time on all edges, no route would be chosen.
However, because of feedback, a slight variation on an edge will be
amplified and thus allow the choice of an edge. The algorithm will
move from an unstable state in which no edge is stronger than
another, to a stable state where the route is composed of the
strongest edges.

Figure 1. Behavior of real ant movements

2.1.1 Ant Colony Optimization meta-heuristic Algorithm


Let G = (V, E) be a connected graph with n = |V | nodes. The simple ant
colony optimization meta-heuristic can be used to find the shortest path
between a source node vs and a destination node vd on the graph G [7]. The
path length is given by the number of nodes on the path. Each edge e(i, j)
E of the graph connecting the nodes vi and vj has a variable i,j (artificial
pheromone), which is modified by the ants when they visit the node. The
pheromone concentration, i, j is an indication of the usage of this edge. An
ant located in node vi uses pheromone i, j of node vj Ni to compute the
probability of node vj as next hop. Ni is the set of one-step neighbors of
node vi.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 4


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

, if j Ni
, = ,
0 if j Ni

The transition probabilities pi, j of a node vi fulfill the constraint:


, = 1, [1, ]

During the route finding process, ants deposit pheromone on the edges. In
the simple ant colony optimization meta-heuristic algorithm, the ants
deposit a constant amount of pheromone. An ant changes the amount of
pheromone of the edge e(vi, vj) when moving from node vi to node vj as
follows:
, , + (1)

Like real pheromone the artificial pheromone concentration decreases with


time to inhibit a fast convergence of pheromone on the edges. In the simple
ant colony optimization meta-heuristic, this happens exponentially:

, 1 . , , (0,1] (2)

2.2 Particle Swarm Optimization (PSO) Model


The second example of a successful swarm intelligence model is Particle
Swarm Optimization (PSO), which was introduced by Russell Eberhart, an
electrical engineer, and James Kennedy, a social psychologist, in 1995. PSO
was originally used to solve non-linear continuous optimization problems,
but more recently it has been used in many practical, real-life application
problems. For example, PSO has been successfully applied to track dynamic
systems [9], evolve weights and structure of neural networks, analyze
human tremor [11], register 3D-to-3D biomedical image [12], control
reactive power and voltage [13], even learning to play games [14] and
music composition [15]. PSO draws inspiration from the sociological
behaviour associated with bird flocking. It is a natural observation that birds
can fly in large groups with no collision for extended long distances,
making use of their effort to maintain an optimum distance between
themselves and their neighbours. This section presents some details about
birds in nature and overviews their capabilities, as well as their sociological
flocking behaviour.

2.2.1 Birds in Nature


Vision is considered as the most important sense for flock organization. The
eyes of most birds are on both sides of their heads, allowing them to see
objects on each side at the same time. The larger size of birds eyes relative

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 5


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
to other animal groups is one reason why birds have one of the most highly
developed senses of vision in the animal kingdom. As a result of such large
sizes of birds eyes, as well as the way their heads and eyes are arranged,
most species of birds have a wide field of view. For example, Pigeons can
see 300 degrees without turning their head, and American Woodcocks have,
amazingly, the full 360-degree field of view. Birds are generally attracted
by food; they have impressive abilities in flocking synchronously for food
searching and long-distance migration. Birds also have an efficient social
interaction that enables them to be capable of: (i) flying without collision
even while often changing direction suddenly, (ii) scattering and quickly
regrouping when reacting to external threats, and (iii) avoiding predators.

Birds Flocking Behaviour


The emergence of flocking and schooling in groups of interacting agents
(such as birds, fish, penguins, etc.) have long intrigued a wide range of
scientists from diverse disciplines including animal behaviour, physics,
social psychology, social science, and computer science for many decades .
Bird flocking can be defined as the social collective motion behaviour of a
large number of interacting birds with a common group objective. The local
interactions among birds (particles) usually emerge the shared motion
direction of the swarm,. Such interactions are based on the nearest
neighbour principle where birds follow certain flocking rules to 17 adjust
their motion (i.e., position and velocity) based only on their nearest
neighbours, without any central coordination. In 1986, birds flocking
behavior were first simulated on a computer by Craig Reynolds. The
pioneering work of Reynolds proposed three simple flocking rules to
implement a simulated flocking behaviour of birds: (i) flock centering (flock
members attempt to stay close to nearby flockmates by flying in a direction
that keeps them closer to the centroid of the nearby flockmates), (ii)
collision avoidance (flock members avoid collisions with nearby flockmates
based on their relative position), and (iii) velocity matching (flock members
attempt to match velocity with nearby flockmates) .

Although the underlying rules of flocking behavior can be considered


simple, the flocking is visually complex with an overall motion that looks
fluid yet it is made of discrete birds. One should note here that collision
avoidance rule serves to establish the minimum required separation
distance, whereas velocity matching rule helps to maintain such separation
distance during flocking; thus, both rules act as a complement to each other.
In fact, both rules together ensure that members of a simulated flock are free
to fly without running into one another, no matter how many they are. It is
worth mentioning that the three aforementioned flocking rules of Reynolds
are generally known as cohesion, separation, and alignment rules in the
literature. For example, according to the animal cognition and animal

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 6


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
behavior research, individuals of animals in nature are frequently observed
to be attracted towards other individuals to avoid being isolated and to align
themselves with neighbours. Reynolds rules are also compared to the
evaluation, comparison, and imitation principles of the Adaptive Culture
Model in the Social Cognitive Theory.

2.2.2 The Original PSO Algorithm


The original PSO was designed as a global version of the algorithm, that is,
in the original PSO algorithm, each particle globally compares its fitness to
the entire swarm population and adjusts its velocity towards the swarms
global best particle. There are, however, recent versions of local/topological
PSO algorithms, in which the comparison process is locally performed
within a predetermined neighbourhood topology. Unlike the original version
of ACO, the original PSO is designed to optimize real-value continuous
problems, but the PSO algorithm has also been extended to optimize binary
or discrete problems. The original version of the PSO algorithm is
essentially described by the following two simple velocity and position
update equations, shown in 3 and 4 respectively.

vid(t+1)= vid(t) + c1 R1(pid(t) xid(t)) + c2 R2 (pgd(t) xid(t)) (3)


xid(t+1) = xid(t) + vid(t+1) (4)

Where:
vid represents the rate of the position change (velocity) of the ith particle
in the dth dimension, and t denotes the iteration counter.
xid represents the position of the ith particle in the dth dimension. It is
worth noting here that xi is referred to as the ith particle itself, or as a
vector of its positions in all dimensions of the problem space. The n-
dimensional problem space has a number of dimensions that equals to
the numbers of variables of the desired fitness function to be optimized.
pid represents the historically best position of the ith particle in the dth
dimension (or, the position giving the best ever fitness value attained by
xi).

pgd represents the position of the swarms global best particle (xg) in
the dth dimension (or, the position giving the global best fitness value
attained by any particle among the entire swarm).
R1 and R2 are two n-dimensional vectors with random numbers
uniformly selected in the range of [0.0, 1.0], which introduce useful
randomness for the search strategy. It worth noting that each dimension
has its own random number, r, because PSO operates on each dimension
independently.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 7


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
c1 and c2 are positive constant weighting parameters, also called the
cognitive and social parameters, respectively, which control the relative
importance of particles private experience versus swarms social
experience (or, in other words, it controls the movement of each particle
towards its individual versus global best position ). It is worth
emphasizing that a single weighting parameter, c, called the acceleration
constant or the learning factor, was initially used in the original version
of PSO, and was typically set to equal 2 in some applications (i.e., it was
initially considered that c1 = c2 = c = 2). But, to better control the
search ability, recent versions of PSO are now using different weighting
parameters which generally fall in the range of [0,4] with c1 + c2 = 4 in
some typical applications . The values of c1 and c2 can remarkably
affect the search ability of PSO by biasing the new position of xi toward
its historically best position (its own private experiences, Pi), or the
globally best position (the swarms overall social experience, Pg):
High values of c1 and c2 can provide new positions in
relatively distant regions of the search space, which often
leads to a better global exploration, but it may cause the
particles to diverge.
Small values of c1 and c2 limit the movement of the particles,
which generally leads to a more refined local search around
the best positions achieved.
When c1 > c2, the search behaviour will be biased towards
particles historically best experiences.
When c1 < c2, the search behaviour will be biased towards
the swarms globally best experience.

The velocity update equation in (3) has three main terms: (i) The first term,
vid(t), is sometimes referred to as inertia, momentum or habit . It ensures
that the velocity of each particle is not changed abruptly, but rather the
previous velocity of the particle is taken into consideration. That is why the
particles generally tend to continue in the same direction they have been
flying, unless there is a really major difference between the particles
current position from one side, and the particles historically best position or
the swarms globally best position from the other side (which means the
particle starts to move in the wrong direction).

This term has a particularly important role for the swarms globally best
particle, xg, This is because if a particle, xi, discovers a new position with a
better fitness value than the fitness of swarms globally best particle, then it
becomes the global best (i.e., gi). In this case, its historically best
position, pi, will coincide with both the swarms global best position, pg,
and its own position vector, xi, in the next iteration (i.e., pi = xi = pg).

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 8


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
Therefore, the effect of last two terms in equation (3) will be no longer
there, since in this special case pid(t) xid(t) = pig(t) xid(t) = 0.

This will prevent the global best particle to change its velocity (and thus its
position), so it will keep staying at its same position for several iterations, as
long as there was no way to offer an inertial movement and there has been
no new best position discovered by another particle. Alternatively, when the
previous velocity term is included in the velocity updating equation (3), the
global best particle will continue its exploration of the search space using
the inertial movement of its previous velocity. (ii) The second term, ( pid(t)
xid(t) ), is the cognitive part of the equation that implements a linear
attraction towards the historically best position found so far by each
particle. This term represents the private-thinking or the self-learning
component from each particles flying experience, and is often referred to as
local memory, self-knowledge, nostalgia or remembrance. (iii) The third
term, ( pgd(t) xid(t) ), is the social part of the equation that implements a
linear attraction towards the globally best position ever found by any
particle . This term represents the experience-sharing or the group-learning
component from the overall swarms flying experience, and is often referred
to as cooperation, social knowledge, group knowledge or shared
information.

3. COMPARISON BETWEEN THE TWO SI MODELS


Despite both models are principally similar in their inspirational origin (the
intelligence of swarms), and are based on nature-inspired properties, they are
fundamentally different in the following aspects.

Table 1: Comparison of ACO and PSO

Criteria ACO PSO


ACO uses an indirect The communication among
Communication communication mechanism particles in PSO is rather direct
Mechanism among ants, called stigmergy, without altering the environment.
which means interaction
through the environment
ACO was originally used to PSO was originally used to solve
solve combinatorial (discrete) continuous problems, but it was
Problem Types
optimization problems, but it later modified to adapt
was later modified to adapt binary/discrete optimization
continuous problems problems.
ACOs solution space is PSOs solution space is typically
Problem
typically represented as a represented as a set of n-
Representation
weighted graph, called dimensional points
construction graph.
Algorithm ACO is commonly more PSO is commonly more
Applicability applicable to problems where applicable to problems where

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 9


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
source and destination are previous and next particle
predefined and specific. positions at each point are clear
and uniquely defined.
ACOs objective is generally PSOs objective is generally
Algorithm
searching for an optimal path finding the location of an optimal
Objective
in the construction graph. point in a Cartesian coordinate
system.
Sequential ordering, Track dynamic systems, evolve
Examples of scheduling, assembly line NN weights, analyze human
Algorithm balancing, probabilistic TSP, tremor, register 3D-to-3D
Applications DNA sequencing, 2D-HP biomedical image, control
protein folding, and protein reactive power and voltage, and
ligand docking. even play games .

4. CONCLUSION
The ACO and PSO can be analyzed for future enhancement such that new
research could be focused to produce better solution by improving the
effectiveness and reducing the limitations. More possibilities for
dynamically determining the best destination through ACO can be evolved
and a plan to endow PSO with fitness sharing aiming to investigate whether
this helps in improving performance. In future the velocity of each
individual must be updated by taking the best element found in all iterations
rather than that of the current iteration only.

REFERENCES
[1] R Kumar, M K Tiwari and R Shankar, " Scheduling of flexible manufacturing systems:
an ant colony optimization approach", Proceedings Instn Mech Engrs Vol. 217, Part B: J
Engineering Manufacture, 2003,pp 1443-1453.

[2] Kuan Yew Wong, Phen Chiak See, " A New minimum pheromone threshold
strategy(MPTS) for Max-min ant system ", Applied Soft computing, Vol. 9, 2009, pp 882-
888.

[3] David C Mathew, Improved Lower Limits for Pheromone Trails in ACO", G Rudolf et
al(Eds), LNCS 5199, pp 508-517, Springer Verlag, 2008.

[4] Laalaoui Y, Drias H, Bouridah A and Ahmed R B, " Ant Colony system with stagnation
avoidance for the scheduling of real time tasks", Computational Intelligence in scheduling,
IEEE symposium, 2009, pp 1-6.

[5] E Priya Darshini, " Implementation of ACO algorithm for EDGE detection and Sorting
Salesman problem",International Journal of Engineering science and Technology, Vol 2, pp
2304-2315, 2010

[6] Alaa Alijanaby, KU Ruhana Kumahamud, Norita Md Norwawi, "Interacted Multiple


Ant a. Colonies optimization Frame work: an experimental study of the evaluation and the

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 10


International Journal of Computer Science and Business Informatics

IJCSBI.ORG
exploration techniques to control the search stagnation", International Journal of
Advancements in computing Technology Vol. 2, No 1, March 2010, pp 78-85

[7] Raka Jovanovic and Milan Tuba, " An ant colony optimization algorithm with
improved pheromone correction strategy for the minimum weight vertex cover problem",
Elsvier,Applied Soft Computing, PP 5360-5366,2011.

[8] Zar Ch Su Su Hlaing, May Aye Lhine, " An Ant Colony Optimisation Algorithm for
solving Traveling Salesman Problem", International Conference on Information
Communication and Management( IPCSIT), Vol,6, pp 54-59, 2011.

[9] D. Karaboga, An Idea Based On Honey Bee Swarm for Numerical Optimization,
Technical Report-TR06,Erciyes University, Engineering Faculty, Computer Engineering
Department, 2005.

[10] M. Bakhouya and J. Gaber, An Immune Inspired-based Optimization Algorithm:


Application to the Traveling Salesman Problem, Advanced Modeling and Optimization,
Vol. 9, No. 1, pp. 105-116, 2007.

[11] K. N. Krishnanand and D. Ghose, Glowworm Swarm Optimization for searching


higher dimensional spaces. In: C. P. Lim, L. C. Jain, and S. Dehuri (eds.) Innovations in
Swarm Intelligence. Springer, Heidelberg, 2009.

[12] M. P. Wachowiak, R. Smolkov, Y. Zheng, J. M. Zurada, and A. S. Elmaghraby, An


approach to multimodal biomedical image registration utilizing particle swarm
optimization, IEEE Transactions on Evolutionary Computation, 2004.

[13] L. Messerschmidt, A. P. Engelbrecht, Learning to play games using a PSO-based


competitive learning approach, IEEE Transactions on Evolutionary Computation, 2004.

[14] T. Blackwell and P. J. Bentley, Improvised music with swarms, In David B. Fogel,
Mohamed A. El-Sharkawi, Xin Yao, Garry Greenwood, Hitoshi Iba, Paul Marrow, and
Mark Shackleton (eds.), Proceedings of the 2002 Congress on Evolutionary Computation
CEC 2002, pages 14621467, IEEE Press, 2002.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 11


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

An Efficient Rough Set Approach in


Querying Covering Based Relational
Databases
P. Prabhavathy
School of Information Technology and Engineering
Vellore Institute of Technology, Vellore, India

Dr. B. K. Tripathy
School of Computing Science and Engineering
Vellore Institute of Technology, Vellore, India

ABSTRACT
Handling uncertainty and incompleteness of knowledge becomes a challenging task in
Information Systems. Rough-set theory enhances databases by allowing it for the management
of uncertainty. Roughsets, due to its versatality can be integrated into an underlying database
model like relational or object oriented which can also be used in the design and querying of
databases.Beaubouef and Petry extended relational databases to introduce rough relational
databases. Rough Relational Databases (RRDB) are those databases that can have multivalued
attributes. Querying data from these databases becomes quite difficult because these
multivalued attributes are indiscernible. In the past, the concept of rough sets has been used to
query data from RRDB. In this paper, we introduce second type covering-based rough sets to
query data by involving a cover set instead of the conventional equivalence class. This
increases the number of possible data retrievals. Also, we encode multivalued attributes into a
simplified binary code. This makes data querying more efficient. Subsequently, a comparative
study between the classical rough sets and second type covering-based rough sets to query data
has been drawn.

Keywords
Rough Sets, Relational Databases, Query, Covering.

1. INTRODUCTION

Impreciseness has become a common feature in most of the databases in recent


times. Fuzzy sets, neural networks were considered to be two of the efficient
models to handle uncertainty [1] in databases. After the introduction of rough
sets by Pawlak in 1982 [8], it has overtaken most of the other techniques and

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 1


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

has proved itself as an effective tool used to handle uncertainty in information


systems. Rough set theory was incorporated into the classical relational
databases and is termed asRough relational database model in uncertain
information systems. There are several features common in rough relational
database (RRDB) and relational database (RDB), and the primary difference
between them is that in relational Database model (RDM) the attribute values
are atomic and singleton whereas in rough relational database model (RRDM)
[2] the attribute values are multivalued. Previous work on RRDB includes its
architecture, rough information entropy, rough relational operations, rough
functional dependency, rough normalization and the theory of rough data
querying [4], [5], [6]. Rough set theory [7] is a mechanism which can be used
for rough data management as well as query handling. Data querying to fetch
attribute values is divided into two types: certain data querying and possible
data querying. RRDBs involve rough sets that are used to query indiscernible
data through the use of equivalence classes to determine lower and upper
approximation regions.

In the following sections, we introduce second type covering-based rough sets


to increase the number of possible data retrievals. This is done by involving a
cover set instead of the conventional equivalence class. Also, we use the
encoding technique for querying data [9] to maximize the efficiency. This is
achieved by means of a function mapping to establish a relationship between
the attribute values and the elements of the covering set.

Section 2 describes related work. Section 3 reviews some basic concepts about
rough set theory, covering based rough set and rough relational
database.Section 4 and 5 discusses the encoding and querying data using rough
sets.Section 6 discusses encoding algorithm using second type covering based
rough sets .Section 7 gives the comparitve study Finally, we conclude our work
in Section 8.

2. RELATED WORKS
In rough relational databases, knowledge about entropy [3] can either guide the
database user towards less uncertain data or act as a measure of the uncertainty
of a data set or relation. As rough relations become larger in terms of the
number of tuples or attributes, the automatic calculation of some measure of
entropy becomes a necessity. The decomposition principle and project principle
were applied for querying data from RRDB. In decomposition approach [9], the

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 2


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

RRDB is decomposed into standard relational table according to the semantics


of the query data, and then use SQL and rough relational operators to get the
results. This approach needs the decomposition of RRDB and query according
to the semantics of the querying data, which wastes time and storage space. A
Rough relational database transform [12] approach is based on decomposing
the data of rough relational database and transformed into relational database
according to the characteristic of rough relational database and relational
database and in virtue of multiplication principle and Descartes of basic
operation of relational algebra. Then deleted redundant data and optimized
RDB. Finally, it is designed translation arithmetic and applied in soil example,
then proved the arithmetic validity. But the data value items of the transformed
and optimized RDB are repeated.

The rough data querying is discussed based on granular computing [10]. It


calculated the lower approximation and upper approximation of every atomic
value in attributes domain, and got the final results by rough set operation
principles. But calculating the lower approximation and upper approximation of
an atomic value needs scanning all tuples of a table, and calculating all the
atomic values will take a very long time. And it also needs processing the
semantics of the query data. Covering based rough sets [11], [13] provide
generality as well as better modelling power to basic rough sets. Also, this new
model unifies many other extensions of the basic rough set model. The SQL
Languages used in rough data querying, where they extended the SQL [4] and
got the results based on comparison between equivalence classes rather than
values. But comparison between equivalence classes is processed by comparing
the set of an equivalence class, and the efficiency is a big problem.

3. BASIC CONCEPTS
3.1 Rough set theory
Definition 1
Let U be a finite set, R an equivalence relation on U.R will generate a partition
U/R = {Y1, Y2, , Ym} on U where Y1, Y2, ..., Ym are the equivalence classes
generated by the equivalence relation R. For any X U, the lower and upper
approximations of X are defined as follows:

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 3


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

3.2 Covering based rough sets


Let (U, C) is a covering approximation space
Definition 2 (Covering)
Let U be a universe, C a family of subsets of U. C is called a cover of U if no
subset in C is empty and C = U. The order pair (U, C) is called a covering
approximation space if C is cover of U.

It is clear that a partition of U is certainly a covering of U, so the concept of a


covering is an extension of a partition. In the following discussion, unless
stated to the contrary, the coverings are considered to be finite, that is,
coverings consist of a finite number of sets in them. First, we list some
definitions about coverings to be used in this paper.

3.2.1 Second type of covering based rough sets


For the second type of covering-based rough set model, the lower
approximation is the same as in the first type of covering based rough set
model.

Definition 3 (SL)
By the second type of lower approximation of a set X U in the space < U, C >
we mean the set:
XU, SL(X) = {K|K C, K X}
We define second type of covering based upper approximation operation.

Definition 4 (SH)
Let C be a covering of U. The second type of covering upper approximation
operation is defined as follows:

The concept and properties of second type of covering based rough sets are
enough discussed.We are going to discuss how these concepts are incorporated
in rough relational databases.

3.3 Rough relational database model (RRDM)


The rough relational database has several features in common with the ordinary
relational database. Both models represent data as a collection of relations

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 4


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

containing tuples.The integration of rough sets with traditional database model


is defined as rough relational database model (Beaubouef, Petry, & Buckles,
1995).Its a logical database model where domains are partitioned into
equivalence classes. The domains of the attributes are partitioned into some
equivalence relation designated by the database user. Within each domain, a
group of values that is considered indiscernible forms an equivalence class.

These relations are considered as set where the tuples in the relation are
considered as elements, and like the elements of sets in general, are unordered
and non-duplicated. In the ordinary relational database, tuple ti takes the form
(di1, di2, , dim), where dij is a domain set Dj and dijDj.

But in rough relational database dij Dj, and although it is not required that dij
be a singleton, dij (since it includes non-first normal form relations) Let P
(Di) denote the powerset (Di) - .

3.4 Rough relational database model definition (RRDB)


A rough relational database is defined as follows:
S = (U, A, D, R). U is the set of all the tuples, A is the attribute set, D is the
domains of attribute sets, and R is the equivalence classes on the D.
In RRDB, an attribute AiA, DAi is the domain of Ai, RAi is the equivalence
class of attribute Ai, a tuple rU, r(Ai) is the tuple rs value on attribute Ai, and
r(Ai)DAi.
In fact, RRDB is a special kind of Multi-valued information system according
to the definition of information system.
Definition 5: A rough relation is a subset of the set cross product P(D1) P(D2)
... P(Dm).
Definition 6: An interpretation = (a1, a2, , am)of a rough tuple ti = (di1, di2,
, dim) is any value assignment such that ajdij for all 1 j m, aj is called a
sub-interpretation of dij.

We have considered the following TV Shows data as example for the


discussion. This data describes the various TV shows seen by different age
group persons.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 5


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Table 1: TV Shows data


ROW Person Age Groups TV Shows
ID
ROW1 {Toddler} {Baby Shows}
ROW2 {Kid ,Pre-teen, Teenage, Toddlers} {Cartoons, Educational}
ROW3 {Pre-teen, teenage} {Teenage comedy shows}
ROW4 {Pre-teen, Teenage, Young Adult} {Rom-com, Sitcom}
ROW5 {Young Adult, Adult} {Sitcom, Serials, Music
videos}
ROW6 {Pre-teen, Teenage, Young Adult, Adult, Young {Movies, Documentaries}
Middle Aged, Middle Aged, V. Young SC}
ROW7 {Pre-teen, Teenage, Young Adult, Adult, Young {Reality Shows}
Middle Aged}
ROW8 {Adult, Young Middle Aged, Middle Aged, V. {News}
Young SC, Young SC ,SC, Old SC}
ROW9 {Kid, Pre-teen, Teenage ,Young Adult, Young {Sports Shows,
Middle Aged, Middle Aged, V. Young SC ,Young Infotainment}
SC}
ROW10 {V. Young SC, Young SC, SC, Old SC} {Old Classics, Serials}

ROW11 {SC, Old SC} {Religious Shows, Serials}

In Table 1, we have shown a relationship between the TV Shows and its


viewers who are of different age groups.
The different age groups are categorized as follows:
Toddler: 3-5 years; Kid: 6-9 years; Pre-teen: 10-12 years; Teenage: 13-17 years
Young Adult: 18-20 years; Adult: 21-39 years; Young Middle Aged: 40-49
years; Middle Aged: 50-54 years; Very Young Senior Citizen: 55-64 years;
Young Senior Citizen: 65-74 years; Senior Citizen: 75-84 years;
Old Senior Citizen: 85+ years.

4. ENCODING USING ROUGH SETS


Let domain and equivalence classes of the multivalued attribute a be
represented by Da and Ra respectively. Let K be an arbitrary value of this
multivalued attribute for which the encoding is to done. Let the encoded data be
represented by E=e1e2em, where m= number of equivalence classes then ei
(i=1,2...m) = 1; if K partly or wholly belongs to each equivalence class.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 6


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Yi of RPAG, =0; otherwise

Consider the multivalued attribute Person Age Groups in the rough relational
database depicted in the Table 1.

DPAG = {Toddler, Kid, Pre-teen, Teenage, Young Adult, Adult, Young Middle
Aged, Middle Aged, Very Young Senior Citizen, Young Senior Citizen, Senior
Citizen, Old Senior Citizen}

RPAG= {Y1, Y2, Y3, Y4, Y5} = {[Toddler, Kid] [Pre-teen, Teenage] [Young
Adult, Adult] [Young Middle Aged, Middle Aged] [Very Young Senior
Citizen, Young Senior Citizen, Senior Citizen, Old Senior Citizen]}

Now, for instance, to encode the arbitrary value K= {Kid, Pre-teen, Teenage,
Toddlers} of the multi-valued attribute Person Age Groupsin ROW2, it is
compared with each equivalence class of RPAG. K exists in the first two
equivalence classes due to which the first two bits are 1 each, the remaining bits
are 0.Therefore, the PAG_ code for ROW2 is 11000 as shown in Table 2.

Similarly, consider the multi valued attribute TV Shows in the table depicted
in the Table1, DTV_Shows = {Baby Shows, Cartoons, Educational, Teenage
Comedy Shows, Rom Com, Sitcom, Serials, Music Videos, Movies,
Documentaries, RealityShows, News, SportsShows, Infotainment, OldClassics,
ReligiousShows}

RTV_Shows= {[Baby Shows, Cartoons, Educational] [Teenage Comedy Shows,


Rom coms, Sitcoms] [Serials, Reality Shows] [Movies, Old Classics, Music
Videos] [Documentaries, Infotainment, News] [Sports, News] [Religious
Shows]}.

To encode the arbitrary value K = {Cartoons, Educational} of the multi-valued


attribute TV_Shows in ROW2, it is compared with each equivalence class of
RTVS. K exists in the first equivalence class due to which the first bit is 1, the
remaining bits are 0. Therefore, the TVS_code for ROW2 is 1000000 as shown
in Table 2.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 7


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Table 2. TV Shows Data after Encoding


ROW ID Person PAG_ PAG_ TV Shows TVS_ TVS_
Age Groups CODE CODE Code Code
(rough (coveri (roughse (Coveri
set- ng set- t-based) ng set-
based) based) based)
ROW1 {Toddler} 10000 10000 {Baby Shows} 1000000 100000
ROW2 {Kid, Pre-teen, 11000 11000 {Cartoons, 1000000 100000
Teenage, Toddlers} Educational}

ROW3 {Pre-teen, teenage} 01000 01000 {Teenage 0100000 110000


comedy shows}

ROW4 {Pre-teen, Teenage, 01100 01100 {Rom-com, 0100000 110000


Young Adult} Sitcom}

ROW5 {Young Adult, Adult} 00100 01100 {Sitcom, Serials, 0111000 011100
Music videos}
ROW6 {Pre-teen, Teenage, 01111 01110 {Movies, 0001100 001110
Young Adult, Adult, Documentaries}
Young Middle Aged,
Middle Aged, V.
Young SC}
ROW7 {Pre-teen, Teenage, 01110 01110 {Reality Shows} 0010000 011000
Young Adult, Adult,
Young Middle Aged}
ROW8 {Adult, Young Middle 00111 00111 {News} 0000100 000110
Aged, Middle Aged, V.
Young SC, Young SC,
SC, Old SC}
ROW9 {Kid, Pre-teen, 11111 11110 {Sports Shows, 0000010 000111
Teenage, Young Adult, Infotainment}
Young Middle Aged,
Middle Aged, V.
Young SC, Young SC}
ROW10 {V. Young SC, Young 00001 00011 {Old classics, 0011000 011100
SC,SC, Old SC} Serials}
ROW11 {SC, Old SC} 00001 00001 {Religious 0010001 011001
Shows, Serials}

The above table describes TV Shows data after applying Encoding using Rough
set and covering based rough sets.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 8


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

5. ENCODING USING COVERING-BASED SET


Let domain and covering set of the multivalued attribute a be represented by
Dac and Rac respectively. Let Kc be an arbitrary value of this multivalued
attribute for which the encoding is to done.

Let the encoded data be represented by Ec = e1c e2c .enc, where n= number of
covering sets eic (i=1, 2.. n) = 1, if Kc partly or wholly belongs to each covering set
Zi of Ra,=0; otherwise consider the multivalued attributes Person Age group
in the relational database depicted in Table1. DPAGc ={Toddler, Kid, Pre-teen,
Teenager, Young Adult, Adult, Young Middle Aged, Middle Aged, Very
Young Senior Citizen, Young Senior Citizen, Senior Citizen, Old Senior
Citizen}
RPAGc = {Z1c Z2c Z3c Z4c Z5c}={[Toddler, Kid][Kid, Pre-teen, Teenage, Young
Adult] [Young Adult, Adult, Young Middle Aged] [Young Middle Aged,
Middle Aged, Very Young Senior Citizen, Young Senior Citizen, Senior
Citizen, Old Senior Citizen] [Senior Citizen, Old Senior Citizen]}
To encode the arbitrary value Kc = {Kid, Pre-teen, Teenage, Toddlers} of the
multi-valued attribute Person Age Groups in ROW2, it is compared with
each covering set of RPAG c.Kc exists in the first two covering sets due to which
the first two bits are 1 each,the remaining bits are 0. The PAG_code is 11000 as
shown in Table 2.
Similarly, DTV_Shows = {Baby Shows, Cartoons, Educational, Teenage Comedy
Shows, Rom Com, Sitcom, Serials, Music Videos, Movies, Documentaries,
Reality Shows, News, Sports Shows, Infotainment, Old Classics, Religious
Shows}.
RTVSc = {[Baby Shows, Cartoons, Educational, Teenage Comedy Shows, Rom
com, Sitcom] [Teenage Comedy Shows, Rom com, Sitcom, Serials, Reality
Shows] [Serials, Reality Shows, Movies, OldClassics, MusicVideos] [Movies,
OldClassics, MusicVideos, Documentaries, Infotainment, News]
[Documentaries, Infotainment, News, SportShows] [SportShows,
ReligiousShows]}
To encode the arbitrary value Kc = {Cartoons, Educational} of the multi-valued
attribute TV_Shows in ROW2, it is compared with each covering set of
RTVSc. Kc exists in the first equivalence class due to which the first bit is 1
each, the remaining bits are 0. The TVS_ code is 100000 as shown in Table 2.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 9


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

6. QUERYING DATA USING THE ENCODED VALUES

Let us consider the target data A = {Young Adult, Adult, Middle Aged}, A
DPAG. We determine all those TV Shows that are applicable to one or more in
the target data, A.

Similarly, taking the target data B = {Teenage Comedy Shows, Serials}, B


DTVS, we find the corresponding Person Age Groups. For target sets similar to
A and B, we calculate retrievals using both Rough and second type Covering
Encodings separately.

6.1 Algorithm for Rough Set

1. Establish multivalued attributes a and b.


2. Establish a target set X and set increment to 1.
3. For each x X do
4. While i is less than or equal to the number of EQUIVALENCE
CLASSES
4.1. Establish Yi Ra .
4.2. If (x Yi) is not equal to
4.2.1. Add Yi to .
5. Establish a retrieved data set Result.
6. Set increment k to 1and Set increment j to 1.
7. While k is less than or equal to the number of tuples
7.1. Establish Ek a_code.
7.2. While j is less than or equal to the number of EQUIVALENCE
CLASSES
7.2.1. Establish ej Ek.
7.2.2. Set increment l to 1.
7.2.3. While l is less or equal to the number of terms in
7.2.3.1. Establish Yi .
7.2.3.2. For each Yi do
7.2.3.2.1. If (j equals i) and (ei equals 1)
7.2.3.2.1.1. Add bk to Result.
7.2.3.2.1.2. Goto 8.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 10


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

6.2 Algorithm for 2nd Type Covering-Based Rough Set

1. Establish multivalued attributes a and b.


2. Establish a target set X.
3. Set increment to 1.
4. For each x X do
5. While i is less than or equal to the number of COVER_SET
5.1. Establish Yi Ra .
5.2. If (x Yi) is not equal to
5.2.1. Add Yi to .
6. Establish a retrieved data set Result.
7. Set increment k to 1.
8. Set increment j to 1.
9. While k is less than or equal to the number of tuples
9.1. Establish Ek a_code.
9.2. While j is less than or equal to the number of COVER_SET
9.2.1. Establish ej Ek.
9.2.2. Set increment l to 1.
9.2.3. While l is less or equal to the number of terms in
9.2.3.1. Establish Yi .
9.2.3.2. For each Yi do
9.2.3.2.1. If (j equals i) and (ei equals 1)
9.2.3.2.1.1. Add bk to Result.
9.2.3.2.1.2. Goto 8.
Lines 1 to 5 in the algorithms above gives upper approximation for the target
set X. Lines 6 to 9 is used to retrieve the possible data.

Consider A
1. The rough encoding upper approximation can be given by:
= {Y3, Y4}
Now, putting 1s at the 3rd and/or 4th positions of the rough encoding data {e1,
e2, e3, e4, e5}, we get {00100}, {00010} and {00110}. All the tuples having
these as rough encoding values are retrieved. For example, from Table1,
ROW4, ROW5, ROW6, ROW7, ROW8 and ROW9 are fetched. Here, the total
number of tuples is 6.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 11


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

2. The second type covering-based upper approximation can be given by


= {Z2, Z3, Z4}
Now, putting 1s at the 2nd and/or 3rd and/or 4th positions of the covering
encoding data {ec1, ec2, ec3, ec4, ec5}, we get {01000}, {00100}, {00010},
{01100}, {00110}, {01010} and {01110}. All the tuples having these as
covering encoding values are retrieved.For example, from Table1, ROW2,
ROW3, ROW4, ROW5, ROW6, ROW7, ROW8, ROW9 and ROW10 are
fetched. Here, the total number of tuples is 9.

Consider B
1. The rough encoding upper approximation can be given by:
= {Y2, Y3}

Now, putting 1s at the 2ndand/or 3rdpositions of the rough encoding data {e1, e2,
e3, e4, e5, e6, e7}, we get {0100000}, {0010000} and {0110000}. All the tuples
having these as rough encoding values are retrieved. For example, from Table1,
ROW3, ROW4, ROW5, ROW7, ROW10 and ROW11 are fetched. Here, the
total number of tuples is 6.

2. The second type covering-based upper approximation can be given by:


= {Z1, Z2, Z3}
Now, putting 1s at the 1st and/or 2nd and/or 3rd positions of the covering
encoding data {ec1, ec2, ec3, ec4, ec5, ec6}, we get {100000}, {010000},
{001000}, {110000}, {011000}, {101000}, {111000}. All the tuples having
these as covering encoding values are retrieved. For example, from Table1,
ROW1, ROW2, ROW3, ROW4, ROW5, ROW6, ROW7, ROW10 and ROW11
are fetched. Here, the total number of tuples is 9.

7. PERFORMANCE ANALYSIS

We have consider number of data retrieval as a metric for querying Rough


relation databases with respect to rough set approach and second type covering
based rough set approach. The second type covering based rough set approach
retrieves maximum number of data then the rough set approach. This will be
very useful for the user looking for maximum data while querying the database.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 12


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

In order to fetch maximum data from the table for a target X, we calculate
upper approximations for both Rough and Second type Covering and compare
the number of retrievals.We take 10 such target sets and plot it against the
number of retrievals , as shown in Figure 1 and Figure 2.

The sample target data set taken for Figure 1:

{YoungAdult, Adult, MiddleAged}, {Pre-teen, Adult} , {V.YoungSC, OldSC},


{YoungMiddleAged, OldSC}, {SeniorSC, OldSC}, {Toddler, Kid,
YoungAdult},{Kid, YoungMiddleAged},{Kid, Pre-teen}, {YoungMiddleAged,
MiddleAged}, {Toddler, OldSC}.

The sample target data set taken for Figure 2:

{TeenageComedyShows, Serials}, {RealityShows, OldClassics}, {BabyShows,


Cartoons}, {ReligiousShows}, {RomCom, Movies}, {MusicVideos, News,
SportShows}, {ReligiousShows, Serials}, {OldClassics, Movies,
Documentaries}, {SportShows, News}, {Educational, Infotainment}.

Figure 1: Using Person Age Groups to retrieve TV Shows

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 13


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

Figure 2: Using TV Shows to retrieve Person Age Groups

From Figure 1 and Figure 2, we see that querying data using Second type
covering increases the number of possilibilities, irrespective of the covering set
chosen.

8. CONCLUSIONS
Uncertainty is a key challenge in the database domain. Multivalued attributes
present in rough relational database makes data querying a complex task. In this
paper, we have proposed a rough set approach and a second type covering-
based approach to make data querying an easy task. In these approaches, the
multivalued attributes are encoded into simple binary code, which makes data
retrieval more efficient. Here, we addressed the second type covering based
rough set which produces a better result than the traditional rough set approach
by giving maximum possible data for querying. In the future work, in order to
improve better performance and increase the response time of the query, we
need to analyze the query by using various types of covering based rough set.
Another future research topic would be to apply the covering based rough set
theory to the spatial database domain.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 14


International Journal of Computer Science and Business Informatics

IJCSBI.ORG

REFERENCES
[1] Beaubouef, T, 1994, Uncertainty processing in a relational database model via a rough set
representation, University Microfilms International, A Bell & Howell Information Company,
PhD, dissertation, pp. 67-76.

[2] Beaubouef, T, Petry, F and Buckles, 1995, B.: Extension of the relational database and its
algebra with rough set techniques, Computational Intelligence, vol.11, no.2, pp.233-245.

[3] Beaubouef, T., Petry, F. and Aroar, 1998, G.: Information theoretic measures of
uncertainty for rough sets and rough relational databases, Information Science, vol.109,
pp.185-195.

[4] Cao, F., Liang, J, 2004, The Rough Data Query Based on SQL Language, Computer
Science, vol.31, no.2.

[5] Hu, Xing lei, Hong, Xiaoguang and Yuan, Yu, 2007, A high efficiency approach to
querying rough data, Fourth International Conference on Fuzzy Systems and Knowledge
Discovery.

[6] Nakata, M., Murai, T, 2001, Data Dependencies over Rough Relational Expressions, In:
IEEE Intl. Fuzzy Systems Conf, pp. 1543-1546.

[7] Pawlak, Z, 1982, Rough Sets, International Journal of Computer and Information science,
vol.11, no.5, pp.341-356.

[8] Pawlak, Z, 1991, Rough sets - Theoretical aspects of reasoning about data, Dordrecht:
Kluwer Academic Publishers, pp. 68-162.

[9] Qiusheng, A, Wang, G., Shen, J. and Xu, J, 2003, Querying Data from RRDB Based on
Rough Sets Theory, Springer-Verlag, pp. 342-345.

[10] Qiusheng, A, Yusheng, Z. and Wenxiu Z, 2005, The study of rough relational database
based on granular computing, Granular Computing, IEEE International Conference on
Granular Computing, vol.1, pp.108-111.

[11] Tripathy, B.K. and Patro, V.M, 2009, Covering Based Rough set approach to uncertainty
management in databases, IIM Ahmadabad.

[12] Wei, Ling-ling, Zhang, Z, 2009, A method for rough relational database transformed into
relational database, IITA International Conference on Services Science, Management and
Engineering.

[13] William Zhu, 2006, Properties of the second type covering-based rough sets, In
Workshops Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence
and Intelligent Agent Technology, pp.494-497.

ISSN: 1694-2108 | Vol. 1, No. 1. MAY 2013 15

Você também pode gostar