Escolar Documentos
Profissional Documentos
Cultura Documentos
Doctor of Philosophy
By
Shashi Bhushan
Reg. No.: 2K06NITK-PhD-1095
Under supervision of
Associate Professor
Dept. of Computer Engineering
NIT, Kurukshetra
Associate Professor
Dept. of Computer Science & Engineering
G. B. Pant Engineering College, Pauri Garhwal
(Uttarakhand)
Shashi Bhushan
This is to certify that the above statement made by the candidate is correct to the best of
our knowledge.
Date:
Acknowledgement
Success in life is never attained single handed. First and foremost, I would like to
express my sincere gratitude to my supervisor, Dr. Mayank Dave for his continuous
support, encouragement and enthusiasm. I thank him for all the energy and time he
has spent for me, discussing everything from research to career choices, reading my
papers and guiding my research through the obstacles and setbacks. His professional
yet caring approach towards the people, his working and his passion for living the life
to the fullest have truly inspired me.
It is extremely difficult for me to express in words my gratitude towards my cosupervisor, Dr. R. B. Patel who stood by me throughout my research work and guided
me not only towards becoming an able researcher but also a good human being. His
constant motivation made me to believe in myself towards this research work. Without
his persuasion and interest, it would not have been possible for me to gain the confidence
that I have today.
express my deepest love to my lovingly kids Ashu and Abhi, for their cooperation and
sacrifice of childhood, that they may enjoy with their father. They always pray to God
to make me success in the work
iii
Shashi Bhushan
iv
Abstract
In P2P networks, peers are rich in computing resources/services, viz., data files, cache
storage, disk space, processing cycles, etc. These peers collectively generate huge
amount of resources and collaboratively perform computing tasks using these
available resources. These peers can serve as both clients and servers, and eliminate
the need of a centralized node. In P2P systems, a major drawback is that resources or
nodes are restricted to temporary availability only. A network element may disappear
at a given time from the network and can reappear at another locality of the network
with an unpredictable pattern.
challenging problems is how to place and access real-time information over the
network. This is because the resources should always be successfully located by the
requesters whenever needed within some bounded delay. This requires management
of information under time constraints and dynamism of the peers. There are multiple
challenges to be addressed for implementing Real Time Distributed Databases
Systems (RTDDBS) over dynamic P2P networks. In order to enable resource
awareness in such a large-scale dynamic distributed environment, specific
management system is required, which takes into account the following P2P
characteristics: reduction in redundant network traffic, data distribution, load
balancing, fault-tolerance, replica placement/updation/assessment, data consistency,
concurrency control, design and maintain logical structure for replicas, etc. In this
thesis, we have developed a solution for resource management which should support
fault-tolerant operations, shortest path length for requested resources, low overhead in
network management operations, well balanced load distribution between the peers
and high probability of successful access from the defined quorums.
In this thesis, we have proposed a self managed, fault adaptive and load adaptive
middleware architecture called Statistics Manager and Action Planner (SMAP) for
implementing Real Time Distributed Database System (RTDDBS) over P2P networks.
Various algorithms are also proposed to enhance the performance of different
modules of SMAP. A Matrix Assisted Technique (MAT) is proposed to partition the
database for implementing the RTDDBS. This approach also provides primary
security to the database over unreliable peers and easy access to the information over
P2P systems. A 3-Tier Execution Model (3-TEM) that integrates MAT, for parallel
v
execution is also proposed. 3-TEM enhances throughput of the P2P system and
balances the load among participating peers. Timestamp based Secure Concurrency
Control Algorithm (TSC2A) is also developed, which handles the issues of concurrent
execution of transactions in the dynamic environment of P2P networks. This
algorithm has capabilities of providing security to both arrived transactions and data
items. An approach called Common Junction Methodology (CJM) is proposed to
reduce redundant traffic and improved response time in P2P network through
common junction in the paths. The quorum acquisition time is reduced through a
novel fault adaptive algorithm called Logical Adaptive Replica Placement Algorithm
(LARPA), which implements logical structure for dynamic environments. The
algorithm efficiently distributes replicas on one hop distance sites to improve data
availability in RTDDBS over P2P system. A self organized Height Balanced Fault
Adaptive Reshuffle (HBFAR) scheme is proposed for improving hierarchical
quorums over P2P systems. It improves data availability through logical arrangement
of replicas. We finally conclude and compare the proposed middleware with some
existing schemes.
vi
Table of Contents
Candidates Declaration................................................................................................... ii
Acknowledgement .......................................................................................................... iii
Abstract
......................................................................................................................v
1.2
1.3
1.4
Motivation................................................................................................................3
1.7
1.8
Organization of Thesis.............................................................................................9
1.9
Summary ...............................................................................................................10
2.2
2.2.2
2.3
2.4
2.5
2.6
vii
2.6.1
Partitioning Methods..................................................................................28
2.7
Concurrency Control..............................................................................................30
2.8
2.9
Introduction ...........................................................................................................56
3.2
System Architecture...............................................................................................57
3.2.1
3.2.5
Discussion ..............................................................................................................65
3.5
Summary ................................................................................................................65
Introduction............................................................................................................68
4.2
4.3
4.3.4
4.4
4.5
viii
4.5.2
Assumptions...............................................................................................84
4.6.2
Simulation Model.......................................................................................84
4.6.3
4.6.4
4.7
Advantages of 3-TEM............................................................................................94
4.8
Discussion ..............................................................................................................94
4.9
Summary ................................................................................................................95
Introduction............................................................................................................96
5.2
5.3
5.4
5.6
5.5.1
5.5.2
Performance Metrics................................................................................102
5.6.2
Assumptions.............................................................................................102
5.6.3
5.7
Discussion ............................................................................................................107
5.8
Summary ..............................................................................................................108
Chapter 6: Topology Adaptive Traffic Controller for P2P Networks ........... 109-127
6.1
Introduction..........................................................................................................109
6.2
6.3
System Architecture.............................................................................................113
6.4
6.4.2
System Analysis.......................................................................................116
ix
6.5
Simulation Model.....................................................................................119
6.5.2
Performance Metrics................................................................................120
6.5.3
Discussion ............................................................................................................126
6.8
Summary .............................................................................................................127
Introduction..........................................................................................................128
7.2
7.3
7.4
7.5
7.3.1
LARPA Topology....................................................................................131
7.3.2
7.3.3
7.3.4
7.3.5
Implementation ....................................................................................................136
7.4.1
7.4.2
Performance Metrics................................................................................139
7.5.2
7.6
Discussion ............................................................................................................145
7.7
Summary ..............................................................................................................146
Introduction..........................................................................................................148
8.2
8.3
System Architecture.............................................................................................151
8.4.3 Rule Set-III: Rules for replica joining into the replica logical
structure................................................................................................................157
8.4.4 Rule Set-IV: Rules for Acquisition of Read/Write Quorum from
HBFAR Logical Tree...........................................................................................158
8.4.5 Correctness Proof of the Algorithm.........................................................160
8.5
Performance Metrics................................................................................163
8.5.2
8.6
Discussion ............................................................................................................166
8.7
Summary ..............................................................................................................167
Contributions .......................................................................................................169
9.2
xi
List of Figures
2.1
2.2
2.5
2.6
2.7
2.8
2.9
2.10 The path taken by a message originating from node 67493 destined for
node 34567 in a Plaxton mesh using decimal digits of length 5 in
Tapestry............................................................................................................46
2.11 Chord identifier circle consisting of the three nodes 0,1 and 3.In this
figure, key1 is located at node 1, key 2at node 3 and key 6 at node 0.............48
2.12 (a) Example 2-d [0,1][0,1] coordinate space partitioned between 5
CAN nodes. (b) Example 2-d space after node F joins...................................49
2.13 JXTA Architecture...........................................................................................50
2.14 APPA Architecture ..........................................................................................51
3.1 Architecture of Statistics Manager and Action Planner (SMAP) ....................59
4.1 3-Tier Execution Model (3-TEM) for P2P Systems ........................................70
4.2
4.3
4.4
4.5
.........................78
4.8
4.9
Relationship between Mean Transaction Arrival Rate vs. Miss Ratio ............93
4.10 Relationship between Mean Transaction Arrival Rate vs. Restart Ratio.........93
4.11 Relationship between Mean Transaction Arrival Rate vs. Abort Ratio...........94
xii
5.1
5.2
5.3
5.4
5.5
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
6.9
Average %age of reduction in Path Cost vs. Overlay Path (Hop Count) ......125
6.10 Average % age Reduction in Response Time vs. Overlay Hop Count..........126
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
7.9
7.10 Relationship between average search time with quorum size .......................143
7.11 Variation in network traffic with quorum size...............................................144
7.12 Probability to Access Updated Data vs. Peer Availability ............................144
7.13 Response Time Comparison between LARPA1 and LARPA2 ................................. 145
7.14 Messages Overhead Comparison between LARPA1 and LARPA2 ...................... 145
8.1
8.2
8.4
8.5
The HBFAR structure after leaving of Peer 2. Peer 4 takes the position
of Peer 2 which already leaved the network. All other replicas in
downlink are readjusted accordingly .............................................................158
8.6
8.7
8.8
The comparison of average search time to form the quorum from the
networks.........................................................................................................165
8.9
xiv
List of Tables
2.1
4.1
4.2
Performance Metrics-II.................................................................................88
4.3
7.1
7.2
Performance Metrics-III..............................................................................140
9.1
xv
List of Abbreviations
1-TEM
3-LTMS
3-TEM
7-LTMS
AM
Authenticity Manager
APC
APL
ART
CCM
CJM
CL
Control Layer
CPU
DA
Data Administrator
DAT
DBA
Database Administrator
DBMS
DCE
DD
Data Distributor
DL
Data Layer
DM
Data Manager
DS
Data Scheduler
DSS
GCM
HBFAR
HQC
IL
Interface Layer
LA
Load Analyzer
LARPA
LD
Local Database
MAT
MTAR
xvi
NCM
NL
Network Layer
NM
Network Manager
P2P
Peer-to-Peer
PAL
Peer Allocator
PA
Peer Analyzer
PC
Path Cost
PCS
PL
Path Length
PPQ
QEE
QI
Query Interface
QM
Quorum Manager
QO
Query Optimizer
QP
Quorum Processor
RA
Resource Allocator
RC
Result Coordinator
RDA
RL
Replication Layer
RSM
Result Manager
RM
Resource Manager
ROM
ROWA
RP
Result Pool
RPB
Resource Publisher
RSM
RT
Response Time
RTDB
RTDBS
RTDDBS
RTM
RTR
SC
Security Checker
SI
SM
Security Manager
SMAP
SQSM
Schema Scheduler
SSM
TAR
TC
Transaction Coordinator
TI
Transaction Interface
TLO
TM
Transaction Manager
TMR
TPP
TRR
TSC2A
TSR
TTL
Time to Live
UM
Update Manager
xviii
Chapter 1
Introduction
Peer-to-Peer (P2P) networks were developed in early 90s and were used mostly for
inhouse purposes for the companies and for limited applications of sharing
information between cooperative researchers. When the Internet began to explode in
the mid 90s, a new wave of ordinary people began to use the Internet as a way to
exchange email, access web pages, and buy things, which was much different from
the initial usage. As intelligent systems become more pervasive and homes become
better connected, a new generation of applications is being deployed over the Internet
[1]. In this scenario, P2P applications become very attractive because they improve
scalability and enhance performance by enabling direct and real time communication
among the peers.
Rest of the chapter is organized as follows. A P2P network is introduced in
Section 1.1. Objectives of P2P networks are presented in Section 1.2. Applications of
P2P systems are given Section 1.3. Section 1.4 discusses motivation behind this
research. Section 1.5 presents challenges in P2P Systems. Section 1.6 gives the
statement of this research. Section 1.7 presents work contribution of Thesis.
Organization of the thesis is explored in Section 1.8. Finally chapter is summarized in
Section 1.9
1.4 Motivation
In the traditional client/server model, one powerful machine acts as the server, i.e. the
service provider and all other attached machines are clients, the service consumers.
But from last two decades this model is facing new challenges due to increased
demands in computing and data sharing. Capacity enhancement in client/server model
3
In this thesis, we are looking for a self organized system that will address some of the
above mentioned issues. Thus, we are required to develop a suitable solution for
resource management which should support fault tolerant operations, shortest path
length for requested resources, low overhead in network management operations, well
balanced load distribution between the peers and high probability of successful access
from the defined quorums. This developed system must be decentralized in nature for
managing the P2P applications and the system resources in an integrated way,
monitors the behavior of P2P applications transparently, obtains accurate resource
projections, manages the connections between the peers, distributes the objects (data
items/replicas) in response to the user requests in dynamic processing and networking
conditions. The developed system should also place/disseminate dynamic data
intelligently at the appropriate peers, or on the suitable peers. To achieve desired data
availability, data must be replicated over group of suitable peers by the system.
Further, this system should manage the data consistency among replicas. This system
should be fault tolerant and capable of managing load at every peer in the system. It
should be adaptable to any joining and leaving of peers to/from networks and address
the database related issues.
1. SMAP enables fast and cost efficient deployment of information over the P2P
network. It is a self managed P2P system, having a capability to deal with high
churn rate of the peers in the network. SMAP is fault adaptive and provides load
balancing among participating peers. It permits true distributed computing
environment for every peer to use the resources of all other peers participating in
the network. It provides data availability by managing replicas in efficient logical
structure. SMAP provides fast response time for transactions with time constraints.
It reduces redundant traffic from P2P networks by reducing conventional overlay
path. It also addresses most of the implementation issues of P2P networks for
RTDDBS.
intersect at any point and this point referred as Common Junction which is utilized
to reroute the messages. It also reduces the traffic in the underlay network. CJM
reduces the traffic without affecting search scope in the P2P networks. It supports
a fast response time because of reducing path length at overlay level. Thus, the
cost to transfer a unit data from one peer to another is also reduced by CJM. The
correctness of the CJM is analyzed through mathematical model as well as
through simulation. It is implemented in the Network Layer of SMAP.
Chapter 1 briefly defines what is P2P network? What are limitations and
applications of P2P systems? A look is also given on the challenges available in the
existing and for the development of new systems. Objective of this research is
followed by contribution made in this dissertation is also presented. This chapter
gives roadmap of the dissertation and finally summarizes the chapter. Chapter 2
explores the literature review. Give more detail for this chapter.
Chapter 3 enables fast, cost efficient and self managed P2P system, having a
capability to deal with high churn rate of the peers in the network. The architectural
view of Statistics Manager and Action Planner (SMAP) system, advantages behind
the development of SMAP followed by the summary of the chapter.
Finally Chapter 9 concludes the work presented in this thesis followed by future
scope.
1.9 Summary
In this chapter we have briefly defined P2P network, their limitations and applications.
A look is also given on the challenges available in the existing system. The
motivation of doing this research is followed by contribution made in this dissertation.
This chapter gives roadmap of the thesis.
In the next chapter we will present literature review.
10
Chapter 2
Literature Review
In recent years evolution of a new wave of innovative network architectures for P2P
networks has been witnessed [17]. In these networks all peers cooperate with each
other to perform a critical function in a decentralized manner. All peers. i.e., both
users and resources providers (service providers) can access each other directly
without intermediary agents. Compared with a centralized system, a P2P system
provides an easy way to aggregate large amounts of resource residing on the edge of
Internet or in ad hoc networks with a low cost of system maintenance.
Rest of the chapter is organized as follows. P2P networks are explored in Section
2.1. Types of P2P networks are presented in Section 2.2. File sharing systems are
given Section 2.3. Section 2.4 discusses underlay and overlay networks. Section 2.5
presents challenges in P2P systems. Section 2.6 discusses parallelism in database.
Section 2.7 presents concurrency control. Topology mismatch problem are explored
in Section 2.8. Replication for availability is given in Section 2.9. Section 2.10
explores on quorum consensus. Section 2.11 presents databases and some middleware
are presented in Section 2.12. Analysis of review work is presented in Section 2.13.
Finally chapter is summarized in Section 2.14.
11
Peer
Peer
Peer
Peer
Peer
In P2P networks, all peers cooperate with each other to perform a critical function
in a decentralized manner. These peers, i.e., are both users and resources providers
(service providers) can access each other directly without intermediary agents.
Compared with a centralized system, a P2P system provides an easy way to aggregate
large amounts of resource residing on the edge of the Internet or in ad hoc networks
with a low cost of system maintenance. P2P systems attract increasing attention from
researchers. Such systems are characterized by direct access between peer systems,
rather than through a centralized server. More simply, a P2P network links the
resources of all the peers on a network and allows the resources to be shared in a
manner that eliminates the need for a central host. In P2P systems, peers of equal
roles and responsibilities, often with various capabilities, exchange information or
share resources directly with each other. Such types of systems function without any
central administration and coordination instance. A P2P network differs from
conventional client/server or multitiered server's networks. The peers are both
suppliers and consumers of resources, in contrast to the traditional client/server model
where only servers supply and clients consume (see Figure 2.2).
Peer
Peer
Server
Peer
Peer
Peer
12
13
Data
Key
Data
Key
Hash
Function
Internet
Distributed
Key
Key
Data
Peer
Peer
Peer
DHTs form an infrastructure that can be used to build P2P networks. DHT based
networks have been widely utilized for accomplishing efficient resource discovery for
grid computing systems, as it aids in resource management and scheduling of
applications. Resource discovery activity involves searching for the appropriate
resource types that match the users application requirements. Recent advances in the
domain of decentralized resource discovery have been based on extending the existing
DHTs with the capability of multidimensional data organization and query routing.
14
In pure P2P networks, peers act as equals, merging the roles of clients and server. In
such networks, there is no central server managing the network, neither is there a
central router. Some examples of pure P2P Application Layer networks designed for
file sharing are Gnutella and Freenet [27].
There also exist hybrid P2P systems, which distribute their clients into two
groups: client peers and overlay peers. Typically, each client is able to act according
to the momentary need of the network and can become part of the respective overlay
network used to coordinate the P2P structure. This division between normal and better
peers is done in order to address the scaling problems on early pure P2P networks, for
example Gnutella (version 2.2).
Directory
Server
Registration
Peer
Peer
Search
Request
Peer
Response
Peer
Request
Another type of hybrid P2P network is a network using on the one hand central
server(s) or bootstrapping mechanisms, on the other hand P2P for their data transfers.
These networks are in general called centralized networks because of their lack of
ability to work without their central server(s), e.g., eDonkey network (eD2k) [28].
15
The P2P file sharing architecture can be classified according to what extent they rely
to one or more servers to facilitate the interaction between peers. P2P systems are
categorized [29] into centralized, decentralized structured, decentralized unstructured,
shown in Figure 2.5.
Centralized: In this type of systems, there is a central control over the peers. There
is a server which carries the information regarding the peers, data files and other
resources. If any peer wants to communicate or wants to use the resources of other
peer have to send the request to the server. Server then searches the location of the
peer /resource through its database/index. After getting the information, peer directly
communicates with the desired peer. This system is very similar to the client/server
model, viz., Napster which is very popular for sharing the music files. The security
measure can be implemented due to the central server. At the time of request sending
the authorization and authentication of the peer can be checked.
Peer-to-Peer Systems
Centralized
(e.g., Napster)
Decentralized
Structured
(e.g, Chord, CAN)
Unstructured
(e.g., Gnutella, Freenet)
It is easy to locate and search an object/peer due to central server. These systems
are easy to implement as the structure is similar to client/server model, i.e.,
complexity is low.
These types of systems are not scalable due to limitation of computational
capability, bandwidth, etc. These systems have poor fault tolerance due to
unavailability of replication of objects [30] and load balancing. These types of
systems are not reliable due to single point failure, malicious attack and network
congestions near the server. These types of systems are least secure. The overhead on
16
the performance of the system is also high. Distributed Databases may be used in
these types of systems.
In centralized P2P systems the resource discovery is done using the central server
which keeps all the information regarding resource, e.g., Napster [13]. Multiple
servers are used to enhance performance in centralized systems [31].
Decentralized Structured: Decentralized structured P2P networks (e.g., Chord [31],
CAN[32, 33], Tapestry[34, 35], Pastry[34] and TRIAD[36]) use a logical structure to
organize the peers of the networks. These networks use a distributed hash table like
mechanism, to lookup files and efficient in locating the object quickly due to the
logical structure (search space is reduced exponentially). As decentralized structured
networks impose a tight control on the overlay topology, hence they are not robust to
peers dynamics. It is easy to locate and search an object/peer, due to logical structure
in these networks. The traffic of messages in these types of networks is reduced.
These systems are scalable, due to dynamic routing protocols. They have good
performance and are least affected due to scalability. These types of system are
reliable in nature, support failure peer detection and replication of objects.
The systems have tight control over the overlay topology hence they are not
robust to peer dynamics. Performance of these types of systems brutally effected, if
churn rate is high and these types of systems are not suitable for the ad hoc peer.
Database searching is comparatively complex with centralized systems [37].
Decentralized Unstructured: These types of systems are actual P2P systems, i.e.,
which are more close to the definition of P2P systems [38, 39]. There is not any
central control and all peers may act as server (which provides the service) as well as
client (which take the service). Peer wants to communicate with other peer, have to
broadcast/flooded the request to all the connected peers for searching the peer/data
object. Only the peer having the data responds and sends the data object through the
reverse path to the requesting peer. The flooding or broadcasting of requests creates
the unnecessary traffic on the network, which is main drawback of the system. A lot
of work is going on to reduce the traffic of the network. Various techniques are also
proposed, i.e., forwarding based, cached based and overlay optimization [40], etc.
17
These types of systems are not having the tight control over the overlay topology, so
they support peer dynamics. The performance is not much affected due to high churn
rate. These systems are distributed in nature, so there is no single point failure.
The scalability is poor due to overhead of traffic to discover the object/peer, as
system grows after a limit its performance goes on decreasing. It is very costly to
search a resource in unstructured system. Flooding is used to search a resource for
enhancing the performance Random Walk [41] and Location aware topology
matching [42] are used. For providing the fault tolerance, a Self maintenance and self
repairing technique are used [17].
For providing security to information, these systems use PKI [18]. Alliatrust, a
reputation management scheme [19] deals with threats, e.g., Free riders, polluted
contents, etc.
To cope up with query loss and system overloading a congestion aware search
protocol may be used [17]. This includes Congestion Aware Forwarding (CAF),
Random Early Stop (RES) and Emergency Signaling (ES). Location dependent
queries use the Voronoi Diagram [43]. The structured and unstructured P2P networks
have its own advantages and disadvantages. File sharing system of P2P networks
depends upon the application deployed on the network. To implements databases over
P2P networks, structured file sharing system have advantage over unstructured,
because of multiple communication between the peers and to reduce the search time
of data from the network.
way
to
improve
Internet routing,
18
such
as
through quality
of
service
(QoS) guarantees to achieve higher quality streaming media [47]. Earlier proposals
such as IntServ, DiffServ, and IP Multicast have not seen wide acceptance largely
because they require modification of all routers in the network. On the other hand, an
overlay network may be incrementally deployed on end hosts running the overlay
protocol software, without cooperation from ISPs. The overlay has no control over
how packets are routed in the underlying network between two overlay peers, but it
controls the sequence of overlay peers a message traverses before reaching its
destination.
Application
Layer
Overlay
Topology
Underlay
Topology
Network
Layer
Node Mapping
Logical Connection
Physical Connection
19
affinity (computation, interest, etc.) between peers. Researchers need to use this
approach to provide algorithmic foundations of large scale dynamic systems.
20
becomes extremely important for P2P systems. The term fault tolerance means that a
system can provide its services even in the presence of faults that are caused either by
internal system errors or occur due to some influence of its environment.
Thus, scalability [49] and reliability are defined in traditional distributed system
terms, such as the bandwidth usage how many systems can be reached from one
peer, how many systems can be supported, how many users can be supported, and
how much storage can be used. However, sometimes it is not possible to recover from
a failure. It is then necessary that the system be capable of adequately providing the
services in the presence of such partial failure. In case of a failure a P2P system must
be capable of providing continuous service while necessary repairs are being made. In
other words, operation such as routing between any two peers n1 and n2 must be
completed successfully even when some peers on the way from n1 to n2 fail
unpredictably.
Reliability is related to systems and network failure, disconnection, availability of
resources, etc. With the lack of strong central authority for autonomous peers,
improving system scalability and reliability is an important goal. As a result,
algorithmic innovation in the area of resource discovery and search has been a clear
area of research, resulting in new algorithms for existing systems, and the
development of new P2P platform.
Low cost for network maintenance [5, 50, 51]: the management of a peers
insertion or deletion in the network, as well as the dissemination and replication of
resources generate control messages in the network. Control messages are mainly
used to keep the topology changing network up-to-date and in a consistent state.
However, since the number of control messages can become very large and grow even
larger than the number of data packets, it is required to keep the proportion of control
messages to the data packets as low as possible. The cost for resource management
should not be higher than the cost of the network resource utilization itself.
Load Balancing [51]: the load distribution is measured by investigating how good
the network management duties are distributed between the peers in the network. A
parameter for assessing this is for example the routing table and the location table at
each peer of the system. A suitable resource management strategy for P2P should
21
ensure a well balanced distribution of the management duties between the peers of the
system [3, 51, 53].
Peer Availability [54]: A peers lifetime is the time between when it enters the
overlay for the first time and when it leaves the overlay permanently. A peers session
time is the elapsed time between when it joins the overlay and when it subsequently
leaves the overlay. The sum of a peers session times divided by its lifetime is defined
as its uptime or called availability [55, 56, 57, 58, 59]. The availability of a P2P
management solution defines the probability that a resource is successfully located in
the system. A resource management strategy is said to be highly available, when it
enables any existing resources of the system to be found when it is requested with a
probability of almost 100%. This depends on the fault tolerant routing and the
resource distribution strategies [2, 60].
Cost sharing/reduction: Centralized systems that serve many clients typically bear
the majority of the cost of the system. When that main cost becomes too large, a P2P
architecture can help spread the cost over all the peers [1, 18]. For example, in the file
sharing space, the developed system will enable the cost sharing of file storage, and
will able to maintain the index required for sharing. Much of the cost sharing is
realized by the utilization and aggregation of otherwise unused resources which
results both in net marginal cost reductions and a lower cost for the most costly
system component. Because peers tend to be autonomous, it is important for costs to
be shared reasonably.
Logical Structure: The structures, in which replicas/peers are connected, play an
important role to reduce the search time of replicas and network traffic. The messages
propagated to search replicas from the structure generate huge network traffic,
because of the topology mismatch problem. Structures should be selected to minimize
the search time and network traffic.
Underlay/Overlay Paths [61, 62]: A message travels multiple hops in underlay
corresponding to one hop path in overlay. Each forwarding of message through
overlay path adds heavy network traffic to the physical network. The peers in
underlay are traversed multiple times, by the messages, while messages are forwarded
22
in overlay path. This cause the redundant traffic in the network, and network may
slowdown to the extent of choking. The unnecessary message forwarding at least at
overlay should be minimized.
Resource Aggregation and Interoperability [7, 50]: A decentralized approach lends
itself naturally to aggregation of resources. Each peer in the P2P system brings with it
certain resources such as computing power or storage space. Applications that benefit
from huge amounts of these resources, such as compute intensive simulations or
distributed file systems, naturally lean toward a P2P structure to aggregate these
resources to solve the larger problem. Interoperability is also an important
requirement for the aggregation of diverse resources.
Dynamism [6, 51]: P2P systems assume that the computing environment is highly
dynamic. That is, resources, such as compute peers, will be entering and leaving the
system continuously. When an application is intended to support a highly dynamic
environment, the P2P approach is a natural fit. In communication applications, such
as Instant Messaging, so called Buddy Lists are used to inform users when persons
with whom they wish to communicate become available. Without this support, users
would be required to poll for chat partners by sending periodic messages to them.
Dynamic Service Relationships [63, 64]: Dynamic Service Relationships become an
important issue in P2P systems due to the fact that those systems are non
deterministic, dynamic and are self organizing based on the immediately available
resources. A P2P system is typically loosely coupled; moreover it is capable of
adapting to changes in the system structure and its environment, viz., number of
peers, their roles, and infrastructure. In order to build a loosely coupled system that is
capable of dynamic reconfiguration, several mechanisms should be there.
Data/Peer Discovery: There must be a distributed search mechanism that allows for
finding services and service providers based on certain criteria. The challenge is to
find the right number of look up services that should be available in the system.
Another challenge is how to decide which peer will run a look up service in a fully
distributed environment. Again we need a decision making system or voting. Running
23
a look up service requires additional resources such as power and memory from the
peer, therefore cannot be always requested form the peer on a free of charge base.
Thus, shortest path of the resource lookup operation is a benchmark for the
effectiveness of the resource management. Herewith, any requested resource should
be found within an optimal lookup path length that is as close as possible to the
Moore Bound D = log-1(Nmax( - 2) + 2) - log-1 -,[65, 66]. Here, D is the
diameter of a Moore graph which is defined as the lowest possible end-to-end
distance between any two peers in a connected graph.
Naming /Addressing [6]: In order to identify a resource (peer or service) a unique
identification mechanism or naming concept needs to be introduced into a P2P
system. How to address a peer in the global network? Addresses that are normally
used to access the peer in the network (such as IP address in the TCP/IP network) do
not help a lot the P2P system is heterogeneous; therefore different addressing
protocols can be theoretically used within one P2P network.
Security [67, 68, 69, 70]: P2P systems are subjected to numerous challenges with
respect to security. Making sure a user of the system is really the one it claims to be.
In P2P systems service and resource consumers might require proof of information
about the provider; otherwise authentication cannot be considered successful.
Therefore, distributed trust establishment mechanisms are needed which will decide
the authentication of user to access the system. In centralized systems the user rights
are pre defined and therefore the decision to allow access for the certain user is taken
based on these predefined rights [13]. In P2P systems the requestor is not known a
prior, that leads to a complex decision making process. This includes the challenges
of making sure data cannot be accessed by unauthorized parties, making sure it was
not modified on the wire without this being recognized, proofing from whom the
data came. For example with cryptographic signatures, or making sure that actions
that have been executed cannot be claimed never to have happened (non repudiation).
Thus, the system must be especially hardened against insider attacks, because people
can easily become insiders.
State and Data Management: P2P systems are characterized by the fact that a single
failing peer must not bring down the system as a whole. Of course, specific services
24
(those that had lived on the dying peer) might not be available anymore, but the
system still fulfills a useful purpose. In many systems this requires facilities for some
kind of distributed data management [10]. As a consequence, we have to look at the
following challenges: replication [71, 72, 73], caching [63], consistency and
synchronization, and finding the nearest copy.
25
27
many processes do part of the work at the same time. Parallelism is effective in the
systems having all of the following characteristics: (a) Symmetric Multi Processors
(SMP), clusters, or massively parallel systems, (b) Sufficient I/O bandwidth, (c)
Underutilized or intermittently used CPUs (for example, systems where CPU usage is
typically less than 30%) and (d) Sufficient memory to support additional memory
intensive processes such as sorts, hashing, and I/O buffers. An example of this is
when four processes handle four different tasks at a work place instead of one process
handling all four tasks by itself. The improvement in performance can be quite high.
In this case, each task will be a partition, a smaller and more manageable unit of an
index or table. The most common use of parallelism is in Decision Support Systems
(DSS) and data warehousing environments. The parallel execution significantly
reduces response time for data intensive operations on large databases and used in
DSS and data warehouses. Complex queries, such as those involving joins of several
tables or searches of very large tables, are often best executed in parallel. If a system
lacks any of these characteristics, parallelism might not significantly improve the
performance.
Range Partitioning
Hash Partitioning
List Partitioning
Composite Partitioning
Each partitioning method has different advantages and design considerations. Thus,
each method is more appropriate for a particular situation.
Range Partitioning: Range partitioning maps data to partitions based on ranges of
partition key values that is established for each partition. It is the most common type
of partitioning and is often used with dates, e.g., to partition sales data into monthly
partitions. The range partitioning maps rows to partitions based on ranges of column
28
values. The range partitioning is defined by the partitioning specification for a table or
index in partition by range (column_list) and by the partitioning specifications for
each individual partition in values less than(value_list), where column_list is an
ordered list of columns that determines the partition to which a row or an index entry
belongs. These columns are called the partitioning columns. The values in the
partitioning columns of a particular row constitute that row's partitioning key.
Hash Partitioning: Hash partitioning maps data to partitions based on a hashing
algorithm may applies to a partitioning key for their identification. The hashing
algorithm evenly distributes rows among partitions, giving partitions approximately
the same size. Hash partitioning is the ideal method for distributing data evenly across
devices. It is a good and easy to use alternative to range partitioning when data is not
historical and there is no obvious column or column list where logical range partition
pruning can be advantageous. A linear hashing algorithm to prevent data from
clustering within specific partitions is used by Oracle Database.
List Partitioning: List partitioning enables to explicitly control, how rows map to
partitions. This can be done by specifying a list of discrete values for the partitioning
column in the description for each partition. This is different from range partitioning,
where a range of values is associated with a partition and with hash partitioning,
where a user have no control of the row to partition mapping. The advantage of list
partitioning is that one can group and organize unordered and unrelated sets of data in
a natural way.
Composite Partitioning: Composite partitioning combines range and hash or list
partitioning. The distribution of data into partitions by range partitioning is done in
Oracle Databases. Oracle uses a hashing algorithm to further divide the data into sub
partitions within each range partition. For range list partitioning, Oracle divides the
data into sub partitions within each range partition based on the explicit list.
In combination with parallelism, partitioning can improve performance in data
warehouses/or in a system. The partitioning significantly enhances data access and
improves overall application performance. The partitioned tables and indexes
facilitate administrative operations by enabling these operations to work on subsets of
data, e.g., create new partition, organize an existing partition, drop a partition and
29
cause less than a second of interruption to a read only application. The partitioned
data greatly improves manageability of very large databases and dramatically reduces
the time required for administrative tasks such as backup and restore, etc. The
granularity can be easily added or removed to the partitioning scheme by splitting
partitions. Partitioning also allows one to swap partitions with a table. To improve the
performance of databases over dynamic P2P networks, parallelism and database
partitioning may be useful.
30
restarts [112, 113, 114, 116]. Among these protocols, OCC-TI [112] and OCC-DATI
[113] based on time interval are better than OCC-DA [114] which is based on single
timestamp since time interval can capture the partial ordering among transactions
more flexible. OCC-DATI is better than OCC-TI since it avoids some unnecessary
restarts in the latter. But there are still some unnecessary restarts with these protocols.
New version of OCC-DATI is Timestamp Vector based Optimistic Protocol (OCCTSV). With the new protocol, more unnecessary restarts of transactions can be
avoided. A Feedback Based Secure Concurrency Control for MLS Distributed
Database are presented in [89 ] which secure the multi level databases.
These all conventional protocols are defined for static network, for dynamic
networks like P2P, modification have to be made. The protocol/ algorithms may
include the constraints of P2P environments. The concurrent processes are distributed
over unreliable peers, which are prone to leave the network. In such environment onecopy-serializability at global and local level, in the transaction execution is hard to
achieve. To achieve secure transaction execution over secure data items, one-copyserializability of transaction at global and local levels, a secure protocol need to be
identified for dynamic environment of P2P.
31
distance between peers. This measurement is conducted in a global P2P domain and
needs the support of additional landmarks. Similarly, this approach also affects the
search scope in P2P systems.
GIA [94] introduces a topology adaptation algorithm to ensure that high capacity
peers are the ones with high degree and low capacity peers are within short reach of
high capacity peers. It addresses a different matching problem in overlay networks.
To chase topology mismatch Minimum Spanning Tree (MST) based approaches
are used in [95, 96]. In these peers build an overlay MST among the source peer and
certain hop neighbors, and then optimizes connections that are not on the tree. An
early attempt at alleviating topology mismatch is called Location-aware Topology
Matching (LTM) [95], in which each peer issues a detector message in a small region
so that the peers receiving the detector can record relative delay information. Based
on the delay information, a receiver can detect and cut most of the inefficient and
redundant logical links, as well as add closer peers as direct neighbors. The major
drawback of LTM is it needs to synchronize all peering peers and thus requires the
support of NTP [97], which is critical.
In [92] authors discuss the relationship between message duplication in overlay
connections and the number of overlay links. The authors proposed Two Hop
Neighbor Comparison and Selection (THANCS) to optimize the overlay network
[98], which may change the overlay topology. This change in overlay topology may
not be acceptable in many cases.
In the above approaches overlay network is only considered for optimization.
These approaches are not considering the underlay network problems. The network
search space should not be altered, as it causes the change in overlay structure of the
network. Thus, a methodology is required, which will reduce network traffic at
underlay level without affecting the overlay topology of the network and make the
system fast and scalable.
32
33
environment, the probability to access stale data from the replicas is higher as
compared with the static environment where peers do not leave the system. Several
protocols have been developed to solve the problem of accessing updated data items
from replicas in dynamic environments. The examples include single lock, distributed
lock, primary copy, majority protocol [126], biased protocol, and quorum consensus
protocol [128, 129, 130, 131]. These protocols are used to keep data consistent and to
access updated data items [124] by using multiple replicas maintained in the
distributed system.
A group of replicas are accessed to get updated data items from the replicas. This
group is generally known as Quorum [122] and depending upon the operation,
quorum is said to be Read Quorum or Write Quorum. To get the updated data
item, read-write quorums and two consecutive write-write quorums must intersect.
The intersection is set of replicas which are common in read-write and two
consecutive write-write quorums. This ensures that the read quorum always gets
updated data from the system. This updated data can be propagated to all other
replicas. The degree of intersection of two quorums makes the system resilient to
churn rate of the peers.
In the literature, many replication protocols have been suggested in [63, 132] for
replica management protocol in a Binary Balanced Tree. The most simple replication
protocol is the Read One Write All (ROWA) [133]. This protocol is suitable for static
networks having fixed and dedicated servers for the replication. It has minimum read
cost amongst other protocols and is highly fault tolerant. This protocol has maximum
communication cost for write operation. This communication cost increases with
increase in number of replicas. In dynamic system update all creates the problem of
unlimited wait. A variation of this technique is known as Read One Write All
Available (ROWAA). The scheme requires all replicas to be available to perform a
write operation, which improves data availability for dynamic environments [134].
Read-Few, Write-Many, approach is presented in [136]
The Dynamic Voting protocol [135] and Majority Consensus protocol [126]
perform better than ROWAA in dynamic environments. In both protocols the number
of replicas is accessed in groups. These protocols have good read and write
availability but have a disadvantage of high read cost. They have long search time to
search the replicas as the replicas are stored randomly in the network.
34
Rather than storing replicas randomly, logical structures [137, 138, 139, 140] have
been proposed to store replicas over the dynamic network. These protocols reduce
search time to make quorum from the replicas and reduce communication cost. The
Multi Level Voting Protocol, Adaptive Voting [141], Weighted Voting, Grid protocol
[142] and Tree Quorum protocol [143] are such replication protocols each with
different operational process. The Multi Level Voting protocol is based on the
concepts of the Hierarchical Quorum Consensus (HQC) strategy. HQC [132, 144,
145, 146] is a generalization of the Majority Scheme. In this tree structure, the
replicas are located only in the leaves, whereas the non leaf peers of the tree are said
to be as logical replicas, which in a way summarize the state of their descendants.
The advantage of tree structure is it reduces the search time to find replicas from the
structure as compare to the random structure. HQC+ [147] is also a generalization of
other protocols that use a grid logical structure to form quorums. Tree structure also
reduces the message transfer to find replicas; hence, it reduces the network traffic
generated in the system. A disadvantage of Tree Quorum protocol is that the number
of replicas grows rapidly as the tree level grows. In case of Adaptive voting and
weighted Voting protocols the formed quorum satisfies some conditions which are (a)
write and read quorums always made up of more than half replicas. (b) Write and read
quorum must be such that they intersect with each other. The disadvantage of these
protocols is the size of quorums grows with increase in number of replicas; hence,
network overhead automatically increases in the system.
Bandwidth Hierarchy Replication (BHR) is proposed in [148]. BHR reduces data
access time by avoiding network congestions in a data grid network. In [149] author
proposed BHR algorithm by using three level hierarchical structures. The proposal
addresses both scheduling and replication problems. Two replication algorithms
Simple Bottom Up (SBU) and Aggregate Bottom Up (ABU) for multi tier data grids
are proposed in [150]. These algorithms minimize data access time and network load.
In these algorithms replicas of the data should be created and spread from the root
center to regional centers, or even to national centers. These strategies are applicable
only to multi tiered grids. The strategy proposed in [151] creates replicas
automatically in a generic decentralized P2P network. Their goal of proposed model
is to maintain replica availability with some probabilistic measure. Various replication
strategies are discussed in [152]. All these replication strategies are tested on
hierarchical Grid Architecture. A different cost model was proposed in [149] to
35
decide the dynamic replication. This model evaluates the data access gains by creating
a replica and the costs of creation and maintenance for the replica. Probabilistic
Quorum Systems are presented in [153].
There are several challenges to update and access replicated data items over a
dynamic network like a P2P network. Data consistency [134], degree of intersection
between two consecutive quorums, search time to find replica and fault tolerance are
some of the identified problems. There is a need of new proposals for dynamic
environment of P2P system which should facilitate low search time, low network
traffic, fast recovery from faults, high degree of quorum intersection and access to
updated data.
2.11 Databases
Database Systems are designed to manage large bodies of information. The
management of data involves both the definition of structures for the storage of
information and the provision of mechanisms for the manipulation of information.
Thus, database is a collection of objects, which satisfy a set of integrity constraints
[82, 154].
Centralized Database Systems: are those that run on a single computer system and do
not interact with each other computer systems. Such systems span a range from single
user database systems running on personal computers to high performance database
systems running on mainframe systems [154].
Distributed Database Systems[155]: consists of collection of sites, connected together
via some kind of communications network, in which each site is a database system
site in its own right, but the sites have agreed to work together, so that, a user at any
site can access data anywhere in the network, exactly as if, the data were all stored at
the users own site. The distributed database system can thus be regarded as a kind of
partnership among the individual local DBMSs at the individual local sites; a new
software component at each site logically an extension of the local DBMS provides
the necessary partnership functions, and it is the combination of this new component
together with the existing DBMS that constitutes what is usually called the distributed
database management system.
36
38
after the deadline. The questions to be addressed in these applications include how to
identify the proper value function, which actually may be application dependent.
Firm Deadline Real Time Applications: These applications are different from the
soft deadline applications in the sense that the tasks, which miss the deadline, are
considered worthless (and may even be harmful if executed to completion) and are
thrown out of the system immediately. The emphasis, thus, is on the number of tasks
that complete within their deadlines. Our interest in the RTDB systems is on the
applications in the firm deadline real time domain [164]. We believe that
understanding firm deadline RTDB systems will provide necessary insight into the
RTDB technology, which is necessary for addressing the more complex framework of
soft deadline applications. Therefore, we have carried out our work from the
perspective of a Firm Deadline Real Time Database System"[163, 154, 158].
39
addresses. In startup, the client contacts the central server and reports a list with the
files it maintains. When the server receives a query from a user, it searches for
matches in its index, returning a list of users that hold the matching file. The user then
directly connect the peer that holds the requested file, and downloads it as shown in
Figure 2.7. There are problems with using a centralized server including the fact that
there is a single point of failure. Napster does not replicate data. It uses "keepalives"
to make sure that its directories are current.
Maintaining a unified view is computationally expensive in Napster. It does not
provide scalability. The focus on Napster as a music sharing system in which users
must be active in order to participate has made it exceedingly popular. Napster does
not use the resource sharing, but it uses distributed file management. Regarding
routing, it is simply a centralized directory system using Napster servers. The main
advantage of Napster and similar systems is that they are simple and they locate files
quickly and efficiently. The main disadvantage is that such centralized systems are
vulnerable to malicious attack and technical failure. Furthermore, these systems are
inherently not largely scalable, as there are bound to be limitations to the size of the
server database and its capacity to respond to queried. This system is not reliable as it
is prone to single point failure, easily attacked by DoS. Napster provides
communication level fault tolerance as any packet dropped due to congestion, can be
retransmitted. Napster provides communication level security. It does not support
system level and application level security. Performance of Napster is good in under
load, but it falls sharply when server is overload. The response time will increase
when the number of peers and request exceed the capability of the server.
Peer
Peer
Peer
Query
Peer
Peer
Server
Peer
Peer
File
40
Gnutella [18, 5]: The Gnutella network, which is originated as a project at Nullsoft, a
subsidiary of America online. Gnutella is a one of the earliest P2P file sharing
systems that are completely decentralized. The general architecture of Gnutella is
given in Figure 2.8. Like most P2P systems, Gnutella builds, at the application level, a
virtual overlay network with its own routing mechanisms. In Gnutella, each peer is
identified by its IP address and connected to some other peers. All communication is
done over the TCP/IP protocol. To join to the network, the new peer needs to know
the IP address of one peer that is already in the system. It first broadcasts a join
message via that peer to the whole system. Each of these peers then responds to
indicate its IP address, how many files it is sharing, and how much space those files
take up. So, in connecting, the new peer immediately knows how much is available on
the network to search through. Gnutella uses file name as the key. In order to search a
file, in unstructured systems, random searches are the only option since the peers have
no way of guessing where the file may lie. Each peer handles the search query in its
own way. To save on bandwidth, a peer does not have to respond to a query if it has
no matching items. The peer also has the option of returning only a limited result set.
After the client peer receives response from other peers, it uses HTTP to download
the files it wants.
Gnutella is completely decentralize but the peers are organized loosely, so the
costs for peer joining and searching are O(N), which means that Gnutella cannot grow
to a very large scale. It is more reliable than the Napster as there is no single point of
failure; objects are replicated proportionally to the square root of their query rate.
Node failure can be detected by neighbors. There exist multi path to connect to a peer.
Gnutella provide similar function as the Napster does. It does not provide resource
sharing. This uses distributed file management. Gnutella uses the fault tolerance at
system level, as the process is recovered due to multiple point execution. Data
replication is also provided by this system. It also provides, the fault tolerance at
communication level due to the IP addresses, dropped packets may be recovered by
retransmission. But channel level tolerance is not supported. Gnutella does not
support the security at any level (system, communication and application level).
Threats are: flooding, malicious contents virus spread, attacks on queries, etc. The
scalability is also a little better than Napster. Gnutella can not grow after a limit, as
the performance of the system drop sharply as the traffic on the network grows.
Further, the response time is greater in Gnutella.
41
Peer
Peer
Peer
Peer
Peer
Peer
42
where to send the request next. In this there is no direct connection between requester
and actual data source, anonymity is maintained, and the owners of files cached
cannot be held responsible for the content of their caches (file encryption with
original text names as key is a further measure that is taken).Fig shown the discovery
mechanism in Freenet. Freenet support multi path searching and faulty peer can be
detected by the neighbor peers so Freenet is reliable in nature. Freenet uses the file
storing rather then file sharing. Load balancing, resource sharing is not supported by
Freenet, also it does not support fault tolerance and security at any level. Performance
and scalability of Freenet is not good.
Peer
Peer
Peer
Peer
Peer
Peer
Peer
Peer
File
Figure 2.9 The Freenet Chain Mode files discovery mechanism. The query is forwarded from
peer to peer using the routing table, until it reaches the peer which has the requested data. The
reply is passed back to the original peer following the reverse path.
TRIAD[ 165]: TRIAD is not a comprehensive P2P system, but a solution to the
problem of content based routing. Its goal is to reduce the time need to access content.
Despite that it is focused on the performance problem, it also represents
improvements in other traits. The core idea in TRIAD is network integrated content
routing. It is an intermediary system between a centralized model and a fully
decentralized model because it relies upon using replicated servers. So a client can go
through one of a variety of servers to reach content as long as each server hosts the
content. The content routers are integrated into the system which acts as both IP
routers and name servers. The main idea is that the content routers hold name to next
hop information so that all routing is done through adjacent servers so that each step
is on the path to the data, avoiding some of the back and forth calling of traditional
DNS (Domain Name Server). They also explore piggybacking connection set up on
the name lookup so that immediately upon locating the data the connection is already
established. Reliability is increased because the system topology is structured so that
there are multiple paths to content. TRIAD increases performance by proposing its
name based content routing as a topological enhancement. This reduces a lot of the
43
overhead from a DNS (Domain Name Server) based system. Its protocols make it
easier to maintain the system by using routing aggregates, instead of a large number
or individual names. The core ideas in TRIAD relate to P2P because in such a system
end users machines can act as either content routers or servers, or both. At minimum
this system could replace the centralized servers of a Napster type system. TRIAD
supports the distributed file management system but it does not support the resource
sharing and load balancing. TRIAD does not support the fault tolerance and security
at any level. TRIAD has good scalability.
Pastry: Pastry [55] is a generic P2P content location and routing system based on a
self organizing overlay network of peers connected via the Internet. It is completely
decentralized, scalable, fault resilient, and reliably routes a message to the live peer
with a peerId numerically closest to a key with that message; it automatically adapts
to the arrival, departure and failure of peers.
Each peer in the Pastry P2P overlay network has a unique 128-bit peerId, this
peerId is assigned randomly when a peer joins the system by computing a
cryptographic hash of the peers public key or its IP address. With this naming
mechanism, Pastry makes an important assumption that peerIds are generated such
that the resulting set of peerIds is uniformly distributed in the peerId space. Each data
also has a 128-bit key. This key can be the original key, or generated by a hash
function. The data is stored in the peer whose id is numerically closest to the key.
Each Pastry peer maintains a routing table, a neighborhood set and a leaf set.
Neighborhood set contains the peerIds and IP addresses of the peers that are closest to
the present peer. Leaf set: Leaf set contains the peerIds and IP addresses of the half
peers with numerically closest larger peerIds, and half peers with numerically closest
smaller peerIds, relative to the present peers peerId given a message, the peer first
checks to see if the key falls within the range of peerIds covered by its leaf set. If so,
the message is forwarded directly to the destination peer, namely the peer in the leaf
set whose peerId is closest to the key. If the key is not covered by leaf set, then the
routing table is used and the message is forwarded to a peer that shares a common
prefix with the key by at least one more digit. In certain cases, it is possible that the
appropriate entry in the routing table is empty or the associated peer is not reachable,
in which case the message is forwarded to a peer that shares a prefix with the key at
least as long as the present peer, and is numerically closer to the key than the present
44
peers peerId. Such a peer must be in the leaf set unless the message has already
arrived at the peer with numerically closest peer Id.
Pastry supports dynamic data object insertion and deletion, but does not explicitly
support for mobile objects. Pastry is reliable due to multi path search, replication of
data objects. Pastry supports dynamic peer join and departure. Pastry support the
distributed file management & load balancing. It also supports the communication
level fault tolerance due to maintaining the routing tables and a neighbor hood set.
Pastry support the at communication level security is supported as the hash function
& cryptography is used in the communication. Pastry has good performance due to its
content location, and scalable due to self organization.
Tapestry: Tapestry [55, 47] is an overlay infrastructure designed as a routing and
location layer in OceanStore [32]. Tapestry mechanisms are modeled after the Plaxton
scheme. Tapestry provides adaptability, fault tolerance against multiple faults, and
introspective optimizations. In Tapestry, each peer has a neighbor map, which is
organized into routing levels, and each level contains entries that point to a set of
peers closest in network distance that matches the suffix for that level. Each peer also
maintains a back pointer list that points to peers where it is referred as a neighbor.
They are used in peer integration algorithm to generate neighbor maps for a peer, and
to integrate it into Tapestry. Tapestry uses a distributed algorithm, called Surrogate
Routing, to incrementally compute a unique root peer for an object and moreover each
object gets multiple root peers through concatenating a small globally constant
sequence of salt values to each object ID, then hashing the result to identify the
appropriate roots. The appropriate root searching is shown in Figure 2.10.
When locating an object, tapestry performs the hashing process with the target
object ID, generating a set of roots to search. Tapestry, store locations of all such
replicas to increase semantic flexibility. There are only some small modifications in
routing mechanism for improving fault tolerance, e.g. in case of bad links
encountered, routing can be continued by jumping to a random neighbor peer.
Tapestry send publish and delete message to multiple roots, Tapestry provide
explicit support for mobile objects. Node insertion is easily implemented through
populating neighbor maps and neighbor notification. Node deletion is more trivial. It
is worth notice that Tapestry provides two introspective mechanisms to allow
Tapestry to adapt to environmental changes. First, in order to adapt to the changes of
45
network distance and connectivity, Tapestry peers tune their neighbor pointers by
running a refresher thread which uses network Pings to update network latency to
each neighbor. Second, Tapestry presents an algorithm that detects query hotspots and
offers suggestions on locations where the additional copies can significantly improve
query response time. Tapestry is reliable in nature as it supports the multi path
searching, failure peer detection mechanism and data replication. It does not support
the resource sharing, but the databases are shared between the peer peers. It supports
the distributed file management system & load balancing mechanism. Tapestry does
not support security at any level. Performance of Tapestry is good due to reduced
searching time (Additional copies at hot spots).Tapestry has good scalability due to
populating neighbors & neighbors notification techniques.
67493
XXXX7
64267
XXXXX
98747
XX567
XXX67
45567
XXXXX
X4567
64567
34567
34567
XXXXX
XXXXX
Figure 2.10 The path taken by a message originating from peer 67493 destined for peer
34567 in a Plaxton mesh using decimal digits of length 5 in Tapestry.
Chord: Chord [166] is a distributed lookup protocol designed by MIT (see Figure
2.11). It supports fast data locating and peer joining/leaving. Each machine is
assigned an m-bit peerID, which is got by hashing its IP address. Each data record (K,
V) has its unique key K. In Chord, it is also assigned an m-bit ID by hashing the key,
P=hash (K). This ID is used to indicate the location of the data.
All the possible N=2m peerIDs are ordered in a one dimensional circle the
machines are mapped to this virtual circle according to their peerIDs. For each
peerID, the first physical machine on its clockwise side is called its successor peer, or
succ(peerID). Each data record (K, V) has an identifier P=hash(K), which indicates
the virtual position in the circle. The data record (K,V) is stored in the first physical
46
machine clockwise from P as shown in Figure 2.11. This machine is called the
successor peer of P, or succ(P). To do routing efficiently, each machine contains part
of the mapping information. In the view of each physical machine, the virtual cycle is
partitioned into 1+logN segments itself, and logN segments with length 1, 2, 4, ,
N/2. The machine maintains a table with logN entries, each entry contains the
information for one segment. The boundaries and the successor of its first virtual peer.
In this way, each machine only need O(logN) memory to maintain the topology
information. This information is sufficient for fast locating/routing.
On query for a record with key K, the virtual position is first be calculated:
P=hash(K). The locating can start from any physical machine. Using the mapping
table, the successor of the segment that contains P is selected to be the next router
until P is lies between the start of the segment and the successor (this means the
successor is also Ps successor, i.e., the target). The distance between the target and
the current machine will decrease by half after each hop. Thus the routing time is
O(logN).
For high availability, the data can be replicated using multiple hash functions, we
can also replicate the data at the r machines succeeding its data ID. Chord also support
failure peer detection mechanism, Hence this system is reliable. The time taken by
each operation is O(logN). In Chord, machines can join and leave at any time. For
normal peer arrival and departure, the cost is O(log2N) with high probability, but in
the worst case, the cost is O(N). The peer failure can also be detected and recovered
automatically if each peer maintains a successor list of its r nearest successors on
the Chord ring. Chord is reliable, as this support failure peer detection mechanism &
data replication. This supports the distributed file management system, but does not
support the resource sharing. No security is provided at any level. Performance of
Chord is good due to fast allocation (Dynamic Hash Table is used for the purpose) of
the objects & replication of objects using multiple hashing functions. Chord has good
scalability due to distributed look up protocol, which supports the peer
joining/leaving.
47
1
2
Successor (6) =0
0
1
Successor (1) =1
5
4
3
Successor (2) =3
Figure 2.11 Chord identifier circle consisting of the three peers 0, 1 and 3.In this figure, key1
is located at peer 1, key 2 at peer 3 and key 6 at peer 0.
sharing. It supports good fault tolerance at system & communication level as it can
copy its contents to one or more of its neighbors. No security is provided at any level.
Performance of CAN is good due to distributed hash based infrastructure that
provides fast lookup of the contents. Due to Topology Updating, CAN supports
dynamic machine joining and leaving The average cost for machine joining is
(d/4)(n1/4) hops for machine leaving and failure recovering, it is a constant time. It is
scalable.
1
A
C
D
0.5
B
C
D
0.5
0
0.5
E
0.5
F
1
Es neighbor set :{ B, E}
Es neighbor set :{ B, E, F}
Fs neighbor set :{ E, D}
Figure 2.12 (a) Example 2-d [0,1][0,1] coordinate space partitioned between 5 CAN peers.
(b) Example 2-d space after peer F joins.
JXTA [167]: architecture is organizes in three layers as shown in Figure 2.13, JXTA
core, JXTA services and JXTA application. The core layer provides minimal and
essential primitives that are common to P2P networking. Services layer includes
network services that may not be absolutely necessary for a P2P network to operate,
but re common or desirable in the P2P environment. The application layer provides
integrated applications that aggregate services, and usually provide user interface.
Edutella [168]: attempts to design and implement a schema based P2P infrastructure
for the semantic web. It uses W3C standards RDF and RDF Schema as the schema
language to annotate resources on the web: achieving a mark up for educational
resources. Edutella provides meta data services such as querying, and replication as
well as semantic services such as mapping, mediation and clustering. Edutella
services are built over JXTA [167], a widely used framework for building P2P
applications. Edutella query service provides the syntax and semantics for querying
49
both individual RDF repositories and for distributed querying across repositories.
Edutella uses mediators to provide coherent views across data sources through
semantic reconciliation. Edutella was visualized to provide a platform for educational
institutions to participate in a global information network, retaining autonomy of
learning resources.
The same authors have also attempted to use super peer based organization of the
Edutella peers to make searching more efficient. The paper [169] describes an
organization of the super peers based on HyperCup, a structured P2P system based on
the Hypercube topology [170]. The super peers maintain meta data for a set of peers,
instead of each peer maintaining its own meta data. The super peers themselves are
connected using the Hypercup overlay. This makes searching for meta data quite
efficient, as searches are executed only in the super peer overlay. They also use super
peer indices based on schema information to facilitate faster search.
Atlas Peer-to-Peer Architecture (APPA): is a data management system that
provides scalability, availability and performance for P2P advanced applications,
which also deals with semantically rich data, viz., XML documents, relational tables,
using a high level SQL like query language. The replication service is placed in the
upper layer of APPA architecture; the APPA architecture provides an Application
Programming Interface (API) to make it easy for P2P collaborative applications to
take advantage of data replication. The architecture design also establishes the
50
integration of the replication service with other APPA services by means of service
interfaces APPA has a layered service-based architecture shown in Figure 2.14.
Besides the traditional advantages of using services (encapsulation, reuse, portability,
etc.), this enables APPA to be network-independent so it can be implemented over
different structured (e.g. DHT) and super-peer P2P networks. The advanced services
layer provides advanced services for semantically rich data sharing including schema
management, replication [171], query processing [172], security, etc. using the basic
services.
Piazza [173]: is a peer data management system that facilitates decentralized sharing
of heterogeneous data. Each peer contributes schemas, mappings, data and/or
computation. Piazza provides query answering capabilities over a distributed
collection of local schemas and pairwise mappings between them. It essentially
provides a decentralized schema mediation mechanism for data integration over a P2P
system. Peers in the system contribute to stored relations, similar to data sources in
data integration systems. The query reformulation occurs through stored relations,
stored either locally or at other peers. Piazza also addresses the key issue of security,
which would enable users to share their data in a controlled manner. Another paper
[179] describes the way a single data item is published in protected form using
51
cryptographic techniques. The owner of the data item encrypts the data and can
specify access control rights declaratively, restricting users to parts of the data.
PIER: P2P Information Exchange and Retrieval (PIER) [43] is a P2P query engine
for query processing in Internet scale distributed systems. PIER provides a
mechanism for scalable sharing and querying of finger print information, used in
network monitoring applications such as intrusion detection. PIER uses four guiding
principles in its design. First, it provides relaxed consistency semantics best effort
results, as achieving ACID properties may be difficult in Internet scale systems [174].
Second, it assumes organic scaling, meaning that there are no data centers/warehouses
and machines can be added in typical P2P fashion. Third, the query engine assumes
data is available in native file systems and need not necessarily be loaded into local
databases. The fourth principle is that instead of waiting for breakthroughs on
semantic technologies for data integration, PIER tries to combine local and reporting
mechanisms into a global monitoring facility. PIER is realized over CAN, the
hypercube based P2P system [33].
PeerDB [175]: is an object management system that provides sophisticated searching
capabilities. PeerDB is realized over BestPeer [176], which provides P2P enabling
technologies. PeerDB can be viewed as a network of local databases on peers. It
allows data sharing without a global schema by using meta data for each relation and
attributes. The query proceeds in two phases: in the first phase, relations that match
the users search are returned by searching on neighbors. After the user selects the
desired relations, the second phase begins, where queries are directed to peers
containing the selected relations. Mobile agents are dispatched to perform the queries
in both phases.
NADSE: Neighbor Assisted Distributed and Scalable Environment (NADSE) [180]
enables the fast and cost-efficient deployment of self-managed intelligent systems
with low management cost at each peer. NADSE implements a structured P2P
concept which enables efficient resource management in P2P systems even during
high rate of network whips. It permits distributed computing environment [177] to
every peer node by grouping the nodes in a cluster and deputing a node as cluster
head (CH) and assumes whole network as group of clusters. Every CH manages
52
2.13 Analysis
A number of existing P2P systems such as Napster, Gnutella, Kazaa and Overnet are
popular for file sharing over the Internet. Most of the systems are dealing with static
data. Irrespective of good research in the socially popular and emerging field, i.e., P2P
networks and systems, still there is a lot of scope of research in this field. It is
identified that most of the P2P systems are popular for the static data, while it is
shared among the networks. A little work is done in the direction of sharing dynamic
data among the P2P systems, the data which is changed, while it is shared among the
networks. Motivating from the uses of P2P resources freely available in P2P system
and wasted for implementing real time information.
To achieve the above objective of placing real time data over P2P environment,
data must be partitioned and replicated over multiple peers for number of reasons,
e.g., security, data availability, etc. Some mechanism is also required to enhance the
throughput of the system to match the expectations of RTDBS. Distributed
Concurrency control mechanism is also required for execution of concurrent
53
processes, which will maintain data consistency, serializability, etc. in the system.
Network traffic is a major issue of the P2P networks, because of topology mismatch
problem. A mechanism is required to reduce this heavy traffic, as peers will
communicate with each other in large extent, will cause the situation of network
choking. Some schemes to place the replicas will be required to reduce the replica
search time, thus, some logical structures to be identified to lace the replicas.
Reliability is another issue which needs more attention of the research society.
Other issues are concurrency control, fault tolerance and load balancing. The response
time and traffic cost needs to be measured and compared as performance measure for
the network. In order to enable resource awareness in such a large scale dynamic
distributed environment, specific middleware is required which takes into account the
following P2P characteristics: managing underlay/overlay topologies, reduction in
redundant network traffic, data distribution, load balancing, fault tolerance, replica
placement/updation/assessment, data consistency, concurrency control, design and
maintenance logical structure for replicas, controlling network traffic of overlay and
underlay networks, etc. Architecture of the proposed middleware should be suitable
for dissemination of dynamic information in the P2P networks [41, 116, 117]. In
Table 2.1 we have presented comparison of various P2P middleware approaches.
2.14 Summary
In this chapter we have presented P2P (P2P) Networks, Types of P2P Networks,
overlay networks, Overlay P2P networks, Limitation of P2P systems, Parallelism in
databases is presented. The concurrency control, topology mismatch problem is also
discussed in the chapter. Replication for availability, quorum consensus, information
regarding databases and their requirements in P2P environment, Some P2P
middleware is also discussed. At the end of the chapter, an analysis of the literature
survey is presented followed by summary.
In the next chapter we have proposed Statistics Manager and Action Planner
(SMAP) for P2P Systems.
54
Middleware
CAN
Tapestry
Chord
Pastry
Napster
Gnutella
Freenet
APPA
Piazza
PIER
PeerDB
NADSE
Load balancing
Fault tolerant
(communication link)
Replication
Replication
Reliable
Replication Replication
Replication
Resource sharing
Secure
Communication
level
Scalable
Little
Little
Little
Little
Good
Good
Good
Good
Better at
under load
Better at under
load
Better at
under load
Good
Better at under
load
Good
Better at
under load
Good
Poor at
overload
Poor at
overload
Poor at
overload
Performance
Distributed file
Management
Poor at overload
Poor at
overload
Data Partitioning
NA
NA
NA
NA
NA
NA
NA
Traffic Optimize
NA
NA
NA
NA
NA
NA
NA
NA
Concurrency Control
NA
NA
NA
NA
NA
NA
NA
NA
Local
Parallel Execution
NA
NA
NA
NA
NA
NA
NA
Schema Management
NA
NA
NA
NA
NA
NA
NA
Y(Global)
Y(Pairwise)
Y(Global)
Hybrid
Hybrid
Hybrid
Loosely
Structured
Loosely
Structured
File sharing
Degree of
Decentralization
Distributed
Centralized
Decentralized
Distributed
Network Structure
Structured
Structured
Unstructured
Chapter 3
3.1 Introduction
P2P technology facilitates to share the resources of geographically distributed peers
connected to Internet. As technology approaches to its peak, the computation power,
storage capacity, capability of input/output operations of any computing device also
goes on increasing. Major part of CPU ticks and storage space of computing devices
are wasted due to limited requirements of users in general. Distributed technologies
enable the sharing of data as well as resources. P2P networks share the data storage,
computation power, communications and administration among thousands of
individual client workstations. The ability of P2P systems to share data and resources
56
can be utilized to pool the wasted resources, e.g., CPU ticks, storage space of
participating peers, etc. To utilize this pool of storage space for implementing
RTDDBS over P2P network is burning challenge. This vision is also supported by
increased usage, availability of Internet facility and popularity of P2P systems. To
address these challenges a number of issues related to P2P systems, database and real
time constraints of databases have to be addressed.
The key issues in implementing RTDDBS over P2P systems is to efficiently
maintain the target data and peer availability in the environment of high node churn,
network traffic, fast response time, high throughput which are also acceptable in real
time environment. Load balancing, fault tolerance, replication are some other issues
without which a system cannot be useful.
We are required to develop a computing/communication P2P systems that fulfills
most of the above challenges. Thus, a system is needed for P2P network which will
increase the availability of the data items, reduce the response time of the system,
provide fast response to update the database system, arrange secure access, distribute
the information over the P2P networks and manage the other dynamic issues in the
database.
57
applications
transparently,
obtains
accurate
resource
projections,
manages
Authenticity Manager (AM): This module looks after the authenticity of a user and
check whether the user is authentic to use the system or not. Various
privileges/permissions, e.g., read/write/execute/update to the user are also verified
through by the authenticity module. Various conventional techniques are used to
avoid unauthorized access in this module e.g., login ids, code exchange techniques,
etc. To avoid the misuse by the malware various techniques, e.g., capcha may be used,
so that unauthorized program may not temper the information/complete system.
Resource Manager (RM): It manages the resources of the system (viz., up/down
bandwidth, storage space, CPU ticks, etc.), and also controls the participation of peers
in the network. RM mainly collect the resources and maintain the statistics of the
58
Application
Interface Layer
Resource
Manager
Authenticity
Manager
Schema Manager
Control Layer
Query Execution
Engine
Data Storage
Space
Replica Search
Manager
Network Manager
Query Processor
Data Manager
Data Scheduler
Query Optimizer
Replica Topology
Manager
Query Interface
Quorum Manager
Replica Update
Manager
Peer Analyzer
Group
Communication
Internet
Resource Allocator (RA): allocates and controls the resources for newly subscribed
services. Resources are allocated fairly among peers, at the same time fulfilling
individual peer requirements. RA keeps the global state of the distributed resources
consistent among all local resources based on a given coherence strategy.
59
Security Manager (SM): It provides coordination among all the applications running
on n numbers of peer. Security, trust and privacy are addressed from the very
beginning of system design and on all levels such as hardware, operating system,
protocols, and architecture. SM has following roles: (1) Protecting channels against
unauthorized access or modifications, (2) Program validation/verification (what an
uploaded/downloaded piece of software really does), Trust modeling, (3) How
fragments of information can be efficiently shared in a controlled manners,
Key/certificate management, and (4) Implications of dynamic P2P network (that can
be done without trusted servers). SM also provides and looks after the security levels
of the data/query/user, etc. at various levels in the system.
Query Interface (QI): It accepts the queries from outside world. Before accepting a
query it forwards the same to SM for the validation of the user. If a user is
authenticated, it returns the information of RC from where user receives the results
against the submitted queries. All the required information is exchanged with the user
for smooth functioning of the operation. This submitted query is further submitted to
the query analyzer (QA).
60
discussion of various components that execute the functionality of MAT and TSC2A
of DL is as follows.
Schema Scheduler (SS): It is responsible for handling the global databases. Global
database is further partitioned horizontally, vertically or by both. SS ensures the
partitioning and resembling of data into the system. It helps DL in compiling the final
results from partial results received from various peers holding the replicas.
Query Processor (QP): It subdivides the received query into subqueries according to
the database schema and distributes the same to the corresponding replicas. It uses the
global schema, Local schema and MAT data partitioning algorithm for subdividing
the queries. QP also helps DL in compiling the partial results.
Query Optimizer (QO): It analyzes, resolve and optimizes the received queries. It is
also responsible for breaking query into subqueries. QO decides whether a peer is
suited for a particular subquery or not.
Query Execution Engine (QEE): It is responsible for executing the subqueries and
produces partial results corresponding to the subqueries. These partial results are
further sent to the peers responsible for compiling the partial results and send to QP.
QEE is also ensures parallel execution of submitted subqueries and manages various
stages used in the execution. It produces timely response of a subquery and execution
of all subqueries corresponding to a query. It dispatches a subquery to suitable peer
and gets back the information/partial results.
Data Scheduler (DS): It maintains the global/local schema of the database. DS finds
the correlation between global and local schema. It checks and distributes the
information to the selected peers through some predefined pattern decided by the
administrator. DS is used to gather the information from the replicas of data partition
from the peers.
Replica Topology Manager (RTM): RTM places the replicas in a logical structure
which ease the access of the information, responsible for searching any replicas from
the group of replicas. It permits read/write quorums. It implements Logical Adaptive
Replica Placement Algorithm (LARPA) and Height Balanced Fault Adaptive
Reshuffle (HBFAR) scheme for reducing search time of the replicas. LARPA and
HBFAR scheme identify the replicas for making quorum. RTM maintains the logical
structure time to time, in which replicas are placed. Every time when any replicas
leave the network, it readjusts the replicas by arranging the addresses of these in
logical structure. LARPA and HBFAR scheme are also used to maintain the overlay
topology in the system. RTM pays a key role in the SMAP.
Replica Update Manager (RUM): The major aim of RUM is to maintain the
freshness of data item. It uses LARPA and HBFAR scheme for maintaining latest
information. The probability to access stale data from the system is minimized by
minimizing the update time of the system.
62
Traffic Load Optimizer (TLO): Huge network traffic is generated in the P2P
networks. This is the main bottle neck of the P2P network, which prevent the network
to be scalable beyond a limit. TLO reduces the network traffic in the network by
optimizing the network paths. It analyses network traffic and provides statistics to
system. This statistics is used in managing the network traffic. TLO implements
63
Peer Analyzer (PA): PA is responsible for collecting statistics of the peers available
in network. Peers are selected for storing replicas depending upon the statistics
received. They can leave and join the network with and without informing the system.
To trace the behavior of peers PA keeps tracks to each peers leaving and joining time
bandwidth with which they are connected to network, storage space available, CPU
utilization of peer, etc., are the parameters which are analyzed by PA in the network.
64
up the information retrieving process. SMAP makes system similar to any other
conventional file management system stored on static network. SMAP is a highly
fault tolerant system. The availability of the data items is also improved through it.
SMAP receives both data and query for processing and management, and avoids any
leakage of information from high security level to low security level.
SMAP also provides the route management in the P2P network. It reduces the
network traffic at large scale providing scalability to the network. This solves the
problem of topology mismatch faced in the P2P networks, which generates heavy
redundant traffic in the network. SMAP helps to balance the load of traffic over the
network.
Using SMAP, one may deploy large-scale intelligent systems without the need of
cost-intensive supercomputing infrastructure in which management is highly complex
and requires high-skilled administrators for their maintenance. The approach is
evolutionary in the sense that it gives a new approach towards the application of P2P
into real-time service scenarios.
3.4 Discussion
The SMAP enables fast and cost-efficient deployment of information over the P2P
network with high availability of data and peers. It permits DCE to every peer uses
the resources of all other peers participating in the network. It utilizes the wasted
resource of peers to implement RTDDBS, offered by the owner of these peers. SMAP
is self-managed P2P system and has the capability to deal with high churn rate of the
peers in the network. It also reduces redundant network traffic generated through
topology mismatch problem in any P2P system. It provides efficient replica placement
in the network which support high data availability to the system.
3.5 Summary
In this chapter, we have presented Statistics Manager and Action Planner (SMAP)
for P2P networks. It is an evolutionary approach to P2P systems that implements any
dynamic information (which can change while shared between peers) over highly
dynamic environment of P2P networks. SMAP enables fast and cost-efficient
deployment of self-managed P2P system with high overall management cost, but with
low management cost at each peer. It implements a structured P2P concept which
65
enables efficient resource management in P2P systems even during high churn rate of
peers in network. SMAP reduces the redundant traffic generated through topology
mismatch problem. It permits distributed computing environment to every peer
participating in the network.
In the next chapter, Data Placement and Execution Model for the RTDDBS will
be discussed.
66
Chapter 4
67
4.1 Introduction
A large number of peers are participating in the P2P networks. P2P systems are
dynamic in nature because participating peers may join or leave the networks with or
without informing the systems. The churn rate of the peers is the rate at which these
peers are leaving and joining the system. Each peer is having its session time for
which it is connected with the system. It is very difficult to select a suitable peer
among number of participating peers for a particular task, which are having variety of
parameters that can affect system performance. P2P systems are popular for their
unrestricted sharing of data files, e.g., Napster, Gnutella. In such environment, the
processes require small CPU ticks to execute them. To mange data availability in the
presence of churn rate during the service time is an issue to be addressed.
For implementation of databases over P2P networks, a system has to address the
challenges related to P2P networks as well as databases. The challenges related to P2P
networks are peer selection, churn rate, session time, network traffic, overlay and
underlay topologies and topology mismatch problems, etc.
The challenges related to databases are data availability, replication, concurrency
control, security and real time access of data, etc. Databases may be partitioned to
maintain data availability, peer availability, primary security and peer load, etc. The
system performance also depends upon, how database is divided into partitions and
how these partitions are permitted to access for the execution of a submitted query by
the system. A global schema is partitioned into local schemas. A proper placement of
the partitions improves performance of the system. To execute a global query through
local schema, an arrangement for mapping between global and local schema is
required. The technique for schema mapping also affects the system performance.
Another challenge is that a real time environment expects the execution of a query in
bounded time. Such requirement of time bound execution of queries and high
throughput of the system is hard to achieve in P2P networks due to the churn rate of
the peers.
To address few of the above issues we have developed a 3-Tier Execution system
which addresses discovery and peer selection, churn rate, data partitioning, data
availability, primary security, schema mapping and data consistency issues.
68
relational database DB has nf fields f1, f 2 , f3 ,..., f nf and nr records r1, r2, r3,..., rnr .
The
database
DB
is
divided
into
partitions
represented
as
Dbi = DB is an operation which compiles the partial results to produce the final
i =1
result corresponding to a global schema. Each partition Dbi is stored at set of peers,
such as replica RDbi = { p1i , p2i , p3i ,..., pi r } . A transaction
Ti
over database DB
69
User/Requester
Transaction Coordinator
(TC)
Transaction Processing
Peers (TPP)
Transaction
Processing
Peers (TPP)
Transaction
Processing
Peers (TPP)
Transaction
Processing
Peers (TPP)
70
response time, these subprocesses are managed by dedicated peers along with
required information. The partial results received back from the subprocesses are
compiled at RC for final results corresponding to the global schema.
These stages share control information for the execution of subtransactions of the
parent transaction. The three components require small chunks of CPU to execute
their corresponding responsibilities and may be executed in parallel. This parallelism
improves the throughput of the system. A timestamp is used to maintain the
serializability in the subtransactions.
TC receives the global transactions from the users, translates and decomposes a
transactions T i against global schema into subtransactions (local) {ts1i , ts2i , ts3i ,..., tsi t }
depending upon the partitioning mechanism used. Local subtransactions further
routed to the corresponding TPPs in serializable order for execution. A subtransaction
may be executed on a number of TPPs. 3-TEM coordinates among TPPs during the
execution of subtransactions. To improve the performance of 3-TEM in terms of
response time, TPP executes the subtransactions in the order of timestamp associated
with them. RC receives the partial results from TPPs and, are compiled for final
results against global transaction. The final results are returned to the owner of the
global transaction. The details about different components of 3-TEM are given in
Figure 4.2.
71
Transaction Manager (TM): It handles the transactions and data in the system and
ensures global serializability. TM resolves the global transaction T i
into
Load Analyzer (LA): It analyzes the load at each participating peer and maintains the
statistics of the load. Depending upon the statistics, load distributing mechanism is
activated to balance the load over the peers participating in the system.
Data Administrator (DA): It is responsible for all data and database related activities
in the system. DA keeps track of the peers where the partitions are stored in the form
of an address table. It also sends the update massage to the DAT in the event of any
data updation.
Data Access Tracker (DAT): It controls, manages and provides the required
information in the system. It keeps track of the read/write timestamps associated with
each data items. Every time a data item is added, read or updated by a transaction,
corresponding timestamp of data item is also updated in the DAT. Two types of
timestamps are associated with every data item, i.e., read and write. The read
timestamp is the timestamp of the last global transaction that reads particular data
items. The write timestamp is the timestamp of the last global transaction which
72
writes this data item. DAT also detects and resolves conflicts between global
transactions.
Peer Identifier (PI): It keeps track of the addresses of peers where database partitions
are stored. PI also holds the routing information of the network. It implements peer
selection criterion procedure.
Transaction/
Query
Transaction
Interface
Security
Checker
Subtransaction
Interface
Load
Analyzer
Transaction
Manager
Data
Administrator
Peer
Identifier
Data Access
Tracker (DAT)
Subtransaction
Manager
Data
Manager
Result Data
Administrator
Local Database
To RC
To TPP
Transaction
Coordinator
(TC)
Result
Manager
Transaction
Processing
Peer (TPP)
Result
Pool
To Requester
Result
Coordinator
(RC)
73
Subtransaction Manager (SSM): resolves a subtransaction tsri , and decides the data
required by it at TPPs. It also checks the feasibility and availability of requested data
items in its local database. The subtransaction is further sent to the Data Manager
(DM) for data mapping. This identifies data items corresponding to subtransaction
from the local database and maintains local serializability.
Data Manager (DM): is responsible for mapping of subtransaction tsri with its
required data items from the database available at a TPP. It is responsible for all the
events being done on all data items corresponding to read/write subtransactions. It
also maintains the data consistency.
Local Database (LD): is the actual partition of global database within which data
item resides. This provides the data items corresponding to read/write subtransactions.
Result Manager (RSM): is similar to Transaction Manger (TM) but does the work in
reverse of TC. It ensures global serializability of final results. Serializability of the
results identified through timestamps of partial results. RSM is responsible for
compiling the partial result rsri into global result Rs i . It also compares the partial
results for updated result among all partial results received from various replicas.
RSM sends a message to the user indicating that result is ready. It handovers final
result to the user (transaction owner) after checking its authenticity. A user is
identified through comparing the token, issued to it against a submitted transaction.
74
Result Data Administrator (RDA): It manages the global as well as local databases. It
helps in compilation of partial results.
Result Pool (RP): holds the results till it is not handover to their owner. It keeps log
of the peers from where the partial results are received. RP also keeps track of the
deadline attached with the transaction and corresponding results, which is utilized to
discard the result after a certain period of time.
results are stored in the RP. This result is forwarded to the requester after
authenticating the requester and the token ids/timestamp associated with the query.
76
the transactions according to the database partitions and compiles partial results
received from the number of partitions after execution is proposed. It is simple, fast
and efficient technique for P2P environment which addresses the above issues.
77
MAT is inspired by method used to search element of 2-D matrix from memory
(RAM). It identifies the partition number and local record number to the partition
against a ( Row No. , Column No. ) of a record. MAT calculates the partition number and
record number using the following procedure: Sr. No./ df r = (Qr , Rr ) , where Qr , Rr are
the quotient and remainder record wise, respectively. Quotient Qr is the partition id
and remainder Rr is the local record id in the partition.
Column.No./ df c = (Qc , Rc ) ,
0
Sr.No
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
1
PAN No
P1
P2
P3
P4
P5
P6
P7
P8
P9
P10
P11
P12
P13
P14
P15
P16
P17
P18
P19
P20
P21
P22
P23
P24
P25
P26
P27
P28
P29
P30
[0, 0]
[1, 0]
[2, 0]
2
Name
Anil
Ashok
Kamal
Raja
Peter
Jony
Anjoo
Ashu
Abhi
Agya
Bohati
Parth
Ritu
Norti
Ved
Usha
Peshhi
Noni
Schin
Vansh
Heena
Shikhu
Vikram
Sukha
Krishna
Radha
Pawan
Rani
Golu
Manish
3
Address
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
4
Age
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
[0, 1]
[1, 1]
[2, 1]
5
XY
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
6
YZ
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
7
AB
8
CD
[0, 2]
[1, 2]
[2, 2]
78
79
80
2.
TC resolves the incoming global query and identifies the position of the record
into global database to be accessed, i.e., Sr.No . Primary key of the inserted
record may be used to identify Sr.No. against global schema. The conventional
methods may be used to resolve the query (as per the primary key).
3.
This record may be accessed from various partitions stored at remote locations.
The partition ids are calculated, i.e., from where that record is to be accessed.
(i) Row check- Sr.No./ df r = (Qr , Rr )
Case I. All column/fields of the record are to be accessed then, each
partition ids, (Qr , i ) where i = 0... n 1 are the partitions to be accessed.
Case II. If specific column corresponding to the specified row is to be
accessed.
81
(v) Result information packets for RC(s) is prepared, i.e., PacketRC [],
includes operation to be performed on the TPPs, number of replicas
against each partition, partial results/acknowledgment expected from the
TPPs. Query ids, timestamps, information about the requester.
(vi) Result information packets are sent to the RC(s).
Operations at TPP:
After receiving the information packet from TC, TPP checks the serial number
( Sr.No. ) of the record to be accessed into its local database, i.e., the position of the
record.
(i) The information packet analyzed for the operation to be performed and data
required for performing the operation.
(ii) The proper position of required record is calculated through supplied Rr and Rc
with in the local database of corresponding replica.
(iii) Specified operation is performed on the data.
(iv) Partial data/ acknowledgment in the form of result packets sent to RC(s), after
completion of the operation.
(v) Send the acknowledgement to TC for successful accessing of data items.
Operations at RC:
After receiving the information packets from TC, RC prepares the store to hold the
partial results/ acknowledgements expected in the form of result packets from TPPs.
The store is prepared to hold all result packets from the multiple replicas
corresponding to each partition of the database.
82
(i) The result information packet for RC is analyzed for the operation to be
performed and space requires for holding the results/ acknowledgements,
corresponding to each replica.
(ii) Wait till the results packets received from the TPPs.
(iii) Each partial result/ acknowledgement from TPPs is positioned at proper place to
compile these partial results to the final result corresponding to the global query.
(iv) RC collects all process completion messages from each and compiles all
messages. It also sends the process completion message to TC.
(v) The compiled result is placed in the result pool.
The result is handed over to the authenticated requester, after matching the its
corresponding token number.
The following equation can be used to compute the candidature of peers. This
candidature is further utilized to select the peers for holding replicas.
Cd i = ASTi w1 + FSAi w2 + CPAi w3 + BDi w4 + Cri w5
(4.1)
Where
Cdi
ASTi
Average Session Time of the peer for which a peer Pi is active in the
system.
FSAi
Cri
There are two parameters that are considered while selecting a peer for storing the
replicas, i.e., candidature of the peer, which is the measure of, how capable a peer is
to store a replica. Second is the distance of each participating peer in the system. The
83
distance is a measure of the cost spends to send and receive messages from the peers
to centre peer. For any efficient system this distance should be minimized. A priority
queue is utilized to store the best peers having largest candidature among all peers.
The length of priority queue is equal to the double of the number of peers required in
the system (length may be varied depending upon system requirement). All these
peers are best suited to store replicas among all participating peers in the overlay.
4.6.1 Assumptions
Relational database is considered for the simulation. It is also assumed that database
satisfies all the normal forms and free from any anomaly, while dealing with the
database. TC takes care of ACID properties of the database. All subtransactions carry
same security level as of the main transaction. The serializability of the transaction
and subtransaction are maintained by the TC and all TPPs. Timestamp is used to
compare the freshness of data items. Each subtransaction carries the timestamp of its
parent transaction. TC does the data placement, decides the divide factor and types of
database partitioning (horizontal, vertical, or both). It has the complete information of
all the fields in database (DB). TC manages the concurrent execution of processes.
To evaluate the performance of 3-TEM an event driven simulation model for firm
deadline real time distributed database system has been developed {Figure 4.4}. This
model is an extended version of the model defined in [164]. The model consists of a
RTDDBS distributed over n peers connected by a secure network. The various
components of the model are categorized into global and local components described
as follows:
system with specified mean transaction arrival rate. It also provides timestamps to the
arrived transactions. Transaction Manager models the execution behavior of the
84
Local Components: Ready Queue all arrived subtransactions and ready to execute are
initially placed in it, according to their priority. Subtransactions get CPU ticks one by
one and in order of their priority. Wait Queue holds the subtransactions which are
blocked due to any reasons, e.g., to any conflicts of resources, concurrent execution of
processes, etc. It holds the subtransactions till their corresponding conflicts are not
resolved. A transaction from the blocked queue is also gets the CPU when it is ready
to execute, i.e., its all corresponding conflicts are resolved.
Concurrency Control Manager (CCM) implements the Timestamp based Secure
Concurrency Control Algorithm (TSC2A) {described in Chapter 5}. It manages and
resolves the concurrent execution of processes, and all conflicts of resources,
processes are resolved with the help of timestamps associated with them. Local
Scheduler is responsible for managing the locks for subtransactions. Depending on
CCM, it decides, whether the lock requesting subtransaction can be processed,
blocked in the wait queue, or restarted. It schedules the subtransactions and controls
the processes for CPU. At any given time, the transaction that has the highest priority
gets the CPU, unless it is being blocked by other transactions due to lock conflict.
For firm deadline system, the transactions that once missed their deadlines are
useless and aborted from the system. The deadlines of each transaction are checked
before execution. Ready transactions wait for execution in the ready queue according
to their priorities. Since the main memory database systems can better support real
time applications, it is assumed that the databases are residing in the main memory. A
transaction requests for a lock on data items before it executes on them. A restarted
subtransaction releases all locked resources, and be restarted from its beginning. A
subtransaction successfully committed also releases all its locked resources by it.
Finally, Sink collects statistics on the completed transactions from the peer.
85
Ready Queue
Terminate
Local
Scheduler
Transaction
Arrival
Commit
Wait Queue
Sink
Concurrency
Control Manager
Blocked
Computation
Peer 5
Memory
Database
Operation
Transaction
Generator
Peer 2
Transaction
Manager
Transaction
Scheduler
Peer 3
Peer 4
Coordinator
Network
Manager
Peer 1
Transaction
Dispatcher
The transaction scheduler is responsible for managing the locks for transactions.
Depending on Timestamp based Secure Concurrency Control Algorithm (TSC2A), the
transaction scheduler determines, whether the lock requesting transaction can be
processed, blocked, or restarted. A restarted transaction releases all locked resources,
and be restarted from its beginning. A transaction after successfully commitment
releases all the locked resources. The deadlines of the firm real time transactions are
defined based on the execution time of the transactions such as:
TDeadline = TArrival Time + (TExecutionTime + SF )
(4.2)
Where:
TArrival Time : Time when a transaction arrives in the system.
SF :
(4.3)
Where:
No. of Operations : Number of operations in the transaction.
TTime Lock :
The performance of the MAT integrated 3-TEM is evaluated and compared with other
existing systems through simulation. In the simulation we have used the performance
metrics defined in Table 4.1 and Table 4.2 and the performance parameters defined in
Table 4.3.
Peer Availability: is the total up time of individual peer out of its total time.
Partition Availability: is the total up time of group of peers out of its total time.
87
Transaction Miss Ratio (TMR): is the percentage of input transactions that are unable
to complete before expiree of their deadline over the total number of transactions
submitted to the system. TMR =
TMissed
TTotal
Transaction Restart Ratio (TRR): is the percentage of transactions that are restarted
due to any reasons over the total number of transactions submitted to the system.
TRR =
TRe start
TTotal
Transaction Success Ratio (TSR): is the percentage of transactions that are committed
successfully within deadline over the total number of transactions submitted to the
system. TSR =
TSuccess
TTotal
Transaction Abort Ratio (TAR): is the percentage of transactions that are aborted due
to any reasons over the total number of transactions submitted to the system.
TAR =
TAbort
TTotal
deadlines in unit time. The logical structures with high throughput can be utilized for
high performance databases over P2P networks. Throughput =
Tcomitted
Total Time
Response Time: is the time duration between transactions submitted and gets its first
88
Default Settings
Description
Parameters
Num_Peer
200
DB_Size
Mean_Arrival_Rate
0.2-2.0
Ex_Pattern
Sequential
Num_CPUs
SlackFactor
Min_HF
1.2
Commit_Time
40 ms
Num_Operation
U_Carda
5-15 uniformly
Distributed
O_Carda
5-10 uniformly
Distributed
Latency
5-20 uniformly
Distributed
89
1
Individual Availability
0.9
System Availability
Partition Availability
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Peer Availability
Figure 4.6 shows that the throughput initially increases with increase in the mean
transaction arrival rate. After reaching its peak it starts decreasing with further
increase in MTAR. In the graph peaks of throughput is maximum, at this value of
mean transaction arrival rate, the system is in its peak performance. It is also observed
that 1-TEM produces its best performance at MTAR value in range of 1-1.2, where as
3-TEM produces its best at MTAR value of 1.4. The 3-TEM bears extra load and
executes more transaction per second as compared to 1-TEM. The throughput of 3TEM is higher than 1-TEM system for all values of MTAR because the 3-TEM
system takes small time span of CPU for the execution of a transaction.
Figure 4.7 presents that response time for 1-TEM (conventional execution model)
and 3-TEM both the case are same. It is also observed that with increase in number of
partitions the response time also goes on increasing. It may be due the possibility of
network delay, which increases with increase in number of partition.
90
8
1-TEM
3-TEM
Throughput (tps)
6
5
4
3
2
1
0
0
0.2
0.4
0.6
0.8
1
MTAR
1.2
1.4
1.6
1.8
Figure 4.6 Relationship between Throughput vs. Mean Transaction Arrival Rate
70
1-TEM
60
3-TEM
50
40
30
20
10
0
0
10
12
14
# Partitions
Figure 4.8 shows that the Query Completion Ratio is initially high and then
decreases with increase in the value of MTAR. For the smaller values of MTAR, the
system has sufficient time and resources to execute the small number of query arrival
per second. The Query Completion Ratio in the case of 3-TEM is always higher as
compared to 1-TEM. It starts decreasing near MTAR value of 1.4 as compared to 1 in
case of 1-TEM. This indicates that 3-TEM completes more transactions as compared
91
to the 1-TEM and bears high load (more transaction per second) as compared to 1TEM.
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
3-TEM
0.1
1-TEM
0
0
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
MTAR
Figure 4.8 Relationship between Mean Transaction Arrival Rate vs. Query Completion Ratio
Figure 4.9 presents that the miss ratio starts increasing with increase in the value
of MTAR. It is also observed that after a certain value of MTAR, the miss ratio
increases rapidly, which is 1.6 in the case of 3-TEM and 0.6 in the case of 1-TEM.
This is because after a certain value of MTAR, the number of transactions in the
system increases beyond the execution rate. The maximum resources are occupied by
the transactions and dependency within the resources is also increased, resulting into
more transactions getting blocked in wait queue, or aborted.
From the Figure 4.10 it is observed that the restart ratio of transactions goes on
increasing with increase in value of MTAR and start decreasing after some value of
MTAR, which is 1.2 in case of 1-TEM and 1.4 for 3-TEM. The peaks in the restart
ratio shows that after that value the number of transactions do not have sufficient time
to execute, i.e., the system rejects to allocate the resources to the transactions in wait
queue, due to the shortage of available remaining time which exceeds the deadline
with the transactions. It is also observed that the restart ratio of the 3-TEM is very less
as compared to the 1-TEM. Because 3-TEM executes the subtransactions in parallel
and resources get freed by the stages after executing and readily available for next
subprocess.
92
0.7
1-TEM
0.6
3-TEM
Miss Ratio
0.5
0.4
0.3
0.2
0.1
0
0
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
MTAR
Figure 4.9 Relationship between Mean Transaction Arrival Rate vs. Miss Ratio
Restart Ratio
0.9
0.8
1-TEM
0.7
3-TEM
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
MTAR
Figure 4.10 Relationship between Mean Transaction Arrival Rate vs. Restart Ratio
Figure 4.11 presents that abort ratio start increasing with increase in the value of
MTAR. After a certain value of MTAR, the abort ratio start increasing rapidly which
is 0.6 for 3-TEM and 0.8 for 1-TEM. The abort ratio for 3-TEM is very less as
compared with 1-TEM. Thus, the resource utilization in 3-TEM is higher as compared
to the 1-TEM.
93
0.8
1-TEM
0.7
3-TEM
Abort Ratio
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
MTAR
Figure 4.11 Relationship between Mean Transaction Arrival Rate vs. Abort Ratio
4.8 Discussion
From the simulation results it is observed that partition availability reaches in
acceptable range of more than 0.7 with peer availability 0.35 of individual peer. To
achieve the availability in acceptable level of 0.95, 6-7 peers with more than 0.7
availability may be recommended for the system. It is observed that 1-Tier Execution
Model (1-TEM) produce its best performance at MTAR values from 1 to 1.2, where
94
as this value for 3-TEM is from 1.2 to 1.4. Thus, 3-TEM can bear extra load, i.e., it
can execute more transaction per second as compared to 1-TEM. The throughput is
always higher for all values of MTAR in case of 3-TEM, because small time span of
CPU is required to execute a transaction. The 3-TEM executes the subtransactions in
parallel fashion.
The response time increases with increase in the number of partitions, higher
network delay is responsible for this. The query completion ratio in case of 3-TEM is
always high as compared to 1-TEM, because of parallelism and the resources
availability in the system. The resource wastage is very low for 3-TEM as compared
with 1-TEM, because restart ratio, miss ratio for 3-TEM is always less than the 1TEM. Thus, the resource utilization is more for 3-TEM. The resource availability in
system is also high for 3-TEM, because of small subprocesses, a process holds the
resource for short duration for execution. The reduced load at TC also help in
improving the performance of 3-TEM.
4.9 Summary
In this chapter 3-Tier Execution Model (3-TEM) is presented which divides the
complete execution process into three independent stages.
It provides high
95
Chapter 5
5.1 Introduction
In Real Time Distributed Database System (RTDDBS) multiple database sites are
linked by a communication system in such a way that the data at any site is available
to users at other sites. This system has several characteristics such as: (1) transparent
interface between user and data sites; (2) ability to locate the data; (3) Database
Management System (DBMS) to process queries; (4) distributed concurrency control
and recovery procedures on the network, and (5) mediators which translates the
queries and data between heterogeneous systems.
A Secure Real Time Distributed Database System (SRTDDBS) consists of
security classes and restricts database operations based on the security levels. It
secures each transaction and data in the system. A security level for a transaction
represents its clearance, security and classification levels. Concurrency control is an
96
SRTDDMS and is defined as O p = {O1p , O p2 , O 3p ,..., O pk } , where k is number less than the
maximum number of operations defined for the transactions. A global transaction
Tr
can be divided into i subtransactions and can be defined as Tr = {tr1 , tr2 , tr3 ,..., tri } . The
coordinator assigns a timestamp
Ts
the system. All transactions are ordered in ascending order of their timestamps. A
subtransaction carries the timestamp t si of its corresponding parent transaction Tri .
is the partially ordered set of security levels with an ordering relation and
mapping from
Dt Tr
to
Sc .
Lv
Sc
, is a
Scj Sci . A subtransaction tri can precedes a subtransaction trj if timestamp tsi of trj is
smaller than the timestamp tsj of trj , i.e., tsi < tsj .
97
x Dt , Lv ( x ) S c
where
Lv min( N ) , Lv max( N )
Sc ,
and
L v min ( N ) L v max ( N )
Lv min( N )
and
. In other words,
every secure database in the distributed system has a range of security levels
associated with it. A data item x is stored in a secure database N , if it satisfies the
condition
Lv min( N ) Lv ( x) Lv max( N ) .
Nj
iff
Lv max( Ni ) = Lv max( N j ) .
Ni
is allowed to
based on the Bell-La Padula model [190] and enforces the following restrictions:
Lv ( x) Lv (T ) .
Thus, a transaction can read objects at its level or below, but it can write
objects only at its level. A transaction with low security level is not allowed to write
at higher security level data objects. This phenomenon is used for incorporating of
database integrity. In addition to these two requirements, a secure system must guard
against illegal information flows through covert channels.
98
For the proposed algorithm, second and third configuration favors over the first one
because it is difficult to maintain a copy of the global schema at every peer. It also
hinders the expandability and simplicity of the system. The coordinator solves the
problem of assigning timestamps incase (c), which is responsible for assigning
timestamps to all global transactions. Case (c) is considered for implementation of
TSC2A.
communication delay and path failure, etc. TPP itself arrange these subtransactions in
order, depending upon the associated timestamps. A subtransaction may be blocked,
after receiving a subtransaction having lower timestamp. These blocked transactions
may be restarted at its turn in order. Hence, the local serializability is also guaranteed
through the used mechanism.
Let Tr be the set of global subtransactions to be executed. A transaction from Tr ,
resolved in subtransactions tr . Subtransactions from tr are executed such that, if a
subtransaction tri precedes a subtransaction trj in this ordering, then for every pair of
atomic operations Oip and Opj , from tri and trj , respectively, i.e., Oip proceeds Opj in
each local schedule. The execution of subtransaction trj can be blocked, after
receiving tri by the TPP, results the local serializability. Therefore, if the Coordinator
submits subtransactions in a serializable order to TPP, then TPP executes the
subtransaction in serializable order and guarantees the overall serializability in the
system.
99
100
Si
with
Si
respectively*/
{
Writelock( x );
Execution( x );
WTs(x) = Tsi ;
Update DAT to Tsi ;
}
Else
{
Abort( Si );/* access denied due to security */
}
}
}//end Algorithm
101
Si
with
}
Else
{
Abort( Si );
Rollback( Si );
}
} //end Algorithm
Global transactions are not likely to be rolled back frequently. But a global
subtransaction is rolled back by TSC2A, it will roll back all subtransactions
corresponding to global transaction. TSC2A enhances the execution autonomy of TPP
by rolling back a global transaction at the coordinator site, before sending its
subtransactions to the relevant TPP. This job is normally done by DAT.
The performance of the TSC2A is evaluated and compared for three cases, viz., low,
medium and high security through simulation. In the simulation we have used the
performance parameters defined in Table 4.3 and performance metrics defined in
Table 4.2 {Chapter 4} to study performance of the TSC2A.
5.6.2 Assumptions
The following assumptions are made during the implementation of the TSC2A:
102
The model assumes that each global transaction is assigned a unique identifier.
To execute a transaction it requires the use of CPU and data items located peer.
There is no global shared memory in the system and all peers communicate via
messages exchange over the communication links.
The transaction is assigned a globally distinct real time priority by using specific
priority assignment technique. Earlier deadline first is used in the simulation.
The cohorts of the transaction are activated at the corresponding TPP to perform
the operations.
103
transactions with low secure transactions. This shows that to implement high security
transactions in any distributed database system may compromise on the miss ratio and
throughput against these high security transactions is reduced.
From Figure 5.2 it is observed that transaction restart ratio increases with increase
in the value of MTAR. After reaching a point restart ratio goes on decreasing for
further increase in the value of MTAR. It is the point where maximum numbers of
transactions are restarted. This decrease in restart ratio is due to the fact that
transactions are not having sufficient remaining time to execute before expiry of
deadline of that transaction. At this point the abort ratio of the transactions are high.
Hence, the restart ratio of the transactions goes on decreasing after the value of
MTAR. This value is different for individual security level. The transaction restart
ratio is higher in case of high security transactions, because transactions restart due to
their wait for high security level data items.
1
0.9
High Security
0.8
Medium Security
Low Security
Miss Ratio
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
Figure 5.1 Comparison between Miss Ratio of Transactions and Mean Transaction Arrival
Rate (MTAR)
104
1.2
High Security
Medium Security
Rastart Ratio
Low Security
0.8
0.6
0.4
0.2
0
0
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
It is observed from Figure 5.3 that initially the success ratio in all three security
levels increases. After a value of MTAR the success ratio goes on decreasing. A
variation in the particular value is due to the transaction executing rate which is high
in case of low security transaction than the medium and high, where as this rate is
higher in case of medium than high security transactions. The system exhausts at
higher value of MTAR in case of low security transactions as compared to the
medium and high security transactions because transactions have to wait for its turn in
wait queue due to unavailability of resources. The success ratio starts decreasing after
certain value of MTAR, because system has sufficient resources and times to execute
the transactions at the rate of transaction arrivals and after this point transactions wait
queue starts increasing due to more arrival rate than the completion rate. With this
load, all resources also become busy and transactions have to further waits for locks
on resources. Thus, the success ratio starts decreasing with further increase in MTAR.
The low security level transactions have highest success ratio among three security
levels, i.e., high, medium and low security level transactions.
105
0.8
0.7
Success Ratio
0.6
0.5
0.4
0.3
High Security
0.2
Medium Security
0.1
Low Security
0
0
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
From Figure 5.4 it is observed that the transactions abort ratio increases with
increase in mean transactions arrival rate. This abort ratio is increased in all three
cases high, medium and low security levels. The abort ratio is highest in case of high
security transaction as compared to other security level transactions. The high abort
ratio is observed in high security transaction because higher priority is given to low
security level transactions. A high security level transaction is aborted and restarted
after some delay whenever a data conflicts occur between a high security level and
low security level transactions.
0.8
High Security
Medium Security
Low Security
0.7
Abort Ratio
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0.2
0.4
0.6
0.8
1.2
1.4
1.6
106
1.8
Figure 5.5 shows the transaction throughput as a function of the MTAR per peer.
It can be seen that the throughput of TSC2A initially increases with the increase in
arrival rates then decreases with further increase in arrival rates. The peak values are
the workload of transaction arrival rate that the system can bear. This value is
different for all three cases and is higher in case of low security transactions.
However, the overall throughput of high security transactions is always less than
medium and low security level transactions. It is also observed that the throughput of
high security level transactions is lower than that of low security level transactions as
arrival rate increases.
160
High Security
Medium Security
Low security
140
Throughput (tps)
120
100
80
60
40
20
0
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Mean Transaction Arrival Rate (MTAR)
1.6
1.8
5.7 Discussion
From the simulation results it is observed that restart ratio degrades the performance
of the system, both in time requirement by the database coordinator to reset the
database to its pervious state and computation time of the individual transaction. This
performance degradations increases with increase in the number of concurrent
transactions. The restart of transactions increases because of the time required for
taking permission to access high security data items and other conflict of resource
locks. The arrival rate of transaction should be managed so that, the performance of
the system, in terms of minimum abort, restart ratio and to maximize the throughput.
The transaction execution rate is higher for less secure transactions than the
medium and high security transaction. The system starts exhausting at different value
of MTAR for all three cases and is higher in case of low security transactions. The
107
load bearing capacity of the system also varies with respect to security levels used for
the transactions and the data items. The load bearing capacity of the system in terms
of rate of transactions execution is higher in case of low security transaction as
compared in the case of medium and high security.
It is also observed that throughput of the system decreases as the security level of
the transaction increases, due to the probability of successfully executed transaction is
decreased because there is a tradeoff between the security level and the throughput of
the system.
5.8 Summary
In this chapter, we have presented an algorithm for Timestamp based Secure
Concurrency Control Algorithm (TSC2A). This algorithm takes care of security of
transactions and the data items provided to the transactions and data items stored at
various peers. It also controls the flow of high security data items to be accessed by
low security transactions.
TSC2A secures data items and transactions through security levels, and restrict the
data access from various data levels. It also avoids covert channel problem in
accessing the database. TSC2A ensures serializability in execution process of a
transaction. It enforces serializability property at global (TC) and local level (TPPs) in
the system.
In the next we will discuss topology adaptive traffic controller for P2P networks.
108
Chapter 6
6.1. Introduction
A Peer-to-Peer (P2P) network is an abstract, logical network called an overlay
network. Instead of strictly decomposing the system into clients (which consume
services) and servers (which provide services), peers in the system elects to provide
services as well as consume them. All participating peers form a P2P network over a
109
physical network. The network overlay abstraction provides flexible and extensible
application-level management techniques that can be easily and incrementally
deployed despite the underlying network. When a new peer joins the network, a
bootstrapping node provides the list of IP addresses of existing peers in the network.
The new peer then tries to connect with these peers. If some attempts succeed, the
connected peers will be the new neighbors of the peer. Once a peer connects into the
network, the new peer will periodically ping the network connections and obtain the
IP addresses of some other peers. These IP addresses are cached by this new peer.
When a peer leaves the network and wants to rejoin (no longer the first time), the peer
will try to connect to the peers whose IP addresses have already been cached.
When a peer is randomly joining and leaving the network, causes topology
mismatch between the P2P logical overlay and the physical underlying network,
causing a large volume of redundant traffic. The flooding-based routing algorithm
generates 330 TB/month in a Gnutella network with only 50,000 nodes [10]. A large
portion of the heavy P2P traffic caused by topology mismatch problem between
overlay and underlay topology, which makes the unstructured P2P systems
unscalable. A message exchanged between the peers in overlay topology, travels
multiple hop distance in the underlay topology. To maintain the topology many data
and control massages have to be sent from one peer to other in overlay network.
Generally a flooding technique is used for searching a peer/data items in the P2P
overlay networks. The search messages are sent to all connected peers, bounded by
TTL. These messages load in overlay becomes the multiple times in underlay
topology. Thus, generates a heavy redundant traffic in the network.
In P2P networks peer nodes rely on one another for services, rather than solely
relying on dedicated and often on centralized infrastructure, i.e., decentralized datasharing and discovery algorithms/mechanisms will be the boosting option for the
deployment of P2P networks. The challenge for the researchers is to address the
topology mismatch problem for avoiding unnecessary redundant traffic from the
networks. This problem can be more clearly introduced using Figure 6.1.
In Figure 6.1 the eight peers are (numbered 1-8) participating in the underlay
network, out of which only four peers are in overlay network. We deal in overlay
networks and therefore, we have two cases. First, willingly we have to send the
massage from one peer to another peer. Second, there is no option to send the
message from one peer to another peer without using some intermediate peer in
110
overlay networks. Both the cases cause a heavy redundant traffic on the physical
network. In Figure 6.1, if we send the query from peer 1 to peer 6 in overlay, let it is a
Path(1)
from (1,6).
Overlay Topology
4
2
1
5
4
3
Underlay Topology
In underlay this path is an ordered set of peers, and for the considered example it
is:
Path(1) = {1, 2,3,5, 6} from (1, 6)
Path(2) = {6,5,3, 4} from (6, 4)
Path(3) = {4,3, 7,8} from (4,8)
Say, the query is sent from peers (1, 6, 4) in overlay. The query has to travel in
underlay as {1,2,3,5,6,5,3,4} which means twice the traffic cost from {3,5,6} is wasted,
similarly as we send the query to 3 hop in overlay network, then twice the traffic cost
of {3,5,6} and {3,4} is wasted. This is the one of the major reason for unwanted heavy
traffic in any of the P2P network. We have to save the unwanted traffic mentioned
above.
A mechanism to reduce redundant traffic from the P2P network is required to be
developed. The search scope of overlay should not be changed while reducing the
network traffic. The overlay topology must be unaltered during reducing the path. In
111
112
Cost(ia, b) .
Cost((ai,, jd))
The
cost
of
two
consecutive
paths
is
j
Cost i
( a, b ) + Cost(c , d ) , b = c
.
=
,
otherwise
Application
Overlay Network
Query Analyzer
Query Optimizer
Query Execution
Engine
Path Manager
Underlay
Network
Physical Network
Manager
Figure 6.2: 3-Layer Traffic Management System (3-LTMS) for Overlay Networks
This layer is implemented over the application layer of the network. Next layer
comprises of four components. The Query Analyzer accepts the queries, resolves and
forwards the same to Query Optimizer. It is also responsible for breaking query into
subqueries. The Query Optimizer decides whether a peer suitable for a particular
subquery or not. Query Execution Engine execute the subquery and produces partial
results corresponding to the subquery. These partial results are further sent to the
requesting peer or to the peer responsible for compiling these partial results. The Path
Manager is responsible for reducing the path and implements the CJM. It reduces the
logical path between the peers in overlay network. All the paths are checked and
reduced (if possible) using the database of the underlay peers. The third layer, which
113
is Physical Network Manager, is responsible for managing the underlay. This layer
utilized the information from path manager.
To avoid the redundant traffic in the P2P networks the two physical paths in underlay
may be checked for any common junction other than source/destination. This
common junction may be used to reroute the traffic and avoid the conventional path
of higher path cost. To find the common junction the following algorithm is used.
114
where
Step 8. End;
//Path((Si , ,j )D ) CJM holds shortest path;
i
To get the paths, CJM finds the Common Junction and diverts the query from regular
path to reduced/shorter one.
In Figure 6.1, two hop paths set CJ {} Common Junction is obtained by
Path(1) Path(2) = CJ {3} ,
diverted to save the unwanted traffic cost of the network. Now, the query has to travel
the path {1,2,3,4} . The traffic cost of this saved path is very less as compare to the
conventional path. Second, take 3 hop path traveling from Path(1,2,3) , we got the
Common Junction peer 3 from 1st and 3rd path which means the query has to travel the
115
path {1,2,3,7,8} . As a result CJM saves a lot of traffic cost on identified paths. If a
query is sent logically from peer 1 to peer 4 through peer 6, in overlay network, it is a
two hop count path {Figure 6.1}. The physical path traversed by the messages is
[1 2 3 5 6 5 3 4] .
(1, 2)
Path (1,2,3,5,6,5,3,4) . This path is combination of two paths, i.e., path 1 and path 2. These
two paths can be merged only if, when the destination of first path and the source of
second path are same peer using conventional methodology. At this point we can save
the traffic cost of the path by routing the query through path [1 2 3 4] . In this
path the traffic cost to send the query through path [3 5 6 5 3] is saved. It is
redundant traffic in the network and saves this portion of the traffic cost. Here peer 3
is common junction between the two paths. CJM is based on the key idea of common
junction between the two paths. This common junction can be identified as
Path (1,2,3,5,6) Path (6,5,3,4) = CJ {3, 6} , where CJ{} is ordered set of peers common in both
(1)
(2)
the paths. Here, peer 3 and peer 6 are the common junction of two paths, Path (1) and
(2)
Path , from where a query may be diverted. If a query is diverted through peer 6, it
becomes a conventional path. But in case of choosing peer 3 for routing a query, we
can save a lot of traffic cost. Similarly, this common junction can be identified
between any two path available. These paths may or may not be continuous.
The objective of CJM is to minimize the cost of existing paths, when data is sending
across the multiple hops.
Let P((ji,)k ) be a i th path. It is ordered set of peers coming across the path from
source peer j to destination peer k . If we want to find two hop path from (l , n) then
we have to find the P((li,,nj)) = P((li,)k ) .P((kj,)n) . Assume that there is no direct path between
peers.
The traffic cost TC((ii,)j ) of a i th path is the cost to transfer the unit data from peer i
to peer
j from conventional path. The cost of the two hop path P((l i,n, j)) is
TC (i ) + TC((mj ),n ) , k = m
TC((li,,nj)) = (l ,k )
otherwise
,
116
CJM finds the minimum path length between two connected paths, i.e., P(ik ,l ) & P( mj ,n ) .
It searches common junction between the paths using the following:
P(ik ,l ) P( mj ,n ) = CJ {}
(6.1)
(6.2)
In Case-I, there is no common junction between two paths, i.e., no continuous path
and no common junction peer is present in the two paths. The cost computed with the
help of eqn (6.3) and (6.4) is as follows.
TC((ki ,, nj )) = TC((ki ),l ) + TC((mj ), n ) =
(6.3)
(6.4)
From eqn (6.3) and (6.4), both the cases are showing total traffic cost infinity, i.e.,
there is no path available from source to destination.
Case II. CJ {}
This means that there is a common junction between the two paths. The path may be
either continuous or have some common junction peers. The traffic cost for these two
cases are computed as follows:
Case-II (a): When k = m , there is continuous path, i.e., the destination of path 1st is
equal to the source of path 2nd. In this case the common junction between two
117
paths is the destination of path 1st and source of the path 2nd. If r is the last peer, it
is the last peer of path 1st and starting peer of path 2nd.
Traffic cost for the normal path (conventional path) and CJM is computed as under:
TC((ki ,, nj )) = TC((ki ),l ) + TC((mj ),n )
(6.5)
(6.6)
From eqn (6.5) and (6.6) it is observed that the traffic cost in both the cases is
same.
Case-II (b): When k m we find a common junction between the two paths (then r
will be a common junction) r . The traffic cost of the joint is:
TCC((ki ,, nj )) = TC((ki ), r ) + (TC((mj ),n ) TC((mj ),r ) )
(6.7)
(6.8)
118
To study the behavior of CJM an event driven simulation model is developed {Figure
6.2} in C++. A brief overview of the different components of the model is as follows.
P1
Underlay Topology
Manager
Time
Scheduler
P2
P3
Overlay Topology
Manager
Path
Manager
P4
P5
Peers
Network
Analyzer
Pn
Network
Manager
Peers are the active entities participating in the network. Each peer has its predefined
availability factor which is decided at the time of its generation. Availability factor
decides the availability behavior of a peer in the network.
Time Scheduler schedules all time based events for the system. It analyzes the session
time and other statistics of peers and networks. Time scheduler decides the time of
joining/leaving of a peer depending upon the availability factor from the network.
Underlay Topology Manager binds peer and manages the underlay topology of the
peers. The number of connections a peer has is decided by the cardinality of the peer.
The Underlay Manager randomly decides the cardinality (from the range decided by
the user) of a peer at the time of connection. The other related parameters are also
decided at the time of connection, viz., latency of the communication link, etc.
Overlay Topology Manager manages the topology used to connect the selected peers
119
uses the overlay cardinality to decide the number of connections a peer has in overlay.
It also implements the logical topologies used to analyze the structure of the network.
Network Analyzer keeps track of statistics of the various elements, viz., peers,
network, paths, and cost. It collects information from all other components which
helps in making decisions about the network.
Path Manager manages all the paths to connect the peers in underlay/overlay. It uses
CJM algorithm to find the shortest path in underlay and required paths in overlay
topology. It also keeps track of underlay path to connect two peers in overlay. The
Path Manager provides multiple paths in underlay against any connection between
two peers in overlay. It updates paths in underlay after leaving of any peer from the
underlay.
To study the behavior of CJM based P2P network, we have considered that n
distributed peer elements are connected with communication links in underlay. The
followings metrics are considered:
Response Time (RT) is the time taken by a test message to traverse the maximum hop
Average Response Time (ART): is the average of all possible paths of every hop
count. Computation of ART is as follows.
t
RT of Path(i)
(6.9)
(6.10)
ART =
j =1
Where
120
Path Length (PL) of a path is the maximum hop count between source and
destination. Average Path Length (APL) is the average of all PLs in the network and
computed as follows.
APL =
(6.11)
Where
Path Cost (PC) is the cost spent by the test message to travel over communication
link between source and destination. The Path cost comprises all the cost including
bandwidth and latency, etc. Average Path Cost (APC): is the average of cost spent in
all possible paths of every hop count. The computation of APC is as follows.
PC (i)
(6.12)
(6.13)
APC =
j =1
Where
( APC APCCJM )
100
APC
Where
APCCJM is Average Path Cost through CJM
121
(6.14)
( ART ARTCJM )
100
ART
(6.15)
Where
20
18
16
14
12
10
8
6
4
2
#Partitions
0
3
11
13
Underlay Cardinality
15
17
19
Figure 6.5 shows that maximum path length in underlay used to transfer the
data/control message depends upon underlay cardinality (number of connections a
peer has in the network). The cardinality in overlay and underlay is assumed to be
more than 3. Initially path length is more against the cardinality value 3. The path
122
length starts reducing with increase in the cardinality and start stabilizing after the
value of 13.
20
18
16
14
12
10
8
6
4
2
Path Length
0
3
11
13
Underlay Cardinality
15
17
19
Figure 6.5 Average Path Lengths for Maximum Reachability vs. Underlay Cardinality
From Figure 6.6, it is observed that average path cost in P2P system also depends
upon the cardinality of participating peers in overlay topology. A high path cost is
observed in the overlay topology with less cardinality values and it reduces with
increase in the value of cardinality of a peer. Path cost in the network starts stabilizing
after the cardinality value of 6. Because the number of connections may be sufficient
to contact any peer with approximately same path cost in the network.
Figure 6.7 presents that the path cost initially is approximately same and start
decreasing with increase in underlay cardinality. In all the three cases CJM provides
minimum path cost as compared to the conventional path and path suggested by
THANCS algorithm. The path cost starts getting stabilized after the underlay
cardinality is 13. The removal of redundant path from conventional paths reduces the
path cost.
123
250
200
150
100
Normal Path
50
THANCS
CJM
0
3
7
8
Overlay Cardinality
10
11
12
300
250
200
150
100
Normal Path
THANCS
50
CJM
0
3
11
13
Underlay Cardinality
15
17
19
It is observed from Figure 6.8 that average response time of a path in overlay
starts constantly increasing with increase in overlay hop count in case of conventional
path. The response time in case of THANCS and CJM is very less as compared to
conventional paths CJM is providing minimum response time among the three.
Because, it uses common junction between two paths to route the messages. The
reduction of average response time is increasing with the increase of hop count in the
overlay path, because for long paths, the possibility to find the common junction is
higher.
124
From the results shown in Figure 6.9 it is observed that the average percentage of
reduction in path cost for CJM is lower than THANCS. But after the increase in path
length more than 3 hop count, the average reduction in path cost for CJM increases
sharply and is more than THANCS. The maximum average reduction in path cost is
observed 61% for CJM and 46% for THANCS. The reason for this reduction in path
cost is, the actual path traveled (through CJM) by the messages is reducing in the
network.
450
Normal Path
400
THANCS
350
CJM
300
250
200
150
100
50
0
1
6
7
Overlay Hop Count
10
11
10
11
70
THANCS
60
CJM
50
40
30
20
10
0
1
5
6
7
Overlay Path Hop Count
Figure 6.9 Average %age of reduction in Path Cost vs. Overlay Path (Hop Count)
125
50
45
THANCS
40
CJM
35
30
25
20
15
10
5
0
1
5
6
7
Overlay Path (Hop Count)
10
11
Figure 6.10 Average % age Reduction in Response Time vs. Overlay Hop Count
From Figure 6.10, it is observed that for initial values of hop count in overlay
path, the RTR percentage for CJM is lower than THANCS. But after the hop count
value 3 RTR percentage becomes higher than THANCS and remains higher for
higher hop count values. CJM provides approximately upto 45% RTR percentage. It
provides comparatively fast data transfer in P2P networks.
6.7 Discussion
Simulation results show that the average saving of path cost is increasing with the
increase in hop count of the path. The maximum path cost is saved upto 61% for the
hop count 11 in its best case. Initially it is observed that the average saving of path
cost increases drastically and after a limit it reduces, which indicates the average
saving in path cost is small. It is also observed that up to 8 hop count the proposed
126
technique CJM gives good results. CJM also significantly reduces the response time
of the network. Approximately 45% of response time of the network is saved on an
average using CJM. It is also observed from the results that a significant amount of
response time is reduced up to 9 hops, after that the reduction in response time is
minor. CJM reduces the physical path without altering the overlay connection and
search space of the system. It is useful for any overlay topology in the system. CJM
provides better results than THANCS for majority of the performance metrics
considered.
6.8 Summary
In this chapter, we have proposed Common Junction Methodology (CJM) technique.
This technique shows amazing results in saving of path cost and reducing the
response time of the network. It solves the topology mismatch problem upto large
extent. CJM works on any of the overlay topology, i.e., centralized or decentralized
topology. It can be implemented in any of the overlay topology without changing the
topology. Other salient features of the CJM are fast convergent speed and search
scope in the network.
In the next chapter an efficient replica placement algorithm LARPA is discussed.
127
Chapter 7
7.1 Introduction
Peer-to-Peer (P2P) networks are low maintenance, massively distributed computing
systems in which peers (nodes) communicate directly with one another to distribute
tasks, exchange information, or share resources. P2P networks are also known for its
huge amount of network traffic due to topology mismatch problem. A large portion of
the heavy P2P traffic is due to topology mismatch problem between overlay topology
and underlay topology. There are currently a number of P2P systems in operation viz.,
Gnutella [67] to construct the unstructured overlay without rigid constraints for search
and placement of data items. However, there is not any guarantee of finding an
existing data object within a bounded number of hops.
P2P systems are rich in free availability of computing power and storage space. A
Real Time Distributed Database System (RTDDBS) is one of the application which is
128
suitable for such resources. But, there are various issues to be handled before
implementation, viz., time constraints to execute transactions in the said system.
Depending upon type of application, real time transaction can be categories in three
types: hard, soft and firm deadline transactions. Any transaction which misses the
deadline is considered worthless and is thrown out of the system immediately in case
of firm deadline transaction.
In replication data items are replicated over the number of peers participating in
the system. But replication is life line for the environment where nodes are prone to
leave the system and data availability is the primary challenge. Data replication
technique is used to provide the fault tolerance. It improves the performance and
reliability of the distributed systems. It also reduces the response time and increases
the data availability of the conventional distributed systems.
Replica logical structures also improve the performance of the system by
reducing the time of quorum formation. The quorums are decided from the structure
such that data consistency and data availability. A special replica overlay structure is
used to place replicas. Data availability is also a primary objective of the P2P
networks.
The numbers of replicas are increased blindly in normal cases for improving data
availability. Due to large number of replicas heavy redundant traffic is generated
during system maintenance and updating phase. Maintaining data consistency is also a
challenge in quorum system. Increasing the number of replicas in the system faces
more problems to maintain the data consistency. It takes more time to update all
replicas present in the system. Network overhead of the system is increased
exponentially with increase in number of replication. This problem has major impact
in the case of P2P network where network overhead is very large due to topology
mismatch problem. Because messages are transferred through a number of peers
present in the underlay topology. These peers are transparent in the overlay topology.
To implement RTDDBS over the P2P networks, data distribution must be efficient
to match the requirements of transactions deadlines. For improving the data
availability and fast data access, normally replicas are placed in efficient overlay
structure. Necessary modification is required in replica overlay topology to reduce the
network traffic and chasing other challenges in the P2P networks. We have
considered few of the above challenges and developed a LARPA for P2P network. It
is adaptive in nature and tolerates up to n 1 faults. LARPA efficiently stores replicas
129
on the one hop distance sites to improve data availability in RTDDBS over P2P
system. A comparative study is also made with some existing systems.
are the number of peers participating in the network and number of edges to connect
the participating peers, respectively. Two peers p1 and p2 in a graph are said to be
connected if there exists a series of consecutive edges {e1 , e2 , e3 ,..., e p } such that e1 is
incident upon the vertex p1 and e p is incident upon the vertex p2 . An edge ( p1 , p2 ) in
E means that p1 knows a direct way to send a message to p2 . Henceforth, we use the
terms graph and network interchangeably. Similarly, the terms peer and vertex are
used equivalently, and so are the terms edges and connections. The series of edges
leading from p1 to p2 is called a path from vertex p1 to p2 , represented as p1 p2 ,
iff they are at more than one hop distance and by p1 p2 in case of one hop distance.
The length of a path Hopl(1,2) is the number of edges in the path from peers p1 to p2 .
The distance is the measure of total cost including all types of cost to send the unit
data from source p1 to destination p2 and can be defined as shortest distance
Dist ( p1 , p2 ) calculated in underlay topology. Replicas of the database are stored on
the peers selected through some criterion among the peers in P . This set of replicas is
defined by
PR = { pr1 , pr2 , pr3 , ..., pri } where pri P; PR P . Replicas form a replica
overlay topology, Hence, replica overlay topology can also be defined by the graph
G1 , where G1 G . PR
ER {E New Established Overlay Links} . The one hop neighbor of any peer p1 is
defined by
N P21 = { p3 | ( p3 , p2 ) E , p3 P, p2 N P11 } N P11 { p1} .
130
Data items D are defined by the set of tuples < Vri , Dci > where Vri is the version
number of the data items, highest version number implies the latest value of data
items. For every committed write query, version number is incremented by one at
particular replica. Dci is the value of data contents stored at a replica.
131
For any read/write quorum, a group of replicas are selected from the logical structure
to execute read/write operations in the system. The time spent in generating the
quorums also affects the system performance. LARPA selects limited numbers of
peers for placing replicas. It forms logical structure on the basis of resource
availability with the peers. One best peer among identified peers is selected as a
centre peer. This is the point from where a query enters in the system for execution
and to select the quorums for it. All remaining special peers establish direct overlay
connection with the centre peers. In LARPA one hop overlay connections improve the
search time to generate quorums. This time to select quorum is reduced by improving
the search time of replicas from the replica overlay. These peers may also establish
extra connections with the peers at one hop distance from the centre.
systems at least one peer must be active and is defined by: P = (1 (1 Pi )) . The
i =1
Peer Availability
in the System
Table 7.1 Effect of Peer Availability over Data Availability in the System
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
2
0.51
0.64
0.75
0.84
0.91
0.96
0.99
1
3
0.657
0.784
0.875
0.936
0.973
0.992
0.999
1
4
0.7599
0.8704
0.9375
0.9744
0.9919
0.9984
0.9999
1
132
7
0.917646
0.972006
0.992188
0.998362
0.999781
0.999987
1
1
8
0.942352
0.983204
0.996094
0.999345
0.999934
0.999997
1
1
9
0.959646
0.989922
0.998047
0.999738
0.99998
0.999999
1
1
This data availability goes on increasing with increase in peer availability. With these
facts it is concluded that one should avoid unnecessary large number of replicas in the
system, and limit the number of replicas up to 7. Hence, if peer having more than 40%
availability it is selected for storing replica, then data availability will be in acceptable
range. To guarantee the data availability, the peers having more than 0.5 availability
are considered for storing the replicas.
133
peers
from
the
queue
7. End
134
135
P18
P1
P17
P2
P12
P8
P6
P7
P5
P13
P9
P11
P10
P14
P3
P15
P16
P4
Figure 7.1 Peers Selection and Logical Connection for LARPA Structure
In Figure 7.1 circles filled with gray color, represent the peers selected for storing
replicas among all the peers. Peer p5 filled with dark gray color is selected as a
centre. The connections with bold dark connector represent existing overlay
connection among replicas, connectors with bold and gray color (curves connectors)
represents the new established connections created for LARPA. All other dashed
connectors represent other overlay connections and can be utilized in case of any path
failure in LARPA. The LARPA logical structure is presented in Figure 7.2, is
obtained from the existing/newly generated connections from network, shown in
Figure 7.1.
P10
P14
P12
P5
P9
P17
P8
Figure 7.2 LARPA obtains Logical Structure from the Network shown in Figure 7.1
7.4 Implementation
The arrivals of transactions at a site (peer) are independent of the arrivals at other
sites. The model assumes that each global transaction is assigned a unique identifier.
Each global transaction is decomposed into subtransactions to be executed by remote
136
137
P14
P12
P5
P9
P17
P8
Figure 7.3 LARPA Structure Representing the Replica p14 departing the Network
P10
P14
P12
P5
P9
P17
P8
Figure 7.4 LARPA Structure Representing the Replica p5 from the Centre departing the
Network
Replica Leave: A replica in LARPA can leave with or without informing the system.
It simply stop working and forwarding the data/control messages. A ping message
always sent to the centre against the active status of replica. This provides the
information about a replica whether it is active and working properly or not. It
maintains updated copy of data. In case of replica leaving from the system it does not
effect the functioning of the system till a single replica is active {Figure 7.3}.
Centre Leave: In case of centre fails as shown in Figure 7.4, the next replica will
automatically take the charge of centre. The replica will manage the system from its
present location in the structure.
neighbors. Active central replica may also be utilized for updating the data items.
Replica announces its active status after successfully updating its data items, through
control message passing.
Centre Joins: When the centre replica wants to join the system. It first tries to connect
with its old connection (stored in its memory). After connecting the replica in the
system, centre updates its data contents through matching the version number and its
contents. The centre replica receives data update acknowledgement from all available
replica participating in the system. The centre replica announces its active status after
successfully updating its data items, through control message.
Network Traffic: in case of LARPA, network traffic is very less as compare to the
random, Hierarchical Quorum Consensus, Extended Hierarchical Quorum Consensus
[145, 147]. This reduced traffic is due to its logical structure and placement of
replicas. Network traffic due to message passing is limited in the replica overlay,
which further reduces traffic in underlay topology and reduces the traffic at Internet.
Fault Tolerance: This is high priority requirement for the system, especially in case
of P2P systems. LARPA works with its only active last replica, available in the
system. It tolerates ns 1 faults (where ns is the number of replicas in the system), as
single replica provides the complete information to the system.
139
Quorum Search Time: is the time duration between the requests for quorum is
submitted and required replicas for quorum are searched in the system.
450
400
350
300
250
200
150
100
50
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Availability
Figure 7.5 Relationship between session time and its availability of a peer in P2P networks
140
Figure 7.6 the compares the response times of Random [153], HQC [145] and HQC+
[147] replication systems with LARPA. LARPA minimized the response time. This
shorter response time, helps fast execution of transactions in the system. It reduces the
workload of the system. This lowest response time is due to the placement of
minimum required replicas in LARPA based system. This LARPA system bears more
workload than the other considered systems.
100
Random
90
HQC
80
HQC+
70
LARPA
60
50
40
30
20
10
0
0
10
12
Quorum Size
14
16
18
20
It is observed from Figure 7.7, that LARPA structure is better among all the
considered structures in restart ratio. Each logical structure has a highest restart ratio
on a value of Mean Transaction Arrival Rate (MTAR), for Random MTAR=0.8, for
HQC MTAR=1.2, HQC+=1.2 and for LARPA MTAR=1.4. This represents that the
system starts exhausting near to these values of MTAR. After this peak restart ratio
decreases with increase in MTAR. This occurs because the time left with transaction
is less to restart. The time is consumed in communication delays and other factors
affecting the performance. Restart ratio for LARPA is the minimum and can bear the
load of transaction up to 1.4 approx.
Figure 7.8 presents the relationship in success ratio with MTAR in the system.
Success ratio is the number of transactions completed successfully within deadline
over the number of transaction submitted for the execution. The success ratio
decreases with increase in MTAR. The success ratio is same for all considered
systems at MTAR=0.2. The success ratio for all the systems decreases with increase
141
in MTAR. But after MTAR=0.2 success ratio decrease very sharply for random
system and much lesser for LARPA based system. LARPA executes more
transactions successfully as compared with other systems.
1.4
LARPA
Trnsaction Rastart Ratio
1.2
HQC
HQC+
Random
0.8
0.6
0.4
0.2
0
0
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
1.8
1
0.9
0.8
Success Ratio
0.7
0.6
0.5
0.4
LARPA
0.3
HQC
0.2
HQC+
Random
0.1
0
0
0.2
0.4
0.6
0.8
1.2
1.4
1.6
142
increasing and throughput starts decreasing. The maximum value of MTAR at peak
for LARPA is 1.4 (approx), which is highest among other selected logical structure.
160
140
Throughput (tps)
120
100
80
60
Random
40
HQC
HQC+
20
LARPA
0
0
0.2
0.4
0.6
0.8
1.2
1.4
1.6
1.8
It is observed from Figure 7.10 that lesser is the search time for quorum
formation, faster is the response. It also reduces the time to execute the transactions.
LARPA performs better than other logical structures and takes minimum search time
to form quorum.
40
Random
HQC
HQC+
LARPA
35
30
25
20
15
10
5
0
0
10
12
14
16
18
20
Quorum Size
Figure 7.10 Relationship between average search time with quorum size
143
message transfers. This reduced average message transfer is due to its least number of
replicas activated in the system. It is also observed that LARPA generates minimum
network load.
It is observed from Figure 7.12 that LARPA has high probability to access
updated data. This is because of the time to update the system is minimum due to one
hop distance and reduced response time of replicas.
1200
LARPA
HQC
1000
HQC+
Random
800
600
400
200
0
0
10
12
Quorum Size
14
16
18
20
0.9
1.2
CELL
1
HQC+
HQC
0.8
Random
0.6
0.4
0.2
0
0
0.1
0.2
0.3
0.4
0.5
0.6
Peer Availability
0.7
0.8
Figure 7.13 presents the comparison of response time between LARPA1 and
LARPA2, it is observed that LARPA2 provides better response time than LARPA1.
Because LARPA2 take less time to travel from centre to peer in LARPA structure, as
144
peers are selected on the basis of minimum distance from the centre for LARPA
structure.
Figure 7.14 gives the comparison of network overhead between LARPA1 and
LARPA2 , it is observed that
LARPA1.
20
LARPA1
18
LARPA2
16
14
12
10
8
6
4
2
0
2
2.5
3.5
4.5
5
Quorum Size
5.5
6.5
90
LARPA1
80
LARPA2
Message Overhead
70
60
50
40
30
20
10
0
2
2.5
3.5
4.5
Quorum Size
5.5
6.5
7.6 Discussion
LARPA permits a replica to keep its old addresses and to join the same place from
where it had left, which reduces rejoin overhead of the replica. Peer resume time is
minimum, and is similar to just checking its old neighbors. Thus, the system
145
reconciliation time is very less as compared to other considered system. It exploits the
advantages of overlay topology in establishing the connections and the connections
can be disconnected or established without affecting the peers in underlay topology.
The network overhead is minimized with limiting the number of replicas in the
system. All replicas are placed at one hop distance from the centre peer, from where
any search starts. Data availability of the system is maintained with placing the replica
over the peers with maximum candidature value. Fault detection is fast due to one
hop distance of all the replicas from centre. LARPA is adaptive in nature and tolerates
upto ns 1 faults. It allows a system to works till the last active replica is active.
In LARPA based systems one replica may be accessed in its best case. It provides
high probability to access updated data from the system. It also has the minimum
quorum acquisition time and response time. LARPA provides minimum transaction
restart ratio, better throughput, and better transaction success ratio. On the basis of
comparative analysis it is find that, LARPA2 provides better response time and
generates less network overhead to the network. All these features recommend
LARPA2 structure for the dynamic environment applications where high throughput is
required.
7.7 Summary
In this chapter we have presented a Logical Adaptive Replica Placement Algorithm
(LARPA). LARPA matches the requirements of RTDDBS over P2P where fast
response is expected from the system. It uses its own peer selection criterion to
maintain data availability of the system in acceptable range. It efficiently stores
replicas on the one hop distance peers to improve data availability for RTDDBS over
P2P system.
To avoid long waiting time LARPA inherited read/write quorum attributes of the
ROWAA protocol. LARPA is adaptive in nature and tolerates up to ns 1 faults. It
shows minimum response time, search time to generate read/write quorums,
transaction restart ratio and transaction miss ratio. It generates lowest message traffic
to update in P2P networks. LARPA based system bears maximum workload. It is
further observed that the algorithm LARPA2 performs slightly better than algorithm
LARPA1 due to its shorter distance among centre and replicas. LARPA is a suitable
for implementing reliable RTDDBS over P2P networks.
146
147
Chapter 8
simulation and performance study. Section 8.6 gives a look on findings. Finally
chapter is summarized in Section 8.7.
8.1 Introduction
Data replication is one of the technique to enhance the performance of the distributed
databases. In replication data is distributed over the geographically separated systems.
Each data replicated over the peers are generally called replica. The multiple replicas
are consulted to get the fresh data items from the distributed systems. This makes
system reliable and resilient to any fault. Data replication is a fundamental
requirement of distributed database systems deployed on the networks which are
dynamic in nature, viz., P2P networks. Peers can join or leave the network at any time
148
with or without prior information. It is also found in the literature that churn rate of
the peers is high in P2P networks. For such a highly dynamic environment the
probability to access a stale data is comparatively high as compare to the static
environment. There are number of challenges to implement the databases over the
dynamic systems like P2P networks. The major challenges are summarized as
follows- data consistency, one copy serializability, fault tolerance, availability of the
data items, response time, churn rate of the peers and network overhead.
A number of protocols and algorithms are proposed in the literature to implement
and maintain the consistency in the distributed databases. Some examples are single
lock, distributed lock, primary copy, majority protocol, biased protocol and quorum
consensus are proposed in the literature. The availability of the replicas in dynamic
P2P network is a major challenge, because of churn rate of peers. Data availability is
also affected by the peer availability in the system. To maintain the data availability in
acceptable range a quorum consensus protocol to access the replicas are quit good
option. A system with replicas stored in a logical structures, improves the quorum
acquisition time.
If a quorum is formed such that, it contains maximum updated replicas then the
probability to access a stale data is obviously reduced. To improve the probability to
access updated data from the set of replicas, the degree of intersection must be high
for two consecutive quorums. To improve the degree of intersection among
consecutive read-write and write-write quorums the logical structure needs to be
accessed in a special way.
Logical structure of the replicas reduces unnecessary network traffic due to
multicasting of search messages/queries for the existing replicas. The network traffic
can be reduced by prioritizing the access of logical structures in P2P systems. Self
organization of the logical structures may also improve the performance of the
system. The network traffic further reduced through optimizing underlay path.
In order to reduce search time we may take the advantage of overlay topology in
the P2P network. If a logical structure is organized in such a way that all updated
replicas are popped up, then search time will reduce drastically and improves the
probability to access updated data item. To address above said challenges we have
developed Height Balanced Fault Adaptive Reshuffles (HBFAR) scheme for P2P
system. It is a self organized scheme to arrange replicas in a binary complete tree with
149
some special attributes. It also improves the probability to access updated data from
the quorums. HBFAR provides high degree of intersection between two consecutive
quorums.
peers participating in the system. Replica set R is the set of l number of peers
holding replicas. Where R is subset of P , i.e., R P . A replica may be active or
inactive according to its availability in the system.
Ra is a set of all active replicas in the system. Ra = {Ra1 , Ra2 , Ra3 ,..., Ras } is an
ordered set of all active replicas arranged in logical tree structure. Ra1 have larger
session time than Ra2 . Ra2 have the large session time as compare to the Ra3 and so
on. Replicas in the logical structure are managed according to the session time. A
replica from the logical structure which has longest session time is placed at the root.
Rd is a set of all dead/inactive replica at particular time. Ra Rd = , i.e., each
replica is either active or dead depending upon the present state of the peer. Replica
set R is also defined in terms of Ra and Rd , R = {Ra Rd } .
Write quorum is an ordered set of replicas from Ra , Qw = {Qw1 , Qw2 , Qw3 ,..., Qws } is
set of write quorum at various times, respectively. It is starting from the 1st replica
from Ra , up to the number of replicas equal to the quorum size decided by the
administrator, let it be a n .
Let write quorum Qw j = {Raiti | ti > tn+1 , i = [1...n]} , where n is the size of write
quorum and ti is the session time of the replica for the period it is active in the
system. Here i is the position of replica in the logical structure starting from the root,
i.e., root of the logical structure is at position 1 , left child and right child are at
position 2 & 3, respectively.
150
Read Quorums Qr = {Qr1 , Qr2 , Qr3 ,..., Qrs } is set of read quorums at various times. A
Read Quorum is defined as Qrj = {Raiti | ti > tm+1 , i = [1...m]} where m is the size of
read quorum, decided by the administrator. The read quorum must follow the
condition m n , ti ti +1 ; tm and tn are the session time of the last replica involved in
the quorum. All the replicas having largest session time are involved in the read and
write quorums. A read quorum is always subset of the write quorum, i.e.,
[Qwi Qw, Qri Wr , Qri Qwi ] . Thus, Qri will always contain updated replicas,
because all replicas with greater session time are involve in the quorums, including
root of the logical structure. The only condition when all replicas go down including
root, then only this system will fail, otherwise every time read quorum have the
updated information in the replicas.
The quorum size depends upon the availability of the peers in the system. The
value of m, n may increase in case of low availability of peers holding the replicas
and can be decrease in case of highly available peers holding replicas.
151
Query Optimizer
Quorum Manager
Update Manager
P2P
Network
Quorum Manager (QM): is responsible to decide the quorum consensus to access the
data items. Quorum is decided such that the system achieve the acceptable replica
availability, i.e., the number of replica to be accessed is increased if the availability of
the peers storing replica are low. The number of replicas may be reduced if the
availability of the peers is high to reduce the overhead of the network. QM is
responsible to maintain the availability of the replicas at the desired level. It
recognizes the replica to be accessed from the logical structure. QM implements
read/write quorum algorithm.
Replica Search Manager (RSM): is responsible for searching any replica from the
group of replicas. These replicas are arranged into a logical structure, maintained by
ROM. RSM also facilitates searching of read/write quorums. It uses algorithms for
searching of replicas, e.g., HQC, HQC+ and HBFASR.
structure may reduce the search time of replicas. ROM identifies the replica for
making the quorum. It maintains the logical structure from time to time, in which
replicas are placed. Every time when any replicas left the network, ROM readjusts the
replicas by arranging the addresses of these replicas in logical structure.
Update Manager (UM): It implements all the update strategies. The eager
methodology is used to update the prioritized replicas, selected for the write quorum.
Lazy methodology is used for remaining replicas existed in the system. This update is
performed by ROM. UM maintains the freshness of data item available with replicas.
It implements the eager/lazy update algorithm. The algorithm is selected such that it
should update information within minimum time, because it plays the key role in
system performance. The probability to access stale data from the system is
minimized by minimizing the update time of the system.
153
child, according to the session time of each replica in the system. All replicas having
higher session time are placed on root in the logical structure. These top replicas are
further participating in every read/write quorums. The regular participation of these
top replicas maintains updated copy of data items which increases the probability to
access updated copy of data. The HBFAR scheme inherits some special properties,
viz., replicas lying on a peer are arranged in the tree such that the session time of each
parent replica is greater than its child. The session time of left child is greater than
right child of any parent. All peers participating in the system find the alternate path
for all its ancestors. This includes all parent replicas comes across the path from leaf
to root.
The replicas with highest session time are given priority over the smaller session
time while including in a quorum. The quorum is formed by first taking the replica at
the root of the tree and then replicas at the branches, i.e., from top to bottom. The
branches of parent node are accessed such that the left child is accessed before right
child of that branch, i.e., from left to right. This pattern to access replicas from the
tree is referred as Top-to-Bottom and Left-to-Right. HBFAR scheme permits each
read quorum to get the updated copy of data items.
Every time replicas are accessed from same position from logical structure
increases the degree of intersection among consecutive quorums. The common
replicas will increase the probability to access updated data items from the system.
Since the replicas are accessed from upper part of tree in Top-to-Bottom and Left-toRight fashion, which minimizes the replicas search time. All read-write and writewrite quorum intersect each other; hence, every read quorum accesses the updated
copy of data item. The maximized degree of intersection set from two consecutive
read-write and write-write quorum ensures the access of fresh data items.
The problem of finding a path between a pair of source-destination peers in the
overlay is the problem of finding a route between the source and destination peers in
the underlay topology. The path between peer P1 and P2 is direct of one hop count
distance and path between peer P1 to P4 is of two hop count distance as shown in
Figure 8.2. The identified route between source and destination in the underlay may
or may not be the shortest path. However, a shortest path in the underlay will be
advantageous for reducing communication cost.
154
The working of HBFAR scheme is divided into two independent parts, accessing the
group of replicas and maintenance of logical structure. Both parts are executed in
parallel to improve the efficiency of the system. The replicas from root to terminal
nodes are included in quorums to maximize the degree of intersection set. The level
up to which replicas are included, it depends upon the size of quorum, e.g.,
1,2,3,4,5,6,7,8 are the replicas and generate HBFAR tree as shown in Figure 8.3. The
replicas 1,2,3 are used for quorum of size three replicas and 1,2,3,4 are used to form a
quorum of size four. The replicas with higher session time are given priority to form
the quorums as shown in Figure 8.3.
P14
P10
P7
P1
P3
P5
P2
P12
P6
P13
P4
P8
P15
P9
P11
Figure 8.2 The arrangement of peers to make Height Balanced Fault Adaptive Reshuffle Tree
over the peers from underlay topology of P2P networks. Here the dotted line connector shows
the connection between the peers in overlay topology. The dark line connector shows the
connection between the peers in the replica topology in tree. P14 is shown as isolated peer in
the network.
P1
Peer Id
P2
P4
P1
P2
P3
P4
P5
P6
P7
P8
P3
P5
P6
P7
1 Hop
X
P1
P1
P2
P2
P3
P3
P4
Parent Id
2 Hop 3 Hop
X
X
P1
P1
P1
P1
P2
X
X
X
X
P1
4 Hop
P8
Figure 8.3 Replica arrangements in the HBFAR scheme generated from Figure 8.2. The
session time of P1 is greater than the P2 and P3. The order of the replicas according to session
time from the HBFAR scheme is P1, P2, P3, P4, P5, P6, P7, and P8.
155
The performance of the system remains approximately same even in high churn rate
of peers. With maximized overlapped replicas in consecutive read and write quorums,
this scheme ensures the access of fresh data items from any number of replicas in the
system. The HBFAR scheme also provides high fault tolerance. With self
organization this scheme tolerate up to n 1 faults among n replicas in the system.
Multicasting and directional forwarding is used to transfer the messages in the system.
The HBFAR scheme triggers maintenance procedure for every leaved replica.
Other replicas in the systems are adjusted according to their session time. The replica
with next higher session time takes the position of leaved replica from the network.
By default, the replicas with longer session time move in the upward direction with
passage of time. It works on 4 rules which are defined in the following sections.
8.4.1
levels in HBFAR structure, i.e., each peer stores the addresses of its directly
connected siblings and of its parent. Simultaneously each peer stores the addresses of
all its grandparent peers coming across the path from that peer to root. All
peers/replicas follow the rules (Rule Set-I) of HBFAR structure to make this overlay
logical structure or replica overlay. The session time of each peer/replica is used as
key in the replica overlay. The use of session time as the key, results in the movement
of replicas having longer session time towards the root. Each newly joined peer
connect at the position of leaves in logical structure decided according to the rules
156
(Rules Set-I) of HBFAR. These peers also search the alternate path of each parent up
to root. Each peer holding replicas transmit the beacon against its active status to all
its directly connected peers. These addresses and beacon are used for making the
connection in case of any failure.
The addresses of peers are used to access the peers in a particular sequence.
HBFAR reduces the search time in building the quorums by using minimum hop
count. This reduction in search time is even less than that achieved by HQC and
HQC+.
8.4.3 Rule Set-III: Rules for replica joining into the replica logical structure
When any replica rejoins the system, the following rules (Rules-III) are taken into
account by system to maintain the replica overlay topology:
(i) Initially rejoined peer searches its position through ping pong messages starting
from the root, assuming the position at level k .
(ii) The replica establishes the connection with its parent peer and both save the
addresses of each other.
157
(iii) Replica updates its data items by comparing the data items with its alive parent
and update version number of data items.
(iv) Replica stores the addresses of its entire grandparents till root, through the path
find message.
P1
Peer Id
P2
P4
P1
P2
P3
P4
P5
P6
P7
P8
P3
P5
P7
P6
1 Hop
X
P1
P1
P2
P2
P3
P3
P4
Parent Id
2 Hop 3 Hop
X
X
P1
P1
P1
P1
P2
X
X
X
X
P1
4 Hop
P8
Figure 8.4 Replica arrangements in a HBFAR logical tree structure. Peer 2 which is shown
by dotted lines is a peer leaving the network
P1
Peer Id
P4
P5
P1
P3
P4
P5
P6
P7
P8
P3
P8
P6
P7
1 Hop
X
P1
P1
P4
P3
P3
P4
Parent Id
2 Hop 3 Hop
X
X
P1
P1
P1
P1
4 Hop
X
X
X
X
Figure 8.5 The HBFAR logical tree structure after leaving of Peer 2. Peer 4 takes the position
of Peer 2 which already leaved the network. All other replicas in downlink are readjusted
accordingly
8.4.4 Rule Set-IV: Rules for Acquisition of Read/Write Quorum from HBFAR
Logical Tree
The following rules (Rule Set-IV) are defined to access the HBFAR logical tree to
form the read/write quorums:
(i) Sizeof Qri Sizeof Qwi , the size of read quorum is always less than or equal to
the size of write quorum.
(ii) Quorum acquisition is always starting from the root, i.e., root is always included
in the read/write quorum.
158
(iii) For any integer k , if the replica at k level is in the quorum then every replica
from k 1 level must be in the quorum of the HBFAR logical tree. This rule is
referred as Top-to-Bottom.
(iv) If a replica from right descendent of a parent replica is in the quorum then there
must be a replica from left descendent which is also in the quorum. Follows the
rule Left-to-Right.
These rules are implemented in the HBFAR scheme by combining Top-toBottom and Left-to-Right to access replicas from the HBFAR logical structure.
Read Write Quorum: The quorum size depends upon overall availability of replicas
and overhead of the network. The size of quorums may be increased in case of low
availability and reduced in case of high availability of the replicas. The replicas
included in quorums are selected according to the Rule Set-IV Rules. The size of
quorums also affects the network traffic. The number of messages transferred to
maintain the HBFAR logical structure increases with increase in quorum size.
Therefore, the network overhead is increased in the system. The quorum size is
directly proportional to the network overhead, i.e., there is tradeoff between the
network overhead and quorum size of the replicas.
The HBFAR scheme uses the fixed number of replicas in quorums, after
considering all factors affecting the system for read/write quorums. The quorum size
of read and write quorums may be different depending upon the requirement of a
system. The replicas are included in sequence from HBFAR logical structure
according to session time to form the quorums. All accessed replicas in read quorum
are compared for updated version of the data items. In the best case only the root may
be accessed for updated data.
Write quorums are decided same as the read quorums. The replicas from top to
bottom and left to right are selected from the HBFAR logical structure to form the
quorum. Whenever write query is executed in the system, all the replicas in quorum
are updated by write through method, i.e., write is committed after receiving
acknowledgement from all available replicas in the quorum. The remaining replicas in
the structure are updated with write back method. Here maximum queries are
responded by the top most replicas of HBFAR logical structure having longer session
time. The replicas which are not used in write quorums are updated through lazy
159
method. These extra links reduce time of update message to reach all replicas in the
system.
It quorum size equal to four, then all four replicas available at the top, starting from
root to branch and left to right will be in read/write quorum of the HBFAR logical
structure. The peers P1 and P4 are used for quorum size two. The peers P1 , P4 and P3
are used in quorum of size three. The peers P1 , P4 , P3 and P8 are used in quorum of size
four by considering the logical structure shown in Figure 8.5.
Basis:
(i) Assuming the height of the HBFAR tree is 0, i.e., only one peer/replica is in the
structure (placed at root). Since according to the Rule Set-IV (ii) read as well as
write quorum must involve root peer in the quorum. Every read/write quorum
includes replica at root. Hence, every access gets the updated data items from the
root. These quorums mathematically described as:
Qw0 = {P0 } , Qr0 = {P0 } , Qr Qw , Qw0 Qr0 read quorum and write quorum
intersect with each other, i.e., read quorum gets the updated data items. HBFAR
scheme provides updated data for height 0. {Hence Proved}
(ii) Assuming height of the HBFAR logical structure be 1, i.e., HBFAR logical tree
has maximum 2-3 peers. One replica is at the root and 1-2 in the down link of the
root. According to the Rule Set-IV, the size of write quorum is greater than or
equal to the size of read quorum. The replicas in the quorum are selected through
Top-to-Bottom and Left-to-Right. Write quorum Qw1 is defined as:
Qw1 = {{P0 },{P0 , P1},{P0 , P1 , P2 }}
(8.1)
(8.2)
160
For every possible set of read quorum against write quorum, quorums intersect
each other thus, it always get the updated information. Using eqn (8.1 & 8.2) we
conclude the following:
Qr1 , Qw1 : Qr1 Qw1 , Qw1 Qr1
All possible read quorums always contain at least one updated replica. This
implies that read quorum always accesses the fresh data item, as intersection is
always non-empty. Thus, HBFAR scheme provides updated data for height 1.
{Hence Proved}
Hypothesis:
Assuming HBFAR logical tree of height i and Qwi , Qri the write and read quorums
of size l and k , respectively and defined as:
Qwi = {P1 , P2 ,..., Pk ,..., Pl }
(8.3)
(8.4)
(8.5)
Therefore, each read quorum accesses the updated replicas as intersection of write
and read quorum is not empty.
Inductive Step:
We have to prove that this is also true for the HBFAR logical structure of height
i + 1 . According to the Rule Set-IV, write quorum of size n is defined as:
Qwi +1 = {P1 , P2 , P3 ,..., Pk ,..., Pl ,..., Pn }
(8.6)
(8.7)
161
If the size selected for the write quorum is n and size for the read quorum is m .
Where l n and k m . The quorum is generated through Rule Set-IV. From eqn
(8.6 & 8.7)
Qwi Qwi +1 , Qri Qri +1
(8.8)
from eqn (8.5). It proves that in every read quorum intersects with
the write quorum in HBFAR scheme. Therefore, every read quorum carries the
updated information. {Proof}
Adaptive & Fault Tolerance: HBFAR scheme easily adapts any of the peer faults. It
works for both cases, peer leaving or joining the system. HBFAR scheme tolerates up
to n 1 faults among n number of replicas participating in the system.
Availability: It is the probability that at least one replica is available in the system and
is given as
n
1 (1 Pr )
i
i =1
(8.9)
where Pri is the probability of i th replica to stay alive and n is the number of replicas .
162
between the peers. A replica stores the addresses of other connected replicas. This
helps in the searching for replicas, while form the read/write quorums. A peer
contains the stale information, while it rejoins the system.
Availability: of peer is calculated as the total active time of peer over the total time of
peer including active and down time. This is the measure of participation of a peer in
the system. Longer is the participation time in the system more contributor a peer is.
Percentage of Stale Data: is calculated as the number of accessed replicas with stale
data over total accesses in the quorum. Any system requires minimal amount of stale
data access. A system is considered better which is having lower value of stale data
access.
163
120
Up Peers
100
Reachable Peers
% Reachability
80
60
40
20
0
0
0.1
0.2
0.3
0.4
0.5
Availability
0.6
0.7
0.8
0.9
Figure 8.7 presents the probability to access stale data decreases with increase in
availability of the peers. In HBFAR scheme, the probability to access stale data is
very less as compared to the Random and HQC. The access of all subqueries is
considered against the stale data accessing. It is also observed that the probability to
access stale data is in acceptable range with replicas having availability greater than
0.7. The peers having availability more than 0.7 may be given priority to store the
replica over other peers so that the performance of the system may be improved.
1.2
1
0.8
0.6
0.4
Random
HQC
HBFAR
0.2
0
0
0.1
0.2
0.3
0.4
0.5
0.6
Availability
0.7
0.8
164
0.9
35
HBFAR
Avg. Search Time (ms)
30
HQC
Random
25
20
15
10
5
0
0
10
12
Quorum Size
14
16
18
20
Figure 8.8 Comparison of average search time to form the quorum from the networks
Figure 8.8 shows that the quorum acquisition time is increasing while increasing
in quorum size. The average search time for the random quorum consensus is
comparatively more than the HBFAR scheme. It takes less time to search a peer
because the peers are located at proper position. The Random Quorum Consensus
takes higher time because it finds the peer through flooding as compare to the
structured.
100
90
Random
HQC
HBFAR
80
70
60
50
40
30
20
10
0
0
10
12
Quorum Size
14
16
18
20
It is observed from Figure 8.9 that average response time increases with increase
in quorum size. The response time in all cases becomes approximately constant after
165
12 value of quorum size. The response time of the HBFAR scheme is lowest among
all the considered schemes. The quorum acquisition time is very low from Random
and HQC.
1200
Random
Average Message Transfer
1000
Hierarchical
800
600
400
200
0
0
10
12
Size of Quorum
14
16
18
20
Figure 8.10 shows that network overhead for Random Quorum Consensus
increases rapidly as the size of quorum is increasing. The network overhead in case of
hierarchical quorum consensus is small as compared to the random quorum because
hierarchical quorum consensus uses the binary logical structure.
8.6 Discussion
The simulation results show that average messages transfer in the P2P network is
minimized through the directional search as compared with the random search. The
message transfer time in the hierarchical topology is also less as compared with the
random topology. The quorum acquisition time is a major factor for the performance.
The system which takes lesser time to search the replicas is having better
performance. HBFAR scheme takes lesser search time as compare to the random and
HQC, because it fixes searching location in the logical structure. But in random and
HQC replicas are searched randomly. This takes time to make the quorum, which
affects the performance of the system.
166
The HBFAR scheme performs better than HQC in respect of search time to form
quorums, response time and probability to access updated data items in dynamic
environment of the network. It provides better data availability in the system. It
maximizes the degree of intersection among consecutive read-write and write-write
quorums and provides the better probability to access updated data items from the
system. HBFAR scheme easily adapts any leave and joining of peer in the system.
System performance is not seriously degrades with increase in churn rate of the peers.
It also works in case of any fault. It may tolerate up to n 1 faults.
8.7 Summary
In this chapter we have presented a HBFAR logical structure for overlay networks.
HBFAR scheme logical structure is organized in such a way that all updated replicas
are popped towards root and only updated replicas are participating in any quorum
formation. The replicas having large session time are on the root side and replicas
with lower session time are arranged on the branch sides of the root. It adjusts itself
after every leaving a replica from the structure. Always a replica spent longer session
time are on the top of the tree. This also reduces the time spent to make the quorum of
replicas, and improves the response time of the system.
In next chapter work is concluded with recommendations for future scope.
167
Chapter 9
168
9.1 Contributions
Contributions of this dissertation are as follows.
1. We have designed Statistics Manager and Action Planner (SMAP) system for P2P
networks. Various algorithms are also proposed to enhance the performance of
various modules of this system. SMAP enables fast and cost-efficient deployment
of information over the P2P network. It is a self-managed P2P system, having a
capability to deal with high churn rate of the peers in the network. SMAP is fault
adaptive and provides load balancing among participated peers. It permits true
distributed computing environment for every peer node to use the resources of all
other peers participating in the network. It provides data availability by managing
replicas in efficient logical structure. To improve the throughput, execution
process is divided into three independent sub-processes by the system. These sub
processes can execute in parallel. SMAP provides fast response time for
transactions with time constraints. It also reduces redundant traffic from P2P
networks by reducing conventional overlay path. It also addresses most of the
issues related to RTDDBS implemented over P2P networks.
are identified through peer selection criteria. All peers are placed at one hop
distance from the centre of LARPA, it is place from where any search starts.
Depending upon the selection of peers for logical structure, LARPA is classified
as LARPA1 and LARPA2. LARPA1 uses the peers with highest candidature value
only, calculated through peer selection criteria. This candidature value is
compromised in LARPA2 by the distance of peers from the centre. LARPA
improves the response time of the system, throughput, data availability and degree
of intersection between two consecutive quorums. It also provides the high
possibility of accessing updated data items from the system and short quorum
acquisition time. The reconciliation of LARPA is fast, because system updates
itself at fast rate. It also reduces the network traffic in P2P network due to its one
hop distance logical structure formation with minimum number of replicas.
171
1. For future research we can extend this work for secure dissemination of
information by integrating security framework in term of trust establishment and
trust management in P2P network. This system will also be developed for
exploring and solving security issues on open networks.
2. To address unique security concerns, it would be imperative to study the adjacent
technological advances in distributed systems, ubiquitous computing, broadband
wireless communication, nanofabrication and bio-systems.
3. Irrespective of good research in the socially popular and emerging field, i.e., P2P
networks and systems, still there is a lot of scope of research in this field.
4. We identified that most of the P2P systems are popular for the static data. The
data which is not changed, while it is shared among the networks. A little work is
done in the direction of sharing dynamic data among the P2P systems. The data
which is changed, while it is shared among the networks. We have developed
SMAP and tested through simulation in future it will be transported on real
networks.
5. Reliability is another issue which needs more attention of the research society.
Other issues are secure concurrency control, secure fault tolerance and secure load
balancing.
172
Middleware
CAN
Tapestry
Chord
Pastry
Napster
Gnutella
Freenet
APPA
Piazza
PIER
PeerDB
NADSE
SMAP
Load balancing
Fault tolerant
(communication
link)
Fault tolerant
(host level)
Replication
Replication
Reliable
Resource sharing
Secure
Communication
level
Scalable
Little
Little
Little
Little
Good
Good
Good
Good
Better at
under load
Better at
under load
Better at
under load
Good
Better at
under load
Good
Better at
under load
Good
Good
Poor at
overload
Poor at
overload
Poor at
overload
Performance
Distributed file
Management
Poor at
overload
Poor at
overload
Data Partitioning
NA
NA
NA
NA
NA
NA
NA
Traffic Optimize
NA
NA
NA
NA
NA
NA
NA
NA
Concurrency
Control
NA
NA
NA
NA
NA
NA
NA
NA
Local
Parallel Execution
NA
NA
NA
NA
NA
NA
NA
Advanced
Parallel
Schema
Management
File sharing
NA
NA
NA
NA
NA
NA
NA
Y(Global)
Y(Global)
Hybrid
Hybrid
Hybrid
Degree of
Decentralization
Distributed
Structured
Structured Unstructured
Hybrid
Y(Pairwise) Y(Global)
Y
List of Publications
International Journals
174
June
21-22,
2010.
(DOI:ieeecomputersociety.org/10.1109/ACE.2010.55)
arnumber=5532818,
175
Bibliography
[1]
V.Gorodetsky,
O.Karsaev,
V.Samoylov,
S.Serebryakov,
S.Balandin,
[3]
C. Shirky, What is P2P and What Isnt, Proceedings of The O'Reilly Peer to
Peer and Web Service Conference, Washington D.C., pp. 5-8, November 2001.
[4]
[5]
[6]
Qian Zhang, Yu Sun, Zheng Liu, Xia Zhang Xuezhi Wen, Design of a
Distributed P2P-based Grid Content Management Architecture, Proceedings
of the 3rd IEEE Annual Communication Networks and Services Research
Conference (CNSR05), pp. 1-6, 2005.
[7]
[8]
[9]
[10]
Ritter,
Why
Gnutella
Can't
http://www.tch.org/gnutella.html.
176
Scale.
No,
Really,
[11]
[12]
[13]
[14]
University
of
California,
Berkley,
2002,
http://setiathome.ssl.berkeley.edu/
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
Lijiang Chen, Bin Cui, Hua Lu, Linhao Xu, Quanqing Xu, iSky, Efficient and
Progressive Skyline Computing in a Structured P2P Network, Proceedings of
the IEEE 28th International Conference on Distributed Computing Systems,
pp. 160-169, 2008.
[23]
[24]
[25]
[26]
[27]
[28]
of
Jerusalem,
Jerusalem,
pp.
1-67,
January,
2005
http://www.cs.huji.ac.il/labs/danss/presentations/emule.pdf
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
Kin Wah Kwong, Danny H.K., A Congestion Aware Search Protocol for
Unstructured Peer-to-Peer Networks, Proceedings of the, LNCS 3358, pp.
319-329, 2004.
[39]
Dr. Ing, HiPeer, An Evolutionary Approach to, P2P Systems, PhD Thesis,
Berlin, 2006.
[40]
Lalit Kumar, Manoj Misra, Ramesh Chander Joshi, Low Overhead Optimal
Check Pointing for Mobile Distributed Systems, 19th IEEE International
Conference on Data Engineering, pp. 686 688, March 2003.
[41]
[42]
[43]
Ryan Huebsch, Joseph M. Hellerstein, Nick Lanham, Boon Thau Loo, Scott
Shenker, Ion Stoica. Querying the Internet with PIER, Proceedings of the
29th VLDB Conference, Berlin, Germany, 2003.
179
[44]
[45]
[46]
[47]
[48]
May Mar Oo, The Soe, Aye Thida, Fault Tolerance by Replication of
Distributed Database in P2P System using Agent Approach, International
Journal of Computers, Vol. 4(1), pp. 9-18, 2010.
[49]
[50]
Runfang Zhou, Kai Hwang, Min Cai, Gossip Trust for Fast Reputation
Aggregation in Peer-to-Peer Networks, IEEE Transactions on Knowledge
And Data Engineering, Vol. 20(9), pp. 1282-1295, September, 2008.
[51]
Conference
on
Software
Engineering
and
Advanced
[53]
[54]
[55]
[56]
[57]
[58]
[59]
[60]
Heng Tao Shen, Yanfeng Shu, Bei Yu, Efficient Semantic-Based Content
Search in P2P Network, IEEE Transactions on Knowledge And Data
Engineering, Vol. 16(7), pp. 813- 826, July 2004.
[61]
[62]
[63]
[64]
Yao-Nan Lien, Hong-Qi Xu, A UDP Based Protocol for Distributed P2P File
Sharing, Eighth International Symposium on Autonomous Decentralized
Systems (ISADS'07), pp. 1-7, 2007.
[65]
[66]
[67]
[68]
[69]
Jing Tian, Zhi Yang, Yafei Dai, A Data Placement Scheme with TimeRelated Model for P2P Storages, Proceedings of the Seventh IEEE
International Conference on Peer-to-Peer Computing, pp. 151-158, 2007.
[70]
[71]
[72]
[73]
[74]
182
[75]
[76]
[77]
Francis Otto, Drake Patrick Mirenbe, A Model for Data Management in Peerto-Peer Systems, International Journal of Computing and ICT Research, Vol.
1(2), pp. 67-73, December 2007.
[78]
[79]
[80]
[81]
[82]
[83]
[84]
183
[85]
[86]
[87]
[88]
[89]
[90]
[91]
[92]
[93]
[94]
[95]
[96]
[97]
[98]
[99]
186
187
[128] Ada Wai-Chee Fu, Yat Sheung Wong, Man Hon Wong, Diamond Quorum
Consensus for High Capacity and Efficiency in a Replicated Database
System, Distributed and Parallel Databases, Vol. 8, pp. 471492, 2000.
[129] A. Sleit, W. Al Mobaideen, S. Al-Areqi, A. Yahya, A Dynamic Object
Fragmentation and Replication Algorithm in Distributed Database Systems,
American Journal of Applied Sciences, Vol 4(8), pp. 613-618, 2007.
[130] D. Agrawal, A. El. Abbadi, The Generalized Tree Quorum Protocol: An
Efficient Approach for Managing Replicated Data, ACM Transactions on
Database Systems, Vol. 17(4), pp. 689-717, 1992.
[131] David Del Vecchio, Sang H Son, Flexible Update Management in Peer-toPeer Database Systems, Proceedings of 9th International Conference on
Database Engineering and Application Symposium, VA, USA, pp. 435 444,
July 2005.
[132] Hidehisa Takamizawa, Kazuhiro Saji. A Replica Management Protocol in a
Binary Balanced Tree Structure-Based P2P Network. Journal of Computers,
Vol. 4(7), pp. 631-640, 2009.
[133] Ahmad N, Abdalla AN, Sidek RM. Data Replication Using Read-One-WriteAll
Monitoring
Synchronization
Transaction
System
in
Distributed
Over
Wide-Area
Networks,
188
International
Conference
on
[148] Sang-Min Park, Jai-Hoon Kim, Young-Bae Ko, Won-Sik Yoon, Dynamic
Data Replication Strategy Based on Internet Hierarchy BHR, SpringerVerlag Heidelberg, 3033, pp. 838-846, 2004.
[149] A. Horri, R. Sepahvand, Gh. Dastghaibyfard. A Hierarchical Scheduling and
Replication Strategy, International Journal of Computer Science and Network
Security, Vol. 8, pp. 30-35, 2008.
[150] Tang M., Lee B., Tang X., Yeo C. The Impact of Data Replication on Job
Scheduling Performance in the Data Grid. Future Generation Computing
System, Vol. 22(3), 2006.
[151] Kavitha, R., A. Iamnitchi, I. Foster, Improving Data Availability through
Dynamic Model Driven Replication in Large Peer-to- Peer Communities,
Global and Peer-to-Peer Computing on Large Scale Distributed Systems
Workshop, Berlin, Germany, 2002.
[152] Ranganathan, I. Foster, Design and evaluation of Replication Strategies for a
High Performance Data Grid in Computing and High Energy and Nuclear
Physics, International Conference on Computing in High Energy and Nuclear
Physics (CHEP'01), Beijing, China, 2001.
[153] D. Malkhi, M.K. Reiter, A. Wool, Probabilistic Quorum Systems,
Information and Computation, Vol. 170(2), pp.184-206, 2001.
[154] Abraham Silberschatz, Henry F. Korth, S. Sudarshan, Database System
Concepts, McGraw-Hill Computer Science Series, International Student
Edition, 2005.
[155] Udai Shanker, Manoj Misra, Anil K. Sarje, Distributed Real Time Database
Systems: Background and Literature Review, Distributed Parallel Databases,
Vol. 23, pp. 127149, 2008.
[156] Arvind Kumar, Rama Shankar Yadav, Ranvijay, Anjali Jain, Fault Tolerance
in Real Time Distributed System, International Journal on Computer Science
and Engineering, Vol. 3 (2), pp. 933-939, 2011.
[157] Amr El Abbadi, Mohamed F. Mokbel, Social Networks and Mobility in the
Cloud, Proceedings of the PVLDB, Vol. 5 (12), pp. 2034-2035, 2012.
[158] K.Y. Lam, Tei-Wei Kuo, Real-Time Database Systems: Architecture and
Techniques, Kluwer Academic Publishers, 2001.
190
and
Clustering
Strategies
for
RDF-Based
Peer-to-Peer
Networks,
[178] Anirban Mondal, Masaru Kitsuregawa, Open Issues for Effective Dynamic
Replication in Wide-Area Network Environments, Peer-to-Peer Networking
and Applications, Vol. 2(3), pp. 230-251, 2009.
[179] Gerome Miklau Dan Suciu, Controlling Access to Published Data Using
Cryptography, Proceedings of the Very Large Databases Conference, USA,
pp. 898909, September 2003.
[180] R.
B.
Patel, Vishal
Garg,
Resource
Management
in
Peer-to-Peer
193