Advance DBMS Concept Presented at Kantipur City College, Kathmand

Advanced DBMS Concepts
Raj Kishore
-----------------------
D2Hawkeye Services Pvt. Ltd.
ISO 9001:2000 Certified
Distributed Databases
• Data stored at several locations

• Managed by a DBMS that can run
autonomously
• Ideally, location of data is
unknown to client
• Clients can write Transactions
regardless of where the affected
data are located
Types of Distributed Database
• Homogeneous: Every site runs

the same type of DBMS (All sites
runs on Oracle)
• Heterogeneous: Different sites
run different DBMS (maybe
Oracle, MSSQL Server, DB/2)
Distributed Databases
Distributed DBMS Architectures
• Client - Servers:
o Client sends query to each database server
in the distributed system
o Client caches and accumulates responses
• Collaborating Server:
o Client sends query to “nearest” Server
o Server sends query to other Servers, as
required
o Server sends response to Client
Storing the Distributed Data
• In fragments at each site

o Split the data up
o Each site stores one or more fragments
• In complete replicas at each site
o Each site stores a replica of the complete
data
• A mixture of fragments and replicas
o Each site stores some replicas and/or
fragments or the data
Advantages Distributed DBMS
• Fragmentation (Sub-set Data)

o Exploit data access locality
o Put data near consumer
o Less network traffic
o Better response time
o Better availability
o Spread Load
• Replicated Data (Complete)
o Improves availability
o Disconnected (mobile) operation
o Reads are cheaper
Fragmentation (Sub-Setting)
• Horizontal – “Row- wise”

o rows of the table make up one
fragment
• Vertical – “Column- Wise”
o columns of the table make up one
fragment
• Selected Tables residing in
selected locations
Replication
• Make synchronized or
unsynchronized copies of data at
different servers
o Synchronized: data are always current,
updates are constantly shipped between
replicas
o Unsynchronized: data queued up for later
synchronization, good for read-only data
• Increases availability of data
• Makes query execution faster
Replication Catalogue
• Which objects are being replicated

• Where objects are being replicated to
• How updates are propagated
• Catalogue is a set of tables that can
be backed up, and recovered (as any
other table)
• These tables are themselves
replicated to each replication site
o No single point of failure in the Distributed
Database
Distributed Transaction
• All data that have been changed must

be propagated before the Transaction
commits (Distributed Replicated)
• Before Transaction can commit, it
obtains locks on all modified copies
• Sends lock requests to remote sites,
holds lock
• If links or remote sites fail,
Transaction cannot commit until
links/sites restored
• commit protocol is complex, and
involves many to and fro messages
Distributed Locking
• How to manage Locks across many

sites?
o Centrally: one site does all locking
 Vulnerable to single site failure
o Primary Copy: all locking for an object
done at the primary copy site for the object
 Reading requires access to locking site as
well as site which stores object
o Fully Distributed: locking for a copy done
at site where the copy is stored
 Locks at all sites while writing an object
Two- Phase Commit
• Site which originates Transaction is coordinator,

other sites involved in Transaction are
subordinates
• When the Transaction needs to Commit:
o Coordinator sends “prepare” message to
subordinates
o Subordinates each force-writes an abort or prepare
Log record, and sends “yes” or “no” message to
Coordinator
o If Coordinator gets unanimous “yes” messages,
force-writes a commit Log record, and sends
“commit” message to subordinates
o Subordinates force-write abort/commit Log record
accordingly, then send an “ack” message to
Coordinator
o Coordinator writes end end Log record after
receiving all acks
Parallel Processing
• Parallel processing divides a large task

into many smaller tasks and executes the
smaller tasks concurrently on several
nodes. As a result, the larger task
completes more quickly
• A node is a separate processor, often on a
separate machine. Multiple processors,
however, can reside on a single machine
Sequential Processing of a Single Task
Executing Component Tasks in Parallel
Problems of Parallel Processing
• Effective implementation of parallel

processing involves two challenges:
o Structuring tasks so some tasks
execute at the same time "in parallel"
o Preserving task sequencing for tasks
that must execute serially
Characteristics of a Parallel Processing
System
• A parallel processing system has the

following characteristics:
o Each processor in a system can perform
tasks concurrently
o Tasks may need to be synchronized
o Nodes usually share resources, such as
data, disks, and other devices
Parallel Processing for SMPs and
MPPs
• Parallel processing architectures

support:
o Clustered and massively parallel
processing (MPP) hardware where each
node has its own memory
o Single memory systems, also known as
"symmetric multiprocessing" (SMP)
hardware, where multiple processors
use one memory resource
The Goals of Parallel Processing
• Speedup is the extent to which more

hardware can perform the same
task in less time than the original
system
• Scaleup is the factor that expresses
how much more work can be done
in the same time period by a larger
system
Speedup and Scaleup with Different
Workloads
Workload Speedup Scaleup

------------------------------------------
OLTP No Yes
DSS Yes Yes
Parallel Query Yes Yes
Batch (Mixed) Possible Yes
Benefits of Parallel Databases
• Parallel database technology can

benefit certain kinds of applications
by enabling:
o Higher Performance With more CPUs
available to an application, higher
speedup and scaleup can be attained
o High Availability Nodes are isolated
from each other, so a failure at one
node does not bring the entire system
down
Multi-Instance Database System
Distributed Database System
Parallel Execution
• With parallel execution features, DBMS

can divide the work of processing SQL
statements among multiple query server
processes
• Provides the framework for parallel
execution to work between nodes
• The data server must parallelize
individual queries into units of work that
can be processed simultaneously in
multiprocessing systems
Example of Parallel Execution
Processing
Multi-dimensional models
• Data items are each related by

several attributes, or dimensions.
o A quantity, e.g. detergent sale, has
dimensions including
 time (when sold),
 cost,
 location (where sold and in which type of
store)
Usage of Multi-Dimensional Data
• Question:- How many 3kg laundry packs

of powder did we sell in Eastern Region in
the last three months?
o Involve several dimensions of information and
can be answered straightforward if a multi-
dimensional view of the data is available
o Complicated, because individual dimensions
often have an inherent hierarchy of variables
within them
Online Analytical-Processing (OLAP)
• Fast Analysis of Shared Multidimensional

Information
o FAST: most responses to users within about 5
seconds simplest analyses taking no more than 1
second very few taking more than 20 seconds
o ANALYSIS: any business logic and statistical
analysis easy enough for the target user
o SHARED all the security requirements for
confidentiality
o MULTIDIMENSIONAL: multidimensional conceptual
view of the data, including full support for
hierarchies
o INFORMATION: all of the data and derived info.
needed
OLAP Vs OLTP (Online Transaction
Processing

Advance DBMS Concept Presented at Kantipur City College, Kathmand

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Advance DBMS Concept Presented at Kantipur City College, Kathmand

Enviado por

Direitos autorais:

Formatos disponíveis

Advanced DBMS Concepts

• Data stored at several locations

• Homogeneous: Every site runs

• In fragments at each site

• Fragmentation (Sub-set Data)

• Horizontal – “Row- wise”

• Which objects are being replicated

• All data that have been changed must

• How to manage Locks across many

• Site which originates Transaction is coordinator,

• Parallel processing divides a large task

• Effective implementation of parallel

• A parallel processing system has the

• Parallel processing architectures

• Speedup is the extent to which more

Workload Speedup Scaleup

• Parallel database technology can

• With parallel execution features, DBMS

• Data items are each related by

• Question:- How many 3kg laundry packs

• Fast Analysis of Shared Multidimensional

Você também pode gostar