Você está na página 1de 51

Teradata

M.S.Prasad
165916
Contents
Introduction to Teradata
Teradata Architecture
Data Distribution
PI characteristics
Data Access
Teradatas scalability
Data Protection features
Introduction to Teradata
Teradata is a Relational Database Management
System (RDBMS):
1. Designed to run the worlds largest commercial databases.
2. Preferred solution for enterprise data warehousing (OLAP).
3. Executes on UNIX-MP-RAS or NT-based system platforms
4. Compliant with ANSI industry standards
5. Runs on single (SMP) or multiple (MPP) nodes
6. Acts as Database server to client applications throughout the enterprise
7. Uses Parallelism to manage Terabytes of data
8. Shared-Nothing Architecture
Advantage Teradata
1. Unlimited, Proven Scalability
2. Unlimited Parallelism - Parallel sorts/aggregations, temporary tables
Shared-Nothing architecture
3. Mature Optimizer - Complex queries, joins per query, ad-hoc processing
Its a Cost Based Optimizer.
3. Model the Business - 3NF, robust view processing, star schema
4. Lowest TCO - ease of setup & maintenance, robust parallel utilities, no re-orgs,
lowest disk to data ratio, robust expansion utility
5. High Availability - No single point of failure,
scalable data loading, parallel load utilities

Note: If the table demographics are well defined, the optimizer will choose the
best plan for the query execution.
Advantage Teradata
7. Enormous capacity
Billions of rows
Terabytes of data
8. High-performance parallel processing
9. Single database server for multiple clients Single Version of the Truth
10. Network and Mainframe connectivity
11. Industry standard access language (SQL)
12. Manageable growth via modularity
13. Fault tolerance at all levels of hardware and software
14. Data integrity and reliability
Advantage Teradata DBA
Things Teradata DBAs NEVER Have to Do!

1.Reorganize data or index space


2.Pre-prepare data for loading (convert, sort, split, etc.)
3.Ensure that queries run in parallel
4.Unload/reload data spaces due to expansion
5.Design, implement and support partition schemes.
6.Write programs to figure how to divide data into partitions
7.Write or run programs to split the input data into partitions for loading

They know that if data doubles, the system can expand easily to
accommodate it.

The workload for creating a table of 100,000 rows is the same as


creating 1,000,000,000 rows!
Advantage Teradata Warehouse

ATM MVS POS


Operational Data

Data Warehouse
Teradata

Access Tools
Cognos Access BO

End Users
Architecture
Channel-Attached System Network-Attached System

Client Application Client Application

CLI
CLI

Channel MTDP
TDP

MOSI

Teradata Node LAN


TPA Channel Driver Teradata Gateway
PDE Parsing Engine Parsing Engine
OS
UN
IX/ BYNET
NT

AMP AMP AMP AMP

VDisk VDisk VDisk VDisk


Architecture In Detail
Node
1. The basic building block for a Teradata system, the node is where the processing
occurs for the database.
2. A node is a term for a general-purpose processing unit under the control of a single
operating system.
3. Teradata system contains one or more nodes.
Single Node - Symmetric Multi Processing (SMP)
Multi Node - Massive Parallel Processing (MPP)
Node Components:
1. Parsing Engine
2. BYNET
3. AMP
Understanding Node

Parsing Engine

BYNET

AMP AMP AMP AMP

Vdisk Vdisk Vdisk Vdisk


Node Components
Component Functionality
Parsing Engine 1. Managing individual sessions (up to 120)
2. Parsing and optimizing your SQL requests
3. Dispatching the optimized plan to the AMPs
4. ASCII / EBCDIC conversion (if necessary)
5. Sending the answer set response back to the
requesting client

AMP 1. Storing and retrieving rows to and from the disks


2. Lock management
3. Sorting rows and Aggregating columns
4. Join processing
5. Output conversion and formatting
6. Creating answer sets for clients
7. Disk space management and Accounting
8. Special utility protocols
9. Recovery processing

Vdisk 1. A vdisk (pronounced, "VEE-disk") is the logical


disk space that is managed by an AMP.
Node
Other components
Component Functionality
Channel Driver Channel driver software is the means of communication
between the PEs and applications running on channel-attached
(mainframe) clients.

Gateway The Teradata Gateway software is the means of communication


between the PEs and applications running on:
1.LAN-attached clients
2.A node in the system

PDE The PDE (Parallel Database Extensions) software layer runs the
operating system on each node. It was created by NCR to
support the parallel environment.

TPA A Trusted Parallel Application (TPA) uses PDE to implement


virtual processors (vprocs). The Teradata RDBMS is classified as a
TPA
Other Components Continued..
Component Functionality
CLI (Call Level Interface) 1. Library of routines for blocking/unblocking requests
and responses to/from the RDBMS
2. Performs logon and logoff functions

Teradata Director 1. The Teradata Director Program is used by the


Program (TDP) mainframe HOST to communicate with the Teradata
system.
2. It manages all traffic between the Call Level Interface
(CLI) and the Teradata System. Its functions include
session initiation and termination, logging,
verification, recovery, and restart.

MTDP (Micro Teradata Performs many of the TDP functions including session
Director Program) management but not session balancing

MOSI (Micro Operating Provides operating system and network protocol


System Interface) independent interface
The Parsing Engine

SQL Request Parsing Engine Answer Set Response

Session Control

Parser
Optimizer
Dispatcher

BYNET

AMP AMP AMP AMP


Data Distribution
The Parsing Engine uses the Hashing Algorithm to distribute data across the
AMPs.
Data distribution is dependent on the hash value of the Primary index (PI).
Hashing Algorithm
1. The Hashing Algorithm acts like a mathematical "blender." It takes up to 16
columns of mixed data as input and generates a single 32-bit binary value called a
Row Hash.
2. Input to the algorithm is the Primary Index (PI) value of a row.
3. Row Hash uniqueness depends directly on PI uniqueness.
4. Good data distribution depends directly on Row Hash uniqueness.
5. The algorithm produces random, but consistent, Row Hashes.
6. The same PI value and data type combination always hash identically.
7. Rows with the same Row Hash will always go to the same AMP.
8. Different PI values rarely produce the same Row Hash (Collisions).
Row Hash
1. A 32-bit binary value.
2. The logical storage location of the row.
3. Used to identify the AMP of the row.
4. Table ID + Row Hash is used to locate the Cylinder and Data Block.
5. Used for distribution, placement, and retrieval of the row.
Primary Index (PI) Hash Mapping
Primary Index Value Hashing
for a Row Algorithm

DSW - Destination
DSW Remaining
Selection Word (first 16 bits) Row Hash (32 bits) 16 bits

Hash Map - 65,536 entries


(memory resident)

Message Passing Layer (PDE and BYNET)

AMP AMP AMP AMP AMP AMP AMP AMP AMP AMP
0 1 2 3 4 5 6 7 8 9
Data Distribution
Records From Client (in random sequence)
2 32 67 12 90 6 54 75 18 25 80 41
From
Host
Teradata
EBCDIC ASCII
Converted
Parsing Parsing
Engine(s) and
Engine(s)
Hashed
ASCII

Message Passing Layer Distributed

AMP 1 AMP 2 AMP 3 AMP 4 Formatted

12 80 2
2
5 9 67
5
Stored
18 41 75
4 0
3 6
2
PI Characteristics
Primary Indexes (UPI and NUPI)
1. A Primary Index may be different than a Primary Key.
2. Every table has only one, Primary Index.
3. A Primary Index may contain null(s).
4. Single-value access uses ONE AMP and, typically, one I/O.

Unique Primary Index (UPI)


1. Involves a single base table row at most.
2. No spool file is ever required.
3. The system automatically enforces uniqueness on the index value.

Non-Unique Primary Index (NUPI)


1. May involve multiple base table rows.
2. A spool file is created when needed.
3. Duplicate values go to the same AMP and the same data block.
4. Only one I/O is needed if all the rows fit in a single data block.
5. Duplicate row check for a Set table is required if there is no USI on the table.
PI Considerations
ACCESS
Maximize one-AMP operations:
Choose the column most frequently used for access.
Consider both join and value access.
DISTRIBUTION
Optimize parallel processing:
Choose a column that provides good distribution.
VOLATILITY
Reduce maintenance resource overhead (I/O):
Choose a column with stable data values.

The Column chosen for PI must be at least nearly UNIQUE to achieve good
distribution of data. Higher the distribution, higher the parallelism
AMP Operations
Single AMP operation
(Typical UPI access)

Multi-AMP
operation

All AMP
operation
Single AMP operation - Illustration
SAMPLE

NUMBER LETTER
UPI
1 P
SELECT LETTER 2 U
FROM SAMPLE 3 Y
WHERE NUMBER = 19 4 T
; 5 R
6 E
ANSWER : N
7 W
8 Q
9 A
10 S
11 D
12 F
13 G
14 H
15 J
16 K
17 L
18 M
19 N
20 B
21 V
22 C
23 X
24 Z
Application to PE

APPL APPL PE PE AMP AMP AMP AMP AMP AMP AMP AMP
1 2 1 2 1 2 3 4 5 6 7 8

SQL Request
13 G 15 J 20 B 7 W 14 H 9 A 22 C 1 P
SELECT LETTER 6 E 4 T 19 N 16 K 23 X 21 V 5 R 10 S
12 F 17 L 2 U 11 D 3 Y 24 Z 18 M 8 Q
FROM SAMPLE
WHERE NUMBER = 19;

1. APPL 1 establishes a user session on PE 1.


2. APPL 1 sends the SQL request to the PE on the forward channel.
3. PE 1 acknowledges the message on the back channel.
4. PE 1 parses and optimizes the request.
PE to AMP

1. PE 1 produces a one-step plan as a message to the BYNET.


2. BYNET uses the hash map to determine the destination to AMP 3.
3. BYNET sends the message to AMP 3 on the forward channel.
4. AMP 3 acknowledges message across the back channel.
AMP to PE

AMP 3 sends answer set to PE 1 on forward


channel.
PE acknowledges receipt across back
channel
PE to Application
Single-AMP Query

PE 1 forwards response parcels to APPL 1 on forward channel.


APPL 1 acknowledges messages on back channel.
APPL 1 processes response and generates output.
All-AMP operation with a Sort
SAMPLE
SELECT NUMBER, LETTER
NUMBER LETTER
FROM SAMPLE
WHERE NUMBER > 9 UPI
ORDER BY LETTER 1 P
; 2 U
ANSWER: 20 B 3 Y
22 C 4 T
11 D 5 R
12 F
13 G 6 E
14 H 7 W
15 J 8 Q
16 K 9 A
17 L 10 S
18 M
19 N 11 D
10 S 12 F
21 V 13 G
23 X 14 H
24 Z 15 J
16 K
17 L
18 M
19 N
20 B
21 V
22 C
23 X
24 Z
Application to PE
All-AMP Query with Sort

SQL Request
SELECT NUMBER, LETTER
FROM SAMPLE
WHERE NUMBER > 9
ORDER BY LETTER ;

1. APPL 1 establishes a user session on PE 1.


2. APPL 1 sends the SQL request to the PE on the forward channel.
3. PE 1 acknowledges the message on the back channel.
4. PE 1 parses and optimizes the request.
PE to AMPs
All-AMP Query with Sort

PE1 produces a three-step plan.


PE1 gives first step to BYNET to send to all AMPs.
BYNET sends step over forward channel to all AMPs.
All AMPs acknowledge receipt over back channel.
PE to AMPs
All-AMP
Query with Sort

1. PE1 sends out step 2 over the BYNET.


2. BYNET sends step to all AMPs.
AMPs to Merge Process
All-AMP
Query with Sort

Each AMP sends its first block of sorted data to BYNET merge process.
AMP to Merge
All-AMP
Query with Sort

Plan
1. GET NUMBER, LETTER
WHERE NUMBER > 9
2. SORT ON LETTER
3. MERGE ON LETTER
1. The merge process continues to request sorted blocks from the AMPs until all AMPs
have exhausted their spool supply.
2. When the merge process has an EOF from each AMP, the answer set is complete.
Note: Spool is a temporary space used by the AMPs to store
the intermediate results.
PE to Application
All-AMP
Query with Sort

PE1 then sends the answer set to the requesting application.


Linear Growth and Expandability
Parsing
Engine
Parsing
Engine
Parsing
SESSIONS
Engine

AMP
AMP
CESSING
AMP
PARALLEL PRO

Disk
Disk Space
Disk Space
Space DATA
Node
Node

Node N O DE

Components may be added as requirements grow without Loss of Performance


Double the number of AMPs - Number of users remains the same - Performance
will double.
Double the number of AMPs and double the number of users - Performance will
stay the same.
Teradata is linearly expandable
Data Protection
Teradata provides the following data Protection features:

Protection Method Type


Locks Software
Fallback Software
Raid Protection Software
Cliques Hardware
Transient Journal Software
Permanent Journal Software
Archive and Restore Software
Locks
There are four types of locks:
Exclusiveprevents any other type of concurrent access
Writeprevents other Read, Write, Exclusive locks
Readprevents Write and Exclusive locks
Accessprevents Exclusive locks only

Locks may be applied at three database levels:


Databaseapplies to all tables/views in the database
Table/Viewapplies to all rows in the table/views
Row Hashapplies to all rows with same row hash

Lock types are automatically applied based on the SQL command:

SELECTapplies a Read lock


UPDATEapplies a Write lock
CREATE TABLEapplies an Exclusive lock
Access Locks
Advantages of Access locks:
1. Permit quicker access to table in multi-user environment.
2. Have minimal blocking effect on other queries.
3. Very useful for aggregating large numbers of rows.
4. Sometimes called a Dirty-Read or Stale-Read lock.
5. Disadvantages of Access locks:
6. May produce erroneous results if performed during table maintenance.
Rule
Lock requests are queued behind all outstanding
incompatible lock requests for the same object.

A new ACCESS lock request is granted immediately.


Fallback
Fallback is a software mechanism. The fallback row is a copy of a primary row
stored on a different AMP.
A fallback table is fully available in the event of an unavailable AMP.

PE PE

BYNET

AMP 1 AMP 2 AMP 3 AMP 4

2 11
6 3 5 12 8 1 Primary rows
3 5
8 2 1 11 6 12 Fallback rows

Benefits of Fallback
1. Permits access to table data during AMP off-line period
2. Adds a level of data protection beyond disk array RAID
3. Automatically restores data changed during AMP off-line
4. Critical for high availability applications
Cost of Fallback
1. Twice the disk space for table storage is needed
2. Twice the I/O for INSERTs, UPDATEs and DELETEs is needed
Fallback Cluster
A defined number of AMPs treated as a fault-tolerant unit.
Fallback rows for AMPs in a cluster reside in the cluster.
Loss of an AMP in the cluster permits continued table access.
Loss of two AMPs in the cluster causes the RDBMS to halt.

Two Clusters of Four AMPs Each


AMP 1 AMP 2 AMP 3 AMP 4
62 8 27 34 22 50 5 78 19 14 1 38
5 34 14 19 38 8 22 62 1 50 27 78

AMP 5 AMP 6 AMP 7 AMP 8


41 66 7 58 93 20 88 2 45 17 37 72
93 72 88 45 7 17 37 58 41 20 2 66

Lose AMP 3 from cluster -> AMPs 1, 2 and 4 experience 33% increase in workload.
Lose AMP 6 from cluster -> AMPs 5, 7 and 8 experience 33% increase in workload.
Lose AMP 7 from cluster ->System halts.
System performance can be adversely affected where any AMP has a
disproportionate burden.
Fallback vs. Non-Fallback Tables
FALLBACK TABLES
ONE AMP DOWN AMP AMP AMP AMP
Data fully available

TWO OR MORE AMPs DOWN AMP AMP AMP AMP


In different clusters
Data fully available
In the same cluster
System halts

NON-FALLBACK TABLES
ONE AMP DOWN
Data partially available
AMP AMP AMP AMP
Queries avoiding down AMP succeed

TWO OR MORE AMPs DOWN


In different clusters AMP AMP AMP AMP
Data partially available
Queries avoiding down AMPs succeed
In the same cluster
System halts
Raid Protection
Two types of disk array protection:
RAID-1 (Mirroring)
1. Each physical disk in the array has an exact copy in the same array.
2. The array controller can read from either disk and write to both.
3. When one disk of the pair fails, there is no change in performance.
4. Mirroring reduces available disk space by 50%.
Mirror
5. Array controller reconstructs failed disks quickly. Primary

RAID-5 (Parity)
1. For every 3 blocks of data, there is a parity block on a 4th disk.
2. Parity Algorithm is applied to determine the parity block.
3. If a disk fails, any missing block may be reconstructed using the other three disks.
4. Parity reduces available disk space by 25% in a 4-disk rank.
5. Array controller reconstruction of failed disks is longer than
RAID 1.
Block 0 Block 1 Block 2 Parity
Parity Block 3 Block 8 Block 4
Block 5 Parity Block 6 Block 7

Summary
RAID-1 - Good performance with disk failures
Higher cost in terms of disk space
RAID-5 - Reduced performance with disk failures
Lower cost in terms of disk space
Recovery Journal For Down AMPs
Recovery Journal is:
Automatically activated when an AMP is taken off-line
Maintained by other AMPs in the cluster
Totally transparent to users of the system
While AMP is off-line:
Journal is active Table updates continue as normal
Journal logs Row-IDs of changed rows for down-AMP

When AMP is back on-line:


Restores rows on recovered AMP to current status
Journal discarded when recovery complete

AMP 1 AMP 2 AMP 3 AMP 4

41 66 7
58 93 20 88 2 45 17 37 72
93 72 88 58 41 20 2 66
45 7 17 37

RJ Row-ID 7 RJ Row-ID 41 RJ Row-ID 66


Cliques
Cliques Pronounced as Clee-ques is a grouping of a set of nodes together. Two or
more TPA nodes having access to the same disk arrays are called a clique.
SMP 1 SMP 2 SMP 3 SMP 4

AMP 1 AMP 2 AMP 3 AMP 4 AMP 5 AMP 6 AMP 7 AMP 8

D D
A A
C C
AMP vprocs can run on any node within the clique and still have full access to
their disk array space. If a node fails, AMPs migrate to another node in the
clique.
SMP 1 SMP 2 SMP 3 SMP 4

AMP 3 AMP 4

AMP 1 AMP 2 AMP 5 AMP 6 AMP 7 AMP 8

D D
Note:
A A Failure of a Node within a Clique
C C increases the workload for the
other Nodes within the clique
Transient Journal
Transient Journal
1. Consists of a journal of transaction before images.
2. Provides rollback in the event of transaction failure.
3. Is automatic and transparent.
4. Before images are reapplied to table if transaction fails.
5. Before images are discarded upon transaction completion.
BEGIN TRANSACTION
UPDATE Row A Before image Row A recorded
(Add $100 to checking)
UPDATE Row B Before image Row B recorded Successful
(Subtract $100 from savings) Transaction
END TRANSACTION Discard before images

BEGIN TRANSACTION
UPDATE Row A Before image Row A recorded
Failed
UPDATE Row B Before image Row B recorded
Transaction
(Failure occurs)
(Rollback occurs) Reapply before images
(Terminate TXN) Discard before images
The Permanent Journal
An optional, user-specified, system-maintained journal used for database recovery to
a specified point in time.
1.Used for recovery from unexpected hardware or software disasters. May be
specified for:
One or more tables
One or more databases
2. Permits capture of BEFORE images for database rollback .
3. Permits capture of AFTER images for database rollforward.
4. Permits archiving change images during table maintenance.
5. Reduces need for full-table backups.
6. Provides a means of recovering NO FALLBACK tables.
7. Requires additional disk space for change images.
8. Requires user intervention for archive and recovery activity.
Note:
The user cannot directly query the permanent journal table.
Permanent Journal occupies Permanent space and hence needs to be
cleaned up periodically.
Archiving and Recovering Data
ARC Utility
1. The Archive/Restore utility
2. Runs on IBM, UNIX and NT
3. Archives data from RDBMS
4. Restores data from archive media
5. Permits data recovery to a specified checkpoint

Common uses of ARC

1. Dump database objects for backup or disaster recovery


2. Restore non-fallback tables after disk failure.
3. Restore tables after corruption from failed batch
processes.
4. Recover accidentally dropped tables, views, or macros.
5. Recover from miscellaneous user errors.
Summary
In todays session we have learnt about:

1. The Teradata architecture and how it achieves the best parallelism and
scalability.
2. The concept of Shared-Nothing Architecture.
3. The way data is distributed using Hashing algorithm.
4. The significance of PI in row distribution.
5. How the data rows are fetched?
6. The various protection features in Teradata
References
Teradata Basics Official curriculum Published by NCR Teradata
Solutions Group

Você também pode gostar