Você está na página 1de 43

Teradata Database 13.

10 Overview


Todd Walter
CTO Teradata Labs

2
Fine Print
Nothing in this presentation constitutes a
commitment to deliver any specific
functionality at any specific time.
Current planning date for 13.10 release in
Q32010.
Key Features
4
What is a Temporal Database
Definitions
Temporal the ability to store all historic states of a given set of data
(a database row), and as part of the query select a point in time to
reference the data. Examples:
> What was this account balance (share price, inventory level, asset value,
etc) on this date?
> What data went into the calculation on 12/31/05, and what adjustments
were made in 1Q06?
> On this historic date, what was the service level (contract status, customer
value, insurance policy coverage) for said customer?
Three Types of Temporal Tables
> Valid Time Tables
When a fact is true in the modeled reality
User specified times
> Transaction Time Tables
When a fact is stored in the database
System maintained time, no user control
> Bitemporal Tables
Both Transaction Time and Valid Time
User Defined Time
> User can add time period columns, and take advantage of the added
temporal operators
> Database does not enforce any rules on user defined time columns
5
Temporal Query
Provide a list of members who were reported as covered on
Jan. 15, 2000 in the Feb. 1, 2000 NCQA report, with names as
accurate as our best data shows today.
SELECT member.member_id, member.member_nm
FROM edw.member_x_coverage
VALIDTIME AS OF DATE 2000-01-15 AND
TRANSACTIONTIME AS OF DATE 2000-01-01 ,edw.member
WHERE member_x_coverage.member_id =
member.member_id;
select member.member_id
,member.member_nm
from edw.member_x_coverage coverage
,edw.member
where coverage.member_id = member.member_id
and coverage.observation_start_dt <= '2000-02-01'
and (coverage.observation_end_dt > '2000-02-01'
or
coverage.observation_end_dt is NULL)
and coverage.effective_dt <= '2000-01-15'
and (coverage.termination_dt > '2000-01-15'
or
coverage.termination_dt is NULL)

With Temporal Support Without Temporal Support
6
Temporal Update BiTemporal Table
With Temporal Support
UPDATE objectlocation
SET LOCATION = External
WHERE item_id = 125
AND item_serial_num = 102
Without Temporal Support
INSERT INTO objectlocation
SELECT item_id, item_serial_num, External, CURRENT_TIME, END(vt), CURRENT_TIME,
Until_Closed
FROM objectlocation
WHERE item_id = 125 AND item_serial_num = 102
AND BEGIN(vt) <= CURRENT_TIME
AND END(vt) > CURRENT_TIME
AND END(tt) = Until_Closed;
INSERT INTO objectlocation
SELECT item_id, item_serial_num, location, BEGIN(vt), CURRENT_TIME, CURRENT_TIME,
Until_Closed
FROM objectlocation
WHERE item_id = 125 AND item_serial_num = 102
AND BEGIN(vt) <= CURRENT_TIME
AND END(vt) > CURRENT_TIME
AND END(tt) = Until_Closed;
UPDATE objectlocation
SET END(tt) = CURRENT_TIME
WHERE item_id = 125 AND item_serial_num = 102
AND BEGIN(vt) <= CURRENT_TIME
AND END(vt) > CURRENT_TIME
AND END(tt) = Until_Closed;
INSERT INTO objectlocation
SELECT item_id, item_serial_num, External, BEGIN(vt), END(vt), CURRENT_TIME,
Until_Closed
FROM objectlocation
WHERE item_id = 125 AND item_serial_num = 102
AND BEGIN(vt) > CURRENT_TIME
AND END(tt) = Until_Closed
UPDATE objectlocation
SET END(tt) = CURRENT_TIME
WHERE item_id =125 AND item_serial_num = 102
AND BEGIN(vt) > CURRENT_TIME
AND END(vt) = Until_Closed

Current valid time, current transaction time Query
Jeans (125,102) are sold today (2005-08-30)
7
Moving Current Date in PPI

Description
> Support use of CURRENT_DATE and CURRENT_TIMESTAMP built-in
functions in Partitioning Expression.
> Ability to reconcile the values of these built-in functions to a newer date or
timestamp using ALTER TABLE.
Optimally reconciles the rows with the newly resolved date or timestamp value.
Reconciles the PPI expression.
Benefit
> Users can define with moving date and timestamps with ease instead of
manual redefinition of the PPI expression using constants.
Date based partitioning is typical use for PPI. If a PPI is defined with moving
current date or current timestamp, the partition that contains the recent data can
be as small as possible for efficient access.
> Required for Temporal semantics feature provides the ability to define
current and history partitions.
8
Time Series Expansion Support

Description
> New EXPAND ON clause added to SELECT to expand row with a
period column into multiple rows
EXPAND ON clause allowed in views and derived tables
> EXPAND ON syntax supports multiple ways to expand rows

Benefit
> Permits time based analysis on period values
Allows business questions such as Get the month end average
inventory cost during the last quarter of the year 2006
Allows OLAP analysis on period data
> Allows charting of period data in an excel format
> Provides infrastructure for sequenced query semantics on
Temporal tables

9
Time series Expansion support

What will it do?
> Expand a time period column and produce value equivalent rows
one each for each time granule in the period
Time granule is user specified
Permits a period representation of the row to be changed into an event
representation
> Following forms of expansion provided:
Interval expansion
By the user specified intervals such as INTERVAL 1 MONTH
Anchor point expansion
By the user specified anchored points in a time line
Anchor period expansion
By user specified anchored time durations in a time line
10
Geospatial Enhancements

Description
> Enhancements to the Teradata 13 Geospatial offering drastically
increasing performance, adding functionality and providing
integration points for partner tools
Benefits
> Increased performance by changing UDFs to Fast Path System
functions
> Replace the Shape File Generator client tool (org2org) with a
stored procedure for tighter integration with the database and
tools such as ESRI ARCGIS
> Provide geodetic distance methods SphericalBufferMBR()
> WFS Server provides better tool integration support for MapInfo
and ESRI products

11
ESRI ArcGIS Connecting to Teradata via Safe
Software FME
1. FME connection in
ArcView
2. Connect to Teradata
via TPT API
3. Select Teradata
tables for ArcView
analysis
12
Projection of Impact Zone
& Storm Path to Google Earth
Where do I deploy my
cat management team.
13
Algorithmic Compression
Description
> Provide the capability that will allow users the option of defining
compression/decompression algorithms that would be
implemented as UDFs and that would be specified and applied to
data at the column level in a row. Initially, Teradata will provide
two compression/decompression algorithms; one set for UNICODE
columns and another set for LATIN columns.
Benefit
> Data compression is the process by which data is encoded so that
it consumes less physical storage space. This capability reduces
both the overall storage capacity needs and the number of
physical disk I/Os required for a given operation. Additionally,
because less physical data is being operated on there is the
potential to improve query response time as well.
Considerations
> At some point, compressed data will have to be decompressed
when required. This can cause the use of some extra CPU cycles
but in general, the advantages of compression outweigh the extra
cost of decompression.
14
Multi-Value Compression For Varchar Columns
Example Multi-Value Compression for Varchar Column:

CREATE TABLE Customer
(Customer_Account_Number INTEGER
,Customer_Name VARCHAR(150)
COMPRESS (Rich,Todd)
,Customer_Address CHAR(200));

15
Block Level Compression
Description
> Feature provides the capability to perform compression on whole
data blocks at the file system level before the data blocks are
actually written to storage.
Benefit
> Block level compression yields benefit by reducing the actual
storage required for storing the data, especially cool/cold data, and
significantly reduce the I/O required to read the data.
Considerations
> There is a CPU cost to perform the act of compression or
decompression on whole data blocks and is generally considered a
good trade since CPU cost is decreasing while I/O cost remains
high.

16
User-Defined SQL Operators
Description
> This feature provides the capability that will allow users to define
and encapsulate complex SQL expressions into a User Defined
Function (UDF) database object.
Benefits
> The use of the SQL UDFs Feature allows users to define their own
functions written using SQL expressions. Previously, the desired
SQL expression would have to be written into the query for each
use or alternatively, an external UDF could have been written in
another programming language to provide the same capability.
> Additionally, SQL UDFs allow one to define functions available in
other databases and with alternative syntax (e.g. ANSI).
Considerations
> The Teradata SQL UDF feature is a subset of the SQL function
feature described in the ANSI SQL:2003 standard.
> Additionally, this feature does not introduce any changes to the
definition of the Dictionary Tables per se, but will add additional
rows into the DBC.TVM and DBC.UDFInfo tables to indicate the
presence of a SQL UDF.

17
SQL UDF - Example
The Months_Between Function:

CREATE FUNCTION Months_Between
(Date1 DATE, Date2 DATE)
RETURNS Interval Month (4)
LANGUAGE SQL
DETERMINISTIC
CONTAINS SQL
PARAMETER STYLE SQL
RETURN(CAST(Date1 AS DATE)- CAST(Date2 AS DATE)) MONTH (4);

SELECT MONTHS_BETWEEN ('2008-01-01', '2007-01-01');
MONTHS_BETWEEN ('2008-01-01', '2007-01-01')
---------------------------------------------------
12
Performance
19
Character-Based PPI (CPPI)
Description
> This feature leverages current Teradata Primary Partitioned Index
(PPI) technology and extends this capability to allow the use of
character data (CHAR, VARCHAR, GRAPHIC, VARGRAPHIC) as
table partitioning mechanisms.
Benefit
> Currently, only an integer datatype is allowed to be used in a PPI
scheme as a partitioning mechanism which facilitates superior
query performance advantage via partition elimination. The
extension of this capability to use character-based datatypes as a
partitioning mechanism will allow for more partitioning options
and in-turn yield similar query performance advantage as the
current PPI technology gleans today.
Considerations
> As with all Teradata indexes or partitioning database design
choices, the Optimizer will determine the appropriate index/PPI to
use that will provide the best-cost plan for executing the query.
No end-user query modification is required.
20
Timestamp Partitioning
Description
> Provide the capability that allows users to explicitly specify a time
zone for PPI tables involving DateTime partitioning expressions in
order to make the expressions deterministic (e.g., not dependent
on the session time zone).
> Implement the enhancements that will extend the PPI partition
elimination capability to include timestamp data types in
partitioning expressions.
Benefit
> Insuring that DateTime partitioning expressions to be
deterministic will eliminate the possibility of any errors that may
occur as a result of incorrect dependence on session time zones.
> The extension of this capability to use timestamp data types as a
partitioning mechanism will allow for more partitioning options
and in-turn yield similar query performance advantage as the
current PPI technology gleans today.
Considerations
> Enhancements related to deterministic time zone handling will
also be applied to sparse join index search conditions as well.
21
Fastpath Functions
Description
> The Fastpath Function project combines the extensibility, short
development cycles, and ease-of-use aspects of UDFs with the
high performance and ease-of-use aspects of Teradata system
functions to yield and alternate development path by which
Teradata Engineering software developers may add new Teradata
system functions to the Teradata server.
Benefit
> The Fastpath Function project will allow Teradata to use a shorter
development cycle to fulfill many customer specific requests for
implementing new system functions that additionally perform in
the same manner as native Teradata system functions.
Considerations
> Source code and/or libraries used in the development of Teradata
system functions must be solely managed and maintained by
Teradata Engineering. End-users will not be able to develop
Fastpath system functions.

22
FastExport Without Spooling
Description
> Enhance the FastExport utility to provide an option that would
allow the utility to execute in a mode that eliminates the
requirement that the query data be spooled prior to the actual
export process.
Benefit
> The direct without spooling method will provide the mechanism
to extract data from Teradata table quickly and efficiently with
the main benefit being realized as a performance gain and
minimum resource utilization.
Considerations
> The direct without spooling method is not transparent to the
user and must be specified as a discrete option when executing
the FastExport utility. It is a user decision to choose between
using either the spool or no spool method.
Teradata Workload Management
24
TASM: Additional Workload Definitions
Description
> Feature increases the number of available TASM Workload
Definitions (WDs) to 250 (instead of 40).
Benefits
> Complex mixed workloads require the ability to have a finer
degree of granular control over the parts of the workload.
Increasing the number of WDs will allow customers to better
manage and report on resource usage of their system to meet
either subject area (e.g. by country, application or division)
resource distribution requirements, or category-of-work (e.g. high
vs. low priority) resource distribution requirements.
Considerations
> Administrators should be aware that when defining a large
number of workloads which will run concurrently, it will become
difficult to create significant differentiation among the different
workloads when the resource division granularity itself gets very
small.

25
TASM: Common Classifications
Description
> This feature provides for capability to have Workload Definition
classification criteria be available for Teradata Workload
Management Category 1, 2 and 3 (Filters, System Throttles and
Workload Definitions) and additionally, extends wildcard support
to Filters and Throttles.
Benefit
> The implementation of Common Classifications addresses the
differences and delivers consistency between the TDWM
categories (Filters, System Throttles and Workload Definitions),
which improves the Teradata Workload Management user
interface and its subsequent usability.
Considerations
> Consideration should be given to re-evaluating the current
settings for the different categories insofar as common
classification extends the ability to manage a workload in an
easier and simpler fashion.

26
TASM: Common Classifications
Who Criteria
> Account String / Account Name
> Teradata Username / Teradata Profile
> Application Name
> Client Address or Client Name
> QueryBand
Where Criteria (Data Objects)
> Databases
> Tables / Views / Macros
> Stored Procedures
What Criteria
> Statement Type (SELECT, DDL, DML)
> Utility Type
> AMP Limits, Row Count, Final Row Count
> Estimated Processing (CPU time)
> Join Types
ALL or no joins
ALL or no product joins
ALL or no unconstrained product joins

27
TASM Utility Management
Description
> This feature enhances the TASM utility to augment the existing TD
Utility Management capability to provide controls to be similar to
the workload management of regular SQL requests and to provide
for the automatic selection of the number of sessions used by
Teradata utilities.
Benefits
> Feature provides for more granular and centralized control of utility
execution and allows deployment to a much wider audience of users
and applications. Additionally, the use of Teradata utility sessions is
moved inside the database and is automated to eliminate the
detailed management of sessions in each job.
Considerations
> Consideration should be given to a reevaluation of current rule sets
and settings to maximize control of the workload and relative utility
execution.
> Throttling in TASM eliminates need for Tenacity and Sleep.
Execution of queued jobs becomes FIFO. Execution of queued jobs
is immediate when resource available rather than at end of Sleep
time
28
TASM Utility Session Configuration Rules
For FastLoad, MultiLoad, and FastExport utilities, the DBS
default for number of AMP sessions is one per AMP.
On a large system with hundreds or thousands of AMPs, this
default becomes inappropriate.
Currently, a user can override this default by changing
individual load/export script, or changing the MAXSESS
parameter in the configuration file, or specifying through
runtime parameters (i.e., MAXSESS or M).
These overriding methods are inconvenient.
This feature allows a DBA to define TDWM rules in one central
place that specifies the number of AMP sessions to be used
based a combination of the following criteria:
> Utility Name
> Who criteria (user, account, client address, query band, etc.)
> Data size
29
TASM Utility Session Configuration Rules
Session configuration rules are optional.
These rules are active when any category of TDWM is enabled.
In each session configuration rule, the DBA specifies the
criteria and the number of sessions to be used when these
criteria are met.
For example, for stand alone MultiLoad jobs submitted by user
Charucki, use 10 sessions.
Session configuration rules also support the Archive/Restore
utility.
The DBA can define similar rules to specify the number of
HUTPARSE sessions to be used for a specific set of criteria.
A new internal DBSControl field: DisableTDWMSessionRules
is provided to disable user-defined session configuration rules
and default sessions rules while TDWM is enabled.
When this field is set, Client and DBS will operate as in
Teradata 13.
Availability, Serviceability, DBA Tasks
Improvements
31
Fault Isolation
Description
> Remove cases where faults can cause restarts
> Specific cases
EVL fault isolation
Unprotected UDFs
Dictionary cache re-initialization
Benefits
> Identify and isolate the fault to only the query or session
> Issues in query calculation and qualification will be isolated
> Badly behaving UDFs will have less opportunity to affect the
system
> Faults in the dictionary cache will result in the dictionary cache
being flushed and reloaded rather than affecting the entire system
32
AMP Fault Isolation

Description
> This feature is intended to catch those AMP errors that currently
cause DBS restarts where the error can be dealt with by taking a
snapshot dump and aborting the transaction that caused the error
Benefit
> This feature can reduce the number of DBS restarts for customers,
thus improving overall system availability
What will it do?
> Current AMP Fault Isolation only avoids a full database restart for
errors when accessing spool tables
> The scope of fault isolation will be increased to cover ERRAMP* or
ERRFIL* errors on permanent tables as well spools
> Retrofitted to current supported releases
33
Read From Fallback
Description
> In the event of encountering a data block read error, either
unreadable or corrupt data blocks, this feature will leverage the
pre-existing Fallback Table facility to transparently retrieve the
required data block from the fallback copy.
Benefit
> When fallback is available, feature seriously improves fault
tolerance and system availability. Significantly improves the value
of having fallback and protects non-redundant (RAID 0 or JBOD)
storage media, such as SSD, from data loss without restart/failover.
Considerations
> Fallback does not need to be instantiated as system-wide property,
because fallback is a table-level attribute, it can be applied
selectively to the largest/most critical customer tables.
> This facility does not in-and-of itself repair bad data blocks, but
allows them to be read from fallback until they can be repaired.


34
Read From Fallback - Particulars
Reading data blocks from the Fallback copy is transparent to both a
user and/or application. Manual intervention is not required
whatsoever.
Feature does not require any special or particular locking mechanism.
A manual process is still required to rebuild the table to repair
unreadable or corrupt data blocks.
Facility cannot recover from data block errors in the Cylinder Index,
NUSI Secondary Index or Permanent Journals.
Read errors are fallback recoverable on TD Data Dictionary tables
with the exception of the unhashed system tables such as the WAL
log, Transient Journal and Space Accounting tables.
Facility applies to SQL Queries with data block read errors, SQL
InsertSelect statements and the Archive utility where the block read
error is on the source table only.
35
Transparent Cylinder Packing
Description
> Develop a new file system background task that will pro-actively
and transparently monitor and adjust the utilization (high or low)
of user data cylinders and pack/unpack said cylinders accordingly
with the goal of returning them to a more efficiently utilized state.
Benefit
1. Cylinder Packing will result in cylinders having a higher datablock
to cylinder index ratio making Cylinder Read operations more
effective by reading less unoccupied sectors.
2. Higher cylinder utilization translates into data tables occupying
less cylinders leaving more cylinders available for other purposes.
3. Diminishes the chances that a mini-cylpack operation will be
executed and lessens the need for administrators to perform
regularly scheduled Packdisk operations.
Considerations
> This feature will have several customer tunable parameters in
DBSControl that will allow customers to mange and adjust the
level of impact of the Transparent Cylinder Packing operations.
36
Merge Data Blocks
During Full Table Modify Operations
Description
> During full table modification operations such as Multiload, Insert
Select and Update or Delete Where, combine adjacent blocks
when small blocks are present.
Benefit
> Small data blocks increase the I/Os necessary to read a table and
interferes with features such as compression and large cylinders.
> Reduce the instances of small data blocks by combining them
when doing work on those blocks or adjacent ones.
37
Archive DBQL Rule Table
Description
> Enhance the Teradata Archive utility to include two additional DBC
tables to the DBC database (Dictionary) backup/restore:
DBC.DBQLRuleTbl
DBC.DBQLRuleCountTbl
Benefit
> Inclusion of the additional DBC tables in the DBC Archive/Restore
process will provide a mechanism by which these tables can be
archived/restored and will altogether eliminate the cumbersome
task of having to every time redefine the appropriate DBQL rules
after a Dictionary initialization.
> Implementation of this feature avoids the possibility of any table
synchronicity issues and offers simplicity, convenience, and
integrity when conducting a DBC archive/restore.
Considerations
> DBC Archive will include these tables automatically in the
Dictionary Archive; no user intervention is required.

Be Aware
Especially if Considering Tech Refresh
39
Large Cylinder Support
Description
> This feature increases data storage cylinder size, the basic
allocation unit for disk space in the Teradata file system. This also
includes an increase in the Cylinder Index size thus allowing for a
commensurate increase in storing more data blocks per cylinder.
Benefit
> Eliminates the inefficiency associated with managing a large
number of small cylinders on very large disk drives, allows larger
AMP sizes (~10 TB per AMP), permits the more efficient storage of
Large Objects and provides the foundation for block level
compression by allowing more small blocks on a cylinder.
Consideration
> This capability is only available starting in Teradata 13.10 and
going forward and requires a System Initialization (SysInit) to be
performed so that large cylinder support can be engaged. It is
anticipated that typically this activity would be performed during
technology refresh opportunities.
40
Packed Row format for 64-bit platforms
Description
> With the introduction of Teradata 13.10, data will now be stored
on the database in byte-packed format whereas previously the
data had been stored in byte-aligned format.
Benefits
> Translates directly into a 4-7 % disk space savings insofar as less
disk space is required to store byte-packed data than is required
with byte-aligned data. Additionally, enables data rows to be
accessed using fewer I/Os thus potentially enhancing the
performance of some workloads.
Considerations
> This capability is only available starting in Teradata 13.10 and
going forward and requires a System Initialization (SysInit) to be
performed so that packed row format support can be engaged. It
is anticipated that typically this activity would be performed
during technology refresh opportunities.

41
Enhanced Teradata Hashing Algorithm
Description
> Enhance the Teradata Hashing Algorithm to reduce the effects of
irregularities in character data on hash results.
Benefit
> This enhancement is targeted to reduce the number of hash
collisions for character data stored as either Latin or Unicode,
notably strings that contain primarily numeric data. Reduction in
hash collisions reduces access time per AMP and produces a more
balanced row distribution which in-turn improves parallelism.
Reduced access time and increased parallelism translate directly
to better performance.
Considerations
> This capability is only available starting in Teradata 13.10 and
going forward and requires a System Initialization (SysInit) to be
performed so that the enhanced hashing algorithm can be
engaged. It is anticipated that typically this activity would be
performed during technology refresh opportunities.
42 >
Teradata Database 13.10

Q
u
a
l
i
t
y
/

S
u
p
p
o
r
t
-

a
b
i
l
i
t
y

AMP fault isolation
Parser diagnostic information capture
Dictionary cache re-initialization
EVL fault isolation and unprotected UDFs
P
e
r
f
o
r
m
a
n
c
e

FastExport without spooling
Character-based PPI
Timestamp partition elimination
User Defined Ordered Analytics
Merge data blocks during full table modify operations
Statement independence
TVS Initial suggested temperature tables
A
c
t
i
v
e

E
n
a
b
l
e

Restart time reduction
Read from Fallback
TASM: Workload Designer
TASM: Utilities Management
TASM: Additional Workload Definitions

E
a
s
e

o
f

U
s
e

Teradata 13.10 Teradata Express Edition
Domain Specific System Functions

Moving current date in PPI
Automatic cylinder packing
E
n
t
e
r
p
r
i
s
e

F
i
t

Algorithmic Compression for Character Data
VLC for VARCHAR columns
Block level compression
Variable fetch size (JDBC)
User Defined SQL Operators
Temporal Processing
Temporal table support
Period data type enhancements
Replication support
Time series Expansion support

Archive DBQL rule table
Enhanced trusted session security
External Directory support enhancements
Geospatial enhancements
Statement Info Parcel Enhancements (JDBC)
Support for IPv6
Support unaligned row format for 64-bit platforms
Enhanced hashing algorithm
Large cylinder support

3/18/10
43 >
Teradata Developer Exchange
http://developer.teradata.com/
What is it?
> Portal for technical
insights
Articles, blogs, podcasts
Forums, FAQs, How to,
etc.
> Community of Teradata
experts
Customers, Teradata R&D
and PS
> Share software
Portlets, UDFs, SPs,
scripts, etc.
Sample applications
Who can use it?
> Anyone (read only)
> Registered contributors
Blogs, code, ratings,
articles, etc.

Você também pode gostar