Você está na página 1de 6

Teradata Interview questions - Part 6

What is a Sparse Index?


Answer:
Sparse Join Indexes are a type of Join Index which contains a WHERE clause that reduces the
number of rows which would otherwise be included in the index. All types of join indexes,
including single table, multitable, simple or aggregate can be sparse.
Difference between Stored Procedure and Macro?
Answer:
SP:
It does not return rows to the user.
It has to use cursors to fetch multiple rows
It used Inout/Out to send values to user
It Contains comprehensive SPL
It is stored in DATABASE or USER PERM
A stored procedure also provides output/Input capabilities
Macros:
It returns set of rows to the user.
It is stored in DBC PERM space
A macro that allows only input values
How do you execute the given SQL statement repeatedly in BTEQ?
Answer:
Select top 1* from database.table1;
=n
Here "=n is to run the previous sql statement, "n" number of times.
What are types of PARTITION PRIMARY INDEX (PPI) in Teradata?
Answer:
1. Partition by CASE
CREATE
TABLE ORDER_Table
(
ORD_number integer NOT NULL,
customer_number integer NOT NULL,
order_date date ,
order_total integer
)
PRIMARY INDEX (customer_number)
PARTITION BY case1 (
order_total < 10000 ,
order_total < 20000 ,
order_total < 30000,

NO

CASE

OR

UNKNOWN ) ;

2. Partition by Range - example using date range


CREATE TABLE ORDER_Table
(
ORD_number integer NOT NULL,
customer_number integer NOT NULL,
order_date date ,
order_total integer
)
PRIMARY INDEX (customer_number)
PARTITION BY range1 (
Order_date BETWEEN date '2010-01-01'
EACH interval '1' month ,
NO RANGE
OR UNKNOWN);

AND

date '2010-12-01'

P.S:If we use NO RANGE or NO CASE - then all values not in this range will be in a single
partition.
If we specify UNKNOWN, then all null values will be placed in this partition
Can we define PARTITION PRIMARY INDEX (PPI) on a Primary Index column in Table?
Explain Why?
Answer:
PPI cannot be defined on PI column in Table. Since PI is used to locate an AMP and store the
data based on HASH value (ROW ID ) it cannot be used as PPI column.
In case of PPI , The data stored in AMP's are Partitioned based on PPI column after they are row
hashed (ROW KEY = ROW ID +PPI value )
What is Cross Join?
Answer:
It is a Teradata specified Join, which is used as equivalent to product join.
There is no On clause in case of CROSS join
SELECT EMP.ename , DPT.Dname
FROM employee EMP
CROSS JOIN
Department DPT
WHERE
EMp.deptno = DPT.depto ;

What does SLEEP function does in Fast load?


Answer
The SLEEP command specifies the amount minutes to wait before retrying to logon and
establish all sessions.
Sleep command can be used with all load utilities not only fastload.

This situation can occur if all of the loader slots are used or if the number of requested sessions is
not available. The default value is 6 minutes. If tenacity was set to 2 hours and sleep 10 minutes,
Sleep command will try to logon for every 10 minutes up to 2 hours duration.
What is TENACITY? What is its default value?
Answer
TENACITY specifies the amount of time in hours, to retry to obtain a loader slot or to establish
all requested sessions to logon. The default for Fast Load is no tenacity, meaning that it will
not retry at all. If several FastLoad jobs are executed at the same time, we recommend setting the
TENACITY to 4, meaning that the system will continue trying to logon for the number of
sessions requested for up to four hours.
How to skip 1st record while using Bteq IMPORT?
Answer:
By using SKIP=1 ; , we can skip first record.
.import infile=<filename>, skip=1;
How many types of Index are present in teradata?
Answer:
There are 5 different indices present in Teradata
1. Primary Index
a.Unique primary index
b. non Unique primary index
2. Secondary Index
a. Unique Secondary index
b. non Unique Secondary index
3. Partitioned Primary Index
a. Case partition (ex. age, salary...)
b. range partition ( ex. date)
4. Join index
a. Single table join index
b. multiple table join index
c. Sparse Join index ( constraint applied on join index in where clause)
5. Hash index

Sparse Index ( SPARSE JOIN INDEX)


It is type of join index which contains a WHERE clause , that reduces number of rows participating
from the base tables.
All types of join indexes, including single table, multitable, simple or aggregate can be sparse.

By default , a given join index will have a NUPI defined on the first column if it is not specified in create
JOIN index statement. We can explicitly mention a column to be NUPI for the join Index.
Any combination of AND, OR, IN (LOGICAL) conditions can be used along with WHERE clause in
Sparse Index.
Check the following example for Sparse Join Index.
CREATE
JOIN INDEX DEV.testjoin AS
sel A.DEPTno , b.sal
FROM
dept A
INNER JOIN
employee B
ON A.deptno=b.dept
WHERE a.deptno=20 --- sparse index
PRIMARY INDEX (sal);
In the following DDL for Join INDEX,
- Sal is explicitly mentioned as NUPI
- a filter clause is applied on Dept.DeptNO = 20, which reduces the number of rows ( an Ideal case of
Sparse Index )

P.s: Join Index never allows a Unique Index to be created.

Peformance tuning Tips : LIKE Clause


While tuning queries in Teradata , We take care of major performance issues but ignore small cases
which might still cause the query to perform badly.
I wanted to mention about one such case of LIKE clause , which many people good at performance
tuning miss it assuming like patterns does not harm the performance . But in reality this is not so!!
If LIKE is used in a WHERE clause, it is better to try to use one or more leading character in the clause, if
at all possible.
eg; LIKE '%STRING%' will be processed differently compared to LIKE 'STRING%'
If a leading character 'STRING%' is used in the begining of like clause , the the Qptimizer makes use
of an index to perform on query thereby increasig the performance
But if the leading character' in '%STRING%' is a wildcard(say '%') , then the Optimizer will not be able to
use an index, and a full table scan (FTS ) must be run, which reduces performance and takes more time.
Hence it is suggested to go for '%STRING%' only if STRING is a part of entire pattern say 'SUBSTRING'

Rank vs Row_Number in Teradata


Anyone working on Teradata would be aware of the two most common OLAP functions in Teradata which
are used for common use i.e.

1.
2.

Rank()
Row_Number()

In one of my posts I have mentioned about how to generate sequence number (ex. Surrogate Key)
using Row_Number. Today let us find out the difference in usage of rank and row number functions by
taking following scenario on tables (sales_daily)

SELECT
item_id,
sales_date,
Actual_sales,
RANK() OVER (ORDER BY Actual_sales DESC) ,
ROW_NUMBER() OVER (ORDER

-- RANK function

BY Actual_sales DESC)

FROM Sales_daily
WHERE
sales_date BETWEEN DATE '2010-01-01'

AND DATE '2010-03-01';

The result set is as follows:

Item_id

Sales_date

Actual_sal
es

Rank

Row_Number
()

120

4/1/2010

550

120

4/2/2017

550

120

4/2/2020

450

120

4/2/2006

350

120

4/2/2027

350

120

4/1/2005

350

120

4/1/2003

250

120

4/2/2003

250

-- ROW_NUMBER function

120

4/1/2025

200

120

4/1/2002

200

10

120

4/1/2021

150

11

11

120

4/2/2001

150

11

12

120

4/1/2001

150

11

13

120

4/1/1931

100

14

14

In the result set, we can observe the following,


1)

Rows 1 and 2 have same rank assigned since that have same actual_sales value

2)

Rows 1 and 2 have different row numbers assigned even though same actual_sales value

3) Row 3 has rank 3 assigned as ranking positions

Hence ,
rows are always assigned an incremented sequence number with ROW_NUMBER.
Rows are always assigned the same number with RANK function.