Performance Tuning in Informatica

Performance Tuning for
Informatica PowerCenter
What Affects Performance?

These can make a substantial performance difference:
Database Tuning Sources, Targets, Repositories, Query

Tuning, Statistics Collection, Indexes, other techniques
Network - Capacity, Utilization

Disk Speed, Space, Tuning, Storage Type, Mount Type
Server Database, Informatica RAM, CPU,
Configuration
PowerCenter add on features Grid, Partitioning
Informatica Platform Architecture

Informatica
Analyst
Analyst Service
Model
Repository
Service
Informatica
Developer
Reposito
ry
Data Integration Service
Informatica
Administrator
ISP
Profile
Service
Mapping
Service
Staging Data
SQL
Service
Integration Service
Data Object Cache

Profile
Warehouse
Workflow
Manager
Mapping
Designer
Repository
Service
Reposito
ry
Metadata
Manager
Metadata Manager Service
MM
Warehouse
Comparison Across Tools

Most
techniques are the same except:
PowerCenter
Intended for larger data volumes
Use the Source Qualifier JOIN, SORT
and FILTER features instead of
additional JOINER, SORTER or
FILTER transformations in the
mapping (Why: Includes the join, sort
and filter conditions in the WHERE
clause in the Source Qualified
database query)
Cache Lookup transformations (Why:

Prepare for managing larger amounts
of data from Source)
Memory Settings in Session
Data Services
Intended for small data volumes
Use additional JOINER, SORTER or
FILTER transformations in the
mapping instead of placing in the
READ transformation (Why: Uses
native Pushdown Optimization and
avoids blocking pushdown on
subsequent transformations)
Do not cache Lookup Transformations

(Why: minimize I/O to a few rows of
required data from Source based on
request. Select only what is used for
those rows.)
Memory settings in DIS
Key Techniques
Adjusting Memory for DTM Buffer (PC and IDS)

Prevent Blockage of Pushdown (IDS, and PC if you have
PDO)
Lookup Caching (PC)

Cache and Optimization (IDS)
Avoid Unnecessary I/O
Avoid Duplicate Work
Anatomy of a Session
Integration Service or
Data Integration Service
Data Transformation Manager
(DTM)
DTM Buffer
Source
data READER
WRITER
TRANSFORMER
Target
data
Transformation
caches
PC Session Memory Settings
Generally set the

DTM Buffer Size
to Auto
For large jobs,

set as large as
needed, defining
it using all
numeric values
(2000000000 for
2 GB)
Set to fit 100
rows or more
Maximum Memory Settings
System uses the smaller of the two

Applies to DTM Buffer Size, Buffer Block Size, and
transformation caches set to Auto
Small changes can have a huge impact on performance, as it
provides more DTM Buffer space
Throughput and Busy Percentage in Log

View Throughput values in PowerCenter Workflow Monitor
Values less than 5000 rows/sec are slow and need tuning,
unless the table size is small too
Review log file Busy Percentage to evaluate

performance/bottleneck status
Source, target and transformation Busy Percentage
Even distribution all 100% -- is ideal
One section busy and others not busy indicates a

bottleneck
Other memory efficiency messages in the log review and

modify memory accordingly
Lookups
Minimize Lookup calls with Unconnected Lookups when a conditional lookup is required
Eliminate Records from the Lookup

Leverage the Lookup source filter property
SQL Override to include only rows in the Source for
insert/update check on target or referential integrity checks
For large Lookups, consider joining the rows instead

Uncached lookups require queries of the source for each
lookup row
Cached lookups query the source once for a complete set

of data
Lookup Cache
Reuse Cache within a Mapping
Structure the lookups identically, same ports, same lookup

condition, and it will be reused automatically
Include additional ports needed by all lookups to the table,

even if not used in all to enable reuse
Reuse Cache Between Mappings Named Cache
Enable concurrent caches
Build cache first time and use across several mappings

In the first session, indicate to rebuild the cache if it exists
Add pipelines to build cache at the start of the mapping for

all lookups concurrently, instead of default sequentially when
lookup process the first row
Increase the index and data cache sizes - hold all data in
memory without paging to disk
Determine Potential Cache Size

Source
Lookup
Lookup
Table
Items
Manufacturer
DIM_Items
Rows
5000
200
100000
Example 1 Lookup
Manufacturer
Cached
Build Cache
Example 2 Lookup
Dim_Items
Uncached
Cached
200
Uncached
100000
Read Source Records 5000
5000
5000
5000
Execute Lookup
5000
5000
Total Disk Reads
5200
10000
105000
10000
In this example, for a small lookup table, caching is best, for a

large one, uncached is best
Reduce I/O and Reduce Number of Rows
Reduce amount of data moving across the network - Filter

data in the database
Reduce the amount of data selected from the source Reduce the number of ports selected in sources, lookups
Select Distinct in Source Qualifier Filter rows

Reduce writes to the log file
Set Trace Level lower (e.g. to WARN)
Fix Transformation Errors
Fix Predicate syntax to avoid messages (view in IDS logs)
Single Pass Reads
Read source once and write to multiple targets
Reduce I/O and Reduce Number of Rows
Remove unnecessary ports and links to unnecessary

ports
Reduce the number of transformations - keep the Source

and Target anchors
Tune Database Queries and SQL statements see log file

for SQL
Reduce Rows in Pipeline

Move transformations that decrease rows upstream in
mapping (Filter, Joiner, Aggregator, Rank)
Move transformations that increase rows downstream

(Normalizer, Joiner, SQL, Java, Lookup with Return All Rows
selected)
Avoid Duplicate Work
Place common logic upstream

Perform functions needed in each data stream prior to the
split to reduce duplicating logic or function calls
Calculate once, use many times
Avoid calculating or testing the same value over and over

Calculate it once in an Expression transformation, and set a
true/false flag
Within an Expression, use variables to calculate a value

used several times
Additional Informatica Resources

More resources available at:
http://mysupport.informatica.com
http://velocity.informatica.com/
http://marketplace.informatica.com
PowerCenter and Data Services Help
Product manuals
PowerCenter
and
Data
Performance Tuning Guides
Services
PowerCenter and Informatica Developer

Transformation Guides
PowerCenter and Informatica Administrator

Guides
Developer Community
Informatica Professional Services
Additional Performance
Improvement
Tips/Techniques
Efficient Data Formats
Flat File vs. Database Table - Flat file located on the server
loads faster than database table located on the server
Fixed-width files are faster than delimited - Delimited files

require extra parsing
UNICODE vs. ASCII data movement mode - Unicode data

requires 2 bytes per character; ASCII requires 1 byte per
character
Disable High Precision - Applies to decimal data type.

Use native drivers instead of ODBC drivers
Transformation Tips
Avoid Stored Procedures, External Procedures, and

Advanced External Procedures - Slow and has overhead
Use sorted input for Aggregators, Union and Joiners

Connect only the ports that will be used downstream Lookup and other transformations
Use the where clause in the Lookup Filter instead of a

Lookup SQL Override
Sort in Source Qualifier (PC)

Join in Source Qualifier (PC)
Pushdown Sort (IDS) with Sorter transformation
Pushdown Join (IDS) with Joiner transformation
Mapping Tips
Simplify nested functions
instead of:
IIF(condition1,result1,IIF(condition2, result2,IIF ))))))))))))
try:
DECODE (TRUE, condition1, result1, , conditionn, resultn)

DECODE processing stops as soon as a condition is matched
Entire IIF statement evaluated even if earlier condition succeeds
Functions are more expensive than operators

Use || instead of CONCAT()
Mapping Tips
Avoid Data Type Conversions

Avoid implicit conversion, especially repeated ones
Avoid repeated conversions by explicitly converting and managing
the data type as it moves through the mapping
Trim spaces before comparing CHAR and VARCHAR

ports
Use Integer values for Filter and Lookup conditions, even
if datatype conversion is required, instead of Char or Data

Minimize the number of targets per mapping
More targets = poorer performance
Reduce the use of function calls

SUM(Col A) + SUM(Col B) Can be SUM(Col A + Col B)
Mapping Tips
Use Integers instead of Strings or Dates and Avoid
Complex Expressions
Comparison in Lookups, Filters and Routers

Group By in Aggregator
XML Reader / Mid-Stream XML Parser

Remove unprojected groups
Dont leave port size lengths as infinite.
Minimize use of Update Strategy transformations
Mapping Tips
Avoid Unnecessary Sorting on already sorted input

Replace multiple filter transformations with a router
transformation - Reduces the number of transformations
in the mapping
Use Filters or Routers to drop rejected rows from an

Update Strategy transformation
Select the appropriate Master Source in Joiner

Source with fewer rows (unsorted joiner)
Source with fewer duplicate keys (sorted joiner)
Normal Joins are faster

Join in database
Mapping Tips
Reusable Sequence Generator - Number of Cached
Values Property
Setting too low causes frequent repository access,
which impacts performance set to 1000 or similar
Unused values in a block are lost leading to gaps in the
sequence
Change unused lookup ports to be neither input nor
output reduces generated SQL results

Use a SQL override with an ORDER BY clause to override
the default ORDER BY for lookups with many output ports
Mapping Tips
Minimize unconnected lookup execution in IIF expression

- Set up the condition so unconnected lookup is in the
THEN clause
Use "Return any matching value" in Lookup whenever

possible - Creates index based on key ports rather than all
lookup ports
Place Lookup Conditions in the following order:

Equal to (=) (numeric ports first, then string or date)
Less than (<), greater than (>), less than or equal to (<=),
greater than or equal to (>=)
Not equal to (!=)
Session/Runtime Settings
Increase the Target Commit Interval - Reduce database

overhead
Use pre-post session commands to drop and rebuild

indexes and constraints - Improves load time when loading
many rows, balance with index/constraint rebuild time

Performance Tuning in Informatica

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Performance Tuning in Informatica

Enviado por

Direitos autorais:

Formatos disponíveis

Performance Tuning for

What Affects Performance?

Database Tuning Sources, Targets, Repositories, Query

Network - Capacity, Utilization

Informatica Platform Architecture

Data Object Cache

Metadata Manager Service

Comparison Across Tools

techniques are the same except:

Cache Lookup transformations (Why:

Memory Settings in Session

Do not cache Lookup Transformations

Memory settings in DIS

Adjusting Memory for DTM Buffer (PC and IDS)

Lookup Caching (PC)

PC Session Memory Settings

Generally set the

For large jobs,

Set to fit 100

Maximum Memory Settings

System uses the smaller of the two

Throughput and Busy Percentage in Log

Review log file Busy Percentage to evaluate

One section busy and others not busy indicates a

Other memory efficiency messages in the log review and

Eliminate Records from the Lookup

For large Lookups, consider joining the rows instead

Cached lookups query the source once for a complete set

Reuse Cache within a Mapping

Structure the lookups identically, same ports, same lookup

Include additional ports needed by all lookups to the table,

Reuse Cache Between Mappings Named Cache

Enable concurrent caches

Build cache first time and use across several mappings

Add pipelines to build cache at the start of the mapping for

Determine Potential Cache Size

Read Source Records 5000

Total Disk Reads

In this example, for a small lookup table, caching is best, for a

Reduce I/O and Reduce Number of Rows

Reduce amount of data moving across the network - Filter

Select Distinct in Source Qualifier Filter rows

Single Pass Reads

Read source once and write to multiple targets

Reduce I/O and Reduce Number of Rows

Remove unnecessary ports and links to unnecessary

Reduce the number of transformations - keep the Source

Tune Database Queries and SQL statements see log file

Reduce Rows in Pipeline

Move transformations that increase rows downstream

Avoid Duplicate Work

Place common logic upstream

Calculate once, use many times

Avoid calculating or testing the same value over and over

Within an Expression, use variables to calculate a value

Additional Informatica Resources

PowerCenter and Informatica Developer

PowerCenter and Informatica Administrator

Efficient Data Formats

Fixed-width files are faster than delimited - Delimited files

UNICODE vs. ASCII data movement mode - Unicode data

Disable High Precision - Applies to decimal data type.

Avoid Stored Procedures, External Procedures, and