Você está na página 1de 10

Performance Tuning in Informatica using persistent cache

Performance Tuning in Informatica Using Peristent cache

Author: E-mail-ID: Location: Account:

Madhuri V madhuri.veluri@wipro.com EC-2, an!alore. "V E"E#$%

[Intended Audience: This paper expects the readers to have a fair knowledge of working with Informatica]

Table of contents
1

Performance Tuning in Informatica using persistent cache

Content

Page No

1. Introduction...3 2. Looku transformation........3 3. !ethods to Im ro"e or tune Looku Transformation......# $. Use of ersistent cache in Looku ...% #. &'( )cenario * &ail+ load......, %. Conclusion......,. .ckno/ledgement10 1. 2eferences.10

Performance Tuning in Informatica using persistent cache

Performance tuning in informatica 1. Introduction


Informatica is a powerful ETL tool from Informatica Corporation, a leading provider of enterprise data integration software Informatica is an ETL tool with high performance capability. e need to ma!e ma"imum use of its features to increase its performance. The ob#ective of performance tuning using loo!up is to optimi$e over all ETL performance. %erformance of informatica is dependent on performances of its several components li!e database, networ!, and system hosting informatica, transformation, mapping, session. &sually we face performance issues mostly with the loo!up transformation. In ' (, it)s a common scenario to use loo!up on a single dimension for every fact load. &sing persistent cache for the loo!up will reduce lot of time involved in caching up for each fact load.

2. Looku Transformation
Loo!up is a transformation to loo! up the values from a relational table*view or a flat file. The developer defines the loo!up match criteria. The categories of loo!up are connected + un,connected, static + dynamic. 'ifferent caches can also be used with loo!up li!e static, dynamic, persistent, and shared. Each of these has its own identification. Loo!up transformation is %assive and it can be both Connected and &nconnected as well. It is used to loo! up data in a relational table, view, or synonym. Loo!up definition can be imported either from source or from target tables. -or e"ample, if we want to retrieve all the sales of a product with an I' 1. and assume that the sales data resides in another table called /0ales/. (ere instead of using the sales table as one more source use Loo!up transformation to loo!up the data for the product, with I' 1. in sales table. In the Loo!up transformation, configure the following properties1

Performance Tuning in Informatica using persistent cache

Looku condition3 3llows the Integration 0ervice to compare the input column containing codes with the loo!up table column. Looku )4L o"erride3 Ensures the Integration 0ervice e"tracts only loo!up table data that relates to input data. Looku cache3 3llows the Integration 0ervice to perform a loo!up 04L override. hen you configure a loo!up cache, you can configure the following cache settings1 5uilding caches1 5ou can configure the session to build caches se6uentially or concurrently. hen you build se6uential caches, the Integration 0ervice creates caches as the source rows enter the Loo!up transformation. hen you configure the session to build concurrent caches, the Integration 0ervice does not wait for the first row to enter the Loo!up transformation before it creates caches. Instead, it builds multiple caches concurrently. Persistent cache3 5ou can save the loo!up cache files and reuse them the ne"t time the Integration 0ervice processes a Loo!up transformation configured to use the cache. 2e cache from source3 If the persistent cache is not synchroni$ed with the loo!up table, you can configure the Loo!up transformation to rebuild the loo!up cache. )tatic cache3 5ou can configure a static, or read,only, cache for any loo!up source. 7y default, the Integration 0ervice creates a static cache. It caches the loo!up file or table and loo!s up values in the cache for each row that comes into the transformation. hen the loo!up condition is true, the Integration 0ervice returns a value from the loo!up cache. The Integration 0ervice does not update the cache while it processes the Loo!up transformation. &+namic cache3 To cache a table, flat file, or source definition and update the cache, configure a Loo!up transformation with dynamic cache. The Integration 0ervice dynamically inserts or updates data in the loo!up cache and passes the data to the target. The dynamic cache is synchroni$ed with the target. )hared cache3 5ou can share the loo!up cache between multiple transformations. 5ou can share an unnamed cache between transformations in the same mapping. 5ou can share a named cache between transformations in the same or different mappings.

Performance Tuning in Informatica using persistent cache

3. !ethods to Im ro"e or tune Looku Transformation


Performance tuning of Looku transformation
Loo!up transformations are used to loo!up a set of values in another table. Loo!ups slows down the performance. 7elow are the some of the !ey points that will help us to improve performance of loo!ups1 Cache the loo!up tables. Informatica can cache all the loo!up and reference tables9 this ma!es operations run very fast. Even after caching, the performance can be further improved by minimi$ing the si$e of the loo!up cache. :educe the number of cached rows by using a s6l override with a restriction. In loo!up tables, delete all unused columns and !eep only the fields that are used in the mapping. If possible, replace loo!ups by #oiner transformation or single source 6ualifier. ;oiner transformation ta!es more time than source 6ualifier transformation. If loo!up transformation specifies several conditions, then place conditions that use e6uality operator <=) first in the conditions that appear in the conditions tab. In the s6l override 6uery of the loo!up table, there will be an >:'E: 75 clause. :emove it if not needed or put fewer column names in the >:'E: 75 list. :eplace loo!up with decode or II- ?for small sets of values@. 'o not use caching in the following cases1 , ,0ource is small and loo!up table is large. ,If loo!up is done on the primary !ey of the loo!up table. Cache the loo!up table columns definitely in the following case1 , ,If loo!up table is small and source is large. If loo!up data is static, use persistent cache. %ersistent caches help to save and reuse cache files. If several sessions in the same #ob use the same loo!up table, then using persistent cache will help the sessions to reuse cache files. In case of static loo!ups, cache files will be built from memory cache instead of from the database, which will improve the performance. If source is huge and loo!up table is also huge, then also use persistent cache. If target table is the loo!up table, then use dynamic cache. The Informatica server updates the loo!up cache as it passes rows to the target. If there are several loo!ups with the same data set, then share the caches. If we are going to return only 1 row, then use unconnected loo!up. 3ll data are read into cache in the order the fields are listed in loo!up ports. If we have an inde" that is even partially in this order, the loading of these loo!ups can be speeded up.

Performance Tuning in Informatica using persistent cache

$. Using a Persistent Looku Cache


5ou can configure a Loo!up transformation to use a non,persistent or persistent cache. The Integration 0ervice saves or deletes loo!up cache files after a successful session based on the Loo!up Cache %ersistent property. If the loo!up table does not change between sessions, you can configure the Loo!up transformation to use a persistent loo!up cache. The Integration 0ervice saves and reuses cache files from session to session, eliminating the time re6uired to read the loo!up table.

Using a Persistent Cache


If you want to save and reuse the cache files, you can configure the transformation to use a persistent cache. &se a persistent cache when you !now the loo!up table does not change between session runs. The first time the Integration 0ervice runs a session using a persistent loo!up cache, it saves the cache files to dis! instead of deleting them. The ne"t time the Integration 0ervice runs the session, it builds the memory cache from the cache files. If the loo!up table changes occasionally, you can override session properties to re,cache the loo!up from the database. hen you use a persistent loo!up cache, you can specify a name for the cache files. specify a named cache, you can share the loo!up cache across sessions. hen you

2ebuilding the Looku Cache


5ou can instruct the Integration 0ervice to rebuild the loo!up cache if you thin! that the loo!up source changed since the last time the Integration 0ervice built the persistent cache. hen you rebuild a cache, the Integration 0ervice creates new cache files, overwriting e"isting persistent cache files. The Integration 0ervice writes a message to the session log when it rebuilds the cache. 5ou can rebuild the cache when the mapping contains one Loo!up transformation or when the mapping contains Loo!up transformations in multiple target load order groups that share a cache. 5ou do not need to rebuild the cache when a dynamic loo!up shares the cache with a static loo!up in the same mapping. If the Integration 0ervice cannot reuse the cache, it either re,caches the loo!up from the database, or it fails the session, depending on the mapping and session properties. Table below summari$es how the Integration 0ervice handles persistent caching for named and unnamed caches1

Performance Tuning in Informatica using persistent cache

Table Integration 0ervice (andling of %ersistent Caches Damed Cache Integration 0ervice cannot locate cache files. :ebuilds cache. Enable or disable the Enable (igh %recision option in session -ails properties. session. Edit the transformation in the Capping 'esigner, Capplet 'esigner, -ails or :eusable Transformation 'eveloper.E session. Edit the mapping ?e"cluding Loo!up transformation@. :euses cache. Change database connection or the file location used to access the -ails loo!up table. session. Change the Integration 0ervice data movement mode. -ails session. Change the sort order in &nicode mode. -ails session. Change the Integration 0ervice code page to a compatible code :euses page. cache. Change the Integration 0ervice code page to an incompatible code -ails page. session. Capping or 0ession Changes 7etween 0essions &nnamed Cache :ebuilds cache. :ebuilds cache. :ebuilds cache. :ebuilds cache. :ebuilds cache. :ebuilds cache. :ebuilds cache. :euses cache. :ebuilds cache.

EEditing properties such as transformation description or port description does not affect persistent cache handling.

#. &'( )cenario * &ail+ load3


Consider a daily ?incremental load@ of a ' ( which is very common in any pro#ect. Especially in case, where multiple facts are built on one dimension, there might be a need for a loo!up on the dimension in every fact load. In such a case rather than building the cache each time, we can use a flow to build the cache once and then use the cache across all the fact loads using a persistent cache.

Performance Tuning in Informatica using persistent cache

In this mapping there is a dummy source and a dummy target. The flow is mainly used to create a loo!up cache.

Performance Tuning in Informatica using persistent cache

The property Looku cache ersistent is enabled here, i.e. once the cache is built with the name ?l!pH'imH>rderH(eader@ specified in the property1 Cache 6ile Name Prefi78 it can be used across any other flow. 3lso here the property :e,cache from the loo!up source is enabled which means that every time the session runs, the cache will be re,built. The above cache can be used in another session as shown below.

3s the property Loo!up cache persistent is enabled, the cache with the name l!pH'imH>rderH(eader which is already built in the previous flow will be used.

%. Conclusion
&sing the performance tuning tips (igh performance capability of Informatica can be e"plored to meet the ever increasing user re6uirements and e"ploding data volumes.

Performance Tuning in Informatica using persistent cache

,. .ckno/ledgement
3m than!ful to :ahul 'eshpande who has made some time from his busy schedule and helped me publishing the technical paper successfully.

1. 2eferences
1. Informatica wor! flow administration help manual. 2. Informatica portal.

$lo&&ar': ETL1 E"tract Transform Load ' (1 'ata arehouse

1.