Escolar Documentos
Profissional Documentos
Cultura Documentos
Course Objectives
By the end of this course you will: Understand how to use the major PowerCenter components for development Be able to build basic ETL mappings and mapplets* Be able to create, run and monitor workflows Understand available options for loading target data Be able to troubleshoot most problems Note: The course does not cover PowerCenter optional features or XML support.
* A mapplet is a subset of a mapping
2
Decision Support
Data Warehouse
Transaction level data Optimized for transaction response time Current Normalized or De-normalized data
Aggregate data Cleanse data Consolidate data Apply business rules De-normalize data
Transform
Extract
3
ETL
Load
Repository Designer Workflow Workflow Rep Server Manager Manager Monitor Administration Console
PowerCenter 7 Architecture
Native
Sources Informatica Server
Native
Targets
TCP/IP
Native
Repository Designer Workflow Workflow Rep Server Manager Manager Monitor Administrative Console
Repository
Not Shown: Client ODBC connections from Designer to sources and targets for metadata
Repository Server
TCP/IP
Import from: Relational database Flat file XML object Create manually
Repository Agent
Native
DEF
9
Repository
Repository Server
TCP/IP
Repository Agent
Native
DEF
10
Repository
11
13
View Synonym
Repository Agent
Native
DEF
14
Repository
15
16
Transformation Views
A transformation has three views:
Iconized shows the transformation in relation to the rest of the mapping Normal shows the flow of data through the transformation Edit shows transformation ports (= table columns) and properties; allows editing
18
Expression Transformation
Perform calculations using non-aggregate functions (row level)
Ports Mixed Variables allowed Create expression in an output or variable port Usage Perform majority of data manipulation
19
Expression Editor
An expression formula is a calculation or conditional statement for a specific port in a transformation
Performs calculation based on ports, functions, operators, variables, constants and return values from other transformations
20
Expression Validation
The Validate or OK button in the Expression Editor will:
Parse the current expression Remote port searching (resolves references to ports in other transformations)
Parse default values Check spelling, correct number of arguments in functions, other syntactical errors
21
Variable Ports
Use in another variable port or an output port expression Local to the transformation (a variable port cannot also be an input or output port)
22
Use for temporary storage Variable ports can remember values across rows; useful for comparing values Variables are initialized (numeric to 0, string to ) when the Mapping logic is processed Variables Ports are not visible in Normal view, only in Edit view
23
Selected port
24
Informatica Datatypes
NATIVE DATATYPES TRANSFORMATION DATATYPES
Native
Transformation
Native
Transformation datatypes allow mix and match of source and target database types When connecting ports, native and transformation datatypes must be compatible (or must be explicitly converted)
25
For further information, see the PowerCenter Client Help > Index > port-to-port data conversion
26
Mappings
Mappings
By the end of this section you will be familiar with:
28
Mapping Designer
Transformation Toolbar
Mapping List
Iconized Mapping
29
Usage
Modify SQL statement User Defined Join Source Filter Sorted ports Select DISTINCT Pre/Post SQL
30
31
To use a semi-colon outside of quotes or comments, escape it with a back slash (\)
32
Mapping Validation
33
Connection Validation
Examples of invalid connections in a Mapping:
Connecting ports with incompatible datatypes
Connecting output ports to a Source Connecting a Source to anything but a Source
34
Mapping Validation
Mappings must: Be valid for a Session to run Be end-to-end complete and contain valid expressions Pass all data flow rules Mappings are always validated when saved; can be validated without being saved
35
36
Workflows
Workflows
By the end of this section, you will be familiar with: The Workflow Manager GUI interface Creating and configuring Workflows Workflow properties
Workflow components
Workflow tasks
38
Workspace
Status Bar
Output Window
39
Task Developer
Create Session, Shell Command and Email tasks Tasks created in the Task Developer are reusable
Worklet Designer
Creates objects that represent a set of tasks Worklet objects are reusable
40
Workflow Structure
A Workflow is set of instructions for the Informatica Server to perform data transformation and load
Combines the logic of Session Tasks, other types of Tasks and Worklets
The simplest Workflow is composed of a Start Task, a Link and one other Task
Link
Start Task
Session Task
41
Creating a Workflow
Select a Server
42
Workflow Properties
Customize Workflow Properties
Workflow log displays
43
Workflow Scheduler
44
Workflow Links
Required to connect Workflow Tasks Can be used to create branches in a Workflow All links are executed unless a link condition is used which makes a link false
Link 1 Link 3
Link 2
45
Workflow Summary
1. Add Sessions and other Tasks to the Workflow
2.
3.
4.
46
Session Tasks
Session Tasks
After this section, you will be familiar with:
48
Session Tasks can be created in the Task Developer (reusable) or Workflow Developer (Workflow-specific)
To create a Session Task
Select the Session button from the Task Toolbar
Or Select menu Tasks | Create and select Session from the drop-down menu
49
50
51
Set properties
52
Monitoring Workflows
Monitoring Workflows
By the end of this section you will be familiar with: The Workflow Monitor GUI interface Monitoring views Server monitoring modes
54
Workflow Monitor
The Workflow Monitor is the tool for monitoring Workflows and Tasks Choose between two views: Gantt chart Task view
Task view
55
Displays real-time information from the Informatica Server and the Repository Server about current workflow runs
56
Monitoring Operations
Perform operations in the Workflow Monitor
Stop, Abort, or Restart a Task, Workflow or Worklet Resume a suspended Workflow after a failed Task is corrected Reschedule or Unschedule a Workflow
committing data during the timeout period, the threads and processes associated with the Session are killed
57
Status Bar
58
Monitoring filters can be set using drop down menus. Minimizes items displayed in Task View
Right-click on Session to retrieve the Session Log (from the Server to the local PC Client)
59
Filter Toolbar
Select type of tasks to filter Select servers to filter Filter tasks by specified criteria Display recent runs
60
Repository Manager Repository Managers Truncate Log option clears the Workflow Monitor logs
61
Filter Transformation
Filter Transformation
Drops rows conditionally
63
Sorter Transformation
Sorter Transformation
Can sort data from relational tables or flat files
65
Sorter Transformation
Sorts data from any source, at any point in a data flow
Sort Keys
Ports Input/Output Define one or more sort keys Define sort order for each key
Example of Usage Sort data before Aggregator to improve performance
Sort Order
66
Sorter Properties
67
Aggregator Transformation
Aggregator Transformation
By the end of this section you will be familiar with:
Aggregator properties
Using sorted data
69
Aggregator Transformation
Performs aggregate calculations
Ports Mixed I/O ports allowed Variable ports allowed Group By allowed Create expressions in variable and output ports Usage Standard aggregations
70
Aggregate Expressions
Aggregate functions are supported only in the Aggregator Transformation
Conditional Aggregate expressions are supported: Conditional SUM format: SUM(value, condition)
71
Aggregator Functions
AVG COUNT FIRST LAST MAX MEDIAN MIN PERCENTILE STDDEV SUM VARIANCE
72
Aggregator Properties
Sorted Input Property
Instructs the Aggregator to expect the data to be sorted Set Aggregator cache sizes for Informatica Server machine
73
Sorted Data
The Aggregator can handle sorted or unsorted data
Sorted data can be aggregated more efficiently, decreasing total processing time
The Server will cache data from each group and release the cached data upon reaching the first record of the next group Data must be sorted according to the order of the Aggregators Group By ports Performance gain will depend upon varying factors
74
Joiner Transformation
Joiner Transformation
By the end of this section you will be familiar with:
76
77
78
Joiner Transformation
Performs heterogeneous joins on different data flows
Active Transformation Ports All input or input / output M denotes port comes from master source Examples Join two flat files Join two tables from different databases Join a flat file with a relational table
79
Joiner Conditions
80
Joiner Properties
Join types:
Normal (inner) Master outer Detail outer Full outer Set Joiner Caches
Joiner can accept sorted data (configure the join condition to use the sort origin ports)
81
Lookup Transformation
Lookup Transformation
By the end of this section you will be familiar with:
Lookup principles
Lookup properties Lookup conditions
Lookup techniques
Caching considerations Persistent caches
83
Return value(s)
84
Lookup Transformation
Looks up values in a database table or flat file and provides data to other components in a mapping
Ports Mixed L denotes Lookup port R denotes port used as a return value (unconnected Lookup only see later) Specify the Lookup Condition Usage Get related values Verify if records exists or if data has changed
85
Lookup Conditions
86
Lookup Properties
Lookup table name
Lookup condition
87
Policy on multiple match: Use first value Use last value Report error
88
Lookup Caching
Caching can significantly impact performance Cached
Lookup table data is cached locally on the Server Mapping rows are looked up against the cache
Uncached
Each Mapping row needs one SQL SELECT
Rule Of Thumb: Cache if the number (and size) of records in the Lookup table is small relative to the number of mapping rows requiring the lookup
89
Persistent Caches
By default, Lookup caches are not persistent; when the session completes, the cache is erased
The next time Session runs, cached data is loaded fully or partially into RAM and reused
A named persistent cache may be shared by different sessions
90
91
93
Target Options
Target Options
By the end of this section you will be familiar with:
Constraint-based loading
95
Target Properties
Edit Tasks: Mappings Tab Session Task
Select target instance Target load type Row loading operations Error handling
96
Delete SQL
DELETE from <target> WHERE <primary key> = <pkvalue>
Constraint-based Loading
PK1
FK1 PK2
FK2
To maintain referential integrity, primary keys must be loaded before their corresponding foreign keys here in the order Target1, Target2, Target 3
98
Ports All input / output Specify the Update Strategy Expression IIF or DECODE logic determines how to handle the record
Example Updating Slowly Changing Dimensions
100
Appropriate SQL (DML) is submitted to the target database: insert, delete or update
DD_REJECT means the row will not have SQL written for it. Target will not see that row Rejected rows may be forwarded through Mapping
101
Router Transformation
Router Transformation
Rows sent to multiple filter conditions
Ports All input/output Specify filter conditions for each Group Usage Link source data in one pass to multiple filter conditions
103
Router Groups
Input group (always one) User-defined groups
Default group (always one) can capture rows that fail all Group conditions
104
105
Ports Two predefined output ports, NEXTVAL and CURRVAL No input ports allowed Usage Generate sequence numbers Shareable across mappings
107
108
System variables
Mapping parameters and variables Parameter files
110
System Variables
SYSDATE
SESSSTARTTIME
$$$SessStartTime
Returns the system date value as a string. Uses system clock on machine hosting Informatica Server
Format of the string is database type dependent Used in SQL override Has a constant value
111
112
Set datatype User-defined names Set aggregation type Set optional initial value
SETMINVARIABLE($$Variable,value) Sets the specified variable to the lower of of the current value or the specified value
SETVARIABLE($$Variable,value) Sets the specified variable to the specified value SETCOUNTVARIABLE($$Variable) Increases or decreases the specified variable by the number of rows leaving the function(+1 for each inserted row, -1 for each deleted row, no change for updated or rejected rows)
115
Parameter Files
You can specify a parameter file for a session in the session editor Parameter file contains folder.session name and initializes each parameter and variable for that session. For example:
[Production.s_m_MonthlyCalculations] $$State=MA $$Time=10/1/2000 00:00:00 $InputFile1=sales.txt $DBConnection_target=sales $PMSessionLogFile=D:/session logs/firstrun.txt
116
117
Unconnected Lookups
Unconnected Lookups
By the end of this section you will know:
119
Unconnected Lookup
Physically unconnected from other transformations NO data flow arrows leading to or from an unconnected Lookup Lookup data is called from the point in the Mapping that needs it Lookup function can be set within any transformation that supports expressions
Function in the Aggregator calls the unconnected Lookup
120
IIF ( ISNULL(customer_id),:lkp.MYLOOKUP(order_no))
Lookup function
Condition is evaluated for each row but Lookup function is called only if condition satisfied
121
Must check a Return port in the Ports tab, else fails at runtime
123
Part of the mapping data flow Returns multiple values (by linking output ports to another transformation) Executed for every record passing through the transformation More visible, shows where the lookup values are used Default values are used
124
Separate from the mapping data flow Returns one value - by checking the Return (R) port option for the output port that provides the return value Only executed when the lookup function is called Less visible, as the lookup is called from an expression within another transformation Default values are ignored
Mapplets
Mapplets
By the end of this section you will be familiar with:
Mapplet Designer
Mapplet advantages Mapplet types Mapplet rules Active and Passive Mapplets Mapplet Parameters and Variables
126
Mapplet Designer
127
Mapplet Advantages
Useful for repetitive tasks / logic
128
129
130
Unsupported Transformations
You cannot not use the following in a mapplet:
Normalizer Transformation
XML source definitions Target definitions
Other mapplets
131
External Sources
Mapplet contains a Mapplet Input transformation
Mixed Sources
Mapplet contains one or more of either of a Mapplet Input transformation AND one or more Source Qualifiers Receives data from the Mapping it is used in, AND from the Mapplet
132
Passive Transformation Connected Ports Output ports only Usage Only those ports connected from an Input transformation to another transformation will display in the resulting Mapplet
133
Transformation
Transformation
Connecting the same port to more than one transformation is disallowed Pass to an Expression transformation first
Resulting Mapplet HAS input ports When used in a Mapping, the Mapplet may occur at any point in mid-flow
134
Mapplet
Source Qualifier
Mapplet
Usage Only those ports connected to an Output transformation (from another transformation) will display in the resulting Mapplet One (or more) Mapplet Output transformations are required in every Mapplet
136
138
CAUTION: Changing a passive Mapplet into an active Mapplet may invalidate Mappings which use that Mapplet so do an impact analysis in Repository Manager first
139
Passive
Active
Multiple Active Mapplets or Active and Passive Mapplets cannot populate the same target instance
140
Reusable Transformations
Reusable Transformations
By the end of this section you will be familiar with:
Transformation Developer
Reusable transformation rules Promoting transformations to reusable Copying reusable transformations
143
Transformation Developer
Make a transformation reusable from the outset, or test it in a mapping first
Reusable transformations
144
Reusable Transformations
Define once, reuse many times Reusable Transformations
Can be a copy or a shortcut Edit Ports only in Transformation Developer Can edit Properties in the mapping
145
146
3. Drop the transformation into the mapping 4. Save the changes to the Repository
147
Workflow Configuration
149
150
(Custom)
(External Database Loaders)
151
152
153
154
155
156
157
Session Configuration
Define properties to be reusable across different sessions Defined at folder level Must have one of these tools open in order to access
158
159
160
161
162