Você está na página 1de 842

Workflow Administration Guide

Informatica PowerCenter
(Version 7.1.1)

Informatica PowerCenter Workflow Administration Guide


Version 7.1.1
August 2004
Copyright (c) 19982004 Informatica Corporation.
All rights reserved. Printed in the USA.
This software and documentation contain proprietary information of Informatica Corporation, they are provided under a license agreement
containing restrictions on use and disclosure and is also protected by copyright law. Reverse engineering of the software is prohibited. No
part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise)
without prior consent of Informatica Corporation.
Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software
license agreement as provided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013(c)(1)(ii) (OCT 1988), FAR
12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14 (ALT III), as applicable.
The information in this document is subject to change without notice. If you find any problems in the documentation, please report them to
us in writing. Informatica Corporation does not warrant that this documentation is error free.
Informatica, PowerMart, PowerCenter, PowerChannel, PowerCenter Connect, MX, and SuperGlue are trademarks or registered trademarks
of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company and product names may be
trade names or trademarks of their respective owners.
Portions of this software are copyrighted by DataDirect Technologies, 1999-2002.
Informatica PowerCenter products contain ACE (TM) software copyrighted by Douglas C. Schmidt and his research group at Washington
University and University of California, Irvine, Copyright (c) 1993-2002, all rights reserved.
Portions of this software contain copyrighted material from The JBoss Group, LLC. Your right to use such materials is set forth in the GNU
Lesser General Public License Agreement, which may be found at http://www.opensource.org/licenses/lgpl-license.php. The JBoss materials
are provided free of charge by Informatica, as-is, without warranty of any kind, either express or implied, including but not limited to the
implied warranties of merchantability and fitness for a particular purpose.
Portions of this software contain copyrighted material from Meta Integration Technology, Inc. Meta Integration is a registered trademark
of Meta Integration Technology, Inc.
This product includes software developed by the Apache Software Foundation (http://www.apache.org/).
The Apache Software is Copyright (c) 1999-2004 The Apache Software Foundation. All rights reserved.
DISCLAIMER: Informatica Corporation provides this documentation as is without warranty of any kind, either express or implied,
including, but not limited to, the implied warranties of non-infringement, merchantability, or use for a particular purpose. The information
provided in this documentation may include technical inaccuracies or typographical errors. Informatica could make improvements and/or
changes in the products described in this documentation at any time without notice.

Table of Contents
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxv
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxi
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxv
New Features and Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxvi
PowerCenter 7.1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxvi
PowerCenter 7.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxviii
PowerCenter 7.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xlii
About Informatica Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xlviii
About this Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xlix
Document Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xlix
Other Informatica Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . l
Visiting Informatica Customer Portal . . . . . . . . . . . . . . . . . . . . . . . . . . . l
Visiting the Informatica Webzine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . l
Visiting the Informatica Web Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . l
Visiting the Informatica Developer Network . . . . . . . . . . . . . . . . . . . . . . l
Obtaining Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . li

Chapter 1: Understanding the Server Architecture . . . . . . . . . . . . . . . 1


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Workflow Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Pipeline Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
PowerCenter Server Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Running a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Load Manager Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Managing Workflow Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Locking and Reading the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Reading the Parameter File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Creating the Workflow Log File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Running Workflow Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Distributing Sessions to Worker Servers. . . . . . . . . . . . . . . . . . . . . . . . . . 9
Starting the DTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Running Sessions from Master Servers . . . . . . . . . . . . . . . . . . . . . . . . . . 10

iii

Writing Historical Information to the Repository . . . . . . . . . . . . . . . . . . 10


Sending Post-Session Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Data Transformation Manager (DTM) Process . . . . . . . . . . . . . . . . . . . . . . . 11
Reading the Session Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Expanding Variables and Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Creating the Session Log File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Validating Code Pages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Verifying Connection Object Permissions . . . . . . . . . . . . . . . . . . . . . . . 12
Running Pre-Session Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Running the Processing Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Running Post-Session Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Sending Post-Session Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Understanding Processing Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Thread Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Threads and Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
PowerCenter Server Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Reading Source Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Blocking Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Block Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
System Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
CPU Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Load Manager Shared Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
DTM Buffer Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Cache Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Code Pages and Data Movement Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
ASCII Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Unicode Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Output Files and Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
PowerCenter Server Log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Workflow Log File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Session Log File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Session Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Performance Detail File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Reject Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Row Error Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Recovery Tables and Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Control File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
iv

Table of Contents

Indicator File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Output File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Cache Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Chapter 2: Configuring the Workflow Manager . . . . . . . . . . . . . . . . . 37


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Setting the Date/Time Display Format . . . . . . . . . . . . . . . . . . . . . . . . . 38
Customizing the Workflow Manager Options . . . . . . . . . . . . . . . . . . . . . . . . 39
Configuring General Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Configuring Format Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Configuring Miscellaneous Options . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Enabling Enhanced Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Registering the PowerCenter Server

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Server Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Steps for Registering a PowerCenter Server . . . . . . . . . . . . . . . . . . . . . . 48
Deleting a PowerCenter Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Configuring Connection Object Permissions . . . . . . . . . . . . . . . . . . . . . . . . 51
Connection Object Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Setting Up a Relational Database Connection . . . . . . . . . . . . . . . . . . . . . . . 53
Database Connect Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Database Connection Code Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Configuring Environment SQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Configuring a Relational Database Connection . . . . . . . . . . . . . . . . . . . 56
Deleting Connection Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Copying a Relational Database Connection . . . . . . . . . . . . . . . . . . . . . . 59
Replacing a Relational Database Connection . . . . . . . . . . . . . . . . . . . . . . . . 62

Chapter 3: Using the Workflow Manager . . . . . . . . . . . . . . . . . . . . . . 65


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Workflow Manager Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Workflow Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Workflow Manager Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Navigating the Workspace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Customizing Workflow Manager Windows . . . . . . . . . . . . . . . . . . . . . . 69
Using Toolbars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Searching for Items . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Arranging Objects in the Workspace . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

Table of Contents

Zooming the Workspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71


Working with Repository Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Viewing Object Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Entering Descriptions for Repository Objects . . . . . . . . . . . . . . . . . . . . . 73
Renaming Repository Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Checking Out and In Versioned Repository Objects . . . . . . . . . . . . . . . . . . . 74
Checking Out Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Checking In Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Searching For Versioned Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Copying Repository Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Copying Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Copying Workflow Segments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Comparing Repository Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Steps for Comparing Objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Working with Metadata Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Creating a Metadata Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Editing a Metadata Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Deleting a Metadata Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Keyboard Shortcuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

Chapter 4: Working with Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . 87


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Workflow Privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Developing Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Creating a New Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Adding Tasks to Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Working with Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Using the Expression Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Deleting a Workflow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Editing a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Using the Workflow Wizard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Step 1. Assign a Name and PowerCenter Server to the Workflow . . . . . . . 99
Step 2. Create a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Step 3. Schedule a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Using Workflow Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Pre-Defined Workflow Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
User-Defined Workflow Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

vi

Table of Contents

Scheduling a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112


Creating a Reusable Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Configuring Scheduler Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Editing Scheduler Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Disabling Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Validating a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Expression Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Task Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Workflow Properties Validation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Running Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Running the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Selecting a Server to Run the Workflow . . . . . . . . . . . . . . . . . . . . . . . . 122
Assigning the PowerCenter Server to a Workflow . . . . . . . . . . . . . . . . . 122
Running a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Running a Part of a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Running a Task in the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Suspending the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Configuring Suspension Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Stopping or Aborting the Workflow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Server Handling of Stop and Abort . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Stopping or Aborting a Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

Chapter 5: Working with Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Creating a Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Creating a Task in the Task Developer . . . . . . . . . . . . . . . . . . . . . . . . . 133
Creating a Task in the Workflow or Worklet Designer . . . . . . . . . . . . . 133
Configuring Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Reusable Workflow Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
AND or OR Input Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Disabling Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Failing Parent Workflow or Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . 138
Validating Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Working with the Assignment Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Working with the Command Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Using Session Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Creating a Command Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

Table of Contents

vii

Executing Commands in the Command Task . . . . . . . . . . . . . . . . . . . . 145


Working with the Control Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Working with the Decision Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Using the Decision Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Creating a Decision Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Working with Event Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Example of User-Defined Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Working with Event-Raise Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
Working With Event-Wait Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
Working with the Timer Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

Chapter 6: Working with Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . 163


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Suspending Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Developing a Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Creating a Reusable Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Creating a Non-Reusable Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Configuring Worklet Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
Adding Tasks in Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
Nesting Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
Using Worklet Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Persistent Worklet Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Overriding Initial Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Validating Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

Chapter 7: Working with Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . 173


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
Creating a Session Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Session Privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Steps to Create a Session Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Editing a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
Edit Session Privilege . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
Applying Attributes to All Instances . . . . . . . . . . . . . . . . . . . . . . . . . . 178
Creating a Session Configuration Object . . . . . . . . . . . . . . . . . . . . . . . . . . 183
Using Pre- and Post-Session SQL Commands . . . . . . . . . . . . . . . . . . . . . . . 186
Guidelines for Entering Pre- and Post-Session SQL Commands . . . . . . . 186
Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

viii

Table of Contents

Using Pre- or Post-Session Shell Commands . . . . . . . . . . . . . . . . . . . . . . . . 188


Using Server and Session Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Configuring Non-Reusable Shell Commands . . . . . . . . . . . . . . . . . . . . 189
Configuring Reusable Shell Commands . . . . . . . . . . . . . . . . . . . . . . . . 192
Using Server Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
Pre-Session Shell Command Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
Using Post-Session Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
Validating a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Validating Multiple Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Running the Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Selecting a Server to Run the Session . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Assigning the PowerCenter Server to a Session . . . . . . . . . . . . . . . . . . . 198
Stopping and Aborting a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
Threshold Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
Fatal Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
ABORT Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
User Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
PowerCenter Server Handling for Session Failure . . . . . . . . . . . . . . . . . 201
Mapping Parameters and Variables in Sessions . . . . . . . . . . . . . . . . . . . . . . 203
Handling High Precision Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

Chapter 8: Working with Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . 207


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Globalization Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Source Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Permissions and Privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Allocating Buffer Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Partitioning Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Configuring Sources in a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
Configuring Readers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
Configuring Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Configuring Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
Working with Relational Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Selecting the Source Database Connection . . . . . . . . . . . . . . . . . . . . . . 214
Defining the Treat Source Rows As Property . . . . . . . . . . . . . . . . . . . . 214
Configuring the Table Owner Name . . . . . . . . . . . . . . . . . . . . . . . . . . 216
Overriding the SQL Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216

Table of Contents

ix

Working with File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218


Configuring Source Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Configuring Fixed-Width File Properties . . . . . . . . . . . . . . . . . . . . . . . 220
Configuring Delimited File Properties . . . . . . . . . . . . . . . . . . . . . . . . . 222
Configuring Line Sequential Buffer Length . . . . . . . . . . . . . . . . . . . . . 225
Server Handling for File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
Character Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
Multibyte Character Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Null Character Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Row Length Handling for Fixed-Width Flat Files . . . . . . . . . . . . . . . . . 228
Numeric Data Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
Using a File List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Creating the File List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Configuring a Session to Use a File List . . . . . . . . . . . . . . . . . . . . . . . . 231

Chapter 9: Working with Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
Globalization Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
Target Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Partitioning Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Permissions and Privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Configuring Targets in a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
Configuring Writers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
Configuring Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Configuring Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
Working with Relational Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
Target Database Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Target Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Truncating Target Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
Deadlock Retry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
Dropping and Recreating Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
Constraint-Based Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
Bulk Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
Table Name Prefix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
Reserved Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Working with Target Connection Groups . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Working with Active Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259

Table of Contents

Working with File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261


Configuring Target Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Configuring Fixed-Width Properties . . . . . . . . . . . . . . . . . . . . . . . . . . 265
Configuring Delimited Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
Server Handling for File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
Writing to Fixed-Width Flat Files with Relational Target Definitions . . 268
Writing to Fixed-Width Files with Flat File Target Definitions . . . . . . . 269
Writing Multibyte Data to Fixed-Width Flat Files . . . . . . . . . . . . . . . . 270
Null Characters in Fixed-Width Files . . . . . . . . . . . . . . . . . . . . . . . . . 272
Character Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
Writing Metadata to Flat File Targets . . . . . . . . . . . . . . . . . . . . . . . . . 273
Working with Heterogeneous Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274

Chapter 10: Understanding Commit Points . . . . . . . . . . . . . . . . . . . 275


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
Target-Based Commits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Source-Based Commits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
Determining the Commit Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
Switching from Source-Based to Target-Based Commit . . . . . . . . . . . . . 280
User-Defined Commits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
Rolling Back Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
Understanding Transaction Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
Transformation Scope. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
Understanding Transaction Control Units . . . . . . . . . . . . . . . . . . . . . . 289
Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
Setting Commit Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292

Chapter 11: Recovering Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
Preparing for Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Configuring the Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Configuring the Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Configuring the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
Configuring the Target Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
Creating pmcmd Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
Working with Repeatable Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
Recovering a Suspended Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305

Table of Contents

xi

Recovering a Suspended Workflow with Sequential Sessions . . . . . . . . . 305


Recovering a Suspended Workflow with Concurrent Sessions . . . . . . . . 306
Steps for Recovering a Suspended Workflow . . . . . . . . . . . . . . . . . . . . . 307
Recovering a Failed Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
Recovering a Failed Workflow with Sequential Sessions . . . . . . . . . . . . . 308
Recovering a Failed Workflow with Concurrent Sessions . . . . . . . . . . . . 309
Steps for Recovering a Failed Workflow . . . . . . . . . . . . . . . . . . . . . . . . 310
Recovering a Session Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Recovering Sequential Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Recovering Concurrent Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Steps for Recovering a Session Task . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
Server Handling for Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
Verifying Recovery Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
Running Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
Completing Unrecoverable Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316

Chapter 12: Sending Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
Configuring Email on UNIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
Configuring Email on Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
Step 1. Verify the Informatica Service Startup Account . . . . . . . . . . . . . 322
Step 2. Configure a Microsoft Outlook User . . . . . . . . . . . . . . . . . . . . 322
Step 3. Configure Logon Network Security . . . . . . . . . . . . . . . . . . . . . 325
Step 4. Create Distribution Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
Step 5. Configure the PowerCenter Server Setup . . . . . . . . . . . . . . . . . 327
Working with Email Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
Email Address Tips and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
Steps to Create an Email Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
Working with Post-Session Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
Using Server Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
Email Variables and Format Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
Configuring Post-Session Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
Sample Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
Working with Suspension Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
Using Email Tasks in a Workflow or Worklet . . . . . . . . . . . . . . . . . . . . . . . 341
Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342

xii

Table of Contents

Chapter 13: Pipeline Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . 345


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
Partition Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
Number of Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
Partition Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
Configuring Partitioning Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
Adding and Deleting Partition Points . . . . . . . . . . . . . . . . . . . . . . . . . 353
Adding and Deleting Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
Entering Partition Descriptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
Specifying Partition Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
Adding Keys and Key Ranges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
Cache Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Round-Robin Partition Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
Hash Keys Partition Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
Hash Auto-Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
Hash User Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
Key Range Partition Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
Adding a Partition Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
Adding Key Ranges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
Adding Filter Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
Pass-Through Partition Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
Database Partitioning Partition Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
Partitioning Relational Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
Entering an SQL Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
Entering a Filter Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
Partitioning File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
Guidelines for Partitioning File Sources . . . . . . . . . . . . . . . . . . . . . . . . 374
Using One Thread to Read a File Source . . . . . . . . . . . . . . . . . . . . . . . 375
Using Multiple Threads to Read a File Source . . . . . . . . . . . . . . . . . . . 375
Configuring for File Partitioning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
Partitioning Relational Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
Database Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
Partitioning File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
Configuring Connection Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
Configuring File Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
Partitioning Joiner Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
Partitioning Sorted Joiner Transformations . . . . . . . . . . . . . . . . . . . . . 384

Table of Contents

xiii

Using Sorted Flat Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385


Using Sorted Relational Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
Using Sorter Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
Optimizing Sorted Joiner Transformations with Partitions . . . . . . . . . . 390
Partitioning Lookup Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
Partitioning Sorter Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
Configuring Sorter Transformation Work Directories . . . . . . . . . . . . . . 392
Mapping Variables in Partitioned Pipelines. . . . . . . . . . . . . . . . . . . . . . . . . 394
Partitioning Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
Restrictions on the Number of Partitions . . . . . . . . . . . . . . . . . . . . . . . 395
Partition Restrictions for Editing Objects . . . . . . . . . . . . . . . . . . . . . . . 396
Partition Restrictions for Informatica Application Products . . . . . . . . . . 397
Partitioning Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398

Chapter 14: Monitoring Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . 401


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
Permissions and Privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
Using the Workflow Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
Opening the Workflow Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
Connecting to Repositories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
Connecting to PowerCenter Servers . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
Filtering Tasks and Servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
Opening and Closing Folders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
Viewing Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408
Viewing Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408
Customizing Workflow Monitor Options . . . . . . . . . . . . . . . . . . . . . . . . . . 409
Configuring General Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
Configuring Gantt Chart View Options . . . . . . . . . . . . . . . . . . . . . . . . 411
Configuring Task View Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
Configuring Advanced Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
Using Workflow Monitor Toolbars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
Working with Tasks and Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416
Running a Task, Workflow, or Worklet . . . . . . . . . . . . . . . . . . . . . . . . 416
Resuming a Workflow or Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
Recovering a Workflow or Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
Stopping or Aborting Tasks and Workflows . . . . . . . . . . . . . . . . . . . . . 418
Scheduling and Unscheduling Workflows . . . . . . . . . . . . . . . . . . . . . . . 418

xiv

Table of Contents

Viewing Session Logs and Workflow Logs . . . . . . . . . . . . . . . . . . . . . . 419


Viewing History Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
Workflow and Task Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
Using the Gantt Chart View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
Organizing Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
Listing Tasks and Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
Navigating the Time Window in Gantt Chart View . . . . . . . . . . . . . . . 425
Zooming the Gantt Chart View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
Performing a Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
Opening All Folders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
Using the Task View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
Filtering in Task View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431
Opening All Folders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
Monitoring Session Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434
Creating and Viewing Performance Details . . . . . . . . . . . . . . . . . . . . . . . . 436
Enabling Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
Viewing Session Performance Details. . . . . . . . . . . . . . . . . . . . . . . . . . 436
Memory Requirement for Performance Details . . . . . . . . . . . . . . . . . . . 437
Understanding Performance Counters . . . . . . . . . . . . . . . . . . . . . . . . . 437
Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441

Chapter 15: Using Multiple Servers. . . . . . . . . . . . . . . . . . . . . . . . . . 443


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
Using Server Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
Using a File Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
Running Sessions with Cache Files . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
Working with Server Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
Distributing Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
Server Grid Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
Server Grid Guidelines and Requirements . . . . . . . . . . . . . . . . . . . . . . 448
Configuring Server Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
Configuring Server Grid Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
Configuring Workflow Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
Configuring Session Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
Override Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
Steps for Creating a Server Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451

Table of Contents

xv

Chapter 16: Log Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456
Workflow Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
Workflow Log Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Configuring Workflow Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
Viewing Workflow Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
Session Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
Session Log Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
Load Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
Detailed Transformation Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
Configuring Session Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
Viewing Session Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474
Reject Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476
Locating Reject Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476
Reading Reject Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477

Chapter 17: Row Error Logging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
Error Log Code Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
Understanding the Error Log Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483
PMERR_DATA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483
PMERR_MSG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485
PMERR_SESS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486
PMERR_TRANS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
Understanding the Error Log File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489
Configuring Error Log Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493

Chapter 18: Session Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496
Session Log Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
Changing the Session Log Name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
Changing the Session Log Name and Location . . . . . . . . . . . . . . . . . . . 498
Steps for Using $PMSessionLogFile . . . . . . . . . . . . . . . . . . . . . . . . . . . 498
Database Connection Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
Source File Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502
Changing the Source File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502
Changing the Source File and Directory . . . . . . . . . . . . . . . . . . . . . . . . 503
xvi

Table of Contents

Steps for Using a Source File Parameter . . . . . . . . . . . . . . . . . . . . . . . . 503


Target File Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504
Changing the Target File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504
Changing the Target File and Directory . . . . . . . . . . . . . . . . . . . . . . . . 505
Steps for Using a Target File Parameter . . . . . . . . . . . . . . . . . . . . . . . . 505
Lookup File Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506
Changing the Lookup File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506
Changing the Lookup File and Directory . . . . . . . . . . . . . . . . . . . . . . . 507
Steps for Using a Lookup File Parameter . . . . . . . . . . . . . . . . . . . . . . . 507
Reject File Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508
Changing the Reject File Name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508
Changing the Reject File and Directory . . . . . . . . . . . . . . . . . . . . . . . . 509
Steps for Using a Reject File Parameter . . . . . . . . . . . . . . . . . . . . . . . . 509
Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 510

Chapter 19: Parameter Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512
Parameter File Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
Guidelines for Creating Parameter Files . . . . . . . . . . . . . . . . . . . . . . . . . . . 515
Sample Parameter File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
Configuring the Parameter File Location . . . . . . . . . . . . . . . . . . . . . . . . . . 518
Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520
Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521

Chapter 20: External Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524
External Loader Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
Permissions and Privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
External Loader Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
Loading Data Using Named Pipes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
Staging Data to Flat Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
Partitioning Sessions with External Loaders . . . . . . . . . . . . . . . . . . . . . 526
Errors and Error Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
Loading to DB2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528
Setting DB2 External Loader Operation Modes . . . . . . . . . . . . . . . . . . 528
Configuring Authorities, Privileges, and Permissions . . . . . . . . . . . . . . 528
Configuring DB2 EE External Loader Attributes . . . . . . . . . . . . . . . . . 529

Table of Contents

xvii

Configuring DB2 EEE External Loader Attributes . . . . . . . . . . . . . . . . 530


Loading to Oracle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
Loading Multibyte Data to Oracle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
Oracle External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
Reject File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
Loading to Sybase IQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
Using Sybase IQ External Loader on UNIX . . . . . . . . . . . . . . . . . . . . . 535
Loading Multibyte Data to Sybase IQ . . . . . . . . . . . . . . . . . . . . . . . . . 535
Sybase IQ External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . . 536
Loading to Teradata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 538
Overriding the Control File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
Teradata MultiLoad External Loader Attributes . . . . . . . . . . . . . . . . . . 540
Teradata TPump External Loader Attributes . . . . . . . . . . . . . . . . . . . . . 542
Teradata FastLoad External Loader Attributes . . . . . . . . . . . . . . . . . . . . 545
Teradata Warehouse Builder External Loader Attributes . . . . . . . . . . . . 547
Creating an External Loader Connection . . . . . . . . . . . . . . . . . . . . . . . . . . 551
Configuring External Loading in a Session . . . . . . . . . . . . . . . . . . . . . . . . . 553
Configuring a Session to Write to a File . . . . . . . . . . . . . . . . . . . . . . . . 553
Configuring File Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554
Selecting an External Loader Connection . . . . . . . . . . . . . . . . . . . . . . . 555
Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557

Chapter 21: Using FTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560
Mainframe Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560
Creating an FTP Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561
FTP Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561
Steps for Creating an FTP Connection . . . . . . . . . . . . . . . . . . . . . . . . 562
Creating an FTP Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565
FTP File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565
FTP File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 568

Chapter 22: Using Incremental Aggregation. . . . . . . . . . . . . . . . . . . 573


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574
PowerCenter Server Processing for Incremental Aggregation . . . . . . . . . . . . 575
Reinitializing the Aggregate Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576
Moving or Deleting the Aggregate Files . . . . . . . . . . . . . . . . . . . . . . . . . . . 577

xviii

Table of Contents

Finding Index and Data Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577


Partitioning Guidelines with Incremental Aggregation . . . . . . . . . . . . . . . . 578
Preparing for Incremental Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579
Configuring the Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579
Configuring the Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579

Chapter 23: Using pmcmd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582
Configuring Environment Variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585
Configuring PM_CODEPAGENAME . . . . . . . . . . . . . . . . . . . . . . . . . 585
Configuring PMTOOL_DATEFORMAT . . . . . . . . . . . . . . . . . . . . . . 585
Configuring Repository Username and Password . . . . . . . . . . . . . . . . . 586
Configuring PM_HOME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587
Using the Command Line Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589
Connecting to the PowerCenter Server in the Command Line Mode . . . 589
pmcmd Return Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590
Using the Interactive Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592
Connecting to the PowerCenter Server in the Interactive Mode . . . . . . . 592
Setting Defaults in the Interactive Mode . . . . . . . . . . . . . . . . . . . . . . . 593
pmcmd Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594
Command Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594
Using Quotation Marks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
Syntax Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
Aborttask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596
Abortworkflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 597
Connect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 597
Disconnect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598
Exit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598
Getrunningsessionsdetails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598
Getserverdetails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599
Getserverproperties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599
Getsessionstatistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 600
Gettaskdetails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
Getworkflowdetails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602
Pingserver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602
Quit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602

Table of Contents

xix

Resumeworkflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603
Resumeworklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603
Scheduleworkflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604
Setfolder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604
Setnowait . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
Setwait . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
Showsettings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
Shutdownserver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
Starttask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606
Startworkflow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607
Stoptask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 609
Stopworkflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 609
Unscheduleworkflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610
Unsetfolder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610
Version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611
Waittask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611
Waitworkflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611

Chapter 24: Session Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614
Memory Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614
Cache Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615
Determining Cache Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
Cache Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
Cache Column Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 618
Cache Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 620
Aggregator Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 621
Calculating the Aggregator Index Cache. . . . . . . . . . . . . . . . . . . . . . . . 621
Calculating the Aggregator Data Cache . . . . . . . . . . . . . . . . . . . . . . . . 622
Joiner Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624
Calculating the Number of Master Rows . . . . . . . . . . . . . . . . . . . . . . . 625
Calculating the Joiner Index Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
Calculating the Joiner Data Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626
Lookup Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628
Static Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628
Dynamic Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628
Sharing Partitioned Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629

xx

Table of Contents

Calculating the Lookup Index Cache . . . . . . . . . . . . . . . . . . . . . . . . . . 629


Calculating the Lookup Data Cache . . . . . . . . . . . . . . . . . . . . . . . . . . 631
Rank Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632
Calculating the Rank Index Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . 632
Calculating the Rank Data Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633

Chapter 25: Performance Tuning. . . . . . . . . . . . . . . . . . . . . . . . . . . . 635


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636
Identifying the Performance Bottleneck . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
Identifying Target Bottlenecks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
Identifying Source Bottlenecks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
Identifying Mapping Bottlenecks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 638
Identifying a Session Bottleneck . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 639
Identifying a System Bottleneck . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 640
Optimizing the Target Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 642
Dropping Indexes and Key Constraints . . . . . . . . . . . . . . . . . . . . . . . . 642
Increasing Checkpoint Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 642
Bulk Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 642
External Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643
Increasing Database Network Packet Size . . . . . . . . . . . . . . . . . . . . . . . 643
Optimizing Oracle Target Databases . . . . . . . . . . . . . . . . . . . . . . . . . . 643
Optimizing the Source Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645
Optimizing the Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645
Using tempdb to Join Sybase and Microsoft SQL Server Tables . . . . . . . 646
Using Conditional Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 646
Increasing Database Network Packet Sizes . . . . . . . . . . . . . . . . . . . . . . 646
Connecting to Oracle Source Databases . . . . . . . . . . . . . . . . . . . . . . . . 646
Optimizing the Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647
Configuring Single-Pass Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647
Optimizing Datatype Conversions . . . . . . . . . . . . . . . . . . . . . . . . . . . 648
Eliminating Transformation Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . 648
Optimizing Lookup Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . 649
Optimizing Filter Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 650
Optimizing Aggregator Transformations . . . . . . . . . . . . . . . . . . . . . . . 650
Optimizing Joiner Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 651
Optimizing Sequence Generator Transformations . . . . . . . . . . . . . . . . . 652
Optimizing Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 652

Table of Contents

xxi

Optimizing the Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655


Pipeline Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655
Allocating Buffer Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655
Increasing the Cache Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658
Increasing the Commit Interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658
Disabling High Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658
Reducing Error Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659
Removing Staging Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659
Optimizing the System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 660
Improving Network Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 660
Using Multiple PowerCenter Servers . . . . . . . . . . . . . . . . . . . . . . . . . . 661
Using Server Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 661
Running the PowerCenter Server in ASCII Data Movement Mode . . . . . 661
Using Additional CPUs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 661
Reducing Paging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 662
Using Processor Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 662
Pipeline Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663
Optimizing the Source Database for Partitioning . . . . . . . . . . . . . . . . . 663
Optimizing the Target Database for Partitioning . . . . . . . . . . . . . . . . . 664

Appendix A: Session Properties Reference . . . . . . . . . . . . . . . . . . . 667


General Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668
Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 670
General Options Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 670
Performance Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673
Config Object Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675
Advanced Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675
Log Options Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 677
Error Handling Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 678
Mapping Tab (Transformations View) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681
Connections Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681
Sources Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683
Targets Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 692
Mapping Tab (Partitions View) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 705
Partition Properties Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 705
KeyRange Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706
HashKeys Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706

xxii

Table of Contents

Partition Points Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706


Non-Partition Points Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 709
Components Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 710
Reusable Pre- or Post-Session Commands . . . . . . . . . . . . . . . . . . . . . . 711
Non-Reusable Pre- or Post-Session Commands . . . . . . . . . . . . . . . . . . 712
Reusable Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 714
Non-Reusable Email. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715
Email Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715
Metadata Extensions Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 718

Appendix B: Workflow Properties Reference . . . . . . . . . . . . . . . . . . 721


General Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722
Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724
Scheduler Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726
Edit Scheduler Settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 727
Variables Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 731
Events Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 732
Metadata Extensions Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733

Appendix C: Session Properties Comparison Reference . . . . . . . . 735


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736
General Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737
General Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737
Source Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 738
Target Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 743
Session Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 750
Performance Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 752
Source Location Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 754
Time Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755
Schedule Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755
Start Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756
Duration Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756
Use Absolute Time Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757
Log and Error Handling Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 758
Log File Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 758
Parameter File Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 759
Batch Handling Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 759

Table of Contents

xxiii

Error Handling Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 759


Transformations Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 761
Partitions Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 762

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 763

xxiv

Table of Contents

List of Figures
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure

1-1. PowerCenter Server and Data Movement . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


1-2. Partitioned Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1-3. PowerCenter Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1-4. Thread Creation for a Simple Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1-5. Thread Creation for a Pass-through Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . .
1-6. Pipeline Stages in a Mapping With an Unsorted Aggregator Transformation . . .
1-7. Pipeline Stages in a Mapping with an Additional Partition Point . . . . . . . . . . . .
1-8. Thread Creation for a Mapping with Three Partitions . . . . . . . . . . . . . . . . . . . .
1-9. Thread Creation with Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . . . . .
1-10. Thread Creation with a Partition Point at a Joiner Transformation . . . . . . . . .
1-11. Target Load Order Groups and Source Pipelines . . . . . . . . . . . . . . . . . . . . . . .
1-12. Event Viewer Application Log Message . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1-13. Application Log Message Detail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2-1. Workflow Manager General Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2-2. Workflow Manager Format Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2-3. Copy Wizard, Versioning, and Target Load Type Options . . . . . . . . . . . . . . . .
3-1. Sample Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3-2. Workflow Manager Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3-3. Check In Workflow Manager Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3-4. Query Browser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3-5. Diff Tool Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-1. Sample Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-2. Sample Workflow With Two Branches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-3. Valid Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-4. Example of a Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-5. Setting Link Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-6. Displaying Link Condition in the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . .
4-7. Expression Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-8. Expression Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-9. Expression Using a Pre-Defined Workflow Variable . . . . . . . . . . . . . . . . . . . . .
4-10. Status Variable Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-11. PrevTaskStatus Variable Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-12. Sample Workflow Using Workflow Variable . . . . . . . . . . . . . . . . . . . . . . . . . .
4-13. Schedule tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-14. Customized Repeat Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-15. Example Workflow - Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-16. Running Part of a Workflow - Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5-1. General Tab - Edit Tasks Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5-2. Revert Button in Session Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5-3. Run If Previous Completed Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.. 2
.. 3
.. 6
. 15
. 15
. 17
. 18
. 18
. 20
. 20
. 22
. 29
. 30
. 40
. 42
. 43
. 66
. 68
. 75
. 76
. 81
. 88
. 89
. 93
. 93
. 95
. 95
. 96
104
107
107
108
108
115
116
120
125
135
137
146

List of Figures

xxv

Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure

xxvi

5-4. Example Workflow Using a Decision Task . . . . . . . . . . . . . . . . . . . . . . . . . . . .


5-5. Example Workflow without a Decision Task . . . . . . . . . . . . . . . . . . . . . . . . . .
5-6. Expanded Example Workflow Using a Decision Task . . . . . . . . . . . . . . . . . . . .
5-7. Example of User-Defined Event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5-8. Example Workflow Using the Timer Task . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6-1. Workflow with Multiple Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6-2. Workflow with Nested Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6-3. Example of Persistent Worklet Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7-1. Session Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7-2. Session Target Object Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7-3. Connection Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7-4. Config Object Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7-5. Session Configuration Browser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7-6. Stop or Continue the Session on Pre- or Post-Session SQL Errors . . . . . . . . . . .
7-7. Make Reusable Option for Pre-Session Shell Commands . . . . . . . . . . . . . . . . . .
7-8. Stop or Continue the Session on Pre-Session Shell Command Error . . . . . . . . . .
7-9. Assign Server Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8-1. Sources Node of the Session Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8-2. Readers Settings in the Sources Node of the Mapping Tab . . . . . . . . . . . . . . . .
8-3. Connections Settings in the Sources Node . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8-4. Properties Settings in the Sources Node of the Mapping Tab . . . . . . . . . . . . . . .
8-5. Treat Source Rows As Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8-6. Source Table Owner Name Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8-7. SQL Query Override Property in the Session Properties . . . . . . . . . . . . . . . . . .
8-8. Properties Settings in the Sources Node for a Flat File Source . . . . . . . . . . . . . .
8-9. Flat Files Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8-10. Fixed-Width File Properties Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8-11. Flat Files Dialog Box. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8-12. Delimited File Properties Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8-13. Line Sequential Buffer Length Property for File Sources . . . . . . . . . . . . . . . . .
9-1. Defining Target Properties in the Session Properties . . . . . . . . . . . . . . . . . . . . .
9-2. Writers Settings on the Mapping Tab of the Session Properties . . . . . . . . . . . . .
9-3. Connections Settings on the Mapping Tab of the Session Properties . . . . . . . . .
9-4. Properties Settings on the Mapping Tab of the Session Properties . . . . . . . . . . .
9-5. Properties Settings on the Mapping Tab for a Relational Target . . . . . . . . . . . .
9-6. Test Load Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9-7. Session Retry on Deadlock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9-8. Mapping Using Constraint-Based Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9-9. Properties Settings on the Mapping Tab for a Flat File Target . . . . . . . . . . . . . .
9-10. Test Load Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9-11. Flat Files Dialog Box. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9-12. Fixed Width Properties Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9-13. Flat Files Dialog Box. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

List of Figures

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.150
.150
.151
.153
.161
.167
.168
.169
.177
.179
.181
.183
.184
.187
.189
.193
.198
.210
.211
.212
.213
.215
.216
.217
.219
.221
.221
.223
.223
.225
.236
.237
.238
.239
.242
.244
.247
.250
.262
.264
.265
.265
.266

Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure

9-14. Delimited File Properties Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . .


10-1. Mapping with a Single Commit Source . . . . . . . . . . . . . . . . . . . . . . .
10-2. Mapping with Multiple Commit Sources . . . . . . . . . . . . . . . . . . . . . .
10-3. Mapping with Targets Connected to a Commit Source . . . . . . . . . . . .
10-4. Mapping a Custom Transformation with a Commit Source . . . . . . . . .
10-5. Roll Back on Failed Commit Example . . . . . . . . . . . . . . . . . . . . . . . .
10-6. Transaction Control Units. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10-7. Session Commit Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11-1. Mapping You Can Enable for Recovery . . . . . . . . . . . . . . . . . . . . . . .
11-2. Mapping You Cannot Enable for Recovery . . . . . . . . . . . . . . . . . . . . .
11-3. Modified Mapping You Can Enable for Recovery . . . . . . . . . . . . . . . .
11-4. Resuming a Suspended Workflow with Sequential Sessions . . . . . . . . .
11-5. Resuming a Suspended Workflow with Concurrent Sessions . . . . . . . .
11-6. Recovering Part of a Workflow With Sequential Sessions. . . . . . . . . . .
11-7. Recovering Part of a Workflow with Concurrent Sessions . . . . . . . . . .
11-8. Recovering Concurrent Sessions Individually . . . . . . . . . . . . . . . . . . .
12-1. Email Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12-2. Post-Session Email Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12-3. Suspension Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12-4. Email Task in a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12-5. Using Post-Session Commands to Generate Reports . . . . . . . . . . . . . .
12-6. Using Email Variables to Attach Reports . . . . . . . . . . . . . . . . . . . . . .
12-7. Sending Email without Microsoft Outlook . . . . . . . . . . . . . . . . . . . . .
13-1. Default Partition Points and Stages in a Sample Mapping . . . . . . . . . .
13-2. Threads Created for a Sample Mapping with Three Partitions . . . . . . .
13-3. Sample Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13-4. Session Properties Partitions View on the Mapping Tab . . . . . . . . . . .
13-5. Edit Partition Point Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13-6. Sample Mapping Showing Valid Partition Points . . . . . . . . . . . . . . . .
13-7. Mapping where Round-robin Partitioning Can Increase Performance . .
13-8. Mapping where Hash Partitioning Can Increase Performance . . . . . . .
13-9. Edit Partition Key Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13-10. Mapping where Key Range Partitioning Can Increase Performance . .
13-11. Edit Partition Key Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13-12. Adding Key Ranges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13-13. Mapping where Pass-through Partitioning Can Increase Performance .
13-14. Overriding the SQL Query and Entering a Filter Condition . . . . . . .
13-15. Properties Settings for Relational Targets in the Session Properties . . .
13-16. Connections Settings for File Targets in the Session Properties . . . . .
13-17. Properties Settings for File Targets in the Session Properties . . . . . . .
13-18. Sorted File Data with 1:n Partitions . . . . . . . . . . . . . . . . . . . . . . . . .
13-19. Sorted File Data Passed Through a Single Partition . . . . . . . . . . . . . .
13-20. Sorted Relational Data with 1:n Partitioning . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

267
279
280
281
282
286
290
292
303
304
304
306
307
308
309
312
328
332
339
341
342
343
343
347
348
349
351
352
354
360
361
362
363
364
365
367
371
378
381
382
386
387
388

List of Figures

xxvii

Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure

xxviii

13-21. Sorted Relational Data Passed Through a Single Partition . . . . . . . . . . . . . . .


13-22. Using Sorter Transformations with Hash Auto-Keys to Maintain Sort Order .
13-23. Session Properties - Configuring Sorter Transformations . . . . . . . . . . . . . . . .
14-1. Workflow Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14-2. Workflow Monitor Statistics Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14-3. General Tab for Workflow Monitor Options . . . . . . . . . . . . . . . . . . . . . . . . .
14-4. Gantt Chart Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14-5. Task View Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14-6. Advanced Tab for Workflow Monitor Options . . . . . . . . . . . . . . . . . . . . . . . .
14-7. Standard Toolbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14-8. Server Toolbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14-9. View Toolbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14-10. Filter Toolbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14-11. History Names Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14-12. Gantt Chart View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14-13. Organizing Gantt Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14-14. Zooming the Gantt Chart View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14-15. Task View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14-16. Session Properties Transformation Statistics . . . . . . . . . . . . . . . . . . . . . . . . .
15-1. Distributing Sessions in a Server Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15-2. Running a Non-session Task on the Master Server . . . . . . . . . . . . . . . . . . . . .
16-1. Properties Settings on the Mapping Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18-1. Using $PMSessionLogFile as the Name of the Session Log . . . . . . . . . . . . . . .
18-2. Using Parameters to Change the Session Source File . . . . . . . . . . . . . . . . . . . .
18-3. Using Parameters to Change the Session Target File . . . . . . . . . . . . . . . . . . . .
18-4. Using Parameters to Change the Session Lookup File . . . . . . . . . . . . . . . . . . .
18-5. Using Parameters to Change the Reject File Name . . . . . . . . . . . . . . . . . . . . .
20-1. Control File Editor Dialog Box for Teradata . . . . . . . . . . . . . . . . . . . . . . . . .
20-2. Writers Settings on the Mapping Tab. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20-3. Properties Settings on the Mapping Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20-4. Connections Settings on the Mapping Tab . . . . . . . . . . . . . . . . . . . . . . . . . . .
22-1. Incremental Aggregation Session Properties . . . . . . . . . . . . . . . . . . . . . . . . . .
25-1. Single-Pass Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A-1. General Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A-2. Properties Tab - General Options Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A-3. Properties Tab - Performance Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A-4. Config Object Tab - Advanced Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A-5. Config Object Tab - Log Option Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A-6. Config Object Tab - Error Handling Settings . . . . . . . . . . . . . . . . . . . . . . . . .
A-7. Mapping Tab - Connections Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A-8. Mapping Tab - Sources Node - Readers Settings . . . . . . . . . . . . . . . . . . . . . . .
A-9. Mapping Tab - Sources Node - Connections Settings . . . . . . . . . . . . . . . . . . . .
A-10. Mapping Tab - Sources Node - Properties Settings . . . . . . . . . . . . . . . . . . . . .

List of Figures

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.389
.390
.393
.403
.408
.410
.411
.412
.413
.415
.415
.415
.415
.420
.423
.426
.427
.431
.434
.446
.447
.477
.497
.502
.504
.506
.508
.539
.553
.554
.556
.580
.648
.668
.670
.673
.676
.677
.679
.681
.684
.685
.686

Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure

A-11. Flat Files Dialog Box for Sources . . . . . . . . . . . . . . . . . . . . . . . . . . .


A-12. Fixed Width Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A-13. Delimited Properties for File Sources . . . . . . . . . . . . . . . . . . . . . . . .
A-14. Mapping Tab - Targets Node - Writers Settings . . . . . . . . . . . . . . . .
A-15. Mapping Tab - Targets Node - Connections Settings . . . . . . . . . . . .
A-16. Mapping Tab - Targets Node - Properties Settings (Relational) . . . . .
A-17. Mapping Tab - Targets Node - File Properties Settings . . . . . . . . . . .
A-18. Flat Files Dialog Box for Targets . . . . . . . . . . . . . . . . . . . . . . . . . . .
A-19. Fixed-Width Properties for File Targets . . . . . . . . . . . . . . . . . . . . . .
A-20. Delimited Properties for File Targets . . . . . . . . . . . . . . . . . . . . . . . .
A-21. Mapping Tab - Transformations Node . . . . . . . . . . . . . . . . . . . . . . .
A-22. Mapping Tab - Partitions Properties Node . . . . . . . . . . . . . . . . . . . .
A-23. Mapping Tab - KeyRange Node . . . . . . . . . . . . . . . . . . . . . . . . . . .
A-24. Mapping Tab - Partition Points Node . . . . . . . . . . . . . . . . . . . . . . .
A-25. Edit Partition Point Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . .
A-26. Edit Partition Key Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A-27. Components Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A-28. Task Browser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A-29. Edit Pre-Session Command Dialog Box . . . . . . . . . . . . . . . . . . . . . .
A-30. Email Object Browser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A-31. On-Success or On-Failure Email - General Tab . . . . . . . . . . . . . . . .
A-32. On-Success or On-Failure Email - Properties Tab . . . . . . . . . . . . . . .
A-33. Metadata Extensions Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
B-1. Workflow Properties - General Tab . . . . . . . . . . . . . . . . . . . . . . . . . .
B-2. Workflow Properties - Properties Tab . . . . . . . . . . . . . . . . . . . . . . . .
B-3. Workflow Properties - Scheduler Tab . . . . . . . . . . . . . . . . . . . . . . . .
B-4. Workflow Properties - Scheduler Tab - Edit Scheduler Dialog Box . . .
B-5. Workflow Properties - Customized Repeat Dialog Box . . . . . . . . . . . .
B-6. Workflow Properties - Variables Tab . . . . . . . . . . . . . . . . . . . . . . . . .
B-7. Workflow Properties - Events Tab . . . . . . . . . . . . . . . . . . . . . . . . . . .
B-8. Workflow Properties - Metadata Extensions Tab . . . . . . . . . . . . . . . .
C-1. Server Manager General Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
C-2. Server Manager Source Options Dialog Box for File Sources . . . . . . . .
C-3. Server Manager Fixed-Width Properties Dialog Box . . . . . . . . . . . . . .
C-4. Server Manager Delimited File Properties Dialog Box . . . . . . . . . . . . .
C-5. Server Manager Source Options Dialog Box (XML Sources) . . . . . . . .
C-6. Server Manager FTP Properties Dialog Box . . . . . . . . . . . . . . . . . . . .
C-7. Server Manager Targets Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . .
C-8. Server Manager Output Files Dialog Box . . . . . . . . . . . . . . . . . . . . . .
C-9. Server Manager External Loader Properties . . . . . . . . . . . . . . . . . . . .
C-10. Server Manager Fixed-Width Dialog Box (Output Files) . . . . . . . . . .
C-11. Server Manager Delimited File Properties Dialog Box (Output Files)
C-12. Server Manager XML Target Dialog Box . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

688
689
690
693
694
696
699
701
702
702
704
705
706
707
708
709
710
712
713
715
716
717
718
722
724
726
727
729
731
732
733
737
739
740
741
741
742
744
745
747
747
748
748

List of Figures

xxix

Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure

xxx

C-13.
C-14.
C-15.
C-16.
C-17.
C-18.
C-19.
C-20.
C-21.

List of Figures

Server
Server
Server
Server
Server
Server
Server
Server
Server

Manager
Manager
Manager
Manager
Manager
Manager
Manager
Manager
Manager

Reject File Dialog Box . . . . . . . . . . .


Pre-Session Commands Dialog Box . .
Post-Session Commands and Email . .
Configuration Parameter Dialog Box .
Source Location Tab. . . . . . . . . . . . .
Time tab . . . . . . . . . . . . . . . . . . . . .
Repeat Dialog Box . . . . . . . . . . . . . .
Log and Error Handling Tab . . . . . . .
Transformations Tab . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

..
..
..
..
..
..
..
..
..

.
.
.
.
.
.
.
.
.

..
..
..
..
..
..
..
..
..

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

..
..
..
..
..
..
..
..
..

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

..
..
..
..
..
..
..
..
..

.
.
.
.
.
.
.
.
.

..
..
..
..
..
..
..
..
..

.
.
.
.
.
.
.
.
.

.749
.750
.751
.752
.754
.755
.756
.758
.761

List of Tables
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table

1-1. PowerCenter Server Connectivity Requirements . . . . . . . . . . . . . . . . . . . . . . . .


1-2. Processing Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2-1. Workflow Manager General Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2-2. Workflow Manager Format Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2-3. Workflow Manager Miscellaneous Options . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2-4. Default Permissions for Connection Objects . . . . . . . . . . . . . . . . . . . . . . . . . . .
2-5. Server Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2-6. TCP/IP Settings to Register a Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2-7. Native Connect String Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2-8. Source and Target Code Page Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . .
2-9. Relational Database Connection Information . . . . . . . . . . . . . . . . . . . . . . . . . .
2-10. Relational Database Connection Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . .
3-1. Metadata Extension Attributes in the Workflow Manager . . . . . . . . . . . . . . . . . .
3-2. Workflow Manager Keyboard Shortcuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3-3. Keyboard Shortcuts for Navigating the Workspace . . . . . . . . . . . . . . . . . . . . . . .
4-1. Task-Specific Workflow Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-2. Datatype Default Values for User-defined Workflow Variables . . . . . . . . . . . . . .
4-3. Schedule Tab Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-4. Repeat Dialog Box Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5-1. Workflow Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5-2. Timer Task Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7-1. Apply All Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7-2. PowerCenter Server Behavior for Failed Sessions . . . . . . . . . . . . . . . . . . . . . . . .
8-1. Treat Source Rows As Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8-2. Flat File Source Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8-3. Fixed-Width File Properties for File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . .
8-4. Delimited File Properties for File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8-5. Support for ASCII and Unicode Data Movement Modes . . . . . . . . . . . . . . . . . .
8-6. Null Character Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9-1. Support for ASCII and Unicode Data Movement Modes . . . . . . . . . . . . . . . . . .
9-2. Relational Target Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9-3. Test Load Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9-4. PowerCenter Server Commands on Supported Databases . . . . . . . . . . . . . . . . . .
9-5. Flat File Target Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9-6. Test Load Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9-7. Writing to a Fixed-Width Target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9-8. Delimited File Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9-9. Datatype Modifications for File Target Columns . . . . . . . . . . . . . . . . . . . . . . . .
9-10. Field Length Measurements for Fixed-Width Flat File Targets . . . . . . . . . . . . .
9-11. Characters to Include when Calculating Field Length for Fixed-Width Targets .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.. 6
. 14
. 40
. 42
. 43
. 44
. 47
. 49
. 54
. 54
. 57
. 58
. 83
. 86
. 86
105
110
115
117
132
162
179
201
215
220
222
224
226
228
234
242
244
245
262
264
266
267
269
270
270

List of Tables

xxxi

Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table

xxxii

10-1. Transformation Scope Property Values . . . . . . . . . . . . . . . . . . . . .


10-2. Session Commit Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11-1. PM_RECOVERY Table Definition . . . . . . . . . . . . . . . . . . . . . . . .
11-2. PM_TGT_RUN_ID Table Definition . . . . . . . . . . . . . . . . . . . . .
11-3. pmcmd Return Codes for Recovery . . . . . . . . . . . . . . . . . . . . . . . .
11-4. Transformations that Output Repeatable Data . . . . . . . . . . . . . . . .
12-1. Email Variables for Post-Session Email . . . . . . . . . . . . . . . . . . . . .
12-2. Format Tags for Email Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13-1. Default Partition Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13-2. Options on Session Properties Partitions View on the Mapping Tab
13-3. Edit Partition Point Dialog Box Options . . . . . . . . . . . . . . . . . . . .
13-4. Valid Partition Types for Partition Points . . . . . . . . . . . . . . . . . . .
13-5. File Properties Settings for File Sources . . . . . . . . . . . . . . . . . . . . .
13-6. Configuring Source File Name for Single-Threaded Reading . . . . .
13-7. Configuring Source File Name for Multi-Threaded Reading . . . . . .
13-8. Partitioning Relational Target Attributes . . . . . . . . . . . . . . . . . . . .
13-9. File Targets Connection Options . . . . . . . . . . . . . . . . . . . . . . . . .
13-10. Target File Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13-11. Variable Value Calculations with Partitioned Sessions . . . . . . . . .
13-12. Restrictions on the Number of Partitions for Transformations . . .
13-13. Partitioning Guidelines for Informatica Application Products . . . .
14-1. Workflow Monitor General Options . . . . . . . . . . . . . . . . . . . . . . .
14-2. Gantt Chart Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14-3. Advanced Workflow Monitor Options . . . . . . . . . . . . . . . . . . . . .
14-4. Workflow and Task Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14-5. Session Details on the Transformation Statistics Tab . . . . . . . . . . .
14-6. Performance Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15-1. Losing Connectivity in a Server Grid . . . . . . . . . . . . . . . . . . . . . .
15-2. Override Workflow Properties . . . . . . . . . . . . . . . . . . . . . . . . . . .
15-3. Override Server Grid Properties . . . . . . . . . . . . . . . . . . . . . . . . . .
16-1. Log File Default Locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16-2. Workflow Log Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16-3. Session Log Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16-4. Session Log Tracing Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16-5. Row Indicators in Reject File . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16-6. Column Indicators in Reject File . . . . . . . . . . . . . . . . . . . . . . . . .
17-1. PMERR_DATA Table Schema . . . . . . . . . . . . . . . . . . . . . . . . . . .
17-2. PMERR_MSG Table Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17-3. PMERR_SESS Table Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17-4. PMERR_TRANS Table Schema . . . . . . . . . . . . . . . . . . . . . . . . . .
17-5. Error Log File Column Headers . . . . . . . . . . . . . . . . . . . . . . . . . .
17-6. Error Log Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18-1. Naming Conventions for User-Defined Session Parameters . . . . . .

List of Tables

..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..
..

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.288
.292
.299
.299
.300
.301
.334
.334
.347
.352
.353
.357
.376
.376
.377
.379
.381
.382
.394
.396
.397
.410
.411
.413
.421
.434
.438
.448
.451
.451
.456
.458
.464
.473
.478
.479
.483
.485
.487
.487
.490
.494
.496

Table 19-1. Parameters and Variables in Parameter File . . . . . . . . . . . . . . . . . . . . . . . . . . .


Table 19-2. Naming Conventions for User-Defined Session Parameters . . . . . . . . . . . . . . . .
Table 20-1. Partitioning Guidelines for External Loaders . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 20-2. DB2 EE External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 20-3. DB2 EE External Loader Return Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 20-4. DB2 EEE External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 20-5. Oracle External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 20-6. Sybase IQ External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 20-7. Teradata MultiLoad External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . .
Table 20-8. Teradata MultiLoad External Loader Attributes Defined at the Session Level . . .
Table 20-9. Teradata TPump External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 20-10. Teradata TPump External Loader Attributes Defined at the Session Level . . . .
Table 20-11. Teradata FastLoad External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . .
Table 20-12. Teradata FastLoad External Loader Attributes Defined at the Session Level . . .
Table 20-13. Teradata Warehouse Builder Operators and Protocol . . . . . . . . . . . . . . . . . . .
Table 20-14. Teradata Warehouse Builder External Loader Attributes . . . . . . . . . . . . . . . . .
Table 20-15. Teradata Warehouse Builder External Loader Attributes Defined at the Session
Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 20-16. Properties Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 21-1. FTP Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 23-1. pmcmd Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 23-2. Connection Information for the Command Line Mode . . . . . . . . . . . . . . . . . .
Table 23-3. pmcmd Return Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 23-4. Setting Defaults for the Interactive Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 23-5. Command Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 23-6. pmcmd Syntax Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 24-1. Caching Storage Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 24-2. Cache File Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 24-3. Aggregate Cache Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 24-7. Column Sizes for Cache Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 24-4. Rank Cache Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 24-5. Joiner Cache Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 24-6. Lookup Cache Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table 25-1. Session Tuning Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table A-1. General Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table A-2. Properties Tab - General Options Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table A-3. Properties Tab - Performance Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table A-4. Config Object Tab - Advanced Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table A-5. Config Object Tab - Log Options Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table A-6. Config Object Tab - Error Handling Settings . . . . . . . . . . . . . . . . . . . . . . . . . .
Table A-7. Mapping Tab - Connections Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table A-8. Mapping Tab - Sources Node - Connections Settings . . . . . . . . . . . . . . . . . . . .
Table A-9. Mapping Tab - Sources Node - Properties Settings (Relational Sources) . . . . . . .
Table A-10. Mapping Tab - Sources Node - Properties Settings (File Sources) . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

513
520
527
529
530
531
534
536
540
542
542
544
545
546
547
547

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

549
555
563
582
590
590
593
594
595
614
616
617
618
618
618
618
655
668
671
674
676
678
679
682
685
686
687

List of Tables

xxxiii

Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table
Table

xxxiv

A-11. Fixed-Width Properties for File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .689


A-12. Delimited Properties for File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .691
A-13. Mapping Tab - Targets Node - Writers Settings . . . . . . . . . . . . . . . . . . . . . . . . .693
A-14. Mapping Tab - Targets Node - Connections Settings . . . . . . . . . . . . . . . . . . . . . .695
A-15. Mapping Tab - Targets Node - Properties Settings (Relational) . . . . . . . . . . . . . .697
A-16. Mapping Tab - Targets Node - File Properties Settings . . . . . . . . . . . . . . . . . . . .699
A-17. Fixed-Width Properties for File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .702
A-18. Delimited Properties for File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .703
A-19. Mapping Tab - Partition Points Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .707
A-20. Edit Partition Point Dialog Box Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .708
A-21. Components Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .711
A-22. Components Tab Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .711
A-23. Pre- or Post-Session Commands - General Tab . . . . . . . . . . . . . . . . . . . . . . . . . .713
A-24. Pre- or Post-Session Commands - Properties Tab . . . . . . . . . . . . . . . . . . . . . . . .714
A-25. Pre- or Post-Session Commands - Commands Tab . . . . . . . . . . . . . . . . . . . . . . .714
A-26. On-Success or On-Failure Emails - General Tab . . . . . . . . . . . . . . . . . . . . . . . . .716
A-27. On-Success or On-Failure Emails - Properties Tab . . . . . . . . . . . . . . . . . . . . . . .717
A-28. Metadata Extensions Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .718
B-1. Workflow Properties - General Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .722
B-2. Workflow Properties - Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .724
B-3. Workflow Properties - Scheduler Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .727
B-4. Workflow Properties - Scheduler Tab - Edit Scheduler Dialog Box . . . . . . . . . . . . .728
B-5. Workflow Properties - Repeat Dialog Box Options . . . . . . . . . . . . . . . . . . . . . . . .729
B-6. Workflow Properties - Variables Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .731
B-7. Workflow Properties - Events Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .732
B-8. Workflow Properties - Metadata Extensions Tab . . . . . . . . . . . . . . . . . . . . . . . . . .733
C-1. General Session Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .738
C-2. Source Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .738
C-3. File Source Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .739
C-4. XML Sources Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .742
C-5. FTP Properties Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .743
C-6. Target Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .743
C-7. Relational Target Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .744
C-8. File Target Output Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .746
C-9. XML Target Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .748
C-10. Reject Files Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .749
C-11. Pre-Session Commands Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .750
C-12. Post-Session Commands and Email Comparison . . . . . . . . . . . . . . . . . . . . . . . . .751
C-13. Performance Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .752
C-14. Configuration Parameters Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .753
C-15. Log File Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .759
C-16. Error Handling Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .759
C-17. Transformations Tab Options Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . .761

List of Tables

Preface
Welcome to PowerCenter, Informaticas software product that delivers an open, scalable data
integration solution addressing the complete life cycle for all data integration projects
including data warehouses and data marts, data migration, data synchronization, and
information hubs. PowerCenter combines the latest technology enhancements for reliably
managing data repositories and delivering information resources in a timely, usable, and
efficient manner.
The PowerCenter metadata repository coordinates and drives a variety of core functions,
including extracting, transforming, loading, and managing data. The PowerCenter Server can
extract large volumes of data from multiple platforms, handle complex transformations on the
data, and support high-speed loads. PowerCenter can simplify and accelerate the process of
moving data warehouses from development to test to production.

xxxv

New Features and Enhancements


This section describes new features and enhancements to PowerCenter 7.1.1, 7.1, and 7.0.

PowerCenter 7.1.1
This section describes new features and enhancements to PowerCenter 7.1.1.

Data Profiling

Data sampling. You can create a data profile for a sample of source data instead of the
entire source. You can view a profile from a random sample of data, a specified percentage
of data, or for a specified number of rows starting with the first row.

Verbose data enhancements. You can specify the type of verbose data you want the
PowerCenter Server to write to the Data Profiling warehouse. The PowerCenter Server can
write all rows, the rows that meet the business rule, or the rows that do not meet the
business rule.

Session enhancement. You can save sessions that you create from the Profile Manager to
the repository.

Domain Inference function tuning. You can configure the Data Profiling Wizard to filter
the Domain Inference function results. You can configure a maximum number of patterns
and a minimum pattern frequency. You may want to narrow the scope of patterns returned
to view only the primary domains, or you may want to widen the scope of patterns
returned to view exception data.

Row Uniqueness function. You can determine unique rows for a source based on a
selection of columns for the specified source.

Define mapping, session, and workflow prefixes. You can define default mapping,
session, and workflow prefixes for the mappings, sessions, and workflows generated when
you create a data profile.

Profile mapping display in the Designer. The Designer displays profile mappings under a
profile mappings node in the Navigator.

PowerCenter Server

xxxvi

Preface

Code page. PowerCenter supports additional Japanese language code pages, such as JIPSEkana, JEF-kana, and MELCOM-kana.

Flat file partitioning. When you create multiple partitions for a flat file source session, you
can configure the session to create multiple threads to read the flat file source.

pmcmd. You can use parameter files that reside on a local machine with the Startworkflow
command in the pmcmd program. When you use a local parameter file, pmcmd passes
variables and values in the file to the PowerCenter Server.

SuSE Linux support. The PowerCenter Server runs on SuSE Linux. On SuSE Linux, you
can connect to IBM, DB2, Oracle, and Sybase sources, targets, and repositories using
native drivers. Use ODBC drivers to access other sources and targets.

Reserved word support. If any source, target, or lookup table name or column name
contains a database reserved word, you can create and maintain a file, reswords.txt,
containing reserved words. When the PowerCenter Server initializes a session, it searches
for reswords.txt in the PowerCenter Server installation directory. If the file exists, the
PowerCenter Server places quotes around matching reserved words when it executes SQL
against the database.

Teradata external loader. When you load to Teradata using an external loader, you can
now override the control file. Depending on the loader you use, you can also override the
error, log, and work table names by specifying different tables on the same or different
Teradata database.

Repository

Exchange metadata with other tools. You can exchange source and target metadata with
other BI or data modeling tools, such as Business Objects Designer. You can export or
import multiple objects at a time. When you export metadata, the PowerCenter Client
creates a file format recognized by the target tool.

Repository Server

pmrep. You can use pmrep to perform the following functions:

Remove repositories from the Repository Server cache entry list.

Enable enhanced security when you create a relational source or target connection in the
repository.

Update a connection attribute value when you update the connection.

SuSE Linux support. The Repository Server runs on SuSE Linux. On SuSE Linux, you
can connect to IBM, DB2, Oracle, and Sybase repositories.

Security

Oracle OS Authentication. You can now use Oracle OS Authentication to authenticate


database users. Oracle OS Authentication allows you to log on to an Oracle database if you
have a logon to the operating system. You do not need to know a database user name and
password. PowerCenter uses Oracle OS Authentication when the user name for an Oracle
connection is PmNullUser.

Web Services Provider

Attachment support. When you import web service definitions with attachment groups,
you can pass attachments through the requests or responses in a service session. The
document type you can attach is based on the mime content of the WSDL file. You can
attach document types such as XML, JPEG, GIF, or PDF.

Preface

xxxvii

Pipeline partitioning. You can create multiple partitions in a session containing web
service source and target definitions. The PowerCenter Server creates a connection to the
Web Services Hub based on the number of sources, targets, and partitions in the session.

XML

Multi-level pivoting. You can now pivot more than one multiple-occurring element in an
XML view. You can also pivot the view row.

PowerCenter 7.1
This section describes new features and enhancements to PowerCenter 7.1.

Data Profiling

Data Profiling for VSAM sources. You can now create a data profile for VSAM sources.

Support for verbose mode for source-level functions. You can now create data profiles
with source-level functions and write data to the Data Profiling warehouse in verbose
mode.

Aggregator function in auto profiles. Auto profiles now include the Aggregator function.

Creating auto profile enhancements. You can now select the columns or groups you want
to include in an auto profile and enable verbose mode for the Distinct Value Count
function.

Purging data from the Data Profiling warehouse. You can now purge data from the Data
Profiling warehouse.

Source View in the Profile Manager. You can now view data profiles by source definition
in the Profile Manager.

PowerCenter Data Profiling report enhancements. You can now view PowerCenter Data
Profiling reports in a separate browser window, resize columns in a report, and view
verbose data for Distinct Value Count functions.

Prepackaged domains. Informatica provides a set of prepackaged domains that you can
include in a Domain Validation function in a data profile.

Documentation

Web Services Provider Guide. This is a new book that describes the functionality of Real-time
Web Services. It also includes information from the version 7.0 Web Services Hub Guide.

XML User Guide. This book consolidates XML information previously documented in the
Designer Guide, Workflow Administration Guide, and Transformation Guide.

Licensing
Informatica provides licenses for each CPU and each repository rather than for each
installation. Informatica provides licenses for product, connectivity, and options. You store
xxxviii Preface

the license keys in a license key file. You can manage the license files using the Repository
Server Administration Console, the PowerCenter Server Setup, and the command line
program, pmlic.

PowerCenter Server

64-bit support. You can now run 64-bit PowerCenter Servers on AIX and HP-UX
(Itanium).

Partitioning enhancements. If you have the Partitioning option, you can define up to 64
partitions at any partition point in a pipeline that supports multiple partitions.

PowerCenter Server processing enhancements. The PowerCenter Server now reads a


block of rows at a time. This improves processing performance for most sessions.

CLOB/BLOB datatype support. You can now read and write CLOB/BLOB datatypes.

PowerCenter Metadata Reporter


PowerCenter Metadata Reporter modified some report names and uses the PowerCenter 7.1
MX views in its schema.

Repository Server

Updating repository statistics. PowerCenter now identifies and updates statistics for all
repository tables and indexes when you copy, upgrade, and restore repositories. This
improves performance when PowerCenter accesses the repository.

Increased repository performance. You can increase repository performance by skipping


information when you copy, back up, or restore a repository. You can choose to skip MX
data, workflow and session log history, and deploy group history.

pmrep. You can use pmrep to back up, disable, or enable a repository, delete a relational
connection from a repository, delete repository details, truncate log files, and run multiple
pmrep commands sequentially. You can also use pmrep to create, modify, and delete a
folder.

Repository

Exchange metadata with business intelligence tools. You can export metadata to and
import metadata from other business intelligence tools, such as Cognos Report Net and
Business Objects.

Object import and export enhancements. You can compare objects in an XML file to
objects in the target repository when you import objects.

MX views. MX views have been added to help you analyze metadata stored in the
repository. REP_SERVER_NET and REP_SERVER_NET_REF views allow you to see
information about server grids. REP_VERSION_PROPS allows you to see the version
history of all objects in a PowerCenter repository.

Preface

xxxix

Transformations

Flat file lookup. You can now perform lookups on flat files. When you create a Lookup
transformation using a flat file as a lookup source, the Designer invokes the Flat File
Wizard. You can also use a lookup file parameter if you want to change the name or
location of a lookup between session runs.

Dynamic lookup cache enhancements. When you use a dynamic lookup cache, the
PowerCenter Server can ignore some ports when it compares values in lookup and input
ports before it updates a row in the cache. Also, you can choose whether the PowerCenter
Server outputs old or new values from the lookup/output ports when it updates a row. You
might want to output old values from lookup/output ports when you use the Lookup
transformation in a mapping that updates slowly changing dimension tables.

Union transformation. You can use the Union transformation to merge multiple sources
into a single pipeline. The Union transformation is similar to using the UNION ALL SQL
statement to combine the results from two or more SQL statements.

Custom transformation API enhancements. The Custom transformation API includes


new array-based functions that allow you to create procedure code that receives and
outputs a block of rows at a time. Use these functions to take advantage of the
PowerCenter Server processing enhancements.

Midstream XML transformations. You can now create an XML Parser transformation or
an XML Generator transformation to parse or generate XML inside a pipeline. The XML
transformations enable you to extract XML data stored in relational tables, such as data
stored in a CLOB column. You can also extract data from messaging systems, such as
TIBCO or IBM MQSeries.

Usability

Viewing active folders. The Designer and the Workflow Manager highlight the active
folder in the Navigator.

Enhanced printing. The quality of printed workspace has improved.

Version Control
You can run object queries that return shortcut objects. You can also run object queries based
on the latest status of an object. The query can return local objects that are checked out, the
latest version of checked in objects, or a collection of all older versions of objects.

Web Services Provider

xl

Preface

Real-time Web Services. Real-time Web Services allows you to create services using the
Workflow Manager and make them available to web service clients through the Web
Services Hub. The PowerCenter Server can perform parallel processing of both requestresponse and one-way services.

Web Services Hub. The Web Services Hub now hosts Real-time Web Services in addition
to Metadata Web Services and Batch Web Services. You can install the Web Services Hub
on a JBoss application server.

Note: PowerCenter Connect for Web Services allows you to create sources, targets, and

transformations to call web services hosted by other providers. For more informations, see
PowerCenter Connect for Web Services User and Administrator Guide.

Workflow Monitor
The Workflow Monitor includes the following performance and usability enhancements:

When you connect to the PowerCenter Server, you no longer distinguish between online
or offline mode.

You can open multiple instances of the Workflow Monitor on one machine.

You can simultaneously monitor multiple PowerCenter Servers registered to the same
repository.

The Workflow Monitor includes improved options for filtering tasks by start and end
time.

The Workflow Monitor displays workflow runs in Task view chronologically with the most
recent run at the top. It displays folders alphabetically.

You can remove the Navigator and Output window.

XML Support
PowerCenter XML support now includes the following features:

Enhanced datatype support. You can use XML schemas that contain simple and complex
datatypes.

Additional options for XML definitions. When you import XML definitions, you can
choose how you want the Designer to represent the metadata associated with the imported
files. You can choose to generate XML views using hierarchy or entity relationships. In a
view with hierarchy relationships, the Designer expands each element and reference under
its parent element. When you create views with entity relationships, the Designer creates
separate entities for references and multiple-occurring elements.

Synchronizing XML definitions. You can synchronize one or more XML definition when
the underlying schema changes. You can synchronize an XML definition with any
repository definition or file used to create the XML definition, including relational sources
or targets, XML files, DTD files, or schema files.

XML workspace. You can edit XML views and relationships between views in the
workspace. You can create views, add or delete columns from views, and define
relationships between views.

Midstream XML transformations. You can now create an XML Parser transformation or
an XML Generator transformation to parse or generate XML inside a pipeline. The XML
transformations enable you to extract XML data stored in relational tables, such as data
stored in a CLOB column. You can also extract data from messaging systems, such as
TIBCO or IBM MQSeries.

Preface

xli

Support for circular references. Circular references occur when an element is a direct or
indirect child of itself. PowerCenter now supports XML files, DTD files, and XML
schemas that use circular definitions.

Increased performance for large XML targets. You can create XML files of several
gigabytes in a PowerCenter 7.1 XML session by using the following enhancements:

Spill to disk. You can specify the size of the cache used to store the XML tree. If the size
of the tree exceeds the cache size, the XML data spills to disk in order to free up
memory.

User-defined commits. You can define commits to trigger flushes for XML target files.

Support for multiple XML output files. You can output XML data to multiple XML
targets. You can also define the file names for XML output files in the mapping.

PowerCenter 7.0
This section describes new features and enhancements to PowerCenter 7.0.

Data Profiling
If you have the Data Profiling option, you can profile source data to evaluate source data and
detect patterns and exceptions. For example, you can determine implicit data type, suggest
candidate keys, detect data patterns, and evaluate join criteria. After you create a profiling
warehouse, you can create profiling mappings and run sessions. Then you can view reports
based on the profile data in the profiling warehouse.
The PowerCenter Client provides a Profile Manager and a Profile Wizard to complete these
tasks.

Data Integration Web Services


You can use Data Integration Web Services to write applications to communicate with the
PowerCenter Server. Data Integration Web Services is a web-enabled version of the
PowerCenter Server functionality available through Load Manager and Metadata Exchange. It
is comprised of two services for communication with the PowerCenter Server, Load Manager
and Metadata Exchange Web Services running on the Web Services Hub.

Documentation

xlii

Preface

Glossary. The Installation and Configuration Guide contains a glossary of new PowerCenter
terms.

Installation and Configuration Guide. The connectivity information in the Installation


and Configuration Guide is consolidated into two chapters. This book now contains
chapters titled Connecting to Databases from Windows and Connecting to Databases
from UNIX.

Upgrading metadata. The Installation and Configuration Guide now contains a chapter
titled Upgrading Repository Metadata. This chapter describes changes to repository

objects impacted by the upgrade process. The change in functionality for existing objects
depends on the version of the existing objects. Consult the upgrade information in this
chapter for each upgraded object to determine whether the upgrade applies to your current
version of PowerCenter.

Functions

Soundex. The Soundex function encodes a string value into a four-character string.
SOUNDEX works for characters in the English alphabet (A-Z). It uses the first character
of the input string as the first character in the return value and encodes the remaining
three unique consonants as numbers.

Metaphone. The Metaphone function encodes string values. You can specify the length of
the string that you want to encode. METAPHONE encodes characters of the English
language alphabet (A-Z). It encodes both uppercase and lowercase letters in uppercase.

Installation

Remote PowerCenter Client installation. You can create a control file containing
installation information, and distribute it to other users to install the PowerCenter Client.
You access the Informatica installation CD from the command line to create the control
file and install the product.

PowerCenter Metadata Reporter


PowerCenter Metadata Reporter replaces Runtime Metadata Reporter and Informatica
Metadata Reporter. PowerCenter Metadata Reporter includes the following features:

Metadata browsing. You can use PowerCenter Metadata Reporter to browse PowerCenter
7.0 metadata, such as workflows, worklets, mappings, source and target tables, and
transformations.

Metadata analysis. You can use PowerCenter Metadata Reporter to analyze operational
metadata, including session load time, server load, session completion status, session
errors, and warehouse growth.

PowerCenter Server

DB2 bulk loading. You can enable bulk loading when you load to IBM DB2 8.1.

Distributed processing. If you purchase the Server Grid option, you can group
PowerCenter Servers registered to the same repository into a server grid. In a server grid,
PowerCenter Servers balance the workload among all the servers in the grid.

Row error logging. The session configuration object has new properties that allow you to
define error logging. You can choose to log row errors in a central location to help
understand the cause and source of errors.

External loading enhancements. When using external loaders on Windows, you can now
choose to load from a named pipe. When using external loaders on UNIX, you can now
choose to load from staged files.
Preface

xliii

External loading using Teradata Warehouse Builder. You can use Teradata Warehouse
Builder to load to Teradata. You can choose to insert, update, upsert, or delete data.
Additionally, Teradata Warehouse Builder can simultaneously read from multiple sources
and load data into one or more tables.

Mixed mode processing for Teradata external loaders. You can now use data driven load
mode with Teradata external loaders. When you select data driven loading, the
PowerCenter Server flags rows for insert, delete, or update. It writes a column in the target
file or named pipe to indicate the update strategy. The control file uses these values to
determine how to load data to the target.

Concurrent processing. The PowerCenter Server now reads data concurrently from
sources within a target load order group. This enables more efficient joins with minimal
usage of memory and disk cache.

Real time processing enhancements. You can now use real-time processing in sessions that
also process active transformations, such as the Aggregator transformation. You can apply
the transformation logic to rows defined by transaction boundaries.

Repository Server

Object export and import enhancements. You can now export and import objects using
the Repository Manager and pmrep. You can export and import multiple objects and
objects types. You can export and import objects with or without their dependent objects.
You can also export objects from a query result or objects history.

pmrep commands. You can use pmrep to perform change management tasks, such as
maintaining deployment groups and labels, checking in, deploying, importing, exporting,
and listing objects. You can also use pmrep to run queries. The deployment and object
import commands require you to use a control file to define options and resolve conflicts.

Trusted connections. You can now use a Microsoft SQL Server trusted connection to
connect to the repository.

Security

xliv

Preface

LDAP user authentication. You can now use default repository user authentication or
Lightweight Directory Access Protocol (LDAP) to authenticate users. If you use LDAP, the
repository maintains an association between your repository user name and your external
login name. When you log in to the repository, the security module passes your login name
to the external directory for authentication. The repository maintains a status for each
user. You can now enable or disable users from accessing the repository by changing the
status. You do not have to delete user names from the repository.

Use Repository Manager privilege. The Use Repository Manager privilege allows you to
perform tasks in the Repository Manager, such as copy object, maintain labels, and change
object status. You can perform the same tasks in the Designer and Workflow Manager if
you have the Use Designer and Use Workflow Manager privileges.

Audit trail. You can track changes to repository users, groups, privileges, and permissions
through the Repository Server Administration Console. The Repository Agent logs
security changes to a log file stored in the Repository Server installation directory. The

audit trail log contains information, such as changes to folder properties, adding or
removing a user or group, and adding or removing privileges.

Transformations

Custom transformation. Custom transformations operate in conjunction with procedures


you create outside of the Designer interface to extend PowerCenter functionality. The
Custom transformation replaces the Advanced External Procedure transformation. You can
create Custom transformations with multiple input and output groups, and you can
compile the procedure with any C compiler.
You can create templates that customize the appearance and available properties of a
Custom transformation you develop. You can specify the icons used for transformation,
the colors, and the properties a mapping developer can modify. When you create a Custom
transformation template, distribute the template with the DLL or shared library you
develop.

Joiner transformation. You can use the Joiner transformation to join two data streams that
originate from the same source.

Version Control
The PowerCenter Client and repository introduce features that allow you to create and
manage multiple versions of objects in the repository. Version control allows you to maintain
multiple versions of an object, control development on the object, track changes, and use
deployment groups to copy specific groups of objects from one repository to another. Version
control in PowerCenter includes the following features:

Object versioning. Individual objects in the repository are now versioned. This allows you
to store multiple copies of a given object during the development cycle. Each version is a
separate object with unique properties.

Check out and check in versioned objects. You can check out and reserve an object you
want to edit, and check in the object when you are ready to create a new version of the
object in the repository.

Compare objects. The Repository Manager and Workflow Manager allow you to compare
two repository objects of the same type to identify differences between them. You can
compare Designer objects and Workflow Manager objects in the Repository Manager. You
can compare tasks, sessions, worklets, and workflows in the Workflow Manager. The
PowerCenter Client tools allow you to compare objects across open folders and
repositories. You can also compare different versions of the same object.

Delete or purge a version. You can delete an object from view and continue to store it in
the repository. You can recover or undelete deleted objects. If you want to permanently
remove an object version, you can purge it from the repository.

Deployment. Unlike copying a folder, copying a deployment group allows you to copy a
select number of objects from multiple folders in the source repository to multiple folders
in the target repository. This gives you greater control over the specific objects copied from
one repository to another.

Preface

xlv

Deployment groups. You can create a deployment group that contains references to
objects from multiple folders across the repository. You can create a static deployment
group that you manually add objects to, or create a dynamic deployment group that uses a
query to populate the group.

Labels. A label is an object that you can apply to versioned objects in the repository. This
allows you to associate multiple objects in groups defined by the label. You can use labels
to track versioned objects during development, improve query results, and organize groups
of objects for deployment or export and import.

Queries. You can create a query that specifies conditions to search for objects in the
repository. You can save queries for later use. You can make a private query, or you can
share it with all users in the repository.

Track changes to an object. You can view a history that includes all versions of an object
and compare any version of the object in the history to any other version. This allows you
to see the changes made to an object over time.

XML Support
PowerCenter contains XML features that allow you to validate an XML file against an XML
schema, declare multiple namespaces, use XPath to locate XML nodes, increase performance
for large XML files, format your XML file output for increased readability, and parse or
generate XML data from various sources. XML support in PowerCenter includes the
following features:

XML schema. You can use an XML schema to validate an XML file and to generate source
and target definitions. XML schemas allow you to declare multiple namespaces so you can
use prefixes for elements and attributes. XML schemas also allow you to define some
complex datatypes.

XPath support. The XML wizard allows you to view the structure of XML schema. You
can use XPath to locate XML nodes.

Increased performance for large XML files. When you process an XML file or stream, you
can set commits and periodically flush XML data to the target instead of writing all the
output at the end of the session. You can choose to append the data to the same target file
or create a new target file after each flush.

XML target enhancements. You can format the XML target file so that you can easily view
the XML file in a text editor. You can also configure the PowerCenter Server to not output
empty elements to the XML target.

Usability

xlvi

Preface

Copying objects. You can now copy objects from all the PowerCenter Client tools using
the copy wizard to resolve conflicts. You can copy objects within folders, to other folders,
and to different repositories. Within the Designer, you can also copy segments of
mappings to a workspace in a new folder or repository.

Comparing objects. You can compare workflows and tasks from the Workflow Manager.
You can also compare all objects from within the Repository Manager.

Change propagation. When you edit a port in a mapping, you can choose to propagate
changed attributes throughout the mapping. The Designer propagates ports, expressions,
and conditions based on the direction that you propagate and the attributes you choose to
propagate.

Enhanced partitioning interface. The Session Wizard is enhanced to provide a graphical


depiction of a mapping when you configure partitioning.

Revert to saved. You can now revert to the last saved version of an object in the Workflow
Manager. When you do this, the Workflow Manager accesses the repository to retrieve the
last-saved version of the object.

Enhanced validation messages. The PowerCenter Client writes messages in the Output
window that describe why it invalidates a mapping or workflow when you modify a
dependent object.

Validate multiple objects. You can validate multiple objects in the repository without
fetching them into the workspace. You can save and optionally check in objects that
change from invalid to valid status as a result of the validation. You can validate sessions,
mappings, mapplets, workflows, and worklets.

View dependencies. Before you edit or delete versioned objects, such as sources, targets,
mappings, or workflows, you can view dependencies to see the impact on other objects.
You can view parent and child dependencies and global shortcuts across repositories.
Viewing dependencies help you modify objects and composite objects without breaking
dependencies.

Refresh session mappings. In the Workflow Manager, you can refresh a session mapping.

Preface

xlvii

About Informatica Documentation


The complete set of documentation for PowerCenter includes the following books:

xlviii

Preface

Data Profiling Guide. Provides information about how to profile PowerCenter sources to
evaluate source data and detect patterns and exceptions.

Designer Guide. Provides information needed to use the Designer. Includes information to
help you create mappings, mapplets, and transformations. Also includes a description of
the transformation datatypes used to process and transform source data.

Getting Started. Provides basic tutorials for getting started.

Installation and Configuration Guide. Provides information needed to install and


configure the PowerCenter tools, including details on environment variables and database
connections.

PowerCenter Connect for JMS User and Administrator Guide. Provides information
to install PowerCenter Connect for JMS, build mappings, extract data from JMS messages,
and load data into JMS messages.

Repository Guide. Provides information needed to administer the repository using the
Repository Manager or the pmrep command line program. Includes details on
functionality available in the Repository Manager and Administration Console, such as
creating and maintaining repositories, folders, users, groups, and permissions and
privileges.

Transformation Language Reference. Provides syntax descriptions and examples for each
transformation function provided with PowerCenter.

Transformation Guide. Provides information on how to create and configure each type of
transformation in the Designer.

Troubleshooting Guide. Lists error messages that you might encounter while using
PowerCenter. Each error message includes one or more possible causes and actions that
you can take to correct the condition.

Web Services Provider Guide. Provides information you need to install and configure the Web
Services Hub. This guide also provides information about how to use the web services that the
Web Services Hub hosts. The Web Services Hub hosts Real-time Web Services, Batch Web
Services, and Metadata Web Services.

Workflow Administration Guide. Provides information to help you create and run
workflows in the Workflow Manager, as well as monitor workflows in the Workflow
Monitor. Also contains information on administering the PowerCenter Server and
performance tuning.

XML User Guide. Provides information you need to create XML definitions from XML,
XSD, or DTD files, and relational or other XML definitions. Includes information on
running sessions with XML data. Also includes details on using the midstream XML
transformations to parse or generate XML data within a pipeline.

About this Book


The Workflow Administration Guide is written for developers and administrators who are
responsible for creating workflows and sessions, running workflows, and administering the
PowerCenter Server. This guide assumes you have knowledge of your operating systems,
relational database concepts, and the database engines, flat files or mainframe system in your
environment. This guide also assumes you are familiar with the interface requirements for
your supporting applications.
The material in this book is available for online use.

Document Conventions
This guide uses the following formatting conventions:
If you see

It means

italicized text

The word or set of words are especially emphasized.

boldfaced text

Emphasized subjects.

italicized monospaced text

This is the variable name for a value you enter as part of an


operating system command. This is generic text that should be
replaced with user-supplied values.

Note:

The following paragraph provides additional facts.

Tip:

The following paragraph provides suggested uses.

Warning:

The following paragraph notes situations where you can overwrite


or corrupt data, unless you follow the specified procedure.

monospaced text

This is a code example.

bold monospaced text

This is an operating system command you enter from a prompt to


run a task.

Preface

xlix

Other Informatica Resources


In addition to the product manuals, Informatica provides these other resources:

Informatica Customer Portal

Informatica Webzine

Informatica web site

Informatica Developer Network

Informatica Technical Support

Visiting Informatica Customer Portal


As an Informatica customer, you can access the Informatica Customer Portal site at http://
my.informatica.com. The site contains product information, user group information,
newsletters, access to the Informatica customer support case management system (ATLAS),
the Informatica Knowledgebase, Informatica Webzine, and access to the Informatica user
community.

Visiting the Informatica Webzine


The Informatica Documentation team delivers an online journal, the Informatica Webzine.
This journal provides solutions to common tasks, detailed descriptions of specific features,
and tips and tricks to help you develop data warehouses.
The Informatica Webzine is a password-protected site that you can access through the
Customer Portal. The Customer Portal has an online registration form for login accounts to
its webzine and web support. To register for an account, go to http://my.informatica.com.
If you have any questions, please email webzine@informatica.com.

Visiting the Informatica Web Site


You can access Informaticas corporate web site at http://www.informatica.com. The site
contains information about Informatica, its background, upcoming events, and locating your
closest sales office. You will also find product information, as well as literature and partner
information. The services area of the site includes important information on technical
support, training and education, and implementation services.

Visiting the Informatica Developer Network


The Informatica Developer Network is a web-based forum for third-party software
developers. You can access the Informatica Developer Network at the following URL:
http://devnet.informatica.com

Preface

The site contains information on how to create, market, and support customer-oriented addon solutions based on Informaticas interoperability interfaces.

Obtaining Technical Support


There are many ways to access Informatica technical support. You can call or email your
nearest Technical Support Center listed below or you can use our WebSupport Service.
WebSupport requires a user name and password. You can request a user name and password at
http://my.informatica.com.
North America / South America

Africa / Asia / Australia / Europe

Informatica Corporation
2100 Seaport Blvd.
Redwood City, CA 94063
Phone: 866.563.6332 or 650.385.5800
Fax: 650.213.9489
Hours: 6 a.m. - 6 p.m. (PST/PDT)
email: support@informatica.com

Informatica Software Ltd.


6 Waltham Park
Waltham Road, White Waltham
Maidenhead, Berkshire
SL6 3TN
Phone: 44 870 606 1525
Fax: +44 1628 511 411
Hours: 9 a.m. - 5:30 p.m. (GMT)
email: support_eu@informatica.com
Belgium
Phone: +32 15 281 702
Hours: 9 a.m. - 5:30 p.m. (local time)
France
Phone: +33 1 41 38 92 26
Hours: 9 a.m. - 5:30 p.m. (local time)
Germany
Phone: +49 1805 702 702
Hours: 9 a.m. - 5:30 p.m. (local time)
Netherlands
Phone: +31 306 082 089
Hours: 9 a.m. - 5:30 p.m. (local time)
Singapore
Phone: +65 322 8589
Hours: 9 a.m. - 5 p.m. (local time)
Switzerland
Phone: +41 800 81 80 70
Hours: 8 a.m. - 5 p.m. (local time)

Preface

li

lii

Preface

Chapter 1

Understanding the Server


Architecture
This chapter covers the following subjects:

Overview, 2

PowerCenter Server Connectivity, 5

Running a Workflow, 7

Load Manager Process, 8

Data Transformation Manager (DTM) Process, 11

Understanding Processing Threads, 14

PowerCenter Server Processing, 22

System Resources, 24

Code Pages and Data Movement Modes, 27

Output Files and Caches, 28

Overview
You can register multiple PowerCenter Servers to a repository. The PowerCenter Server moves
data from sources to targets based on workflow and mapping metadata stored in a repository.
A workflow is a set of instructions that describes how and when to run tasks related to
extracting, transforming, and loading data. The PowerCenter Server runs workflow tasks
according to the conditional links connecting the tasks. You can run a task by placing it in a
workflow.
When you have multiple PowerCenter Servers, you can assign a server to start a workflow or a
session. This allows you to distribute the workload. You can increase performance by using a
server grid to balance the workload. A server grid is a server object that allows you to
automate the distribution of sessions across multiple servers. For more information about
server grids, see Working with Server Grids on page 446.
A session is a type of workflow task. A session is a set of instructions that describes how to
move data from sources to targets using a mapping. Other workflow tasks include commands,
decisions, timers, pre-session SQL commands, post-session SQL commands, and email
notification. For details on workflow tasks, see Working with Tasks on page 131.
Use the Designer to import source and target definitions into the repository and to build
mappings. A mapping is a set of source and target definitions linked by transformation
objects that define the rules for data transformation. Use the Workflow Manager to develop
and manage workflows. Use the Workflow Monitor to monitor workflows and stop the
PowerCenter Server.
When a workflow starts, the PowerCenter Server retrieves mapping, workflow, and session
metadata from the repository to extract data from the source, transform it, and load it into
the target. It also runs the tasks in the workflow. The PowerCenter Server uses Load Manager
and Data Transformation Manager (DTM) processes to run the workflow.
Figure 1-1 shows the processing path between the PowerCenter Server, repository, source, and
target:
Figure 1-1. PowerCenter Server and Data Movement
Source

Source
Data

PowerCenter
Server

Transformed
Data

Instructions
from
Metadata

Repository

Chapter 1: Understanding the Server Architecture

Target

The PowerCenter Server can combine data from different platforms and source types. For
example, you can join data from a flat file and an Oracle source. The PowerCenter Server can
also load data to different platforms and target types. For example, you can load transformed
data to both a flat file target and a Microsoft SQL Server database in the same session.

Workflow Processes
The PowerCenter Server uses both process memory and system shared memory to perform
these tasks. It runs as a daemon on UNIX and a service on Windows. The PowerCenter Server
uses the following processes to run a workflow:

The Load Manager process. Starts and locks the workflow, runs workflow tasks, and starts
the DTM to run sessions.

The Data Transformation Manager (DTM) process. Performs session validations. Creates
threads to initialize the session, read, write, and transform data, and handle pre- and postsession operations.

Pipeline Partitioning
When running sessions, the PowerCenter Server can achieve high performance by
partitioning the pipeline and performing the extract, transformation, and load for each
partition in parallel. To accomplish this, use the following session and server configuration:

Configure the session with multiple partitions.

Install the PowerCenter Server on a machine with multiple CPUs.

You can configure the partition type at most transformations in the pipeline. The
PowerCenter Server can partition data using round-robin, hash, key-range, database
partitioning, or pass-through partitioning.
For relational sources, the PowerCenter Server creates multiple database connections to a
single source and extracts a separate range of data for each connection. For XML or file
sources, the PowerCenter Server reads multiple files concurrently. The files must have the
same structure or hierarchy.
When the PowerCenter Server transforms the partitions concurrently, it passes data between
the partitions as needed to perform operations such as aggregation. When the PowerCenter
Server loads relational data, it creates multiple database connections to the target and loads
partitions of data concurrently. When the PowerCenter Server loads data to file targets, it
creates a separate file for each partition. You can choose to merge the target files.
Figure 1-2 shows a mapping that contains two partitions:
Figure 1-2. Partitioned Mapping
Source

Transformations

Target

Overview

For more information about pipeline partitioning, see Pipeline Partitioning on page 345.

Chapter 1: Understanding the Server Architecture

PowerCenter Server Connectivity


The PowerCenter Server connects to the following Informatica platform components:

PowerCenter Client

Other PowerCenter Servers

Repository Server

Repository Agent

Source and target databases

The PowerCenter Server is a repository client application. It connects to the Repository


Server and Repository Agent to retrieve workflow and mapping metadata from the repository
database. When the PowerCenter Server requests a repository connection from the Repository
Server, the Repository Server starts and manages the Repository Agent. The Repository Server
then re-directs the PowerCenter Server to connect directly to the Repository Agent. For
details on repository connectivity, see Understanding the Repository in the Repository
Guide.
The Workflow Manager communicates directly with the PowerCenter Server over a TCP/IP
connection. The Workflow Manager communicates directly with the PowerCenter Server
each time you schedule or edit a workflow, display workflow details, and request workflow
and session logs. You create the connection by defining the port number in the Workflow
Manager and the PowerCenter Server configuration. Use the Workflow Manager to register
the PowerCenter Server in the repository.
In a server grid, the Workflow Manager communicates directly with multiple PowerCenter
Servers over TCP/IP connections. Each PowerCenter Server retrieves a server grid object from
the repository, which it uses to connect to the other PowerCenter Servers in the grid. When
the PowerCenter Servers connect to each other, they maintain a constant line of
communication with each other. For more information about creating and using server grids,
see Working with Server Grids on page 446.
The PowerCenter Server connects to the source or target database using ODBC or native
drivers. It uses TCP/IP to connect to the Repository Server. The PowerCenter Server
maintains a database connection pool for stored procedures or lookup databases in a
workflow. The PowerCenter Server allows an unlimited number of connections to lookup or
stored procedure databases. If a database user does not have permission for the number of
connections a session requires, the session fails. You can optionally set a parameter to limit the
database connections.
For a session, the PowerCenter Server holds the connection as long as it needs to read data
from source tables or write data to target tables.
To prevent loss of information during data transfer, the PowerCenter Server, PowerCenter
Client, Repository Server, Repository Agent, and repository database must have compatible
code pages.

PowerCenter Server Connectivity

Figure 1-3 shows the PowerCenter Server connectivity:


Figure 1-3. PowerCenter Connectivity
PowerCenter
Client

PowerCenter
Server

TCP/IP

Native/
ODBC

Sources and
Targets

TCP/IP

Repository Server
Repository Agent
Native/ODL
PowerCenter
Repository

Table 1-1 summarizes the software you need to connect the PowerCenter Server to the
platform components, source databases, and target databases:
Table 1-1. PowerCenter Server Connectivity Requirements
PowerCenter Server Connection

Connectivity Requirement

PowerCenter Client

TCP/IP

Other PowerCenter Servers

TCP/IP

Repository Server

TCP/IP

Repository Agent

TCP/IP

Source and target databases

Native database drivers or ODBC

Note: Both the Windows and UNIX versions of the PowerCenter Server can use ODBC drivers to connect to
databases. However, Informatica recommends using native drivers when possible to improve performance.

Chapter 1: Understanding the Server Architecture

Running a Workflow
The PowerCenter Server uses the Load Manager process and the Data Transformation
Manager Process (DTM) to run the workflow and carry out workflow tasks.
When the PowerCenter Server runs a workflow, the Load Manager performs the following
tasks:
1.

Locks the workflow and reads workflow properties.

2.

Reads the parameter file and expands workflow variables.

3.

Creates the workflow log file.

4.

Runs workflow tasks.

5.

Distributes sessions to worker servers.

6.

Starts the DTM to run sessions.

7.

Runs sessions from master servers.

8.

Sends post-session email if the DTM terminates abnormally.

For details on the Load Manager process, see Load Manager Process on page 8.
When the PowerCenter Server runs a session, the DTM performs the following tasks:
1.

Fetches session and mapping metadata from the repository.

2.

Creates and expands session variables.

3.

Creates the session log file.

4.

Validates session code pages if data code page validation is enabled. Checks query
conversions if data code page validation is disabled.

5.

Verifies connection object permissions.

6.

Runs pre-session shell commands.

7.

Runs pre-session stored procedures and SQL.

8.

Creates and runs mapping, reader, writer, and transformation threads to extract,
transform, and load data.

9.

Runs post-session stored procedures and SQL.

10. Runs post-session shell commands.


11. Sends post-session email.
For details on the DTM process, see Data Transformation Manager (DTM) Process on
page 11.

Running a Workflow

Load Manager Process


The Load Manager is the primary PowerCenter Server process. It accepts requests from the
PowerCenter Client and from pmcmd. The Load Manager runs and monitors the workflow. It
performs the following tasks:

Manages workflow scheduling.

Locks and reads the workflow.

Reads the parameter file.

Creates the workflow log file.

Runs workflow tasks and evaluates the conditional links connecting tasks.

Starts the DTM, which runs the session.

Writes historical run information to the repository.

Sends post-session email in the event of DTM failure.

Managing Workflow Scheduling


The Load Manager manages workflow scheduling in the following situations:

When you start the PowerCenter Server. When you start the PowerCenter Server, the
Load Manager launches and queries the repository for a list of workflows configured to run
on the PowerCenter Server.

When you save a workflow. When you save a workflow assigned to a PowerCenter Server
to the repository, the Load Manager adds the workflow to or removes the workflow from
the schedule queue.

Locking and Reading the Workflow


When the PowerCenter Server starts a workflow, the Load Manager requests an execute lock
on the workflow from the repository. The execute lock allows the PowerCenter Server to run
the workflow and prevents you from starting the workflow again until it completes. If the
workflow is already locked, the PowerCenter Server cannot start the workflow. A workflow
may be locked if it is already running.
The Load Manager also reads the workflow from the repository at workflow run time. The
Load Manager reads all links and tasks in the workflow except sessions and worklet instances.
The Load Manager reads session instance information from the repository. The DTM
retrieves the session and mapping from the repository at session run time. The Load Manager
reads worklets from the repository when the worklet starts.
For more information on locking, see Repository Security in the Repository Guide.

Chapter 1: Understanding the Server Architecture

Reading the Parameter File


When the workflow starts, the Load Manager checks the workflow properties for use of a
parameter file. If the workflow uses a parameter file, the Load Manager reads the parameter
file and expands the variable values for the workflow and any worklets invoked by the
workflow.
The parameter file can also contain mapping variables, mapping parameters, session
parameters, and session variables for sessions in the workflow. When starting the DTM, the
Load Manager passes the parameter file name to the DTM.
For more information on the parameter file, see Session Parameters on page 495.

Creating the Workflow Log File


The Load Manager creates a log file for the workflow. The workflow log file contains a history
of the workflow run, including initialization, workflow task status, and error messages. You
can use information in the workflow log file in conjunction with the PowerCenter Server log
and session log to troubleshoot system, workflow, or session problems.
You can view the workflow log file in the Workflow Manager or open it in a text editor. The
following sample shows the first few lines of a log file:
INFO : LM_36215 : (2076|2224) Starting execution of workflow
[w_OrdersBooked].
INFO : LM_36255 : (2076|2224) Link [StartWorkflow --> s_BOOKINGS]: empty
expression string, evaluated to TRUE.
INFO : LM_36224 : (2076|2224) Starting execution of session instance
[s_BOOKINGS].
INFO : LM_36302 : (2076|2224) Started DTM process [pid = 508] for session
instance [s_BOOKINGS].

For more information on workflow log files, see Log Files on page 455.

Running Workflow Tasks


The Load Manager runs workflow tasks according to the conditional links connecting the
tasks. Links define the order of execution for workflow tasks. When a task in the workflow
completes, the Load Manager evaluates the completed task according to specified conditions,
such as success or failure. Based on the result of the evaluation, the Load Manager runs
successive links and tasks.
For more information on workflows and workflow tasks, see Working with Workflows on
page 87.

Distributing Sessions to Worker Servers


When you run a workflow in a server grid, the master server distributes session tasks to the
worker servers in a round-robin fashion to balance the workload. When the master server
Load Manager Process

distributes a session to a worker server, the Load Manager on the worker server machine starts
a DTM process to run the session.
For more information about creating and using server grids, see Working with Server Grids
on page 446.

Starting the DTM


When the workflow reaches a session, the Load Manager starts the DTM. The Load Manager
provides the DTM with session and parameter file information that allows the DTM to
retrieve the session and mapping metadata from the repository.
For more information on the DTM process, see Data Transformation Manager (DTM)
Process on page 11.

Running Sessions from Master Servers


If a PowerCenter Server is part of a server grid, it can run sessions assigned from other master
servers. The master server runs tasks in a workflow before it runs sessions assigned from other
master servers.
For more information about creating and using server grids, see Working with Server Grids
on page 446.

Writing Historical Information to the Repository


The Load Manager monitors the status of workflow tasks during the workflow run. When
workflow tasks start or finish, the Load Manager writes historical run information to the
repository. Historical run information for tasks includes start and completion times and
completion status. Historical run information for sessions also includes source read statistics,
target load statistics, and number of errors. You can view this information using the Workflow
Monitor.
For details on using the Workflow Monitor, see Monitoring Workflows on page 401.

Sending Post-Session Email


The Load Manager sends post-session email if the DTM terminates abnormally. The DTM
sends post-session email in all other cases. For details on post-session email, see Sending
Email on page 319.

10

Chapter 1: Understanding the Server Architecture

Data Transformation Manager (DTM) Process


When the workflow reaches a session, the Load Manager starts the DTM process. The DTM
process is the process associated with the session task. The Load Manager creates one DTM
process for each session in the workflow. The DTM process performs the following tasks:

Reads session information from the repository.

Expands the server, session, and mapping variables and parameters.

Creates the session log file.

Validates source and target code pages.

Verifies connection object permissions.

Runs pre-session shell commands, stored procedures and SQL.

Creates and runs mapping, reader, writer, and transformation threads to extract,
transform, and load data.

Runs post-session stored procedures, SQL, and shell commands.

Sends post-session email.

Reading the Session Information


The Load Manager provides the DTM with session instance information when it starts the
DTM. The DTM retrieves the mapping and session metadata from the repository.

Expanding Variables and Parameters


If the workflow uses a parameter file, the Load Manager sends the parameter file to the DTM
when it starts the DTM. The DTM creates and expands session-level, server-level, and
mapping-level variables and parameters. For more information on the parameter file, see
Session Parameters on page 495.

Creating the Session Log File


The DTM creates a log file for the session. The log file contains a complete history of the
session run, including initialization, transformation, status, and error messages. You can use
information in the log file in conjunction with the PowerCenter Server log and the workflow
log file to troubleshoot system or session problems.
You can view the log file in the Workflow Monitor or open it in a text editor. The following
sample shows the first few lines of a log file:
MASTER> CMN_1010 System shared memory [2338661387] allocated for
[12000000] bytes.
MASTER> PETL_24000 Parallel Pipeline Engine initializing.
MASTER> PETL_24001 Parallel Pipeline Engine running.

Data Transformation Manager (DTM) Process

11

MASTER> PETL_24003 Initializing session run.


MAPPING> TM_6014 Initializing session [s_Customers] at [Tue Nov 04
16:55:06 2003]

For more information on session log files, see Log Files on page 455.

Validating Code Pages


When the PowerCenter Server runs in Unicode mode with data code page validation enabled,
the DTM validates the following code pages:

Source code pages. Must be a subset of the PowerCenter Server code page.

Target code pages. Must be a superset of the PowerCenter Server code page.

Repository Agent code page. Must be compatible with the PowerCenter Server code page.

Repository Server code page. Must be compatible with the PowerCenter Server code page.

Lookup database code page. Must be compatible with the PowerCenter Server code page.

Stored procedure database code page. Must be compatible with the PowerCenter Server
code page.

PowerCenter Server code page. Must be registered with the Workflow Manager.

If the DTM cannot validate the code pages, it writes the error into the session log and fails the
session. If you disable data code page validation, the PowerCenter Server does not enforce
code page compatibility.
The PowerCenter Server processes data internally using the UCS-2 character set. When you
disable data code page validation the PowerCenter Server verifies that the source query, target
query, lookup database query, and stored procedure call text convert from the source, target,
lookup, or stored procedure data code page to the UCS-2 character without loss of data in
conversion. If the PowerCenter Server encounters an error when converting data, it writes an
error message to the session log.
For more information about code pages, see Globalization Overview and Code Pages in
the Installation and Configuration Guide.

Verifying Connection Object Permissions


After validating the session code pages, the DTM verifies permissions for connection objects
used in the session. The DTM verifies that the user who started the PowerCenter Server and
the user who started or scheduled the workflow has execute permissions for connection
objects associated with the session.

Running Pre-Session Operations


After verifying connection object permissions, the DTM runs pre-session shell commands.
The DTM then runs pre-session stored procedures and SQL commands.

12

Chapter 1: Understanding the Server Architecture

Running the Processing Threads


After initializing the session, the DTM uses reader, transformation, and writer threads to
extract, transform, and load data. The number of threads the DTM uses to run the session
depends on the number of partitions configured for the session. For a detailed discussion of
reader, transformation, and writer threads, see Understanding Processing Threads on
page 14.

Running Post-Session Operations


After the DTM runs the processing threads, it runs post-session SQL commands and stored
procedures. The DTM then runs post-session shell commands.

Sending Post-Session Email


When the session finishes, the DTM composes and sends email reporting session completion
or failure. If the DTM terminates abnormally, the Load Manager sends post-session email.
For details on post-session email, see Sending Email on page 319.

Data Transformation Manager (DTM) Process

13

Understanding Processing Threads


The DTM allocates process memory for the session and divides it into buffers. This is also
known as buffer memory. The default memory allocation is 12,000,000 bytes. The DTM uses
multiple threads to process data. The main DTM thread is called the master thread.
The master thread creates and manages other threads. The master thread for a session can
create mapping, pre-session, post-session, reader, transformation, and writer threads. For
more information, see Thread Types on page 14.
For each target load order group in a mapping, the master thread can create several threads.
The types of threads depend on the session properties and the transformations in the
mapping. The number of threads depends on the partitioning information for each target
load order group in the mapping.
For more information on target load order groups, see Reading Source Data on page 22.

Thread Types
The master thread creates different types of threads for a session. The types of threads the
master thread creates depend on the following factors:

Pre- and post-session properties

Types of transformations in the mapping

Table 1-2 lists the types of threads that the master thread can create:
Table 1-2. Processing Threads

14

Thread Type

Description

Mapping Thread

One thread for each session. Fetches session and mapping information.
Compiles the mapping. Cleans up after session execution.

Pre- and Post-Session


Threads

One thread each to perform pre- and post-session operations.

Reader Thread

One thread for each partition for each source pipeline. Reads from sources.
Relational sources use relational reader threads, and file sources use file
reader threads.

Transformation Thread

One or more transformation threads for each partition. Processes data


according to the transformation logic in the mapping.

Writer Thread

One thread for each partition, if a target exists in the source pipeline. Writes to
targets. Relational targets use relational writer threads, and file targets use file
writer threads.

Chapter 1: Understanding the Server Architecture

Figure 1-4 shows the threads the master thread creates for a simple mapping that contains one
target load order group:
Figure 1-4. Thread Creation for a Simple Mapping

1 Reader Thread

1 Transformation Thread

1 Writer Thread

The mapping in Figure 1-4 contains a single partition. In this case, the master thread creates
one reader, one transformation, and one writer thread to process the data. The reader thread
controls how the PowerCenter Server extracts source data and passes it to the source qualifier,
the transformation thread controls how the PowerCenter Server processes the data, and the
writer thread controls how the PowerCenter Server loads data to the target.
When the pipeline contains only a source definition, source qualifier, and a target definition,
the data bypasses the transformation threads, proceeding directly from the reader buffers to
the writer. This type of pipeline is a pass-through pipeline.
Figure 1-5 shows the threads for a pass-through pipeline with one partition:
Figure 1-5. Thread Creation for a Pass-through Pipeline

1 Reader Thread

Bypassed
Transformation
Thread

1 Writer Thread

Note: The previous examples assume that each session contains a single partition. For

information on how partitions and partition points affect thread creation, see Threads and
Partitioning on page 16.

Reader Threads
The master thread creates reader threads to extract source data. The number of reader threads
depends on the partitioning information for each pipeline. The number of reader threads
equals the number of partitions. For more information, see Threads and Partitioning on
page 16.
The PowerCenter Server creates an SQL statement for each reader thread to extract data from
a relational source. For file sources, the PowerCenter Server can create multiple threads to
read a single source.

Understanding Processing Threads

15

Transformation Threads
The master thread creates transformation threads to transform data received in buffers by the
reader thread, move the data from transformation to transformation, and create memory
caches when necessary. The number of transformation threads depends on the partitioning
information for each pipeline. For more information, see Threads and Partitioning on
page 16.
The transformation threads store fully-transformed data in a buffer drawn from the memory
pool for subsequent access by the writer thread.
If the pipeline contains a Rank, Joiner, Aggregator, Sorter, or a cached Lookup
transformation, the transformation thread uses cache memory until it reaches the configured
cache size limits. If the transformation thread requires more space, it pages to local cache files
to hold additional data.
When the PowerCenter Server runs in ASCII mode, the transformation threads pass character
data in single bytes. When the PowerCenter Server runs in Unicode mode, the transformation
threads use double bytes to move character data.

Writer Threads
The master thread creates writer threads to load target data. The number of writer threads
depends on the partitioning information for each pipeline. If the pipeline contains one
partition, the master thread creates one writer thread. If it contains multiple partitions, the
master thread creates multiple writer threads. For more information, see Threads and
Partitioning on page 16.
Each writer thread creates connections to the target databases to load data. If the target is a
file, each writer thread creates a separate file. You can configure the session to merge these
files.
If the target is relational, the writer thread takes data from buffers and commits it to session
targets. When loading targets, the writer commits data based on the commit interval in the
session properties. You can configure a session to commit data based on the number of source
rows read, the number of rows written to the target, or the number of rows that pass through
a transformation that generates transactions, such as a Transaction Control transformation.

Threads and Partitioning


The master thread creates different numbers of threads for different mappings. The number
of threads depends on the partitioning information for each target load order group. This
includes the following factors:

16

The partition points. Controls the thread boundaries and pipeline stages.

The number of partitions. Controls the number of threads the master thread creates for
each pipeline stage.

The number of source pipelines. Controls the number of reader threads and the number
of transformation threads downstream from the sources.

Chapter 1: Understanding the Server Architecture

Partition Points
By default, the Workflow Manager places partition points at certain transformations in each
source pipeline. Partition points mark the thread boundaries in a source pipeline and divide
the pipeline into stages. A pipeline stage is the section of a pipeline executed between any two
partition points. When you set a partition point at a transformation, the new pipeline stage
includes that transformation.
The PowerCenter Server can redistribute rows of data at partition points. For example, if you
place a partition point at a Sorter transformation and specify multiple partitions, the
PowerCenter Server redistributes rows among all partitions before the rows enter the Sorter
transformation. The rows stay in the same partitions until they reach the next partition point.
For more information, see Pipeline Partitioning on page 345.
By default, the Workflow Manager places a partition point at each of the following
transformations:

Source qualifier. Marks the reader stage. You cannot delete this partition point.

Rank and unsorted Aggregator transformation. Marks the transformation stage


boundaries and creates a new transformation stage. This is necessary to ensure that rows
are grouped properly before the Rank and Aggregator transformations process them. You
can delete these partition points under certain circumstances. For more information, see
Adding and Deleting Partition Points on page 353.

Target instance. Marks the writer stage. You cannot delete this partition point.

Figure 1-6 shows the pipeline stages for a mapping that contains an unsorted Aggregator
transformation:
Figure 1-6. Pipeline Stages in a Mapping With an Unsorted Aggregator Transformation

First Stage

Second Stage

Third Stage

Default Partition Points

Fourth Stage

The mapping in Figure 1-6 contains four stages by default. The partition point at the source
qualifier marks the boundary between the first (reader) and second (transformation) stages.
The partition point at the Aggregator transformation marks the boundary between the second
and third (transformation) stages. The partition point at the target instance marks the
boundary between the third (transformation) and the fourth (writer) stages.
If you use PowerCenter, you can add and delete partition points at other transformations. For
information on valid partition points, see Pipeline Partitioning on page 345. When you add
a partition point, you increase the number of pipeline stages by one. When you remove a
partition point, you decrease the number of pipeline stages by one.

Understanding Processing Threads

17

Figure 1-7 shows the pipeline stages if you add a partition point at the Filter transformation:
Figure 1-7. Pipeline Stages in a Mapping with an Additional Partition Point

First Stage

Second Stage

Third Stage

Fourth Stage

Partition Points

Fifth Stage

Number of Partitions
The number of threads that process each pipeline stage depends on the number of partitions.
A partition is a pipeline stage that executes in a single reader, transformation, or writer thread.
The number of partitions in any pipeline stage equals the number of threads in that stage. If
you do not specify otherwise, the PowerCenter Server creates one partition in every pipeline
stage. If you purchased the partitioning option, you can configure multiple partitions for a
single pipeline stage.
You can specify the number of partitions at any partition point. The number of partitions
must be consistent across a pipeline. Therefore, if you define two partitions at the source
qualifier, the Workflow Manager sets two partitions at all transformations that are partition
points, and two partitions at the target instances.
For example, suppose you need to use the mapping in Figure 1-6 on page 17 to read data from
three flat files. To do this, you need to specify three partitions at the source qualifier. When
you do this, the Workflow Manager sets three partitions at all other partition points in the
pipeline.
The master thread creates three sets of threads. Figure 1-8 shows thread creation for a
mapping with three partitions:
Figure 1-8. Thread Creation for a Mapping with Three Partitions

Default Partition Points

Threads for Partition #1


Threads for Partition #2
Threads for Partition #3
3 Reader Threads
(First Stage)

18

(Second Stage)

Chapter 1: Understanding the Server Architecture

6 Transformation Threads
(Third Stage)

3 Writer Threads
(Fourth Stage)

When you define three partitions across the mapping in Figure 1-8, the master thread creates
three threads at each pipeline stage, for a total of 12 threads. If you need to read data from
four file sources, you would specify four partitions at the source qualifier. The master thread
would create a fourth thread at each stage, for a total of 16 threads.
The PowerCenter Server processes partitions concurrently. When you run a session with
multiple partitions, the threads run as follows:
1.

The reader threads run concurrently to extract data from the source.

2.

The transformation threads run concurrently in each transformation stage to process


data. The PowerCenter Server redistributes data among the partitions at each partition
point.

3.

The writer threads run concurrently to write data to the target.

Note: Increasing the number of partitions or partition points increases the number of threads.

Therefore, increasing the number of partitions or partition points also increases the load on
the server machine. If the server machine contains ample CPU bandwidth, processing rows of
data in a session concurrently can increase session performance. However, if you create a large
number of partitions or partition points in a session that processes large amounts of data, you
can overload the system.

Number of Source Pipelines


The master thread creates a reader and transformation thread for each source pipeline in the
target load order group. For more information on source pipelines and target load order
groups, see Reading Source Data on page 22.
When you connect multiple pipelines to a multiple input group transformation, such as a
Joiner or Custom transformation, the PowerCenter Server maintains the transformation
threads or creates a new transformation thread depending on the partitioning information:

You add a partition point at the multiple input group transformation. The PowerCenter
Server creates a new pipeline stage and creates one transformation thread downstream
from the partition point. The PowerCenter Server creates one transformation thread
regardless of the number of output groups the transformation contains.

You do not add a partition point at the multiple input group transformation. The
PowerCenter Server maintains the same number of transformation threads downstream
from the partition point until it reaches the next partition point. However, for each
partition at the multiple input group transformation and its downstream transformations,
only one thread actively processes a row of data at any given time.

Understanding Processing Threads

19

Figure 1-9 shows the thread creation for a mapping that contains a Joiner transformation
configured for sorted input:
Figure 1-9. Thread Creation with Joiner Transformation
1 Reader Thread
1 Transformation Thread

* Partition Points

*
*
*

1 Reader Thread

1 Transformation Thread

1 Writer Thread

Each source pipeline in Figure 1-9 contains a transformation thread. The Joiner
transformation is not a partition point, so both transformation threads can process data at the
Joiner and Expression transformations. However, only one transformation thread processes a
row at any given time. The target load order group contains one target, so the master thread
creates only one writer thread.
Suppose you add a partition point at the Joiner transformation in Figure 1-9. Figure 1-10
shows the mapping in Figure 1-9 with a partition point at the Joiner transformation:
Figure 1-10. Thread Creation with a Partition Point at a Joiner Transformation
1 Reader Thread
1 Transformation Thread

* Partition Points

*
*

1 Reader Thread

20

1 Transformation
Thread

Chapter 1: Understanding the Server Architecture

1 Transformation
Thread Created After
the Partition Point

1 Writer Thread

Each source pipeline in Figure 1-10 contains a transformation thread. However, the
transformation threads end at the Joiner transformation. The Joiner transformation is a
partition point, so the master thread creates a new transformation thread starting at the
partition point.
Note: If any source qualifier in either Figure 1-9 or Figure 1-10 feeds a target other than the

target associated with the Joiner transformation, the master thread creates an additional writer
thread.

Understanding Processing Threads

21

PowerCenter Server Processing


When you run a session, the PowerCenter Server reads source data and passes it to the
transformations for processing. To help understand PowerCenter Server processing, consider
the following PowerCenter Server actions:

Reading source data. The PowerCenter Server reads the sources in a mapping at different
times depending on how you configure the sources, transformations, and targets in the
mapping. For more information on reading data, see Reading Source Data on page 22.

Blocking data. The PowerCenter Server sometimes blocks the flow of data at a
transformation in the mapping while it processes a row of data from a different source. For
more information on blocking data, see Blocking Data on page 23.

Block processing. The PowerCenter Server reads and processes a block of rows at a time.
For more information, see Block Processing on page 23.

Reading Source Data


You create a session based on a mapping. Mappings contain one or more target load order
groups. A target load order group is the collection of source qualifiers, transformations, and
targets linked together in a mapping. Each target load order group contains one or more
source pipelines. A source pipeline consists of a source qualifier and all of the transformations
and target instances that receive data from that source qualifier.
By default, the PowerCenter Server reads sources in a target load order group concurrently,
and it processes target load order groups sequentially. You can configure the order that the
PowerCenter Server processes target load order groups. For more information on setting the
target load order, see Mappings in the Designer Guide.
Figure 1-11 shows a mapping that contains two target load order groups and three source
pipelines:
Figure 1-11. Target Load Order Groups and Source Pipelines
Sources
Transformations
Targets
Pipeline A

T1
Target Load Order Group 1

T2

Pipeline B
C

22

Chapter 1: Understanding the Server Architecture

T3

Target Load Order Group 2


Pipeline C

In the mapping shown in Figure 1-11, the PowerCenter Server processes the target load order
groups sequentially. It first processes Target Load Order Group 1 by reading Source A and
Source B at the same time. When it finishes processing Target Load Order Group 1, the
PowerCenter Server begins to process Target Load Order Group 2 by reading Source C.

Blocking Data
You can include multiple input group transformations in a mapping. The PowerCenter Server
passes data to the input groups concurrently. However, sometimes the transformation logic of
a multiple input group transformation requires that the PowerCenter Server block data on
one input group while it waits for a row from a different input group.
Blocking is the suspension of the data flow into an input group of a multiple input group
transformation. When the PowerCenter Server blocks data, it reads data from the source
connected to the input group until it fills the reader and transformation buffers. Once the
PowerCenter Server fills the buffers, it does not read more source rows until the
transformation logic allows the PowerCenter Server to stop blocking the source. When the
PowerCenter Server stops blocking a source, it processes the data in the buffers and continues
to read from the source.
The PowerCenter Server blocks data at one input group when it needs a specific row from a
different input group to perform the transformation logic. Once the PowerCenter Server
reads and processes the row it needs, it stops blocking the source.

Block Processing
The PowerCenter Server reads and processes a block of rows at a time. The number of rows in
the block depend on the row size and the DTM buffer size. In the following circumstances,
the PowerCenter Server processes one row in a block:

Log row errors. When you log row errors, the PowerCenter Server processes one row in a
block.

Connect CURRVAL. When you connect the CURRVAL port in a Sequence Generator
transformation, the session processes one row in a block. For optimal performance,
Informatica recommends that you connect only the NEXTVAL port in mappings. For
more information, see Sequence Generator Transformation in the Transformation Guide.

Configure array-based mode for Custom transformation procedure. When you configure
the data access mode for a Custom transformation procedure to be row-based, the
PowerCenter Server processes one row in a block. By default, the data access mode is arraybased, and the PowerCenter Server processes multiple rows in a block. For more
information, see Custom Transformation Functions in the Transformation Guide.

PowerCenter Server Processing

23

System Resources
To allocate system resources for read, transformation, and write processing, you should
understand how the PowerCenter Server allocates and uses system resources. The
PowerCenter Server uses the following system resources:

CPU

Load Manager shared memory

DTM buffer memory

Cache memory

CPU Usage
The PowerCenter Server performs read, transformation, and write processing for a pipeline in
parallel. It can process multiple partitions of a pipeline within a session, and it can process
multiple sessions in parallel.
If you have a symmetric multi-processing (SMP) platform, you can use multiple CPUs to
concurrently process session data or partitions of data. This provides increased performance,
as true parallelism is achieved. On a single processor platform, these tasks share the CPU, so
there is no parallelism.
The PowerCenter Server can use multiple CPUs to process a session that contains multiple
partitions. The number of CPUs used depends on factors such as the number of partitions,
the number of threads, the number of available CPUs, and amount or resources required to
process the mapping.
For more information about partitioning, see Pipeline Partitioning on page 345.

Load Manager Shared Memory


The Load Manager uses both process and shared memory. The Load Manager keeps a list of
workflows and the schedule queue in process memory. The Load Manager shared memory is
organized as an array of session slots that store session instance and status information. The
DTM retrieves the session object and mapping object from the repository for processing.
Session instance information does not occupy the shared memory slot until session run time.
When you start a workflow, the Load Manager retrieves session instance information from the
repository with other workflow tasks. At session runtime, the Load Manager places the session
instance information into a shared memory slot and starts the DTM. The DTM connects to
the shared memory and uses the session instance information to retrieve the session and
mapping from the repository. When the session completes, the Load Manager releases the
session instance from the shared memory slot and writes session run information to the
repository.
If the PowerCenter Server shuts down, it releases all sessions from shared memory.

24

Chapter 1: Understanding the Server Architecture

You can configure three parameters in the PowerCenter Server configuration that control how
the Load Manager allocates shared memory to sessions and the number of sessions the
PowerCenter Server runs simultaneously:

MaxSessions. The maximum sessions parameter indicates the maximum number of session
slots available to the Load Manager at one time for running or repeating sessions. For
example, if you select the default MaxSessions of 10, the Load Manager allocates 10
session slots. This parameter helps you control the number of sessions the PowerCenter
Server can run simultaneously.

LMSharedMemory. Set the Load Manager shared memory parameter in conjunction with
the Maximum Sessions parameter to ensure that the Load Manager has enough memory
for each session. The Load Manager requires approximately 200,000 bytes of shared
memory for each session slot. The default setting is 2,000,000 bytes. For each increase of
10 sessions in the MaxSessions setting, you need to increase LMSharedMemory by
2,000,000 bytes.

FailSessionIfMaxSessionsReached. The Fail Session If Max Sessions Reached option


determines how the Load Manager handles a session when the number of sessions already
running equals the number specified for maximum sessions. By default, this option is
disabled, and the Load Manager holds sessions waiting to run in a ready queue until a
session slot becomes available.

DTM Buffer Memory


The Load Manager launches the DTM. The DTM allocates buffer memory to the session
based on the DTM Buffer Size setting in the session properties. By default, it allocates
12,000,000 bytes of memory to the session.
The DTM divides the memory into buffer blocks as configured in the Buffer Block Size
setting in the session properties (64,000 bytes per block, by default). The reader,
transformation, and writer threads use buffer blocks to move data from sources to targets.
You can sometimes improve session performance by increasing buffer memory when you run
a session handling a large volume of character data and the PowerCenter Server runs in
Unicode mode. In Unicode mode, the PowerCenter Server uses double bytes to move
characters, so increasing buffer memory might improve session performance.
If the DTM cannot allocate the configured amount of buffer memory for the session, the
session cannot initialize. Informatica recommends you allocate no more than 1 GB for DTM
buffer memory.

System Resources

25

Cache Memory
The DTM process creates in-memory index and data caches to temporarily store data used by
the following transformations:

Aggregator transformation (without sorted input)

Rank transformation

Joiner transformation

Lookup transformation (with caching enabled)

You configure memory size for the index and data cache in the transformation properties. By
default, the PowerCenter Server allocates 1,000,000 bytes for the index cache and 2,000,000
bytes for the data cache.
By default, the DTM creates cache files in the directory configured for the $PMCacheDir
server variable. If the DTM requires more space than it allocates, it pages to local index and
data files.
The DTM process also creates an in-memory cache to store data used by a Sorter
transformation. You configure the memory size for the cache in the transformation properties.
By default, the PowerCenter Server allocates 8,388,608 bytes for the cache, and the DTM
creates cache files in the directory configured for the $PMTempDir server variable. If the
DTM requires more cache space than it allocates, it pages to local cache files.
When processing large amounts of data, the DTM may create multiple index and data files.
The session does not fail if it runs out of cache memory and pages to the cache files. It does
fail, however, if the local directory for cache files runs out of disk space.
After the session completes, the DTM releases memory used by the index and data caches and
deletes any index and data files. However, if the session is configured to perform incremental
aggregation or if a Lookup transformation is configured for a persistent lookup cache, the
DTM saves all index and data cache information to disk for the next session run.
For more information about caching, see Session Caches on page 613.

26

Chapter 1: Understanding the Server Architecture

Code Pages and Data Movement Modes


You can configure PowerCenter to move multibyte data. The PowerCenter Server can move
data in either ASCII or Unicode data movement mode. These modes determine how the
PowerCenter Server handles character data. You choose the data movement mode in the
PowerCenter Server configuration settings. If you want to move multibyte data, choose
Unicode data movement mode.
To ensure that data is not lost during conversion from one machine to another, you must also
choose the appropriate code pages for your connections. In the Workflow Manager, you select
code pages for the PowerCenter Server and the database connections the PowerCenter Server
uses to connect to the source and target machines. The Workflow Manager validates code
page compatibility when you add or edit a session.
For more information, see Globalization Overview and Code Pages in the Installation and
Configuration Guide.

ASCII Mode
Use ASCII mode when all sources and targets are 7-bit ASCII or EBCDIC character sets. In
ASCII mode, the PowerCenter Server recognizes 7-bit ASCII and EBCDIC characters and
stores each character in a single byte. When the PowerCenter Server runs in ASCII mode, it
does not validate session code pages. It reads all character data as ASCII characters and does
not perform code page conversions. It also treats all numerics as U.S. Standard and all dates as
binary data.

Unicode Mode
Use Unicode mode when sources or targets use 8-bit or multibyte character sets and contain
character data. In Unicode mode, the PowerCenter Server recognizes multibyte character sets
as defined by supported code pages.
If you configure the PowerCenter Server to validate data code pages, the PowerCenter Server
validates source and target code page compatibility when you run a session. If you configure
the PowerCenter Server for relaxed data code page validation, the PowerCenter Server lifts
source and target compatibility restrictions.
When reading a source, the PowerCenter Server converts data from the source character set to
Unicode based on the source code page. The PowerCenter Server allots two bytes for each
character when moving data through a mapping. The PowerCenter Server converts data from
Unicode to the target character set based on the target code page when writing to the target. It
also treats all numerics as U.S. Standard and all dates as binary data.
The PowerCenter Server code page must be compatible with the code pages of the
PowerCenter Client.
For details on code page compatibility and validation, see Globalization Overview in the
Installation and Configuration Guide.
Code Pages and Data Movement Modes

27

Output Files and Caches


Once launched, the PowerCenter Server logs status and error messages to a UNIX log file or
to the Windows Application log. During each workflow run, the PowerCenter Server creates a
workflow log file. During each session, the PowerCenter Server creates a session log file and
reject file. Depending on transformation cache settings and target types, the PowerCenter
Server may create additional files as well.
The PowerCenter Server uses the PowerCenter Server code page to generate log files. When
you directly access a log file generated by the PowerCenter Server, it appears in the character
set of the PowerCenter Server code page. When you use the Workflow Manager to access a file
generated by the PowerCenter Server, such as a session log, the Workflow Manager uses the
PowerCenter Client code page to translate and display the session log in the character set of
the PowerCenter Client code page.
The PowerCenter Server creates the following output files:

PowerCenter Server log

Workflow log file

Session log file

Session details file

Performance details file

Reject files

Row error logs

Recovery tables and files

Control file

Post-session email

Output file

Cache files

When the PowerCenter Server on UNIX creates any file other than a recovery file, it sets the
file permissions according to the umask of the shell that starts the PowerCenter Server. For
example, when the umask of the shell that starts the PowerCenter Server is 022, the
PowerCenter Server creates files with rw-r--r-- permissions. To change the file permissions,
you must change the umask of the shell that starts the PowerCenter Server and then restart it.
The PowerCenter Server on UNIX creates recovery files with rw------- permissions.
The PowerCenter Server on Windows creates files with read and write permissions.

PowerCenter Server Log


The PowerCenter Server creates a log for all status and error messages. You can troubleshoot
PowerCenter Server problems by examining error messages sent to this log.

28

Chapter 1: Understanding the Server Architecture

On UNIX, the default name of the PowerCenter Server log file is pmserver.log. You configure
the PowerCenter Server log file name with the LogFileName option in the PowerCenter
Server setup program.
On Windows, the PowerCenter Server logs status and error messages in the event log. Use the
Event Viewer to access those messages. You can also configure the PowerCenter Server on
Windows to write status and error messages to a file.

PowerCenter Server Messages


The PowerCenter Server associates a message code with the text of every message. The code
uses a text prefix, such as LM, CMN, or RR, with a code number, such as CMN_1039. In
PowerCenter Server error logs, the codes appear before the text as follows:
LM_34003 Server initialization completed.
LM_36802 Workflow <workflow name> scheduled to run at <time>.

Some message codes are embedded within other codes, for example:
CMN_1050 [LM 2041 Received request to start session]

You can also configure the PowerCenter Server on Windows to write error messages to the
Application Log, which you can view with the Event Viewer. Messages sent from the
PowerCenter Server display PowerCenter in the Source column, the code prefix in the
Category column, and the code number in the Event column. However, since some message
codes are embedded within other codes, to ensure you are viewing the true message code, you
must view the text of the message.
Figure 1-12 shows a sample application log:
Figure 1-12. Event Viewer Application Log Message

Output Files and Caches

29

Figure 1-13 shows how you can view the text of the message by selecting the message and
using the Enter key:
Figure 1-13. Application Log Message Detail

Error Messages
Using the listed error code, consult the Troubleshooting Guide for probable causes and actions
to correct the problem.

Workflow Log File


The PowerCenter Server creates a workflow log file for each workflow it runs. It writes
information in the workflow log such as intitialization of processes, workflow task run
information, errors encountered, and workflow run summary. Workflow log error messages
are categorized into severity levels. You can configure the PowerCenter Server to suppress
writing messages to the workflow log file. You can also configure the workflow to write
workflow messages to the session log file.
As with PowerCenter Server logs and session logs, the PowerCenter Server enters a code
number into the workflow log file message along with message text. You can find information
on error messages in the Troubleshooting Guide.
By default, the PowerCenter Server saves workflow logs in a directory entered for the server
variable $PMWorkflowLogDir in the PowerCenter Server registration and names the
workflow log workflow_name.log.
By default, the PowerCenter Server saves only one workflow log for each workflow. If you
want to save multiple logs for different workflow runs, you can configure the workflow to save

30

Chapter 1: Understanding the Server Architecture

a workflow log file in two different ways:

By timestamp, permitting an unlimited number of workflow logs.

By cycle, saving the configured number of workflow logs, replacing the older logs with new
logs. You can use the server variable $PMWorkflowLogCount to set the number of logs the
PowerCenter Server archives for the workflow.

For more information about the workflow log, see Log Files on page 455.

Session Log File


The PowerCenter Server creates a session log file for each session it runs. It writes information
in the session log such as initialization of processes, session validation, creation of SQL
commands for reader and writer threads, errors encountered, and load summary. The amount
of detail in the session log depends on the tracing level that you set.
As with PowerCenter Server logs and workflow logs, the PowerCenter Server enters a code
number along with message text. You can find information on error messages in the
Troubleshooting Guide.
By default, the PowerCenter Server saves session logs in a directory entered for the server
variable $PMSessionLogDir in the PowerCenter Server registration and names the session log
session_name.log.
By default, the PowerCenter Server saves only one session log for each session. If you want to
save multiple logs for different session runs, you can configure the session to save a session log
file in two different ways:

By timestamp, permitting an unlimited number of session logs.

By cycle, saving the configured number of session logs, replacing the older logs with new
logs. You can use the server variable $PMSessionLogCount to set the number of logs the
PowerCenter Server archives for the session.

For more information about the session log, see Log Files on page 455.

Session Details
When you run a session, the Workflow Manager creates session details that provide load
statistics for each target in the mapping. You can monitor session details during the session or
after the session completes. Session details include information such as table name, number of
rows written or rejected, and read and write throughput. You can view this information by
double-clicking the session in the Workflow Monitor.
For more information on session details file, see Monitoring Session Details on page 434.

Performance Detail File


The PowerCenter Server can create a set of information known as session performance details
to help determine where performance can be improved. Performance details provide

Output Files and Caches

31

transformation-by-transformation information on the flow of data through the session. To


generate this information for a session, select the performance detail option in the session
properties.
You can view performance details in the Workflow Monitor, or open the text file that contains
the information in a text editor. The PowerCenter Server names the file session_name.perf,
and stores it in the same directory as the session log (in the PowerCenter Server variable
directory $PMSessionLog, by default).
For more information on performance details, see Creating and Viewing Performance
Details on page 436.

Reject Files
By default, the PowerCenter Server creates a reject file for each target in the session. The
reject file contains rows of data that the writer does not write to targets.
The writer may reject a row in the following circumstances:

It is flagged for reject by an Update Strategy or Custom transformation.

It violates a database constraint, such as primary key constraint.

A field in the row was truncated or overflowed, and the target database is configured to
reject truncated or overflowed data.

By default, the PowerCenter Server saves the reject file in the directory entered for the server
variable $PMBadFileDir in the Workflow Manager, and names the reject file
target_table_name.bad.
Note: If you enable row error logging, the PowerCenter Server does not create a reject file.

For more information about the reject file, see Log Files on page 455.

Row Error Logs


When you configure a session, you can choose to log row errors in a central location. When a
row error occurs, the PowerCenter Server logs error information that allows you to determine
the cause and source of the error. The PowerCenter Server logs information such as source
name, row ID, current row data, transformation, timestamp, error code, error message,
repository name, folder name, session name, and mapping information.
For more information about row error logging, see Row Error Logging on page 481.

Recovery Tables and Files


You can recover failed sessions that write to relational targets. The PowerCenter Server creates
recovery tables on the target database system when it runs a session enabled for recovery.
When you run a session in recovery mode, the PowerCenter Server uses information in the
recovery tables to complete the session.
For more information about recovery, see Recovering Data on page 295.
32

Chapter 1: Understanding the Server Architecture

Control File
When you run a session that uses an external loader, the PowerCenter Server creates a control
file and a target flat file. The control file contains information about the target flat file such as
data format and loading instructions for the external loader. The control file has an extension
of .ctl. You can view the control file and the target flat file in the target file directory (default:
$PMTargetFilesDir).
For more information about external loading and control files, see External Loading on
page 523.

Email
You can compose and send email messages by creating an Email task in the Workflow
Designer or Task Developer. You can place the Email task in a workflow, or you can associate
it with a session. The Email task allows you to automatically communicate information about
a workflow or session run to designated recipients.
Email tasks in the workflow send email depending on the conditional links connected to the
task. For post-session email, you can create two different messages, one to be sent if the
session completes successfully, the other if the session fails. You can also use variables to
generate information about the session name, status, and total rows loaded.
For example, if your database administrator wants to track how long a session takes to
complete, you can configure the session to send an email containing the time and date the
session starts and completes. Or, if you want to notify your Informatica administrator when a
session fails, you can configure the session to send an email only if it fails and attach the
session log to the email.
For more information, see Sending Email on page 319.

Indicator File
If you use a flat file as a target, you can configure the PowerCenter Server to create an
indicator file for target row type information. For each target row, the indicator file contains a
number to indicate whether the row was marked for insert, update, delete, or reject. The
PowerCenter Server names this file target_name.ind and stores it in the same directory as the
target file. For more information about configuring the PowerCenter Server, see the
Installation and Configuration Guide.

Output File
If the session writes to a target file, the PowerCenter Server creates the target file based on a
file target definition. By default, the PowerCenter Server names the target file based on the
target definition name. If a mapping contains multiple instances of the same target, the
PowerCenter Server names the target files based on the target instance name.

Output Files and Caches

33

The PowerCenter Server creates this file in the PowerCenter Server variable directory,
$PMTargetFileDir, by default. For more information about working with target files, see
Working with Targets on page 233.

Cache Files
When the PowerCenter Server creates memory cache it also creates cache files. The
PowerCenter Server creates index and data cache files for the following transformations in a
mapping:

Aggregator transformation

Joiner transformation

Rank transformation

Lookup transformation

Sorter transformation

By default, the DTM creates the index and data files for Aggregator, Rank, Joiner, and
Lookup transformations in the directory configured for the $PMCacheDir server variable.
The PowerCenter Server names the index file PM*.idx, and the data file PM*.dat. The
PowerCenter Server creates the index and data files for the Sorter transformation in the
$PMTempDir server variable directory.
The PowerCenter Server writes to the cache files during the session in the following cases:

The mapping contains one or more Aggregator transformations configured without sorted
ports.

The session is configured for incremental aggregation.

The mapping contains a Lookup transformation that is configured to use a persistent


lookup cache, and the PowerCenter Server runs the session for the first time.

The mapping contains a Lookup transformation that is configured to initialize the


persistent lookup cache.

The DTM runs out of cache memory and pages to the local cache files. The DTM may
create multiple files when processing large amounts of data. The session fails if the local
directory runs out of disk space.

After the session completes, the DTM generally deletes the overflow index and data files. It
does not delete the cache files under the following circumstances:

The session is configured to perform incremental aggregation.

The session is configured with a persistent lookup cache.

Incremental Aggregation Files


If the session performs incremental aggregation, the PowerCenter Server saves index and data
cache information to disk when the session finished. The next time the session runs, the
PowerCenter Server uses this historical information to perform the incremental aggregation.

34

Chapter 1: Understanding the Server Architecture

The PowerCenter Server names these files PMAGG*.dat and PMAGG*.idx and saves them to
the cache directory.
For more information about incremental aggregation, see Using Incremental Aggregation
on page 573.

Persistent Lookup Cache


If a session uses a Lookup transformation, you can configure the transformation to use a
persistent lookup cache. With this option selected, the PowerCenter Server saves the lookup
cache to disk the first time it runs the session, then uses this lookup cache during subsequent
session runs. These files are saved in the cache directory. If you do not name the files in the
transformation properties, these files are named PMLKUP*.idx and PMLKUP*.dat.
For more information about lookup caching, see Session Caches on page 613 and Lookup
Transformation in the Transformation Guide.

Output Files and Caches

35

36

Chapter 1: Understanding the Server Architecture

Chapter 2

Configuring the Workflow


Manager
This chapter covers the following topics:

Overview, 38

Customizing the Workflow Manager Options, 39

Registering the PowerCenter Server, 46

Configuring Connection Object Permissions, 51

Setting Up a Relational Database Connection, 53

Replacing a Relational Database Connection, 62

37

Overview
Before you can use the Workflow Manager to create workflows and sessions, you must
configure the Workflow Manager. You can configure display options and connection
information in the Workflow Manager. You must register a PowerCenter Server before you
can start it or create a workflow to run against it.
You can configure the following information in the Workflow Manager:

Configure Workflow Manager options. You can configure options such as grouping
sessions or docking and undocking windows. For details, see Customizing the Workflow
Manager Options on page 39.

Register PowerCenter Servers. Before you can start a PowerCenter Server, you must
register it with the repository. For details, see Registering the PowerCenter Server on
page 46.

Create a server grid. When you have multiple PowerCenter Servers registered to the same
repository you can create a server grid to balance workloads. For details, see Working with
Server Grids on page 446.

Create source and target database connections. Create connections to each source and
target database. You must create connections to a database before you can create a session
that accesses the database. For details, see Setting Up a Relational Database Connection
on page 53.

Create connections objects. Create connection objects in the repository when you define
database, FTP, and external loader connections. For details, see Configuring Connection
Object Permissions on page 51.

Setting the Date/Time Display Format


The Workflow Manager displays the date and time formats configured in the Windows
Control Panel of the PowerCenter Client machine. To modify the date and time formats,
display the Control panel and open Regional Settings. Set the date and time formats on the
Date and Time tabs.
Note: For the Timer task and schedule settings, the Workflow Manager displays date in short

date format, and the time in 24-hour format (HH:mm).

38

Chapter 2: Configuring the Workflow Manager

Customizing the Workflow Manager Options


You can customize the Workflow Manager default options to control the behavior and look of
the Workflow Manager tools.
To configure Workflow Manager options, choose Tools-Options. You can configure the
following options:

General. You can configure workspace options, display options, and other general options
on the General tab. For more information about the General tab, see Configuring
General Options on page 39.

Format. You can configure font, color, and other format options on the Format tab. For
more information about the Format tab, see Configuring Format Options on page 42.

Miscellaneous. You can configure Copy Wizard and Versioning options on the
Miscellaneous tab. For more information about the Miscellaneous tab, see Configuring
Miscellaneous Options on page 43.

Advanced. You can configure enhanced security for connection objects in the Advanced
tab. For more information about the Advanced tab, see Enabling Enhanced Security on
page 44.

Configuring General Options


General options control tool behavior such as whether or not a tool retains its view when you
close it, how the Overview window behaves, and where the Workflow Manager stores
workspace files.

Customizing the Workflow Manager Options

39

Figure 2-1 shows the Workflow Manager General Options:


Figure 2-1. Workflow Manager General Options

Table 2-1 describes general options you can configure in the Workflow Manager:
Table 2-1. Workflow Manager General Options

40

Option

Description

Reload Tasks/
Workflows When
Opening a Folder

Reloads the last view of a tool when you open it. For example, if you have a workflow open
when you disconnect from a repository, select this option so that the same workflow displays
the next time you open the folder and Workflow Designer. Enabled by default.

Ask Whether to Reload


the Tasks/Workflows

Appears only when you select Reload tasks/workflows when opening a folder. Select this
option if you want the Workflow Manager to prompt you to reload tasks, workflows, and
worklets each time you open a folder. Disabled by default.

Overview Window Pans


Delay

By default, when you drag the focus of the Overview window, the focus of the workbook
moves concurrently. When you select this option, the focus of the workspace does not
change until you release the mouse button. Disabled by default.

Arrange Workflows/
Worklets Vertically By
Default

Arranges tasks in workflows vertically by default. Disabled by default.

Allow Invoking In-Place


Editing Using the
Mouse

By default, you can press F2 to edit objects directly in the workspace instead of opening the
Edit Task dialog box. Select this option so you can also click the object name in the
workspace to edit the object. Disabled by default.

Chapter 2: Configuring the Workflow Manager

Table 2-1. Workflow Manager General Options


Option

Description

Open Editor When Task


Is Created

Opens the Edit Task dialog box when you create a task. By default, the Workflow Manager
creates the task in the workspace. If you do not enable this option, double-click the task to
open the Edit Task dialog box. Disabled by default.

Workspace File
Directory

The directory for workspace files created by the Workflow Manager. Workspace files
maintain the last task or workflow you saved. This directory should be local to the
PowerCenter Client to prevent file corruption or overwrites by multiple users. By default, the
Workflow Manager creates files in the PowerCenter Client installation directory.

Display Tool Names On


Views

Displays the name of the tool in the upper left corner of the workspace or workbook. Enabled
by default.

Always Show the Full


Name of Selected Task

Shows the full name of a task when you select it. By default, the Workflow Manager
abbreviates the task name in the workspace. Enabled by default.

Show the Expression


On a Link

Shows the link condition in the workspace. If you do not enable this option, the Workflow
Manager abbreviates the link condition in the workspace. Enabled by default.

Launch Workflow
Monitor when Workflow
is Started

The Workflow Monitor launches when you start a workflow or a task. Enabled by default.

Receive Notifications
from Server

Allows you to receive notification messages from the Repository Server. The Repository
Server sends notification about actions performed on repository objects. Enabled by default.
For details, see Understanding the Repository in the Repository Guide.

Customizing the Workflow Manager Options

41

Configuring Format Options


Format options control colors and fonts. To configure format options, select the appropriate
Workflow Manager tool.
Figure 2-2 shows the Workflow Manager Format Options:
Figure 2-2. Workflow Manager Format Options

Table 2-2 describes the format options for the Workflow Manager:
Table 2-2. Workflow Manager Format Options

42

Option

Description

Show Solid Lines for


Links

Displays links as solid lines. By default, the Workflow Manager displays links as dotted lines.

Workspace Colors

Displays all items that you can customize in the selected tool. Select an item to change its
color.

Color

Choose the color of the selected item in Workspace Colors.

Font Categories

Select the Workflow Manager tool for which you want to customize the display font.

Change Font

Select to change the display font and language script for the Workflow Manager tool you
choose from the Categories menu.

Reset All

Resets all format options to their original default values.

Chapter 2: Configuring the Workflow Manager

Configuring Miscellaneous Options


Copy Wizard options control the display settings and available functions for the Copy
Wizard. Versioning options control how the Workflow Manager displays checked out objects.
Target loading options control how the PowerCenter Server loads targets. To configure Copy
Wizard, Versioning, or Target Load Type options, choose Tools-Options and select the
Miscellaneous tab.
Figure 2-3 shows the Workflow Manager Miscellaneous Options:
Figure 2-3. Copy Wizard, Versioning, and Target Load Type Options

Table 2-3 describes the options for the Copy Wizard, Versioning, and Target Load Type:
Table 2-3. Workflow Manager Miscellaneous Options
Option

Description

Validate Copied Objects

Validates the copied object. Enabled by default.

Generate Unique Name When


Resolved to Rename

Generates unique names for copied objects if you select the Rename option. For
example, if the workflow wf_Sales has the same as a workflow in the destination
folder, the Rename option generates the unique name wf_Sales1. Enabled by
default.

Get Default Object When


Resolved to Choose

Uses the object with the same name in the destination folder if you select the
Choose option.

Show Check Out Image in


Navigator

Displays the Check Out icon when an object has been checked out. Enabled by
default.

Customizing the Workflow Manager Options

43

Table 2-3. Workflow Manager Miscellaneous Options


Option

Description

Reset All

Resets all Copy Wizard and Versioning options to their default values.

Target Load Type

Sets default load type for sessions. You can choose normal or bulk loading.
Any change you make takes effect after you restart the Workflow Manager.
You can override this setting in the session properties. Default is Bulk.
For more information on normal and bulk loading, see Table A-15 on page 697.

Enabling Enhanced Security


The Workflow Manager has an enhanced security option that allows you to specify a default
set of privileges that applies to restricted access controls for connection objects.
When you enable enhanced security, the Workflow Manager automatically assigns default
permissions for connection objects to the object owner, owner group, and all other users. You
can assign read, write, and execute permissions to an object, and specify permission for users
and groups you add in the Permissions dialog box when you edit a connection.
Table 2-4 lists the default permissions to a connection object:
Table 2-4. Default Permissions for Connection Objects
User

Default Connection Object Permissions

Owner

Read/Write/Execute

Owner Group

Read/Execute

World

No permissions

If you do not enable enhanced security, the Workflow Manager assigns Read, Write, and
Execute permissions to all users or groups for the connection.
Enabling enhanced security does not lock the restricted access settings for connection objects.
You can continue to change the permissions for connection objects after enabling enhanced
security.
If you delete the Owner from the repository, the Workflow Manager automatically assigns
ownership of the object to Administrator.
To enable enhanced security for connection objects:

44

1.

Choose Tools-Options.

2.

Click the Advanced Tab.

Chapter 2: Configuring the Workflow Manager

3.

Select Enable Enhanced Security.

4.

Click OK.

Customizing the Workflow Manager Options

45

Registering the PowerCenter Server


Before you can start the PowerCenter Server or create or run workflows, you need to register
the PowerCenter Server in the repository. Use the Workflow Manager to register the
PowerCenter Server.
To register, edit, or delete the PowerCenter Server, you must have Administer Server,
Administrator, or Super User privileges. In addition, to register a PowerCenter Server, you
need the following information:

PowerCenter Server name.

Host name.

TCP/IP address used to access the PowerCenter Server.


Use the IP address or host name of the machine on which the PowerCenter Server runs,
and the port number the PowerCenter Server uses on that machine.

Code page identifying the character set associated with the PowerCenter Server.

Default directories you want the PowerCenter Server to use for workflow files and caches.

You can perform the following registration tasks for a PowerCenter Server:

Register a PowerCenter Server. When you register a PowerCenter Server, specify


information such as the code page and directories for session output. This information is
stored in the repository.
When you register multiple PowerCenter Servers, you can choose the PowerCenter Server
to run a workflow or a session. You also can create a server grid to distribute workloads
across multiple servers.

Edit a PowerCenter Server. When you edit a PowerCenter Server, all workflows and
sessions using that PowerCenter Server use the updated server connection information,
including the updated code page settings. You do not need to restart the Workflow
Manager to use the updated information.

Delete a PowerCenter Server. When you delete a PowerCenter Server, you must assign
another PowerCenter Server for the workflows and sessions using the deleted server before
you can run the workflow. To assign a PowerCenter Server to a workflow or to a session,
choose Connections-Assign.

Server Variables
You can define server variables for each PowerCenter Server you register. Some server variables
define the path and directories for workflow output files and caches. By default, the
PowerCenter Server places output files in these directories when you run a workflow. Other
server variables define server attributes such as log file count. In a server grid, you must use
the same server variables for each server.
The installation process creates directories in the location where you install the PowerCenter
Server. To use these directories as the default location for the session output files, you must
first set the server variable $PMRootDir to define the path to the directories.
46

Chapter 2: Configuring the Workflow Manager

By using server variables, you simplify the process of changing the PowerCenter Server that
runs a workflow. If each workflow in a folder uses server variables, then when you copy the
folder to a production repository, the PowerCenter Server in production can run the workflow
using the server variables defined with the PowerCenter server running against the test
repository. The PowerCenter Server reads and writes the files to the directories in the
$PMRootDir path. To ensure a workflow successfully completes, relocate any necessary file
source or incremental aggregation file to the default directories of the new PowerCenter
Server.
Table 2-5 lists the server variables you configure when you register a PowerCenter Server:
Table 2-5. Server Variables
Server Variable

Required/
Optional

$PMRootDir

Required

A root directory to be used by any or all other server variables.


Informatica recommends you use the PowerCenter Server installation
directory as the root directory.

$PMSessionLogDir

Required

Default directory for session logs. Defaults to $PMRootDir/SessLogs.

$PMBadFileDir

Required

Default directory for reject files. Defaults to $PMRootDir/BadFiles.

$PMCacheDir

Required

Default directory for the index and data cache files. Defaults to
$PMRootDir/Cache. To avoid performance problems, always use a drive
local to the PowerCenter Server for the cache directory. Do not use a
mapped or mounted drive for cache files.

$PMTargetFileDir

Required

Default directory for target files. Defaults to $PMRootDir/TgtFiles.

$PMSourceFileDir

Required

Default directory for source files. Defaults to $PMRootDir/SrcFiles.

$PMExtProcDir

Required

Default directory for external procedures. Defaults to $PMRootDir/


ExtProc.

$PMTempDir

Required

Default directory for temporary files. Defaults to $PMRootDir/Temp.

$PMSuccessEmailUser

Optional

Email address to receive post-session email when the session completes


successfully. Use to address post-session email. The default value is an
empty string. For details, see Sending Email on page 319.

$PMFailureEmailUser

Optional

Email address to receive post-session email when the session fails. The
default value is an empty string. Use to address post-session email.

$PMSessionLogCount

Optional

Number of session logs the PowerCenter Server archives for the session.
Use to archive session logs. For details, see Viewing Session Logs on
page 474. Defaults to 0.

$PMSessionErrorThreshold

Optional

Number of non-fatal errors the PowerCenter Server allows before failing


the session. Non-fatal errors include reader, writer, and DTM errors. If
you want to stop the session on errors, enter the number of non-fatal
errors you want to allow before stopping the session. The PowerCenter
Server maintains an independent error count for each source, target, and
transformation. Use to configure the Stop On option in the session
properties.
Defaults to 0. If you use the default setting, non-fatal errors do not cause
the session to stop.

Description

Registering the PowerCenter Server

47

Table 2-5. Server Variables


Server Variable

Required/
Optional

$PMWorkflowLogDir

Required

Default directory for workflow logs.


Defaults to $PMRootDir/WorkflowLogs.

$PMWorkflowLogCount

Optional

Number of workflow logs the PowerCenter Server archives for the


workflow. Defaults to 0.

$PMLookupFileDir

Optional

Default directory for lookup files. Defaults to $PMRootDir/LkpFiles.

Description

Steps for Registering a PowerCenter Server


You can register one or more PowerCenter Servers with a PowerCenter repository, allowing
you to run workflows and sessions on different servers. In a multiple server environment, it is
important to enter descriptive server names for each registered server to help users
differentiate between servers. When you register multiple servers you must have a unique
server name and a unique combination of host name and port number for each server in the
repository. For more information on using multiple servers, see Using Multiple Servers on
page 443.
To register the PowerCenter Server:
1.

In the Workflow Manager, connect to the repository.


Note: The first time you connect to the repository, use the database user name and

password used to create the repository.


2.

Choose Server-Server Configuration.


The Server Browser dialog box appears.

3.

48

Click New to register a new server.

Chapter 2: Configuring the Workflow Manager

The Server dialog box appears.

4.

Enter a new server name.

5.

Configure the TCP/IP connectivity settings.

6.

If you do not know the IP address, enter the host name and use the Resolve Server button
to resolve the IP address. You can also enter the IP address in the Host Name/IP Address
field and use the Resolve Server button to resolve the host name.
The Workflow Manager can only resolve the host name or IP address if you enter the
information in the Host Name/IP Address field.
The Workflow Manager also resolves the host name or IP address when you click OK.
Table 2-6 describes the settings required to register a PowerCenter Server using TCP/IP:
Table 2-6. TCP/IP Settings to Register a Server
TCP/IP Option

Required/
Optional

Server Name

Required

The name of PowerCenter Server. This name must be unique to


the repository.

Host Name or IP
address

Required

Server host name or IP address of the PowerCenter Server


machine.

Resolved IP Address

n/a (read-only)

The IP address resolved by the Workflow Manager. This is a


read-only field.

Port Number

Required

Port number the PowerCenter Server uses. Must be the same


port listed in the PowerCenter Server configuration parameters.

Description

Registering the PowerCenter Server

49

Table 2-6. TCP/IP Settings to Register a Server

7.

TCP/IP Option

Required/
Optional

Timeout

Required

Number of seconds the Workflow Manager waits for a response


from the PowerCenter Server.

Code Page

Required

Character set associated with the PowerCenter Server. Select


the code page identical to the PowerCenter Server operating
system code page. Must be identical to or compatible with the
repository code page.

Description

For $PMRootDir, enter a valid root directory for the PowerCenter Server platform.
Informatica recommends using the PowerCenter Server installation directory as the root
directory because the PowerCenter Server installation creates the default server directories
there. If you enter a different root directory, make sure to create the necessary directories.

8.

Enter the server variables, as desired.


Do not use trailing delimiters. A trailing delimiter might invalidate the directory used by
the PowerCenter Server. For example, enter c:\data\sessionlog, not c:\data\sessionlog\.
See Table 2-5 on page 47 for a list of server variables.

9.

Click OK.
The new PowerCenter Server appears in the Navigator below the repository.

Deleting a PowerCenter Server


When you delete a PowerCenter Server with associated workflows, assign another server to
the workflows. For details, see Assigning the PowerCenter Server to a Workflow on
page 122.
To delete a PowerCenter Server, you must have one of the following privileges:

Administer Server privilege

Super User privilege

To delete a server:

50

1.

In the Workflow Manager, choose Server-Server Configuration.

2.

Select the PowerCenter Server you want to delete.

3.

Click Delete.

4.

Click OK.

Chapter 2: Configuring the Workflow Manager

Configuring Connection Object Permissions


You create connection objects in the repository when you define the following connections:

Relational. Database connections for relational source or target databases. For more
information about relational database connections, see Setting Up a Relational Database
Connection on page 53.

Queue. Database connections for message queues. For more information about message
queues, see the PowerCenter Connect for IBM MQSeries User and Administrator Guide.

FTP. Connection to access source or target files using File Transfer Protocol (FTP). For
more information about using FTP, see Using FTP on page 559.

Application. Database connection to access databases such as SAP R/3 and PeopleSoft. For
more information, see your PowerCenter Connect documentation.

Loader. Connection to access target databases using external loaders. For more
information about using external loaders, see External Loading on page 523.

With correct permissions, you can access these objects from all folders in the repository and
use them in any session.

Connection Object Permissions


You can configure and manage permissions within each connection object. The Workflow
Manager assigns Owner permissions to the user who creates the connection. The Workflow
Manager grants Owner Group permissions to the first group in the Group Memberships list
of the owner.
The Workflow Manager automatically assigns default permissions for connection objects to
the object owner, owners group, and all other users if you enable enhanced security. For more
information about enhanced security, see Enabling Enhanced Security on page 44.
You can specify read, write, and execute permissions for each user and group in the list. You
can perform the following types of tasks with different connection object permissions, in
combination with user privileges and folder permissions:

Read. View the connection object in the Workflow Manager and Repository Manager.
When you have read permission, you can perform tasks in which you view, copy, or edit
repository objects associated with the connection object.

Write. Edit the connection object.

Execute. Run sessions that use the connection object.

For information on tasks you can perform with user privileges, folder permissions, and
connection object permissions, see Repository Security in the Repository Guide.
To manage connection permissions, you must have Super User privileges or be the owner of
the connection. If you do not have the privilege to manage connection permissions, the
Permissions dialog box is read-only. You can change the owner of the object, add or remove
users and groups in the permissions list, and change the permissions for each user or group.

Configuring Connection Object Permissions

51

To view or delete a connection, you must have at least read permission for the connection. To
edit a connection, you must have read and write permissions for the connection.
You add permissions from the Connection Browser dialog box.
To configure permissions for connection objects:
1.

Open the Connection Browser dialog box for the connection object. For example, choose
Connections-Relational to open the Connection Browser dialog box for a relational
database connection.

2.

Select the connection object you want to configure in the Connection Browser dialog
box.

3.

Click Permissions to open the Permissions dialog box.

Configure permissions for connection objects.

52

4.

Select the owner and group for the connection object.

5.

Add user or group you want to assign permissions for the connection, and click OK.

Chapter 2: Configuring the Workflow Manager

Setting Up a Relational Database Connection


Before the PowerCenter Server can access a source or target database in a session, you must
configure the database connections in the Workflow Manager. When you create or modify a
session that reads from or writes to a relational database, you can select only configured source
and target databases. Database connections are saved in the repository.
When you create a connection, you must have the following information available:

Database name. Name for the connection.

Database type. Type of the source or target database.

Database username. Name of a user who has the appropriate database permissions to read
from and write to the database.

Password. Database password (7-bit ASCII only).

Connect string. Connect string used to communicate with the database.

Database code page. Code page associated with the database.

Some database drivers, such as ISG Navigator, do not allow user names and passwords. Since
the Workflow Manager requires a database user name and password, PowerCenter provides
two reserved words to register databases that do not allow user names and passwords:

PmNullUser

PmNullPasswd

Use the PmNullUser user name if you are using Oracle OS Authentication. Oracle OS
Authentication allows you to log on to an Oracle database if you have a logon to the operating
system. You do not need to know a database user name and password. PowerCenter uses
Oracle OS Authentication when the connection user name is PmNullUser and the connection
is for an Oracle database.
You can change connection information at any time. If you edit a Workflow Manager
connection used by a workflow, the PowerCenter Server uses the updated connection
information the next time the workflow runs. You might use this functionality when moving
from test to production.
Tip: If you edit a database connection, all sessions using the named connection then use the

updated connection.
To create a database connection, you must have one of the following privileges:

Use Workflow Manager

Super User

Database Connect Strings


When you create a database connection, specify a connect string for that connection. The
PowerCenter Server uses connect strings to communicate with a database.

Setting Up a Relational Database Connection

53

Table 2-7 lists the native connect string syntax for each supported database when you create
or update connections:
Table 2-7. Native Connect String Syntax
Database

Connect String Syntax

Example

IBM DB2

dbname

mydatabase

Informix

dbname@servername

mydatabase@informix

Microsoft SQL Server

servername@dbname

sqlserver@mydatabase

Oracle

dbname.world (same as TNSNAMES entry)

oracle.world

Sybase

servername@dbname

sambrown@mydatabase

Teradata*

ODBC_data_source_name or
ODBC_data_source_name@db_name or
ODBC_data_source_name@db_user_name

TeradataODBC
TeradataODBC@mydatabase
TeradataODBC@jsmith

*Use Teradata ODBC drivers to connect to source and target databases.

Database Connection Code Pages


When you create a database connection, select a code page for that connection. Code pages
must be compatible for accurate data movement.
If you configure the PowerCenter Server and PowerCenter Client for data code page
validation, the PowerCenter Server enforces code page compatibility at session runtime. Use
the following guidelines to determine code page compatibility:

The target database code page must be a superset of the source database code page and the
PowerCenter Server code page.

The source database code page must be a subset of the target database code page and the
PowerCenter Server code page.

For example, if the source database code page is 7-bit ASCII and the PowerCenter Server code
page is Latin 1, the target database code page must be Latin 1, which is a superset of 7-bit
ASCII.
Table 2-8 summarizes code page compatibility between the source and target code pages when
you configure the PowerCenter Client and PowerCenter Server for data code page validation:
Table 2-8. Source and Target Code Page Compatibility

54

Component Code Page

Code Page Compatibility

Source

Subset of target and PowerCenter Server.

Target

Superset of source and PowerCenter Server.


The PowerCenter Server creates external loader data and control files using the
target flat file code page.

Chapter 2: Configuring the Workflow Manager

When you change the code page in a database connection, you must choose one that is
compatible with the previous code page. If the code pages are incompatible, the Workflow
Manager invalidates all sessions using that database connection.
If you configure the PowerCenter Client and PowerCenter Server for relaxed data code page
validation, you can select any supported code page for source and target database connections.
If you are familiar with your data and are confident that it will convert safely from one code
page to another, you can run sessions with incompatible source and target data code pages. It
is your responsibility to ensure your data will convert properly.
For details, see Globalization Overview and Code Pages in the Installation and
Configuration Guide.

Configuring Environment SQL


For relational databases, you may need to execute some SQL commands in the database
environment when you connect to the database. For example, you might want to set isolation
levels on the source and target systems to avoid deadlocks.
You configure environment SQL in the database connection. You can use environment SQL
for source, target, lookup, and stored procedure connections. If the SQL syntax is not valid,
the PowerCenter Server does not connect to the database, and the session fails.
The PowerCenter Server executes the SQL each time it connects to the database. For example,
if you configure environment SQL in a target connection, and you configure three partitions
for the pipeline, the PowerCenter Server executes the SQL three times, once for each
connection to the target database.

Guidelines for Entering Environment SQL


Consider the following guidelines when creating the SQL statements:

You can enter any SQL command that is valid in the database associated with the
connection object. The PowerCenter Server does not allow nested comments, even though
the database might.

When you enter SQL in the SQL Editor, you manually type in the SQL statements.

Use a semi-colon (;) to separate multiple statements.

The PowerCenter Server ignores semi-colons within single quotes, double quotes, or
within /* ...*/.

If you need to use a semi-colon outside of quotes or comments, you can escape it with a
back slash (\).

You cannot use session or mapping variables in the environment SQL.

You can configure the table owner name using sqlid in the environment SQL for a DB2
connection. However, the table owner name in the target instance overrides the SET sqlid
statement in environment SQL. To use the table owner name specified in the SET sqlid
statement, do not enter a name in the target name prefix.

Setting Up a Relational Database Connection

55

Configuring a Relational Database Connection


Use the following procedure to configure a relational database connection.
To create a relational database connection:
1.

In the Workflow Manager, connect to a repository.

2.

Choose Connections-Relational.
A dialog box appears, listing all the registered source and target database connections.

56

3.

Select the type of database connection you want to create.

4.

Click New.

Chapter 2: Configuring the Workflow Manager

The Connection Object Definition dialog box appears.

5.

For relational database connections, enter the connection information listed in Table 2-9:
Table 2-9. Relational Database Connection Information
Database Connection
Option

Required/
Optional

Name

Required

Connection name used by the Workflow Manager. Connection


name cannot contain spaces or other special characters, except
for the underscore.

Type

Required

Type of database.

User Name

Required

Database user name with the appropriate read and write


database permissions to access the database. If you are using
Oracle OS Authentication, or you are using databases such as
ISG Navigator that do not allow user names, enter PmNullUser.
For Teradata connections, this overrides the default database
user name in the ODBC entry.

Password

Required

Password for the database user name. For Oracle OS


Authentication, or for databases such as ISG Navigator that do
not allow passwords, enter PmNullPassword. For Teradata
connections, this overrides the database password in the ODBC
entry.
Passwords must be in 7-bit ASCII only.

Description

Setting Up a Relational Database Connection

57

Table 2-9. Relational Database Connection Information

6.

Database Connection
Option

Required/
Optional

Connect String

Required for all


databases,
except Microsoft
SQL Server and
Sybase

Connect string used to communicate with the database. For


syntax, see Database Connect Strings on page 53.

Code Page

Required

Specifies the code page the PowerCenter Server uses to read


from a source database or write to a target database or file.

Description

For each type of relational database connection, enter the attributes listed in Table 2-10:
Table 2-10. Relational Database Connection Attributes

7.

Attribute Name

Relational Database
Type

Rollback Segment

Oracle

The name of the rollback segment. A rollback segment


records database transactions in the event that you
want to undo the transaction.

Enable Parallel Mode

Oracle

Enables parallel processing when loading data into a


table in bulk mode.

Environment SQL

All relational databases

Enter SQL commands to set the database environment


when you connect to the database.

Database Name

Sybase, Microsoft SQL


Server, and Teradata

The name of the database. For Teradata connections,


this overrides the default database name in the ODBC
entry. Also, if you do no enter a database name here for
a Teradata connection, the PowerCenter Server uses
the default database name in the ODBC entry.

Data Source Name

Teradata

The name of the Teradata ODBC data source.

Server Name

Sybase and Microsoft


SQL Server

Database server name. Used to configure workflows.

Packet Size

Sybase and Microsoft


SQL Server

Used to optimize the ODBC connection to Sybase and


Microsoft SQL Server.

Domain Name

Microsoft SQL Server

The name of the domain. Used for Microsoft SQL Server


on Windows.

Use Trusted Connection

Microsoft SQL Server

If selected, the PowerCenter Server uses Windows


authentication to access the Microsoft SQL Server
database. The user name that starts the PowerCenter
Server must be a valid Windows user with access to the
Microsoft SQL Server database.

Description

Click OK.
The new database connection appears in the Connection Browser list.

8.

58

To add more database connections, repeat steps 3-7.

Chapter 2: Configuring the Workflow Manager

9.

Click OK to save all changes.

Deleting Connection Objects


When you delete relational, queue, FTP, Application, and external loader connections, the
Workflow Manager marks all sessions that use these connections invalid. To make the sessions
valid, you must edit them and replace the missing connections.

Copying a Relational Database Connection


After you set up a relational database connection, you can make a copy of it by clicking the
Copy As button. The Workflow Manager allows you to choose the relational database type
when you make a copy of a relational database connection.
When you make a copy of a relational database connection, the Workflow Manager retains
the connection properties that apply to the relational database type you select. The copy of
the connection is invalid if a required connection property is missing. Edit the connection
properties manually to validate the connection.
The Workflow Manager appends an underscore and the first three letters of the relational
database type to the name of the new database connection. For example, you make a copy of
the Microsoft SQL Server database connection called Dev_Target. You choose Oracle for the
type of the new database connection. The Workflow Manager names the new database
connection Dev_Target_Ora.
To copy a relational database connection:
1.

Choose Connections-Relational.
The Relational Connection Browser appears.

2.

Choose the relational connection you want to copy.


Tip: Hold the shift key to select more than one connection to copy.
Setting Up a Relational Database Connection

59

3.

Click Copy As.


The Select Subtype dialog box appears.

4.

Select a relational database type for the copy of the connection.

5.

Click OK.

6.

The Workflow Manager retains connection properties that apply to the relational
database type.
If a required connection property does not exist, the Workflow Manager displays a
warning message.

60

7.

Click OK to close the warning dialog box.

8.

The copy of the connection appears in the Relational Connection Browser.

Chapter 2: Configuring the Workflow Manager

9.

If the copied connection is invalid, click the Edit button to enter required connection
properties.

10.

Click Close to close the Relational Connection Browser dialog box.

Setting Up a Relational Database Connection

61

Replacing a Relational Database Connection


You can replace a relational database connection with another relational database connection.
For example, you might have several sessions that you want to write to another target
database. Instead of editing the properties for each session, you can replace the relational
database connection for all sessions in the repository that use the connection.
When you replace database connections, the Workflow Manager replaces the relational
database connections in the following locations for all sessions using the connection:

Source connection

Target connection

Connection Information property in Lookup and Stored Procedure transformations

$Source Connection Value session property

$Target Connection Value session property

If the repository contains both relational and application connections with the same name,
the Workflow Manager only replaces the relational connection when you specified the
connection type as relational in all locations in the repository.
For example, you have a relational and an application source, each called ITEMS. In one
session, you specified the name ITEMS for a source connection instead of Relational:ITEMS.
When you replace the relational connection ITEMS with another relational connection, the
Workflow Manager does not replace any relational connection in the repository because it
cannot determine the connection type for the source connection entered as ITEMS.
The PowerCenter Server uses the updated connection information the next time the workflow
runs.
To replace connections in the Workflow Manager, you must have Super User privilege.
You must first close all folders before replacing a relational database connection.
To replace a relational database connection:

62

1.

Close all folders in the repository.

2.

Choose Connections-Replace.

Chapter 2: Configuring the Workflow Manager

The Replace Connections dialog box appears.


Replace a connection.

3.

Click the Add button to replace a connection.

4.

In the From list, choose a relational database connection you want to replace.

5.

In the To list, choose the replacement relational database connection.

6.

Click Replace.
All sessions in the repository that use the From connection now use the connection you
choose in the To list.

Replacing a Relational Database Connection

63

64

Chapter 2: Configuring the Workflow Manager

Chapter 3

Using the Workflow


Manager
This chapter covers the following topics:

Overview, 66

Navigating the Workspace, 69

Working with Repository Objects, 73

Checking Out and In Versioned Repository Objects, 74

Searching For Versioned Objects, 76

Copying Repository Objects, 77

Comparing Repository Objects, 79

Working with Metadata Extensions, 82

65

Overview
In the Workflow Manager, you define a set of instructions called a workflow to execute
mappings you build in the Designer. Generally, a workflow contains a session and any other
task you may want to perform when you execute a session. Tasks can include a session, email
notification, or scheduling information. You connect each task with links in the workflow.
You can also create a worklet in the Workflow Manager. A worklet is an object that groups a
set of tasks. A worklet is similar to a workflow, but without scheduling information. You can
execute a batch of worklets inside a workflow.
After you create a workflow, you run the workflow in the Workflow Manager and monitor it
in the Workflow Monitor. For details on the Workflow Monitor, see Monitoring Workflows
on page 401.

Workflow Manager Tools


To create a workflow, you first create tasks such as a session, which contains the mapping you
build in the Designer. You then connect tasks with conditional links to specify the order of
execution for the tasks you created. The Workflow Manager consists of three tools to help you
develop a workflow:

Task Developer. Use the Task Developer to create tasks you want to execute in the
workflow.

Workflow Designer. Use the Workflow Designer to create a workflow by connecting tasks
with links. You can also create tasks in the Workflow Designer as you develop the
workflow.

Worklet Designer. Use the Worklet Designer to create a worklet.

Figure 3-1 shows what a workflow might look like if you want to run a session, perform a
shell command after the session completes, and then stop the workflow:
Figure 3-1. Sample Workflow

Workflow Tasks
You can create the following types of tasks in the Workflow Manager:

66

Assignment. Assigns a value to a workflow variable. For details, see Working with the
Assignment Task on page 140.

Command. Specifies a shell command to run during the workflow. For details, see Using
Workflow Variables on page 103.

Chapter 3: Using the Workflow Manager

Control. Stops or aborts the workflow. For details on the Control task, see Stopping or
Aborting the Workflow on page 129.

Decision. Specifies a condition to evaluate. For details, see Working with the Decision
Task on page 149.

Email. Sends email during the workflow. For details on the Email task, see Sending
Email on page 319.

Event-Raise. Notifies the Event-Wait task that an event has occurred. For details, see
Working with Event Tasks on page 153.

Event-Wait. Waits for an event to occur before executing the next task. For details, see
Working with Event Tasks on page 153.

Session. Runs a mapping you create in the Designer. For details on the Session task, see
Working with Sessions on page 173.

Timer. Waits for a timed event to trigger. For details, see Scheduling a Workflow on
page 112.

Workflow Manager Windows


The Workflow Manager displays the following windows to help you create and organize
workflows:

Navigator. Allows you to connect to and work in multiple repositories and folders. In the
Navigator, the Workflow Manager displays a red icon over invalid objects.

Workspace. Allows you to create, edit, and view tasks, workflows, and worklets.

Output. Contains tabs to display different types of output messages. The Output window
contains the following tabs:

Save. Displays messages when you save a workflow, worklet, or task. The Save tab
displays a validation summary when you save a workflow or a worklet.

Fetch Log. Displays messages when the Workflow Manager fetches objects from the
repository.

Validate. Displays messages when you validate a workflow, worklet, or task.

Copy. Displays messages when you copy repository objects.

Server. Displays messages from the PowerCenter Server.

Notifications. Displays messages from the Repository Server.

Overview. An optional window that allows you to easily view large workflows in the
workspace. Outlines the visible area in the workspace and highlights selected objects in
color. Choose View-Overview Window to display this window.

You can view a list of open windows and switch from one window to another in the Workflow
Manager. To view the list of open windows, choose Window-Windows.
The Workflow Manager also displays a status bar that shows the status of the operation you
perform.

Overview

67

Figure 3-2 shows the Workflow Manager windows:


Figure 3-2. Workflow Manager Windows
Navigator

Workspace

Overview

Output

Status Bar

68

Chapter 3: Using the Workflow Manager

Navigating the Workspace


The Workflow Manager allows you to perform the following operations to navigate the
workspace:

Customize windows.

Customize toolbars.

Search for tasks, links, events and variables.

Arrange objects in the workspace.

Zoom and pan the workspace.

Customizing Workflow Manager Windows


You can customize the following options for the Workflow Manager windows:

Display a window. From the menu, choose View. Then select the window you want to
open.

Close a window. Click the small x in the upper right corner of the window.

Dock or undock a window. Double-click the title bar, or drag the title bar toward or away
from the workspace.

Using Toolbars
The Workflow Manager can display the following toolbars to help you select tools and
perform operations quickly:

Standard. Contains buttons to connect to and disconnect from repositories and folders,
toggle windows, zoom in and out, pan the workspace, and find objects.

Connections. Contains buttons to open connection browsers and to assign servers.

Repository. Contains buttons to connect to, disconnect from, and add repositories, open
folders, close tools, save changes to repositories, and print the workspace.

View. Contains buttons to customize toolbars, toggle the status bar and windows, toggle
full-screen view, create a new workbook, and view the properties of objects.

Layout. Contains buttons to arrange and restore objects in the workspace, find objects,
zoom in and out, and pan the workspace.

Tasks. Contains buttons to create tasks.

Workflow. Contains buttons to edit workflow properties.

Run. Contains buttons to schedule the workflow, start the workflow, or start a task.

Navigating the Workspace

69

You can perform the following operations with toolbars:

Display or hide a toolbar.

Create a new toolbar.

Add or remove buttons.

For details on how to perform these toolbar operations, see Using the Designer in the
Designer Guide.

Searching for Items


The Workflow Manager includes search features to help you find tasks, links, variables, and
events in the workspace as well as text in the Output window. You can search for items in any
Workflow Manager tool or Output window.
There are two ways to search for items in the workspace:

Find in Workspace. Searches multiple items at once and returns a list of all task names,
link conditions, event names, or variable names that contain the search string.

Find Next. Searches through items one at a time and highlights the first task, link, event,
variable, or text string that contains the search string. If you repeat the search, the
Workflow Manager highlights the next item that contains the search string.

To find a task, link, event, or variable in the workspace:


1.

In any Workflow Manager tool, click the Find in Workspace toolbar button or choose
Edit-Find in Workspace.
The Find in Workspace dialog box opens:

2.

Choose whether you want to search for tasks, links, variables, or events.

3.

Enter a search string, or select a string from the list.


The Workflow Manager saves the last 10 search strings in the list.

4.

Specify whether or not to match whole words and whether or not to perform a casesensitive search.

5.

Click Find Now.


The Workflow Manager lists task names, link conditions, event names, or variable names
that match the search string at the bottom of the dialog box.

6.
70

Click Close.

Chapter 3: Using the Workflow Manager

To find a single object:


1.

To search for a task, link, event, or variable, open the appropriate Workflow Manager
tool and click a task, link, or event. To search for text in the Output window, click the
appropriate tab in the Output window.

2.

Enter a search string in the Find field on the standard toolbar.


The search is not case-sensitive.
Find Next Button
Find Field

3.

Choose Edit-Find Next, click the Find Next button on the toolbar, or press Enter or F3
to search for the string.
The Workflow Manager highlights the first task name, link condition, event name, or
variable name that contains the search string, or the first string in the Output window
that matches the search string.

4.

To search for the next item, press Enter or F3 again.


The Workflow Manager alerts you when you have searched through all items in the
workspace or Output window before it highlights the same objects a second time.

Arranging Objects in the Workspace


The Workflow Manager can arrange objects in the workspace horizontally or vertically. In the
Task Manager, you can also arrange tasks evenly in the workspace by choosing Tile. To
arrange objects in the workspace, select Layout-Arrange and choose Horizontal, Vertical, or
Tile.

Zooming the Workspace


You can zoom in and out as well as pan the workspace to adjust the view.
Use the following toolbar or Layout menu options to set zoom levels:

Zoom Center In/Out by 10%. Increases or decreases the magnification by 10%


increments while maintaining the center of the view.

Zoom Point In/Out by 10%. Uses a point you select as the center point and increases or
decreases the magnification by 10% increments.

Zoom Rectangle. Increases the current magnification of a rectangular area you select.
Degree of magnification depends upon the size of the area you select, workspace size, and
current magnification.

Zoom Normal. Sets the zoom level to 100%.

Scale to Fit. Scales all workspace objects to fit the workspace.

Navigating the Workspace

71

Zoom Percent. Sets the zoom level to the percent you choose while maintaining the center
of the view.

To maximize the size of the workspace window, choose View-Full Screen. To go back to
normal view, click the Close Full Screen button or press Esc.
To pan the workspace, click Layout-Pan or click the Pan button on the toolbar. Drag the
focus of the workspace window and release the mouse button when it is in the appropriate
position. Double-click the workspace to stop panning.

72

Chapter 3: Using the Workflow Manager

Working with Repository Objects


The Workflow Manager allows you to perform the following general operations with
repository objects:

View properties for each object.

Enter descriptions for each object.

Rename an object.

To edit any repository object, you must first add a repository in the Navigator so you can
access the repository object. To add a repository in the Navigator, choose Repository-Add or
click the Add Repository button on the Repository toolbar. Enter the repository name and
user name and click OK.

Viewing Object Properties


To view properties of a repository object, first select the repository object in the Navigator.
Choose View-Properties to view object properties. Or, right-click the repository object and
choose Properties.
You can view properties of a folder, task, worklet, or workflow. For folders, the Workflow
Manager displays folder name and whether the folder is shared. Object properties are readonly.
You can also view dependencies for repository objects, for more information about viewing
object dependencies, see the Repository Guide.

Entering Descriptions for Repository Objects


When you edit an object in the Workflow Manager, you can enter descriptions and comments
for that object. The maximum number of characters you can enter is 2,000 bytes/K, where K
is the maximum number of bytes a character contains in the selected repository code page.
For example, if the repository code page is a Japanese code page where the each character can
contain up to two bytes (K=2), each description and comment field allows you to enter up to
1,000 characters.

Renaming Repository Objects


You can rename repository objects by clicking the Rename button in the Edit Tasks dialog box
or the Edit Workflow dialog box. You can also rename repository objects by clicking the
object name in the workspace and typing in the new name.

Working with Repository Objects

73

Checking Out and In Versioned Repository Objects


When you work with versioned objects, you check out an object when you want to change it,
and check it in when you want to commit your changes to the repository. Checking in new
objects adds a new version to the object history.
For more information, see Working with Versioned Objects in the Repository Guide.

Checking Out Objects


When you open an object in the workspace, the repository checks out the object and locks the
object for your use. No other user can check out the object. If another user has checked out
the object, you can open the object as read-only.
You can view objects you and other users have checked out. You might want to view
checkouts to see if an object is available for you to work with, or if you need to check in all of
the objects you have worked with.
For more information on viewing object checkouts, see Working with Versioned Objects in
the Repository Guide.

Checking In Objects
You commit changes to the repository by checking in objects. When you check in an object,
the repository creates a new version of the object and assigns it a version number. The
repository increments the version number by one each time it creates a new version.
You can check in an object from the Workflow Manager workspace. To do this, select the
object and choose Versioning-Check in.
You can check in an object when you review the results of the following tasks:

View object history. You can check in an object from the View History window when you
view the history of an object.

View checkouts. You can check in an object from the View Checkouts window when you
search for checked out objects.

View query results. You can check in an object from the Query Results window when you
search for object dependencies or run an object query.

To check in an object, select the object or objects and choose Versioning-Check in.
Enter text into the comment field in the Check In dialog box.

74

Chapter 3: Using the Workflow Manager

Figure 3-3 shows the Check In dialog box:


Figure 3-3. Check In Workflow Manager Objects

Apply the check in comment to multiple objects.

When you check in an object, the repository creates a new version of the object and
increments the version number by one.

Checking Out and In Versioned Repository Objects

75

Searching For Versioned Objects


You can use an object query to search for versioned objects in the repository that meet
specified conditions. When you run a query, the repository returns results based on those
conditions. You may want to create an object query to perform the following tasks:

Track repository objects during development. You can add Label, User, Last saved, or
Comments parameters to queries to track objects during development. For more
information about creating object queries, see Grouping Versioned Objects in the
Repository Guide.

Associate a query with a deployment group. When you create a dynamic deployment
group, you can associate an object query with it. For more information about working
with deployment groups, see Copying Folders and Deployment Groups in the Repository
Guide.

To create an object query, choose Versioning-Queries to open the Query Browser.


Figure 3-4 shows the Query Browser:
Figure 3-4. Query Browser

Edit a query.
Delete a query.
Create a query.
Configure permissions.
Run a query.

From the Query Browser, you can create, edit, and delete queries. You can also configure
permissions for each query from the Query Browser. You can run any queries for which you
have read permissions from the Query Browser.
For information about working with object queries, see Grouping Versioned Objects in the
Repository Guide.

76

Chapter 3: Using the Workflow Manager

Copying Repository Objects


You can copy repository objects (such as workflows, worklets, or tasks) within the same folder,
to a different folder, or to a different repository. If you want to copy the object to another
folder, you must open the destination folder before you copy the object into the folder.
The Workflow Manager provides a Copy Wizard that allows you to copy objects. When you
copy a workflow or a worklet, the Copy Wizard copies all of the worklets, sessions, and tasks
in the workflow. You must resolve all conflicts that occur. Conflicts occur when the Copy
Wizard finds a workflow or worklet with the same name in the target folder, or when the
server connection does not exist in the target repository. If a server connection does not exist,
you can skip the conflict and choose a server connection after you copy the workflow. You
cannot copy server connections. Conflicts may also occur when you copy Session tasks.
For more details on the Copy Wizard, see Copying Objects in the Repository Guide.
You can configure display settings and functions of the Copy Wizard by choosing ToolsOptions. For details, see Configuring Miscellaneous Options on page 43.
Note: The Workflow Manager provides an Import Wizard that allows you to import objects

from an XML file. The Import Wizard provides the same options to resolve conflicts as the
Copy Wizard. For details, see Exporting and Importing Objects in the Repository Guide.

Copying Sessions
When you copy a Session task, the Copy Wizard looks for the database connection and
associated mapping in the destination folder. If the mapping or connection does not exist in
the destination folder, you can select a new mapping or connection. If the destination folder
does not contain any mapping, you must first copy a mapping to the destination folder in the
Designer before you can copy the session.
When you copy a session that has mapping variable values saved in the repository, the
Workflow Manager either copies or retains the saved variable values.

Copying Workflow Segments


You can copy segments of workflows and worklets when you want to reuse a portion of
workflow or worklet logic. A segment consists of one or more tasks, the links between the
tasks, and any condition in the links. You can copy reusable and non-reusable objects when
copying and pasting segments. You can copy segments of workflows or worklets into
workflows and worklets within the same folder, within another folder, or within a folder in a
different repository. You can also paste segments of workflows or worklets into an empty
Workflow Designer or Worklet Designer workspace.

Copying Repository Objects

77

To copy a segment from a workflow or worklet:


1.

Open the workflow or worklet.

2.

Select a segment by highlighting each task you want to copy. You can select multiple
reusable or non-reusable objects. You can also select segments by dragging the pointer in
a rectangle around objects in the workspace.

3.

Choose Edit-Copy or press Ctrl+C to copy the segment to the clipboard.

4.

Open the workflow or worklet into which you want to paste the segment. You can also
copy the object into the Workflow or Worklet Designer workspace.

5.

Choose Edit-Paste or press Ctrl+V.

The Copy Wizard opens, and notifies you if it finds copy conflicts.
Note: You can copy individual non-reusable tasks by selecting the individual task and

following the instructions for copying and pasting segments.

78

Chapter 3: Using the Workflow Manager

Comparing Repository Objects


The Workflow Manager allows you to compare two repository objects of the same type to
identify differences between the objects. For example, if you have two similar Email tasks in a
folder, you can compare them to see which one contains the attributes you need. When you
compare two objects, the Workflow Manager displays their attributes in detail.
You can compare objects across folders and repositories. To do this, you must have both
folders open. You can compare a reusable object with a non-reusable object. You can also
compare two versions of the same object. For more information about versioned objects, see
Working with Versioned Objects in the Repository Guide.
To compare objects, you must have read permission on each folder that contains the objects
you want to compare.
You can compare the following types of objects:

Tasks

Sessions

Worklets

Workflows

You can also compare instances of the same type. For example, if the workflows you compare
contain worklet instances with the same name, you can compare the instances to see if they
differ. The Workflow Manager also allows you to compare the following instances and
attributes:

Instances of sessions and tasks in a workflow or worklet comparison. For example, when
you compare workflows, you can compare task instances that have the same name.

Instances of mappings and transformations in a session comparison. For example, when


you compare sessions, you can compare mapping instances.

The attributes of instances of the same type within a mapping comparison. For example,
when you compare flat file sources, you can compare attributes, such as file type (delimited
or fixed), delimiters, escape characters, and optional quotes.

You can compare schedulers and session configuration objects in the Repository Manager. You
cannot compare objects of different types. For example, you cannot compare an Email task
with a Session task.
When you compare objects, the Workflow Manager displays the results in the Diff Tool
window. The Diff Tool output contains different nodes for different types of objects.
When you import Workflow Manager objects, you can compare object conflicts. For more
information, see Exporting and Importing Objects in the Repository Guide.

Comparing Repository Objects

79

Steps for Comparing Objects


Use the following procedure to compare objects.
To compare two objects:
1.

Open the folders that contain the objects you want to compare.

2.

Open the appropriate Workflow Manager tool.

3.

Choose Tasks-Compare, Worklets-Compare, or Workflow-Compare.


A dialog box similar to the following one opens:

4.

Click Browse to select an object.

5.

Click Compare.
Tip: You can also compare objects from the Navigator or workspace. In the Navigator,

select the objects, right-click and choose Compare Objects. In the workspace, select the
objects, right-click and choose Compare Objects.

80

Chapter 3: Using the Workflow Manager

Figure 3-5 shows the result of comparing two objects:


Figure 3-5. Diff Tool Window
Filter nodes that
have same attribute
values.
Drill down to
further compare
objects.

Differences between
objects are
highlighted and the
nodes are flagged.
Differences
between object
properties are
marked.
Displays the
properties of the
node you select.

You can further compare differences between object properties by clicking the Compare
Further icon or by right-clicking the differences.

6.

If you want to save the comparison as a text or HTML file, choose File-Save to File.
Comparing Repository Objects

81

Working with Metadata Extensions


You can extend the metadata stored in the repository by associating information with
individual repository objects. For example, you may wish to store your name with the
worklets you create. If you create a session, you can store your telephone extension with that
session. You associate information with repository objects using metadata extensions.
Repository objects can contain both vendor-defined and user-defined metadata extensions.
You can view and change the values of vendor-defined metadata extensions, but you cannot
create, delete, or redefine them. You can create, edit, delete, and view user-defined metadata
extensions, as well as change their values.
You can create metadata extensions for the following objects in the Workflow Manager:

Sessions

Workflows

Worklets

You can create both reusable and non-reusable metadata extensions. You associate reusable
metadata extensions with all repository objects of a certain type such as all sessions or all
worklets. You associate non-reusable metadata extensions with a single repository object such
as one workflow. For more information about metadata extensions, see Metadata Extensions
in the Repository Guide.
To create, edit, and delete user-defined metadata extensions in the Workflow Manager, you
must have read and write permissions on the folder.

Creating a Metadata Extension


You can create user-defined, reusable and non-reusable metadata extensions for repository
objects using the Workflow Manager. To create a metadata extension, you edit the object for
which you want to create the metadata extension, and then add the metadata extension to the
Metadata Extensions tab.
If you need to create multiple reusable metadata extensions, it is easier to create them using
the Repository Manager. For details, see Metadata Extensions in the Repository Guide.
To create a metadata extension:

82

1.

Open the appropriate Workflow Manager tool.

2.

Drag the appropriate object into the workspace.

3.

Double-click the title bar of the object to edit it.

Chapter 3: Using the Workflow Manager

4.

Click the Metadata Extensions tab:

User-Defined
Metadata
Extensions

This tab lists the existing user-defined and vendor-defined metadata extensions. Userdefined metadata extensions appear in the User Defined Metadata Domain. If they exist,
vendor-defined metadata extensions appear in their own domains.
5.

Click the Add button.


A new row appears in the User Defined Metadata Extension Domain.

6.

Enter the information in Table 3-1:


Table 3-1. Metadata Extension Attributes in the Workflow Manager
Field

Required/
Optional

Extension Name

Required

Name of the metadata extension. Metadata extension names must


be unique for each type of object in a domain. Metadata extension
names cannot contain any special characters except underscores
and cannot begin with numbers.

Datatype

Required

The datatype: numeric (integer), string, or boolean.

Precision

Required for string


objects

The maximum length for string metadata extensions.

Description

Working with Metadata Extensions

83

Table 3-1. Metadata Extension Attributes in the Workflow Manager

7.

Field

Required/
Optional

Value

Optional

An optional value.
For a numeric metadata extension, the value must be an integer
between -2,147,483,647 and 2,147,483,647.
For a boolean metadata extension, choose true or false.
For a string metadata extension, click the Open button in the Value
field to enter a value of more than one line, up to 2,147,483,647
bytes.

Reusable

Required

Makes the metadata extension reusable or non-reusable. Check to


apply the metadata extension to all objects of this type (reusable).
Clear to make the metadata extension apply to this object only
(non-reusable).
Note: If you make a metadata extension reusable, you cannot
change it back to non-reusable. The Workflow Manager makes the
extension reusable as soon as you confirm the action.

UnOverride

Optional

Restores the default value of the metadata extension when you


click Revert. This column appears only if the value of one of the
metadata extensions was changed.

Description

Optional

Description of the metadata extension.

Description

Click OK.

Editing a Metadata Extension


You can edit user-defined, reusable, and non-reusable metadata extensions for repository
objects using the Workflow Manager. To edit a metadata extension, you edit the repository
object, and then make changes to the Metadata Extensions tab.
What you can edit depends on whether the metadata extension is reusable or non-reusable.
You can promote a non-reusable metadata extension to reusable, but you cannot change a
reusable metadata extension to non-reusable.

Editing Reusable Metadata Extensions


If the metadata extension you want to edit is reusable and editable, you can change the value
of the metadata extension, but not any of its properties. However, if the vendor or user who
created the metadata extension did not make it editable, you cannot edit the metadata
extension or its value. For details, see Metadata Extensions in the Repository Guide.
To edit the value of a reusable metadata extension, click the Metadata Extensions tab and
modify the Value field. To restore the default value for a metadata extension, click Revert in
the UnOverride column.

84

Chapter 3: Using the Workflow Manager

Editing Non-Reusable Metadata Extensions


If the metadata extension you want to edit is non-reusable, you can change the value of the
metadata extension as well as its properties. You can also promote the metadata extension to a
reusable metadata extension.
To edit a non-reusable metadata extension, click the Metadata Extensions tab. You can update
the Datatype, Value, Precision, and Description fields. For a description of these fields, see
Table 3-1 on page 83.
If you wish to make the metadata extension reusable, check Reusable. If you make a metadata
extension reusable, you cannot change it back to non-reusable. The Workflow Manager makes
the extension reusable as soon as you confirm the action.
To restore the default value for a metadata extension, click Revert in the UnOverride column.

Deleting a Metadata Extension


You can delete metadata extensions for repository objects. You delete reusable metadata
extensions using the Repository Manager. You can delete non-reusable metadata extensions
using the Workflow Manager. To do this, edit the repository object, and then delete the
metadata extension from the Metadata Extensions tab.

Working with Metadata Extensions

85

Keyboard Shortcuts
When editing a repository object or maneuvering around the Workflow Manager, use the
following Keyboard shortcuts to help you complete different operations quickly.
Table 3-2 lists the Workflow Manager keyboard shortcuts for editing a repository object:
Table 3-2. Workflow Manager Keyboard Shortcuts
To

Press

Cancel editing in a cell

Esc

Check and uncheck a check box.

Space Bar

Copy text from a cell onto the clipboard.

Ctrl+C

Cut text from a cell onto the clipboard.

Ctrl+X

Edit the text of a cell.

F2. Then move the cursor to the desired location.

Find all combination and list boxes.

Type the first letter on the list.

Find tables or fields in the workspace.

Ctrl+F

Move around cells in a dialog box.

Ctrl+directional arrows

Paste copied or cut text from the clipboard into a cell.

Ctrl+V

Select the text of a cell.

F2

Table 3-3 lists the Workflow Manager keyboard shortcuts for navigating in the workspace:
Table 3-3. Keyboard Shortcuts for Navigating the Workspace

86

To

Press

Create links.

Ctrl+F2. Press Ctrl+F2 to select first task you want to link.


Press Tab to select the rest of the tasks you want to link.
Press Ctrl+F2 again to link all the tasks you selected.

Edit task name in the workspace.

F2

Expand selected node and all its children.

SHIFT + * (use asterisk on numeric keypad )

Move across Select tasks in the workspace.

Tab

Select multiple tasks.

Ctrl+mouse click

Chapter 3: Using the Workflow Manager

Chapter 4

Working with Workflows


This chapter covers the following topics:

Overview, 88

Developing Workflows, 91

Using the Workflow Wizard, 99

Using Workflow Variables, 103

Scheduling a Workflow, 112

Validating a Workflow, 119

Running the Workflow, 122

Suspending the Workflow, 127

Stopping or Aborting the Workflow, 129

87

Overview
A workflow is a set of instructions that tells the PowerCenter Server how to execute tasks such
as sessions, email notifications, and shell commands. After you create tasks in the Task
Developer and Workflow Designer, you connect the tasks with links to create a workflow.
In the Workflow Designer, you can specify conditional links and use workflow variables to
create branches in the workflow. The Workflow Manager also provides Event-Wait and EventRaise tasks so you can control the sequence of task execution in the workflow. You can also
create worklets and nest them inside the workflow.
Every workflow contains a Start task, which represents the beginning of the workflow.
Figure 4-1 shows a sample workflow:
Figure 4-1. Sample Workflow
Workflow Tasks

Start Task

Session Task

Assignment Task

Link

Command Task

You can create workflows with branches to execute tasks concurrently.

88

Chapter 4: Working with Workflows

Figure 4-2 shows a sample workflow with two branches:


Figure 4-2. Sample Workflow With Two Branches

After you create a workflow, select a PowerCenter Server to run the workflow. You can then
start the workflow using the Workflow Manager, Workflow Monitor, or pmcmd.
Use the Workflow Monitor to see the progress of a workflow during its run. The Workflow
Monitor can also show the history of a workflow. For more information about the Workflow
Monitor, see Monitoring Workflows on page 401.
Use the following guidelines when you develop a workflow:
1.

Create a new workflow. Create a new workflow in the Workflow Designer. For details on
creating a new workflow, see Creating a New Workflow on page 91.

2.

Add tasks in the workflow. You might have already created tasks in the Task Developer.
Or, you can add tasks to the workflow as you develop the workflow in the Workflow
Designer. For details on workflow tasks, see Working with Tasks on page 131.

3.

Connect tasks with links. After you add tasks in the workflow, connect them with links
to specify the order of execution in the workflow. For details on links, see Working with
Links on page 92.

4.

Specify conditions for each link. You can specify conditions on the links to create
branches and dependencies. For details, see Working with Links on page 92.

5.

Validate workflow. Validate the workflow in the Workflow Designer to identify errors.
For details on validation rules, see Validating a Workflow on page 119.

6.

Save workflow. When you save the workflow, the Workflow Manager validates the
workflow and updates the repository.

7.

Run workflow. In the workflow properties, select a PowerCenter Server to run the
workflow. Run the workflow from the Workflow Manager, Workflow Monitor, or
pmcmd. You can monitor the workflow in the Workflow Monitor. For details on starting
a workflow, see Running the Workflow on page 122.

For a complete list of workflow properties, see Workflow Properties Reference on page 721.
Overview

89

Workflow Privileges
You need the one of the following privileges to create a workflow:

Use Workflow Manager privilege with read and write folder permissions

Super User privilege

You need one of the following privileges to run, schedule, and monitor the workflow:

90

Workflow Operator privilege

Super User privilege

Chapter 4: Working with Workflows

Developing Workflows
The first step to develop a workflow is to create a new workflow in the Workflow Designer. A
workflow must contain a Start task. The Start task represents the beginning of a workflow.
When you create a workflow, the Workflow Designer creates a Start task and adds it to the
workflow. You cannot delete the Start task.
After you create a new workflow, the next step is to add tasks to the workflow. The Workflow
Manager includes tasks such as the Session task, the Command task, and the Email task so
you can design your workflow.
Finally, you connect workflow tasks with links to specify the order of execution in the
workflow. You can add conditions to links.

Creating a New Workflow


You must create a workflow before you can add tasks such as a Session, Command, or Email.
When adding a session, if the workspace in the Workflow Designer is empty, you can create a
workflow automatically.
To create a workflow manually:
1.

Open the Workflow Designer.

2.

Choose Workflows-Create.

3.

Enter a name for the new workflow.

4.

Click OK.
The Workflow Designer creates a Start task in the new workflow.

For information on using the Workflow Wizard, see Using the Workflow Wizard on
page 99.

Developing Workflows

91

To create a workflow automatically:


1.

Open the Workflow Designer. Close any open workflow.

2.

Click the session button on the Tasks toolbar.

3.

Click in the Workflow Designer workspace.


The Mappings dialog box displays.

4.

Select a mapping to associate with the session and click OK.


The Create Workflow dialog box appears. The Workflow Designer names the workflow
wf_MappingName by default. You can rename the workflow or change other workflow
properties. For more information on workflow properties, see Workflow Properties
Reference on page 721.

5.

Click OK.
The Workflow Designer creates a workflow for the session.

Adding Tasks to Workflows


After you create a new workflow, you add tasks you want to execute in the workflow. You may
already have created tasks in the Task Developer. Or, you may want to create tasks in the
Workflow Designer as you develop the workflow.
If you have already created tasks in the Task Developer, add them to the workflow by dragging
the tasks from the Navigator window to the Workflow Designer workspace.
To create and add tasks as you develop the workflow, choose Tasks-Create in the Workflow
Designer. Or, you can also use the Tasks toolbar to create and add tasks to the workflow. Click
the button on the Tasks toolbar for the task you want to create. Click again in the Workflow
Designer workspace to create and add the task.
Tasks you create in the Workflow Designer are non-reusable. Tasks you create in the Task
Developer are reusable. For more information about reusable tasks, see Reusable Workflow
Tasks on page 135.

Working with Links


Use links to connect each workflow task. You can specify conditions with links to create
branches in the workflow. The Workflow Manager does not allow you to use links to create
loops in the workflow. Each link in the workflow can execute only once.
The workflow in Figure 4-3 is not a loop because each task runs at most once.

92

Chapter 4: Working with Workflows

Figure 4-3 shows a valid workflow:


Figure 4-3. Valid Workflow

The Workflow Manager does not allow you to create a workflow that contains a loop, such as
the loop shown in Figure 4-4. Figure 4-4 shows a loop where the three sessions may be run
multiple times:
Figure 4-4. Example of a Loop

Use the following procedure to link tasks in the Workflow Designer or the Worklet Designer.
To link two tasks:
1.

In the Tasks toolbar, click the link button.


Link Button

2.

In the workspace, click the first task you want to connect and drag it to the second task.

3.

A link appears between the two tasks.

If you have a number of tasks that you want to link concurrently, you may not wish to
connect each link manually. To quickly link tasks concurrently, use the following procedure.
To link several tasks concurrently:
1.

In the workspace, click the first task you want to connect.

2.

Ctrl-click all other tasks you want to connect.


Developing Workflows

93

Note: Do not use Ctrl+A or Edit-Select to choose tasks.


3.

Choose Tasks-Link concurrent.

4.

A link appears between the first task you selected and each task you added. The first task
you selected links to each task concurrently.

If you have a number of tasks that you want to link sequentially, you may not wish to connect
each link manually. To quickly link tasks sequentially, use the following procedure.
To link several tasks sequentially:
1.

In the workspace, click the first task you want to connect.

2.

Ctrl-click the next task you want to connect. Continue to add tasks in the order you want
them to run.

3.

Choose Tasks-Link sequential.

4.

Links appear in sequential order between the first task and each subsequent task you
added.

Specifying Link Conditions


Once you create links between tasks, you can specify conditions for each link to determine the
order of execution in the workflow. If you do not specify conditions for each link, the
PowerCenter Server executes the next task in the workflow by default.
You can use pre-defined or user-defined workflow variables in the link condition. If the link
condition evaluates to True, the PowerCenter Server executes the next task in the workflow. If
the link condition evaluates to False, the PowerCenter Server does not execute the next task in
the workflow.
You can view results of link evaluation during workflow runs in the workflow log file.

Example of Link Conditions


You can use link conditions to specify the order of execution in the workflow or to create
branches in the workflow. For example, you may have two Session tasks in the workflow,
s_STORES_CA and s_STORES_AZ. You want the PowerCenter Server to run the second
Session task only if the first Session task has no target failed rows.
To accomplish this, you can set the link condition between the two sessions so that the
s_STORES_AZ executes only if the number of failed target rows for S_STORES_CA is zero.

94

Chapter 4: Working with Workflows

Figure 4-5 shows how to set the link condition using the target failed rows variable for
S_STORES_CA:
Figure 4-5. Setting Link Condition

After you specify the link condition in the Expression Editor, the Workflow Manager validates
the link condition and displays it next to the link in the workflow.
Figure 4-6 shows the link condition displayed in the workspace:
Figure 4-6. Displaying Link Condition in the Workflow

Link Condition

To specify a condition for a link:


1.

In the Workflow Designer workspace, double-click the link you want to specify.
or
Right-click the link and choose Edit. The Expression Editor displays.
Developing Workflows

95

2.

In the Expression Editor, enter the link condition.


The Expression Editor provides pre-defined workflow variables, user-defined workflow
variables, variable functions, and boolean and arithmetic operators.

3.

Validate the expression using the Validate button. The Workflow Manager displays error
messages in the Output window.

Tip: Click and drag the end point of a link to move it from one task to another without losing

the link condition.

Using the Expression Editor


The Workflow Manager provides an Expression Editor for any expressions in the workflow.
You can enter expressions using the Expression Editor for the following:

Link conditions

Decision task

Assignment task

Figure 4-7 shows the Expression Editor:


Figure 4-7. Expression Editor

The Expression Editor displays system variables, user-defined, and pre-defined workflow
variables such as $Session.status. For details on workflow variables, see Using Workflow
Variables on page 103.
The Expression Editor also displays a list of functions. PowerCenter uses a SQL-like language
that contains many functions designed to handle common expressions. For example, you can
use the ABS function to find the absolute value. For a complete list of functions, see the
Transformation Language Reference.

96

Chapter 4: Working with Workflows

Adding Comments
The Expression Editor also allows you to add comments using -- or // comment indicators.
You can use comments to give descriptive information about the expression, or you can
specify a valid URL to access business documentation about the expression.
For examples on adding comments to expressions, see The Transformation Language in the
Transformation Language Reference.

Validating Expressions
You can use the Validate button to validate an expression. If you do not validate an
expression, the Workflow Manager validates it when you close the Expression Editor. You
cannot run a workflow with invalid expressions.
Expressions in link conditions and Decision task conditions must evaluate to a numerical
value. Workflow variables used in expressions must exist in the workflow.

Expression Editor Display


The Expression Editor can display syntax expressions in different colors for better readability.
If you have the latest Rich Edit control, riched20.dll, installed on your system, the Expression
Editor displays expression functions in blue, comments in grey, and quoted strings in green.
You can resize the Expression Editor. Expand the dialog box by dragging from the borders.
The Workflow Manager saves the new size for the dialog box as a client setting.

Deleting a Workflow
You may decide to delete a workflow that you no longer use. When you delete a workflow,
you delete all non-reusable tasks and reusable task instances associated with the workflow.
Reusable tasks used in the workflow remain in the folder when you delete the workflow.
If you delete a workflow that is running, the PowerCenter Server aborts the workflow. If you
delete a workflow that is scheduled to run, the PowerCenter Server removes the workflow
from the schedule.
You can delete a workflow in the Navigator window, or you can delete the workflow currently
displayed in the Workflow Designer workspace.

To delete a workflow from the Navigator window, open the folder, select the workflow and
press the Delete key.

To delete a workflow currently displayed in the Workflow Designer workspace, choose


Workflows-Delete.

Developing Workflows

97

Editing a Workflow
When you edit a workflow, the repository updates the workflow information when you save
the workflow. If a workflow is running when you make edits, the PowerCenter Server uses the
updated information the next time you run the workflow.

Viewing Links in Workflow or Worklet


When you edit a workflow or worklet, you can view the forward or backward link paths to
other tasks. You can highlight paths to see links in the workflow branch from the Start task to
the last task in the branch.
Note: You can configure the color the Workflow Manager uses to display links. When you

configure the format options, choose the Link Selection option.


To view link paths:
1.

In the Worklet Designer or Workflow Designer, right-click a task and choose Highlight
Path.

2.

Choose Forward Path, Backward Path, or Both.


The Workflow Manager highlights all links in the branch you select.

Deleting Links in a Workflow or Worklet


When you edit a workflow or worklet, you can delete multiple links at once without deleting
the connected tasks.
To delete multiple links:
1.

In the Worklet Designer or Workflow Designer, select all links you want to delete.
Tip: You can use the mouse to click and drag the selection, or you can Ctrl-click the tasks

and links.
2.

Choose Edit-Delete Links.


The Workflow Manager removes all selected links.

98

Chapter 4: Working with Workflows

Using the Workflow Wizard


You can use the Workflow Wizard to automate the process of creating sessions, adding
sessions to a workflow, and linking sessions to create a workflow. The Workflow Wizard
creates sessions from mappings and adds them to the workflow. It also creates a Start task and
allows you to schedule the workflow. You can add tasks and edit other workflow properties
after the Workflow Wizard completes. If you want to create concurrent sessions, use the
Workflow Designer to manually build a workflow.
Before you create a workflow, verify that the folder contains a valid mapping for the Session
task.
Complete the following steps to build a workflow using the Workflow Wizard:
1.

Assign a name and PowerCenter Server to the workflow.

2.

Create a session.

3.

Schedule the workflow.

Step 1. Assign a Name and PowerCenter Server to the Workflow


In the first step of the Workflow Wizard, you add the name and description of the workflow
and choose the PowerCenter Server to run the workflow.
To create the workflow:
1.

In the Workflow Manager, open the folder containing the mapping you want to use in
the workflow.

2.

Open the Workflow Designer.

3.

Choose Workflows-Wizard.

Using the Workflow Wizard

99

The Workflow Wizard appears.

4.

Enter a name for the workflow.


The convention for naming workflows is wf_WorkflowName. For a complete list of
naming conventions for repository objects, see Naming Conventions in Getting Started.

5.

Enter a description for the workflow.

6.

Choose the PowerCenter Server to run the workflow, and click Next.

The next step is to create a session.

Step 2. Create a Session


In the second step of the Workflow Wizard, you create a session based on a mapping. You can
add tasks later in the Workflow Designer workspace. For details on working with tasks, see
Working with Tasks on page 131.
To create a session:
1.

In the second step of the Workflow Wizard, select a valid mapping and click the right
arrow button.
The Workflow Wizard creates a Session task in the right pane using the selected mapping
and names it s_MappingName by default.

100

Chapter 4: Working with Workflows

The following figure shows a mapping selected for a session:

2.

You can select additional mappings to create more Session tasks in the workflow.
When you add multiple mappings to the list, the Workflow Wizard creates sequential
sessions in the order you add them.

3.

Use the arrow buttons to change the session order.

4.

Specify whether the session should be reusable.


When you create a reusable session, you can use the session in other workflows. For
details on reusable sessions, see Working with Tasks on page 131

5.

Specify how you want the PowerCenter Server to run the workflow.
You can specify that the PowerCenter Server runs sessions only if previous sessions
complete, or you can specify that the PowerCenter Server always runs each session. When
you select this option, it applies to all sessions you create using the Workflow Wizard.

Step 3. Schedule a Workflow


In the third step of the Workflow Wizard, you can schedule a workflow to run continuously,
repeat at a given time or interval, or start manually. The PowerCenter Server runs a workflow
unless the prior workflow run fails. When a workflow fails, the PowerCenter Server removes
the workflow from the schedule, and you must reschedule it. You can do this in the Workflow
Manger or using pmcmd.

Using the Workflow Wizard

101

To schedule a workflow:
1.

In the third step of the Workflow Wizard, configure the scheduling and run options. For
more information about scheduling a workflow, see Scheduling a Workflow on
page 112.

2.

Click Next.
The Workflow Wizard displays the settings for the workflow:

3.

Verify the workflow settings and click Finish. To edit settings, click Back.
The completed workflow opens in the Workflow Designer workspace. From the
workspace, you can add tasks, create concurrent sessions, add conditions to links, or
modify properties.

4.

102

When you finish modifying the workflow, choose Repository-Save.

Chapter 4: Working with Workflows

Using Workflow Variables


You can create and use variables in a workflow to reference values and record information. For
example, you can use a variable in a Decision task to determine whether the previous task ran
properly. If it did, you can run the next task. If not, you can stop the workflow.
You can use the following types of workflow variables:

Pre-defined workflow variables. The Workflow Manager provides pre-defined workflow


variables for tasks within a workflow. For more information, see Pre-Defined Workflow
Variables on page 105.

User-defined workflow variables. You create user-defined workflow variables when you
create a workflow. For more information, see User-Defined Workflow Variables on
page 108.

You can use workflow variables when you configure the following types of tasks:

Assignment tasks. You can use an Assignment task to assign a value to a user-defined
workflow variable. For example, you can increment a user-defined counter variable by
setting the variable to its current value plus 1. For information on using workflow variables
in Assignment tasks, see Working with the Assignment Task on page 140.

Decision tasks. Decision tasks determine how the PowerCenter Server executes a
workflow. For example, you can use the Status variable to run a second session only if the
first session completes successfully. For information on using workflow variables in
Decision tasks, see Working with the Decision Task on page 149.

Links. Links connect each workflow task. You can use workflow variables in links to create
branches in the workflow. For example, after a Decision task, you can create one link to
follow when the decision condition evaluates to true, and another link to follow when the
decision condition evaluates to false. For information on using workflow variables in Link
tasks, see Working with Links on page 92.

Timer tasks. Timer tasks specify when the PowerCenter Server begins to execute the next
task in the workflow. You can use a user-defined date/time variable to specify the exact
time the PowerCenter Server starts to execute the next task. For information on using
workflow variables in Timer tasks, see Working with the Timer Task on page 161.

You can use the Expression Editor to create an expression that uses variables.

Using Workflow Variables

103

Figure 4-8 shows the Expression Editor:


Figure 4-8. Expression Editor
Select pre-defined variables.
Select user-defined variables.
Create an
expression
using
variables.

When you build an expression, you can select pre-defined variables on the Pre-Defined tab.
You can select user-defined variables on the User-Defined tab. The Functions tab contains
functions that you can use with workflow variables.
Use the point-and-click method to enter an expression using a variable. For information on
using the Expression Editor, see Using the Expression Editor on page 96.
You can use the following keywords to write expressions for user-defined and pre-defined
workflow variables:

104

AND

OR

NOT

TRUE

FALSE

NULL

SYSDATE

Chapter 4: Working with Workflows

Pre-Defined Workflow Variables


Each workflow contains a set of pre-defined variables that you can use to evaluate workflow
and task conditions. You can use the following types of pre-defined variables:

Task-specific variables. The Workflow Manager provides a set of task-specific variables for
each task in the workflow. You can use task-specific variables in a link condition to control
the path the PowerCenter Server takes when running the workflow. The Workflow
Manager lists task-specific variables under the task name in the Expression Editor.

System variables. You can use the SYSDATE and WORKFLOWSTARTTIME system
variables within a workflow. For more information on system variables, see Variables in
the Transformation Language Reference. The Workflow Manager lists system variables under
the Built-in node in the Expression Editor.

Table 4-1 lists the task-specific workflow variables available in the Workflow Manager:
Table 4-1. Task-Specific Workflow Variables
Task-Specific Variables

Description

Task Types

Datatype

Condition

Evaluation result of decision condition expression.


If the task fails, the Workflow Manager keeps the condition
set to null.

Decision

Integer

EndTime

Date and time the associated task ended.

All tasks

Date/time

ErrorCode

Last error code for the associated task. If there is no error,


the PowerCenter Server sets ErrorCode to 0 when the
task completes.

All tasks

Integer

ErrorMsg

Last error message for the associated task.


If there is no error, the PowerCenter Server sets ErrorMsg
to an empty string when the task completes.

All tasks

Nstring*

FirstErrorCode

Error code for the first error message in the session.


If there is no error, the PowerCenter Server sets
FirstErrorCode to 0 when the session completes.

Session

Integer

FirstErrorMsg

The first error message in the session.


If there is no error, the PowerCenter Server sets
FirstErrorMsg to an empty string when the task completes.

Session

Nstring*

PrevTaskStatus

Status of the previous task in the workflow that the


PowerCenter Server ran. Statuses include:
- ABORTED
- FAILED
- STOPPED
- SUCCEEDED
Use these key words when writing expressions to evaluate
the status of the previous task. For more information, see
Evaluating Task Status in a Workflow on page 107.

All tasks

Integer

SrcFailedRows

Total number of rows the PowerCenter Server failed to


read from the source.

Session

Integer

SrcSuccessRows

Total number of rows successfully read from the sources.

Session

Integer

Using Workflow Variables

105

Table 4-1. Task-Specific Workflow Variables


Task-Specific Variables

Description

Task Types

Datatype

StartTime

Date and time the associated task started.

All tasks

Date/time

Status

Status of the previous task in the workflow. Task statuses


include:
- ABORTED
- DISABLED
- FAILED
- NOTSTARTED
- STARTED
- STOPPED
- SUCCEEDED
Use these key words when writing expressions to evaluate
the status of the current task. For more information, see
Evaluating Task Status in a Workflow on page 107.

All tasks

Integer

TgtFailedRows

Total number of rows the PowerCenter Server failed to


write to the target.

Session

Integer

TgtSuccessRows

Total number of rows successfully written to the targets.

Session

Integer

TotalTransErrors

Total number of transformation errors.

Session

Integer

* Variables of type Nstring can have a maximum length of 600 characters.

All pre-defined workflow variables except Status have a default value of null. The
PowerCenter Server uses the default value of null when it encounters a pre-defined variable
from a task that has not yet run in the workflow. Therefore, expressions and link conditions
that depend upon tasks not yet run are valid. The default value of Status is NOTSTARTED.

Using Pre-Defined Workflow Variables in Expressions


When you use a workflow variable in an expression, the PowerCenter Server evaluates the
expression and returns True or False. If the condition evaluates to true, the PowerCenter
Server runs the next task. The PowerCenter Server writes an entry in the workflow log similar
to the following message:
INFO : LM_36506 : (1980|1040) Link [Session2 --> Session3]: condition is
TRUE for the expression [$Session2.PrevTaskStatus = SUCCEEDED].

The Expression Editor displays the pre-defined workflow variables on the Pre-defined tab.
The Workflow Manager groups task-specific variables by task and lists system variables under
the Built-in node. To use a variable in an expression, double-click the variable. The
Expression Editor displays task-specific variables in the Expression field in the following
format:
$<TaskName>.<Pre-definedVariable>

106

Chapter 4: Working with Workflows

Figure 4-9 shows the Expression Editor with an expression using a task-specific workflow
variable and keyword:
Figure 4-9. Expression Using a Pre-Defined Workflow Variable

Evaluating Task Status in a Workflow


You can use Status and PrevTaskStatus in link conditions to test the status of tasks in a
workflow. Use Status to test the status of the previous task in the workflow. Use
PrevTaskStatus to test the status of the previous task in the workflow that the PowerCenter
Server ran.
Use PrevTaskStatus if you disable a task in the workflow. Status and PrevTaskStatus return the
same value unless the condition uses a disabled task.
Figure 4-10 shows a workflow with link conditions using Status:
Figure 4-10. Status Variable Example
Previous Task in Workflow

Link condition:
$Session2.Status = SUCCEEDED

The PowerCenter Server returns value based on the


previous task in the workflow, Session2.

When you run the workflow, the PowerCenter Server evaluates the link condition and returns
the value based on the status of Session2.

Using Workflow Variables

107

Figure 4-11 shows a workflow with link conditions using PrevTaskStatus:


Figure 4-11. PrevTaskStatus Variable Example
Previous Task Run
Disabled Task

Link condition:
$Session2.PrevTaskStatus = SUCCEEDED

The PowerCenter Server returns value based on the


previous task run, Session1.

When you run the workflow, the PowerCenter Server skips Session2 because the session is
disabled. When the PowerCenter Server evaluates the link condition, it returns the value
based on the status of Session1.
Tip: If you do not disable Session2, the PowerCenter Server returns the value based on the

status of Session2. You do not need to change the link condition when you enable and disable
Session2.

User-Defined Workflow Variables


You can create your own variables within a workflow. When you create a variable in a
workflow, it is valid only in that workflow. You can use the variable in tasks within that
workflow. You can edit and delete user-defined workflow variables.
You can use user-defined variables when you need to make a workflow decision based on
criteria you specify. For example, suppose you create a workflow to load data to an orders
database nightly. You also need to load a subset of this data to headquarters periodically,
perhaps every tenth time you update the local orders database. You create separate sessions to
update the local database and the one at headquarters. The workflow looks like Figure 4-12:
Figure 4-12. Sample Workflow Using Workflow Variable

108

Chapter 4: Working with Workflows

You can use a user-defined variable to determine when to run the session that updates the
orders database at headquarters.
To do this, set up the workflow as follows:
1.

Create a persistent workflow variable, $$WorkflowCount, to represent the number of


times the workflow has run.

2.

Add a Start task and both sessions to the workflow.

3.

Place a Decision task after the session that updates the local orders database.
Set up the decision condition to check to see if the number of workflow runs is evenly
divisible by 10. You can use the modulus (MOD) function to do this.

4.

Create an Assignment task to increment the $$WorkflowCount variable by one.

5.

Link the Decision task to the session that updates the database at headquarters when the
decision condition evaluates to true. Link it to the Assignment task when the decision
condition evaluates to false.

When you do this, the session that updates the local database runs every time the workflow
runs. The session that updates the database at headquarters runs every 10th time the
workflow runs.

Start and Current Values


Conceptually, the PowerCenter Server holds two different values for a workflow variable
during a workflow run:

Start value of a workflow variable

Current value of a workflow variable

The start value is the value of the variable at the start of the workflow. The start value could
be a value defined in the parameter file for the variable, a value saved in the repository from
the previous run of the workflow, a user-defined initial value for the variable, or the default
value based on the variable datatype.
The PowerCenter Server looks for the start value of a variable in the following order:
1.

Value in parameter file

2.

Value saved in the repository (if the variable is persistent)

3.

User-specified default value

4.

Datatype default value

For a list of datatype default values, see Table 4-2 on page 110.
For example, you create a workflow variable in a workflow and enter a default value, but you
do not define a value for the variable in a parameter file. The first time the PowerCenter
Server runs the workflow, it evaluates the start value of the variable to the user-defined default
value.

Using Workflow Variables

109

If you declare the variable as persistent, the PowerCenter Server saves the value of the variable
to the repository at the end of the workflow run. The next time the workflow runs, the
PowerCenter Server evaluates the start value of the variable as the value saved in the
repository.
If the variable is non-persistent, the PowerCenter Server does not save the value of the variable.
The next time the workflow runs, the PowerCenter Server evaluates the start value of the
variable as the user-specified default value.
If you want to override the value saved in the repository before running a workflow, you need
to define a value for the variable in a parameter file. When you define a workflow variable in
the parameter file, the PowerCenter Server uses this value instead of the value saved in the
repository or the configured initial value for the variable.
The current value is the value of the variable as the workflow progresses. When a workflow
starts, the current value of a variable is the same as the start value. The value of the variable
can change as the workflow progresses if you create an Assignment task that updates the value
of the variable.
If the variable is persistent, the PowerCenter Server saves the current value of the variable to
the repository at the end of a successful workflow run. If the workflow fails to complete, the
PowerCenter Server does not update the value of the variable in the repository.
The PowerCenter Server states the value saved to the repository for each workflow variable in
the workflow log.

Datatype Default Values


If the PowerCenter Server cannot determine the start value of a variable by any other means,
it uses a default value for the variable based on its datatype. For more information on how the
PowerCenter Server determines start values for a variable, see Start and Current Values on
page 109.
Table 4-2 lists the datatype default values for user-defined workflow variables:
Table 4-2. Datatype Default Values for User-defined Workflow Variables
Datatype

Workflow Manager Default Value

Date/time

1/1/1753 A.D.

Double

Integer

Nstring

Empty string

Creating User-Defined Workflow Variables


You can create workflow variables for a workflow in the workflow properties.

110

Chapter 4: Working with Workflows

To create a workflow variable:


1.

In the Workflow Designer, create a new workflow or edit an existing one.

2.

Select the Variables tab.


Add Button

Validate Button

3.

Click Add and enter a name for the variable.


The correct format for a user-defined workflow variable is $$VariableName. Do not use a
single $ for a user-defined workflow variable. The single $ is reserved for system variables
and pre-defined workflow variables.
Workflow variable names are not case-sensitive.

4.

In the Datatype field, select the datatype for the new variable. You can select from the
following datatypes:

Date/time

Double

Integer

Nstring

Variables of type Nstring can have a maximum length of 600 characters.


5.

Enable the Persistent option if you want the value of the variable retained from one
execution of the workflow to the next. For more information, see Start and Current
Values on page 109.

6.

Enter the default value for the variable in the Default field. If the default value is a null
value, enable the Is Null option.

7.

To validate the default value of the new workflow variable, click the Validate button.

8.

Click Apply to save the new workflow variable.

9.

Click OK to close the workflow properties.

Using Workflow Variables

111

Scheduling a Workflow
You can schedule a workflow to run continuously, repeat at a given time or interval, or you
can manually start a workflow. The PowerCenter Server runs a scheduled workflow as
configured.
By default, the workflow runs on demand. You can change the schedule settings by editing the
scheduler. If you change schedule settings, the PowerCenter Server reschedules the workflow
according to the new settings.
Each workflow has an associated scheduler. A scheduler is a repository object that contains a
set of schedule settings. You can create a non-reusable scheduler for the workflow. Or, you can
create a reusable scheduler so you can use the same set of schedule settings for workflows in
the folder.
The Workflow Manager marks a workflow invalid if you delete the scheduler associated with
the workflow.
If you choose a different PowerCenter Server for the workflow or restart the PowerCenter
Server, it reschedules all workflows. This includes workflows that are scheduled to run
continuously but whose start time has passed. You must manually reschedule workflows
whose start time has passed if they are not scheduled to run continuously.
The PowerCenter Server does not run the workflow if:

The prior workflow run fails. When a workflow fails, the PowerCenter Server removes the
workflow from the schedule, and you must manually reschedule it. You can reschedule the
workflow in the Workflow Manager or using pmcmd. In the Workflow Manager Navigator
window, right-click the workflow and select Schedule Workflow. For more information
about the pmcmd scheduleworkflow command, see Scheduleworkflow on page 604.

You remove the workflow from the schedule. You can remove the workflow from the
schedule in the Workflow Manager or using pmcmd. In the Workflow Manager Navigator
window, right-click the workflow and select Unschedule Workflow. For more information
about the pmcmd unscheduleworkflow command, see Unscheduleworkflow on page 610.

Note: The PowerCenter Server schedules the workflow in the time zone of the PowerCenter

Server machine. For example, the PowerCenter Client is in your current time zone and the
PowerCenter Server is in a time zone two hours later. If you schedule the workflow to start at
9 a.m., it starts at 9 a.m. in the time zone of the PowerCenter Server machine and 7 a.m.
current time.
To schedule a workflow:
1.

In the Workflow Designer, open the workflow.

2.

Choose Workflows-Edit.

3.

In the Scheduler tab, choose Non-reusable if you want to create a non-reusable set of
schedule settings for the workflow.
Choose Reusable if you want to select an existing reusable scheduler for the workflow.

112

Chapter 4: Working with Workflows

Note: If you do not have a reusable scheduler in the folder, you must create one before you

choose Reusable. The Workflow Manager displays a warning message if you do not have
an existing reusable scheduler.
4.

Click the right side of the Scheduler field to edit scheduling settings for the scheduler.

Edit scheduler
settings.

For a complete list of scheduler options, see Configuring Scheduler Settings on


page 114.
5.

If you select Reusable, choose a reusable scheduler from the Scheduler Browser dialog
box.

6.

Click OK.

To remove a workflow from its schedule, right-click the workflow in the Navigator window
and choose Unschedule Workflow.

Scheduling a Workflow

113

To reschedule a workflow on its original schedule, right-click the workflow in the Navigator
window and choose Schedule Workflow.

Creating a Reusable Scheduler


For each folder, the Workflow Manager allows you to create reusable schedulers so you can
reuse the same set of scheduling settings for workflows in the folder. Use a reusable scheduler
so you do not need to configure the same set of scheduling settings in each workflow.
When you delete a reusable scheduler, all workflows that use the deleted scheduler becomes
invalid. To make the workflows valid, you must edit them and replace the missing scheduler.
To create a reusable scheduler:
1.

In the Workflow Designer, choose Workflows-Schedulers.

2.

Click Add to add a new scheduler.

3.

In the General tab, enter a name for the scheduler.

4.

Configure the scheduler settings in the Scheduler tab. For a complete list of scheduler
settings, see Table 4-3 on page 115.

Configuring Scheduler Settings


Configure the Schedule tab of the scheduler to set run options, schedule options, start
options, and end options for the schedule.

114

Chapter 4: Working with Workflows

Figure 4-13 shows the Schedule tab:


Figure 4-13. Schedule tab

Table 4-3 describes the settings on the Schedule tab:


Table 4-3. Schedule Tab Settings
Scheduler Options

Required/
Optional

Description

Run Options:
Run On Server Initialization/
Run On Demand/Run
Continuously

Optional

Indicates the workflow schedule type.


If you select Run On Server Initialization, the PowerCenter
Server runs the workflow as soon as the server is initialized. The
PowerCenter Server then starts the next run of the workflow
according to settings in Schedule Options.
If you select Run On Demand, the PowerCenter Server runs the
workflow when you start the workflow manually.
If you select Run Continuously, the PowerCenter Server runs the
workflow as soon as the server initializes. The PowerCenter
Server then starts the next run of the workflow as soon as it
finishes the previous run.

Schedule Options:
Run Once/Run Every/
Customized Repeat

Optional

Required if you select Run On Server Initialization, or if you do


not choose any setting in Run Options.
If you select Run Once, the PowerCenter Server runs the
workflow once, as scheduled in the scheduler.
If you select Run Every, the PowerCenter Server runs the
workflow at regular intervals, as configured.
If you select Customized Repeat, the PowerCenter Server runs
the workflow on the dates and times specified in the Repeat
dialog box.
When you select Customized Repeat, click Edit to open the
Repeat dialog box. The Repeat dialog box allows you to
schedule specific dates and times for the workflow run. The
selected scheduler appears at the bottom of the page.

Scheduling a Workflow

115

Table 4-3. Schedule Tab Settings


Scheduler Options

Required/
Optional

Description

Start Options: Start Date/Start


Time

Optional

Start Date indicates the date on which the PowerCenter Server


begins the workflow schedule.
Start Time indicates the time at which the PowerCenter Server
begins the workflow schedule.

End Options: End On/End


After/Forever

Required/
Optional

Required if the workflow schedule is Run Every or Customized


Repeat.
If you select End On, the PowerCenter Server stops scheduling
the workflow in the selected date.
If you select End After, the PowerCenter Server stops
scheduling the workflow after the set number of workflow runs.
If you select Forever, the PowerCenter Server schedules the
workflow as long as the workflow does not fail.

Customizing Repeat Option


You can schedule the workflow to run once, run at an interval, or customize your own repeat
option. Click the Edit button to open the Customized Repeat dialog box.
Figure 4-14 shows the Customized Repeat dialog box:
Figure 4-14. Customized Repeat Dialog Box

116

Chapter 4: Working with Workflows

Table 4-4 describes options in the Customized Repeat dialog box:


Table 4-4. Repeat Dialog Box Options
Repeat Option

Required/
Optional

Repeat Every

Required

Enter the numeric interval you would like the PowerCenter Server to schedule
the workflow, and then select Days, Weeks, or Months, as appropriate.
If you select Days, select the appropriate Daily Frequency settings.
If you select Weeks, select the appropriate Weekly and Daily Frequency
settings.
If you select Months, select the appropriate Monthly and Daily Frequency
settings.

Weekly

Required/
Optional

Required to enter a weekly schedule. Select the day or days of the week on
which you would like the PowerCenter Server to run the workflow.

Monthly

Required/
Optional

Required to enter a monthly schedule.


If you select Run On Day, select the dates on which you want the workflow
scheduled on a monthly basis. The PowerCenter Server schedules the
workflow to run on the selected dates. If you select a numeric date exceeding
the number of days within a given month, the PowerCenter Server schedules
the workflow for the last day of the month, including leap years. For example, if
you schedule the workflow to run on the 31st of every month, the PowerCenter
Server schedules the session on the 30th of the following months: April, June,
September, and November.
If you select Run On The, select the week(s) of the month, then day of the
week on which you want the workflow to run. For example, if you select Second
and Last, then select Wednesday, the PowerCenter Server schedules the
workflow to run on the second and last Wednesday of every month.

Daily

Optional

Enter the number of times you would like the PowerCenter Server to run the
workflow on any day the session is scheduled.
If you select Run Once, the PowerCenter Server schedules the workflow once
on the selected day, at the time entered on the Start Time setting on the Time
tab.
If you select Run Every, enter Hours and Minutes to define the interval at which
the PowerCenter Server runs the workflow. The PowerCenter Server then
schedules the workflow at regular intervals on the selected day. The
PowerCenter Server uses the Start Time setting for the first scheduled
workflow of the day.

Description

Editing Scheduler Settings


You can edit scheduler settings for both non-reusable and reusable schedulers.

Non-reusable schedulers. When you configure or edit a non-reusable scheduler, check in


the workflow to allow the schedule to automatically take effect.
You can update the schedule manually with the workflow checked out. Right-click the
workflow in the Navigator, and select Schedule Workflow. Note that the changes are
applied only to the latest checked-in version of the workflow.

Scheduling a Workflow

117

Reusable schedulers. When you edit settings for a reusable scheduler, the repository
creates a new version of the scheduler and increments the version number by one. To
update a workflow with the latest schedule, check in the scheduler after you edit it.
When you configure a reusable scheduler for a new workflow, you must check in both the
workflow and the scheduler to enable the schedule to take effect. Thereafter, when you
check in the scheduler after revising it, the workflow schedule is updated automatically
even if it is checked out.
You need to update the workflow schedule manually if you do not check in the scheduler.
To update a workflow schedule manually, right-click the workflow in the Navigator, and
select Schedule Workflow. Note that the new schedule is implemented only for latest
version of the workflow that is checked in. Workflows that are checked out are not
updated with the new schedule.

Disabling Workflows
You may want to disable the workflow while you edit it. This prevents the PowerCenter Server
from running the workflow on its schedule. Select the Disable Workflows option on the
General tab of the workflow properties. The PowerCenter Server does not run disabled
workflows until you clear the Disable Workflows option. Once you clear the Disable
Workflows option, the PowerCenter Server reschedules the workflow.

118

Chapter 4: Working with Workflows

Validating a Workflow
Before you can run a workflow, you must validate it. When you validate the workflow, you
validate all task instances in the workflow, including nested worklets.
The Workflow Manager validates the following properties:

Expressions. Expressions in the workflow must be valid.

Tasks. Non-reusable task and Reusable task instances in the workflow must follow
validation rules.

Scheduler. If the workflow uses a reusable scheduler, the Workflow Manager verifies that
the scheduler exists.

The Workflow Manager also verifies that you linked each task properly. For example, you
must link the Start task to at least one task in the workflow.
Note: The Workflow Manager validates Session tasks separately. If a session is invalid, the

workflow may still be valid. For more information about session validation, see Validating a
Session on page 195.

Expression Validation
The Workflow Manager validates all expressions in the workflow. You can enter expressions in
the Assignment task, Decision task, and link conditions. The Workflow Manager writes any
error message to the Output window.
Expressions in link conditions and Decision task conditions must evaluate to a numerical
value. Workflow variables used in expressions must exist in the workflow.
The Workflow Manager marks the workflow invalid if a link condition is invalid.

Task Validation
The Workflow Manager validates each task in the workflow as you create it. When you save or
validate the workflow, the Workflow Manager validates all tasks in the workflow except
Session tasks. It marks the workflow invalid if it detects any invalid task in the workflow.
The Workflow Manager verifies that attributes in the tasks follow validation rules. For
example, the user-defined event you specify in an Event task must exist in the workflow. The
Workflow Manager also verifies that you linked each task properly. For example, you must
link the Start task to at least one task in the workflow. For details on task validation rules, see
Validating Tasks on page 139.
When you delete a reusable task, the Workflow Manager removes the instance of the deleted
task from workflows. The Workflow Manager also marks the workflow invalid when you
delete a reusable task used in a workflow.
The Workflow Manager verifies that there are no duplicate task names in a folder, and that
there are no duplicate task instances in the workflow.
Validating a Workflow

119

Workflow Properties Validation


The Workflow Manager marks the workflow invalid if the scheduler you specify for the
workflow does not exist in the folder.

Running Validation
When you validate a workflow, you validate worklet instances, worklet objects, and all other
nested worklets in the workflow. You validate task instances and worklets, regardless of
whether you have edited them.
The Workflow Manager validates the worklet object using the same validation rules for
workflows. The Workflow Manager validates the worklet instance by verifying attributes in
the Parameter tab of the worklet instance. For details on validating worklets, see Validating
Worklets on page 171.
If the workflow contains nested worklets, you can select a worklet to validate the worklet and
all other worklets nested under it. To validate a worklet and its nested worklets, right-click the
worklet and choose Validate.

Example
For example, you have a workflow that contains a non-reusable worklet called Worklet_1.
Worklet_1 contains a nested worklet called Worklet_a. The workflow also contains a reusable
worklet instance called Worklet_2. Worklet_2 contains a nested worklet called Worklet_b.
In the example workflow in Figure 4-15, the Workflow Manager validates links, conditions,
and tasks in the workflow. The Workflow Manager validates all tasks in the workflow,
including tasks in Worklet_1, Worklet_2, Worklet_a, and Worklet_b.
You can validate a part of the workflow. Right-click Worklet_1 and choose Validate. The
Workflow Manager validates all tasks in Worklet_1 and Worklet_a.
Figure 4-15 shows the example workflow:
Figure 4-15. Example Workflow - Validation
Worklet_1: Non-reusable
worklet. Contains a
nested worklet called
Worklet_a.

Worklet_2: Reusable
worklet. Contains a
nested worklet called
Worklet_b.

Validating Multiple Workflows


You can validate multiple workflows or worklets without fetching them into the workspace.
To validate multiple workflows, you must select and validate the workflows from a query
120

Chapter 4: Working with Workflows

results view or a view dependencies list. When you validate multiple workflows, the validation
does not include sessions, nested worklets, or reusable worklet objects in the workflows.
Note: If you are using the Repository Manager, you can select and validate multiple workflows

from the Repository Navigator.


You can save and optionally check in workflows that change from invalid to valid status. For
more information about validating multiple objects, see Validating Multiple Objects in the
Repository Guide.
To validate multiple workflows:
1.

Select workflows from either a query list or a view dependencies list.

2.

Right-click one of the selected workflows and choose Validate.


The Validate Objects dialog box displays.

3.

Choose whether to save objects and check in objects that you validate.

Validating a Workflow

121

Running the Workflow


Before you can run a workflow, you must save changes in the folder and select a PowerCenter
Server to run the workflow. You can manually start a workflow configured to run on demand
or to run on a schedule. Use the Workflow Manager, Workflow Monitor, or pmcmd to run a
workflow. You can choose to run the entire workflow, part of a workflow, or a task in the
workflow.

Selecting a Server to Run the Workflow


You must choose a server to run the workflow. If you only register one server, the Workflow
Manager lists the single registered PowerCenter Server that runs the workflow. For
PowerCenter repositories with multiple servers, the Workflow Manager lists all servers.
To select a server to run a workflow:
1.

In the Workflow Designer, open the Workflow.

2.

Choose Workflows-Edit. The Edit Workflow dialog box appears.

3.

Click the Select Server button on the General tab. A list of registered servers appear.

Select a server.

4.

Select the server on which you want to run the workflow.

5.

Click OK twice to select the server for the workflow.

Assigning the PowerCenter Server to a Workflow


After you register the PowerCenter Server, you can assign it to workflows you want to run on
that server. This allows you to assign the PowerCenter Server to multiple workflows without

122

Chapter 4: Working with Workflows

editing each workflow property individually. To assign the PowerCenter Server to multiple
workflows, you must first close all folders in the repository.
You can also choose a PowerCenter Server to run a specific workflow by editing the workflow
property. For details, see Running a Workflow on page 124.
To assign the PowerCenter Server to workflows, you must have Super User privilege.
To assign the PowerCenter Server:
1.

Close all folders in the repository.

2.

Choose Server-Assign Server.


or
Right-click the server name in the Navigator and choose Assign Server. The Assign Server
dialog box opens.

Select a server to
assign.
Select a folder.

Assign a server to
a workflow.

3.

From the Choose Server list, select the server you want to assign.

4.

From the Show Folder list, select the folder you want to view. Or, choose All to view
workflows in all folders in the repository.

5.

Select the Select check box for each workflow you want to run on the PowerCenter
Server.

6.

Click Assign.

Removing an Assigned Server from a Workflow


You can remove an assigned server from a workflow in the Assign Server dialog box. Perform
the following steps to remove an assigned server from a workflow.

Running the Workflow

123

To remove an assigned server:


1.

Close all folders in the repository.

2.

Choose Server-Assign Server.

3.

From the Choose Server list, select None.

4.

From the Show Folder list, select the folder you want to view. Or, choose All to view
workflows in all folders in the repository.

5.

Select the workflows from which you want to remove the assigned server.

6.

Click Assign.

Running a Workflow
When you choose Workflows-Start, the PowerCenter Server runs the entire workflow.
To run a workflow from pmcmd, use the startworkflow command. For details on using
pmcmd, see Using pmcmd on page 581.
To start a workflow with the Workflow Manager:
1.

Connect to a repository and open the folder containing the workflow.

2.

From the Navigator, select the workflow that you want to start.

3.

Right-click the workflow in the Navigator and choose Start Workflow.


The PowerCenter Server starts running the entire workflow.

When you choose Start Workflow, the workflow runs on the PowerCenter Server you selected
in the workflow properties. You can also use the Choose Server toolbar button to run the
workflow on a different server.
After the Workflow Manager sends a request to the PowerCenter Server, the Output window
displays the PowerCenter Server response. If an error displays, check the workflow log or
session log for error messages.
You can also manually start a workflow by right-clicking in the Workflow Designer workspace
and choosing Start Workflow.

Running a Part of a Workflow


You can choose to run only part of the workflow. To run part of the workflow, right-click the
task that you want the PowerCenter Server to begin running and choose Start Workflow From
Task. The PowerCenter Server runs the workflow from the selected task to the end of the
workflow.
When you run a workflow from a selected task, the PowerCenter Server runs the workflow on
the registered server you choose in the workflow properties. The PowerCenter Server logs
messages in the workflow log when you start a workflow from a task.

124

Chapter 4: Working with Workflows

To run a part of a workflow from pmcmd, use the startfrom flag of the startworkflow
command. For details on using pmcmd, see Using pmcmd on page 581.
To run a part of a workflow:
1.

Connect to the folder containing the workflow.

2.

In the Navigator window, drill down the Workflow folder to show the tasks in the
workflow.
or
In the Workflow Designer workspace, select the task from which you want the
PowerCenter Server to begin running.

3.

Right-click the task on which you want the PowerCenter Server to begin running.

4.

Choose Start Workflow From Task.

For example, you have a workflow with multiple tasks. The example workflow in Figure 4-16
contains two branches. If you want to run the tasks commandtask2, e_email2, and
command3, you start the workflow from commandtask2. All subsequent tasks in the branch
will run.
Figure 4-16. Running Part of a Workflow - Example

When you start the workflow from


commandtask2, the PowerCenter
Server runs this portion of the workflow.

Running a Task in the Workflow


When you start a task in the workflow, the Workflow Manager locks the entire workflow so
another user cannot start the workflow. The PowerCenter Server runs the selected task. It does
not run the rest of the workflow.
To run a task using the Workflow Manager, select the task in the Workflow Designer
workspace. Right-click the task and choose Start Task.
You can select a task to start using menu commands in the Workflow Manager. In the
Navigator window, drill down the Workflow folder to show the tasks in the workflow you
want to start. Right-click the task you want to start and choose Start Task.

Running the Workflow

125

To start a task in a workflow from pmcmd, use the starttask command. For details on using
pmcmd, see Using pmcmd on page 581.

126

Chapter 4: Working with Workflows

Suspending the Workflow


When a task in the workflow fails, you might want to suspend the workflow, fix the error, and
resume or recover the workflow. The PowerCenter Server suspends the workflow if you enable
the Suspend On Error option in the workflow properties. You can optionally set a suspension
email so the PowerCenter Server sends an email when it suspends a workflow.
When you enable the Suspend On Error option, the PowerCenter Server suspends the
workflow when one of the following fails:

Session

Command

Worklet

Email

When a task fails in the workflow, the PowerCenter Server stops running tasks in its path.
The PowerCenter Server does not evaluate the output link of the failed task. If no other task is
running in the workflow, the Workflow Monitor displays the status of the workflow as
Suspended.
If one or more tasks are still running in the workflow when a task fails, the PowerCenter
Server stops running the failed task and continues running tasks in other paths. The
Workflow Monitor displays the status of the workflow as Suspending.
When the status of the workflow is Suspended or Suspending, you can fix the error, such
as a target database error, and resume or recover the workflow in the Workflow Monitor.
When you resume or recover a workflow, the PowerCenter Server restarts the failed tasks and
continues evaluating the rest of the tasks in the workflow. The PowerCenter Server does not
run any task that already completed successfully.
Note: Do not edit a workflow or the tasks inside a workflow when the PowerCenter Server

suspends a workflow.
For details about resuming the workflow, see Resuming a Workflow or Worklet on
page 417. For details about recovering the workflow, see Recovering a Workflow or Worklet
on page 417.
To suspend a workflow:
1.

In the Workflow Designer, open the workflow.

2.

Choose Workflows-Edit.

Suspending the Workflow

127

3.

In the General tab, enable Suspend On Error.

4.

Click OK.

Configuring Suspension Email


You can configure the workflow so that the PowerCenter Server sends an email when it
suspends a workflow. Select an existing reusable email task for the suspension email. When a
task fails, the PowerCenter Server starts suspending the workflow and sends the suspension
email. If another task fails while the PowerCenter Server is suspending the workflow, you do
not get the suspension email again.
The PowerCenter Server sends out a suspension email if another task fails after you resume
the workflow.
For details on configuring suspension emails, see Working with Suspension Email on
page 339.

128

Chapter 4: Working with Workflows

Stopping or Aborting the Workflow


You can specify when and how you want the PowerCenter Server to stop or abort a workflow
by using the Control task in the workflow. After you start a workflow, you can stop or abort it
through the Workflow Monitor or pmcmd. You can issue the stop or abort command at any
time during the execution of a workflow.
You can stop or abort a workflow by performing one of the following actions:

Use a Control task in the workflow. For details, see Working with the Control Task on
page 147.

Issue a stop or abort command in the Workflow Monitor. For details, see Monitoring
Workflows on page 401.

Issue a stop or abort command in pmcmd. For details, see pmcmd Reference on
page 594.

You can also stop or abort a task within a workflow. For details on stopping the Session task,
see Stopping and Aborting a Session on page 200.

Server Handling of Stop and Abort


When you stop a workflow, the PowerCenter Server tries to stop all the tasks that are
currently running in the workflow. If the workflow contains a worklet, the PowerCenter
Server also tries to stop all the tasks that are currently running in the worklet. If it cannot stop
the workflow, you need to abort the workflow.
The PowerCenter Server can stop the following tasks completely:

Session

Command

Timer

Event-Wait

Worklet

When you stop a Command task that contains multiple commands, the PowerCenter Server
finishes executing the current command and does not execute the rest of the commands. The
PowerCenter Server cannot stop tasks such as the Email task. For example, if the PowerCenter
Server has already started sending an email when you issue the stop command, the
PowerCenter Server finishes sending the email before it stops running the workflow.
The PowerCenter Server aborts the workflow if the Repository Server process shuts down.

Stopping or Aborting a Task


You can stop or abort a task within a workflow from the Workflow Monitor. When you stop
or abort a task, the PowerCenter Server stops processing the task. The PowerCenter Server
does not process other tasks in the path of the stopped or aborted task. The PowerCenter
Stopping or Aborting the Workflow

129

Server continues processing concurrent tasks in the workflow. If the PowerCenter Server
cannot stop the task, you can abort the task.
When you abort a task, the PowerCenter Server kills the process on the task. The
PowerCenter Server continues processing concurrent tasks in the workflow when you abort a
task.
You can also stop or abort a worklet. The PowerCenter Server stops and aborts a worklet
similar to stopping and aborting a task. The PowerCenter Server stops the worklet while
executing concurrent tasks in the workflow. You can also stop or abort tasks within a worklet.

Stopping or Aborting a Session Task


If the PowerCenter Server is executing a Session task when you issue the stop command, the
PowerCenter Server stops reading data. It continues processing and writing data and
committing data to targets. If the PowerCenter Server cannot finish processing and
committing data, you can issue the abort command.
The PowerCenter Server handles the abort command for the Session task like the stop
command, except it has a timeout period of 60 seconds. If the PowerCenter Server cannot
finish processing and committing data within the timeout period, it kills the DTM process
and terminates the session. For details on stopping or aborting a session, see Stopping and
Aborting a Session on page 200.

130

Chapter 4: Working with Workflows

Chapter 5

Working with Tasks


This chapter covers the following topics:

Overview, 132

Creating a Task, 133

Configuring Tasks, 135

Validating Tasks, 139

Working with the Assignment Task, 140

Working with the Command Task, 143

Working with the Control Task, 147

Working with the Decision Task, 149

Working with Event Tasks, 153

Working with the Timer Task, 161

131

Overview
The Workflow Manager contains many types of tasks to help you build workflows and
worklets. You can create reusable tasks in the Task Developer. Or, create and add tasks in the
Workflow or Worklet Designer as you develop the workflow.
Table 5-1 summarizes workflow tasks available in Workflow Manager:
Table 5-1. Workflow Tasks
Task Name

Tool

Reusable

Description

Assignment

Workflow Designer
Worklet Designer

No

Assigns a value to a workflow variable. For details, see


Working with the Assignment Task on page 140.

Command

Task Developer
Workflow Designer
Worklet Designer

Yes

Specifies shell commands to run during the workflow.


You can choose to run the Command task only if the
previous task in the workflow completes. For details, see
Working with the Command Task on page 143.

Control

Workflow Designer
Worklet Designer

No

Stops or aborts the workflow. For details, see Working


with the Control Task on page 147.

Decision

Workflow Designer
Worklet Designer

No

Specifies a condition to evaluate in the workflow. Use


the Decision task to create branches in a workflow. For
details, see Working with the Decision Task on
page 149.

Email

Task Developer
Workflow Designer
Worklet Designer

Yes

Sends email during the workflow. For details, see


Sending Email on page 319.

Event-Raise

Workflow Designer
Worklet Designer

No

Represents the location of a user-defined event. The


Event-Raise task triggers the user-defined event when
the PowerCenter Server runs the Event-Raise task. For
details, see Working with Event Tasks on page 153.

Event-Wait

Workflow Designer
Worklet Designer

No

Waits for a user-defined or a pre-defined event to occur.


Once the event occurs, the PowerCenter Server
completes the rest of the workflow. For details, see
Working with Event Tasks on page 153.

Session

Task Developer
Workflow Designer
Worklet Designer

Yes

Set of instructions to run a mapping. For details, see


Working with Sessions on page 173.

Timer

Workflow Designer
Worklet Designer

No

Waits for a specified period of time to run the next task.


For details, see Working with Event Tasks on
page 153.

The Workflow Manager validates tasks attributes and links. If a task is invalid, the workflow
becomes invalid. Workflows containing invalid sessions may still be valid. For details on
validating tasks, see Validating Tasks on page 139.

132

Chapter 5: Working with Tasks

Creating a Task
You can create tasks in the Task Developer, or you can create them in the Workflow Designer
or the Worklet Designer as you develop the workflow or worklet. Tasks you create in the Task
Developer are reusable. Tasks you create in the Workflow Designer and Worklet Designer are
non-reusable by default.
For details on reusable tasks, see Reusable Workflow Tasks on page 135.

Creating a Task in the Task Developer


You can create the following three types of tasks in the Task Developer:

Command

Session

Email

Perform the following steps to create tasks in the Task Developer.


To create a task in the Task Developer:
1.

In the Task Developer, choose Tasks-Create. The Create Task dialog box appears.

2.

Select the task type you want to create, Command, Session, or Email.

3.

Enter a name for the task.

4.

For session tasks, select the mapping you want to associate with the session.

5.

Click Create.
The Task Developer creates the workflow task.

6.

Click Done to close the Create Task dialog box.

Creating a Task in the Workflow or Worklet Designer


You can create and add tasks in the Workflow Designer or Worklet Designer as you develop
the workflow or worklet. You can create any type of task in the Workflow Designer or Worklet
Designer. Tasks you create in the Workflow Designer or Worklet Designer are non-reusable.
Edit the General tab of the task properties to promote a non-reusable task to a reusable task.

Creating a Task

133

Perform the following steps to create tasks in the Workflow Designer or Worklet Designer.
To create tasks in the Workflow Designer or Worklet Designer:
1.

In the Workflow Designer or Worklet Designer, open a workflow or worklet.

2.

Choose Tasks-Create.

3.

Select the type of task you want to create.

4.

Enter a name for the task.

5.

Click Create.
The Workflow Designer or Worklet Designer creates the task and adds it to the
workspace.

6.

Click Done.

You can also use the Tasks toolbar to create and add tasks to the workflow. Click the button
on the Tasks toolbar for the task you want to create. Click again in the Workflow Designer or
Worklet Designer workspace to create and add the task. The Workflow Designer or Worklet
Designer creates the task with a default task name when you use the Tasks toolbar.

134

Chapter 5: Working with Tasks

Configuring Tasks
After you create the task, you can configure general task options on the General tab. For each
task instance in the workflow, you can configure how the PowerCenter Server runs the task
and the other objects associated with the selected task. You can also disable the task so you can
run rest of the workflow without the selected task.
Figure 5-1 displays the General tab in the Edit Tasks dialog box:
Figure 5-1. General Tab - Edit Tasks Dialog Box

When you use a task in the workflow, you can edit the task in the Workflow Designer and
configure the following task options in the General tab:

Treat input link as AND or OR. Choose to have the PowerCenter Server run the task
when all or one of the input link conditions evaluates to True.

Disable this task. Choose to disable the task so you can run the rest of the workflow
without the task.

Fail parent if this task fails. Choose to fail the workflow or worklet containing the task if
the task fails.

Fail parent if this task does not run. Choose to fail the workflow or worklet containing
the task if the task does not run.

Reusable Workflow Tasks


Workflows can contain reusable task instances and non-reusable tasks. Non-reusable tasks
exist within a single workflow. Reusable tasks can be used in multiple workflows in the same
folder.

Configuring Tasks

135

You have the option to create any task as non-reusable or reusable. Tasks you create in the
Task Developer are reusable. Tasks you create in the Workflow Designer are non-reusable by
default. However, you can edit the general properties of a task to promote it to a reusable task.
The Workflow Manager stores each reusable task separate from the workflows that use the
task. You can view a list of reusable tasks in the Tasks node in the Navigator window. You can
see a list of all reusable Session tasks in the Sessions node in the Navigator window.
To promote a non-reusable workflow task:
1.

In the Workflow Designer, double-click the task you want to make reusable.

2.

In the General tab of the Edit Task dialog box, check the Make Reusable option.

3.

When prompted whether you are sure you want to promote the task, click Yes.

4.

Click OK to return to the workflow.

5.

Choose Repository-Save.

The newly promoted task appears in the list of reusable tasks in the Tasks node in the
Navigator window.

Instances and Inherited Changes


When you add a reusable task to a workflow, you add an instance of the task. The definition
of the task exists outside the workflow, while an instance of the task exists in the workflow.
You can edit the task instance in the Workflow Designer. Changes you make in the task
instance exist only in the workflow. The task definition remains unchanged in the Task
Developer.
When you make changes to a reusable task definition in the Task Developer, the changes
reflect in the instance of the task in the workflow only if you have not edited the instance.

Reverting Changes in Reusable Tasks Instances


When you edit an instance of a reusable task in the workflow, you can revert back to the
settings in the task definition. When you change settings in the task instance, the Revert
button appears. The Revert button appears after you override task properties. You cannot use
the Revert button for settings that are read-only or locked by another user.

136

Chapter 5: Working with Tasks

Figure 5-2 displays the Revert button in the Mapping tab of a Session task:
Figure 5-2. Revert Button in Session Properties

AND or OR Input Links


For each task, you can choose to treat the input link as an AND link or an OR link. When a
task has one input link, the PowerCenter Server processes the task when the previous object
completes and the link condition evaluates to True. If you have multiple links going into one
task, you can choose to have an AND input link so that the PowerCenter Server runs the task
when all the link conditions evaluates to True. Or, you can choose to have an OR input link
so that the PowerCenter Server runs the task as soon as any link condition evaluates to True.
To set the type of input links, double-click the task to open the Edit Tasks dialog box. Select
AND or OR for the input link type. For details on working with links and link conditions,
see Working with Links on page 92.

Disabling Tasks
In the Workflow Designer, you can disable a workflow task so that the PowerCenter Server
runs the workflow without the disabled task. The status of a disabled task is DISABLED.
Disable a task in the workflow by selecting the Disable This Task option in the Edit Tasks
dialog box.

Configuring Tasks

137

Failing Parent Workflow or Worklet


You can choose to fail the workflow or worklet if a task fails or does not run. The workflow or
worklet that contains the task instance is called the parent. A task might not run when the
input condition for the task evaluates to False.
To fail the parent workflow or worklet if the task fails, double-click the task and select the Fail
Parent If This Task Fails option in the General tab. When you select this option and a task
fails, it does not prevent the other tasks in the workflow or worklet from running. Instead, the
PowerCenter Server marks the status of the workflow or worklet as failed. If you have a session
nested within multiple worklets, you must select the Fail Parent If This Task Fails option for
each worklet instance to see the failure at the workflow level.
To fail the parent workflow or worklet if the task does not run, double-click the task and
select the Fail Parent If This Task Does Not Run option in the General tab. When you choose
this option, the PowerCenter Server fails the parent workflow if a task did not run.
Note: The PowerCenter Server does not fail the parent workflow if you disable a task.

138

Chapter 5: Working with Tasks

Validating Tasks
You can validate reusable tasks in the Task Developer. Or, you can validate task instances in
the Workflow Designer. When you validate a task, the Workflow Manager validates task
attributes and links. For example, the user-defined event you specify in an Event tasks must
exist in the workflow.
The Workflow Manager uses the following rules to validate tasks:

Assignment. The Workflow Manager validates the expression you enter for the
Assignment task. For example, the Workflow Manager verifies that you assigned a
matching datatype value to the workflow variable in the assignment expression.

Command. The Workflow Manager does not validate the shell command you enter for the
Command task.

Event-Wait. If you choose to wait for a pre-defined event, the Workflow Manager verifies
that you specified a file to watch. If you choose to use the Event-Wait task to wait for a
user-defined event, the Workflow Manager verifies that you specified an event.

Event-Raise. The Workflow Manager verifies that you specified a user-defined event for
the Event-Raise task.

Timer. The Workflow Manager verifies that the variable you specified for the Absolute
Time setting has the Date/Time datatype.

Start. The Workflow Manager verifies that you linked the Start task to at least one task in
the workflow.

When a task instance is invalid, the workflow using the task instance becomes invalid. When
a reusable task is invalid, it does not affect the validity of the task instance used in the
workflow. However, if a Session task instance is invalid, the workflow may still be valid. The
Workflow Manager validates sessions differently. For details, see Validating a Session on
page 195.
To validate a task, select the task in the workspace and choose Tasks-Validate. Or, right-click
the task in the workspace and choose Validate.

Validating Tasks

139

Working with the Assignment Task


The Assignment task allows you to assign a value to a user-defined workflow variable. To use
an Assignment task in the workflow, first create and add the Assignment task to the workflow.
Then configure the Assignment task to assign values or expressions to user-defined variables.
After you assign a value to a variable using the Assignment task, the PowerCenter Server uses
the assigned value for the variable during the remainder of the workflow.
You must create a variable before you can assign values to it. You cannot assign values to predefined workflow variables.
To create an Assignment task:
1.

In the Workflow Designer, click the Assignment icon on the Tasks toolbar.

Assignment Task Toolbar Icon

or
Choose Tasks-Create. Select Assignment Task for the task type.
2.

Enter a name for the Assignment task. Click Create. Then click Done.
The Workflow Designer creates and adds the Assignment task to the workflow.

3.

Double-click the Assignment task to open the Edit Task dialog box.

4.

On the Expressions tab, click Add to add an assignment.

Add an assignment.

Open Button

5.

140

Click the Open button in the User Defined Variables field.

Chapter 5: Working with Tasks

The Select Variable dialog box appears.

6.

Select the variable for which you want to assign a value. Click OK.

7.

Click the Edit button in the Expression field to open the Expression Editor.
The Expression Editor shows pre-defined workflow variables, user-defined workflow
variables, variable functions, and boolean and arithmetic operators.

8.

Enter the value or expression you want to assign. For example, if you want to assign the
value 500 to the user-defined variable $$custno1, enter the number 500 in the
Expression Editor.
Validate the expression before you close the Expression Editor.
Working with the Assignment Task

141

142

9.

Repeat steps 5-7 to add more variable assignments as necessary. Use the up and down
arrows in the Expressions tab to change the order of the variable assignments.

10.

Click OK.

Chapter 5: Working with Tasks

Working with the Command Task


The Command task allows you to specify one or more shell commands to run during the
workflow. For example, you can specify shell commands in the Command task to delete reject
files, copy a file, or archive target files.
You can use a Command task in the following ways:

Standalone Command task. You can use a Command task anywhere in the workflow or
worklet to run shell commands.

Pre- and post-session shell command. You can call a Command task as the pre- or postsession shell command for a Session task. For more information about specifying presession and post-session shell commands, see Using Pre- or Post-Session Shell
Commands on page 188.

Note: You can use server variables or session variables in pre- and post-session shell commands.

You cannot use server variables or session variables in standalone Command tasks. The
PowerCenter Server does not expand server variables or session variables in standalone
Command tasks.
Use any valid UNIX command or shell script for UNIX servers, or any valid DOS or batch
file for Windows servers.
For example, you might use a shell command to copy a file from one directory to another. For
a Windows server you would use the following shell command to copy the SALES_ ADJ file
from the source directory, L, to the target, H:
copy L:\sales\sales_adj H:\marketing\

For a UNIX server, you would use the following command to perform a similar operation:
cp sales/sales_adj marketing/

Each shell command runs in the same environment (UNIX or Windows) as the PowerCenter
Server. Environment settings in one shell command script do not carry over to other scripts.
To run all shell commands in the same environment, call a single shell script that invokes
other scripts.

Using Session Parameters


You can use session parameters in pre- or post-session shell commands. For example, you
might use an input file parameter instead of hard-coding the name of a source file.

Creating a Command Task


Perform the following steps to create a Command task.

Working with the Command Task

143

To create a Command task:


1.

In the Workflow Designer or the Task Developer, click the Command Task icon on the
Tasks toolbar.

Command Task Icon

or
Choose Task-Create. Select Command Task for the task type.
2.

Enter a name for the Command task. Click Create. Then click Done.

3.

Double-click the Command task in the workspace to open the Edit Tasks dialog box.

4.

In the Commands tab, click the Add button to add a command.

Add Button

Edit Button

5.

144

In the Name field, enter a name for the new command.

Chapter 5: Working with Tasks

6.

In the Command field, click the Edit button to open the Command Editor.

7.

Enter the command you want to perform. Enter only one command in the Command
Editor.

8.

Click OK to close the Command Editor.

9.

Repeat steps 3-8 to add more commands in the task.

10.

Click OK.

If you specify non-reusable shell commands for a session, you can promote the non-reusable
shell commands to a reusable Command task. For details, see Creating a Reusable Command
Task from Pre- or Post-Session Commands on page 191.

Executing Commands in the Command Task


The PowerCenter Server processes the shell commands in the order you specify them. You can
choose to run a command only if the previous command completed successfully. Or, you
choose to run all commands in the Command Task, regardless of the result of the previous
command. If you configure multiple commands in a Command task to run on UNIX, each
command runs in a separate shell.
To run the next command only if the previous command completes successfully, select the
Run If Previous Completed option in the Properties tab of the Command task.
If you select the Run If Previous Completed option, when one of the commands in the
Command task fails, the PowerCenter Server stops running the rest of the commands and
fails the task. If you do not select the Run If Previous Completed option, the PowerCenter
Server runs all the commands in the Command task and treats the task as completed, even if a
command fails.

Working with the Command Task

145

Figure 5-3 shows the Run If Previous Completed option:


Figure 5-3. Run If Previous Completed Option

146

Chapter 5: Working with Tasks

Working with the Control Task


You can use the Control task to stop, abort, or fail the top-level workflow or the parent
workflow based on an input link condition. A parent workflow or worklet is the workflow or
worklet that contains the Control task.
To create a Control task:
1.

In the Workflow Designer, click the Control Task icon on the Tasks toolbar.

Control Task Icon

or
Choose Tasks-Create. Select Control Task for the task type.
2.

Enter a name for the Control task. Click Create. Then click Done.
The Workflow Manager creates and adds the Control task to the workflow.

3.

Double-click the Control task in the workspace to open it.

Working with the Control Task

147

4.

Configure control options on the Properties tab.

You can choose from the following control options:

148

Control Option

Description

Fail Me

Marks the Control task as Failed. The PowerCenter Server fails


the Control task if you choose this option. If you choose Fail Me in
the Properties tab and choose Fail Parent If This Task Fails in the
General tab, the PowerCenter Server fails the parent workflow.

Fail Parent

Marks the status of the workflow or worklet that contains the


Control task as failed after the workflow or worklet completes.

Stop Parent

Stops the workflow or worklet that contains the Control task.

Abort Parent

Aborts the workflow or worklet that contains the Control task.

Fail Top-Level Workflow

Fails the workflow that is running.

Stop Top-Level Workflow

Stops the workflow that is running.

Abort Top-Level Workflow

Aborts the workflow that is running.

Chapter 5: Working with Tasks

Working with the Decision Task


The Decision task allows you to enter a condition that determines the execution of the
workflow, similar to a link condition. The Decision task has a pre-defined variable called
$Decision_task_name.condition that represents the result of the decision condition. The
PowerCenter Server evaluates the condition in the Decision task and sets the pre-defined
condition variable to True (1) or False (0).
You can specify one decision condition per Decision task.
After the PowerCenter Server evaluates the Decision task, you can use the pre-defined
condition variable in other expressions in the workflow to help you develop the workflow.
Depending on the workflow, you might use link conditions instead of a Decision task.
However, the Decision task simplifies the workflow. For details on link conditions, see
Working with Links on page 92.
If you do not specify a condition in the Decision task, the PowerCenter Server evaluates the
Decision task to True.

Using the Decision Task


You can use the Decision task instead of multiple link conditions in a workflow. Instead of
specifying multiple link conditions, use the pre-defined Condition variable in a Decision task
to simplify link conditions.

Example
For example, you have a Command task that depends on the status of the three sessions in the
workflow. You want the PowerCenter Server to run the Command task when any of the three
sessions fails. To accomplish this, use a Decision task with the following decision condition:
$Q1_session.status = FAILED OR $Q2_session.status = FAILED OR
$Q3_session.status = FAILED

You can then use the pre-defined condition variable in the input link condition of the
Command task. Configure the input link with the following link condition:
$Decision.condition = True

Working with the Decision Task

149

Figure 5-4 shows the example workflow using a Decision task:


Figure 5-4. Example Workflow Using a Decision Task

You can configure the same logic in the workflow without the Decision task. Without the
Decision task, you need to use three link conditions and treat the input links to the
Command task as OR links.
Figure 5-5 shows the example workflow without the Decision task:
Figure 5-5. Example Workflow without a Decision Task

You can further expand the example workflow in Figure 5-4. In Figure 5-4, the PowerCenter
Server runs the Command task if any of the three Session tasks fails. Suppose now you want
the PowerCenter Server to also run an Email task if all three Session tasks succeed.

150

Chapter 5: Working with Tasks

To do this, add an Email task and use the decision condition variable in the link condition.
Figure 5-6 shows the expanded example workflow using a Decision task:
Figure 5-6. Expanded Example Workflow Using a Decision Task

$Decision.condition = True

$Decision.condition = False

Creating a Decision Task


Perform the following steps to create a Decision task.
To create a Decision task:
1.

In the Workflow Designer, click the Decision Task icon on the Tasks toolbar.

Decision Task Icon

or
Choose Tasks-Create. Select Decision Task for the task type.
2.

Enter a name for the Decision task. Click Create. Then click Done.
The Workflow Designer creates and adds the Decision task to the workspace.

Working with the Decision Task

151

3.

Double-click the Decision task to open it.

4.

Click the Open button in the Value field to open the Expression Editor.

5.

In the Expression Editor, enter the condition you want the PowerCenter Server to
evaluate.
Validate the expression before you close the Expression Editor.

6.

152

Click OK.

Chapter 5: Working with Tasks

Working with Event Tasks


You can define events in the workflow to specify the sequence of task execution. The event is
triggered based on the completion of the sequence of tasks. Use the following tasks to help
you use events in the workflow:

Event-Raise task. Event-Raise task represents a user-defined event. When the PowerCenter
Server runs the Event-Raise task, the Event-Raise task triggers the event. Use the EventRaise task with the Event-Wait task to define events.

Event-Wait task. The Event-Wait task waits for an event to occur. Once the event triggers,
the PowerCenter Server continues executing the rest of the workflow.

To coordinate the execution of the workflow, you may specify the following types of events for
the Event-Wait and Event-Raise tasks:

Pre-defined event. A pre-defined event is a file-watch event. For pre-defined events, use an
Event-Wait task to instruct the PowerCenter Server to wait for the specified indicator file
to appear before continuing with the rest of the workflow. When the PowerCenter Server
locates the indicator file, it starts the next task in the workflow.

User-defined event. A user-defined event is a sequence of tasks in the workflow. Use an


Event-Raise task to specify the location of the user-defined event in the workflow. A userdefined event is sequence of tasks in the branch from the Start task leading to the EventRaise task.
When all the tasks in the branch from the Start task to the Event-Raise task complete, the
Event-Raise task triggers the event. The Event-Wait task waits for the Event-Raise task to
trigger the event before continuing with the rest of the tasks in its branch.

Example of User-Defined Events


Say you have four sessions you want to run in a workflow. You want Q1_session and
Q2_session to run concurrently to save time. You also want to run Q3_session after
Q1_session completes. You want to run Q4_session only when Q1_session, Q2_session, and
Q3_session complete.
Figure 5-7 shows how to accomplish this using the Event-Raise and Event-Wait tasks:
Figure 5-7. Example of User-Defined Event
User-defined event: Q1Q3_Complete

Working with Event Tasks

153

Perform the following steps to configure the workflow shown in Figure 5-7:
1.

Link Q1_session and Q2_session concurrently.

2.

Add Q3_session after Q1_session.

3.

Declare an event called Q1Q3_Complete in the Events tab of the workflow properties.

4.

In the workspace, add an Event-Raise task after Q3_session.

5.

Specify the Q1Q3_Complete event in the Event-Raise task properties. This allows the
Event-Raise task to trigger the event when Q1_session and Q3_session complete.

6.

Add an Event-Wait task after Q2_session.

7.

Specify the Q1Q3_Complete event for the Event-Wait task.

8.

Add Q4_session after the Event-Wait task. When the PowerCenter Server processes the
Event-Wait task, it waits until the Event-Raise task triggers Q1Q3_Complete before it
runs Q4_session.

The PowerCenter Server runs the workflow shown in Figure 5-7 in the following order:
1.

The PowerCenter Server runs Q1_session and Q2_session concurrently.

2.

When Q1_session completes, the PowerCenter Server runs Q3_session.

3.

The PowerCenter Server finishes executing Q2_session.

4.

The Event-Wait task waits for the Event-Raise task to trigger the event.

5.

The PowerCenter Server completes Q3_session.

6.

The Event-Raise task triggers the event, Q1Q3_complete.

7.

The PowerCenter Server runs Q4_session because the event, Q1Q3_Complete, has been
triggered.

8.

The PowerCenter Server runs the Email task.

Working with Event-Raise Tasks


The Event-Raise task represents the location of a user-defined event. A user-defined event is
the sequence of tasks in the branch from the Start task to the Event-Raise task. When the
PowerCenter Server runs the Event-Raise task, the Event-Raise task triggers the user-defined
event.
To use an Event-Raise task, you must first declare the user-defined event. Then, create an
Event-Raise task in the workflow to represent the location of the user-defined event you just
declared. In the Event-Raise task properties, specify the name of a user-defined event.

154

Chapter 5: Working with Tasks

Declaring a User-Defined Event


Perform the following steps to declare a name for a user-defined event.
To declare a user-defined event:
1.

In the Workflow Designer, select Workflow-Edit to open the workflow properties.

2.

Select the Events tab in the Edit Workflow dialog box.

Add a userdefined event.

3.

Click Add to add an event name. Event name is not case-sensitive.

4.

Click OK.

Using the Event-Raise Task For a User-Defined Event


After you declare a user-defined event, use the Event-Raise task to represent the location of
the event and to trigger the event.
Perform the following steps to use an Even-Raise task.
To use an Event-Raise task:
1.

In the Workflow Designer workspace, create an Event-Raise task and place it in the
workflow to represent the user-defined event you want to trigger. A user-defined event is
the sequence of tasks in the branch from the Start task to the Event-Raise task.

Working with Event Tasks

155

2.

Double-click the Event-Raise task to open it.

3.

Click the Open button in the Value field on the Properties tab to open the Events
Browser for user-defined events.

4.

Choose an event in the Events Browser.

5.

Click OK twice to return to the workspace.

Working With Event-Wait Tasks


The Event-Wait task waits for a pre-defined event or a user-defined event. A pre-defined event
is a file-watch event. When you use the Event-Wait task to wait for a pre-defined event, you

156

Chapter 5: Working with Tasks

specify an indicator file for the PowerCenter Server to watch. The PowerCenter Server waits
for the indicator file to appear. Once the indicator file appears, the PowerCenter Server
continues executing tasks after the Event-Wait task.
Do not use the Event-Raise task to trigger the event when you wait for a pre-defined event.
You can also use the Event-Wait task to wait for a user-defined event. To use the Event-Wait
task for a user-defined event, you specify the name of the user-defined event in the EventWait task properties. The PowerCenter Server waits for the Event-Raise task to trigger the
user-defined event. Once the user-defined event is triggered, the PowerCenter Server
continues running tasks after the Event-Wait task.

Waiting for User-Defined Events


You can use the Event-Wait task to wait for a user-defined event. A user-defined event is
triggered by the Event-Raise task. To wait for a user-defined event, you must first use an
Event-Raise task to trigger the user-defined event.
To wait for a user-defined event:
1.

In the workflow, create an Event-Wait task and double-click the Event-Wait task to open
the Edit Task dialog box.

2.

In the Events tab of the Edit Tasks dialog box, select User-Defined.

Open the Events Browser.

Working with Event Tasks

157

3.

Click the Event button to open the Events Browser dialog box.

4.

Select a user-defined event for the PowerCenter Server to wait.

5.

Click OK twice.

Waiting for Pre-Defined Events


To use a pre-defined event, you need a shell command, script, or batch file to create an
indicator file. The file must be created or sent to a directory local to the PowerCenter Server.
The file can be any format recognized by the PowerCenter Server operating system. You can
choose to have the PowerCenter Server delete the indicator file after it detects the file, or you
can manually delete the indicator file. The PowerCenter Server marks the status of the EventWait task as failed if it cannot delete the indicator file.
When you specify the indicator file in the Event-Wait task, enter the directory in which the
file will appear and the name of the indicator file. You must provide the absolute path for the
file. The directory must be local to the PowerCenter Server. If you only specify the file name
and not the directory, the PowerCenter Server looks for the indicator file in the system
directory. For example, on Windows 2000, the system directory is c:\winnt\system32.
You can enter the actual name of the file or use server variables to specify the location of the
files. For more information on server variables, see Server Variables on page 46.
The PowerCenter Server writes the time the file appears in the workflow log.
Note: Do not use a source or target file name as the indicator file name.

Perform the following steps to wait for a pre-defined event in the workflow.
To wait for a pre-defined event:
1.

158

Create an Event-Wait task and double-click the Event-Wait task to open it.

Chapter 5: Working with Tasks

2.

In the Events tab of the Edit Task dialog box, select Pre-defined.

3.

Enter the path of the indicator file.

4.

If you want the PowerCenter Server to delete the indicator file after it detects the file,
select the Delete Filewatch File option in the Properties tab.

5.

Click OK.

Enabling Past Events


By default, the Event-Wait task waits for the Event-Raise task to trigger the event. By default,
the Event-Wait task does not check if the event already occurred. You can select the Enable
Past Events option so that the PowerCenter Server checks if the event has already occurred.
Working with Event Tasks

159

When you select Enable Past Events, the PowerCenter Server continues executing the next
tasks if the event already occurred.
Select the Enable Past Events option in the Properties tab of the Event-Wait task.

160

Chapter 5: Working with Tasks

Working with the Timer Task


The Timer task allows you to specify the period of time to wait before the PowerCenter Server
runs the next task in the workflow. You can choose to start the next task in the workflow at an
exact time and date. You can also choose to wait a period of time after the start time of
another task, workflow, or worklet before starting the next task.
The Timer task has two types of settings:

Absolute time. You specify the exact time that the PowerCenter Server starts running the
next task in the workflow. You may specify the exact date and time, or you can choose a
user-defined workflow variable to specify the exact time.

Relative time. You instruct the PowerCenter Server to wait for a specified period of time
after the Timer task, the parent workflow, or the top-level workflow starts.

For example, you may have two sessions in the workflow. You want the PowerCenter Server
wait ten minutes after the first session completes before it runs the second session. Use a
Timer task after the first session. In the Relative Time setting of the Timer task, specify ten
minutes from the start time of the Timer task.
Figure 5-8 shows the example workflow using the Timer task:
Figure 5-8. Example Workflow Using the Timer Task

You can use a Timer task anywhere in the workflow after the Start task.
To create a Timer task:
1.

In the Workflow Designer, click the Timer task icon on the Tasks toolbar.

Timer Task Toolbar Icon

or
Choose Tasks-Create. Select Timer Task for the task type.
2.

Double-click the Timer task to open it.

3.

On the General tab, enter a name for the Timer task.

Working with the Timer Task

161

4.

Click the Timer tab to specify when the PowerCenter Server starts the next task in the
workflow.

Specify attributes for Absolute Time or Relative Time described in Table 5-2:
Table 5-2. Timer Task Attributes

162

Timer Attribute

Description

Absolute Time: Specify the


exact time to start

The PowerCenter Server starts the next task in the workflow at the
exact date and time you specify.

Absolute Time: Use this


workflow date-time variable to
calculate the wait

Specify a user-defined date-time workflow variable. The


PowerCenter Server starts the next task in the workflow at the time
you choose.
The Workflow Manager verifies that the variable you specify has
the Date/Time datatype.
The Timer task fails if the date-time workflow variable evaluates to
NULL.

Relative time: Start after

Specify the period of time the PowerCenter Server waits to start


executing the next task in the workflow.

Relative time: from the start


time of this task

Choose this option to wait a specified period of time after the start
time of the Timer task to run the next task.

Relative time: from the start


time of the parent workflow/
worklet

Choose this option to wait a specified period of time after the start
time of the parent workflow/worklet to run the next task.

Relative time: from the start


time of the top-level workflow

Choose this option to wait a specified period of time after the start
time of the top-level workflow to run the next task.

Chapter 5: Working with Tasks

Chapter 6

Working with Worklets


This chapter covers the following topics:

Overview, 164

Developing a Worklet, 165

Using Worklet Variables, 169

Validating Worklets, 171

163

Overview
A worklet is an object that represents a set of tasks. It can contain any task available in the
Workflow Manager. You can run worklets inside a workflow. The workflow that contains the
worklet is called the parent workflow. You can also nest a worklet in another worklet.
Create a worklet when you want to reuse a set of workflow logic in several workflows. Use the
Worklet Designer to create and edit worklets.
When the PowerCenter Server runs a worklet, it expands the worklet. The PowerCenter
Server then runs the worklet as it would any other workflow, executing tasks and evaluating
links in the worklet.
The worklet does not contain any scheduling or server information. To run a worklet, include
the worklet in a workflow. The worklet runs on the PowerCenter Server you choose for the
workflow. The Workflow Manager does not provide a parameter file or log file for worklets.
The PowerCenter Server writes information about worklet execution in the workflow log.

Suspending Worklets
When you choose Suspend On Error for the parent workflow, the PowerCenter Server also
suspends the worklet if a task in the worklet fails. When a task in the worklet fails, the
PowerCenter Server stops executing the failed task and other tasks in its path. If no other task
is running in the worklet, the worklet status is Suspended. If one or more tasks are still
running in the worklet, the worklet status is Suspending. The PowerCenter Server suspends
the parent workflow when the status of the worklet is Suspended or Suspending.
For details on suspending workflows, see Suspending the Workflow on page 127.

164

Chapter 6: Working with Worklets

Developing a Worklet
To develop a worklet, you must first create a worklet. After you create a worklet, configure
worklet properties and add tasks to the worklet. You can create reusable worklets in the
Worklet Designer. You can also create non-reusable worklets in the Workflow Designer as you
develop the workflow.

Creating a Reusable Worklet


Create reusable worklets in the Worklet Designer. You can view a list of reusable worklets in
the Navigator Worklets node.
To create a reusable worklet:
1.

In the Worklet Designer, choose Worklets-Create. The Create Worklet dialog box
appears.

2.

Enter a name for the worklet.

3.

Click OK.
The Worklet Designer creates a Start task in the worklet.

Creating a Non-Reusable Worklet


You can create non-reusable worklets in the Workflow Designer as you develop the workflow.
Non-reusable worklets only exist in the workflow. You cannot use a non-reusable worklet in
another workflow. After you create the worklet in the Workflow Designer, open the worklet to
edit it in the Worklet Designer.

Developing a Worklet

165

You can promote non-reusable worklets to reusable worklets by selecting the Reusable option
in the worklet properties. To rename non-reusable worklets, open the worklet properties in
the Workflow Designer.
To create a non-reusable worklet:
1.

In the Workflow Designer, open a workflow.

2.

Choose Tasks-Create.

3.

Select Worklet for the Task type.

4.

Enter a name for the worklet.

5.

Click Create.
The Workflow Designer creates the worklet and adds it to the workspace.

6.

Click Done.

Configuring Worklet Properties


When you use a worklet in a workflow, you can configure the same set of general task settings
on the General tab as any other task. For example, you can make a worklet reusable, disable a
worklet, configure the input link to the worklet, or fail the parent workflow based on the
worklet. For details on these task settings, see Configuring Tasks on page 135.
In addition to general task settings, you can configure the following worklet properties:

Worklet variables. Use worklet variables to reference values and record information. You
use worklet variables the same way you use workflow variables. You can assign a workflow
variable to a worklet variable to override its initial value.
For details on worklet variables, see Using Worklet Variables on page 169.

Events. To use the Event-Wait and Event-Raise tasks in the worklet, you must first declare
an event in the worklet properties.

Metadata extension. Extend the metadata stored in the repository by associating


information with repository objects. For details, see Working with Metadata Extensions
on page 82.

Adding Tasks in Worklets


After you create a new worklet, add tasks by opening the worklet in the Worklet Designer. A
worklet must contain a Start task. The Start task represents the beginning of a worklet. When
you create a worklet, the Worklet Designer automatically creates a Start task for you.
To add tasks to a non-reusable worklet:

166

1.

Create a non-reusable worklet in the Workflow Designer workspace.

2.

Right-click the worklet and choose Open Worklet.

Chapter 6: Working with Worklets

3.

The Worklet Designer opens so you can add tasks in the worklet.

4.

Add tasks in the worklet by using the Tasks toolbar or choose Tasks-Create in the
Worklet Designer.

5.

Connect tasks with links.

Declaring Events in Worklets


Similar to workflows, you can use Event-Wait and Event-Raise tasks in a worklet. To use the
Event-Raise task, you first declare a user-defined event in the worklet. Events in one instance
of a worklet do not affect events in other instances of the worklet. You cannot specify worklet
events in the Event tasks in the parent workflow.
For more information about using event tasks, see Working with Event Tasks on page 153.

Viewing Links in a Worklet


When you edit a workflow or worklet, you can view the forward or backward link paths to
other tasks. You can highlight paths to see links in the workflow branch from the Start task to
the last task in the branch. For details, see Developing Workflows on page 91.

Nesting Worklets
You can nest a worklet within another worklet. When you run a workflow containing nested
worklets, the PowerCenter Server runs the nested worklet from within the parent worklet. You
can group several worklets together by function or simplify the design of a complex workflow
when you nest worklets.
You might choose to nest worklets to load data to fact and dimension tables. Create a nested
worklet to load fact and dimension data into a staging area. Then, create a nested worklet to
load the fact and dimension data from the staging area to the data warehouse.
You might choose to nest worklets to simplify the design of a complex workflow. Nest
worklets that can be grouped together within one worklet. In the workflow in Figure 6-1, two
worklets relate to regional sales and two worklets relate to quarterly sales.
Figure 6-1 shows a workflow that uses multiple worklets:
Figure 6-1. Workflow with Multiple Worklets

Developing a Worklet

167

The workflow in Figure 6-2 shows the same workflow with the worklets grouped and nested
in parent worklets.
Figure 6-2 shows a workflow that uses nested worklets:
Figure 6-2. Workflow with Nested Worklets

Creating Nested Worklets


From the Worklet Designer, open the parent worklet. To nest an existing reusable worklet,
choose Tasks-Insert Worklet. To create a non-reusable nested worklet, choose Tasks-Create,
and select worklet.

168

Chapter 6: Working with Worklets

Using Worklet Variables


Worklet variables are similar to workflow variables. A worklet has the same set of pre-defined
variables as any task. You can also create user-defined worklet variables. Like user-defined
workflow variables, user-defined worklet variables can be persistent or non-persistent. For
details on workflow variables, see Using Workflow Variables on page 103.
You cannot use variables from the parent workflow in the worklet. Similarly, you cannot use
user-defined worklet variables in the parent workflow. However, you can use pre-defined
worklet variables in the parent workflow, just as you can use pre-defined variables for other
tasks in the workflow.

Persistent Worklet Variables


User-defined worklet variables can be persistent or non-persistent. To create a persistent
worklet variable, select Persistent when you create the variable. When you create a persistent
worklet variable, the worklet variable retains its value the next time the PowerCenter Server
executes the worklet instance in the parent workflow.
For example, you might have a worklet with a persistent variable. Use two instances of the
worklet in a workflow to run the worklet twice. You name the first instance of the worklet
Worklet1 and the second instance Worklet2.
Figure 6-3 shows the example workflow:
Figure 6-3. Example of Persistent Worklet Variable

When you run the example workflow shown in Figure 6-3, the persistent worklet variable
retains its value from Worklet1 and becomes the initial value in Worklet2. After the
PowerCenter Server executes Worklet2, it retains the value of the persistent variable in the
repository and uses the value the next time you run the workflow.
Worklet variables only persist when you run the same workflow. A worklet variable does not
retain its value when you use instances of the worklet in different workflows.

Overriding Initial Value


For each worklet instance, you can override the initial value of the worklet variable by
assigning a workflow variable to it.
To override the initial value of a worklet variable:
1.

Double-click the worklet instance in the Workflow Designer workspace.

Using Worklet Variables

169

2.

On the Parameters tab, click the Add button.

Add Button

Select a user-defined
worklet variable.

3.

Click the open button in the User-Defined Worklet Variables field to select a worklet
variable.

4.

Click the Open button in the Parent Workflow Variable field to select a workflow
variable to assign to the worklet variable.

5.

Click Apply.
The worklet variable in this worklet instance now has the selected workflow variable as its
initial value.

170

Chapter 6: Working with Worklets

Validating Worklets
The Workflow Manager validates worklets when you save the worklet in the Worklet
Designer. In addition, when you use worklets in a workflow, the PowerCenter Server validates
the workflow according to the following validation rules at runtime:

You cannot run two instances of the same worklet concurrently in the same workflow.

You cannot run two instances of the same worklet concurrently across two different
workflows.

Each worklet instance in the workflow can run only once.

When a worklet instance is invalid, the workflow using the worklet instance remains valid.
For details on workflow validation rules, see Validating a Workflow on page 119.
The Workflow Manager displays a red invalid icon if the worklet object is invalid. The
Workflow Manager validates the worklet object using the same validation rules for workflows.
The Workflow Manager displays a blue invalid icon if the worklet instance in the workflow is
invalid. The worklet instance may be invalid when any of the following conditions occurs:

The parent workflow or worklet variable you assign to the user-defined worklet variable
does not have a matching datatype.

The user-defined worklet variable you used in the worklet properties does not exist.

You do not specify the parent workflow or worklet variable you want to assign.

For non-reusable worklets, you may see both red and blue invalid icons displayed over the
worklet icon in the Navigator.

Validating Worklets

171

172

Chapter 6: Working with Worklets

Chapter 7

Working with Sessions


This chapter covers the following topics:

Overview, 174

Creating a Session Task, 175

Editing a Session, 177

Creating a Session Configuration Object, 183

Using Pre- and Post-Session SQL Commands, 186

Using Pre- or Post-Session Shell Commands, 188

Using Post-Session Email, 194

Validating a Session, 195

Running the Session, 197

Stopping and Aborting a Session, 200

Mapping Parameters and Variables in Sessions, 203

Handling High Precision Data, 204

173

Overview
A session is a set of instructions that tells the PowerCenter Server how and when to move data
from sources to targets. A session is a type of task, similar to other tasks available in the
Workflow Manager. In the Workflow Manager, you configure a session by creating a Session
task. To run a session, you must first create a workflow to contain the Session task.
When you create a Session task, you enter general information such as the session name,
session schedule, and the PowerCenter Server to run the session. You can also select options to
execute pre-session shell commands, send On-Success or On-Failure email, and use FTP to
transfer source and target files.
Using session properties, you can also override parameters established in the mapping, such as
source and target location, source and target type, error tracing levels, and transformation
attributes. When you assign a server in a server grid to a session, the server you specify at the
session level overrides the server you specify at the workflow level.
You can run as many sessions in a workflow as you need. You can run the Session tasks
sequentially or concurrently, depending on your needs.
The PowerCenter Server creates several files and in-memory caches depending on the
transformations and options used in the session. For more details on session output files and
caches, see Output Files and Caches on page 28.

174

Chapter 7: Working with Sessions

Creating a Session Task


You create a Session task for each mapping you want the PowerCenter Server to run. The
PowerCenter Server uses the instructions configured in the session to move data from sources
to targets.
You can create a reusable Session task in the Task Developer. You can also create non-reusable
Session tasks in the Workflow Designer as you develop the workflow. After you create the
session, you can edit the session properties at any time.
Note: Before you create a Session task, you must configure the Workflow Manager to

communicate with databases and the PowerCenter Server. You must assign appropriate
permissions for any database, FTP, or external loader connections you configure. For details
on configuring the Workflow Manager, see Configuring the Workflow Manager on page 37.

Session Privileges
To create sessions, you must have one of the following sets of privileges and permissions:

Use Workflow Manager privilege with read, write, and execute permissions

Super User privilege

You must have read permission for connection objects associated with the session in addition
to the above privileges and permissions.
PowerCenter allows you to set a read-only privilege for sessions. The Workflow Operator
privilege allows a user to view, start, stop, and monitor sessions without being able to edit
session properties.

Steps to Create a Session Task


Create the Session task in the Task Developer or the Workflow Designer. Session tasks created
in the Task Developer are reusable. For more information about reusable tasks and other
general information about workflow tasks, see Reusable Workflow Tasks on page 135.
To create a Session task:
1.

In the Workflow Designer, click the Session Task icon on the Tasks toolbar.
or
Choose Tasks-Create. Select Session Task for the task type.

2.

Enter a name for the Session task.

3.

Click Create. The Mappings dialog box appears.

Creating a Session Task

175

176

4.

Select the mapping you want to use in the Session task and click OK.

5.

Click Done. The Session task appears in the workspace.

Chapter 7: Working with Sessions

Editing a Session
After you create a session, you can edit it. For example, you might need to adjust the buffer
and cache sizes, modify the update strategy, or clear a variable value saved in the repository.
Double-click the Session task to open the session properties. The session has the following
tabs, and each of those tabs has multiple settings:

General tab. Enter session name, mapping name, description for the Session task, specify a
PowerCenter Server override, and configure additional task options.

Properties tab. Enter session log information, test load settings, and performance
configuration.

Config Object tab. Enter advanced settings, log options, and error handling
configuration.

Mapping tab. Enter source and target information, override transformation properties,
and configure the session for partitioning.

Components tab. Configure pre- or post-session shell commands and emails.

Metadata Extension tab. Configure metadata extension options.

For a detailed description of the session properties tabs and associated options, see Session
Properties Reference on page 667.
Figure 7-1 shows the session properties:
Figure 7-1. Session Properties

Editing a Session

177

You can edit session properties at any time. The repository updates the session properties
immediately.
If the session is running when you edit the session, the repository updates the session when
the session completes. If the mapping changes, the Workflow Manager might issue a warning
that the session is invalid. The Workflow Manager then allows you to continue editing the
session properties. After you edit the session properties, the PowerCenter Server validates the
session and reschedules the session as necessary. For details on session validation, see
Validating a Session on page 195.

Edit Session Privilege


To edit a session, you must have one of the following sets of privileges and permissions:

Use Workflow Manager privilege with read and write permissions on the folder

Super User privilege

Applying Attributes to All Instances


When you edit the session properties, you can apply source, target, and transformation
settings to all instances of the same type in the session. You can also apply settings to all
partitions in a pipeline. You can apply reader or writer settings, connection settings, and
properties settings.
For example, you might need to change a relational connection from a test to a production
database for all the target instances in a session. You can change the connection value for one
target in a session and apply the connection to the other relational target objects.

178

Chapter 7: Working with Sessions

Figure 7-2 shows the writers, connections, and properties settings for a target instance in a
session:
Figure 7-2. Session Target Object Settings

For a target
instance, you can
change writers,
connections, and
properties
settings.

Table 7-1 shows the options you can use to apply attributes to objects in a session. You can
apply different options depending on whether the setting is a reader or writer, connection, or
an object property.
Table 7-1. Apply All Options
Setting

Option

Description

Reader
Writer

Apply Type to All Instances

Applies a reader or writer type to all instances of the same object


type in the session. For example, you can apply a relational
reader type to all the other readers in your session.

Reader
Writer

Apply Type to All Partitions

Applies a reader or writer type to all the partitions in a pipeline.


For example, if you have four partitions, you can change the writer
type in one partition for a target instance. Then you can use this
option to apply the change to the other three partitions.

Connections

Apply Connection Type

Applies the same type of connection to all instances. Connection


types are relational, FTP, queue, application, or external loader.

Editing a Session

179

Table 7-1. Apply All Options


Setting

Option

Description

Connections

Apply Connection Value

Apply a connection value to all instances or partitions. The


connection value defines a specific connection that you can view
in the connection browser. You can only apply a connection value
that is valid for the existing connection type.

Connections

Apply Connection Attributes

Apply only the connection attribute values to all instances or


partitions. Each type of connection has different attributes. You
can apply connection attributes separately from connection
values. To view sample connection attributes, see Figure 7-3 on
page 181.

Connections

Apply Connection Data

Apply the connection value and its connection attributes to all the
other instances that have the same connection type. This option
combines the connection option and the connection attribute
option.

Connections

Apply All Connection


Information

Applies the connection value and its attributes to all the other
instances even if they do not have the same connection type. This
option is similar to Apply Connection Data, but it allows you to
change the connection type.

Properties

Apply Attribute to all


Instances

Applies an attribute value to all instances of the same object type


in the session. For example, if you have a relational target you can
choose to truncate a table before you load data. You can apply the
attribute value to all the relational targets in your session.

Properties

Apply Attribute to all


Partitions

Applies an attribute value to all partitions in a pipeline. For


example, you can change the name of the reject file name in one
partition for a target instance, then apply the file name change to
the other partitions.

Applying Connection Settings


When you apply connection settings you can apply the connection type, connection value,
and connection attributes. You can only apply a connection value that is valid for a
connection type unless you choose the Apply All Connection Information option. For
example, if a target instance uses an FTP connection, you can only choose an FTP connection
value to apply to it. The Apply All Connection Information option enables you to apply a
new connection type, connection value, and connection attributes.

180

Chapter 7: Working with Sessions

Figure 7-3 illustrates the connection options by showing where they display on a connection
browser:
Figure 7-3. Connection Options

The connection type can be relational, FTP, queue,


application, or external loader.

The connection value defines a specific connection.

Connection attributes are different for each


connection type.

Applying Attributes to Partitions or Instances


When you apply attributes to all instances or partitions in a session, you must open the
session and edit one of the session objects. You apply attributes or properties to other
instances by choosing an attribute in that object and selecting to apply its value to the other
instances or partitions.
To apply attributes to all instances or partitions:
1.

Open a session in the workspace.

2.

Click the Mappings tab.

3.

Choose a source, target, or transformation instance from the Navigator. Settings for
properties, connections, and readers or writers might display, depending on the object
you choose.

Editing a Session

181

182

4.

Right-click a reader, writer, property, or connection value. A list of options display.

5.

Select an option from the list and choose to apply it to all instances or all partitions.

6.

Click OK to apply the attribute or property.

Chapter 7: Working with Sessions

Creating a Session Configuration Object


The Config Object tab in the session properties includes commit and load settings, log
options, and error handling settings. The Workflow Manager allows you to create a reusable
set of attributes for the Config Object tab. When you configure attributes in the Config
Object tab, you can specify a session configuration object you already created. Or, you can
specify the default session configuration object called default_session_config. Override the
attributes in the session configuration object in the Config Object tab.
Figure 7-4 shows the Config Object tab of the session properties:
Figure 7-4. Config Object Tab

Select a
session
configuration
object.

Click the Browse button in the Config Name field to choose a session configuration. Select a
user-defined or default session configuration object from the browser.
To create a session configuration object:
1.

In the Workflow Manager, click Tasks-Session Configuration.

Creating a Session Configuration Object

183

The Session Configuration Browser appears.


Figure 7-5 shows the Session Configuration Browser:
Figure 7-5. Session Configuration Browser

184

2.

Click New to create a new session configuration object.

3.

Enter a name for the session configuration object.

Chapter 7: Working with Sessions

4.

In the Properties tab, configure advanced settings, log options, and error handling
options.

5.

Click OK.

For session configuration object settings descriptions, see Config Object Tab on page 675.

Creating a Session Configuration Object

185

Using Pre- and Post-Session SQL Commands


You can specify pre- and post-session SQL in the Source Qualifier transformation and the
target instance when you create a mapping. When you create a Session task in the Workflow
Manager you can override the SQL commands on the Mapping tab. You might want to use
these commands to drop indexes on the target before the session runs, and then recreate them
when the session completes.
The PowerCenter Server executes pre-session SQL commands before it reads the source. It
executes post-session SQL commands after it writes to the target.

Guidelines for Entering Pre- and Post-Session SQL Commands


Remember the following guidelines when creating the SQL statements:

You can use any command that is valid for the database type. However, the PowerCenter
Server does not allow nested comments, even though the database might.

You can use mapping parameters and variables in SQL executed against the source, but not
the target.

Use a semi-colon (;) to separate multiple statements.

The PowerCenter Server ignores semi-colons within single quotes, double quotes, or
within /* ...*/.

If you need to use a semi-colon outside of quotes or comments, you can escape it with a
back slash (\).

The Workflow Manager does not validate the SQL.

Error Handling
You can configure error handling on the Config Object tab. You can choose to stop or
continue the session if the PowerCenter Server encounters an error issuing the pre- or postsession SQL command.

186

Chapter 7: Working with Sessions

Figure 7-6 shows how to configure error handling for a pre- or post-session SQL commands:
Figure 7-6. Stop or Continue the Session on Pre- or Post-Session SQL Errors

Stop or
continue the
session on preor postsession SQL
error.

Using Pre- and Post-Session SQL Commands

187

Using Pre- or Post-Session Shell Commands


The PowerCenter Server can perform shell commands at the beginning of the session or at the
end of the session. Shell commands are operating system commands. You can use pre- or postsession shell commands, for example, to delete a reject file or session log, or to archive target
files before the session begins.
The Workflow Manager provides the following types of shell commands for each Session task:

Pre-session command. The PowerCenter Server performs pre-session shell commands at


the beginning of a session. You can configure a session to stop or continue if a pre-session
shell command fails.

Post-session success command. The PowerCenter Server performs post-session success


commands only if the session completed successfully.

Post-session failure command. The PowerCenter Server performs post-session failure


commands only if the session failed to complete.

Use the following guidelines to call a shell command:

Use any valid UNIX command or shell script for UNIX servers, or any valid DOS or batch
file for Windows servers.

Configure the session to execute the pre- or post-session shell commands.

The Workflow Manager provides a task called the Command task that allows you to specify
shell commands anywhere in the workflow. You can choose a reusable Command task for the
pre- or post-session shell command. Or, you can create non-reusable shell commands for the
pre- or post-session shell commands. For details on the Command task, see Working with
the Command Task on page 143.
If you create a non-reusable pre- or post-session shell command, you can make it into a
reusable Command task.
The Workflow Manager allows you to choose from the following options when you configure
shell commands:

Create non-reusable shell commands. Create a non-reusable set of shell commands for the
session. Other sessions in the folder cannot use this set of shell commands.

Use an existing reusable Command task. Select an existing Command task to run as the
pre- or post-session shell command.

Configure pre- and post-session shell commands in the Components tab of the session
properties.

Using Server and Session Variables


You can include any server variable, such as $PMTargetFileDir, or session variables in
commands in pre-session and post-session commands. When you use a server variable instead
of entering a specific directory, you can run the same workflow on different PowerCenter
Servers without changing session properties. You cannot use server variables or session
188

Chapter 7: Working with Sessions

variables in standalone Command tasks in the workflow. The PowerCenter Server does not
expand server variables or session variables used in standalone Command tasks.

Configuring Non-Reusable Shell Commands


When you create non-reusable pre- or post-session shell commands, the commands are only
visible in session properties. The Workflow Manager does not create Command tasks from
these non-reusable commands. You can make non-reusable shell commands into a reusable
Command tasks.
Figure 7-7 shows the Make Reusable option for a pre-session shell command:
Figure 7-7. Make Reusable Option for Pre-Session Shell Commands

Make this shell command


reusable.

Perform the following steps to create pre- or post-session shell commands for a specific
session.

Using Pre- or Post-Session Shell Commands

189

To create non-reusable shell commands:


1.

In the Components tab of the session properties, select Non-reusable for pre- or postsession shell command.

Edit presession
commands.

190

2.

Click the Edit button in the Value field to open the Edit Pre- or Post-Session Command
dialog box.

3.

Enter a name for the command in the General tab.

Chapter 7: Working with Sessions

4.

If you want the PowerCenter Server to perform the next command only if the previous
command completed successfully, select Run If Previous Completed in the Properties tab.

5.

In the Commands tab, click the Add button to add shell commands.
Enter one command for each line.

Add a command.

6.

Click OK.

Creating a Reusable Command Task from Pre- or Post-Session


Commands
If you create non-reusable pre- or post-session shell commands, you can make them into a
reusable Command task. Once you make the pre- or post-session shell commands into a
reusable Command task, you cannot revert back.

Using Pre- or Post-Session Shell Commands

191

To create a Command Task from non-reusable pre- or post-session shell commands, click the
Edit button to open the Edit dialog box for the shell commands. In the General tab, select the
Make Reusable checkbox.
After you check the Make Reusable checkbox and click OK, a new Command task appears in
the Tasks folder in the Navigator window. You can use this Command task in other
workflows, just as you do with any other reusable workflow tasks.

Configuring Reusable Shell Commands


Perform the following steps to call an existing reusable Command task as the pre- or postsession shell command for the Session task.
To select an existing Command task as the pre-session shell command:
1.

In the Components tab of the session properties, click Reusable for the pre- or postsession shell command.

2.

Click the Edit button in the Value field to open the Task Browser dialog box.

3.

Select the Command task you want to run as the pre- or post-session shell command.

4.

Click the Override button in the Task Browser dialog box if you want to change the order
of the commands, or if you want to specify whether to run the next command when the
previous command fails.
Changes you make to the Command task from the session properties only apply to the
session. In the session properties, you cannot edit the commands in the Command task.

5.

Click OK to select the Command task for the pre- or post-session shell command.
The name of the Command task you select appears in the Value field for the shell
command.

192

Chapter 7: Working with Sessions

Using Server Variables


You can include any server variable, such as $PMTargetFileDir, in pre- or post-session shell
commands. When you use a server variable instead of entering a specific directory, you can
run the same workflow on different PowerCenter Servers without changing session properties.

Pre-Session Shell Command Errors


You can configure the session to stop or continue if a pre-session shell command fails. If you
select stop, the PowerCenter Server stops the session, but continues with the rest of the
workflow. If you select Continue, the PowerCenter Server ignores the errors and continues the
session. By default the PowerCenter Server stops the session upon shell command errors.
Configure the session to stop or continue if a pre-session shell command fails in the Error
Handling settings on the Config Object tab.
Figure 7-8 shows how to configure the session to stop or continue when a pre-session shell
command fails:
Figure 7-8. Stop or Continue the Session on Pre-Session Shell Command Error

Stop or
continue the
session on presession shell
command error.

Using Pre- or Post-Session Shell Commands

193

Using Post-Session Email


The PowerCenter Server can send emails after the session completes. You can send an email
when the session completes successfully. Or, you can send an email when the session fails. The
PowerCenter Server can send the following types of emails for each Session task:

On-Success Email. The PowerCenter Server sends the email when the session completes
successfully.

On-Failure Email. The PowerCenter Server sends the email when the session fails.

You can also use an Email task to send email anywhere in the workflow. If you already created
a reusable Email task, you can select it as the On-Success or On-Failure email for the session.
Or, you can create non-reusable emails that exist only within the Session task.
For more information about sending post-session emails, see Sending Email on page 319.

194

Chapter 7: Working with Sessions

Validating a Session
The Workflow Manager validates a Session task when you save it. You can also manually
validate Session tasks and session instances. Validate reusable Session tasks in the Task
Developer. Validate non-reusable sessions and reusable session instances in the Workflow
Designer.
The Workflow Manager marks a reusable session or session instance invalid if you perform
one of the following tasks:

Edit the mapping in a way that might invalidate the session. You can edit the mapping
used by a session at any time. When you edit and save a mapping, the repository might
invalidate sessions that already use the mapping. The PowerCenter Server does not execute
invalid sessions.
You must reconnect to the folder to see the effect of mapping changes on Session tasks. For
details on validating mappings, see Mappings in the Designer Guide.
When you edit a session based on an invalid mapping, the Workflow Manager displays a
warning message:
The mapping [mapping_name] associated with the session [session_name] is
invalid.

Delete a database, FTP, or external loader connection used by the session.

Leave session attributes blank. For example, the session is invalid if you do not specify the
source file name.

Change the code page of a session database connection to an incompatible code page.

If you delete objects associated with a Session task such as session configuration object, Email,
or Command task, the Workflow Manager marks a reusable session invalid. However, the
Workflow Manager does not mark a non-reusable session invalid if you delete an object
associated with the session.
If you delete a shortcut to a source or target from the mapping, the Workflow Manager does
not mark the session invalid.
The Workflow Manager does not validate SQL overrides or filter conditions entered in the
session properties when you validate a session. You must validate SQL override and filter
conditions in the SQL Editor.
If a reusable session task is invalid, the Workflow Manager displays an invalid icon over the
session task in the Navigator and in the Task Developer workspace. This does not affect the
validity of the session instance and the workflows using the session instance.
If a reusable or non-reusable session instance is invalid, the Workflow Manager marks it
invalid in the Navigator and in the Workflow Designer workspace. Workflows using the
session instance remain valid.
To validate a session, select the session in the workspace and choose Tasks-Validate. Or, rightclick the session instance in the workspace and choose Validate.

Validating a Session

195

Validating Multiple Sessions


You can validate multiple sessions without fetching them into the workspace. You must select
and validate the sessions from a query results view or a view dependencies list. You can save
and optionally check in sessions that change from invalid to valid status. For more
information about validating multiple objects, see Validating Multiple Objects in the
Repository Guide.
Note: If you are using the Repository Manager, you can select and validate multiple sessions

from the Navigator.


To validate multiple sessions:
1.

Select sessions from either a query list or a view dependencies list.

2.

Right-click one of the selected sessions and choose Validate.


The Validate Objects dialog box displays.

3.

196

Choose whether to save objects and check in objects that you validate.

Chapter 7: Working with Sessions

Running the Session


By default, the PowerCenter Server you assign to a workflow runs all tasks. If you register
multiple servers to a repository, you can override the PowerCenter Server at the session level.
In a server grid, the master server distributes the sessions to available worker servers. You can
assign a PowerCenter Server to a session. The session always runs on the server you assigned
to it. For more information about how a server grid distributes sessions, see Distributing
Sessions on page 446.

Selecting a Server to Run the Session


You can choose a server to run the session. If you only register one server, the Workflow
Manager lists the single registered PowerCenter Server that runs the workflow and session.
For PowerCenter repositories with multiple servers, the Workflow Manager lists all servers.
To select a server to run a session:
1.

Open a session in a workflow.

2.

Double-click the session in the workflow. The Edit Tasks dialog box appears.

3.

Click the Select Server button on the General tab. A list of registered servers appear.

Select a
server.

4.

Select a server to run the session.

Running the Session

197

5.

Click OK twice to select the server for the session.

Instead of choosing a server for each session in the folder, you can assign multiple sessions to
a server.

Assigning the PowerCenter Server to a Session


After you register the PowerCenter Server, you can assign it to sessions you want to run on
that server. This allows you to assign the PowerCenter Server to multiple sessions without
editing each session property individually. To assign the PowerCenter Server to multiple
sessions, you must first close all folders in the repository.
To assign the PowerCenter Server to sessions, you must have the Super User privilege.
Figure 7-9 shows the Assign Server dialog box:
Figure 7-9. Assign Server Dialog Box

Select a server to assign.


Select a folder.
Show sessions.

Assign a server to a session.

To assign the PowerCenter Server:


1.

Close all folders in the repository.

2.

Choose Server-Assign Server.


or
Right-click the server name in the Navigator and choose Assign Server. The Assign Server
dialog box opens.

3.

198

From the Choose Server list, select the server you want to assign.

Chapter 7: Working with Sessions

4.

From the Show Folder list, select the folder you want to view. Or, choose All to view
workflows in all folders in the repository.

5.

Select the Show Sessions check box.

6.

Select each session you want to run on the PowerCenter Server.

7.

Click Assign.

You can remove an assigned server from a session in the Assign Server dialog box. Perform the
following steps to remove an assigned server from a session.
To remove an assigned server:
1.

Close all folders in the repository.

2.

Choose Server-Assign Server.

3.

From the Choose Server list, select None.

4.

From the Show Folder list, select the folder you want to view. Or, choose All to view
workflows in all folders in the repository.

5.

Select the sessions from which you want to remove the assigned server.

6.

Click Assign.

Running the Session

199

Stopping and Aborting a Session


You can stop or abort a session just as you can stop or abort any task. You can also abort a
session by using the ABORT() function in the mapping logic. Session errors can cause the
PowerCenter Server to stop a session early. You can control the stopping point by setting an
error threshold in a session, using the ABORT function in mappings, or requesting the
PowerCenter Server to stop the session. You cannot control the stopping point when the
PowerCenter Server encounters fatal errors, such as loss of connection to the target database.
If a session fails as a result of error, you can consider performing session recovery. For more
information on recovery, see Recovering a Session Task on page 311. For more information
on row error logging, see Overview on page 482.

Threshold Errors
You can choose to stop a session on a designated number of non-fatal errors. A non-fatal error
is an error that does not force the session to stop on its first occurrence. Establish the error
threshold in the session properties with the Stop On option. When you enable this option,
the PowerCenter Server counts non-fatal errors that occur in the reader, writer, and
transformation threads.
The PowerCenter Server maintains an independent error count when reading sources,
transforming data, and writing to targets. The PowerCenter Server counts the following nonfatal errors when you set the stop on option in the session properties:

Reader errors. Errors encountered by the PowerCenter Server while reading the source
database or source files. Reader threshold errors can include alignment errors while
running a session in Unicode mode.

Writer errors. Errors encountered by the PowerCenter Server while writing to the target
database or target files. Writer threshold errors can include key constraint violations,
loading nulls into a not null field, and database trigger responses.

Transformation errors. Errors encountered by the PowerCenter Server while transforming


data. Transformation threshold errors can include conversion errors, and any condition set
up as an ERROR, such as null input.

When you create multiple partitions in a pipeline, the PowerCenter Server maintains a
separate error threshold for each partition. When the PowerCenter Server reaches the error
threshold for any partition, it stops the session. The writer may continue writing data from
one or more partitions, but it does not affect your ability to perform a successful recovery.
Note: If alignment errors occur in a non line-sequential VSAM file, the PowerCenter Server

sets the error threshold to 1 and stops the session.

Fatal Error
A fatal error occurs when the PowerCenter Server cannot access the source, target, or
repository. This can include loss of connection or target database errors, such as lack of
200

Chapter 7: Working with Sessions

database space to load data. If the session uses a Normalizer or Sequence Generator
transformation, the PowerCenter Server cannot update the sequence values in the repository,
and a fatal error occurs.
If the session does not use a Normalizer or Sequence Generator transformation, and the
PowerCenter Server loses connection to the repository, the PowerCenter Server does not stop
the session. The session completes, but the PowerCenter Server cannot log session statistics
into the repository.

ABORT Function
Use the ABORT function in the mapping logic to abort a session when the PowerCenter
Server encounters a designated transformation error.
For more information about ABORT, see Functions in the Transformation Language
Reference.

User Command
You can stop or abort the session from the Workflow Manager. You can also stop the session
using pmcmd.

PowerCenter Server Handling for Session Failure


The PowerCenter Server handles session errors in different ways, depending on the error or
event that causes the session to fail.
Table 7-2 describes the PowerCenter Server behavior when a session fails:
Table 7-2. PowerCenter Server Behavior for Failed Sessions
Cause for Session Errors

PowerCenter Server Behavior

- Error threshold met due to reader errors


- Stop command using Workflow Manager or
pmcmd

The PowerCenter Server performs the following tasks:


- Stops reading.
- Continues processing data.
- Continues writing and committing data to targets.
If the PowerCenter Server cannot finish processing and committing
data, you need to issue the Abort command to stop the session.

Abort command using Workflow Manager

The PowerCenter Server performs the following tasks:


- Stops reading.
- Continues processing data.
- Continues writing and committing data to targets.
If the PowerCenter Server cannot finish processing and committing
data within 60 seconds, it kills the PowerCenter Server process.

Stopping and Aborting a Session

201

Table 7-2. PowerCenter Server Behavior for Failed Sessions

202

Cause for Session Errors

PowerCenter Server Behavior

- Fatal error from database


- Error threshold met due to writer errors

The PowerCenter Server performs the following tasks:


- Stops reading and writing.
- Rolls back all data not committed to the target database.
If the session stops due to fatal error, the commit or rollback may
or may not be successful.

- Error threshold met due to transformation errors


- ABORT( )
- Invalid evaluation of transaction control
expression

The PowerCenter Server performs the following tasks:


- Stops reading.
- Flags the row as an abort row and continues processing data.
- Continues to write to the target database until it hits the abort row.
- Issues commits based on commit intervals.
- Rolls back all data not committed to the target database.

Chapter 7: Working with Sessions

Mapping Parameters and Variables in Sessions


You can use mapping parameters in the session properties to alter certain mapping attributes.
For example, you can use a mapping parameter in a transformation override to override a
filter or user-defined join in a Source Qualifier transformation.
If you use mapping variables in a session, you can clear any of the variable values saved in the
repository by editing the session. When you clear the variable values, the PowerCenter Server
uses the values in the parameter file the next time you run a session. If the session does not use
a parameter file, the PowerCenter Server uses the initial values defined in the mapping. For
more information on mapping variables, see Mapping Parameters and Variables in the
Designer Guide.
To view or delete values for mapping variables saved in the repository:
1.

In the Navigator window of the Workflow Manager, right-click the Session task and
select View Persistent Values.

2.

Click Delete Values to delete existing variable values.

3.

To save changes, click OK.

Mapping Parameters and Variables in Sessions

203

Handling High Precision Data


The PowerCenter Server processes decimal values as Doubles or Decimals. When you create a
session, you choose to enable the Decimal datatype or let the PowerCenter Server process the
data as a Double (precision of 15).
To enable high precision data handling:

Use the Decimal datatype with a precision of 16 to 28 in the mapping.

Select Enable High Precision in the session properties.

The precision attributed to a number also includes the scale of the number. For example, the
value 11.47 has a precision of 4 and a scale of 2.
For example, you might have a mapping with Decimal (20,0) that passes the number
40012030304957666903. If you enable high precision, the PowerCenter Server passes the
number as is. If you do not enable high precision, the PowerCenter Server passes
4.00120303049577 x 10 19.
If you want to process a Decimal value with a precision greater than 28 digits, the
PowerCenter Server automatically treats it as a Double value. For example, if you want to
process the number 2345678904598383902092.1927658, which has a precision of 29 digits,
the PowerCenter Server automatically treats this number as a Double value of
2.34567890459838 x 10 21.
To use high precision data handling in a session:
1.

204

In the Workflow Manager, open the session properties.

Chapter 7: Working with Sessions

2.

On the Properties tab, select Enable High Precision.

Enable
High
Precision

3.

Click OK twice to save changes.

Handling High Precision Data

205

206

Chapter 7: Working with Sessions

Chapter 8

Working with Sources


This chapter covers the following topics:

Overview, 208

Configuring Sources in a Session, 210

Working with Relational Sources, 214

Working with File Sources, 218

Server Handling for File Sources, 226

Server Handling for File Sources, 226

Using a File List, 230

207

Overview
In the Workflow Manager, you can create sessions with the following sources:

Relational. You can extract data from any relational database that the PowerCenter Server
can connect to. When extracting data from relational sources and Application sources, you
must configure the database connection to the data source prior to configuring the session.

File. You can create a session to extract data from a flat file, COBOL, or XML source. The
PowerCenter Server can extract data from any local directory or FTP connection for the
source file. If the file source requires an FTP connection, you need to configure the FTP
connection to the host machine before you create the session.

Heterogeneous. You can extract data from multiple sources in the same session. You can
extract from multiple relational sources, such as Oracle and SQL Server. Or, you can
extract from multiple source types, such as relational and flat file. When you configure a
session with heterogeneous sources, configure each source instance separately.

Globalization Features
You can choose a code page that you want the PowerCenter Server to use for relational sources
and flat files. You specify code pages for relational sources when you configure database
connections in the Workflow Manager. You can set the code page for file sources in the session
properties. For more information about code pages, see Globalization Overview in the
Installation and Configuration Guide.

Source Connections
Before you can extract data from a source, you must configure the connection properties the
PowerCenter Server uses to connect to the source file or database. You can configure source
database and FTP connections in the Workflow Manager.
For more information on creating database connections, see Configuring the Workflow
Manager on page 37. For more information on creating FTP connections, see Using FTP
on page 559.

Permissions and Privileges


You must have read permissions for the connections you use in the session. For example, if the
source requires database connections or FTP connections, you must have permission to read
those connections in the session.

Allocating Buffer Memory


When the PowerCenter Server initializes a session, it allocates blocks of memory to hold
source and target data. The PowerCenter Server allocates at least two blocks for each source
and target partition. Sessions that use a large number of sources or targets might require
208

Chapter 8: Working with Sources

additional memory blocks. If the PowerCenter Server cannot allocate enough memory blocks
to hold the data, it fails the session.
For more information on allocating buffer memory, see Optimizing the Session on
page 655.

Partitioning Sources
You can create multiple partitions for relational, Application, and file sources. For relational
or Application sources, the PowerCenter Server creates a separate connection to the source
database for each partition you set in the session properties. For file sources, you can
configure the session to read the source with one thread or multiple threads.
For more information on partitioning data, see Pipeline Partitioning on page 345.

Overview

209

Configuring Sources in a Session


Configure source properties for sessions in the Sources node of the Mapping tab of the session
properties. When you configure source properties for a session, you define properties for each
source instance in the mapping.
Figure 8-1 shows the Sources node on the Mapping tab:
Figure 8-1. Sources Node of the Session Properties

The Sources node lists the sources used in the session and displays their settings. To view and
configure settings for a source, select the source from the list. You can configure the following
settings for a source:

Readers

Connections

Properties

Configuring Readers
You can click the Readers settings on the Sources node to view the reader the PowerCenter
Server uses with each source instance. The Workflow Manager specifies the necessary reader
for each source instance in the Readers settings on the Sources node.

210

Chapter 8: Working with Sources

Figure 8-2 shows the Readers settings in the Sources node of the Mapping tab:
Figure 8-2. Readers Settings in the Sources Node of the Mapping Tab

Configuring Connections
Click the Connections settings on the Sources node to define source connection information.

Configuring Sources in a Session

211

Figure 8-3 shows the Connections settings in the Sources node of the Mapping tab:
Figure 8-3. Connections Settings in the Sources Node

Edit a
connection.
Choose a
connection.

For relational sources, choose a configured database connection in the Value column for each
relational source instance. By default, the Workflow Manager displays the source type for
relational sources. For details on configuring database connections, see Selecting the Source
Database Connection on page 214.
For flat file and XML sources, choose one of the following source connection types in the
Type column for each source instance:

FTP. If you want to read data from a flat file or XML source using FTP, you must specify
an FTP connection when you configure source options. You must define the FTP
connection in the Workflow Manager prior to configuring the session.
You must have read permission for any FTP connection you want to associate with the
session. The user starting the session must have execute permission for any FTP
connection associated with the session. For details on using FTP, see Using FTP on
page 559.

None. Choose None when you want to read from a local flat file or XML file.

Configuring Properties
Click the Properties settings in the Sources node to define source property information. The
Workflow Manager displays properties, such as source file name and location for flat file,

212

Chapter 8: Working with Sources

COBOL, and XML source file types. You do not need to define any properties on the
Properties settings for relational sources.
Figure 8-4 shows the Properties settings in the Sources node of the Mapping tab:
Figure 8-4. Properties Settings in the Sources Node of the Mapping Tab

For more information on configuring sessions with relational sources, see Working with
Relational Sources on page 214. For more information on configuring sessions with flat file
sources, see Working with File Sources on page 218. For more information on configuring
sessions with XML sources, see the XML User Guide.

Configuring Sources in a Session

213

Working with Relational Sources


When you configure a session to read data from a relational source, you can configure the
following properties for sources:

Source database connection. Select the database connection for each relational source. For
more information, see Selecting the Source Database Connection on page 214.

Treat source rows as. Define how the PowerCenter Server treats each source row as it reads
it from the source table. For more information, see Defining the Treat Source Rows As
Property on page 214.

Table owner name. Define the table owner name for each relational source. For more
information, see Configuring the Table Owner Name on page 216.

Override SQL query. You can override the default SQL query to extract source data. For
more information, see Overriding the SQL Query on page 216.

Selecting the Source Database Connection


Before you can run a session to read data from a source database, the PowerCenter Server
must connect to the source database. Database connections must exist in the repository to
appear on the source database list. You must define them prior to configuring a session. For
details on configuring a database connection, see Setting Up a Relational Database
Connection on page 53.
On the Connections settings in the Sources node, select the database connection from the list.
You must have read permission for the source database connection to configure the session to
use it. The user starting the configured session must have execute permission for source
database connections.

Defining the Treat Source Rows As Property


When the PowerCenter Server reads a source, it marks each row with an indicator to specify
which operation to perform when the row reaches the target. You can define how the
PowerCenter Server marks each row using the Treat Source Rows As property in the General
Options settings on the Properties tab.

214

Chapter 8: Working with Sources

Figure 8-5 shows the Treat Source Rows As property on the General Options settings:
Figure 8-5. Treat Source Rows As Property

Treat Source
Rows As
Property

Table 8-1 describes the options you can choose for the Treat Source Rows As property:
Table 8-1. Treat Source Rows As Options
Treat Source Rows As Option

Description

Insert

The PowerCenter Server marks all rows to insert into the target.

Delete

The PowerCenter Server marks all rows to delete from the target.

Update

The PowerCenter Server marks all rows to update the target. You can further
define the update operation in the target options. For more information, see Target
Properties on page 241.

Data Driven

The PowerCenter Server uses the Update Strategy transformations in the mapping
to determine the operation on a row-by-row basis. You define the update operation
in the target options. If the mapping contains an Update Strategy transformation,
this option defaults to Data Driven. You can also use this option when the mapping
contains Custom transformations configured to set the update strategy.

Once you determine how to treat all rows in the session, you also need to set update strategy
options for individual targets. For more information on setting the target update strategy
options, see Target Properties on page 241.
For more information on setting the update strategy for a session, see Update Strategy
Transformation in the Transformation Guide.
Working with Relational Sources

215

Configuring the Table Owner Name


You can define the owner name of the source table in the session properties. For some
databases such as DB2, tables can have different owners. If the database user specified in the
database connection is not the owner of the source tables in a session, specify the table owner
for each source instance. A session can fail if the database user is not the owner and you do
not specify the table owner name.
Specify the table owner name in the Owner Name field in the Properties settings in the
Sources node.
Figure 8-6 shows the Properties settings where you define the table owner name for relational
sources:
Figure 8-6. Source Table Owner Name Property

Owner Name

Overriding the SQL Query


You can alter or override the default query in the mapping by entering SQL override in the
Properties settings in the Sources node. You can enter any SQL statement supported by the
source database.
The Workflow Manager does not validate the SQL override. The following errors could cause
the session to fail, and possibly cause data errors:

216

Fields with incompatible datatypes or unknown fields

Typing mistakes or other errors

Chapter 8: Working with Sources

Figure 8-7 shows the Properties settings in the Sources node where you can override the SQL
query:
Figure 8-7. SQL Query Override Property in the Session Properties

SQL Query

To override the default query for a relational source:


1.

In the Workflow Manager, open the session properties.

2.

Click the Mapping tab and open the Transformations view.

3.

Click the Sources node and open the Properties settings.

4.

Click the Open button in the SQL Query field to open the SQL Editor.

5.

Enter the SQL override.

6.

Click OK to return to the session properties.

Working with Relational Sources

217

Working with File Sources


You can create a session to extract data from flat file or COBOL sources. When you create a
session to read data from a flat file or COBOL file, you can configure the following
information in the session properties:

Source properties. You can define source properties on the Properties settings in the
Sources node, such as source file options. For more information, see Configuring Source
Properties on page 218.

Flat file properties. You can edit fixed-width and delimited source file properties. For
more information, see Configuring Fixed-Width File Properties on page 220 and
Configuring Delimited File Properties on page 222.

Line sequential buffer length. You can change the buffer length for flat files on the
Advanced settings on the Config Object tab. For more information, see Configuring Line
Sequential Buffer Length on page 225.

Treat source rows as. Define how the PowerCenter Server treats each source row as it reads
it from the source. For more information, see Defining the Treat Source Rows As
Property on page 214.

Configuring Source Properties


You can define session source properties on the Properties settings in the Sources node.

218

Chapter 8: Working with Sources

Figure 8-8 shows the flat file source properties you define in the Properties settings of the
Sources node on the Mapping tab:
Figure 8-8. Properties Settings in the Sources Node for a Flat File Source

Working with File Sources

219

Table 8-2 describes the properties you define on the Properties settings for flat file source
definitions:
Table 8-2. Flat File Source Properties
File Source
Options

Required/
Optional

Source File
Directory

Optional

Enter the directory name in this field. By default, the PowerCenter Server looks
in the server variable directory, $PMSourceFileDir, for file sources.
If you specify both the directory and file name in the Source Filename field,
clear this field. The PowerCenter Server concatenates this field with the Source
Filename field when it runs the session.
You can also use the $InputFileName session parameter to specify the file
directory.
For details on session parameters, see Session Parameters on page 495.

Source Filename

Required

Enter the file name, or file name and path. Optionally use the $InputFileName
session parameter for the file name.
The PowerCenter Server concatenates this field with the Source File Directory
field when it runs the session. For example, if you have C:\data\ in the Source
File Directory field, then enter filename.dat in the Source Filename field.
When the PowerCenter Server begins the session, it looks for
C:\data\filename.dat.
By default, the Workflow Manager enters the file name configured in the source
definition.
For details on session parameters, see Session Parameters on page 495.

Source Filetype

Required

Allows you to configure multiple file sources using a file list.


Indicates whether the source file contains the source data, or whether it
contains a list of files with the exact same file properties. Choose Direct if the
source file contains the source data. Choose Indirect if the source file contains
a list of files.
When you select Indirect, the PowerCenter Server finds the file list and reads
each listed file when it runs the session. For details on file lists, see Using a
File List on page 230.

Set File Properties


link

Optional

Opens a dialog box that allows you to override source file properties. By
default, the Workflow Manager displays file properties as configured in the
source definition.
For more information, see Configuring Fixed-Width File Properties on
page 220 and Configuring Delimited File Properties on page 222.

Description

Configuring Fixed-Width File Properties


When you read data from a fixed-width file, you can edit file properties in the session, such as
the null character or code page. You can configure fixed-width properties for non-reusable
sessions in the Workflow Designer and for reusable sessions in the Task Developer. You
cannot configure fixed-width properties for instances of reusable sessions in the Workflow
Designer.
Click Set File Properties to open the Flat Files dialog box.

220

Chapter 8: Working with Sources

Figure 8-9 shows the Flat Files dialog box:


Figure 8-9. Flat Files Dialog Box

To edit the fixed-width properties, select Fixed Width and click Advanced. The Fixed-Width
Properties dialog box appears. By default, the Workflow Manager displays file properties as
configured in the mapping. Edit these settings to override those configured in the source
definition.
Figure 8-10 shows the Fixed-Width Properties dialog box:
Figure 8-10. Fixed-Width File Properties Dialog Box

Working with File Sources

221

Table 8-3 describes options you can define in the Fixed Width Properties dialog box for file
sources:
Table 8-3. Fixed-Width File Properties for File Sources
Fixed-Width
Properties Options

Required/
Optional

Text/Binary

Required

Indicates the character representing a null value in the file. This can be any
valid character in the file code page, or any binary value from 0 to 255. For
more information about specifying null characters, see Null Character
Handling on page 227.

Repeat Null
Character

Optional

If selected, the PowerCenter Server reads repeat NULL characters in a


single field as a single NULL value. If you do not select this option, the
PowerCenter Server reads a single null character at the beginning of a field
as a null field.
Important: For multibyte code pages, Informatica recommends that you
specify a single-byte null character if you are using repeating non-binary null
characters. This ensures that repeating null characters fit into the column
exactly.
For more information about specifying null characters, see Null Character
Handling on page 227.

Code Page

Required

Select the code page of the fixed-width file. The default setting is the client
code page.

Number of Initial
Rows to Skip

Optional

The PowerCenter Server skips the specified number of rows before reading
the file. Use this to skip header rows. One row may contain multiple records.
If you select the Line Sequential File Format option, the PowerCenter Server
ignores this option.

Number of Bytes to
Skip Between
Records

Optional

The PowerCenter Server skips the specified number of bytes between


records. For example, you have an ASCII file on Windows with one record on
each line, and a carriage return and line feed appear at the end of each line.
If you want the PowerCenter Server to skip these two single-byte characters,
enter 2.
If you have an ASCII file on UNIX with one record for each line, ending in a
carriage return, skip the single character by entering 1.

Strip Trailing Blanks

Optional

If selected, the PowerCenter Server strips trailing blank spaces from records
before passing them to the Source Qualifier transformation.

Line Sequential File


Format

Optional

Select this option if the file uses a carriage return at the end of each record,
shortening the final column.

Description

Configuring Delimited File Properties


When you read data from a delimited file, you can edit file properties in the session, such as
the delimiter or code page. You can configure delimited properties for non-reusable sessions
in the Workflow Designer and for reusable sessions in the Task Developer. You cannot
configure delimited properties for instances of reusable sessions in the Workflow Designer.
Click Set File Properties to open the Flat Files dialog box.

222

Chapter 8: Working with Sources

Figure 8-11 shows the Flat Files dialog box:


Figure 8-11. Flat Files Dialog Box

To edit the delimited properties, select Delimited and click Advanced. The Delimited File
Properties dialog box appears. By default, the Workflow Manager displays file properties as
configured in the mapping. Edit these settings to override those configured in the source
definition.
Figure 8-12 shows the Delimited File Properties dialog box:
Figure 8-12. Delimited File Properties Dialog Box

Working with File Sources

223

Table 8-4 describes options you can define in the Delimited File Properties dialog box for file
sources:
Table 8-4. Delimited File Properties for File Sources
Delimited File
Properties Options

Required/
Optional

Delimiters

Required

Character used to separate columns of data in the source file. Use the button
to the right of this field to enter a different delimiter. Delimiters can be either
printable or single-byte unprintable characters, and must be different from
the escape character and the quote character (if selected). You cannot select
unprintable multibyte characters as delimiters. The delimiter must be in the
same code page as the flat file code page.

Treat Consecutive
Delimiters as One

Optional

By default, the PowerCenter Server reads pairs of delimiters as a null value.


If selected, the PowerCenter Server reads any number of consecutive
delimiter characters as one.
For example, a source file uses a comma as the delimiter character and
contains the following record: 56, , , Jane Doe. By default, the PowerCenter
Server reads that record as four columns separated by three delimiters: 56,
NULL, NULL, Jane Doe. If you select this option, the PowerCenter Server
reads the record as two columns separated by one delimiter: 56, Jane Doe.

Optional Quotes

Required

Select No Quotes, Single Quote, or Double Quotes. If you select a quote


character, the PowerCenter Server ignores delimiter characters within the
quote characters. Therefore, the PowerCenter Server uses quote characters
to escape the delimiter.
For example, a source file uses a comma as a delimiter and contains the
following row: 342-3849, Smith, Jenna, Rockville, MD, 6.
If you select the optional single quote character, the PowerCenter Server
ignores the commas within the quotes and reads the row as four fields.
If you do not select the optional single quote, the PowerCenter Server reads
six separate fields.
When the PowerCenter Server reads two optional quote characters within a
quoted string, it treats them as one quote character. For example, the
PowerCenter Server reads the following quoted string as Im going

Description

tomorrow:

2353, Im going tomorrow., MD


Additionally, if you select an optional quote character, the PowerCenter
Server only reads a string as a quoted string if the quote character is the first
character of the field.
Note: You can improve session performance if the source file does not
contain quotes or escape characters.

224

Code Page

Required

Select the code page of the delimited file. The default setting is the client
code page.

Escape Character

Optional

Character immediately preceding a delimiter character embedded in an


unquoted string, or immediately preceding the quote character in a quoted
string. When you specify an escape character, the PowerCenter Server
reads the delimiter character as a regular character (called escaping the
delimiter or quote character).
Note: You can improve session performance for mappings containing
Sequence Generator transformations if the source file does not contain
quotes or escape characters.

Chapter 8: Working with Sources

Table 8-4. Delimited File Properties for File Sources


Delimited File
Properties Options

Required/
Optional

Remove Escape
Character From Data

Optional

This option is selected by default. Clear this option to include the escape
character in the output string.

Number of Initial
Rows to Skip

Optional

The PowerCenter Server skips the specified number of rows before reading
the file. Use this to skip title or header rows in the file.

Description

Configuring Line Sequential Buffer Length


You can configure the line buffer length for file sources. By default, the PowerCenter Server
reads a file record into a buffer that holds 1024 bytes. If the source file records are larger than
1024 bytes, increase the Line Sequential Buffer Length property in the session properties
accordingly.
Figure 8-13 shows the Advanced settings on the Config Object tab in the session properties
where you define the line buffer length:
Figure 8-13. Line Sequential Buffer Length Property for File Sources

Line
Sequential
Buffer Length

Working with File Sources

225

Server Handling for File Sources


When you configure a session with file sources, you might take these additional features into
account when creating mappings with file sources:

Character set

Multibyte character error handling

Null character handling

Row length handling for fixed-width flat files

Numeric data handling

Tab handling

Character Set
You can configure the PowerCenter Server to run sessions in either ASCII or Unicode data
movement mode.
Table 8-5 describes source file formats supported by each data movement path in
PowerCenter:
Table 8-5. Support for ASCII and Unicode Data Movement Modes
Character Set

Unicode mode

ASCII mode

7-bit ASCII

Supported

Supported

US-EBCDIC
(COBOL sources only)

Supported

Supported

8-bit ASCII

Supported

Supported

8-bit EBCDIC
(COBOL sources only)

Supported

Supported

ASCII-based MBCS

Supported

PowerCenter Server generates a warning message.

EBCDIC-based MBCS

Supported

Not supported. The PowerCenter Server terminates the session.

If you configure a session to run in ASCII data movement mode, delimiters, escape
characters, and null characters must be valid in the ISO Western European Latin 1 code page.
Any 8-bit characters you specified in previous versions of PowerCenter are still valid. In
Unicode data movement mode, delimiters, escape characters, and null characters must be
valid in the specified code page of the flat file.
For more information about configuring and working with data movement modes, see
Globalization Overview in the Installation and Configuration Guide.

226

Chapter 8: Working with Sources

Multibyte Character Error Handling


Misalignment of multibyte data in a file causes session errors. Data becomes misaligned when
you place column breaks incorrectly in a file, resulting in multibyte characters that extend
beyond the last byte in a column.
When you import a fixed-width flat file, you can create, move, or delete column breaks using
the Flat File Wizard. Incorrect positioning of column breaks can create alignment errors when
you run a session containing multibyte characters.
The PowerCenter Server handles alignment errors in fixed-width flat files according to the
following guidelines:

Non-line sequential file. The PowerCenter Server skips rows containing misaligned data
and resumes reading the next row. The skipped row appears in the session log with a
corresponding error message. If an alignment error occurs at the end of a row, the
PowerCenter Server skips both the current row and the next row, and writes them to the
session log.

Line sequential file. The PowerCenter Server skips rows containing misaligned data and
resumes reading the next row. The skipped row appears in the session log with a
corresponding error message.

Reader error threshold. You can configure a session to stop after a specified number of
non-fatal errors. A row containing an alignment error increases the error count by 1. The
session stops if the number of rows containing errors reaches the threshold set in the
session properties. Errors and corresponding error messages appear in the session log file.

Fixed-width COBOL sources are always byte-oriented and can be line sequential. The
PowerCenter Server handles COBOL files according to the following guidelines:

Line sequential files. The PowerCenter Server skips rows containing misaligned data and
writes the skipped rows to the session log. The session stops if the number of error rows
reaches the error threshold.

Non-line sequential files. The session stops at the first row containing misaligned data.

Null Character Handling


You can specify single-byte or multibyte null characters for fixed-width flat files. The
PowerCenter Server uses these characters to determine if a column is null.

Server Handling for File Sources

227

Table 8-6 describes how the PowerCenter Server uses the Null Character and Repeat Null
Character properties to determine if a column is null:
Table 8-6. Null Character Handling
Null
Character

Repeat Null
Character

Binary

Disabled

A column is null if the first byte in the column is the binary null character. The
PowerCenter Server reads the rest of the column as text data only to determine the
column alignment and track the shift state for shift sensitive code pages. If data in the
column is misaligned, the PowerCenter Server skips the row and writes the skipped row
and a corresponding error message to the session log.

Non-binary

Disabled

A column is null if the first character in the column is the null character. The
PowerCenter Server reads the rest of the column only to determine the column
alignment and track the shift state for shift sensitive code pages. If data in the column is
misaligned, the PowerCenter Server skips the row and writes the skipped row and a
corresponding error message to the session log.

Binary

Enabled

A column is null if it contains only the specified binary null character. The next column
inherits the initial shift state of the code page.

Non-binary

Enabled

A column is null if the repeating null character fits into the column exactly, with no bytes
leftover. For example, a five-byte column is not null if you specify a two-byte repeating
null character. In shift-sensitive code pages, shift bytes do not affect the null value of a
column. A column is still null if it contains a shift byte at the beginning or end of the
column.
Informatica recommends you specify a single-byte null character if you use repeating
non-binary null characters. This ensures that repeating null characters fit into a column
exactly.

PowerCenter Server Behavior

Row Length Handling for Fixed-Width Flat Files


For fixed-width flat files, data in a row can be shorter than the row length in the following
situations:

The file is fixed-width line-sequential with a carriage return or line feed that appears
sooner than expected.

The file is fixed-width non-line sequential, and the last line in the file is shorter than
expected.

In these cases, the PowerCenter Server reads the data but does not append any blanks to fill
the remaining bytes. The PowerCenter Server reads subsequent fields as NULL. Fields
containing repeating null characters that do not fill the entire field length are not considered
NULL.

228

Chapter 8: Working with Sources

Numeric Data Handling


Sometimes, file sources contain non-numeric data in numeric columns. When the
PowerCenter Server reads non-numeric data, it treats the row differently, depending on the
source type. When the PowerCenter Server reads non-numeric data from numeric columns in
a flat file source or an XML source, it drops the row and writes the row to the session log.
When the PowerCenter Server reads non-numeric data for numeric columns in a COBOL
source, it reads a null value for the column.

Server Handling for File Sources

229

Using a File List


You can create a session to run multiple source files for one source instance in the mapping.
You might use this feature if, for example, your company collects data at several locations
which you then want to move through the same session. When you create a mapping to use
multiple source files for one source instance, the properties of all files must exactly match the
source definition.
To use multiple source files, you create a file containing the names and directories of each
source file you want the PowerCenter Server to use. This file is referred to as a file list.
When you configure the session properties, enter the file name of the file list in the Source
Filename field and enter the location of the file list in the Source File Directory field. When
the session starts, the PowerCenter Server reads the file list, then locates and reads the first file
source in the list. After the PowerCenter Server reads the first file, it locates and reads the next
file in the list.
The PowerCenter Server writes the path and name of the file list to the session log. If the
PowerCenter Server encounters an error while accessing a source file, it logs the error in the
session log and stops the session.
Note: When you use a file list and the session performs incremental aggregation, the

PowerCenter Server performs incremental aggregation across all listed source files.

Creating the File List


The file list contains the names of all the source files you want the PowerCenter Server to use
for the source instance in the session. Create the file list in an editor appropriate to the
PowerCenter Server platform and save it as a text file. For example, you can create a file list
for a PowerCenter Server on Windows with any text editor then save it as ASCII.
The PowerCenter Server interprets the file list using the PowerCenter Server code page. Each
file in the list must use the user-defined code page configured in the source definition. This
code page must be a subset of the repository code page.
Each file in the file list must share the same file properties as configured in the source
definition or as entered for the source instance in the session property sheet. You can enter
different paths for each file in the list, but for the session to complete successfully, the paths
must be local to the PowerCenter Server machine. Map the drives on a PowerCenter Server on
Windows or mount the drives on a PowerCenter Server on UNIX, as necessary. If you do not
specify a path for a file, the PowerCenter Server assumes the file is in the same directory as the
file list.
The file list format must follow the following guidelines:

230

Text file

One file name, or path and file name, for each line

Chapter 8: Working with Sources

The PowerCenter Server skips blank lines and ignores leading blank spaces. Any characters
indicating a new line, such as \n in ASCII files, must be valid in the code page of the
PowerCenter Server.
The following example shows a valid file list created for a PowerCenter Server on Windows.
Each of the drives listed are mapped on the server machine. The western_trans.dat file is
located in the same directory as the file list.
western_trans.dat
d:\data\eastern_trans.dat

e:\data\midwest_trans.dat
f:\data\canada_trans.dat

Once you create the file list, place it in a directory local to the PowerCenter Server.

Configuring a Session to Use a File List


After you create a file list for multiple source files, you can configure the session to access
those files.
To use multiple source files for one source instance in a session:
1.

In the Workflow Manager, open the session properties.

2.

Click the Mapping tab and open the Transformations view.

Using a File List

231

3.

Click the Properties settings in the Sources node.

Source
Filename
Indirect
File Type

4.

In the Source Filetype field, choose Indirect.

5.

In the Source Filename field, replace the file name with the name of the file list.
If necessary, also enter the path in the Source File Directory field.
If you enter only a file name in the Source Filename field, and you have specified a path
in the Source File Directory field, the PowerCenter Server looks for the named file in the
listed directory.
If you enter only a file name in the Source Filename field, and you do not specify a path
in the Source File Directory field, the PowerCenter Server looks for the named file in the
directory where the PowerCenter Server is installed on UNIX or in the system directory
on Windows.

6.

232

Click OK.

Chapter 8: Working with Sources

Chapter 9

Working with Targets


This chapter covers the following topics:

Overview, 234

Configuring Targets in a Session, 236

Working with Relational Targets, 240

Working with Target Connection Groups, 257

Working with Active Sources, 259

Working with File Targets, 261

Server Handling for File Targets, 268

Working with Heterogeneous Targets, 274

233

Overview
In the Workflow Manager, you can create sessions with the following targets:

Relational. You can load data to any relational database that the PowerCenter Server can
connect to. When loading data to relational targets, you must configure the database
connection to the target before you configure the session.

File. You can load data to a flat file or XML target. The PowerCenter Server can load data
to any local directory or FTP connection for the target file. If the file target requires an
FTP connection, you need to configure the FTP connection to the host machine before
you create the session.

Heterogeneous. You can output data to multiple targets in the same session. You can
output to multiple relational targets, such as Oracle and Microsoft SQL Server. Or, you
can output to multiple target types, such as relational and flat file. For more information,
see Working with Heterogeneous Targets on page 274.

Globalization Features
You can configure the PowerCenter Server to run sessions in either ASCII or Unicode data
movement mode.
Table 9-1 describes target character sets supported by each data movement mode in
PowerCenter:
Table 9-1. Support for ASCII and Unicode Data Movement Modes
Character Set

Unicode Mode

ASCII Mode

7-bit ASCII

Supported

Supported

8-bit ASCII

Supported

Supported

ASCII-based MBCS

Supported

PowerCenter Server generates a warning


message, but does not terminate the session.

UTF-8

Supported (targets only)

PowerCenter Server generates a warning


message, but does not terminate the session.

PowerCenter allows you to work with targets that use multibyte character sets. You can choose
a code page that you want the PowerCenter Server to use for relational objects and flat files.
You specify code pages for relational objects when you configure database connections in the
Workflow Manager. The code page for a database connection used as a target must be a
superset of the repository code page.
When you change the database connection code page to one that is not two-way compatible
with the old code page, the Workflow Manager generates a warning and invalidates all
sessions that use that database connection.

234

Chapter 9: Working with Targets

Code pages you select for a file represent the code page of the data contained in these files. If
you are working with flat files, you can also specify delimiters and null characters supported
by the code page you have specified for the file.
Target code pages must be a superset of the repository code page. They must also be a superset
of the source code page and the PowerCenter Server code page.
However, if you configure the PowerCenter Server and Client for relaxed code page
validation, you can select any code page supported by PowerCenter for the target database
connection. When using relaxed code page validation, select compatible code pages for the
source and target data to prevent data inconsistencies. For more information about code page
compatibility, see Globalization Overview in the Installation and Configuration Guide.
If the target contains multibyte character data, configure the PowerCenter Server to run in
Unicode mode. When the PowerCenter Server runs a session in Unicode mode, it uses the
database code page to translate data.
If the target contains only single-byte characters, configure the PowerCenter Server to run in
ASCII mode. When the PowerCenter Server runs a session in ASCII mode, it does not
validate code pages.

Target Connections
Before you can load data to a target, you must configure the connection properties the
PowerCenter Server uses to connect to the target file or database. You can configure target
database and FTP connections in the Workflow Manager.
For details on creating database connections, see Setting Up a Relational Database
Connection on page 53. For details on creating FTP connections, see Using FTP on
page 559.

Partitioning Targets
When you create multiple partitions in a session with a relational target, the PowerCenter
Server creates multiple connections to the target database to write target data concurrently.
When you create multiple partitions in a session with a file target, the PowerCenter Server
creates one target file for each partition. You can configure the session properties to merge
these target files.
For details on configuring a session for pipeline partitioning, see Pipeline Partitioning on
page 345.

Permissions and Privileges


You must have execute permissions for connection objects associated with the session. For
example, if the target requires database connections or FTP connections, you must have read
permission on the connections to configure the session, and execute permission to run the
session.

Overview

235

Configuring Targets in a Session


Configure target properties for sessions in the Transformations view on Mapping tab of the
session properties. Click the Targets node to view the target properties. When you configure
target properties for a session, you define properties for each target instance in the mapping.
Figure 9-1 shows where you define target properties in a session:
Figure 9-1. Defining Target Properties in the Session Properties

Targets Node
Writers Settings

Connections Settings

Properties Settings

Transformations View

The Targets node contains the following settings where you define properties:

Writers

Connections

Properties

Configuring Writers
Click the Writers settings in the Transformations view to define the writer to use with each
target instance.

236

Chapter 9: Working with Targets

Figure 9-2 shows you define the writer to use with each target instance:
Figure 9-2. Writers Settings on the Mapping Tab of the Session Properties

Writers Settings

When the mapping target is a flat file, an XML file, an SAP BW target, or an IBM MQSeries
target, the Workflow Manager specifies the necessary writer in the session properties.
However, when the target in the mapping is relational, you can change the writer type to File
Writer if you plan to use an external loader.
Note: You can change the writer type for non-reusable sessions in the Workflow Designer and

for reusable sessions in the Task Developer. You cannot change the writer type for instances of
reusable sessions in the Workflow Designer.
When you override a relational target to use the file writer, the Workflow Manager changes
the properties for that target instance on the Properties settings. It also changes the
connection options you can define in the Connections settings.
After you override a relational target to use a file writer, define the file properties for the
target. Click Set File Properties and choose the target to define. For more information, see
Configuring Fixed-Width Properties on page 265 and Configuring Delimited Properties
on page 266.

Configuring Connections
View the Connections settings on the Mapping tab to define target connection information.

Configuring Targets in a Session

237

Figure 9-3 shows the Connections settings on the Mapping tab of the session properties:
Figure 9-3. Connections Settings on the Mapping Tab of the Session Properties

Connections Settings

Choose a connection.
Edit a connection.

For relational targets, the Workflow Manager displays Relational as the target type by default.
In the Value column, choose a configured database connection for each relational target
instance. For details on configuring database connections, see Target Database Connection
on page 241.
For flat file and XML targets, choose one of the following target connection types in the Type
column for each target instance:

FTP. If you want to load data to a flat file or XML target using FTP, you must specify an
FTP connection when you configure target options. FTP connections must be defined in
the Workflow Manager prior to configuring sessions.
You must have read permission for any FTP connection you want to associate with the
session. The user starting the session must have execute permission for any FTP
connection associated with the session. For details on using FTP, see Using FTP on
page 559.

Loader. You can use the external loader option to improve the load speed to Oracle, DB2,
Sybase IQ, or Teradata target databases.
To use this option, you must use a mapping with a relational target definition and choose
File as the writer type on the Writers settings for the relational target instance. The
PowerCenter Server uses an external loader to load target files to the Oracle, DB2, Sybase

238

Chapter 9: Working with Targets

IQ, or Teradata database. You cannot choose external loader if the target is defined in the
mapping as a flat file, XML, MQ, or SAP BW target.
For details on using the external loader feature, see External Loading on page 523.

Queue. Choose Queue when you want to output to an IBM MQSeries message queue. For
details, see the PowerCenter Connect for IBM MQSeries User and Administrator Guide.

None. Choose None when you want to write to a local flat file or XML file.

Configuring Properties
View the Properties settings on the Mapping tab to define target property information. The
Workflow Manager displays different properties for the different target types: relational, flat
file, and XML.
Figure 9-4 shows the Properties settings on the Mapping tab:
Figure 9-4. Properties Settings on the Mapping Tab of the Session Properties

Properties Settings

For more information on relational target properties, see Working with Relational Targets
on page 240. For more information on flat file target properties, see Working with File
Targets on page 261. For more information on XML target properties, see Working with
Heterogeneous Targets on page 274.
For more information on configuring sessions with multiple target types, see Working with
Heterogeneous Targets on page 274.

Configuring Targets in a Session

239

Working with Relational Targets


When you configure a session to load data to a relational target, you define most properties in
the Transformations view on the Mapping tab. You also define some properties on the
Properties tab and the Config Object tab.
You can configure the following properties for relational targets:

Target database connection. Define database connection information. For more


information, see Target Database Connection on page 241.

Target properties. You can define target properties such as target load type, target update
options, and reject options. For more information, see Target Properties on page 241.

Truncate target tables. The PowerCenter Server can truncate target tables before loading
data. For more information, see Truncating Target Tables on page 245.

Deadlock retry. You can configure the session to retry deadlocks when writing to targets.
For more information, see Deadlock Retry on page 246.

Drop and recreate indexes. Use pre- and post-session SQL to drop and recreate an index
on a relational target table to optimize query speed. For more information, see Dropping
and Recreating Indexes on page 248.

Constraint-based loading. The PowerCenter Server can load data to targets based on
primary key-foreign key constraints and active sources in the session mapping. For more
information, see Constraint-Based Loading on page 248.

Bulk loading. You can specify bulk mode when loading to DB2, Microsoft SQL Server,
Oracle, and Sybase databases. For more information, see Bulk Loading on page 252.

You can define the following properties in the session and override the properties you define
in the mapping:

Table name prefix. You can specify the target owner name or prefix in the session
properties to override the table name prefix in the mapping. For more information, see
Table Name Prefix on page 254.

Pre-session SQL. You can create SQL commands and execute them in the target database
before loading data to the target. For example, you might want to drop the index for the
target table before loading data into it. For more information, see Using Pre- and PostSession SQL Commands on page 186.

Post-session SQL. You can create SQL commands and execute them in the target database
after loading data to the target. For example, you might want to recreate the index for the
target table after loading data into it. For more information, see Using Pre- and PostSession SQL Commands on page 186.

If any target table or column name contains a database reserved word, you can create and
maintain a reserved words file containing database reserved words. When the PowerCenter
Server executes SQL against the database, it places quotes around the reserved words. For
more information, see Reserved Words on page 255.
When the PowerCenter Server runs a session with at least one relational target, it performs
database transactions per target connection group. For example, it commits all data to targets
240

Chapter 9: Working with Targets

in a target connection group at the same time. For more information, see Working with
Target Connection Groups on page 257.

Target Database Connection


Before you can run a session to load data to a target database, the PowerCenter Server must
connect to the target database. Database connections must exist in the repository to appear on
the target database list. You must define them prior to configuring a session. For details on
configuring a database connection, see Configuring the Workflow Manager on page 37.
You can choose the target connections in the Transformations view of the Mapping tab. Click
either the Targets or Connections node and select the database connection from the list for
each target instance. You must have read permission for the target database connection to
configure the session to use it. The user starting the configured session must have execute
permission for target database connections.

Target Properties
You can configure session properties for relational targets in the Transformations view on the
Mapping tab, and in the General Options settings on the Properties tab. Define the properties
for each target instance in the session.
When you click the Transformations view on the Mapping tab, you can view and configure
the settings of a specific target. Select the target under the Targets node.

Working with Relational Targets

241

Figure 9-5 shows the relational target properties you define in the Properties settings on the
Mapping tab:
Figure 9-5. Properties Settings on the Mapping Tab for a Relational Target

Edit settings for a


particular target.

Table 9-2 describes the properties available in the Properties settings on the Mapping tab of
the session properties:
Table 9-2. Relational Target Properties

242

Target Property

Required/
Optional

Target Load Type

Required

You can choose Normal or Bulk.


If you select Normal, the PowerCenter Server loads targets normally.
You can only choose Bulk when you load to Sybase, Oracle, or Microsoft
SQL Server. If you specify Bulk for other database types, the PowerCenter
Server reverts to a normal load.
Note: Choose Normal mode if the mapping contains an Update Strategy
transformation.
For more information, see Bulk Loading on page 252.

Insert*

Optional

If selected, the PowerCenter Server inserts all rows flagged for insert.
By default, this option is selected.

Update (as Update)*

Optional

If selected, the PowerCenter Server updates all rows flagged for update.
By default, this option is selected.

Update (as Insert)*

Optional

If selected, the PowerCenter Server inserts all rows flagged for update.
By default, this option is not selected.

Chapter 9: Working with Targets

Description

Table 9-2. Relational Target Properties


Target Property

Required/
Optional

Update (else Insert)*

Optional

If selected, the PowerCenter Server updates rows flagged for update if they
exist in the target, then inserts any remaining rows marked for insert.
By default, this option is not selected.

Delete*

Optional

If selected, the PowerCenter Server deletes all rows flagged for delete.
By default, this option is selected.

Truncate Table

Optional

If selected, the PowerCenter Server truncates the target before loading.


By default, this option is not selected.
For details on this feature, see Truncating Target Tables on page 245.

Reject File Directory

Optional

Enter the directory name in this field. By default, the PowerCenter Server
writes all reject files to the server variable directory, $PMBadFileDir.
If you specify both the directory and file name in the Reject Filename field,
clear this field. The PowerCenter Server concatenates this field with the
Reject Filename field when it runs the session.
You can also use the $BadFileName session parameter to specify the file
directory.
For details on session parameters, see Session Parameters on page 495.

Reject Filename

Required

Enter the file name, or file name and path. By default, the PowerCenter
Server names the reject file after the target instance name:
target_name.bad. Optionally use the $BadFileName session parameter for
the file name.
The PowerCenter Server concatenates this field with the Reject File
Directory field when it runs the session. For example, if you have
C:\reject_file\ in the Reject File Directory field, and enter filename.bad in
the Reject Filename field, the PowerCenter Server writes rejected rows to
C:\reject_file\filename.bad.
For details on session parameters, see Session Parameters on page 495.

Description

*For details on target update strategies, see Update Strategy Transformation in the Transformation Guide.

Working with Relational Targets

243

Figure 9-6 shows the test load options in the General Options settings on the Properties tab:
Figure 9-6. Test Load Options

Test Load
Options

Table 9-3 describes the test load options on the General Options settings on the Properties
tab:
Table 9-3. Test Load Options

244

Property

Required/
Optional

Enable Test Load

Optional

You can configure the PowerCenter Server to perform a test load.


With a test load, the PowerCenter Server reads and transforms data without
writing to targets. The PowerCenter Server generates all session files, and
performs all pre- and post-session functions, as if running the full session.
The PowerCenter Server writes data to relational targets, but rolls back the
data when the session completes. For all other target types, such as flat file
and SAP BW, the PowerCenter Server does not write data to the targets.
Enter the number of source rows you want to test in the Number of Rows to
Test field.
You cannot perform a test load on sessions using XML sources.
Note: You can perform a test load for relational targets when you configure a
session for normal mode. If you configure the session for bulk mode, the
session fails.

Number of Rows to
Test

Optional

Enter the number of source rows you want the PowerCenter Server to test
load.
The PowerCenter Server reads the exact number you configure for the test
load.

Chapter 9: Working with Targets

Description

Truncating Target Tables


The PowerCenter Server can truncate target tables before running a session. You can choose
to truncate tables on a target-by-target basis. If you have more than one target instance, you
only have to select the truncate target table option for one target instance.
Depending on the target database and primary key-foreign key relationships in the session
target, the PowerCenter Server might issue a delete or truncate command.
Table 9-4 lists the commands that the PowerCenter Server issues for each database:
Table 9-4. PowerCenter Server Commands on Supported Databases
Target Database

Table contains a primary key


referenced by a foreign key

Table does not contain a primary key


referenced by a foreign key

DB2

truncate table <table_name>*

truncate table <table_name>*

Informix

delete from <table_name>

delete from <table_name>

ODBC

delete from <table_name>

delete from <table_name>

Oracle

delete from <table_name> unrecoverable

truncate table <table_name>

Microsoft SQL Server

delete from <table_name>

truncate table <table_name>**

Sybase 11.x

truncate table <table_name>

truncate table <table_name>

*If you use a DB2 database on AS/400, the PowerCenter Server issues a clrpfm command.
** If you use the Microsoft SQL Server ODBC driver, the PowerCenter Server issues a delete statement.

If the PowerCenter Server issues a truncate target table command and the target table instance
specifies a table name prefix, the PowerCenter Server verifies the database user privileges for
the target table by issuing a truncate command. If the database user is not specified as the
target owner name or does not have the database privilege to truncate the target table, the
PowerCenter Server automatically issues a delete command instead and writes the following
error message to the session log:
WRT_8208 Error truncating target table <target table name> trying DELETE
FROM query.

If the PowerCenter Server issues a delete command and the database has logging enabled, the
database saves all deleted records to the log for rollback. If you do not want to save deleted
records for rollback, you can disable logging to improve the speed of the delete.
For all databases, if the PowerCenter Server fails to truncate or delete any selected table
because the user lacks the necessary privileges, the session fails.
If you use truncate target tables with one of the following functions, the PowerCenter Server
fails to successfully truncate target tables for the session:

Incremental aggregation. When you enable both truncate target tables and incremental
aggregation in the session properties, the Workflow Manager issues a warning that you
cannot enable truncate target tables and incremental aggregation in the same session.

Working with Relational Targets

245

Test load. When you enable both truncate target tables and test load, the PowerCenter
Server disables the truncate table function, runs a test load session, and writes the
following message to the session log:
WRT_8105 Truncate target tables option turned off for test load session.

To truncate a target table:


1.

In the Workflow Manager, open the session properties.

2.

Click the Mapping tab, and then click the Transformations view.

3.

Click the Targets node.

Truncate Target
Table Option

4.

In the Properties settings, select Truncate Target Table Option for each target table you
want the PowerCenter Server to truncate before it runs the session.

5.

Click OK.

Deadlock Retry
Select the Session Retry on Deadlock option in the session properties if you want the
PowerCenter Server to retry target writes on a deadlock. A deadlock might occur when the
PowerCenter Server attempts to take control of the same lock for a row when loading
partitioned targets or when running two sessions simultaneously to the same target.

246

Chapter 9: Working with Targets

If the PowerCenter Server encounters a deadlock when it tries to write to a target, the
deadlock only affects targets in the same target connection group. The PowerCenter Server
still writes to targets in other target connection groups.
Encountering deadlocks can slow session performance. To improve session performance, you
can increase the number of target connection groups the PowerCenter Server uses to write to
the targets in a session. To use a different target connection group for each target in a session,
use a different database connection name for each target instance. If you want, you can specify
the same connection information for each connection name. For more information, see
Working with Target Connection Groups on page 257.
You can only retry sessions on deadlock for targets configured for normal load. If you select
this option and configure a target for bulk mode, the PowerCenter Server does not retry target
writes on a deadlock for that target. You can also configure the PowerCenter Server to set the
number of deadlock retries and the deadlock sleep time period. For more information on
configuring the PowerCenter Server, see the Installation and Configuration Guide.
To retry a session on deadlock, click the Properties tab in the session properties and then
scroll down to the Performance settings.
Figure 9-7 shows how to retry sessions on deadlock:
Figure 9-7. Session Retry on Deadlock

Session Retry
on Deadlock

Working with Relational Targets

247

Dropping and Recreating Indexes


After you insert significant amounts of data into a target, you normally need to drop and
recreate indexes on that table to optimize query speed. You can drop and recreate indexes by:

Using pre- and post-session SQL. The preferred method for dropping and re-creating
indexes is to define a SQL statement in the Pre SQL property that drops indexes before
loading data to the target. You can use the Post SQL property to recreate the indexes after
loading data to the target. Define the Pre SQL and Post SQL properties for relational
targets in the Transformations view on the Mapping tab in the session properties. For more
information, see Using Pre- and Post-Session SQL Commands on page 186.

Using the Designer. The same dialog box you use to generate and execute DDL code for
table creation can drop and recreate indexes. However, this process is not automatic. Every
time you run a session that modifies the target table, you need to launch the Designer and
use this feature.

Constraint-Based Loading
In the Workflow Manager, you can specify constraint-based loading for a session. When you
select this option, the PowerCenter Server orders the target load on a row-by-row basis. For
every row generated by an active source, the PowerCenter Server loads the corresponding
transformed row first to the primary key table, then to any foreign key tables. Constraintbased loading depends on the following requirements:

Active source. Related target tables must have the same active source.

Key relationships. Target tables must have key relationships.

Target connection groups. Targets must be in one target connection group.

Treat rows as insert. Use this option when you insert into the target. You cannot use
updates with constraint-based loading.

Active Source
When target tables receive rows from different active sources, the PowerCenter Server reverts
to normal loading for those tables, but loads all other targets in the session using constraintbased loading when possible. For example, a mapping contains three distinct pipelines. The
first two contain a source, source qualifier, and target. Since these two targets receive data
from different active sources, the PowerCenter Server reverts to normal loading for both
targets. The third pipeline contains a source, Normalizer, and two targets. Since these two
targets share a single active source (the Normalizer), the PowerCenter Server performs
constraint-based loading: loading the primary key table first, then the foreign key table.
For more information on active sources, see Working with Active Sources on page 259.

Key Relationships
When target tables have no key relationships, the PowerCenter Server does not perform
constraint-based loading. Similarly, when target tables have circular key relationships, the
248

Chapter 9: Working with Targets

PowerCenter Server reverts to a normal load. For example, you have one target containing a
primary key and a foreign key related to the primary key in a second target. The second target
also contains a foreign key that references the primary key in the first target. The
PowerCenter Server cannot enforce constraint-based loading for these tables. It reverts to a
normal load.

Target Connection Groups


The PowerCenter Server enforces constraint-based loading for targets in the same target
connection group. If you want to specify constraint-based loading for multiple targets that
receive data from the same active source, you must verify the tables are in the same target
connection group. If the tables with the primary key-foreign key relationship are in different
target connection groups, the PowerCenter Server cannot enforce constraint-based loading
when you run the workflow.
To verify that all targets are in the same target connection group, perform the following tasks:

Verify all targets are in the same target load order group and receive data from the same
active source.

Use the default partition properties and do not add partitions or partition points.

Define the same target type for all targets in the session properties.

Define the same database connection name for all targets in the session properties.

Choose normal mode for the target load type for all targets in the session properties.

For more information, see Working with Target Connection Groups on page 257.

Treat Rows as Insert


Use constraint-based loading only when the session option Treat Source Rows As is set to
Insert. You might get inconsistent data if you select a different Treat Source Rows As option
and you configure the session for constraint-based loading.
When the mapping contains Update Strategy transformations and you need to load data to a
primary key table first, split the mapping using one of the following options:

Load primary key table in one mapping and dependent tables in another mapping. You
can use constraint-based loading to load the primary table.

Perform inserts in one mapping and updates in another mapping.

For more information about update strategies, see Update Strategy Transformation in the
Transformation Guide.
Constraint-based loading does not affect the target load ordering of the mapping. Target load
ordering defines the order the PowerCenter Server reads the sources in each target load order
group in the mapping. A target load order group is a collection of source qualifiers,
transformations, and targets linked together in a mapping. Constraint-based loading
establishes the order in which the PowerCenter Server loads individual targets within a set of
targets receiving data from a single source qualifier.

Working with Relational Targets

249

Example
The session for the mapping in Figure 9-8 is configured to perform constraint-based loading.
In the first pipeline, target T_1 has a primary key, T_2 and T_3 contain foreign keys
referencing the T1 primary key. T_3 has a primary key that T_4 references as a foreign key.
Since these four tables receive records from a single active source, SQ_A, the PowerCenter
Server loads rows to the target in the following order:

T_1

T_2 and T_3 (in no particular order)

T_4

The PowerCenter Server loads T_1 first because it has no foreign key dependencies and
contains a primary key referenced by T_2 and T_3. The PowerCenter Server then loads T_2
and T_3, but since T_2 and T_3 have no dependencies, they are not loaded in any particular
order. The PowerCenter Server loads T_4 last, because it has a foreign key that references a
primary key in T_3.
Figure 9-8. Mapping Using Constraint-Based Loading

After loading the first set of targets, the PowerCenter Server begins reading source B. If there
are no key relationships between T_5 and T_6, the PowerCenter Server reverts to a normal
load for both targets.
If T_6 has a foreign key that references a primary key in T_5, since T_5 and T_6 receive data
from a single active source, the Aggregator AGGTRANS, the PowerCenter Server loads rows
to the tables in the following order:

250

T_5

T_6

Chapter 9: Working with Targets

T_1, T_2, T_3, and T_4 are in one target connection group if you use the same database
connection for each target, and you use the default partition properties. T_5 and T_6 are in
another target connection group together if you use the same database connection for each
target and you use the default partition properties. The PowerCenter Server includes T_5 and
T_6 in a different target connection group because they are in a different target load order
group from the first four targets.
To enable constraint-based loading:
1.

In the General Options settings of the Properties tab, choose Insert for the Treat Source
Rows As property.

Treat rows
as insert.

Working with Relational Targets

251

2.

Click the Config Object tab. In the Advanced settings, select Constraint Based Load
Ordering.

Constraint Based
Load Ordering

3.

Click OK.

Bulk Loading
You can enable bulk loading when you load to DB2, Sybase, Oracle, or Microsoft SQL Server.
If you enable bulk loading for other database types, the PowerCenter Server reverts to a
normal load. Bulk loading improves the performance of a session that inserts a large amount
of data to the target database. Configure bulk loading on the Mapping tab.
When bulk loading, the PowerCenter Server invokes the database bulk utility and bypasses
the database log, which speeds performance. Without writing to the database log, however,
the target database cannot perform rollback. As a result, you may not be able to perform
recovery. Therefore, you must weigh the importance of improved session performance against
the ability to recover an incomplete session.
For more information on increasing session performance when bulk loading, see Bulk
Loading on page 642.
Note: When loading to DB2, Microsoft SQL Server, and Oracle targets, you must specify a

normal load for data driven sessions. When you specify bulk mode and data driven, the
PowerCenter Server reverts to normal load.

252

Chapter 9: Working with Targets

Committing Data
When bulk loading to Sybase and DB2 targets, the PowerCenter Server ignores the commit
interval you define in the session properties and commits data when the writer block is full.
When bulk loading to Microsoft SQL Server and Oracle targets, the PowerCenter Server
commits data at each commit interval. Also, Microsoft SQL Server and Oracle start a new
bulk load transaction after each commit.
Tip: When bulk loading to Microsoft SQL Server or Oracle targets, define a large commit

interval to reduce the number of bulk load transactions and increase performance.

Oracle Guidelines
Oracle allows bulk loading for the following software versions:

Oracle server version 8.1.5 or higher

Oracle client version 8.1.7.2 or higher

You can use the Oracle client 8.1.7 if you install the Oracle Threaded Bulk Mode patch.
Use the following guidelines when bulk loading to Oracle:

Do not define CHECK constraints in the database.

Do not define primary and foreign keys in the database. However, you can define primary
and foreign keys for the target definitions in the Designer.

To bulk load into indexed tables, choose non-parallel mode. To do this, you must disable
the Enable Parallel Mode option. For more information, see Configuring a Relational
Database Connection on page 56.
Note that when you disable parallel mode, you cannot load multiple target instances,
partitions, or sessions into the same table.
To bulk load in parallel mode, you must drop indexes and constraints in the target tables
before running a bulk load session. After the session completes, you can rebuild them. If
you use bulk loading with the session on a regular basis, you can use pre- and post-session
SQL to drop and rebuild indexes and key constraints.

When you use the LONG datatype, verify it is the last column in the table.

Specify the Table Name Prefix for the target when you use Oracle client 9i. If you do not
specify the table name prefix, the PowerCenter Server uses the database login as the prefix.

For more information, see your Oracle documentation.

DB2 Guidelines
Use the following guidelines when bulk loading to DB2:

You must drop indexes and constraints in the target tables before running a bulk load
session. After the session completes, you can rebuild them. If you use bulk loading with
the session on a regular basis, you can use pre- and post-session SQL to drop and rebuild
indexes and key constraints.

Working with Relational Targets

253

You cannot use source-based or user-defined commit when you run bulk load sessions on
DB2.

If you create multiple partitions for a DB2 bulk load session, you must use database
partitioning for the target partition type. If you choose any other partition type, the
PowerCenter Server reverts to normal load and writes the following message to the session
log:
ODL_26097 Only database partitioning is support for DB2 bulk load.
Changing target load type variable to Normal.

When you bulk load to DB2, the DB2 database writes non-fatal errors and warnings to a
message log file in the session log directory. The message log file name is
<session_log_name>.<target_instance_name>.<partition_index>.log. You can check both
the message log file and the session log when you troubleshoot a DB2 bulk load session.

For more information, see your DB2 documentation.

Table Name Prefix


The table name prefix is the owner of the target table. For some databases, such as DB2,
tables can have different owners. If the database user specified in the database connection is
not the owner of the target tables in a session, specify the table owner for each target instance.
A session can fail if the database user is not the owner and you do not specify the table owner
name.
You can specify the table owner name in the target instance or in the session properties. When
you specify the table owner name in the session properties, you override table owner name in
the transformation properties. For more information about specifying table owner name in
the mapping properties, see Mappings in the Designer Guide.
Note: When you specify the table owner name and you set the sqlid for a DB2 database in the

environment SQL, the PowerCenter Server uses table owner name in the target instance. To
use the table owner name specified in the SET sqlid statement, do not enter a name in the
target name prefix.
To specify the target owner name or prefix at the session level:

254

1.

In the Workflow Manager, open the session properties and click the Transformations
view on the Mapping tab.

2.

Select the target instance under the Targets node.

Chapter 9: Working with Targets

3.

In the Properties settings, enter the table owner name or prefix in the Table Name Prefix
field, and click OK.

Target Instance

Table Name Prefix

Reserved Words
If any table name or column name contains a database reserved word, such as MONTH or
YEAR, the session fails with database errors when the PowerCenter Server executes SQL
against the database. You can create and maintain a reserved words file, reswords.txt, in the
PowerCenter Server installation directory. When the PowerCenter Server initializes a session,
it searches for reswords.txt. If the file exists, the PowerCenter Server places quotes around
matching reserved words when it executes SQL against the database.
Use the following rules and guidelines when working with reserved words.

The PowerCenter Server searches the reserved words file when it generates SQL to connect
to source, target, and lookup databases.

If you override the SQL for a source, target, or lookup, you must enclose any reserved
word in quotes.

You may need to enable some databases, such as Microsoft SQL Server and Sybase, to use
SQL-92 standards regarding quoted identifiers. You can use environment SQL to issue the
command. For example, with Microsoft SQL Server, you can use the following command:
SET QUOTED_IDENTIFIER ON

Working with Relational Targets

255

Sample reswords.txt File


To use a reserved words file, create a file named reswords.txt and place it in the PowerCenter
Server installation directory. Create a section for each database that you need to store reserved
words for. Add reserved words used in any table or column name. You do not need to store all
reserved words for a database in this file. Database names and reserved words in resword.txt
are not case sensitive.
Following is a sample resword.txt file:
[Teradata]
MONTH
DATE
INTERVAL
[Oracle]
OPTION
START
[DB2]
[SQL Server]
CURRENT
[Informix]
[ODBC]
MONTH
[Sybase]

256

Chapter 9: Working with Targets

Working with Target Connection Groups


When you create a session with at least one relational target, SAP BW target, or dynamic
MQSeries target, you need to consider target connection groups. A target connection group is
a group of targets that the PowerCenter Server uses to determine commits and loading. When
the PowerCenter Server performs a database transaction, such as a commit, it performs the
transaction to all targets in a target connection group.
The PowerCenter Server performs the following database transactions per target connection
group:

Deadlock retry. If the PowerCenter Server encounters a deadlock when it writes to a


target, the deadlock only affects targets in the same target connection group. The
PowerCenter Server still writes to targets in other target connection groups. For more
information, see Deadlock Retry on page 246.

Constraint-based loading. The PowerCenter Server enforces constraint-based loading for


targets in a target connection group. If you want to specify constraint-based loading, you
must verify the primary table and foreign table are in the same target connection group.
For more information, see Constraint-Based Loading on page 248.

Targets in the same target connection group meet the following criteria:

Belong to the same partition.

Belong to the same target load order group.

Have the same target type in the session.

Have the same database connection name for relational targets, and Application
connection name for SAP BW targets. For more information, see the PowerCenter
Connect for SAP BW User and Administrator Guide.

Have the same target load type, either normal or bulk mode.

For example, suppose you create a session based on a mapping that reads data from one source
and writes to two Oracle target tables. In the Workflow Manager, you do not create multiple
partitions in the session. You use the same Oracle database connection for both target tables
in the session properties. You specify normal mode for the target load type for both target
tables in the session properties. The targets in the session belong to the same target
connection group.
Suppose you create a session based on the same mapping. In the Workflow Manager, you do
not create multiple partitions. However, you use one Oracle database connection name for
one target, and you use a different Oracle database connection name for the other target. You
specify normal mode for the target load type for both target tables. The targets in the session
belong to different target connection groups.
Note: When you define the target database connections for multiple targets in a session using

session parameters, the targets may or may not belong to the same target connection group.
The targets belong to the same target connection group if all session parameters resolve to the
same target connection name. For example, you create a session with two targets and specify
the session parameter $DBConnection1 for one target, and $DBConnection2 for the other
Working with Target Connection Groups

257

target. In the parameter file, you define $DBConnection1 as Sales1 and you define
$DBConnection2 as Sales1 and run the workflow. Both targets in the session belong to the
same target connection group.

258

Chapter 9: Working with Targets

Working with Active Sources


An active source is an active transformation the PowerCenter Server uses to generate rows. An
active source can be any of the following transformations:

Aggregator

Application Source Qualifier

Custom, configured as an active transformation

Joiner

MQ Source Qualifier

Normalizer (VSAM or pipeline)

Rank

Sorter

Source Qualifier

XML Source Qualifier

Mapplet, if it contains any of the above transformation

Note: Although the Filter, Router, Transaction Control, and Update Strategy transformations

are active transformations, the PowerCenter Server does not use them as active sources in a
pipeline.
Active sources affect how the PowerCenter Server processes a session when you use any of the
following transformations or session properties:

XML targets. The PowerCenter Server can load data from different active sources to an
XML target when each input group receives data from one active source. For more
information on XML targets, see Working with XML Targets in the XML User Guide.

Transaction generators. Transaction generators, such as Transaction Control


transformations, become ineffective for downstream transformations or targets if you put a
transaction control point after it. Transaction control points are transaction generators and
active sources that generate commits. For more information on effective and ineffective
transaction generators, see Transaction Control Transformation in the Transformation
Guide. For a list of transaction control points, see Transformation Scope on page 287.

Mapplets. An Input transformation must receive data from a single active source. For
more information on connecting mapplets to active sources in mappings, see Mapplets
in the Designer Guide.

Source-based commit. Some active sources generate commits. When you run a sourcebased commit session, the PowerCenter Server generates a commit from these active
sources at every commit interval. For more information on source-based commit sessions,
see Source-Based Commits on page 278.

Working with Active Sources

259

260

Constraint-based loading. To use constraint-based loading, you must connect all related
targets to the same active source. The PowerCenter Server orders the target load on a rowby-row basis based on rows generated by an active source. For more information on
constraint-based loading, see Constraint-Based Loading on page 248.

Row error logging. If an error occurs downstream from an active source that is not a
source qualifier, the PowerCenter Server cannot identify the source row information for
the logged error row. For more information on logging errors, see Overview on
page 482.

Chapter 9: Working with Targets

Working with File Targets


You can output data to a flat file in either of the following ways:

Use a flat file target definition. Create a mapping with a flat file target definition. Create
a session using the flat file target definition. When the PowerCenter Server runs the
session, it creates the target flat file based on the flat file target definition.

Use a relational target definition. Use a relational definition to write to a flat file when
you want to use an external loader to load the target. Create a mapping with a relational
target definition. Create a session using the relational target definition. Configure the
session to output to a flat file by specifying the File Writer in the Writers settings on the
Mapping tab. For details on using the external loader feature, see External Loading on
page 523.

You can configure the following properties for flat file targets:

Target properties. You can define target properties such as partitioning options, output
file options, and reject options. For more information, see Configuring Target Properties
on page 261.

Flat file properties. You can choose to create delimited or fixed-width files, and define
their properties. For more information, see Configuring Fixed-Width Properties on
page 265 and Configuring Delimited Properties on page 266.

Configuring Target Properties


You can configure session properties for flat file targets in the Properties settings on the
Mapping tab, and in the General Options settings on the Properties tab. Define the properties
for each target instance in the session.

Working with File Targets

261

Figure 9-9 shows the flat file target properties you define in the Properties settings on the
Mapping tab in the session properties:
Figure 9-9. Properties Settings on the Mapping Tab for a Flat File Target

Flat File Target


Instance
Set File
Properties
Properties
Settings

Table 9-5 describes the properties you define in the Properties settings for flat file target
definitions:
Table 9-5. Flat File Target Properties
Target Properties

262

Required/
Optional

Description

Merge Partitioned
Files

Optional

When selected, the PowerCenter Server merges the partitioned target files into
one file when the session completes, and then deletes the individual output
files. If the PowerCenter Server fails to create the merged file, it does not
delete the individual output files.
You cannot merge files if the session uses FTP, an external loader, or a
message queue.
For details on configuring a session for partitioning, see Pipeline Partitioning
on page 345.

Merge File
Directory

Optional

Enter the directory name in this field. By default, the PowerCenter Server
writes the merged file in the server variable directory, $PMTargetFileDir.
If you enter a full directory and file name in the Merge File Name field, clear
this field.

Merge File Name

Optional

Name of the merge file. Default is target_name.out. This property is required if


you select Merge Partitioned Files.

Chapter 9: Working with Targets

Table 9-5. Flat File Target Properties


Target Properties

Required/
Optional

Description

Output File
Directory

Optional

Enter the directory name in this field. By default, the PowerCenter Server
writes output files in the server variable directory, $PMTargetFileDir.
If you specify both the directory and file name in the Output Filename field,
clear this field. The PowerCenter Server concatenates this field with the Output
Filename field when it runs the session.
You can also use the $OutputFileName session parameter to specify the file
directory.
For details on session parameters, see Session Parameters on page 495.

Output Filename

Required

Enter the file name, or file name and path. By default, the Workflow Manager
names the target file based on the target definition used in the mapping:
target_name.out.
If the target definition contains a slash character, the Workflow Manager
replaces the slash character with an underscore.
When you use an external loader to load to an Oracle database, you must
specify a file extension. If you do not specify a file extension, the Oracle loader
cannot find the flat file and the PowerCenter Server fails the session. For more
information about external loading, see Loading to Oracle on page 533.
Enter the file name, or file name and path. Optionally use the $OutputFileName
session parameter for the file name.
The PowerCenter Server concatenates this field with the Output File Directory
field when it runs the session.
For details on session parameters, see Session Parameters on page 495.
Note: If you specify an absolute path file name when using FTP, the
PowerCenter Server ignores the Default Remote Directory specified in the FTP
connection. When you specify an absolute path file name, do not use single or
double quotes.

Reject File
Directory

Optional

Enter the directory name in this field. By default, the PowerCenter Server
writes all reject files to the server variable directory, $PMBadFileDir.
If you specify both the directory and file name in the Reject Filename field,
clear this field. The PowerCenter Server concatenates this field with the Reject
Filename field when it runs the session.
You can also use the $BadFileName session parameter to specify the file
directory.
For details on session parameters, see Session Parameters on page 495.

Reject Filename

Required

Enter the file name, or file name and path. By default, the PowerCenter Server
names the reject file after the target instance name: target_name.bad.
Optionally use the $BadFileName session parameter for the file name.
The PowerCenter Server concatenates this field with the Reject File Directory
field when it runs the session. For example, if you have C:\reject_file\ in the
Reject File Directory field, and enter filename.bad in the Reject Filename
field, the PowerCenter Server writes rejected rows to
C:\reject_file\filename.bad.
For details on session parameters, see Session Parameters on page 495.

Set File Properties


Link

Optional

Opens a dialog box that allows you to define flat file properties. For more
information, see Configuring Fixed-Width Properties on page 265 and
Configuring Delimited Properties on page 266.
When you output to a flat file using a relational target definition in the mapping,
make sure you define the flat file properties by clicking the Set File Properties
link.

Working with File Targets

263

Figure 9-10 shows the test load options in the General Options settings on the Properties tab:
Figure 9-10. Test Load Options

Test Load
Options

Table 9-6 describes the test load options in the General Options settings on the Properties
tab:
Table 9-6. Test Load Options

264

Property

Required/
Optional

Enable Test Load

Optional

You can configure the PowerCenter Server to perform a test load.


With a test load, the PowerCenter Server reads and transforms data without
writing to targets. The PowerCenter Server generates all session files and
performs all pre- and post-session functions, as if running the full session.
The PowerCenter Server writes data to relational targets, but rolls back the
data when the session completes. For all other target types, such as flat file
and SAP BW, the PowerCenter Server does not write data to the targets.
Enter the number of source rows you want to test in the Number of Rows to
Test field.
You cannot perform a test load on sessions using XML sources.
Note: You can perform a test load for relational targets when you configure a
session for normal mode. If you configure the session for bulk mode, the
session fails.

Number of Rows to
Test

Optional

Enter the number of source rows you want the PowerCenter Server to test
load.
The PowerCenter Server reads the number you configure for the test load.

Chapter 9: Working with Targets

Description

Configuring Fixed-Width Properties


When you output data to a fixed-width file, you can edit file properties in the session
properties, such as the null character or code page. You can configure fixed-width properties
for non-reusable sessions in the Workflow Designer and for reusable sessions in the Task
Developer. You cannot configure fixed-width properties for instances of reusable sessions in
the Workflow Designer.
In the Transformations view on the Mapping tab, click the Targets node and then click Set
File Properties to open the Flat Files dialog box.
Figure 9-11 shows the Flat Files dialog box:
Figure 9-11. Flat Files Dialog Box

To edit the fixed-width properties, select Fixed Width and click Advanced.
Figure 9-12 shows the Fixed Width Properties dialog box:
Figure 9-12. Fixed Width Properties Dialog Box

Working with File Targets

265

Table 9-7 describes the options you define in the Fixed Width Properties dialog box:
Table 9-7. Writing to a Fixed-Width Target
Fixed-Width
Properties Options

Required/
Optional

Null Character

Required

Enter the character you want the PowerCenter Server to use to represent
null values. You can enter any valid character in the file code page.
For more information about using null characters for target files, see Null
Characters in Fixed-Width Files on page 272.

Repeat Null Character

Optional

Select this option to indicate a null value by repeating the null character to
fill the field. If you do not select this option, the PowerCenter Server enters
a single null character at the beginning of the field to represent a null
value. For more information about specifying null characters for target
files, see Null Characters in Fixed-Width Files on page 272.

Code Page

Required

Select the code page of the fixed-width file. The default setting is the client
code page.

Description

Configuring Delimited Properties


When you output data to a delimited file, you can edit file properties in the session
properties, such as the delimiter or code page. You can configure delimited properties for
non-reusable sessions in the Workflow Designer and for reusable sessions in the Task
Developer. You cannot configure delimited properties for instances of reusable sessions in the
Workflow Designer.
In the Transformations view on the Mapping tab, click the Targets node and then click Set
File Properties to open the Flat Files dialog box.
Figure 9-13 shows the Flat Files dialog box:
Figure 9-13. Flat Files Dialog Box

To edit the delimited properties, select Delimited and click Advanced.

266

Chapter 9: Working with Targets

Figure 9-14 shows the Delimited File Properties dialog box:


Figure 9-14. Delimited File Properties Dialog Box

Table 9-8 describes the options you can define in the Delimited File Properties dialog box:
Table 9-8. Delimited File Properties
Edit Delimiter
Options

Required/
Optional

Delimiters

Required

Character used to separate columns of data. Use the button to the right of this
field to enter a non-printable delimiter. Delimiters can be either printable or
single-byte unprintable characters, and must be different from the escape
character and the quote character (if selected). You cannot select unprintable
multibyte characters as delimiters.

Optional Quotes

Required

Select None, Single, or Double. If you select a quote character, the


PowerCenter Server does not treat delimiter characters within the quote
characters as a delimiter. For example, suppose an output file uses a comma
as a delimiter and the PowerCenter Server receives the following row: 3423849, Smith, Jenna, Rockville, MD, 6.
If you select the optional single quote character, the PowerCenter Server
ignores the commas within the quotes and writes the row as four fields.
If you do not select the optional single quote, the PowerCenter Server writes
six separate fields.

Code Page

Required

Select the code page of the delimited file. The default setting is the client code
page.

Description

Working with File Targets

267

Server Handling for File Targets


When you configure a session to write to file targets, you need to know how the PowerCenter
Server loads data. In the mapping, you must correctly configure your flat file target
definitions and the relational target definitions you use to write to flat files. The PowerCenter
Server loads data to flat files based on the following criteria:

Writing to fixed-width flat files from relational target definitions. The PowerCenter
Server adds spaces to target columns based on transformation datatype.

Writing to fixed-width flat files from flat file target definitions. You must configure the
precision and field width for flat file target definitions to accommodate the total length of
the target field.

Writing multibyte data to fixed-width files. You must configure the precision of string
columns to accommodate character data. When writing shift-sensitive data to a fixedwidth flat file target, the PowerCenter Server adds shift characters and spaces to meet file
requirements.

Null characters in fixed-width files. The PowerCenter Server writes repeating or nonrepeating null characters to fixed-width target file columns differently depending on
whether the characters are single- or multibyte.

Character set. You can write ASCII or Unicode data to a flat file target.

Writing metadata to flat file targets. You can configure the PowerCenter Server to write
the column header information when you write to flat file targets.

Writing to Fixed-Width Flat Files with Relational Target Definitions


When you want to output to a fixed-width file based on a relational target definition in the
mapping, consider how the PowerCenter Server handles spacing in the target file.
When the PowerCenter Server writes to a fixed-width flat file based on a relational target
definition in the mapping, it adds spaces to columns based on the transformation datatype
connected to the target. This allows the PowerCenter Server to write optional symbols
necessary for the datatype, such as a negative sign or decimal point, without sending the row
to the reject file.
For example, you connect a transformation Integer(10) port to a Number(10) column in a
relational target definition. In the session properties, you override the relational target
definition to use the File Writer and you specify to output a fixed-width flat file. In the target
flat file, the PowerCenter Server appends an additional byte to the Number(10) column to
allow for negative signs that might be associated with Integer data.

268

Chapter 9: Working with Targets

Table 9-9 describes the number of bytes the PowerCenter Server adds to the target column
and optional characters it uses for each datatype:
Table 9-9. Datatype Modifications for File Target Columns
Transformation Datatype
Connected to Fixed-Width
Flat File Target Column

Bytes Added by
PowerCenter
Server

Decimal

- Negative sign (-) for the mantissa.


- Decimal point (.).

Double

- Negative sign for the mantissa.


- Decimal point.
- Negative sign, e, and three digits for the exponent, for
example, -4.2-e123.

Float

- Negative sign for the mantissa.


- Decimal point.
- Negative sign, e, and three digits for the exponent.

Integer

- Negative sign for the mantissa.

Money

- Negative sign for the mantissa.


- Decimal point.

Numeric

- Negative sign for the mantissa.


- Decimal point.

Real

- Negative sign for the mantissa.


- Decimal point.
- Negative sign, e, and three digits for the exponent.

Optional Characters for the Datatype

Writing to Fixed-Width Files with Flat File Target Definitions


When you want to output to a fixed-width flat file based on a flat file target definition, you
must configure precision and field width for the target field to accommodate the total length
of the target field. If the data for a target field is too long for the total length of the field, the
PowerCenter Server performs one of the following actions:

Truncates the row for string columns

Writes the row to the reject file for numeric and datetime columns

Note: When the PowerCenter Server writes a row to the reject file, it writes a message in the

session log.
When a session writes to a fixed-width flat file based on a fixed-width flat file target definition
in the mapping, the PowerCenter Server defines the total length of a field by the precision or
field width defined in the target.
Fixed-width files are byte-oriented, which means the total length of a field is measured in
bytes.

Server Handling for File Targets

269

Table 9-10 describes how the PowerCenter Server measures the total field length for fields in a
fixed-width flat file target definition:
Table 9-10. Field Length Measurements for Fixed-Width Flat File Targets
Datatype

Target Field Property That Determines Total Field Length

Number

Field width

String

Precision

Datetime

Field width

Table 9-11 lists the characters you must accommodate when you configure the precision or
field width for flat file target definitions to accommodate the total length of the target field:
Table 9-11. Characters to Include when Calculating Field Length for Fixed-Width Targets
Datatype

Characters to Accommodate

Number

- Decimal separator.
- Thousands separators.
- Negative sign (-) for the mantissa.

String

- Multibyte data.
- Shift-in and shift-out characters.
For more information, see Writing Multibyte Data to Fixed-Width Flat Files on page 270.

Datetime

- Date and time separators, such as slashes (/), dashes (-), and colons (:).
For example, the format MM/DD/YYYY HH24:MI:SS has a total length of 19 bytes.

When you edit the flat file target definition in the mapping, define the precision or field
width great enough to accommodate both the target data and the characters in Table 9-11.
For example, suppose you have a mapping with a fixed-width flat file target definition. The
target definition contains a number column with a precision of 10 and a scale of 2. You use a
comma as the decimal separator and a period as the thousands separator. You know some rows
of data might have a negative value. Based on this information, you know the longest possible
number is formatted with the following format:
-NN.NNN.NNN,NN

Open the flat file target definition in the mapping and define the field width for this number
column as a minimum of 14 bytes.
For more information on formatting numeric and datetime values, see Working with Flat
Files in the Designer Guide.

Writing Multibyte Data to Fixed-Width Flat Files


If you plan to load multibyte data into a fixed-width flat file, configure the precision to
accommodate the multibyte data. Fixed-width files are byte-oriented, not character-oriented.
So, when you configure the precision for a fixed-width target, you need to consider the
number of bytes you load into the target, rather than the number of characters.
270

Chapter 9: Working with Targets

For string columns, the PowerCenter Server truncates the data if the precision is not large
enough to accommodate the multibyte data.
You might work with the following types of multibyte data:

Non shift-sensitive multibyte data. The file contains all multibyte data. Configure the
precision in the target definition to allow for the additional bytes.
For example, you know that the target data contains four double-byte characters, so you
define the target definition with a precision of 8 bytes.
If you configure the target definition with a precision of 4, the PowerCenter Server
truncates the data before writing to the target.

Shift-sensitive multibyte data. The file contains single-byte and multibyte data. When
writing to a shift-sensitive flat file target, the PowerCenter Server adds shift characters and
spaces to meet file requirements. You must configure the precision in the target definition
to allow for the additional bytes and the shift characters. For more information, see
Writing Shift-Sensitive Multibyte Data on page 271.

Note: Delimited files are character-oriented, and you do not need to allow for additional

precision for multibyte data.

Writing Shift-Sensitive Multibyte Data


When writing to a shift-sensitive flat file target, the PowerCenter Server adds shift characters
and spaces if the data going into the target does not meet file requirements. You need to allow
at least two extra bytes in each data column containing multibyte data so the output data
precision matches the byte width of the target column.
The PowerCenter Server writes shift characters and spaces in the following ways:

If a column begins or ends with a double-byte character, the PowerCenter Server adds shift
characters so the column begins and ends with a single-byte shift character.

If the data is shorter than the column width, the PowerCenter Server pads the rest of the
column with spaces.

If the data is longer than the column width, the PowerCenter Server truncates the data so
the column ends with a single-byte shift character.

To illustrate how the PowerCenter Server handles a fixed-width file containing shift-sensitive
data, say you want to output the following data to the target:
SourceCol1

SourceCol2

AAAA

aaaa

is a double-byte character, a is a single-byte character.

The first target column contains eight bytes and the second target column contains four
bytes.

Server Handling for File Targets

271

The PowerCenter Server must add shift characters to handle shift-sensitive data. Since the
first target column can only handle eight bytes, the PowerCenter Server truncates the data
before it can add the shift characters.
TargetCol1

TargetCol2

-oAAA-i

aaaa

The following table describes the notation used in this example:


Notation

Description

A
-o
-i

Double-byte character
Shift-out character
Shift-in character

For the first target column, the PowerCenter Server writes only three of the double-byte
characters to the target. It cannot write any additional double-byte characters to the output
column because the column must end in a single-byte character. If you add two more bytes to
the first target column definition, then the PowerCenter Server can add shift characters and
write all the data without truncation.
For the second target column, the PowerCenter Server writes all four single-byte characters to
the target. It does not add write shift characters to the column because the column begins and
ends with single-byte characters.

Null Characters in Fixed-Width Files


You can specify any valid single-byte or multibyte character as a null character for a fixedwidth target. You can also use a space as a null character.
The null character can be repeating or non-repeating. If the null character is repeating, the
PowerCenter Server writes as many null characters as possible into a target column. If you
specify a multibyte null character and there are extra bytes left after writing null characters,
the PowerCenter Server pads the column with single-byte spaces. If a column is smaller than
the multibyte character specified as the null character, the session fails at initialization.

Character Set
You can configure the PowerCenter Server to run sessions with flat file targets in either ASCII
or Unicode data movement mode.
If you configure a session with a flat file target to run in Unicode data movement mode, the
target file code page must be a superset of the PowerCenter Server code page and the source
code page. Delimiters, escape, and null characters must be valid in the specified code page of
the flat file.
If you configure a session to run in ASCII data movement mode, delimiters, escape, and null
characters must be valid in the ISO Western European Latin1 code page. Any 8-bit character
you specified in previous versions of PowerCenter is still valid.
272

Chapter 9: Working with Targets

For more information about configuring and working with data movement modes and code
pages, see Globalization Overview in the Installation and Configuration Guide.

Writing Metadata to Flat File Targets


When you write to flat file targets, you can configure the PowerCenter Server to write the
column header information. When you enable the Output Metadata For Flat File Target
option, the PowerCenter Server writes column headers to flat file targets. It writes the target
definition port names to the flat file target in the first line, starting with the # symbol. By
default, this option is disabled.
When writing to fixed-width files, the PowerCenter Server truncates the target definition port
name if it is longer than the column width.
For example, you have the following fixed-width flat file target definition:

The column width for ITEM_ID is six. When you enable the Output Metadata For Flat File
Target option, the PowerCenter Server writes the following text to a flat file:
#ITEM_ITEM_NAME

PRICE

100001Screwdriver

9.50

100002Hammer

12.90

100003Small nails

3.00

For information about configuring the PowerCenter Server to output flat file metadata, see
the Installation and Configuration Guide.

Server Handling for File Targets

273

Working with Heterogeneous Targets


You can output data to multiple targets in the same session. When the target types or database
types of those targets differ from each other, you have a session with heterogeneous targets.
To create a session with heterogeneous targets, you can create a session based on a mapping
with heterogeneous targets. Or, you can create a session based on a mapping with
homogeneous targets and select different database connections.
A heterogeneous target has one of the following characteristics:

Multiple target types. You can create a session that writes to both relational and flat file
targets.

Multiple target connection types. You can create a session that writes to a target on an
Oracle database and to a target on a DB2 database. Or, you can create a session that writes
to multiple targets of the same type, but you specify different target connections for each
target in the session.

All database connections you define in the Workflow Manager are unique to the PowerCenter
Server, even if you define the same connection information. For example, you define two
database connections, Sales1 and Sales2. You define the same user name, password, connect
string, code page, and attributes for both Sales1 and Sales2. Even though both Sales1 and
Sales2 define the same connection information, the PowerCenter Server treats them as
different database connections. When you create a session with two relational targets and
specify Sales1 for one target and Sales2 for the other target, you create a session with
heterogeneous targets.
You can create a session with heterogeneous targets in one of the following ways:

Create a session based on a mapping with targets of different types or different database
types. In the session properties, keep the default target types and database types.

Create a session based on a mapping with the same target types. However, in the session
properties, specify different target connections for the different target instances, or
override the target type to a different type.

You can override the target type in the session properties. However, you can only perform
certain overrides. You can specify the following target type overrides in a session:

Relational target to flat file.

Relational target to any other relational database type. Verify the datatypes used in the
target definition are compatible with both databases.

SAP BW target to a flat file target type.

Note: When the PowerCenter Server runs a session with at least one relational target, it

performs database transactions per target connection group. For example, it orders the target
load for targets in a target connection group when you enable constraint-based loading. For
more information, see Working with Target Connection Groups on page 257.

274

Chapter 9: Working with Targets

Chapter 10

Understanding Commit
Points
This chapter covers the following topics:

Overview, 276

Target-Based Commits, 277

Source-Based Commits, 278

User-Defined Commits, 283

Understanding Transaction Control, 287

Setting Commit Properties, 292

275

Overview
A commit interval is the interval at which the PowerCenter Server commits data to targets
during a session. The commit point can be a factor of the commit interval, the commit
interval type, and the size of the buffer blocks. The commit interval is the number of rows
you want to use as a basis for the commit point. The commit interval type is the type of rows
that you want to use as a basis for the commit point. You can choose between the following
commit types:

Target-based commit. The PowerCenter Server commits data based on the number of
target rows and the key constraints on the target table. The commit point also depends on
the buffer block size, the commit interval, and the PowerCenter Server configuration for
writer timeout.

Source-based commit. The PowerCenter Server commits data based on the number of
source rows. The commit point is the commit interval you configure in the session
properties.

User-defined commit. The PowerCenter Server commits data based on transactions


defined in the mapping properties. You can also configure some commit and rollback
options in the session properties.

Source-based and user-defined commit sessions have partitioning restrictions. If you


configure a session with multiple partitions to use source-based or user-defined commit, you
can only choose pass-through partitioning at certain partition points in a pipeline. For more
information, see Specifying Partition Types on page 356.

276

Chapter 10: Understanding Commit Points

Target-Based Commits
During a target-based commit session, the PowerCenter Server commits rows based on the
number of target rows and the key constraints on the target table. The commit point depends
on the following factors:

Commit interval. The number of rows you want to use as a basis for commits. Configure
the target commit interval in the session properties.

Writer wait timeout. The amount of time the writer waits before it issues a commit.
Configure the writer wait timeout in the PowerCenter Server setup.

Buffer blocks. Blocks of memory that hold rows of data during a session. You can
configure the buffer block size in the session properties, but you cannot configure the
number of rows the block holds.

When you run a target-based commit session, the PowerCenter Server may issue a commit
before, on, or after, the configured commit interval. The PowerCenter Server uses the
following process to issue commits:

When the PowerCenter Server reaches a commit interval, it continues to fill the writer
buffer block.When the writer buffer block fills, the PowerCenter Server issues a commit.

If the writer buffer fills before the commit interval, the PowerCenter Server writes to the
target, but waits to issue a commit. It issues a commit when one of the following
conditions is true:

The writer is idle for the amount of time specified by the PowerCenter Server writer wait
timeout option.

The PowerCenter Server reaches the commit interval and fills another writer buffer.

For more information about configuring the writer wait timeout, see Installing and
Configuring the PowerCenter Server on Windows or Installing and Configuring the
PowerCenter Server on UNIX in the Installation and Configuration Guide.
Note: When you choose target-based commit for a session containing an XML target, the

Workflow Manager disables the On Commit session property on the Transformations view of
the Mapping tab.

Target-Based Commits

277

Source-Based Commits
During a source-based commit session, the PowerCenter Server commits data to the target
based on the number of rows from some active sources in a target load order group. These
rows are referred to as source rows.
When the PowerCenter Server runs a source-based commit session, it identifies commit
source for each pipeline in the mapping. The PowerCenter Server generates a commit row
from these active sources at every commit interval. The PowerCenter Server writes the name
of the transformation used for source-based commit intervals into the session log:
Source-based commit interval based on... TRANSFORMATION_NAME

The PowerCenter Server might commit less rows to the target than the number of rows
produced by the active source. For example, you have a source-based commit session that
passes 10,000 rows through an active source, and 3,000 rows are dropped due to
transformation logic. The PowerCenter Server issues a commit to the target when the 7,000
remaining rows reach the target.
The number of rows held in the writer buffers does not affect the commit point for a sourcebased commit session. For example, you have a source-based commit session that passes
10,000 rows through an active source. When those 10,000 rows reach the targets, the
PowerCenter Server issues a commit. If the session completes successfully, the PowerCenter
Server issues commits after 10,000, 20,000, 30,000, and 40,000 source rows.
If the targets are in the same transaction control unit, the PowerCenter Server commits data
to the targets at the same time. If the session fails or aborts, the PowerCenter Server rolls back
all uncommitted data in a transaction control unit to the same source row.
If the targets are in different transaction control units, the PowerCenter Server performs the
commit when each target receives the commit row. If the session fails or aborts, the
PowerCenter Server rolls back each target to the last commit point. It might not roll back to
the same source row for targets in separate transaction control units. For more information on
transaction control units, see Understanding Transaction Control Units on page 289.
Note: Source-based commit may slow session performance if the session uses a one-to-one

mapping. A one-to-one mapping is a mapping that moves data from a Source Qualifier, XML
Source Qualifier, or Application Source Qualifier transformation directly to a target. For
more information about performance, see Performance Tuning on page 635.

Determining the Commit Source


When you run a source-based commit session, the PowerCenter Server generates commits at
all source qualifiers and transformations that do not propagate transaction boundaries. This
includes the following active sources:

278

Source Qualifier

Application Source Qualifier

MQ Source Qualifier

Chapter 10: Understanding Commit Points

XML Source Qualifier when you only connect ports from one output group

Normalizer (VSAM)

Aggregator with the All Input transformation scope

Joiner with the All Input transformation scope

Rank with the All Input transformation scope

Sorter with the All Input transformation scope

Custom with one output group and with the All Input transformation scope

A multiple input group transformation with one output group connected to multiple
upstream transaction control points

Mapplet, if it contains one of the above transformations

For more information on transformation scope and transaction control, see Understanding
Transaction Control on page 287. For more information on active sources, see Working
with Active Sources on page 259.
A mapping can have one or more target load order groups, and a target load order group can
have one or more active sources that generate commits. The PowerCenter Server uses the
commits generated by the active source that is closest to the target definition. This is known
as the commit source.
For example, you have the mapping in Figure 10-1:
Figure 10-1. Mapping with a Single Commit Source

Transformation Scope
property is All Input.

The mapping contains a Source Qualifier transformation and an Aggregator transformation


with the All Input transformation scope. The Aggregator transformation is closer to the
targets than the Source Qualifier transformation and is therefore used as the commit source
for the source-based commit session.

Source-Based Commits

279

Also, suppose you have the mapping in Figure 10-2:


Figure 10-2. Mapping with Multiple Commit Sources

Transformation Scope
property is All Input.

The mapping contains a target load order group with one source pipeline that branches from
the Source Qualifier transformation to two targets. One pipeline branch contains an
Aggregator transformation with the All Input transformation scope, and the other contains an
Expression transformation. The PowerCenter Server identifies the Source Qualifier
transformation as the commit source for t_monthly_sales and the Aggregator as the commit
source for T_COMPANY_ALL. It performs a source-based commit for both targets, but uses
a different commit source for each.

Switching from Source-Based to Target-Based Commit


If the PowerCenter Server identifies a target in the target load order group that does not
receive commits from an active source that generates commits, it reverts to target-based
commit for that target only.
The PowerCenter Server writes the name of the transformation used for source-based commit
intervals into the session log. When the PowerCenter Server switches to target-based commit,
it writes a message in the session log.
A target might not receive commits from a commit source in the following circumstances:

280

The target receives data from the XML Source Qualifier transformation, and you
connect multiple output groups from an XML Source Qualifier transformation to
downstream transformations. An XML Source Qualifier transformation does not generate
commits when you connect multiple output groups downstream.

The target receives data from an active source with multiple output groups other than an
XML Source Qualifier transformation. For example, the target receives data from a
Custom transformation that you do not configure to generate transactions. Multiple
output group active sources neither generate nor propagate commits.

Chapter 10: Understanding Commit Points

Connecting XML Sources in a Mapping


An XML Source Qualifier transformation does not generate commits when you connect
multiple output groups downstream. When you an XML Source Qualifier transformation in a
mapping, the PowerCenter Server can use different commit types for targets in this session
depending on the transformations used in the mapping:

You put a commit source between the XML Source Qualifier transformation and the
target. The PowerCenter Server uses source-based commit for the target because it receives
commits from the commit source. The active source is the commit source for the target.

You do not put a commit source between the XML Source Qualifier transformation and
the target. The PowerCenter Server uses target-based commit for the target because it
receives no commits.

Suppose you have the mapping in Figure 10-3:


Figure 10-3. Mapping with Targets Connected to a Commit Source

Connected to an XML
Source Qualifier
transformation with multiple
connected output groups.
PowerCenter Server uses
target-based commit when
loading to these targets.

Connected to an active
source that generates
commits, AGG_Sales.
PowerCenter Server uses
source-based commit
when loading to this
target.

Transformation Scope = All Input

This mapping contains an XML Source Qualifier transformation with multiple output groups
connected downstream. Because you connect multiple output groups downstream, the XML
Source Qualifier transformation does not generate commits. You connect the XML Source
Qualifier transformation to two relational targets, T_STORE and T_PRODUCT. Therefore,
these targets do not receive any commit generated by an active source. The PowerCenter
Server uses target-based commit when loading to these targets.
However, the mapping includes an active source that generates commits, AGG_Sales, between
the XML Source Qualifier transformation and T_YTD_SALES. The PowerCenter Server uses
source-based commit when loading to T_YTD_SALES.

Source-Based Commits

281

Connecting Multiple Output Group Custom Transformations in a Mapping


Multiple output group Custom transformations that you do not configure to generate
transactions neither generate nor propagate commits. Therefore, the PowerCenter Server can
use different commit types for targets in this session depending on the transformations used
in the mapping:

You put a commit source between the Custom transformation and the target. The
PowerCenter Server uses source-based commit for the target because it receives commits
from the active source. The active source is the commit source for the target.

You do not put a commit source between the Custom transformation and the target. The
PowerCenter Server uses target-based commit for the target because it receives no
commits.

Suppose you have the mapping in Figure 10-4:


Figure 10-4. Mapping a Custom Transformation with a Commit Source
Connected to a multiple output
group active source,
CT_XML_Parser. PowerCenter
Server uses target-based commit
when loading to these targets.

Connected to an active source


that generates commits,
AGG_store_orders. PowerCenter
Server uses source-based commit
when loading to this target.
Transformation Scope is All Input.

The mapping contains a multiple output group Custom transformation, CT_XML_Parser,


which drops the commits generated by the Source Qualifier transformation. Therefore,
targets T_store_name and T_store_addr do not receive any commits generated by an active
source. The PowerCenter Server uses target-based commit when loading to these targets.
However, the mapping includes an active source that generates commits, AGG_store_orders,
between the Custom transformation and T_store_orders. The PowerCenter Server uses
source-based commit when loading to T_store_orders.
Note: You can configure a Custom transformation to generate transactions when the Custom

transformation procedure outputs transactions. When you do this, configure the session for
user-defined commit. For more information on user-defined commit sessions, see UserDefined Commits on page 283.

282

Chapter 10: Understanding Commit Points

User-Defined Commits
During a user-defined commit session, the PowerCenter Server commits and rolls back
transactions based on a row or set of rows that pass through a Transaction Control
transformation. The PowerCenter Server evaluates the transaction control expression for each
row that enters the transformation. The return value of the transaction control expression
defines the commit or rollback point.
You can use also create a user-defined commit session when the mapping contains a Custom
transformation configured to generate transactions. When you do this, the procedure
associated with the Custom transformation defines the transaction boundaries.
When the PowerCenter Server evaluates a commit row, it commits all rows in the transaction
to the target or targets. When it evaluates a rollback row, it rolls back all rows in the
transaction from the target or targets. The PowerCenter Server writes a message to the session
log at each commit and rollback point. The session details are cumulative. The following
message is a sample commit message from the session log:
WRITER_1_1_1> WRT_8317
USER-DEFINED COMMIT POINT

Wed Oct 15 08:15:29 2003

===================================================
WRT_8036 Target: TCustOrders (Instance Name: [TCustOrders])
WRT_8038 Inserted rows - Requested: 1003
Rejected: 0
Affected: 1023

Applied: 1003

When the PowerCenter Server writes all rows in a transaction to all targets, it issues commits
sequentially for each target.
The PowerCenter Server rolls back data based on the return value of the transaction control
expression or error handling configuration. If the transaction control expression returns a
rollback value, the PowerCenter Server rolls back the transaction. If an error occurs, you can
choose to roll back or commit at the next commit point.
If the transaction control expression evaluates to a value other than commit, rollback, or
continue, the PowerCenter Server fails the session. For more information about valid values,
see Transaction Control Transformation in the Transformation Guide.
When the session completes, the PowerCenter Server may write data to the target that was not
bound by commit rows. You can choose to commit at end of file or to roll back that open
transaction.
Note: If you use bulk loading with a user-defined commit session, the target may not recognize

the transaction boundaries. If the target connection group does not support transactions, the
PowerCenter Server writes the following message to the session log:
WRT_8234 Warning: Target Connection Groups connection doesnt support
transactions. Targets may not be loaded according to specified transaction
boundaries rules.

User-Defined Commits

283

Rolling Back Transactions


The PowerCenter Server rolls back transactions in the following circumstances:

Rollback evaluation. The transaction control expression returns a rollback value.

Open transaction. You choose to roll back at the end of file.

Roll back on error. You choose to roll back commit transactions if the PowerCenter Server
encounters a non-fatal error.

Roll back on failed commit. If any target connection group in a transaction control unit
fails to commit, the PowerCenter Server rolls back all uncommitted data to the last
successful commit point.

For more information on transaction control units, see Understanding Transaction Control
Units on page 289.

Rollback Evaluation
If the transaction control expression returns a rollback value, the PowerCenter Server rolls
back the transaction and writes a message to the session log indicating that the transaction
was rolled back. It also indicates how many rows were rolled back.
The following message is a sample message that the PowerCenter Server writes to the session
log when the transaction control expression returns a rollback value:
WRITER_1_1_1> WRT_8326 User-defined rollback processed
WRITER_1_1_1> WRT_8331 Rollback statistics
WRT_8162 ===================================================
WRT_8330 Rolled back [333] inserted, [0] deleted, [0] updated rows for the
target [TCustOrders]

Roll Back Open Transaction


If the last row in the transaction control expression evaluates to
TC_CONTINUE_TRANSACTION, the session completes with an open transaction. If you
choose to roll back that open transaction, the PowerCenter Server rolls back the transaction
and writes a message to the session log indicating that the transaction was rolled back.
The following message is a sample message indicating that Commit on End of File is disabled
in the session properties:
WRITER_1_1_1> WRT_8168 End loading table [TCustOrders] at: Wed Nov 05
10:21:56 2003
WRITER_1_1_1> WRT_8325 Final rollback executed for the target
[TCustOrders] at end of load

The following message is a sample message indicating that Commit on End of File is enabled
in the session properties:
WRITER_1_1_1> WRT_8143
Commit at end of Load Order Group

284

Chapter 10: Understanding Commit Points

Wed Nov 05 08:15:29 2003

Roll Back on Error


You can choose to roll back a transaction at the next commit point if the PowerCenter Server
encounters a non-fatal error. When the PowerCenter Server encounters a non-fatal error, it
processes the error row and continues processing the transaction. If the transaction boundary
is a commit row, the PowerCenter Server rolls back the entire transaction and writes it to the
reject file.
The following table describes row indicators in the reject file for rolled-back transactions:
Row Indicator

Description

Rolled-back insert

Rolled-back update

Rolled-back delete

Note: The PowerCenter Server does not roll back a transaction if it encounters an error before

it processes any row through the Transaction Control transformation.

Roll Back on Failed Commit


When the PowerCenter Server reaches the commit point for all targets in a transaction control
unit, it issues commits sequentially for each target. If the commit fails for any target
connection group within a transaction control unit, the PowerCenter Server rolls back all data
to the last successful commit point. The PowerCenter Server cannot roll back committed
transactions, but it does write the transactions to the reject file.
For example, use the mapping in Figure 10-5 on page 286 to read through the following
scenario. This mapping has one transaction control unit and three target connection groups.
The target names contain information about the target connection group. For example,
TCG1_T1 represents the first target connection group and the first target.
1.

The PowerCenter Server reaches the third commit point for all targets.

2.

It begins to issue commits sequentially for each target.

3.

The PowerCenter Server successfully commits to TCG1_T1 and TCG1_T2.

4.

The commit fails for TCG2_T3.

5.

The PowerCenter Server does not issue a commit for TCG3_T4.

6.

The PowerCenter Server rolls back TCG2_T3 and TCG3_T4 to the second commit
point, but it cannot roll back TCG1_T1 and TCG1_T2 to the second commit point
because it successfully committed at the third commit point.

7.

The PowerCenter Server writes the rows to the reject file from TCG2_T3 and
TCG3_T4. These are the rollback rows associated with the third commit point.

8.

The PowerCenter Server writes the row to the reject file from TCG_T1 and TCG1_T2.
These are the commit rows associated with the third commit point.

User-Defined Commits

285

Figure 10-5 illustrates PowerCenter Server behavior when it rolls back on a failed commit:
Figure 10-5. Roll Back on Failed Commit Example

Third commit is successful (3).


Rows appear in the reject file (8).

Third commit fails (4).


PowerCenter Server rolls back to second commit (6).
Rows appear in reject file (7).
PowerCenter Server does not issue third commit (5).
It rolls back to second commit (6).
Rows appear in reject file (7).

The following table describes row indicators in the reject file for committed transactions in a
failed transaction control unit:

286

Row Indicator

Description

Committed insert

Committed update

Committed delete

Chapter 10: Understanding Commit Points

Understanding Transaction Control


PowerCenter allows you to define transactions that the PowerCenter Server uses when it
processes transformations, and when it commits and rolls back data at a target. You can define
a transaction based on a varying number of input rows. A transaction is a set of rows bound
by commit or rollback rows, the transaction boundaries. Some rows may not be bound by
transaction boundaries. This set of rows is an open transaction. You can choose to commit at
end of file or to roll back open transactions when you configure the session. For more
information on the Commit On End of File session property, see Setting Commit
Properties on page 292.
The PowerCenter Server can process a transformation for each row at a time, for all rows in a
transaction, or for all source rows together. Processing a transformation for all rows in a
transaction allows you to include such transformations, such as an Aggregator, in a real-time
session. For more information on configuring how the PowerCenter Server processes a
transformation, see Transformation Scope on page 287.
Transaction boundaries originate from transaction control points. A transaction control point
is a transformation that defines or redefines the transaction boundary in the following ways:

Generates transaction boundaries. The transformations that define transaction


boundaries differ, depending on the session commit type:

Target-based and user-defined commit. Transaction generators generate transaction


boundaries. A transaction generator is a transformation that generates both commit and
rollback rows. The Transaction Control and Custom transformation are transaction
generators.

Source-based commit. Some active sources generate commits. They do not generate
rollback rows. Also, transaction generators generate commit and rollback rows. For a list
of active sources that generate commits, see Determining the Commit Source on
page 278.

Drops incoming transaction boundaries. When a transformation drops incoming


transaction boundaries, and does not generate commits, the PowerCenter Server outputs
all rows into an open transaction. All active sources that generate commits and transaction
generators drop incoming transaction boundaries.

For a list of transaction control points, see Table 10-1 on page 288.

Transformation Scope
You can configure how the PowerCenter Server applies the transformation logic to incoming
data with the Transformation Scope transformation property. When the PowerCenter Server
processes a transformation, it either drops transaction boundaries or preserves transaction
boundaries, depending on the transformation scope and the mapping configuration.
You can choose one of the following values for the transformation scope:

Row. Applies the transformation logic to one row of data at a time. Choose Row when a
row of data does not depend on any other row. When you choose Row for a
Understanding Transaction Control

287

transformation connected to multiple upstream transaction control points, the


PowerCenter Server drops transaction boundaries and outputs all rows from the
transformation as an open transaction. When you choose Row for a transformation
connected to a single upstream transaction control point, the PowerCenter Server
preserves transaction boundaries.

Transaction. Applies the transformation logic to all rows in a transaction. Choose


Transaction when a row of data depends on all rows in the same transaction, but does not
depend on rows in other transactions. When you choose Transaction, the PowerCenter
Server preserves incoming transaction boundaries. It resets any cache, such as an
aggregator or lookup cache, when it receives a new transaction.
When you choose Transaction for a multiple input group transformation, you must
connect all input groups to the same upstream transaction control point.

All Input. Applies the transformation logic on all incoming data. When you choose All
Input, the PowerCenter Server drops incoming transaction boundaries and outputs all
rows from the transformation as an open transaction. Choose All Input when a row of data
depends on all rows in the source.

Table 10-1 lists the transformation scope values available for each transformation:
Table 10-1. Transformation Scope Property Values
Transformation

Row

Aggregator
Application Source
Qualifier

n/a.
Transaction control point.

Custom*

Optional.
Transaction control point
or when configured to
generate commits.

Expression

Default. Does not display.

External Procedure

Default. Does not display.

Filter

Default. Does not display.

Joiner
Lookup

Default. Does not display.

MQ Source Qualifier

n/a.
Transaction control point.

Normalizer (VSAM)

n/a.
Transaction control point.

Normalizer (relational)

Default. Does not display.

Rank

288

Chapter 10: Understanding Commit Points

Transaction

All Input

Optional.

Default.
Transaction control point.

Optional.
Transaction control point or
when configured to generate
commits.

Default.
Transaction control point
when it has one output
group or when configured
to generate commits.

Optional.

Default.
Transaction control point.

Optional.

Default.
Transaction control point.

Table 10-1. Transformation Scope Property Values


Transformation

Row

Router

Default. Does not display.

Sorter
Sequence Generator

Default. Does not display.

Source Qualifier

n/a.
Transaction control point.

Stored Procedure

Default. Does not display.

Transaction Control

Default. Does not display.


Transaction control point.

Union

Default. Does not display.

Update Strategy

Default. Does not display.

XML Generator

XML Parser

Default. Does not display.

XML Source Qualifier

n/a.
Transaction control point.

Transaction

All Input

Optional.

Default.
Transaction control point.

Optional.
Transaction when the flush
on commit is set to create a
new document,

Default. Does not display.

*For more information on how the Transformation Scope property affects the Custom transformation, see Custom Transformation in the
Transformation Guide.

Understanding Transaction Control Units


A transaction control unit is the group of targets connected to an active source that generates
commits or an effective transaction generator. A transaction control unit may contain
multiple target connection groups. For more information on target connection groups, see
Working with Target Connection Groups on page 257.
When the PowerCenter Server reaches the commit point for all targets in a transaction control
unit, it issues commits sequentially for each target.

Understanding Transaction Control

289

Figure 10-6 illustrates transaction control units with a Transaction Control transformation:
Figure 10-6. Transaction Control Units

Target Connection Group 1

Transaction
Control Unit 1
Target Connection Group 2

Target Connection Group 3

Target Connection Group 4

Transaction
Control Unit 2

Note that T5_ora1 uses the same connection name as T1_ora1 and T2_ora1. Because
T5_ora1 is connected to a separate Transaction Control transformation, it is in a separate
transaction control unit and target connection group. If you connect T5_ora1 to
tc_TransactionControlUnit1, it will be in the same transaction control unit as all targets, and
in the same target connection group as T1_ora1 and T2_ora1.

Rules and Guidelines


Consider the following rules and guidelines when you work with transaction control:

290

Transformations with Transaction transformation scope must receive data from a single
transaction control point.

The PowerCenter Server uses the transaction boundaries defined by the first upstream
transaction control point for transformations with Transaction transformation scope.

Transaction generators can be effective or ineffective for a target. The PowerCenter Server
uses the transaction generated by an effective transaction generator when it loads data to a
target. For more information on effective and ineffective transaction generators, see
Transaction Control Transformation in the Transformation Guide.

The Workflow Manager prevents you from using incremental aggregation in a session with
an Aggregator transformation with Transaction transformation scope.

Transformations with All Input transformation scope cause a transaction generator to


become ineffective for a target in a user-defined commit session. For more information on

Chapter 10: Understanding Commit Points

using transaction generators in mappings, see Transaction Control Transformation in the


Transformation Guide.

The PowerCenter Server resets any cache at the beginning of each transaction for
Aggregator, Joiner, Rank, and Sorter transformations with Transaction transformation
scope.

You can only choose the Transaction transformation scope for Joiner transformations when
you use sorted input.

When you add a partition point at a transformation with Transaction transformation


scope, the Workflow Manager uses the pass-through partition type by default. You cannot
change the partition type.

Understanding Transaction Control

291

Setting Commit Properties


When you create a session, you can configure commit properties. The properties you set
depend on the type of mapping and the type of commit you want the PowerCenter Server to
perform.
Figure 10-7 shows the session commit properties that you set in the General Options settings
of the Properties tab:
Figure 10-7. Session Commit Properties

Commit Type
Commit Interval
Commit on
End of File
Roll Back
Transactions
on Error

Table 10-2 describes the session commit properties that you set in the General Options
settings of the Properties tab:
Table 10-2. Session Commit Properties

292

Property

Target-Based

Source-Based

User-Defined

Commit Type

Selected by default if no
transaction generator or only
ineffective transaction
generators are in the
mapping.

Choose for source-based


commit if no transaction
generator or only ineffective
transaction generators are in
the mapping.

Selected by default if
effective transaction
generators are in the
mapping.

Commit Interval*

Default is 10,000.

Default is 10,000.

n/a

Chapter 10: Understanding Commit Points

Table 10-2. Session Commit Properties


Property

Target-Based

Source-Based

User-Defined

Commit on End of File

Commits data at the end of


the file. Enabled by default.
You cannot disable this
option.

Commits data at the end of


the file. Clear this option if
you want the PowerCenter
Server to roll back open
transactions.

Commits data at the end of


the file. Clear this option if
you want the PowerCenter
Server to roll back open
transactions.

Roll Back
Transactions on
Errors

n/a

If the PowerCenter Server


encounters a non-fatal error,
you can choose to roll back
the transaction at the next
commit point.
When the PowerCenter
Server encounters a
transformation error, it only
rolls back the transaction if
the error occurs after the
effective transaction
generator for the target.

If the PowerCenter Server


encounters a non-fatal error,
you can choose to roll back
the transaction at the next
commit point.
When the PowerCenter
Server encounters a
transformation error, it only
rolls back the transaction if
the error occurs after the
effective transaction
generator for the target.

*Tip: When you bulk load to Microsoft SQL Server or Oracle targets, define a large commit interval. Microsoft SQL Server
and Oracle start a new bulk load transaction after each commit. Increasing the commit interval reduces the number of
bulk load transactions and increases performance.

Setting Commit Properties

293

294

Chapter 10: Understanding Commit Points

Chapter 11

Recovering Data
This chapter covers the following topics:

Overview, 296

Preparing for Recovery, 297

Recovering a Suspended Workflow, 305

Recovering a Failed Workflow, 308

Recovering a Session Task, 311

Server Handling for Recovery, 314

Completing Unrecoverable Sessions, 316

295

Overview
If you stop a session or if an error causes a session to stop unexpectedly, refer to the session
logs to determine the cause of the failure. Correct the errors, and then complete the session.
The method you use to complete the session depends on the configuration of the mapping
and the session, the specific failure, and how much progress the session made before it failed.
If the PowerCenter Server did not commit any data, run the session again. If the session
issued at least one commit and is recoverable, consider running the session in recovery mode.
Recovery allows you to restart a failed session and complete it as if the session had run
without pause. When the PowerCenter Server runs in recovery mode, it continues to commit
data from the point of the last successful commit. For more information on PowerCenter
Server processing during recovery, see Server Handling for Recovery on page 314.
All recovery sessions run as part of a workflow. When you recover a session, you also have the
option to run part of the workflow. Consider the configuration and design of the workflow
and the status of other tasks in the workflow before you choose a method of recovery.
Depending on the configuration and status of the workflow and session, you can choose one
or more of the following recovery methods:

Recover a suspended workflow. If the workflow suspends due to session failure, you can
recover the failed session and resume the workflow. For details, see Recovering a
Suspended Workflow on page 305.

Recover a failed workflow. If the workflow fails as a result of session failure, you can
recover the session and run the rest of the workflow. For details, see Recovering a Failed
Workflow on page 308.

Recover a session task. If the workflow completes, but a session fails, you can recover the
session alone without running the rest of the workflow. You can also use this method to
recover multiple failed sessions in a branched workflow. For details, see Recovering a
Session Task on page 311.

For more information on session failure, see Stopping and Aborting a Session on page 200.

296

Chapter 11: Recovering Data

Preparing for Recovery


Before you perform recovery, you must configure the mapping, session, workflow, and target
database to ensure that the recovery session will consistently read, transform, and write data as
though the session had not failed.
Under certain circumstances, you cannot recover the session and must run it again. For more
information on completing unrecoverable sessions, see Completing Unrecoverable Sessions
on page 316.

Configuring the Mapping


When you design a mapping, consider requirements for session recovery. Configure the
mapping so that the PowerCenter Server can extract, transform, and load data with the same
results each time it runs the session.
Use the following guidelines when you configure the mapping:

Sort the data from the source. This guarantees that the PowerCenter Server always
receives source rows in the same order. You can do this by configuring the Sorted Ports
option in the Source Qualifier or Application Source Qualifier transformation or by
adding a Sorter transformation configured for distinct output rows to the mapping after
the source qualifier.

Verify all targets receive data from transformations that produce repeatable data. Some
transformations produce repeatable data. You can enable a session for recovery in the
Workflow Manager when all targets in the mapping receive data from transformations that
produce repeatable data. For more information on repeatable data, see Working with
Repeatable Data on page 301.

Also, to perform consistent data recovery, the source, target, and transformation properties for
the recovery session must be the same as those for the failed session. Do not change the
properties of objects in the mapping before you run the recovery session.

Configuring the Session


To perform recovery on a failed session, the session must meet the following criteria:

The session is enabled for recovery.

The previous session run failed and the recovery information is accessible.

To enable recovery, select the Enable Recovery option in the Error Handling settings of the
Configuration tab in the session properties.
If you enable recovery and also choose to truncate the target for a relational normal load
session, the PowerCenter Server does not truncate the target when you run the session in
recovery mode.
Use the following guidelines when you enable recovery for a partitioned session:

Preparing for Recovery

297

The Workflow Manager configures all partition points to use the default partitioning
scheme for each transformation when you enable recovery.

The Workflow Manager sets the partition type to pass-through unless the transformation
receiving the data is either an Aggregator transformation, a Rank transformation, or a
sorted Joiner transformation.

You can only enable recovery for unsorted Joiner transformations with one partition.

For Custom transformations, you can enable recovery only for transformations with one
input group.

The PowerCenter Server disables test load when you enable the session for recovery.
To perform consistent data recovery, the session properties for the recovery session must be
the same as the session properties for the failed session. This includes the partitioning
configuration and the session sort order.

Configuring the Workflow


The recovery method you choose for the workflow depends on the design and configuration
of the workflow. As with sessions, you can configure a workflow so that you can correct errors
and complete the workflow as though it ran without error.
If other tasks or workflows in your environment depend on the successful completion of a
session, configure the workflow containing the session to suspend on error. This is useful for
sequential and concurrent sessions because it prevents the PowerCenter Server from
continuing the workflow after the session fails. This is also useful if multiple concurrent
sessions fail or if other workflows depend on the successful completion of the workflow. For
details on recovering a suspended workflow, see Recovering a Suspended Workflow on
page 305.
If you do not want to configure the workflow to suspend on error, you can configure
recoverable sessions to fail the workflow if the session fails. This prevents the PowerCenter
Server from continuing to run the workflow after the session fails. In this case, you may want
to perform recovery by running the part of the workflow that did not yet run. For more
information, see Recovering a Failed Workflow on page 308.
You can also allow the workflow to complete even if sessions or other tasks fail. You can then
choose to recover only the failed session tasks. This allows you to recover the sessions without
running previously successful tasks. For more information, see Recovering a Session Task on
page 311.

Configuring the Target Database


When the PowerCenter Server runs a session in recovery mode, it uses information in
recovery tables that it creates on the target database system. The PowerCenter Server creates
the recovery tables when it runs a session enabled for recovery. If the tables already exist, the
PowerCenter Server writes information to them.

298

Chapter 11: Recovering Data

The PowerCenter Server creates the following recovery tables in the target database:

PM_RECOVERY. This table records target load information during the session run. The
PowerCenter Server removes the information from this table after each successful session
and initializes the information at the beginning of subsequent sessions.

PM_TGT_RUN_ID. This table records information the PowerCenter Server uses to


identify each target on the database. The information remains in the table between session
runs.

If you want the PowerCenter Server to create the recovery tables, you must grant table
creation privileges to the database user name for the target database connection. If you do not
want the PowerCenter Server to create the recovery tables, you must create the recovery tables
manually.
Do not edit or drop the recovery tables while recovery is enabled. If you want to disable
recovery, the PowerCenter Server does not remove the recovery tables from the target
database. You must manually remove the recovery tables.
Table 11-1 describes the format of PM_RECOVERY:
Table 11-1. PM_RECOVERY Table Definition
Column Name

Datatype

REP_GID

VARCHAR(240)

WFLOW_ID

NUMBER

SUBJ_ID

NUMBER

TASK_INST_ID

NUMBER

TGT_INST_ID

NUMBER

PARTITION_ID

NUMBER

TGT_RUN_ID

NUMBER

RECOVERY_VER

NUMBER

CHECK_POINT

NUMBER

ROW_COUNT

NUMBER

Table 11-2 describes the format of PM_TGT_RUN_ID:


Table 11-2. PM_TGT_RUN_ID Table Definition
Column Name

Datatype

LAST_TGT_RUN_ID

NUMBER

Note: If you manually create the PM_TGT_RUN_ID table, you must specify a value other

than zero in the LAST_TGT_RUN_ID column to ensure that the session runs successfully in
recovery mode.

Preparing for Recovery

299

Creating pmcmd Scripts


You can use pmcmd to perform recovery from the command line or in a script. When you use
pmcmd commands in a script, pmcmd indicates the success or failure of the command with a
return code. The following return codes apply to recovery sessions.
Table 11-3 describes the return codes for pmcmd that relate to recovery:
Table 11-3. pmcmd Return Codes for Recovery
Code

Description

12

The PowerCenter Server cannot start recovery because the session or workflow is scheduled, suspending,
waiting for an event, waiting, initializing, aborting, stopping, disabled, or running.

19

The PowerCenter Server cannot start the session in recovery mode because the workflow is configured to run
continuously.

For details on additional pmcmd return codes, see pmcmd Return Codes on page 590.

300

Chapter 11: Recovering Data

Working with Repeatable Data


You can enable a session for recovery in the Workflow Manager when all targets in the
mapping receive data from transformations that produce repeatable data. All transformations
have a property that determines when the transformation produces repeatable data. For most
transformations, this property is hidden. However, you can write the Custom transformation
procedure to output repeatable data, and then configure the Custom transformation Output
Is Repeatable property to match the procedure behavior.
Transformations can produce repeatable data under the following circumstances:

Never. The order of the output data is inconsistent between session runs. This is the
default for active Custom transformations.

Based on input order. The output order is consistent between session runs when the input
data order for all input groups is consistent between session runs. This is the default for
passive Custom transformations.

Always. The order of the output data is consistent between session runs even if the order
of the input data is inconsistent between session runs.

Based on transformation configuration. The transformation produces repeatable data


depending on how you configure the transformation. You can always enable the session for
recovery, but you may get inconsistent results depending on how you configure the
transformation.

Table 11-4 lists which transformations produce repeatable data:


Table 11-4. Transformations that Output Repeatable Data
Transformation

Output is Repeatable

Source Qualifier (relational)

Based on transformation configuration.


Use sorted ports to produce repeatable data. Or, add a transformation that
produces repeatable data immediately after the Source Qualifier
transformation. If you do not do either of these options, you might get
inconsistent results.

Source Qualifier (flat file)

Always.

Application Source Qualifier

Based on transformation configuration.


Use sorted ports for relational sources, such as Siebel sources, to produce
repeatable data. Or, add a transformation that produces repeatable data
immediately after the Application Source Qualifier transformation. If you do not
do either of these options, you might get inconsistent results.

MQ Source Qualifier

Always.

XML Source Qualifier

Always.

Aggregator

Always.

Custom

Based on transformation configuration.


Configure the Output is Repeatable property according to the Custom
transformation procedure behavior.

Working with Repeatable Data

301

Table 11-4. Transformations that Output Repeatable Data


Transformation

Output is Repeatable

Expression

Based on input order.

External Procedure

Based on input order.

Filter

Based on input order.

Joiner

Based on input order.

Lookup

Based on input order.

Normalizer (VSAM)

Always.
You can enable the session for recovery, however, you might get inconsistent
results if you run the session in recovery mode. The Normalizer transformation
generates source data in the form of primary keys. Recovering a session might
generate different values than if the session completed successfully. However,
the PowerCenter Server continues to produce unique key values.

Normalizer (pipeline)

Based on input order.

Rank

Always.

Router

Based on input order.

Sequence Generator

Based on transformation configuration.


You must reset the sequence value to the value set in the failed session run. If
you do not, you might get inconsistent results.

Sorter, configured for distinct output


rows

Always.

Sorter, not configured for distinct


output rows

Based on input order.

Stored Procedure

Based on input order.

Transaction Control

Based on input order.

Union

Never.

Update Strategy

Based on input order.

XML Generator

Always.

XML Parser

Always.

To run a session in recovery mode, you must first enable the failed session for recovery. To
enable a session for recovery, the Workflow Manager verifies all targets in the mapping receive
data from transformations that produce repeatable data. The Workflow Manager uses the
values in the Table 11-4 to determine whether or not you can enable a session for recovery.
However, the Workflow Manager cannot verify whether or not you configure some
transformations, such as the Sequence Generator transformation, correctly and always allows
you to enable these sessions for recovery. You may get inconsistent results if you do not
configure these transformations correctly.

302

Chapter 11: Recovering Data

You cannot enable a session for recovery in the Workflow Manager under the following
circumstances:

You connect a transformation that never produces repeatable data directly to a target. To
enable this session for recovery, you can add a transformation that always produces
repeatable data between the transformation that never produces repeatable data and the
target.

You connect a transformation that never produces repeatable data directly to a


transformation that produces repeatable data based on input order. To enable this session
for recovery, you can add a transformation that always produces repeatable data
immediately after the transformation that never produces repeatable data.

When a mapping contains a transformation that never produces repeatable data, you can add
a transformation that always produces repeatable data immediately after it.
Note: In some cases, you might get inconsistent data if you run some sessions in recovery

mode. For a description of circumstances that might lead to inconsistent data, see
Completing Unrecoverable Sessions on page 316.
Figure 11-1 illustrates a mapping you can enable for recovery:
Figure 11-1. Mapping You Can Enable for Recovery

The mapping contains an Aggregator transformation that always produces repeatable data.
The Aggregator transformation provides data for the Lookup and Expression transformations.
Lookup and Expression transformations produce repeatable data if they receive repeatable
data. Therefore, the target receives repeatable data, and you can enable this session for
recovery.

Working with Repeatable Data

303

Figure 11-2 illustrates a mapping you cannot enable for recovery:


Figure 11-2. Mapping You Cannot Enable for Recovery

Never produces repeatable data.


Configured for distinct output rows.
Always produces repeatable data.

The mapping contains two Source Qualifier transformations that produce repeatable data.
However, the mapping contains a Union and Custom transformation downstream that never
produce repeatable data. The Lookup transformation only produces repeatable data if it
receives repeatable data. Therefore, the target does not receive repeatable data, and you
cannot enable this session for recovery.
You can modify this mapping to enable the session for recovery by adding a Sorter
transformation configured for distinct output rows immediately after transformations that
never output repeatable data. Since the Union transformation is connected directly to another
transformation that never produces repeatable data, you only need to add a Sorter
transformation after the Custom transformation, as shown in the mapping in Figure 11-3:
Figure 11-3. Modified Mapping You Can Enable for Recovery

Never produces repeatable data.


Configured for distinct output rows.
Always produces repeatable data.
Produces repeatable data based on
input order.

304

Chapter 11: Recovering Data

Recovering a Suspended Workflow


You can configure the workflow to suspend if a task fails. If a session that is enabled for
recovery fails, you can correct the error that caused the session to fail and resume the
suspended workflow in recovery mode. When the PowerCenter Server resumes the workflow,
it runs the failed session in recovery mode. If the recovery session succeeds, the PowerCenter
Server runs the rest of the workflow.
You can recover a suspended workflow with sequential or concurrent sessions. For workflows
with either sequential or concurrent sessions, suspending the workflow on error is useful if
successive tasks in the workflow depend on the success of the previous sessions. For a
workflow with concurrent sessions, resuming a suspended workflow in recovery mode also
allows you to simultaneously recover concurrent failed sessions.
You can only resume a suspended workflow in recovery mode if a session that is enabled for
recovery fails. If a session fails that is not enabled for recovery, you can resume the workflow
normally. When you resume the workflow, the PowerCenter Server restarts the session. If the
session succeeds, the PowerCenter Server runs the rest of the workflow.
To configure the workflow to suspend on error, enable the Suspend On Error option on the
General tab of the workflow properties. For more information about suspending the
workflow, see Suspending the Workflow on page 127.
For steps on recovering a suspended workflow, see Steps for Recovering a Suspended
Workflow on page 307.

Recovering a Suspended Workflow with Sequential Sessions


When a sequential session enabled for recovery fails, the PowerCenter Server places the
workflow in a suspended state. While the workflow is suspended, you can correct the error
that caused the session to fail.
After you correct the error, you can resume the workflow in recovery mode. When it resumes
the workflow, the PowerCenter Server starts the failed session in recovery mode.
If the recovery session succeeds, the PowerCenter Server runs the rest of the workflow. If the
recovery session fails, the PowerCenter Server suspends the workflow again.

Example
Suppose the workflow w_ItemOrders contains two sequential sessions. In this workflow,
s_ItemSales is enabled for recovery, and the workflow is configured to suspend on error.

Recovering a Suspended Workflow

305

Figure 11-4 illustrates w_ItemOrders:


Figure 11-4. Resuming a Suspended Workflow with Sequential Sessions
Workflow
configured to
suspend on
error.
Session enabled for recovery.

Suppose s_ItemSales fails, and the PowerCenter Server suspends the workflow. You correct the
error and resume the workflow in recovery mode. The PowerCenter Server recovers the
session successfully, and then runs s_UpdateOrders.
If s_UpdateOrders also fails, the PowerCenter Server suspends the workflow again. You
correct the error, but you cannot resume the workflow in recovery mode because you did not
enable the session for recovery. Instead, you resume the workflow. The PowerCenter Server
starts s_UpdateOrders from the beginning, completes the session successfully, and then runs
the StopWorkflow control task.

Recovering a Suspended Workflow with Concurrent Sessions


When a concurrent session enabled for recovery fails, the PowerCenter Server places the
workflow in a suspending state while it completes any other concurrently running tasks. After
concurrent tasks succeed or fail, the PowerCenter Server places the workflow in a suspended
state. While the workflow is suspended, you can correct the error that caused the session to
fail. If concurrent tasks failed, you can also correct those errors.
After you correct the error, you can resume the workflow in recovery mode. The PowerCenter
Server runs the failed session in recovery mode. If multiple concurrent sessions failed, the
PowerCenter Server starts all failed sessions enabled for recovery in recovery mode, and
restarts other concurrent tasks or sessions not enabled for recovery.
After successful recovery or completion of all failed sessions and tasks, the PowerCenter Server
completes the rest of the workflow. If a recovery session or task fails again, the PowerCenter
Server suspends the workflow.

Example
Suppose you have the workflow w_ItemsDaily, containing three concurrent sessions,
s_SupplierInfo, s_PromoItems, and s_ItemSales. In this workflow, s_SupplierInfo and
s_PromoItems are enabled for recovery, and the workflow is configured to suspend on error.

306

Chapter 11: Recovering Data

Figure 11-5 illustrates w_ItemsDaily:


Figure 11-5. Resuming a Suspended Workflow with Concurrent Sessions
Sessions enabled for recovery.

Workflow
configured to
suspend on error.

Suppose s_SupplierInfo fails while the PowerCenter Server is running the three sessions. The
PowerCenter Server places the workflow in a suspending state and continues running the
other two sessions. s_PromoItems and s_ItemSales also fail, and the PowerCenter Server then
places the workflow in a suspended state.
You correct the errors that caused each session to fail and then resume the workflow in
recovery mode. The PowerCenter Server starts s_SupplierInfo and s_PromoItems in recovery
mode. Since s_ItemSales is not enabled for recovery, it restarts the session from the beginning.
The PowerCenter Server runs the three sessions concurrently.
After all sessions succeed, the PowerCenter Server runs the Command task.

Steps for Recovering a Suspended Workflow


You can use the Workflow Monitor to resume a workflow in recovery mode. If the workflow
or session is currently scheduled, waiting, or disabled, the PowerCenter Server cannot run the
session in recovery mode. You must stop or unschedule the workflow or stop the session.
To resume a workflow or worklet in recovery mode:
1.

In the Navigator, select the suspended workflow you want to resume.

2.

Choose Task-Resume/Recover.
The PowerCenter Server resumes the workflow.

You can also use pmcmd to resume a workflow in recovery mode. For more information, see
Using pmcmd on page 581.

Recovering a Suspended Workflow

307

Recovering a Failed Workflow


You can configure a session to fail the workflow if the session fails. If the session is also
enabled for recovery, you can correct the error that caused the session to fail and recover the
workflow from the failed session. When the PowerCenter Server recovers the workflow from
the failed session, it runs the failed session in recovery mode. If the recovery session succeeds,
the PowerCenter Server runs the rest of the workflow.
You can recover a workflow from a failed sequential or concurrent session. You might want to
fail a workflow as a result of session failure if successive tasks in the workflow depend on the
success of the previous sessions.
To configure a session to fail the workflow if the session fails, enable the Fail Parent If This
Task Fails option on the General tab of the session properties. For more information, see
Working with Tasks on page 131.
For steps on recovering a failed workflow, see Steps for Recovering a Failed Workflow on
page 310.

Recovering a Failed Workflow with Sequential Sessions


When a sequential session fails that is enabled for recovery and configured to fail the
workflow, the PowerCenter Server fails the workflow. You can correct the error that caused the
session to fail and recover the workflow from the failed session. When the PowerCenter Server
recovers the workflow from the session, it runs the session in recovery mode.
If the recovery session succeeds, the PowerCenter Server runs the rest of the workflow. If the
recovery session fails, the PowerCenter Server fails the workflow again.

Example
Suppose the workflow w_ItemOrders contains two sequential sessions. s_ItemSales is enabled
for recovery and also configured to fail the parent workflow if it fails.
Figure 11-6 illustrates w_ItemOrders:
Figure 11-6. Recovering Part of a Workflow With Sequential Sessions
Session enabled for recovery.

Sessions configured to fail workflow if either session fails.

308

Chapter 11: Recovering Data

Suppose s_ItemSales fails, and the PowerCenter Server fails the workflow. You correct the
error and recover the workflow from s_ItemSales. The PowerCenter Server successfully
recovers the session, and then runs the next task in the workflow, s_UpdateOrders.
Suppose s_UpdateOrders also fails, and the PowerCenter Server fails the workflow again. You
correct the error, but you cannot recover the workflow from the session. Instead, you start the
workflow from the session. The PowerCenter Server starts s_UpdateOrders from the
beginning, completes the session successfully, and then runs the StopWorkflow control task.

Recovering a Failed Workflow with Concurrent Sessions


When a concurrent session fails that is enabled for recovery and configured to fail the
workflow, the PowerCenter Server fails the workflow. You can then correct the error that
caused the session to fail and recover the workflow from the failed session. When the
PowerCenter Server recovers the workflow, it runs the session in recovery mode. If the
recovery session succeeds, the PowerCenter Server runs successive tasks in the workflow in the
same path as the session. The PowerCenter Server does not recover or restart concurrent tasks
when you recover a workflow from a failed session.
If multiple concurrent sessions fail that are enabled for recovery and configured to fail the
workflow, the Informatica fails the workflow when the first session fails. Concurrent sessions
continue to run until they succeed or fail. After all concurrent sessions complete, you can
correct the errors that caused failures.
After you correct the errors, you can recover the workflow. If multiple sessions enabled for
recovery fail, individually recover all but one failed session. You can then recover the workflow
from the remaining failed session. This ensures that the Informatica recovers all concurrent
failed sessions before it runs the rest of the workflow. For details on recovering a session
individually, see Recovering a Session Task on page 311.

Example
Suppose the workflow w_ItemsDaily contains three concurrent sessions, s_SupplierInfo,
s_PromoItems, and s_ItemSales. In this workflow, each session is enabled for recovery and
configured to fail the parent workflow if the session fails.
Figure 11-7 illustrates w_ItemsDaily:
Figure 11-7. Recovering Part of a Workflow with Concurrent Sessions
Sessions enabled for recovery.
Sessions configured to fail parent workflow if
the session fails.

Recovering a Failed Workflow

309

Suppose s_SupplierInfo fails while the three concurrent sessions are running, and the
PowerCenter Server fails the workflow. s_PromoItems and s_ItemSales also fail. You correct
the errors that caused each session to fail.
In this case, you must combine two recovery methods to run all sessions before completing
the workflow. You recover s_PromoItems individually. You cannot recover s_ItemSales
because it is not enabled for recovery, but you start the session from the beginning. After the
PowerCenter Server successfully completes s_PromoItems and s_ItemSales, you recover the
workflow from s_SupplierInfo. The PowerCenter Server runs the session in recovery mode,
and then runs the Command task.

Steps for Recovering a Failed Workflow


You can use the Workflow Manager or Workflow Monitor to recover a failed workflow. If the
workflow or session is currently scheduled, waiting, or disabled, the PowerCenter Server
cannot run the session in recovery mode. You must stop or unschedule the workflow or stop
the session.
To recover a failed workflow using the Workflow Manager:
1.

Select the failed session in the Navigator or in the Workflow Designer workspace.

2.

Right-click the failed session and choose Recover Workflow from Task.
The PowerCenter Server runs the failed session in recovery mode, and then runs the rest
of the workflow.

To recover a failed workflow using the Workflow Monitor:


1.

Select the failed session in the Navigator.

2.

Right-click the session and choose Recover Workflow From Task.


or
Choose Task-Recover Workflow From Task.
The PowerCenter Server runs the session in recovery mode.

You can also use pmcmd to recover a failed workflow. For more information, see Using
pmcmd on page 581.

310

Chapter 11: Recovering Data

Recovering a Session Task


If you do not configure the workflow to suspend on error, and you do not configure the
workflow to fail if sessions or tasks fail, the PowerCenter Server completes the workflow even
if it encounters errors. If a session fails, but other tasks in the workflow complete successfully,
you may want to recover only the failed session. When the PowerCenter Server recovers a
session, it runs the session in recovery mode.
You can recover sequential or concurrent sessions. For workflows with sequential sessions,
individually recovering a session is useful if the rest of the workflow succeeded and you need
to recover the failed session. This allows you to recover the session without restarting
successful tasks.
For workflows with concurrent sessions, this method is useful if multiple concurrent sessions
fail and also cause the workflow to fail. You can individually recover concurrent sessions and
individually start subsequent tasks in the workflow paths until the paths converge at a single
task.
In other complex, branched workflows, individually recovering multiple failed sessions allows
you to specify the order in which the sessions run.

Recovering Sequential Sessions


When a sequential session enabled for recovery fails, and the workflow is not configured to
suspend or fail on error, the PowerCenter Server continues to run the workflow. You can
correct the error that caused the session to fail.
After you correct the error, you can individually recover the failed session. When the
PowerCenter Server individually recovers a session, it runs the session in recovery mode. It
does not run other tasks in the workflow.

Recovering Concurrent Sessions


When a concurrent session enabled for recovery fails, the PowerCenter Server continues to
run the workflow. Other tasks and the workflow may succeed. You can correct the error that
caused the session to fail. If concurrent tasks failed, you can also correct those errors. After
you correct the errors, you can individually recover each session without running the rest of
the workflow.
If multiple concurrent sessions fail that are enabled for recovery and configured to fail the
workflow on session failure, the PowerCenter Server fails the workflow. You can correct the
errors that caused the sessions to fail. After you correct the errors, you can individually recover
each session. Once all concurrent tasks are recovered or complete, you can start the session
from a task where the concurrent paths converge.

Recovering a Session Task

311

Example
Suppose the workflow w_ItemsDaily contains three concurrently running sessions. Each
session is enabled for recovery and configured to fail the workflow if the session fails.
Figure 11-8 illustrates w_ItemsDaily:
Figure 11-8. Recovering Concurrent Sessions Individually
Sessions enabled for recovery.
Sessions configured to fail parent workflow if
the session fails.

Suppose s_ItemSales fails and the PowerCenter Server fails the workflow. s_PromoItems and
s_SupplierInfo also fail. You correct the errors that caused the sessions to fail.
After you correct the errors, you individually recover each failed session. The PowerCenter
Server successfully recovers the sessions. The workflow paths after the sessions converge at the
Command task, allowing you to start the workflow from the Command task and complete
the workflow.
Alternatively, after you correct the errors, you could also individually recover two of the three
failed sessions. After the PowerCenter Server successfully recovers the sessions, you can
recover the workflow from the third session. The PowerCenter Server then recovers the third
session and, on successful recovery, runs the rest of the workflow.

Steps for Recovering a Session Task


You can use the Workflow Manager or Workflow Monitor to recover a failed session in a
workflow. If the workflow or session is currently scheduled, waiting, or disabled, the
PowerCenter Server cannot run the session in recovery mode. You must stop or unschedule
the workflow or stop the session.
To recover a failed session using the Workflow Manager:
1.

Select the failed session in the Navigator or in the Workflow Designer workspace.

2.

Right-click the failed session and choose Recover Task.


The PowerCenter Server runs the session in recovery mode.

To recover a failed session using the Workflow Monitor:


1.

312

Select the failed session in the Navigator.

Chapter 11: Recovering Data

2.

Right-click the session and choose Recover Task.


or
Choose Task-Recover Task.
The PowerCenter Server runs the session in recovery mode.

You can also use pmcmd to recover a failed session. For more information, see Using pmcmd
on page 581.

Recovering a Session Task

313

Server Handling for Recovery


The PowerCenter Server writes recovery data to relational target databases when you run a
session enabled for recovery. If the session fails, the PowerCenter Server uses the recovery data
to determine the point at which it continues to commit data during the recovery session.

Verifying Recovery Tables


The PowerCenter Server creates recovery information in cache files for all sessions enabled for
recovery. It also creates recovery tables on the target database for relational targets during the
initial session run.
If the session is enabled for recovery, the PowerCenter Server creates recovery information in
cache files during the normal session run. The PowerCenter Server stores the cache files in the
directory specified for $PMCacheDir. The PowerCenter Server generates file names in the
format PMGMD_METADATA_*.dat. Do not alter these files or remove them from the
PowerCenter Server cache directory. The PowerCenter Server cannot run the recovery session
if you delete the recovery cache files.
If the session writes to a relational database and is enabled for recovery, the PowerCenter
Server also verifies the recovery tables on the target database for all relational targets at the
beginning of a normal session run. If the tables do not exist, the PowerCenter Server creates
them. If the database user name the PowerCenter Server uses to connect to the target database
does not have permission to create the recovery tables, you must manually create them. For
information about recovery table structure, see Configuring the Target Database on
page 298.
During the session run, the PowerCenter Server writes target load information for normal
load targets into the recovery tables. If the session fails, the PowerCenter Server uses this
information to complete the session in recovery mode. If the session is configured to write to
relational targets in bulk mode, the PowerCenter Server does not write recovery information
to the recovery tables.
If the session completes successfully, the PowerCenter Server deletes all recovery cache files
and removes recovery table entries that are related to the session. The PowerCenter Server
initializes the information in the recovery tables at the beginning of the next session run.
The PowerCenter Server also uses the recovery cache files to store messages from real-time
sources. For more information, see your PowerCenter Connect documentation.

Running Recovery
If a session enabled for recovery fails, you can run the session in recovery mode. The
PowerCenter Server moves a recovery session through the states of a normal session:
scheduled, waiting, running, succeeded, and failed. When the PowerCenter Server starts the
recovery session, it runs all pre-session tasks.

314

Chapter 11: Recovering Data

For relational normal load targets, the PowerCenter Server performs incremental load
recovery. It uses the recovery information created during the normal session run to determine
the point at which the session stopped committing data to the target. It then continues
writing data to the target. On successful recovery, the PowerCenter Server removes the
recovery information from the tables.
For example, if the PowerCenter Server commits 10,000 rows before the session fails, when
you run the session in recovery mode, the PowerCenter Server bypasses the rows up to 10,000
and starts loading with row 10,001.
If the session writes to a relational target in bulk mode, the PowerCenter Server performs the
entire writer run. If the Truncate Target Table option is enabled in the session properties, the
PowerCenter Server truncates the target before loading data.
If the session writes to a flat file or XML file, the PowerCenter Server performs full load
recovery. It overwrites the existing output file and performs the entire writer run. If the
session writes to heterogeneous targets, the PowerCenter Server performs incremental load
recovery for all relational normal load targets and full load recovery for all other target types.
On successful recovery, the PowerCenter Server deletes recovery cache files associated with the
session. It also performs all post-session tasks.

Server Handling for Recovery

315

Completing Unrecoverable Sessions


In some cases, you cannot perform recovery for a session. There may also be circumstances
that cause a recovery session to fail or produce inconsistent data. If you cannot recover a
session, you can run the session again.
You cannot run sessions in recovery mode under the following circumstances:

You change the number of partitions. If you change the number of partitions after the
session fails, the recovery session fails.

Recovery table is empty or missing from the target database. The PowerCenter Server
fails the recovery session under the following circumstances:

You deleted the table after the PowerCenter Server created it.

The session enabled for recovery succeeded, and the PowerCenter Server removed the
recovery information from the table.

Recovery cache file is missing. The PowerCenter Server fails the recovery session if the
recovery cache file is missing from the PowerCenter Server cache directory.

The PowerCenter Server performing recovery is on a different operating system. The


operating system of the PowerCenter Server that runs the recovery session must be the
same as the operating system of the PowerCenter Server that ran the failed session.

You might get inconsistent data if you perform recovery under the following circumstances:

You change the partitioning configuration. If you change any partitioning options after
the session fails, you may get inconsistent data.

Source data is not sorted. To perform a successful recovery, the PowerCenter Server must
process source rows during recovery in the same order it processes them during the initial
session. Use the Sorted Ports option in the Source Qualifier transformation or add a Sorter
transformation directly after the Source Qualifier transformation.

The sources or targets change after the initial session failure. If you drop or create
indexes, or edit data in the source or target tables before recovering a session, the
PowerCenter Server may return missing or repeat rows.

The session writes to a relational target in bulk mode, but the session is not configured
to truncate the target table. The PowerCenter Server may load duplicate rows to the
during the recovery session.

The mapping uses a Normalizer transformation. The Normalizer transformation


generates source data in the form of primary keys. Recovering a session might generate
different values than if the session completed successfully. However, the PowerCenter
Server will continue to produce unique key values.

The mapping uses a Sequence Generator transformation. The Sequence Generator


transformation generates source data in the form of sequence values. Recovering a session
might generate different values than if the session completed successfully.
If you want to ensure the same sequence data is generated during the recovery session, you
can reset the value specified as the Current Value in the Sequence Generator

316

Chapter 11: Recovering Data

transformation properties to the same value used when you ran the failed session. If you do
not reset the Current Value, the PowerCenter Server will continue to generate unique
Sequence values.

The session performs incremental aggregation and the PowerCenter Server stops
unexpectedly. If the PowerCenter Server stops unexpectedly while running an incremental
aggregation session, the recovery session cannot use the incremental aggregation cache
files. Rename the backup cache files for the session from PMAGG*.idx.bak and
PMAGG*.dat.bak to PMAGG*.idx and PMAGG*.dat before you perform recovery.

The PowerCenter Server data movement mode changes after the initial session failure. If
you change the data movement mode before recovering the session, the PowerCenter
Server might return incorrect data.

The PowerCenter Server code page or source and target code pages change after the
initial session failure. If you change the source, target, or PowerCenter Server code pages,
the PowerCenter Server might return incorrect data. You can perform recovery if the new
code pages are two-way compatible with the original code pages.

The PowerCenter Server runs in Unicode mode and you change the session sort order.
When the PowerCenter Server runs in Unicode mode, it sorts character data based on the
sort order selected for the session. Do not perform recovery if you change the session sort
order after the session fails.

Completing Unrecoverable Sessions

317

318

Chapter 11: Recovering Data

Chapter 12

Sending Email
This chapter covers the following topics:

Overview, 320

Configuring Email on UNIX, 321

Configuring Email on Windows, 322

Working with Email Tasks, 328

Working with Post-Session Email, 332

Working with Suspension Email, 339

Using Email Tasks in a Workflow or Worklet, 341

Tips, 342

319

Overview
You can send email to designated recipients when the PowerCenter Server runs a workflow.
For example, if you want to track how long a session takes to complete, you can configure the
session to send an email containing the time and date the session starts and completes. Or, if
you want the PowerCenter Server to notify you when a workflow suspends, you can configure
the workflow to send email when it suspends.
When you create a workflow or worklet, you can include the following types of email:

Email task. You can include reusable and non-reusable Email tasks anywhere in the
workflow or worklet. For more information, see Using Email Tasks in a Workflow or
Worklet on page 341.

Post-session email. You can configure the session so the PowerCenter Server sends an
email when the session completes or fails. You create an Email task and use it for postsession email. For more information, see Working with Post-Session Email on page 332.
When you configure the subject and body of post-session email, you can use email
variables to include information about the session run, such as session name, status, and
the total number of records loaded. You can also use email variables to attach the session
log or other files to email messages. For more information, see Email Variables and
Format Tags on page 333.

Suspension email. You can configure the workflow so the PowerCenter Server sends an
email when the workflow suspends. You create an Email task and use it for suspension
email. For more information, see Working with Suspension Email on page 339.

Before you can configure a session or workflow to send email, you need to create an Email
task. For more information, see Working with Email Tasks on page 328.
The PowerCenter Server on Windows sends email in MIME format. This allows you to
include characters in the subject and body that are not in 7-bit ASCII. For more information
on the MIME format or the MIME decoding process, see your email documentation.
Before creating Email tasks, configure the PowerCenter Server to send email. For more
information, see Configuring Email on UNIX on page 321 and Configuring Email on
Windows on page 322.

320

Chapter 12: Sending Email

Configuring Email on UNIX


The PowerCenter Server on UNIX uses rmail to send email. To send email, the repository
user who starts the PowerCenter Server must have the rmail tool installed in the path.
If you want to send email to more than one person, separate the email address entries with a
comma. Do not put spaces between addresses.
To verify the rmail tool is accessible on AIX:
1.

Log on to the UNIX system as the Informatica user who starts the PowerCenter Server.

2.

Type the following lines at the prompt and press Enter:


rmail <your fully qualified email address>,<second fully
qualified email address>
From <your_user_name>

3.

To indicate the end of the message, type ^D.


You should receive a blank email from the email account of the user you specify in the
From line. If not, locate the directory where rmail resides and add that directory to the
path.

To verify the rmail tool is accessible on all other UNIX machines:


1.

Log on to the UNIX system as the Informatica user who starts the PowerCenter Server.

2.

Type the following line at the prompt and press Enter:


rmail <your fully qualified email address>,<second fully
qualified email address>

3.

To indicate the end of the message, type . on a line of its own and press Enter.
Or, type ^D.
You should receive a blank email from the email account of the Informatica user. If not,
locate the directory where rmail resides and add that directory to the path.

Once you verify that rmail is installed correctly, you can send email. For more information on
configuring email, see Working with Email Tasks on page 328.

Configuring Email on UNIX

321

Configuring Email on Windows


The PowerCenter Server on Windows uses Microsoft Outlook to send email using the MAPI
interface. You must meet the following requirements to send email on a PowerCenter Server
on Windows:

Install the Microsoft Outlook mail client on the PowerCenter Server machine.

Run Microsoft Outlook on a Microsoft Exchange Server.

Create a Windows user account that has Log on as a service rights and a Microsoft Outlook
profile.

To configure the PowerCenter Server on Windows to send email, you must perform the
following steps:
1.

Verify the Informatica Service startup account.

2.

Configure a Microsoft Outlook profile for the Informatica Service startup account.

3.

Configure Logon network security.

4.

Create distribution lists in the Personal Address Book in Microsoft Outlook.

5.

Configure the PowerCenter Server to send email using the Microsoft Outlook profile you
created in step 2.

Step 1. Verify the Informatica Service Startup Account


You must have an Informatica Service startup account, which grants a user the Log on as a
service right to start the Informatica Service. Verify the Informatica Service startup account so
that you can create a Microsoft Outlook profile for the user who has Log on as a service right
for the Informatica Service Start Account.
For details on verifying service rights, see the Troubleshooting section of Installing and
Configuring the PowerCenter Server on Windows in the Installation and Configuration
Guide.

Step 2. Configure a Microsoft Outlook User


You must set up a Microsoft Outlook user for the Informatica Service startup account before
configuring the PowerCenter Server to send email. The user profile must contain the
following services:

Microsoft Exchange Server

Personal Address Book

Use the same log on name for both the Microsoft Outlook account you create and the user
you grant Log on as a service rights in the Informatica Service startup account.
Note: If you do not already have a Microsoft Outlook mailbox for the Informatica Service

startup account user, ask your network administrator to create one.


322

Chapter 12: Sending Email

To configure a Microsoft Outlook user:


1.

Open the Control Panel on the machine running the PowerCenter Server.

2.

Double-click the Mail (or Mail and Fax) icon.

3.

On the Services tab of the user Properties dialog box, click Show Profiles.

The Mail dialog box displays the list of profiles configured for the computer.
4.

If you have a Microsoft Outlook profile set up for the Informatica Service startup
account, skip to Step 3. Configure Logon Network Security on page 325. If you do not
already have a Microsoft Outlook profile set up for the Informatica Service startup
account, continue to the next step.

5.

Click Add in the mail properties window.


The Microsoft Outlook Setup Wizard appears.

Configuring Email on Windows

323

324

6.

Select Use The Following Information Services and then select Microsoft Exchange
Server. Click Next.

7.

Enter a profile name. You can enter any name, but Informatica recommends that you
enter a text string that matches the Informatica Service startup account. Click Next.

Chapter 12: Sending Email

8.

Enter the name of the Microsoft Exchange Server. Enter your mailbox name. Click Next.

9.

Indicate whether you travel with your computer. Click Next.

10.

Enter the path to your personal address book. Click Next.

11.

Indicate whether you want to run Outlook when you start Windows. Click Next.

12.

The Setup Wizard indicates that you have successfully configured an Outlook profile.

13.

Click Finish.

Step 3. Configure Logon Network Security


You must configure the Logon Network Security before you run the Microsoft Exchange
Server Service.
To configure Logon Network Security for the Microsoft Exchange Server:
1.

Open the Control Panel on the machine running the PowerCenter Server.

2.

Double-click the Mail (or Mail and Fax) icon. The User Properties sheet appears.

Configuring Email on Windows

325

3.

On the Services tab, select Microsoft Exchange Server and click Properties.

4.

Click the Advanced tab. Set the Logon network security option to NT Password
Authentication.

Logon Network Security

5.

Click OK.

Step 4. Create Distribution Lists


When the PowerCenter Server runs on Windows, you can enter only one email address in the
Workflow Manager. If you want to send email to multiple recipients, create a distribution list
containing these addresses in the Personal Address Book in Microsoft Outlook. Enter the
distribution list name as the recipient when configuring email.
For more information about working with your Personal Address Book, refer to Microsoft
Outlook documentation.

326

Chapter 12: Sending Email

Step 5. Configure the PowerCenter Server Setup


After you create the Microsoft Outlook profile, configure the PowerCenter Server to send
email as that Microsoft Outlook user.
To configure the PowerCenter Server as a Microsoft Outlook user:
1.

From the PowerCenter Server Setup, click the Configuration tab.

2.

In the MS Exchange Profile field, enter the name of the Microsoft Outlook profile you
created for the Informatica Service startup account.

Microsoft Exchange Profile

Configuring Email on Windows

327

Working with Email Tasks


The Workflow Manager provides an Email task that allows you to send email during a
workflow. You can create reusable Email tasks in the Task Developer for any type of email. Or,
you can create non-reusable Email tasks in the Workflow and Worklet Designer.
You can use Email tasks in any of the following locations:

Session properties. You can configure the session to send email when the session
completes or fails. For more information, see Working with Post-Session Email on
page 332.

Workflow properties. You can configure the workflow to send email when the workflow
suspends. For more information, see Working with Suspension Email on page 339.

Workflow or worklet. You can include an Email task anywhere in the workflow or worklet
to send email based on a condition you define. For more information, see Using Email
Tasks in a Workflow or Worklet on page 341.

Figure 12-1 shows the Edit Tasks dialog box for an Email task in the Task Developer:
Figure 12-1. Email Task

Email Address Tips and Guidelines


Consider the following tips and guidelines when you enter the email address in an Email task:

328

Enter the email address using 7-bit ASCII characters only.

You can enter either the $PMSuccessEmailUser or $PMFailureEmailUser server variable


for post-session email. For more information, see Using Server Variables on page 333.

Chapter 12: Sending Email

If the PowerCenter Server runs on Windows, you can enter a Microsoft Exchange Profile
name. The mail recipient must have an entry in the Global Address book of the Microsoft
Outlook profile.

If the PowerCenter Server runs on Windows, you can send email to multiple recipients by
creating a distribution list in your Personal Address book. All recipients must also be in the
Global Address book. You cannot enter multiple addresses separated by commas or semicolons.

If the PowerCenter Server runs on UNIX, you can enter multiple email addresses separated
by a comma. Do not include spaces between email addresses.

Steps to Create an Email Task


You can create Email tasks in the Task Developer, Worklet Designer, and Workflow Designer.
Use the following steps to create an Email task.
To create an Email task in the Task Developer:
1.

In the Task Developer, choose Tasks-Create. The Create Task dialog box appears.

2.

Select an Email task and enter a name for the task. Click Create.
The Workflow Manager creates an Email task in the workspace.

3.

Click Done.

Working with Email Tasks

329

4.

Double-click the Email task in the workspace. The Edit Tasks dialog box appears.

5.

Click Rename to enter a name for the task.

6.

You can optionally enter a description for the task in the Description field.

7.

Click the Properties tab.

Enter the email text.

8.

Enter the fully qualified email address of the mail recipient in the Email User Name field.
For more information on entering the email address, see Email Address Tips and
Guidelines on page 328.

330

Chapter 12: Sending Email

9.

Enter the subject of the email in the Email Subject field. Or, you can leave this field
blank.

10.

Click the Open button in the Email Text field to open the Email Editor.

11.

Enter the text of the email message in the Email Editor.


When you use the Email task, you can incorporate format tags in your message. For more
information, see Email Variables and Format Tags on page 333.
You can leave the Email Text field blank.

12.

Click OK twice to save your changes.

Working with Email Tasks

331

Working with Post-Session Email


You can configure a session so the PowerCenter Server sends email to someone when it fails or
completes a session. You can create two Email tasks, one the PowerCenter Server sends if it
completes the session, and the other if it fails the session.
The PowerCenter Server sends post-session email at the end of a session, after executing postsession shell commands or stored procedures. When the PowerCenter Server encounters an
error sending the email, it writes a message to the server or event log. It does not fail the
session.
The Workflow Manager includes the following session properties to send post-session email:

On-Success Email

On-Failure Email

Figure 12-2 shows the On-Success and On-Failure email properties on the Components tab of
the session properties:
Figure 12-2. Post-Session Email Properties

Use a reusable
Email task.
Select a
reusable Email
task.
Edit the nonreusable Email
task.
Use a nonreusable Email
task.

You can specify a reusable Email task you create in the Task Developer for either success email
or failure email. Or, you can create a non-reusable Email task for each session property. When
you create a non-reusable Email task for the session property, you create the Email task for
that session only. You cannot use the Email task in the workflow or worklet.

332

Chapter 12: Sending Email

You cannot specify a non-reusable Email task you create in the Workflow or Worklet Designer
for post-session email.
Tip: When you configure an Email task for post-session email, use the email server variables,

$PMSuccessEmailUser or $PMFailureEmailUser, for the email recipient. Verify you specify


the values of the server variables for the PowerCenter Server that runs the session.

Using Server Variables


You can use server variables to address post-session email. When you register the PowerCenter
Server, you can configure its server variables. You can use the following server variables for
sending post-session email:

$PMSuccessEmailUser. Email address of the user to receive email when the session
completes successfully. Use this variable for the Email User Name for success email only.
The PowerCenter Server does not expand this variable when you use it for any other email
type.

$PMFailureEmailUser. Email address of the user to receive email when the session fails to
complete. Use this variable for the Email User Name for failure email only. The
PowerCenter Server does not expand this variable when you use it for any other email type.

When you use one of these server variables, the PowerCenter Server sends email to the address
configured for the server variable.
You might use this functionality when you have an administrator who troubleshoots all failed
sessions. Instead of entering the administrator email address for each session, you can use the
email variable $PMFailureEmailUser. If the administrator changes, you can correct all sessions
by editing the $PMFailureEmailUser server variable, instead of editing the email address in
each session.
You might also use this functionality when you have different administrators for different
PowerCenter Servers. If you deploy a folder from one repository to another or otherwise
change the PowerCenter Server that runs the session, the new server automatically sends email
to users associated with the new server when you use server variables instead of hard-coded
email addresses.
Note: $PMSuccessEmailUser and $PMFailureEmailUser are optional server variables. Verify

you define a variable before using it to address email.

Email Variables and Format Tags


You can use email variables and format tags in an email message for post-session emails. You
can use some email variables in the subject of the email. With email variables, you can include
important session information in the email, such as the number of rows loaded, the session
completion time, or read and write statistics. You can also attach the session log or other
relevant files to the email. Use format tags in the body of the message to make the message
easier to read.

Working with Post-Session Email

333

Note: The PowerCenter Server does not limit the type or size of attached files. However, since

large attachments can cause problems with your email system, avoid attaching excessively
large files, such as session logs generated using verbose tracing. The PowerCenter Server
generates an error message in the email if an error occurs attaching the file.
Table 12-1 describes the email variables you can use in a post-session email:
Table 12-1. Email Variables for Post-Session Email
Email Variable

Description

%s

Session name.

%e

Session status.

%b

Session start time.

%c

Session completion time.

%i

Session elapsed time (session completion time-session start time).

%l

Total rows loaded.

%r

Total rows rejected.

%t

Source and target table details, including read throughput in bytes per second and write throughput
in rows per second. The PowerCenter Server includes all information displayed in the session detail
dialog box.

%m

Name of the mapping used in the session.

%n

Name of the folder containing the session.

%d

Name of the repository containing the session.

%g

Attach the session log to the message.

%a<filename>

Attach the named file. The file must be local to the PowerCenter Server. The following are valid file
names: %a<c:\data\sales.txt> or %a</users/john/data/sales.txt>.
Note: The file name cannot include the greater than character (>) or a line break.

Note: The PowerCenter Server ignores %a, %g, or %t when you include them in the email subject. Include these variables in the email
message only.

Table 12-2 lists the format tags you can use in an Email task:
Table 12-2. Format Tags for Email Tasks
Formatting

Format Tag

tab

\t

new line

\n

Configuring Post-Session Email


You can configure post-session email to use a reusable or non-reusable Email task.

334

Chapter 12: Sending Email

Using a Reusable Email Task


Use the following steps to configure post-session email to use a reusable Email task.
To configure post-session email to use a reusable Email task:
1.

Open the session properties and click the Components tab.

2.

Select Reusable in the Type column for the success email or failure email field.

3.

Click the Open button in the Value column to select the reusable Email task.

Working with Post-Session Email

335

4.

Select the Email task in the Object Browser dialog box and click OK.

5.

You can optionally edit the Email task for this session property by clicking the Edit
button in the Value column.
If you edit the Email task for either success email or failure email, the edits only apply to
this session.

6.

Click OK to close the session properties.

Using a Non-Reusable Email Task


Follow these steps to configure success email or failure email to use a non-reusable Email task.
To configure success email or failure email to use a non-reusable Email task:

336

1.

Open the session properties and click the Components tab.

2.

Select Non-Reusable in the Type column for the success email or failure email field.

Chapter 12: Sending Email

3.

Open the email editor using the Open button.

4.

Edit the Email task and click OK. For more information on editing Email tasks, see
Working with Email Tasks on page 328.

5.

Click OK to close the session properties.

Sample Email
The following is user-entered text from a sample post-session email configuration using
variables:
Session complete.
Session name: %s
%l
%r
%e
%b
%c
%i
%g

The following is sample output from the configuration above:


Session complete.
Session name: sInstrTest
Total Rows Loaded = 1
Total Rows Rejected = 0
Completed

Working with Post-Session Email

337

Start Time: Tue Nov 17 12:26:31 2003


Completion Time: Tue Nov 17 12:26:41 2003
Elapsed time: 0:00:10 (h:m:s)

338

Chapter 12: Sending Email

Working with Suspension Email


You can configure a workflow to send email when the PowerCenter Server suspends the
workflow. For example, when a task fails, the PowerCenter Server suspends the workflow and
sends the suspension email. You can fix the error and resume the workflow.
If another task fails while the PowerCenter Server is suspending the workflow, you do not get
the suspension email again. However, the PowerCenter Server sends another suspension email
if another task fails after you resume the workflow.
For more information, see Suspending the Workflow on page 127.
Configure suspension email on the General tab of the workflow properties.
Figure 12-3 shows the Suspension Email workflow options:
Figure 12-3. Suspension Email

Select a reusable Email task.

Remove the reusable Email task.

Select Suspend On Error.

To configure suspension email:


1.

In the Workflow Designer, open the workflow.

2.

Choose Workflows-Edit to open the workflow properties.

3.

On the General tab, select Suspend on Error.

Working with Suspension Email

339

4.

Click the Browse Emails button to select a reusable Email task.

Note: The Workflow Manager returns an error message if you do not have any reusable

Email tasks in the folder. Create a reusable Email task in the folder before you configure
suspension email.

340

5.

Choose a reusable Email task and click OK.

6.

Click OK to close the workflow properties.

Chapter 12: Sending Email

Using Email Tasks in a Workflow or Worklet


You can use Email tasks anywhere in a workflow or worklet. For example, you can include an
Email task in a workflow after a Command task that executes a shell script. You can configure
the links in the workflow or worklet so the PowerCenter Server sends you email if the
Command task fails.
You might want the PowerCenter Server to generate a report during a workflow and email the
report to you after generating it.
Note: When you use an Email task outside of a Session task, the PowerCenter Server reads

variables related to the session as text. For example, if you use the variable %s in an Email task
in the workflow, the PowerCenter Server cannot provide a session name, as it is not within a
session.
Figure 12-4 shows a workflow that performs this operation:
Figure 12-4. Email Task in a Workflow

Configure the gen_report Command task to execute a shell script that generates the report.
Verify the shell script saves the report to a directory local to the PowerCenter Server.
Configure the em_report Email task to attach the file generated from the shell script.

Using Email Tasks in a Workflow or Worklet

341

Tips
The following suggestions can extend the capabilities of Email tasks.
Create generic user for sending email.
Often there are multiple users who can start sessions on a PowerCenter Server. If you want to
avoid entering the Microsoft Outlook profile each time the PowerCenter user changes, create
a generic Microsoft Outlook profile, such as PowerCenter, then grant each PowerCenter
user rights to send mail through this profile.
Use server variables to address post-session emails.
When the server variables $PMSuccessEmailUser and $PMFailureEmailUser are configured
for the PowerCenter Server, use them to address post-session emails. This allows you to
change the recipient of post-session emails for all sessions the server runs by editing the server
variables. It can also make deploying sessions into production easier when the variables are
defined for both development and production servers.
Generate and send post-session reports.
You can use a post-session success command to generate a report file and attach that file to a
success email. For example, you create a batch file called Q3rpt.bat that generates a sales
report, and you are running Microsoft Outlook on Windows.
Figure 12-5 shows how you can configure the post-session success command to generate a
report:
Figure 12-5. Using Post-Session Commands to Generate Reports

342

Chapter 12: Sending Email

Figure 12-6 shows how you can configure success email to attach a report file:
Figure 12-6. Using Email Variables to Attach Reports

Use email variable


%a to attach the
report.

Use other mail programs.


If you do not have Microsoft Outlook, you can use a post-session success command to invoke
a command line email program, such as WindMail. In this case, you do not have to enter the
email user name or subject, since your recipients, email subject, and body text will be
contained in the batch file, sendmail.bat.
Figure 12-7 shows how you can configure the post-session success command to invoke a
command line email program:
Figure 12-7. Sending Email without Microsoft Outlook

Tips

343

344

Chapter 12: Sending Email

Chapter 13

Pipeline Partitioning
This chapter covers the following subjects:

Overview, 346

Configuring Partitioning Information, 351

Cache Partitioning, 359

Round-Robin Partition Type, 360

Hash Keys Partition Types, 361

Key Range Partition Type, 363

Pass-Through Partition Type, 367

Database Partitioning Partition Type, 369

Partitioning Relational Sources, 371

Partitioning File Sources, 374

Partitioning Relational Targets, 378

Partitioning File Targets, 380

Partitioning Joiner Transformations, 384

Partitioning Lookup Transformations, 391

Partitioning Sorter Transformations, 392

Mapping Variables in Partitioned Pipelines, 394

Partitioning Rules, 395


345

Overview
You create a session for each mapping you want the PowerCenter Server to run. Every
mapping contains one or more source pipelines. A source pipeline consists of a source
qualifier and all the transformations and targets that receive data from that source qualifier.
If you purchase the Partitioning option, you can specify partitioning information for each
source pipeline in a mapping. The partitioning information for a pipeline controls the
following factors:

The number of reader, transformation, and writer threads that the master thread creates
for the pipeline. For more information, see Understanding Processing Threads on
page 14.

How the PowerCenter Server reads data from the source, including the number of
connections to the source.

How the PowerCenter Server distributes rows of data to each transformation as it processes
the pipeline.

How the PowerCenter Server writes data to the target, including the number of
connections to each target in the pipeline.

You can specify partitioning information for a pipeline by setting the following attributes:

Location of partition points. Partition points mark the thread boundaries in a pipeline
and divide the pipeline into stages. The PowerCenter Server sets partition points at several
transformations in a pipeline by default. If you have the Partitioning option, you can
define other partition points. When you add partition points, you increase the number of
transformation threads, which can improve session performance. The PowerCenter Server
can redistribute rows of data at partition points, which can also improve session
performance. For more information on partition points, see Partition Points on
page 346.

Number of partitions. A partition is a pipeline stage that executes in a single thread. If you
purchase the Partitioning option, you can set the number of partitions at any partition
point. When you add partitions, you increase the number of processing threads, which can
improve session performance. For more information, see Number of Partitions on
page 348.

Partition types. The PowerCenter Server specifies a default partition type at each partition
point. If you purchase the Partitioning option, you can change the partition type. The
partition type controls how the PowerCenter Server redistributes data among partitions at
partition points. For more information, see Partition Types on page 348.

Partition Points
By default, the PowerCenter Server sets partition points at various transformations in the
pipeline. Partition points mark thread boundaries as well as divide the pipeline into stages. A
stage is a section of a pipeline between any two partition points. When you set a partition
point at a transformation, the new pipeline stage includes that transformation.
346

Chapter 13: Pipeline Partitioning

Table 13-1 lists the partition points that the Workflow Manager creates by default:
Table 13-1. Default Partition Points
Transformation
(Partition Point)

Default
Partition Type

Source Qualifier or
Normalizer transformation

Pass-through

Controls how the PowerCenter Server reads data from the source
and passes data into the source qualifier.

Rank and unsorted


Aggregator transformations

Hash auto-keys

Ensures that the PowerCenter Server groups rows properly before it


sends them to the transformation.

Target instances

Pass-through

Controls how the target instances pass data to the targets.

Description

If you purchase the Partitioning option, you can add partition points at other transformations
and delete some partition points.
Figure 13-1 shows the default partition points and pipeline stages for a simple mapping with
one source pipeline:
Figure 13-1. Default Partition Points and Stages in a Sample Mapping

First Stage

Second Stage

Default Partition Points

Third Stage Fourth Stage

The mapping in Figure 13-1 contains four stages. The partition point at the source qualifier
marks the boundary between the first (reader) and second (transformation) stages. The
partition point at the Aggregator transformation marks the boundary between the second and
third (transformation) stages. The partition point at the target instance marks the boundary
between the third (transformation) and fourth (writer) stage.
When you add a partition point, you increase the number of pipeline stages by one. Similarly,
when you delete a partition point, you reduce the number of stages by one. For more
information, see Understanding Processing Threads on page 14.
Besides marking stage boundaries, partition points also mark the points in the pipeline where
the PowerCenter Server can redistribute data across partitions. For example, if you place a
partition point at a Filter transformation and define multiple partitions, the PowerCenter
Server can redistribute rows of data among the partitions before the Filter transformation
processes the data. The partition type you set at this partition point controls the way in which
the PowerCenter Server passes rows of data to each partition. For more information, see
Partition Types on page 348.
For more information on adding and deleting partition points, see Adding and Deleting
Partition Points on page 353.

Overview

347

Number of Partitions
A partition is a pipeline stage that executes in a single reader, transformation, or writer thread.
By default, the PowerCenter Server defines a single partition in the source pipeline. If you
purchase the Partitioning option, you can increase the number of partitions. This increases
the number of processing threads, which can improve session performance.
For example, you need to use the mapping in Figure 13-1 to extract data from three flat files
of various sizes. To do this, you define three partitions at the source qualifier to read the data
simultaneously. When you do this, the Workflow Manager defines three partitions in the
pipeline.
Figure 13-2 shows the threads that the master thread creates for this mapping:
Figure 13-2. Threads Created for a Sample Mapping with Three Partitions

Default Partition Points

Threads for Partition #1


Threads for Partition #2
Threads for Partition #3
3 Reader Threads
(First Stage)

(Second Stage)

6 Transformation Threads
(Third Stage)

3 Writer Threads
(Fourth Stage)

By default, the PowerCenter Server sets the number of partitions to one. You can generally
define up to 64 partitions at any partition point. However, there are situations in which you
can define only one partition in the pipeline. For more information, see Restrictions on the
Number of Partitions on page 395.
Note: Increasing the number of partitions or partition points increases the number of threads.

Therefore, increasing the number of partitions or partition points also increases the load on
the server machine. If the server machine contains ample CPU bandwidth, processing rows of
data in a session concurrently can increase session performance. However, if you create a large
number of partitions or partition points in a session that processes large amounts of data, you
can overload the system.
For more information on adding and deleting partitions, see Adding and Deleting Partitions
on page 356.

Partition Types
When you configure the partitioning information for a pipeline, you must specify a partition
type at each partition point in the pipeline. The partition type determines how the
PowerCenter Server redistributes data across partition points.

348

Chapter 13: Pipeline Partitioning

The Workflow Manager allows you to specify the following partition types:

Round-robin. The PowerCenter Server distributes data evenly among all partitions. Use
round-robin partitioning where you want each partition to process approximately the same
number of rows. For more information, see Round-Robin Partition Type on page 360.

Hash. The PowerCenter Server applies a hash function to a partition key to group data
among partitions. If you select hash auto-keys, the PowerCenter Server uses all grouped or
sorted ports as the partition key. If you select hash user keys, you specify a number of ports
to form the partition key. Use hash partitioning where you want to ensure that the
PowerCenter Server processes groups of rows with the same partition key in the same
partition. For more information, see Hash Keys Partition Types on page 361.

Key range. You specify one or more ports to form a compound partition key. The
PowerCenter Server passes data to each partition depending on the ranges you specify for
each port. Use key range partitioning where the sources or targets in the pipeline are
partitioned by key range. For more information, see Key Range Partition Type on
page 363.

Pass-through. The PowerCenter Server passes all rows at one partition point to the next
partition point without redistributing them. Choose pass-through partitioning where you
want to create an additional pipeline stage to improve performance, but do not want to
change the distribution of data across partitions. For more information, see Pass-Through
Partition Type on page 367.

Database partitioning. The PowerCenter Server queries the IBM DB2 system for table
partition information and loads partitioned data to the corresponding nodes in the target
database. Use database partitioning with IBM DB2 targets stored on a multi-node
tablespace. For more information, see Database Partitioning Partition Type on page 369.

You can specify different partition types at different points in the pipeline.
Figure 13-3 shows a mapping where you can specify different partition types to increase
session performance:
Figure 13-3. Sample Mapping

The mapping in Figure 13-3 reads data about items and calculates average wholesale costs and
prices. The mapping must read item information from three flat files of various sizes, and
then filter out discontinued items. It sorts the active items by description, calculates the
average prices and wholesale costs, and writes the results to a relational database in which the
target tables are partitioned by key range.
When you use this mapping in a session, you can increase session performance by specifying
different partition types at the following partition points in the pipeline:

Source qualifier. To read data from the three flat files concurrently, you must specify three
partitions at the source qualifier. Accept the default partition type, pass-through.

Overview

349

Filter transformation. Since the source files vary in size, each partition processes a
different amount of data. Set a partition point at the Filter transformation, and choose
round-robin partitioning to balance the load going into the Filter transformation.

Sorter transformation. To eliminate overlapping groups in the Sorter and Aggregator


transformations, use hash auto-keys partitioning at the Sorter transformation. This causes
the PowerCenter Server to group all items with the same description into the same
partition before the Sorter and Aggregator transformations process the rows. You can
delete the default partition point at the Aggregator transformation.

Target. Since the target tables are partitioned by key range, specify key range partitioning
at the target to optimize writing data to the target.

For more information on specifying partition types, see Specifying Partition Types on
page 356.

350

Chapter 13: Pipeline Partitioning

Configuring Partitioning Information


When you create or edit a session, you can change the partitioning information for each
pipeline in a mapping. If the mapping contains multiple pipelines, you can specify multiple
partitions in some pipelines and single partitions in others. You update partitioning
information using the Partitions view on the Mapping tab in the session properties.
You can configure the following information in the Partitions view on the Mapping tab:

Add and delete partition points.

Enter a description for each partition.

Specify the partition type at each partition point.

Add a partition key and key ranges for certain partition types.

Figure 13-4 shows the configuration options on the Partitions view on the Mapping tab:
Figure 13-4. Session Properties Partitions View on the Mapping Tab
Add a partition
point.
Delete a partition
point.

Edit the selected


partition point.
Selected Partition
Point

Partitioning
Workspace

Edit Keys
Specify key
ranges.

Click to display
Partitions view.

Configuring Partitioning Information

351

Table 13-2 describes the configuration options for the Partitions view on the Mapping tab:
Table 13-2. Options on Session Properties Partitions View on the Mapping Tab
Partitions View Option

Description

Add Partition Point

Click to add a new partition point in the mapping. When you add a partition point, the
transformation name appears under the Partition Points node.

Delete Partition Point

Click to delete the selected partition point.


You cannot delete certain partition points. For details, see Adding and Deleting Partition
Points on page 353.

Edit Partition Point

Click to edit the selected partition point. This opens the Edit Partition Point dialog box. For
more information on the options in this dialog box, see Table 13-3 on page 353.

Key Range

Displays the key and key ranges for the partition point, depending on the partition type.
For key range partitioning, you specify the key ranges.
For hash user keys partitioning, this field displays the partition key.
The Workflow Manager does not display this area for other partition types.

Edit Keys

Click to add or remove the partition key for key range or hash user keys partitioning. You
cannot create a partition key for hash auto-keys, round-robin, or pass-through partitioning.

You can configure the following information when you edit or add a partition point:

Specify the partition type at the partition point.

Add and delete partitions.

Enter a description for each partition.

Figure 13-5 shows the configuration options in the Edit Partition Point dialog box:
Figure 13-5. Edit Partition Point Dialog Box
Selected Partition Point
Add a partition.
Delete a partition.
Select a partition.
Enter the partition description.

Specify the partition type.

352

Chapter 13: Pipeline Partitioning

Table 13-3 describes the configuration options in the Edit Partition Point dialog box:
Table 13-3. Edit Partition Point Dialog Box Options
Partition Options

Description

Select Partition Type

Changes the partition type.

Partition Names

Selects individual partitions from this dialog box to configure.

Add a Partition

Adds a partition. You can add up to 64 partitions at any partition point. The number of
partitions must be consistent across the pipeline. Therefore, if you define three partitions
at one partition point, the Workflow Manager defines three partitions at all partition points
in the pipeline.

Delete a Partition

Deletes the selected partition. Each partition point must contain at least one partition.

Description

Enter an optional description for the current partition.

Adding and Deleting Partition Points


When you create a session, the Workflow Manager creates one partition point at the following
transformations in the pipeline:

Source Qualifier or Normalizer. This partition point controls how the PowerCenter Server
extracts data from the source and passes it to the source qualifier. You cannot delete this
partition point.

Rank and unsorted Aggregator transformations. These partition points ensure that the
PowerCenter Server groups rows properly before it sends them to the transformation. You
can delete these partition points if the pipeline contains only one partition or if the
PowerCenter Server passes all rows in a group to a single partition before they enter the
transformation.
For example, in the mapping in Figure 13-3 on page 349, you can delete the default
partition point at the Aggregator transformation because hash auto-keys partitioning at
the Sorter transformation sends all rows that contain items with the same description to
the same partition. Therefore, the Aggregator transformation receives data for all items
with the same description in one partition and can calculate the average costs and prices
for this item correctly.

Target instances. This partition point controls how the writer passes data to the targets.
You cannot delete this partition point.

Rules for Adding and Deleting Partition Points


You can add and delete partition points at other transformations in the pipeline according to
the following rules:

You cannot create partition points at source instances.

You cannot create partition points at Sequence Generator transformations or unconnected


transformations.

Configuring Partitioning Information

353

You can add partition points at any other transformation provided that no partition point
receives input from more than one pipeline stage.

Figure 13-6 shows the valid partition points in a mapping:


Figure 13-6. Sample Mapping Showing Valid Partition Points

* Valid Partition Points


*

In this mapping, the Workflow Manager creates partition points at the source qualifier and
target instance by default. You can place an additional partition point at Expression
transformation EXP_3.
If you place a partition point at EXP_3 and define one partition, the master thread creates the
following threads:

* Partition Points
*

Reader Thread
(First Stage)

(Second Stage)

Transformation Threads
(Third Stage)

Writer Thread
(Fourth Stage)

In this case, each partition point receives data from only one pipeline stage, so EXP_3 is a
valid partition point.
The following transformations are not valid partition points:

354

Transformation

Reason

Source

This is a source instance.

Chapter 13: Pipeline Partitioning

Transformation

Reason

SG_1

This is a Sequence Generator transformation.

EXP_1 and EXP_2

If you could place a partition point at EXP_1 or EXP_2, you would create an additional pipeline
stage that processes data from the source qualifier to EXP_1 or EXP_2. In this case, EXP_3
would receive data from two pipeline stages, which is not allowed.

For more information about processing threads, see Understanding Processing Threads on
page 14.

Steps for Adding Partition Points


You add partition points from the Mappings tab of the session properties.
To add a partition point:
1.

On the Partitions view of the Mapping tab, select a transformation that is not already a
partition point, and click the Add a Partition Point button.
Tip: You can select a transformation from the Non-Partition Points node.

2.

Select the partition type for the partition point or accept the default value. For
information on specifying a valid partition type, see Specifying Partition Types on
page 356.

3.

Click OK.
The transformation appears in the Partition Points node in the Partitions view on the
Mapping tab of the session properties.

Configuring Partitioning Information

355

Adding and Deleting Partitions


In general, you can define up to 64 partitions at any partition point in a source pipeline. In
certain circumstances, the number of partitions in the pipeline must be set to one. For more
information, see Restrictions on the Number of Partitions on page 395.
The number of partitions you specify equals the number of connections to the source or
target. If the pipeline contains a relational source or target, the number of partitions at the
source qualifier or target instance equals the number of connections to the database. If the
pipeline contains file sources, you can configure the session to read the source with one thread
or with multiple threads. For more information on connecting to relational sources and
targets, see Partitioning Relational Sources on page 371 and Partitioning Relational
Targets on page 378. For more information on connecting to file sources and targets, see
Partitioning File Sources on page 374 and Partitioning File Targets on page 380.
The number of partitions you specify remains consistent throughout the pipeline. So if you
specify three partitions at any partition point, the PowerCenter Server creates three partitions
at all other partition points in the pipeline.

Entering Partition Descriptions


You can enter a description for each partition you create. To enter a description, select the
partition in the Edit Partition Point dialog box, and then enter the description in the
Description field.

Specifying Partition Types


The Workflow Manager sets a default partition type for each partition point in the pipeline.
At the source qualifier and target instance, the Workflow Manager specifies pass-through
partitioning. For Rank and unsorted Aggregator transformations, for example, the Workflow
Manager specifies hash auto-keys partitioning when the transformation scope is All Input.
When you create a new partition point, the Workflow Manager sets the partition type to the
default partition type for that transformation. You can change the default type.
You must specify pass-through partitioning for all transformations that are downstream from
a transaction generator or an active source that generates commits, and upstream from a target
or a transformation with Transaction transformation scope. Also, if you configure the session
to use constraint-based loading, you must specify pass-through partitioning for all
transformations that are downstream from the last active source.

356

Chapter 13: Pipeline Partitioning

Table 13-4 lists valid partition types and the default partition type for different partition
points in the pipeline:
Table 13-4. Valid Partition Types for Partition Points
Transformation
(Partition Point)

RoundRobin

Hash
Auto-Keys

Hash User
Keys

Key
Range

PassThrough

Source definition

Database
Partitioning

Default Partition
Type
Not a valid partition
point

Source Qualifier
(relational sources)

Pass-through

Source Qualifier
(flat file sources)

Pass-through

XML Source Qualifier

Pass-through

Normalizer
(COBOL sources)

Pass-through

Pass-through

Pass-through

Based on
transformation scope*

Normalizer
(relational)

Aggregator (sorted)
Aggregator (unsorted)

Custom

Pass-through

Expression

Pass-through

External Procedure

Pass-through

Filter

Pass-through

Based on
transformation scope*

Pass-through

Based on
transformation scope*

Pass-through

Joiner
Lookup

X
X

Rank
Router

X
X

Sequence Generator

Not a valid partition


point

Sorter

Based on
transformation scope*

Stored Procedure

Pass-through

Transaction Control

Pass-through

Union

Pass-through

Update Strategy

Pass-through

Configuring Partitioning Information

357

Table 13-4. Valid Partition Types for Partition Points


Transformation
(Partition Point)

RoundRobin

Hash
Auto-Keys

Hash User
Keys

Key
Range

PassThrough

Database
Partitioning

Unconnected
transformation

Default Partition
Type
Not a valid partition
point

Relational target
definition

Flat file target definition

XML target definition

X
(DB2 targets
only)

Pass-through
The default for DB2
targets is database
partitioning
Pass-through
Not a valid partition
point

* The default partition type is pass-through when the transformation scope is Transaction, and hash auto-keys when the transformation scope is All Input.

Adding Keys and Key Ranges


If you select key range or hash user keys partitioning at any partition point, you need to
specify a partition key. The PowerCenter Server uses the key to pass rows to the appropriate
partition.
For example, if you specify key range partitioning at a Source Qualifier transformation, the
PowerCenter Server uses the key and ranges to create the WHERE clause when it selects data
from the source. Therefore, you can have the PowerCenter Server pass all rows that contain
customer IDs less than 135000 to one partition and all rows that contain customer IDs
greater than or equal to 135000 to another partition. For more information, see Key Range
Partition Type on page 363.
If you specify hash user keys partitioning at a transformation, the PowerCenter Server uses the
key to group data based on the ports you select as the key. For example, if you specify
ITEM_DESC as the hash key, the PowerCenter Server distributes data so that all rows that
contain items with the same description go to the same partition. For more information, see
Hash Keys Partition Types on page 361.

358

Chapter 13: Pipeline Partitioning

Cache Partitioning
When you create a session with multiple partitions, the PowerCenter Server can partition
caches for the Aggregator, Joiner, Lookup, and Rank transformations. It creates a separate
cache for each partition, and each partition works with only the rows needed by that
partition. As a result, the PowerCenter Server requires only a portion of total cache memory
for each partition. When you run a session, the PowerCenter Server accesses the cache in
parallel for each partition.
After you configure the session for partitioning, you can configure memory requirements and
cache directories for each transformation in the Transformations view on the Mapping tab of
the session properties. To configure the memory requirements, calculate the total
requirements for a transformation, and divide by the number of partitions. To further
improve performance, you can configure separate directories for each partition.
The guidelines for cache partitioning is different for each cached transformation:

Aggregator transformation. The PowerCenter Server uses cache partitioning for any
multi-partitioned session with an Aggregator transformation. You do not have to set a
partition point at the Aggregator transformation.

Joiner transformation. The PowerCenter Server uses cache partitioning when you create a
partition point at the Joiner transformation. For more information about partitioning with
Joiner transformations, see Partitioning Joiner Transformations on page 384.

Lookup transformation. The PowerCenter Server uses cache partitioning when you create
a hash auto-keys partition point at the Lookup transformation. For more information
about partitioning with Lookup transformations, see Partitioning Lookup
Transformations on page 391.

Rank transformation. The PowerCenter Server uses cache partitioning for any multipartitioned session with a Rank transformation. You do not have to set a partition point at
the Rank transformation.

For more caching information, see Session Caches on page 613.

Cache Partitioning

359

Round-Robin Partition Type


In round-robin partitioning, the PowerCenter Server distributes rows of data evenly to all
partitions. Each partition processes approximately the same number of rows.
Table 13-4 on page 357 lists the partition points where you can specify round-robin
partitioning.
Use round-robin partitioning when you need to distribute rows evenly and do not need to
group data among partitions. In a pipeline that reads data from file sources of different sizes,
you can use round-robin partitioning to ensure that each partition receives approximately the
same number of rows.
Figure 13-7 shows a mapping where round-robin partitioning helps distribute rows before
they enter a Filter transformation:
Figure 13-7. Mapping where Round-robin Partitioning Can Increase Performance

Round-robin partitioning distributes data


evenly at the Filter transformation.

The session based on this mapping reads item information from three flat files of different
sizes:

Source file 1: 80,000 rows

Source file 2: 5,000 rows

Source file 3: 15,000 rows

When the PowerCenter Server reads the source data, the first partition begins processing 80%
of the data, the second partition processes 5% of the data, and the third partition processes
15% of the data.
To distribute the workload more evenly, set a partition point at the Filter transformation and
set the partition type to round-robin. The PowerCenter Server distributes the data so that
each partition processes approximately one third of the data.

360

Chapter 13: Pipeline Partitioning

Hash Keys Partition Types


In hash partitioning, the PowerCenter Server uses a hash function to group rows of data
among partitions. The PowerCenter Server groups the data based on a partition key.
Use hash partitioning when you want the PowerCenter Server to distribute rows to the
partitions by group. For example, you need to sort items by item ID, but you do not know
how many items have a particular ID number.
There are two types of hash partitioning:

Hash auto-keys. The PowerCenter Server uses all grouped or sorted ports as a compound
partition key. You may need to use hash auto-keys partitioning at Rank, Sorter, and
unsorted Aggregator transformations.

Hash user keys. You specify a number of ports to generate the partition key.

Table 13-4 on page 357 lists the partition points where you can specify hash partitioning.

Hash Auto-Keys
You can use hash auto-keys partitioning at or before Rank, Sorter, Joiner, and unsorted
Aggregator transformations to ensure that rows are grouped properly before they enter these
transformations.
Figure 13-8 shows a mapping where hash auto-keys partitioning causes the PowerCenter
Server to distribute rows to each partition according to group before they enter the Sorter and
Aggregator transformations:
Figure 13-8. Mapping where Hash Partitioning Can Increase Performance

Hash auto-keys partitioning groups data at the Sorter.

In this mapping, the Sorter transformation sorts items by item description. If items with the
same description exist in more than one source file, each partition will contain items with the
same description. Without hash auto-keys partitioning, the Aggregator transformation might
calculate average costs and prices for each item incorrectly.
To prevent errors in the cost and prices calculations, set a partition point at the Sorter
transformation and set the partition type to hash auto-keys. When you do this, the
PowerCenter Server redistributes the data so that all items with the same description reach the
Sorter and Aggregator transformations in a single partition.

Hash Keys Partition Types

361

Hash User Keys


In hash user keys partitioning, the PowerCenter Server uses a hash function to group rows of
data among partitions based on a user-defined partition key. You choose the ports that define
the partition key.
In the mapping in Figure 13-8 on page 361, if you specify hash auto-keys partitioning, the
Sorter transformation receives rows of data grouped by the sort key, such as ITEM_DESC. If
the item descriptions are long, and you know that each item has a unique ID number, you can
specify hash user keys partitioning at the Sorter transformation and select ITEM_ID as the
hash key. This may improve the performance of the session since the hash function usually
processes numerical data more quickly than string data.

Adding a Hash Key


If you select hash user keys partitioning at any partition point, you must specify a hash key.
The PowerCenter Server uses the hash key to distribute rows to the appropriate partition
according to group.
To specify the hash key, select the partition point on the Partitions view of the Mapping tab,
and click Edit Keys. This displays the Edit Partition Key dialog box. The Available Ports list
displays the connected input and input/output ports in the transformation. To specify the
hash key, select one or more ports from this list, and then click Add.
Figure 13-9 shows one port selected as the hash key for a Filter transformation:
Figure 13-9. Edit Partition Key Dialog Box

Rearrange selected ports.

To rearrange the order of the ports that make up the key, select a port in the Selected Ports list
and click the up or down arrow.

362

Chapter 13: Pipeline Partitioning

Key Range Partition Type


With key range partitioning, the PowerCenter Server distributes rows of data based on a port
or set of ports that you specify as the partition key. For each port, you define a range of values.
The PowerCenter Server uses the key and ranges to send rows to the appropriate partition.
Table 13-4 on page 357 lists the partition points where you can specify key range
partitioning.
Use key range partitioning in mappings where the source and target tables are partitioned by
key range.
Figure 13-10 shows a mapping where key range partitioning can optimize writing to the
target table:
Figure 13-10. Mapping where Key Range Partitioning Can Increase Performance

Key range partitioning at the target


optimizes writing to the target tables.

The target table in the database is partitioned by ITEM_ID as follows:

Partition 1: 00012999

Partition 2: 30005999

Partition 3: 60009999

To optimize writing to the target table, perform the following tasks:


1.

Set the partition type at the target instance to key range.

2.

Create three partitions.

3.

Choose ITEM_ID as the partition key.


The PowerCenter Server uses this key to pass data to the appropriate partition.

4.

Set the key ranges as follows:


ITEM_ID

Start Range

Partition #1

End Range
3000

Partition #2

3000

Partition #3

6000

6000

When you do this, the PowerCenter Server sends all items with IDs less than 3000 to the first
partition. It sends all items with IDs between 3000 and 5999 to the second partition. Items
with IDs greater than or equal to 6000 go to the third partition. For more information on key
ranges, see Adding Key Ranges on page 365.
Key Range Partition Type

363

Adding a Partition Key


To specify the partition key for key range partitioning, select the partition point on the
Partitions view of the Mapping tab, and click Edit Keys. This displays the Edit Partition Key
dialog box. The Available Ports list displays the connected input and input/output ports in
the transformation. To specify the partition key, select one or more ports from this list, and
then click Add.
Figure 13-11 shows one port selected as the partition key for the target table
T_ITEM_PRICES:
Figure 13-11. Edit Partition Key Dialog Box

Rearrange the selected ports.

To rearrange the order of the ports that make up the partition key, select a port in the Selected
Ports list and click the up or down arrow.
In key range partitioning, the order of the ports does not affect how the PowerCenter Server
redistributes rows among partitions, but it can affect session performance. For example, you
might configure the following compound partition key:
Selected Ports
ITEMS.DESCRIPTION
ITEMS.DISCONTINUED_FLAG

Since boolean comparisons are usually faster than string comparisons, the session may run
faster if you arrange the ports in the following order:
Selected Ports
ITEMS.DISCONTINUED_FLAG
ITEMS.DESCRIPTION

364

Chapter 13: Pipeline Partitioning

Adding Key Ranges


After you identify the ports that make up the partition key, you must enter the ranges for each
port on the Partitions view of the Mapping tab.
Figure 13-12 shows where you enter key ranges on the Partitions view of the Mapping tab:
Figure 13-12. Adding Key Ranges

Specify key ranges.

You can leave the start or end range blank for a partition. When you leave the start range
blank, the PowerCenter Server uses the minimum data value as the start range. When you
leave the end range blank, the PowerCenter Server uses the maximum data value as the end
range.
For example, you can add the following ranges for a key based on CUSTOMER_ID in a
pipeline that contains two partitions:
CUSTOMER_ID

Start Range

Partition #1
Partition #2

End Range
135000

135000

When the PowerCenter Server reads the Customers table, it sends all rows that contain
customer IDs less than 135000 to the first partition, and all rows that contain customer IDs
equal to or greater than 135000 to the second partition. The PowerCenter Server eliminates
rows that contain null values or values that fall outside the key ranges.

Key Range Partition Type

365

When you configure a pipeline to load data to a relational target, if a row contains null values
in any column that makes up the partition key or if a row contains a value that fall outside all
of the key ranges, the PowerCenter Server sends that row to the first partition.
When you configure a pipeline to read data from a relational source, the PowerCenter Server
reads rows that fall within the key ranges. It does not read rows with null values in any
partition key column.
If you want to read rows with null values in the partition key, use pass-through partitioning
and create a SQL override.
Consider the following guidelines when you create key ranges:

The partition key must contain at least one port.

You must specify a range for each port.

Use the standard PowerCenter date format to enter dates in key ranges.

The Workflow Manager does not validate overlapping string or numeric ranges.

The Workflow Manager does not validate gaps or missing ranges.

Adding Filter Conditions


If you specify key range partitioning for a relational source, you can specify optional filter
conditions or override the SQL query. For details, see Partitioning Relational Sources on
page 371.

366

Chapter 13: Pipeline Partitioning

Pass-Through Partition Type


In pass-through partitioning, the PowerCenter Server processes data without redistributing
rows among partitions. Therefore, all rows in a single partition stay in that partition after
crossing a pass-through partition point.
When you add a partition point to a pipeline, the master thread creates an additional pipeline
stage. Use pass-through partitioning when you want to increase data throughput, but you
cannot or do not want to increase the number of partitions.
You can specify pass-through partitioning at any valid partition point in a pipeline.
Figure 13-13 shows a mapping where pass-through partitioning can increase data throughput:
Figure 13-13. Mapping where Pass-through Partitioning Can Increase Performance

Reader Thread
(First Stage)

Transformation Thread
(Second Stage)

Writer Thread
(Third Stage)

By default, this mapping contains partition points only at the source qualifier and target
instance. Since this mapping contains an XML target, you can configure only one partition at
any partition point.
In this case, the master thread creates one reader thread to read data from the source, one
transformation thread to process the data, and one writer thread to write data to the target.
Each pipeline stage processes the rows as follows:
Time

Source Qualifier
(First Stage)
Row Set 1
Row Set 2
Row Set 3
Row Set 4
...
Row Set n

Transformations
(Second Stage)

Row Set 1
Row Set 2
Row Set 3
...
Row Set n-1

Target Instance
(Third Stage)

Row Set 1
Row Set 2
...
Row Set n-2

Because the pipeline contains three stages, the PowerCenter Server can process three sets of
rows concurrently.
If the Expression transformations are very complicated, processing the second
(transformation) stage can take a long time and cause low data throughput. To improve
performance, set a partition point at Expression transformation EXP_2 and set the partition

Pass-Through Partition Type

367

type to pass-through. This creates an additional pipeline stage. The master thread creates an
additional transformation thread:

Reader Thread
(First Stage)

(Second Stage)

Transformation Threads
(Third Stage)

Writer Thread
(Fourth Stage)

The PowerCenter Server can now process four sets of rows concurrently as follows:

Time

Source
Qualifier
(First Stage)
Row Set 1
Row Set 2
Row Set 3
Row Set 4
...
Row Set n

FIL_1 & EXP_1


Transformations
(Second Stage)
Row Set 1
Row Set 2
Row Set 3
...
Row Set n-1

EXP_2 & LKP_1


Transformations
(Third Stage)
Row Set 1
Row Set 2
...
Row Set n-2

Target
Instance
(Fourth Stage)
Row Set 1
...
Row Set n-3

By adding an additional partition point at Expression transformation EXP_2, you replace one
long running transformation stage with two shorter running transformation stages. Data
throughput depends on the longest running stage. So in this case, data throughput increases.
For more information about processing threads, see Understanding Processing Threads on
page 14.

368

Chapter 13: Pipeline Partitioning

Database Partitioning Partition Type


When you load to an IBM DB2 table stored on a multi-node tablespace, you can optimize
session performance by using the database partitioning partition type instead of the passthrough partition type for IBM DB2 targets.
When you use database partitioning, the PowerCenter Server queries the DB2 system for
table partition information and loads partitioned data to the corresponding nodes in the
target database.
You can only specify database partitioning for relational targets.
You can specify database partitioning for the target partition type with any number of
pipeline partitions and any number of database nodes. However, you can improve load
performance further when the number of pipeline partitions equals the number of database
nodes.
Use the following rules and guidelines when you use database partitioning:

By default, the PowerCenter Server fails the session when you use database partitioning for
non-DB2 targets. However, you can configure the PowerCenter Server to default to passthrough partitioning when you use database partitioning for non-DB2 relational targets:

On Windows. Select the Treat Database Partitioning as Pass-Through option on the


Configuration tab of the PowerCenter Server setup. By default, this option is disabled.

On UNIX. Add the following entry to the file pmserver.cfg:


TreatDBPartitionAsPassThrough=Yes

You cannot use database partitioning when you configure the session to use source-based
or user-defined commit, constraint-based loading, or session recovery.

The target table must contain a partition key. Also, you must link all not-null partition key
columns in the target instance to a transformation in the mapping.

You must use high precision mode when the IBM DB2 table partitioning key uses a Bigint
field. The PowerCenter Server fails the session when the IBM DB2 table partitioning key
uses a Bigint field and you use low precision mode.

If you create multiple partitions for a DB2 bulk load session, you must use database
partitioning for the target partition type. If you choose any other partition type, the
PowerCenter Server reverts to normal load and writes the following message to the session
log:
ODL_26097 Only database partitioning is support for DB2 bulk load.
Changing target load type variable to Normal.

If you configure a session for database partitioning, the PowerCenter Server reverts to passthrough partitioning under the following circumstances:

The DB2 target table is stored on one node.

You run the session in debug mode using the Debugger.

Database Partitioning Partition Type

369

370

You configure the PowerCenter Server to treat the database partitioning partition type as
pass-through partitioning and you used database partitioning for a non-DB2 relational
target.

Chapter 13: Pipeline Partitioning

Partitioning Relational Sources


When you run a session that partitions relational or Application sources, the PowerCenter
Server creates a separate connection to the source database for each partition. It then creates
an SQL query for each partition. You can customize the query for each source partition by
entering filter conditions in the Transformation view on the Mapping tab. You can also
override the SQL query for each source partition using the Transformations view on the
Mapping tab.
Figure 13-14 shows where you can override the SQL query for each source partition:
Figure 13-14. Overriding the SQL Query and Entering a Filter Condition

Browse Button
Enter SQL overrides.
Enter filter conditions.
Transformations View

For more information about partitioning Application sources, refer to the PowerCenter
Connect documentation.

Entering an SQL Query


You can enter an SQL override if you want to customize the SELECT statement in the SQL
query. The SQL statement you enter on the Transformations view of the Mapping tab
overrides any customized SQL query that you set in the Designer when you configure the
Source Qualifier transformation. For more information, see Source Qualifier
Transformation in the Transformation Guide.

Partitioning Relational Sources

371

The SQL query also overrides any key range and filter condition that you enter for a source
partition. So, if you also enter a key range and source filter, the PowerCenter Server uses the
SQL query override to extract source data.
If you create a key that contains null values, you can extract the nulls by creating another
partition and entering an SQL query or filter to extract null values.
To enter an SQL query for each partition, click the Browse button in the SQL Query field.
Enter the query in the SQL Editor dialog box, and then click OK.
If you entered an SQL query in the Designer when you configured the Source Qualifier
transformation, that query appears in the SQL Query field for each partition. To override this
query, click the Browse button in the SQL Query field, revise the query in the SQL Editor
dialog box, and then click OK.

Entering a Filter Condition


If you specify key range partitioning at a relational source qualifier, you can enter an
additional filter condition. When you do this, the PowerCenter Server generates a WHERE
clause that includes the filter condition you enter in the session properties.
The filter condition you enter on the Transformations view of the Mapping tab overrides any
filter condition that you set in the Designer when you configure the Source Qualifier
transformation. For more information, see Source Qualifier Transformation in the
Transformation Guide.
If you use key range partitioning, the filter condition works in conjunction with the key
ranges. For example, you want to select data based on customer ID, but you do not want to
extract information for customers outside the USA. Define the following key ranges:
CUSTOMER_ID

Start Range

Partition #1
Partition #2

End Range
135000

135000

If you know that the IDs for customers outside the USA fall within the range for a particular
partition, you can enter a filter in that partition to exclude them. Therefore, you enter the
following filter condition for the second partition:
CUSTOMERS.COUNTRY = USA

When the session runs, the following queries for the two partitions appear in the session log:
READER_1_1_1> RR_4010 SQ instance [SQ_CUSTOMERS] SQL Query [SELECT
CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.LAST_NAME FROM
CUSTOMERS WHERE CUSTOMER.CUSTOMER ID < 135000]
[...]
READER_1_1_2> RR_4010 SQ instance [SQ_CUSTOMERS] SQL Query [SELECT
CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.LAST_NAME FROM
CUSTOMERS WHERE CUSTOMERS.COUNTRY = USA AND 135000 <=
CUSTOMERS.CUSTOMER_ID]

372

Chapter 13: Pipeline Partitioning

To enter a filter condition, click the Browse button in the Source Filter field. Enter the filter
condition in the SQL Editor dialog box, and then click OK.
If you entered a filter condition in the Designer when you configured the Source Qualifier
transformation, that query appears in the Source Filter field for each partition. To override
this filter, click the Browse button in the Source Filter field, change the filter condition in the
SQL Editor dialog box, and then click OK.

Partitioning Relational Sources

373

Partitioning File Sources


When a session uses a file source, you can configure it to read the source with one thread or
with multiple threads. The PowerCenter Server creates one connection to the file source when
you configure the session to read with one thread, and it creates multiple concurrent
connections to the file source when you configure the session to read with multiple threads.
Configure the source file name property for partitions 2-n to specify single- or multi-threaded
reading. To configure for single-threaded reading, pass empty data through partitions 2-n. To
configure for multi-threaded reading, leave the source file name blank for partitions 2-n. For
more information about configuring file properties with multiple partitions, see Configuring
for File Partitioning on page 375.

Guidelines for Partitioning File Sources


Use the following guidelines when you configure a file source session with multiple partitions:

374

You can use pass-through partitioning at the source qualifier.

You can use single- or multi-threaded reading with flat file or COBOL sources.

You can use single-threaded reading with XML sources.

You cannot use multi-threaded reading if the source files are non-disk files, such as FTP
files or IBM MQSeries sources.

If you use a shift-sensitive code page, you can use multi-threaded reading only if the
following conditions are true:

The file is fixed-width.

The file is not line sequential.

You did not enable user-defined shift state in the source definition.

If you configure a session for multi-threaded reading, and the PowerCenter Server cannot
create multiple threads to a file source, it writes a message to the session log and reads the
source with one thread.

When the PowerCenter Server uses multiple threads to read a source file, it may not read
the rows in the file sequentially. If sort order is important, configure the session to read the
file with a single thread. For example, sort order may be important if the mapping contains
a sorted Joiner transformation and the file source is the sort origin.

You can also use a combination of direct and indirect files to balance the load.

Session performance for multi-threaded reading is optimal with large source files.
Although the PowerCenter Server can create multiple connections to small source files,
performance may not be optimal.

Chapter 13: Pipeline Partitioning

Using One Thread to Read a File Source


When the PowerCenter Server uses one thread to read a file source, it creates one connection
to the source. The PowerCenter Server reads the rows in the file or file list sequentially. You
can configure single-threaded reading for direct or indirect file sources in a session:

Reading direct files. You can configure the PowerCenter Server to read from one or more
direct files. If you configure the session with more than one direct file, the PowerCenter
Server creates a concurrent connection to each file. It does not create multiple connections
to a file.

Reading indirect files. When the PowerCenter Server reads an indirect file, it reads the file
list and reads the files in the list sequentially. If the session has more than one file list, the
PowerCenter Server reads the file lists concurrently, and it reads the files in the list
sequentially.

Using Multiple Threads to Read a File Source


When the PowerCenter Server uses multiple threads to read a source file, it creates multiple
concurrent connections to the source. The PowerCenter Server may or may not read the rows
in a file sequentially. You can configure multi-threaded reading for direct or indirect file
sources in a session:

Reading direct files. When the PowerCenter Server reads a direct file, it creates multiple
reader threads to read the file concurrently. You can configure the PowerCenter Server to
read from one or more direct files. For example, if a session reads from two files and you
create five partitions, the PowerCenter Server may distribute one file among two partitions
and one file among three partitions.

Reading indirect files. When the PowerCenter Server reads an indirect file, it creates
multiple threads to read the file list concurrently. It also creates multiple threads to read
the files in the list concurrently. The PowerCenter Server may use more than one thread to
read a single file.

Configuring for File Partitioning


After you create partition points and configure partitioning information, you can configure
source connection settings and file properties on the Transformations view of the Mapping
tab. Click the source instance name you want to configure under the Sources node. When you
click the source instance name for a file source, the Workflow Manager displays connection
and file properties in the session properties.
You can configure the source file names and directories for each source partition. The
Workflow Manager generates a file name and location for each partition.

Partitioning File Sources

375

Table 13-5 describes the file properties settings for file sources in a mapping:
Table 13-5. File Properties Settings for File Sources
Attribute Value

Description

Source File Directory

Enter the local source file directory. The default location is $PMSourceFileDir.

Source File Name

Enter the local source file name. You can also use the session variable, $InputFileName, as
defined in the parameter file. If you use a file list, enter the name of the list.
By default, the Workflow Manager uses the source file name for each partition. Edit the file
name property for partitions 2-n based on how you want the PowerCenter Server to read
the files.

Source File Type

Choose Direct to use source files or Indirect to use a file list.

Configuring Sessions to Use a Single Thread


To configure a session to read a file with a single thread, pass empty data through partitions 2n. To pass empty data, create a file with no data, such as empty.txt, and put it in the source
file directory. Then, use empty.txt as the source file name.
Table 13-6 describes the session configuration and the PowerCenter Server behavior when it
uses a single thread to read source files:
Table 13-6. Configuring Source File Name for Single-Threaded Reading
Source File Name

Value

PowerCenter Server Behavior

Partition #1
Partition #2
Partition #3

ProductsA.txt
empty.txt
empty.txt

The PowerCenter Server creates one thread to read ProductsA.txt. It reads


rows in the file sequentially. After it reads the file, it passes the data to
three partitions in the transformation pipeline.

Partition #1
Partition #2
Partition #3

ProductsA.txt
empty.txt
ProductsB.txt

The PowerCenter Server creates two threads. It creates one thread to read
ProductsA.txt, and it creates one thread to read ProductsB.txt. It reads the
files concurrently, and it reads rows in the files sequentially.

If you use FTP to access source files, you can choose a different connection for each direct
file. For more information about using FTP to access source files, see Using FTP on
page 559.

Configuring Sessions to Use Multiple Threads


To configure a session to read a file with multiple threads, leave the source file name blank for
partitions 2-n. The PowerCenter Server uses partitions 2-n to read a portion of the previous
partition file or file list. The PowerCenter Server ignores the directory field of that partition.

376

Chapter 13: Pipeline Partitioning

Table 13-7 describes the session configuration and the PowerCenter Server behavior when it
uses multiple threads to read source files:
Table 13-7. Configuring Source File Name for Multi-Threaded Reading
Attribute

Value

PowerCenter Server Behavior

Partition #1
Partition #2
Partition #3

ProductsA.txt
<blank>
<blank>

The PowerCenter Server creates three threads to concurrently read


ProductsA.txt.

Partition #1
Partition #2
Partition #3

ProductsA.txt
<blank>
ProductsB.txt

The PowerCenter Server creates three threads to read ProductsA.txt and


ProductsB.txt concurrently. Two threads read ProductsA.txt and one thread
reads ProductsB.txt.

Partitioning File Sources

377

Partitioning Relational Targets


When you configure a pipeline to load data to a relational target, the PowerCenter Server
creates a separate connection to the target database for each partition at the target instance. It
concurrently loads data for each partition into the target database.
Configure partition attributes for targets in the pipeline on the Transformations view of the
Mapping tab in the session properties. For relational targets, you configure the reject file
names and directories. The PowerCenter Server creates one reject file for each target partition.
Figure 13-15 shows the Properties settings for relational targets:
Figure 13-15. Properties Settings for Relational Targets in the Session Properties

Properties Settings
Selected Target Instance

Enter reject file directories.

Enter reject file names.


Transformations View

378

Chapter 13: Pipeline Partitioning

Table 13-8 describes the partitioning attributes for relational targets in a pipeline:
Table 13-8. Partitioning Relational Target Attributes
Attribute

Description

Reject File Directory

Location for the target reject files. Default is $PMBadFileDir.

Reject File Name

Name of reject file. Default is target name partition number.bad. You can also use the session
variable, $BadFileName, as defined in the parameter file.

Database Compatibility
When you configure a session with multiple partitions at the target instance, the PowerCenter
Server creates one connection to the target for each partition. If you configure multiple target
partitions in a session that loads to a database or ODBC target that does not support multiple
concurrent connections to tables, the session fails.
When you create multiple target partitions in a session that loads data to an Informix
database, you must create the target table with row-level locking. If you insert data from a
session with multiple partitions into an Informix target configured for page-level locking, the
session fails and returns the following message:
WRT_8206 Error: The target table has been created with page level locking.
The session can only run with multi partitions when the target table is
created with row level locking.

Sybase IQ does not allow multiple concurrent connections to tables. If you create multiple
target partitions in a session that loads to Sybase IQ, the PowerCenter Server loads all of the
data in one partition.

Partitioning Relational Targets

379

Partitioning File Targets


When you configure a session to write to a file target, the PowerCenter Server writes the
output to a separate file for each partition at the target instance. When you run the session,
the PowerCenter Server writes to the files concurrently.
You can configure connection settings and file properties for each target partition. You
configure these settings in the Transformations view on the Mapping tab.

Configuring Connection Settings


The Connections settings in the Transformations view on the Mapping tab allow you to
configure the connection type for all target partitions. You can choose different connection
objects for each partition, but they must all be of the same type.
You can use one of the following connection types with target files:

Local. Write the partitioned target files to the local machine.

FTP. Transfer the partitioned target files to another machine. You can transfer the files to
any machine to which the PowerCenter Server can connect. For more information about
using FTP to load to target files, see Using FTP on page 559.

Loader. Use an external loader that can load from multiple output files. This option
appears if the pipeline loads data to a relational target and you choose a file writer in the
Writers settings on the Mapping tab. If you choose a loader that cannot load from multiple
output files, the PowerCenter Server fails the session. For more information about
configuring external loaders for partitioning, see Partitioning Sessions with External
Loaders on page 526.

Message Queue. Transfer the partitioned target files to an IBM MQSeries message queue.
For more information about loading to message queues, refer to the PowerCenter Connect
for IBM MQSeries User and Administrator Guide.

You can merge target files only if you choose local connections for all target partitions.

380

Chapter 13: Pipeline Partitioning

Figure 13-16 shows the Connections settings for file targets:


Figure 13-16. Connections Settings for File Targets in the Session Properties

Selected Target Instance


Connections Settings

Connection Type

Transformations View

Table 13-9 describes the connection options for file targets in a mapping:
Table 13-9. File Targets Connection Options
Attribute

Description

Connection Type

Choose a local, FTP, external loader, or message queue connection. Select None for a local
connection.
The connection type is the same for all partitions.

Value

For an FTP, external loader, or message queue connection, click the button in this field to
select the connection object.
You can specify a different connection object for each partition.

Configuring File Properties


The Properties settings in the Transformations view on the Mapping tab allow you to
configure file properties such as the reject file names and directories, the output file names
and directories, and whether to merge the target files.

Partitioning File Targets

381

Figure 13-17 shows the Properties settings for file targets:


Figure 13-17. Properties Settings for File Targets in the Session Properties

Selected Target Instance

Properties Settings
Select to merge target files.

Enter output file directories.

Enter output file names.

Enter reject file directories.


Enter reject file names.

Table 13-10 describes the file properties for file targets in a mapping:
Table 13-10. Target File Properties

382

Attribute

Description

Merge Partitioned Files

If you select this option, the PowerCenter Server merges the partitioned target files into one
file when the session completes, and then deletes the individual output files. It does not
delete the individual files if it fails to create the merged file.
You cannot merge files if the session uses FTP, an external loader, or an MQSeries
message queue.

Merge File Directory

Location for the merge file. Default is $PMTargetFileDir.

Merge File Name

Name of the merge file. Default is target name.out.

Output File Directory

Location for the target file. Default is $PMTargetFileDir.

Chapter 13: Pipeline Partitioning

Table 13-10. Target File Properties


Attribute

Description

Output File Name

Name of target file. Default is target name partition number.out. You can also use the
session variable, $OutputFileName, as defined in the parameter file.

Reject File Directory

Location for the target reject files. Default is $PMBadFileDir.

Reject File Name

Name of reject file. Default is target name partition number.bad.

Partitioning File Targets

383

Partitioning Joiner Transformations


When you create a partition point at the Joiner transformation, the Workflow Manager sets
the partition type to hash auto-keys when the transformation scope is All Input. The
Workflow Manager sets the partition type to pass-through when the transformation scope is
Transaction.
You must create the same number of partitions for the master and detail source. If you
configure the Joiner transformation for sorted input, you can change the partition type to
pass-through. See the Transformation Guide for more information about configuring the
Joiner transformation for sorted input.
To use cache partitioning with a Joiner transformation, you must create a partition point at
the Joiner transformation. This allows you to create multiple partitions for both the master
and detail source of a Joiner transformation. For more information about cache partitioning,
see Cache Partitioning on page 359.
Note: If you do not create a partition point at the Joiner transformation, you can create n

partitions for the detail source, but only one partition for the master source (1:n).

Partitioning Sorted Joiner Transformations


When you include a Joiner transformation that uses sorted input in the mapping, you must
verify the Joiner transformation receives sorted data. If your sources contain large amounts of
data, you may want to configure partitioning to improve performance. However, partitions
that redistribute rows can rearrange the order of sorted data, so it is important to configure
partitions to maintain sorted data.
For example, when you use a hash auto-keys partition point, the PowerCenter Server uses a
hash function to determine the best way to distribute the data among the partitions. However,
it does not maintain the sort order, so you must follow specific partitioning guidelines to use
this type of partition point.
When you join data, you can partition data for the master and detail pipelines in the
following ways:

1:n. Use one partition for the master source and multiple partitions for the detail source.
The PowerCenter Server maintains the sort order because it does not redistribute master
data among partitions.

n:n. Use an equal number of partitions for the master and detail sources. When you use
n:n partitions, the PowerCenter Server processes multiple partitions concurrently. You may
need to configure the partitions to maintain the sort order depending on the type of
partition you use at the Joiner transformation.

Note: When you use 1:n partitions, do not add a partition point at the Joiner transformation.

If you add a partition point at the Joiner transformation, the Workflow Manager adds an
equal number of partitions to both master and detail pipelines.
Use different partitioning guidelines, depending on where you sort the data:

384

Chapter 13: Pipeline Partitioning

Using sorted flat files. Use one of the following partitioning configurations:

Use 1:n partitions when you have one flat file in the master pipeline and multiple flat
files in the detail pipeline. Configure the session to use one reader-thread for each file.

Use n:n partitions when you have one large flat file in the master and detail pipelines.
Configure partitions to pass all sorted data in the first partition, and pass empty file data
in the other partitions.

Using sorted relational data. Use one of the following partitioning configurations:

Use 1:n partitions for the master and detail pipeline.

Use n:n partitions. If you use a hash auto-keys partition, configure partitions to pass all
sorted data in the first partition.

Using the Sorter transformation. Use n:n partitions. If you use a hash auto-keys partition
at the Joiner transformation, configure each Sorter transformation to use hash auto-keys
partition points as well.

Note: Add only pass-through partition points between the sort origin and the Joiner

transformation.

Using Sorted Flat Files


Use 1:n partitions when you have one flat file in the master pipeline and multiple flat files in
the detail pipeline. When you use 1:n partitions, the PowerCenter Server maintains the sort
order because it does not redistribute data among partitions. When you have one large flat file
in each master and detail pipeline, you can use n:n partitions and add a pass-through or hash
auto-keys partition at the Joiner transformation. When you add a hash auto-keys partition
point, you must configure partitions to pass all sorted data in the first partition to maintain
the sort order.

Using 1:n Partitions


If the session uses one flat file in the master pipeline and multiple flat files in the detail
pipeline, you can use one partition for the master source and n partitions for the detail file
sources (1:n). Add a pass-through partition point at the detail Source Qualifier
transformation. Do not add a partition point at the Joiner transformation. The PowerCenter
Server maintains the sort order when you create one partition for the master source because it
does not redistribute sorted data among partitions.
When you have multiple files in the detail pipeline that have the same structure, pass the files
to the Joiner transformation using the following guidelines:

Configure the mapping with one source and one Source Qualifier transformation in each
pipeline.

Specify the path and file name for each flat file in the Properties settings of the
Transformations view on the Mapping tab of the session properties.

Each file must use the same file properties as configured in the source definition.

Partitioning Joiner Transformations

385

The range of sorted data in the flat files can overlap. You do not need to use a unique range
of data for each file.

Figure 13-18 shows sorted file data joined using 1:n partitioning:
Figure 13-18. Sorted File Data with 1:n Partitions

Flat File

Source
Qualifier
Joiner
transformation

Flat File 1
Flat File 2
Flat File 3

Source
Qualifier
with passthrough
partition
Sorted Data
Sorted output depends on join type.

The Joiner transformation may output unsorted data depending on the join type. If you use a
full outer or detail outer join, the PowerCenter Server processes unmatched master rows last,
which can result in unsorted data.

Using n:n Partitions


If the session uses sorted flat file data, you can use n:n partitions for the master and detail
pipelines. You can add a pass-through partition or hash auto-keys partition at the Joiner
transformation. If you add a pass-through partition at the Joiner transformation, follow
instructions in the Transformation Guide for maintaining the sort order in mappings.
If you add a hash auto-keys partition point at the Joiner transformation, you can maintain the
sort order by passing all sorted data to the Joiner transformation in a single partition. When
you pass sorted data in one partition, the PowerCenter Server maintains the sort order when it
redistributes data using a hash function.
To allow the PowerCenter Server to pass all sorted data in one partition, configure the session
to use the sorted file for the first partition and empty files for the remaining partitions.
The PowerCenter Server redistributes the rows among multiple partitions and joins the sorted
data.

386

Chapter 13: Pipeline Partitioning

Figure 13-19 shows sorted file data passed through a single partition to maintain sort order:
Figure 13-19. Sorted File Data Passed Through a Single Partition
Source
Qualifier

Source
Qualifier

Joiner
transformation
with hash autokeys partition
point

Sorted Data
No Data

The example in Figure 13-19 shows sorted data passed in a single partition to maintain the
sort order. The first partition contains sorted file data while all other partitions pass empty file
data. At the Joiner transformation, the PowerCenter Server distributes the data among all
partitions while maintaining the order of the sorted data.

Using Sorted Relational Data


When you join relational data, you can use 1:n partitions for the master and detail pipeline.
When you use 1:n partitions, you cannot add a partition point at the Joiner transformation. If
you use n:n partitions, you can add a pass-through or hash auto-keys partition at the Joiner
transformation. If you use a hash auto-keys partition point, you must configure partitions to
pass all sorted data in the first partition to maintain sort order.

Using 1:n Partitions


If the session uses sorted relational data, you can use one partition for the master source and n
partitions for the detail source (1:n). Add a key-range or pass-through partition point at the
Source Qualifier transformation. Do not add a partition point at the Joiner transformation.
The PowerCenter Server maintains the sort order when you create one partition for the
master source because it does not redistribute data among partitions.

Partitioning Joiner Transformations

387

Figure 13-20 shows sorted relational data with 1:n partitioning:


Figure 13-20. Sorted Relational Data with 1:n Partitioning

Relational
Source

Source Qualifier
transformation

Joiner
transformation
Relational
Source

Source Qualifier
transformation with
key-range or passthrough partition point

Sorted Data
Unsorted Data
Sorted output
depends on join
type.

The Joiner transformation may output unsorted data depending on the join type. If you use a
full outer or detail outer join, the PowerCenter Server processes unmatched master rows last,
which can result in unsorted data.

Using n:n Partitions


If the session uses sorted relational data, you can use n:n partitions for the master and detail
pipelines and add a pass-through or hash auto-keys partition point at the Joiner
transformation. When you use a pass-through partition at the Joiner transformation, follow
instructions in the Transformation Guide for maintaining sorted data in mappings.
When you use a hash auto-keys partition point, you maintain the sort order by passing all
sorted data to the Joiner transformation in a single partition. Add a key-range partition point
at the Source Qualifier transformation that contains all source data in the first partition.
When you pass sorted data in one partition, the PowerCenter Server redistributes data among
multiple partitions using a hash function and joins the sorted data.

388

Chapter 13: Pipeline Partitioning

Figure 13-21 shows sorted relational data passed through a single partition to maintain the
sort order:
Figure 13-21. Sorted Relational Data Passed Through a Single Partition

Relational
Source

Relational
Source

Source Qualifier
transformation with
key-range partition
point

Source Qualifier
transformation with
key-range partition
point

Joiner
transformation
with hash autokeys partition
point

Sorted Data
No Data

The example in Figure 13-21 shows sorted relational data passed in a single partition to
maintain the sort order. The first partition contains sorted relational data while all other
partitions pass empty data. After the PowerCenter Server joins the sorted data, it redistributes
data among multiple partitions.

Using Sorter Transformations


If the session uses the Sorter transformations to sort data, you can use n:n partitions for the
master and detail pipelines. Use a hash auto-keys partition point at the Sorter transformation
to group the data. You can add a pass-through or hash auto-keys partition point at the Joiner
transformation.
The PowerCenter Server groups data into partitions of the same hash values, and the Sorter
transformation sorts the data before passing it to the Joiner transformation. When the
PowerCenter Server processes the Joiner transformation configured with a hash auto-keys
partition, it maintains the sort order by processing the sorted data using the same partitions it
uses to route the data from each Sorter transformation.

Partitioning Joiner Transformations

389

Figure 13-22 shows Sorter transformations used with hash auto-keys to maintain sort order:
Figure 13-22. Using Sorter Transformations with Hash Auto-Keys to Maintain Sort Order
Source with
unsorted
data

Source with
unsorted
data

Source
Qualifier
transformation

Source
Qualifier
transformation

Sorter
transformation
with hash autokeys partition
point

Sorter
transformation
with hash autokeys partition
point

Joiner
transformation
with hash autokeys or passthrough
partition point

Sorted Data
Unsorted Data

Note: For best performance, use sorted flat files or sorted relational data. You may want to

calculate the processing overhead for adding Sorter transformations to your mapping.

Optimizing Sorted Joiner Transformations with Partitions


When you use partitions with a sorted Joiner transformation, you may optimize performance
by grouping data and using n:n partitions.

Add a Hash Auto-keys Partition Upstream of the Sort Origin


To obtain expected results and get best performance when partitioning a sorted Joiner
transformation, you must group and sort data. To group data, ensure that rows with the same
key value are routed to the same partition. The best way to ensure that data is grouped and
distributed evenly among partitions is to add a hash auto-keys or key-range partition point
before the sort origin. Placing the partition point before you sort the data ensures that you
maintain grouping and sort the data within each group.

Use n:n Partitions


You may be able to improve performance for a sorted Joiner transformation by using n:n
partitions. When you use n:n partitions, the Joiner transformation reads master and detail
rows concurrently and does not need to cache all of the master data. This reduces memory
usage and speeds processing. When you use 1:n partitions, the Joiner transformation caches
all the data from the master pipeline and writes the cache to disk if the memory cache fills.
When the Joiner transformation receives the data from the detail pipeline, it must then read
the data from disk to compare the master and detail pipelines.

390

Chapter 13: Pipeline Partitioning

Partitioning Lookup Transformations


You can use cache partitioning for static and dynamic caches, and named and unnamed
caches. When you create a partition point at a connected Lookup transformation, you can use
cache partitioning under the following conditions:

You use the hash auto-keys partition type for the Lookup transformation.

The lookup condition contains only equality operators.

The database is configured for case-sensitive comparison.


For example, if the lookup condition contains a string port and the database is not
configured for case-sensitive comparison, the PowerCenter Server does not perform cache
partitioning and writes the following message to the session log:
CMN_1799 Cache partitioning requires case sensitive string comparisons.
Lookup will not use partitioned cache as the database is configured for
case insensitive string comparisons.

For more information about cache partitioning, see Cache Partitioning on page 359.

Partitioning Lookup Transformations

391

Partitioning Sorter Transformations


If you configure multiple partitions in a session that uses a Sorter transformation, the
PowerCenter Server sorts data in each partition separately. The Workflow Manager allows you
to choose hash auto-keys, key-range, or pass-through partitioning when you add a partition
point at the Sorter transformation.
Use hash-auto keys partitioning when you place the Sorter transformation before an
Aggregator transformation configured to use sorted input. Hash auto-keys partitioning
groups rows with the same values into the same partition based on the partition key. After
grouping the rows, the PowerCenter Server passes the rows through the Sorter
transformation. The PowerCenter Server processes the data in each partition separately, but
hash auto-keys partitioning accurately sorts all of the source data because rows with matching
values are processed in the same partition.
Use key-range partitioning when you want to send all rows in a partitioned session from
multiple partitions into a single partition for sorting. When you merge all rows into a single
partition for sorting, the PowerCenter Server can process all of your data together.
Use pass-through partitioning if you already used hash partitioning in the pipeline. This
ensures that the data passing into the Sorter transformation is correctly grouped among the
partitions. Pass-through partitioning increases session performance without increasing the
number of partitions in the pipeline.
For more information on Sorter transformations, see Sorter Transformation in the
Transformation Guide.

Configuring Sorter Transformation Work Directories


The PowerCenter Server creates temporary files for each Sorter transformation in a pipeline.
It reads and writes data to these files while it performs the sort. The PowerCenter Server stores
these files in the Sorter transformation work directories.
By default, the Workflow Manager sets the work directories for all partitions at Sorter
transformations to $PMTempDir. You can specify a different work directory for each
partition in the session properties.

392

Chapter 13: Pipeline Partitioning

Figure 13-23 shows where you specify the work directories in the session properties:
Figure 13-23. Session Properties - Configuring Sorter Transformations

Selected Sorter Transformation

Enter Sorter transformation


work directories.

Partitioning Sorter Transformations

393

Mapping Variables in Partitioned Pipelines


When you specify multiple partitions in a target load order group that uses mapping variables,
the PowerCenter Server evaluates the value of a mapping variable in each partition separately.
The PowerCenter Server uses the following process to evaluate variable values:
1.

It updates the current value of the variable separately in each partition according to the
variable function used in the mapping.

2.

After loading all the targets in a target load order group, the PowerCenter Server
combines the current values from each partition into a single final value based on the
aggregation type of the variable.

3.

If there is more than one target load order group in the session, the final current value of
a mapping variable in a target load order group becomes the current value in the next
target load order group.

4.

When the PowerCenter Server completes loading the last target load order group, the
final current value of the variable is saved into the repository.
For more information about mapping variables, see Mapping Parameters and Variables
in the Designer Guide. For more information about target load order groups, see Reading
Source Data on page 22.

Use one of the following variable functions in the mapping to set the variable value:

SetCountVariable

SetMaxVariable

SetMinVariable

For more information about the variable functions, see Functions in the Transformation
Language Reference.
Table 13-11 describes how the PowerCenter Server calculates variable values across partitions:
Table 13-11. Variable Value Calculations with Partitioned Sessions
Variable Function

Variable Value Calculation Across Partitions

SetCountVariable

PowerCenter Server calculates the final count values from all partitions.

SetMaxVariable

PowerCenter Server compares the final variable value for each partition and saves the
highest value.

SetMinVariable

PowerCenter Server compares the final variable value for each partition and saves the
lowest value.

Note: You should use the SetVariable function only once for each mapping variable in a

pipeline. When you create multiple partitions in a pipeline, the PowerCenter Server uses
multiple threads to process that pipeline. If you use this function more than once for the same
variable, the current value of a mapping variable may have indeterministic results.

394

Chapter 13: Pipeline Partitioning

Partitioning Rules
You can create multiple partitions in a pipeline if the PowerCenter Server can maintain data
consistency when it processes the partitioned data. When you create a session, the Workflow
Manager validates each pipeline for partitioning. You can change the partitioning information
for a pipeline as long as it conforms to the rules and restrictions listed in this section.
There are several types of partitioning rules and restrictions. These include restrictions on the
number of partitions, partitioning restrictions when you change a mapping, restrictions that
apply to other Informatica products, and general guidelines.

Restrictions on the Number of Partitions


In general, you can create up to 64 partitions at any partition point in each pipeline in a
mapping. Under certain circumstances however, the number of partitions should or must be
limited.

Restrictions for Numerical Functions


The numerical functions CUME, MOVINGSUM, and MOVINGAVG calculate running
totals and averages on a row-by-row basis. According to the way you partition a pipeline, the
order that rows of data pass through a transformation containing one of these functions can
change. Therefore, a session with multiple partitions that uses CUME, MOVINGSUM, or
MOVINGAVG functions may not always return the same calculated result.

Restrictions for Relational Targets


When you configure a session to load data to relational targets, the PowerCenter Server can
create one or more connections to each target. If you configure multiple target partitions in a
session that writes to a database or ODBC target that does not support multiple connections,
the session fails.
When you create multiple target partitions in a session that loads data to an Informix
database, you must create the target table with row-level locking.
For more information, see Database Compatibility on page 379.
Sybase IQ does not allow multiple concurrent connections to tables. If you create multiple
target partitions in a session that loads to Sybase IQ, the PowerCenter Server loads all of the
data in one partition.

Restrictions for Transformations


Some restrictions on the number of partitions depend on the types of transformations in the
pipeline. These restrictions apply to all transformations, including reusable transformations,
transformations created in mappings and mapplets, and transformations, mapplets, and
mappings referenced by shortcuts.

Partitioning Rules

395

Table 13-12 describes the restrictions on the number of partitions for transformations:
Table 13-12. Restrictions on the Number of Partitions for Transformations
Transformation

Restrictions

Custom transformation

By default, you can only specify one partition if the pipeline contains a Custom
transformation.
However, this transformation contains an option on the Properties tab to allow
multiple partitions. If you enable this option, you can specify multiple partitions at this
transformation. Do not select Is Partitionable if the Custom transformation procedure
performs the procedure based on all the input data together, such as data cleansing.

External Procedure
transformation

By default, you can only specify one partition if the pipeline contains an External
Procedure transformation.
This transformation contains an option on the Properties tab to allow multiple
partitions. If this option is enabled, you can specify multiple partitions at this
transformation.

Joiner transformation

You can specify only one partition if the pipeline contains the master source for a
Joiner transformation and you do not add a partition point at the Joiner
transformation.

XML target instance

You can specify only one partition if the pipeline contains XML targets.

Sequence numbers generated by Normalizer and Sequence Generator transformations might


not be sequential for a partitioned source, but they are unique.

Restrictions when Running the Debugger


You can run the Debugger on a session if all pipelines in the mapping contain one partition.

Partition Restrictions for Editing Objects


When you edit object properties, you can impact your ability to create multiple partitions in a
a session or to run an existing session with multiple partitions.

Before You Create a Session


When you create a session, the Workflow Manager checks the mapping properties. Mappings
dynamically pick up changes to shortcuts, but not to reusable objects, such as reusable
transformations and mapplets. Therefore, if you edit a reusable object in the Designer after
you save a mapping and before you create a session, you must open and resave the mapping for
the Workflow Manager to recognize the changes to the object.

After You Create a Session with Multiple Partitions


When you edit a mapping after you create a session with multiple partitions, the Workflow
Manager does not invalidate the session even if the changes violate partitioning rules. The
PowerCenter Server fails the session the next time it runs unless you edit the session so that it
no longer violates partitioning rules.

396

Chapter 13: Pipeline Partitioning

The following changes to mappings can cause session failure:

You delete a transformation that was a partition point.

You add a transformation that is a default partition point.

You move a transformation that is a partition point to a different pipeline.

You change a transformation that is a partition point in any of the following ways:

The existing partition type is invalid.

The transformation can no longer support multiple partitions.

The transformation is no longer a valid partition point.

You disable partitioning in an External Procedure transformation after you create a


pipeline with multiple partitions.

You switch the master and detail source for the Joiner transformation after you create a
pipeline with multiple partitions.

Partition Restrictions for Informatica Application Products


You can specify multiple partitions in Informatica Application products, but there are some
additional restrictions with these products.
Table 13-13 describes the partitioning restrictions that apply to Informatica Application
products:
Table 13-13. Partitioning Guidelines for Informatica Application Products
Product

Restrictions

PowerCenter Connect for PeopleSoft

If the pipeline contains an Application Source Qualifier transformation for


PeopleSoft when it is connected to or associated with a PeopleSoft tree, then
you can specify only one partition and the partition type must be passthrough.

PowerCenter Connect for IBM


MQSeries

For MQSeries sources, you can specify multiple partitions only if there is no
associated source qualifier in the pipeline.
You cannot merge output files from sessions with multiple partitions if you
use an MQSeries message queue as the target connection type.

PowerCenter Connect for SAP R/3

If the mapping contains hierarchies or IDOCs, then you can specify only one
partition and the partition type must be pass-through.
If you generate the ABAP program using exec SQL, then you can specify only
one partition and the partition type must be pass-through.
You must use the Informatica default date format to enter dates in key
ranges.

PowerCenter Connect for SAP BW

You can specify only one partition when the target load order group contains
an SAP BW target.

Partitioning Rules

397

Table 13-13. Partitioning Guidelines for Informatica Application Products


Product

Restrictions

PowerCenter Connect for Siebel

When you use a source filter in a join override, always use the following
syntax for Siebel business components:
SiebelBusinessComponentName.SiebelFieldName

When you create a source filter for a Siebel business component, always use
the following syntax:
SiebelBusinessComponentName.SiebelFieldName

PowerCenter Connect SDK

If the mapping contains a multi-group target that receives data from more
than one pipeline, then you can specify only one partition.
If the mapping contains a multi-group target that receives data from multiple
groups, then the partition type must be pass-through.

For more information about these other products, please see the product documentation.

Partitioning Guidelines
This section summarizes the other guidelines that appear throughout this chapter.

Guidelines for Adding and Deleting Partition Points


The following guidelines apply to adding and deleting partition points:

You cannot delete a partition point at a Source Qualifier transformation, a Normalizer


transformation for COBOL sources, or a target instance.

You cannot create a partition point at a source instance.

You cannot create a partition point at a Sequence Generator transformation or an


unconnected transformation.

You can add a partition point at any other transformation provided that no partition point
receives input from more than one pipeline stage.

For more information, see Adding and Deleting Partition Points on page 353.

Guidelines for Specifying the Partition Type


You must choose pass-through partitioning at certain partition points in a pipeline if the
session uses a source-based commit or constraint-based loading, or if the mapping contains a
transaction generator, such as a Transaction Control transformation. For more information,
see Table 13-4 on page 357.
If recovery is enabled, the Workflow Manager sets pass-through as the partition type unless
the partition point is either an Aggregator transformation or a Rank transformation.

Guidelines for Adding and Deleting Partition Keys


The following guidelines apply to creating and deleting partition keys:

398

A partition key must contain at least one port.

Chapter 13: Pipeline Partitioning

If you choose key range partitioning at any partition point, you must specify a range for
each port in the partition key.

If you choose key range partitioning and need to enter a date range for any port, use the
standard PowerCenter date format. For details on the default date format, see Dates in
the Transformation Language Reference.

The Workflow Manager does not validate overlapping string ranges, overlapping numeric
ranges, gaps, or missing ranges.

If a row contains a null value in any column that makes up the partition key, or if a row
contains values that fall outside all of the key ranges, the PowerCenter Server sends that
row to the first partition.

For more information, see Adding Key Ranges on page 365.

Guidelines for Partitioning File Sources and Targets


The following guidelines apply to partitioning file sources and targets:

When connecting to file sources or targets, you must choose the same connection type for
all partitions. You may choose different connection objects as long as each object is of the
same type. For more information, see Partitioning File Sources on page 374 and
Partitioning File Targets on page 380.

You cannot merge output files from sessions with multiple partitions if you use FTP, an
external loader, or an MQSeries message queue as the target connection type. For more
information, see Partitioning File Targets on page 380.

Partitioning Rules

399

400

Chapter 13: Pipeline Partitioning

Chapter 14

Monitoring Workflows
This chapter covers the following topics:

Overview, 402

Using the Workflow Monitor, 404

Customizing Workflow Monitor Options, 409

Using Workflow Monitor Toolbars, 415

Working with Tasks and Workflows, 416

Workflow and Task Status, 421

Using the Gantt Chart View, 423

Using the Task View, 430

Monitoring Session Details, 434

Creating and Viewing Performance Details, 436

Tips, 441

401

Overview
You can monitor workflows and tasks in the Workflow Monitor. View details about a
workflow or task in Gantt Chart view or Task view. You can run, stop, abort, and resume
workflows from the Workflow Monitor.
The Workflow Monitor displays workflows that have run at least once. The Workflow
Monitor continuously receives information from the PowerCenter Server and Repository
Server. It also fetches information from the repository to display historic information.
The Workflow Monitor consists of the following windows:

Navigator window. Displays monitored repositories, servers, and repository objects.

Output window. Displays messages from the PowerCenter Server and the Repository
Server.

Time window. Displays progress of workflow runs.

Gantt Chart view. Displays details about workflow runs in chronological (Gantt Chart)
format.

Task view. Displays details about workflow runs in a report format, organized by workflow
run.

The Workflow Monitor displays time relative to the time configured on the PowerCenter
Server machine. For example, a folder contains two workflows. One workflow runs on a
PowerCenter Server in your local time zone, and the other runs on a PowerCenter Server in a
time zone two hours later. If you start both workflows at 9 a.m. local time, the Workflow
Monitor displays the start time as 9 a.m. for one workflow and as 11 a.m. for the other
workflow.

402

Chapter 14: Monitoring Workflows

Figure 14-1 shows the Workflow Monitor in Gantt Chart view:


Figure 14-1. Workflow Monitor

Navigator
Window

Gantt
Chart
View

Task View

Output Window

Time Window

Toggle between Gantt Chart view and Task view by clicking the tabs on the bottom of the
Workflow Monitor.
Note: You can view and hide the Output window in the Workflow Monitor. To toggle back

and forth, choose View-Output.

Permissions and Privileges


To use the Workflow Monitor, you must have one of the following sets of permissions and
privileges:

Use Workflow Manager privilege with the execute permission on the folder

Workflow Operator privilege with the read permission on the folder

Super User privilege

You must also have execute permission for connection objects to restart, resume, stop, or
abort a workflow containing a session.
For more information on permissions and privileges necessary to use the Workflow Monitor,
see Permissions and Privileges by Task in the Repository Guide.

Overview

403

Using the Workflow Monitor


The Workflow Monitor provides options to view information about workflow runs. After you
open the Workflow Monitor and connect to a repository, you can view dynamic information
about workflow runs by connecting to a PowerCenter Server.
You can customize the Workflow Monitor display by configuring the maximum days or
workflow runs the Workflow Monitor shows. You can also filter tasks and servers in both
Gantt Chart and Task view.
Complete the following steps to monitor workflows:
1.

Open the Workflow Monitor.

2.

Connect to the repository containing the workflow.

3.

Connect to the PowerCenter Server.

4.

Select the workflow you want to monitor.

5.

Choose from Gantt Chart view or Task view.

Opening the Workflow Monitor


You can open the Workflow Monitor in the different ways:

From the Windows Start menu

From the Workflow Manager Navigator

Configure the Workflow Manager to open the Workflow Monitor when you run a
workflow from the Workflow Manager.

You can open multiple instances of the Workflow Monitor on one machine using the
Windows Start menu.
To open the Workflow Monitor when you start a workflow:
1.

In the Workflow Manager, choose Tools-Options.

2.

In the General tab, select Launch Workflow Monitor When Workflow Is Started.

To open the Workflow Monitor from the Workflow Manager:


1.

In the Workflow Manager, connect to a repository.

2.

In the Navigator, right-click a server or a repository and choose Run Monitor.


The Workflow Monitor appears.

404

Chapter 14: Monitoring Workflows

Connecting to Repositories
When you open the Workflow Monitor, you must connect to a repository to monitor the
objects in it. Connect to repositories by choosing Repository-Connect. Enter the repository
name and connection information.
Once you connect to a repository, the Workflow Monitor displays a list of servers available for
the repository. The Workflow Monitor can monitor multiple repositories, PowerCenter
Servers, and workflows at the same time.
Note: If you are not connected to a repository, you can remove the repository from the

Navigator. Select the repository in the Navigator and choose Edit-Delete. The Workflow
Monitor displays a message verifying that you want to remove the repository from the
Navigator list. Click Yes to remove the repository. You can connect to the repository again at
any time.

Connecting to PowerCenter Servers


When you connect to a repository, the Workflow Monitor displays all registered PowerCenter
Servers and deleted PowerCenter Servers. To monitor tasks and workflows that run on a
server, you must connect to the server. In the Navigator, the Workflow Monitor displays a red
icon over deleted servers.
To connect to a server, right-click it and choose Connect. When you connect to a server, you
can view all folders that you have read permission on. You can disconnect from a server by
right-clicking it and selecting Disconnect. When you disconnect from a server, or when the
Workflow Monitor cannot connect to a server, the Workflow Monitor displays disconnected
for the server status.
You can also verify whether a PowerCenter Server is running by pinging it. Right-click the
server in the Navigator and select Ping Server. You can view the ping response time in the
Output window.
Note: You can also open a PowerCenter Server node in the Navigator without connecting to it.

When you open a PowerCenter Server, the Workflow Monitor gets workflow run information
stored in the repository. It does not get dynamic workflow run information from currently
running workflows.

Filtering Tasks and Servers


You can filter tasks and servers in both Gantt Chart view and Task view. Use the Filters menu
to hide tasks and servers you do not want to view in the Workflow Monitor.

Filtering Tasks
You can view all or some workflow tasks. You can filter out tasks to view only tasks you want.
For example, if you want to view only Session tasks, you can hide all other tasks. You can view
all tasks at any time.

Using the Workflow Monitor

405

You can also filter deleted tasks. To filter deleted tasks, choose Filters-Deleted Tasks.
To filter tasks:
1.

Choose Filters-Tasks.
The Filter Tasks dialog box appears.

2.

Clear the tasks you want to hide, and select the tasks you want to view.

3.

Click OK.
Note: When you filter a task, the Gantt Chart view displays a red link between tasks to

indicate a filtered task. You can double-click the link to view the tasks you hid.

Filtering Servers
When you connect to a repository, the Workflow Monitor displays a list of registered servers
and deleted servers. When you register multiple servers, you can filter out servers to view only
servers you want to monitor.
When you hide a server, the Workflow Monitor hides the server from the Navigator for both
Gantt Chart and Task view. You can show the server at any time.
You can hide unconnected servers. When you hide a connected server, the Workflow Monitor
asks if you want to disconnect from the server and then filter it. You must disconnect from a
server before hiding it.
To filter ser vers:
1.

In the Navigator, right-click a repository and select Filter Servers.


or
Choose Filters-Servers.

406

Chapter 14: Monitoring Workflows

The Filter Servers dialog box appears.

2.

Select the servers you want to view, and clear the servers you want to filter. Click OK.
If you are connected to a server that you clear, the Workflow Monitor prompts you to
disconnect from the server before filtering.

3.

Click Yes to disconnect from the server and filter it.


The Workflow Monitor hides the server from the Navigator.
Click No to remain connected to the server. If you click No, you cannot filter the server.

Tip: You can also filter a server in the Navigator by right-clicking it and selecting Filter Server.

Opening and Closing Folders


You can choose which folders to open and close in the Workflow Monitor. When you open a
folder, the Workflow Monitor displays the number of workflow runs that you configured in
the Workflow Monitor options. For more information, see Configuring General Options on
page 409.
You can open and close folders in both Gantt Chart and Task view. When you open a folder,
it opens in both views. To open a folder, right-click it in the Navigator and select Open. Or,
you can double-click the folder.
To view folder contents in the Workflow Monitor, you must have one of the following sets of
permissions and privileges:

Workflow Operator privilege with read permission on the folder

Super User privilege

Using the Workflow Monitor

407

Viewing Statistics
You can view statistics about the objects you monitor in the Workflow Monitor by choosing
View-Statistics. The Statistics dialog box displays the following information:

Number of opened repositories. Number of repositories you are connected to in the


Workflow Monitor.

Number of connected servers. Number of servers you connected to since you opened the
Workflow Monitor.

Number of fetched tasks. Number of tasks the Workflow Monitor fetched from the
repository during the period specified in the Time window.

Figure 14-2 shows the Statistics dialog box:


Figure 14-2. Workflow Monitor Statistics Dialog Box

Viewing Properties
You can view properties for the following items:

Tasks. You can view properties such as task name, start time, and status.

Sessions. You can view properties about the Session task and session run, such as mapping
name and number of rows successfully loaded. You can also view load statistics about the
session run. For more information on session details, see Monitoring Session Details on
page 434. You can also view performance details about the session run. For more
information, see Creating and Viewing Performance Details on page 436.

Workflows. You can view properties such as start time, status, and run type.

Links. When you double-click a link between tasks in Gantt Chart view, you can view
tasks you hide.

Servers. You can view properties such as server version and startup time. You can also view
the sessions and workflows running on the PowerCenter Server.

Folders. You can view properties such as the number of workflow runs displayed in the
Time window.

To view properties for all objects, right-click the object and select Properties. You can rightclick items in the Navigator or the Time window in either Gantt Chart view or Task view.
To view link properties, double-click the link in the Time window of Gantt Chart view.
When you view link properties, you can double-click a task in the Link Properties dialog box
to view the properties for the filtered task.

408

Chapter 14: Monitoring Workflows

Customizing Workflow Monitor Options


You can configure how the Workflow Monitor displays general information, workflows, and
tasks. You can configure general tasks such as the maximum number of days or runs that the
Workflow Monitor displays. You can also configure options specific to Gantt Chart and Task
view.
Choose Tools-Options to configure Workflow Monitor options.
You can configure the following options in the Workflow Monitor:

General. Customize general options such as the maximum number of workflow runs to
display and whether to receive messages from the Workflow Manager. See Configuring
General Options on page 409

Gantt Chart view. Configure Gantt Chart view options such as workspace color, status
colors, and time format. See Configuring Gantt Chart View Options on page 411.

Task view. Configure which columns to display in Task view. See Configuring Task View
Options on page 412.

Advanced. Configure advanced options such as the number of workflow runs the
Workflow Monitor holds in memory for each server. Configuring Advanced Options on
page 412.

Configuring General Options


You can customize general options such as the maximum number of days to display and
which text editor to use for viewing session and workflow logs.

Customizing Workflow Monitor Options

409

Figure 14-3 shows the General Options tab:


Figure 14-3. General Tab for Workflow Monitor Options

Table 14-1 describes the options you can configure on the General tab:
Table 14-1. Workflow Monitor General Options

410

Setting

Description

Maximum Days

Specifies the number of tasks the Workflow Monitor displays up to a maximum


number of days. The default is 5.

Maximum Workflow Runs per


Folder

Specifies the maximum number of workflow runs the Workflow Monitor displays for
each folder. The default is 200.

Receive Messages from


Workflow Manager

Select this option to receive messages from the Workflow Manager. The Workflow
Manager sends messages when you start or schedule a workflow in the Workflow
Manager. The Workflow Monitor displays these messages in the Output window.

Receive Notifications from


Repository Server

Select this option to receive notifications from the Repository Server. Notifications
from the Repository Server display in the Output window Notifications tab.

Log File Editor

Enter the path and file name of the text editor to view and edit workflow and session
logs. You can browse to select an editor. By default, the Workflow Monitor uses
WordPad.

Location

The location where the Workflow Monitor stores temporary versions of log files
when you open session or workflow logs from the Workflow Monitor.

Chapter 14: Monitoring Workflows

Configuring Gantt Chart View Options


You can configure Gantt Chart view options such as workspace color, status colors, and time
format.
Figure 14-4 shows the Gantt Chart Options tab:
Figure 14-4. Gantt Chart Options

Table 14-2 describes the options you can configure on the Gantt Chart Options tab:
Table 14-2. Gantt Chart Options
Gantt Chart Option

Description

Status Color

Choose a status and configure the color for the status. The Workflow Monitor displays tasks
with the selected status in the colors you choose. You can choose two colors to display a
gradient.

Recovery Color

Configure the color for the recovery sessions. The Workflow Monitor uses the status color for
the body of the status bar, and it uses and the recovery color as a gradient in the status bar.

Workspace Color

Choose a color for each workspace component.

Time Format

Select a display format for the time window.

Customizing Workflow Monitor Options

411

Configuring Task View Options


You can choose the columns you want to display in Task view. You can also reorder the
columns and specify a default column width.
Figure 14-5 shows the Task View Options tab:
Figure 14-5. Task View Options

Configuring Advanced Options


You can configure advanced options such as the number of workflow runs the Workflow
Monitor holds in memory for each server.

412

Chapter 14: Monitoring Workflows

Figure 14-6 shows the Advanced Options tab:


Figure 14-6. Advanced Tab for Workflow Monitor Options

Table 14-3 describes the options you can configure on the Advanced tab:
Table 14-3. Advanced Workflow Monitor Options
Setting

Description

Expand Running Workflows Automatically

Expands running workflows in the Navigator.

Hide Folders/Workflows That Do Not Contain


Any Runs When Filtering By Running/
Schedule Runs

Hides folders or workflows under the Workflow Run column in the Time
window when you filter running or scheduled tasks.

Highlight the Entire Row When an Item Is


Selected

Highlights the entire row in the Time window for selected items. When
you disable this option, the Workflow Monitor highlights the item in the
Workflow Run column in the Time window.

Customizing Workflow Monitor Options

413

Table 14-3. Advanced Workflow Monitor Options

414

Setting

Description

Open Latest 20 Runs At a Time

Allows you to open the number of workflow runs of your choice. The
number of runs to be opened is set at 20 by default.

Minimum Number of Workflow Runs (Per


Server) the Workflow Monitor Will
Accumulate in Memory

Specifies the minimum number of workflow runs per server that the
Workflow Monitor holds in memory before it starts releasing older runs
from memory.
When you connect to a server, the Workflow Monitor fetches the
number of workflow runs specified on the General tab for each folder
you connect to. When the number of runs is less than the number
specified in this option, the Workflow Monitor stores new runs in
memory until it reaches this number. Then it releases the oldest run
from memory when it fetches a new run.
When the number of workflow runs the Workflow Monitor initially
fetches exceeds the number specified in this option, the Workflow
Monitor stores all those runs and then releases the oldest run from
memory when it fetches a new run.

Chapter 14: Monitoring Workflows

Using Workflow Monitor Toolbars


The Workflow Monitor toolbars allow you to select tools and tasks quickly. You can perform
the following toolbar operations:

Display or hide a toolbar.

Create a new toolbar.

Add or remove buttons.

For details on how to perform these toolbar operations, see Using the Designer in the
Designer Guide.
By default, the Workflow Monitor displays the following toolbars:

Standard. Contains buttons to connect to and disconnect from repositories, and to zoom
and print the workspace.
Figure 14-7 displays the Standard toolbar:
Figure 14-7. Standard Toolbar

Server. Contains buttons to connect to and disconnect from PowerCenter Servers, to ping
the server, and to start and stop workflows, worklets, and tasks.
Figure 14-8 displays the Server toolbar:
Figure 14-8. Server Toolbar

View. Contains buttons to refresh the view and to open workflow and session logs.
Figure 14-9 displays the View toolbar:
Figure 14-9. View Toolbar

Filter. Contains buttons to display most recent runs, and to filter tasks, servers, and
folders.
Figure 14-10 displays the Filter toolbar:
Figure 14-10. Filter Toolbar

Using Workflow Monitor Toolbars

415

Working with Tasks and Workflows


You can perform the following tasks with objects in the Workflow Monitor:

Run a task or workflow.

Resume a suspended workflow.

Stop or abort a task or workflow.

Schedule and unschedule a workflow.

View session logs and workflow logs.

View history names.

Running a Task, Workflow, or Worklet


The Workflow Monitor displays workflows that have run at least once. In the Workflow
Monitor, you can run a workflow or any task or worklet in the workflow. To run a workflow
or part of a workflow, right-click the workflow or task and choose a restart option. When you
choose restart, the task, workflow, or worklet runs on the PowerCenter Server you specify in
the workflow properties.
You can also run part of a workflow. When you run part of a workflow, the PowerCenter
Server runs the workflow from the selected task to the end of the workflow.
For details on running workflows and tasks in the Workflow Manager, see Running the
Workflow on page 122.
To run a workflow from the Workflow Monitor:
1.

In the Navigator, select the workflow you want to run.

2.

Right-click the workflow in the Navigator and choose Restart.


or
Choose Task-Restart.
The PowerCenter Server runs the workflow you specify.

To run a task from the Workflow Monitor:


1.

In the Navigator, select the task or worklet you want to run.

2.

Right-click the task or worklet in the Navigator and choose Restart Task.
The PowerCenter Server runs the task or worklet you specify. It does not run the rest of
the workflow.

To run a part of a workflow from the Workflow Monitor:


1.

416

In the Navigator, select the task from which you want to run the workflow.

Chapter 14: Monitoring Workflows

2.

Right-click the task and choose Restart Workflow from Task.


or
Choose Task-Restart.
The PowerCenter Server runs the workflow starting with the task you specify.

Resuming a Workflow or Worklet


In the workflow properties, you can choose to suspend the workflow or worklet if a task fails.
After you fix the failed task, resume the workflow in the Workflow Monitor. When you
resume a workflow, the PowerCenter Server finds the failed task, runs the task again, and
continues running the rest of the tasks in the workflow path.
For details on suspending a workflow, see Suspending the Workflow on page 127.
To resume a workflow or worklet:
1.

In the Navigator, select the workflow or worklet you want to resume.

2.

Choose Tasks-Resume.
or
Right-click the workflow or worklet in the Navigator and choose Resume.
The Workflow Monitor displays server messages about the resume command in the
Output window.

Recovering a Workflow or Worklet


In the workflow properties, you can choose to suspend the workflow or worklet if a session
fails. After you fix the errors that caused the session to fail, recover the workflow in the
Workflow Monitor. When you recover a workflow, the PowerCenter Server recovers the failed
session, and continues running the rest of the tasks in the workflow path.
For details on suspending a workflow, see Suspending the Workflow on page 127.
To recover a workflow or worklet:
1.

In the Navigator, select the workflow or worklet you want to recover.

2.

Choose Tasks-Resume/Recover.
or
Right-click the workflow or worklet in the Navigator and choose Resume/Recover.
The Workflow Monitor displays server messages about the recover command in the
Output window.

Working with Tasks and Workflows

417

Stopping or Aborting Tasks and Workflows


You can stop or abort a task, workflow, or worklet in the Workflow Monitor at any time.
When you stop a task in the workflow, the PowerCenter Server stops processing the task and
all other tasks in its path. The PowerCenter Server continues running concurrent tasks. If the
PowerCenter Server cannot stop processing the task, you need to abort the task. When the
PowerCenter Server aborts a task, it kills the DTM process and terminates the task.
For details on server handling of stop and abort, see Server Handling of Stop and Abort on
page 129.
To stop or abort workflows, tasks, or worklets in the Workflow Monitor:
1.

In the Navigator, select the task, workflow, or worklet you want to stop or abort.

2.

Choose Tasks-Stop or Tasks-Abort.


or
Right-click the task, workflow, or worklet in the Navigator and choose Stop or Abort.

3.

The Workflow Monitor displays the status of the stop or abort command in the Output
window.

Scheduling and Unscheduling Workflows


You can schedule and unschedule workflows in the Workflow Monitor. You can schedule any
workflow that is not configured to run on demand. When you try to schedule a run on
demand workflow, the Workflow Monitor displays an error message in the Output window.
When you schedule an unscheduled workflow, the workflow uses its original schedule
specified in the workflow properties. If you want to specify a different schedule for the
workflow, you must edit the scheduler in the Workflow Manager.
To schedule an unscheduled workflow in the Workflow Monitor:

Right-click the workflow and choose Schedule.


The Workflow Monitor displays the workflow status as Scheduled, and displays a message
in the Output window.

To unschedule a scheduled workflow in the Workflow Monitor:

Right-click the workflow and choose Unschedule.


The Workflow Monitor displays the workflow status as Unscheduled, and displays a
message in the Output window.

For details on scheduling workflows, see Scheduling a Workflow on page 112.

418

Chapter 14: Monitoring Workflows

Viewing Session Logs and Workflow Logs


You can open and edit session and workflow log files from the Workflow Monitor. To view
workflow or session logs, connect to the server. You can view the most recent session or
workflow log. Or, select a particular workflow run and view the log for that run. If a past
session or workflow log is not available, the Workflow Manager opens the most recent log file.
You can view log files in any text editor on the PowerCenter Client. To change the log file
editor, choose Tools-Options. Enter the path and file name of the text editor in the Log File
Editor field on the General tab.
When you open a session or workflow log, the Workflow Monitor copies the log file from the
PowerCenter Server machine to the directory specified on the General tab of the Options
dialog box. The Workflow Monitor opens the file from the temporary directory on the client
machine. When you open a session or workflow log, you can cancel the operation at any time.
Note: To view past session or workflow log files, you must configure the session or workflow to

save logs by timestamp. For more information on workflow and session logs, see Log Files
on page 455.

Viewing Dynamic Log Files


When you open a session or workflow log, the Workflow Monitor opens the most recent
version of the log file, even if the PowerCenter Server is currently writing to the log file. Each
time you choose Get Session Log or Get Workflow Log, the Workflow Monitor opens a new
text file with the most recent version of the log file. If you choose to open the log file after the
session completes, the Workflow Monitor opens the entire log in a new text file.

Steps to View Log Files


Perform the following steps to view a session or workflow log.
To view a session or workflow log file:
1.

Right-click a Session task or workflow in the Navigator or Time window.

2.

Choose Get Session Log, or choose Get Workflow Log.


The most recent session or workflow log file opens in the log file editor you specify for
the Workflow Monitor.
Tip: When the Workflow Monitor retrieves the session or workflow log, you can press the

Esc key to cancel the process.

Viewing History Names


If you rename a task, workflow, or worklet, the Workflow Monitor can show a history of
names. When you start a renamed task, workflow, or worklet, the Workflow Monitor displays
the current name. To view a list of historical names, select the task, workflow, or worklet in
the Navigator. Right-click and choose Show History Names.

Working with Tasks and Workflows

419

Figure 14-11 shows the History Names dialog box:


Figure 14-11. History Names Dialog Box

420

Chapter 14: Monitoring Workflows

Workflow and Task Status


The Workflow Monitor displays the status of workflows and tasks.
Table 14-4 describes the different statuses for workflow and tasks:
Table 14-4. Workflow and Task Status
Status Name

Status for

Description

Aborted

Workflows
Tasks

The PowerCenter Server aborted the workflow or task. The PowerCenter


Server kills the DTM process when you abort a workflow or task.

Aborting

Workflows
Tasks

The PowerCenter Server is in the process of aborting the workflow or task.

Disabled

Workflows
Tasks

You select the Disabled option in the workflow or task properties. The
PowerCenter Server does not run the disabled workflow or task until you clear
the Disabled option.

Failed

Workflows
Tasks

The PowerCenter Server failed the workflow or task due to errors.

Running

Workflows
Tasks

The PowerCenter Server is running the workflow or task.

Scheduled

Workflows

You schedule the workflow to run at a future date. The PowerCenter Server
runs the workflow for the duration of the schedule.

Stopped

Workflows
Tasks

You choose to stop the workflow or task in the Workflow Monitor. The
PowerCenter Server stopped the workflow or task.

Stopping

Workflows
Tasks

The PowerCenter Server is in the process of stopping a workflow or task.

Succeeded

Workflows
Tasks

The PowerCenter Server successfully completed the workflow or task.

Suspended

Workflows
Worklets

The PowerCenter Server suspends the workflow because a task fails and no
other tasks are running in the workflow. This status is available only when you
choose the Suspend on Error option.

Suspending

Workflows
Worklets

A task fails in the workflow when other tasks are still running. The PowerCenter
Server stops executing the failed task and continues executing tasks in other
paths. This status is available only when you choose the Suspend on Error
option.

Terminated

Workflows

The PowerCenter Server terminated unexpectedly when it was running this


workflow or task.

Unscheduled

Workflows

You removed a workflow from the schedule. Or, the workflow is scheduled and
the PowerCenter Server is about to run the scheduled workflow.

Waiting

Workflows
Tasks

The PowerCenter Server is waiting for available resources so it can execute


the workflow or task. For example, you may set the maximum number of
concurrent sessions to 10. If the PowerCenter Server is already executing 10
concurrent sessions, all other workflows and tasks has the Waiting status until
the PowerCenter Server is free to execute more tasks.

Workflow and Task Status

421

To see a list of tasks by status, view the workflow in Task view and sort by status. Or, choose
Edit-List Tasks in Gantt Chart view. For details, see Listing Tasks and Workflows on
page 424.

422

Chapter 14: Monitoring Workflows

Using the Gantt Chart View


The Gantt Chart view allows you to view chronological details of workflow runs. The Gantt
Chart view displays the following information:

Task name. Name of the task in the workflow.

Duration. The length of time the PowerCenter Server spends running the most recent task
or workflow.

Status. The status of the most recent task or workflow. For more information about status,
see Workflow and Task Status on page 421.

Connection between objects. The Workflow Monitor shows links between objects in the
Time window.

Figure 14-12 displays the Gantt Chart view:


Figure 14-12. Gantt Chart View

Organizing Tasks
In Gantt Chart view, you can organize tasks in the Navigator. You can drag and drop tasks
within a workflow to change the order they appear in the Navigator.
Using the Gantt Chart View

423

For example, the Workflow Monitor usually displays the Decision task as the first task in the
following workflow:

Decision task displays first.

You can drag and drop the Decision task within the Navigator so the Decision task is in the
middle or at the bottom of the list of tasks for that workflow:

Decision task displays


between other tasks.

Listing Tasks and Workflows


The Workflow Monitor lists tasks and workflows in all repositories you connect to. You can
view tasks and workflows by status, such as failed or succeeded. You can highlight the task in
Gantt Chart view by double-clicking the task in the list.

424

Chapter 14: Monitoring Workflows

To view a list of tasks and workflows by status:


1.

Open the Gantt Chart view and choose Edit-List Tasks. The List Tasks dialog box
appears.

2.

In the List What field, select the type of task status you want to list.
For example, select Failed to view a list of failed tasks and workflows.

3.

Click List to view the list.


Tip: Double-click the task name in the List Tasks dialog box to highlight the task in Gantt

Chart view.

Navigating the Time Window in Gantt Chart View


You can scroll through the Time window in Gantt Chart view to monitor the workflow runs.
To scroll the Time window, you can use any of the following methods:

Use the scroll bars.

Right-click the task or workflow and choose Go To Next Run, or choose Go To Previous
Run.

Choose View-Organize to select the date you want to display.

When you choose View-Organize, the Go To field appears above the Time window. Click the
Go To field to view a calendar and select the date you want to display. When you choose a
date, the Workflow Monitor displays that date beginning at 12:00 a.m.

Using the Gantt Chart View

425

Figure 14-13 shows the Go To field:


Figure 14-13. Organizing Gantt Chart

Zooming the Gantt Chart View


You can change the zoom settings in Gantt Chart view. By default, the Workflow Monitor
shows the Time window in increments of one hour. You can change the time increments to
zoom the Time window.

426

Chapter 14: Monitoring Workflows

Figure 14-14 shows the Time window in 30 minute increments:


Figure 14-14. Zooming the Gantt Chart View

Zoom

30 Minute
Increments
Solid Line
For Hour
Increments
Dotted Line
For Half Hour
Increments

To zoom the Time window in Gantt Chart view, choose View-Zoom and then choose the
desired time increment.
You can also choose the time increment in the Zoom button on the toolbar.

Performing a Search
Use the search tool in the Gantt Chart view to search for tasks, workflows, and worklets in all
repositories you connect to. The Workflow Monitor searches for the word you specify in task
names, workflow names, and worklet names. You can highlight the task in Gantt Chart view
by double-clicking the task after searching.

Using the Gantt Chart View

427

To perform a search:
1.

Open the Gantt Chart view and choose Edit-Find. The Find Object dialog box appears.

2.

In the Find What field, enter the keyword you want to find.

3.

Click Find Now.


The Workflow Monitor displays a list of tasks, workflows, and worklets that match the
keyword.
Tip: Double-click the task name in the Find Object dialog box to highlight the task in

Gantt Chart view.

428

Chapter 14: Monitoring Workflows

Opening All Folders


You can open all folders that you have read permission on in a Repository. To open all the
folders in the Gantt Chart view, right-click the server you want to view, and then choose
Open All Folders. The Workflow Monitor displays workflows and tasks in the folders.

Using the Gantt Chart View

429

Using the Task View


The Task view displays information about workflow runs in a report format. The Task view
provides a convenient way to compare and filter details of workflow runs. Task view displays
the following information:

Workflow run list. The list of workflow runs. The workflow run list contains folder,
workflow, worklet, and task names. The Workflow Monitor displays workflow runs
chronologically with the most recent run at the top. It displays folders and servers
alphabetically.

Status. The status of the task or workflow.

Start time. The time that the PowerCenter Server starts executing the task or workflow.

Completion time. The time that the PowerCenter Server finishes executing the task or
workflow.

Status message. Message from the PowerCenter Server regarding the status of the task or
workflow.

Run type. The method you used to start the workflow. You might manually start the
workflow or schedule the workflow to start.

Worker server. The PowerCenter Server that ran the task.

You can perform the following tasks in Task view:

Filter tasks. Use the Filter menu to select the tasks you want to display or hide. For more
information on filtering tasks in Task view, see Filtering in Task View on page 431.

Hide and view columns. Hide or view an entire column in Task view. For details on
hiding and viewing columns in Task view, see Configuring Task View Options on
page 412.

Hide and view the Navigator. You can hide the Navigator in Task view. Choose ViewNavigator to hide or view the Navigator.

To view the tasks in Task view, select the server you want to monitor in the Navigator.

430

Chapter 14: Monitoring Workflows

Figure 14-15 displays the Task view:


Figure 14-15. Task View

Navigator
Window

Workflow
Run List

Time Window

Task View
Output
Window

Filtering in Task View


In Task view, you can view all or some workflow tasks. You can filter tasks in the following
ways:

By task type. You can filter out tasks to view only tasks you want. For example, if you want
to view only session task types, you can filter out all other tasks. For more information on
filtering task types and servers, see Filtering Tasks and Servers on page 405.

By nodes in the Navigator. You can filter the workflow runs the Workflow Monitor
displays in the Time window by selecting different nodes in the Navigator. For example,
when you select a repository name in the Navigator, the Time window displays all
workflow runs that ran on the PowerCenter Servers registered to that repository. When
you select a folder name in the Navigator, the Time window displays all workflow runs in
that folder.

By the most recent runs. To display by the most recent runs, choose Filters-Most Recent
Runs and choose the number of runs you want to display.

By Time window columns. You can choose Filters-Auto Filter and filter by properties you
specify in the Time window columns.

Using the Task View

431

To filter by Time view columns:


1.

Choose Filters-Auto Filter.


The Filter button appears in the some columns of the Time Window in Task view:

Filter Button
Select the
workflows you want
to display.

2.

Click the Filter button in a column in the Time Window.

3.

Choose the properties you want to filter.


Tip: If you want to view all tasks, select All to view all tasks.

When you click the Filter button in either the Start Time or Completion Time column,
you can choose a custom time to filter.

432

4.

Select Custom for either Start Time or Completion Time. The Filter Start Time or
Custom Completion Time dialog box appears.

5.

Choose to show tasks before, after, or between the time you specify. Select the date and
time. Click OK.

Chapter 14: Monitoring Workflows

Opening All Folders


You can open all folders that you have read permission on in a Repository. To open all folders
in the Task view, right-click the server with the folders you want to view, and then choose
Open All Folders. The Workflow Monitor displays workflows and tasks in the folders.

Using the Task View

433

Monitoring Session Details


When the PowerCenter Server runs a Session task, the Workflow Monitor creates session
details that provide load statistics for each target in the mapping. You can view session details
when the session runs or after the session completes.
To view session details, right-click the session in the Workflow Monitor and choose
Properties. Click the Transformation Statistics tab in the Properties dialog box.
Figure 14-16 shows the session details on the Transformation Statistics tab:
Figure 14-16. Session Properties Transformation Statistics

When you create multiple partitions in a session, the PowerCenter Server provides session
details for each partition. You can use these details to determine if the data is evenly
distributed among the partitions. For example, if the PowerCenter Server moves more rows
through one target partition than another, or if the throughput is not evenly distributed, you
might want to adjust the data range for the partitions.
When you load data to a target with multiple groups, such as an XML target, the
PowerCenter Server provides session details for each group.
Table 14-5 lists the information on the Transformation Statistics tab:
Table 14-5. Session Details on the Transformation Statistics Tab

434

Session Detail

Description

Instance Name

Name of the source qualifier instance or the target instance in the mapping. If you create
multiple partitions in the source or target, the Instance Name displays the partition number.
If the source or target contains multiple groups, the Instance Name displays the group
name.

Transformation Name

Name of the source qualifier or target.

Chapter 14: Monitoring Workflows

Table 14-5. Session Details on the Transformation Statistics Tab


Session Detail

Description

Applied Rows

For targets, shows the number of rows the PowerCenter Server successfully applied to the
target (that is, the target returned no errors).
For sources, shows the number of rows the PowerCenter Server successfully read from
the source.
Note: The number of applied rows equals the number of affected rows for sources.

Affected Rows

For targets, shows the number of rows affected by the specified operation. For example,
you have a table with one column called SALES_ID and five rows containing the values 1,
2, 3, 2, and 2. You mark rows for update where SALES_ID is 2. The writer affects three
rows, even though there was only one update request. Or, if you mark rows for update
where SALES_ID is 4, the writer affects 0 rows.
For sources, shows the number of rows the PowerCenter Server successfully read from
the source.
Note: The number of applied rows equals the number of affected rows for sources.

Rejected Rows

Number of rows the PowerCenter Server dropped when reading from the source, or the
number of rows the PowerCenter Server rejected when writing to the target.

Throughput (Rows/Sec)

Rate at which the PowerCenter Server read rows from the source or wrote data into the
target in bytes per second.

Last Error Message

The most recent error message written to the session log. If you view details after the
session completes, this field displays the last error message.

Last Error Code

The error message code of the most recent error message written to the session log. If you
view details after the session completes, this field displays the last error code.

Start Time

The time the PowerCenter Server started to read from the source or write to the target.
The Workflow Monitor displays time relative to the PowerCenter Server.

End Time

The time the PowerCenter Server finished reading from the source or writing to the target.
The Workflow Monitor displays time relative to the PowerCenter Server.

Monitoring Session Details

435

Creating and Viewing Performance Details


The performance details provide counters that help you understand the session and mapping
efficiency. Each source qualifier, target definition, and individual transformation appears in
the performance details, along with counters that display performance information about
each transformation.
You can view performance details through the Workflow Monitor as the session runs, or you
can open the resulting file in a text editor.
You create performance details by selecting Collect Performance Data in the session properties
before running the session. By evaluating the final performance details, you can determine
where session performance slows down. Monitoring also provides session-specific details that
can help tune the following:

Buffer block size

Index and data cache size for Aggregator, Rank, Lookup, and Joiner transformations

Lookup transformations

Before using performance details to improve session performance you must do the following:

Enable monitoring

Increase Load Manager shared memory

Understand performance counters

Enabling Monitoring
To view performance details, you must enable monitoring in the session properties before
running the session.
To enable monitoring:
1.

In the Workflow Manager, open the selected session properties.

2.

In the Performance settings of the Properties tab, select Collect Performance Data, and
click OK.

3.

Run the session.

Viewing Session Performance Details


You can view session performance details in the Workflow Monitor or by locating and
opening the performance details file.
In the Workflow Monitor, you can watch performance details during the session run.

436

Chapter 14: Monitoring Workflows

To view performance details in the Workflow Monitor:


1.

While the session is running, right-click the session in the Workflow Monitor and choose
Properties.

2.

Click the Performance tab in the Properties dialog box.

3.

Click OK.

To view the performance details file:


1.

Locate the performance details file.


The PowerCenter Server names the file session_name.perf, and stores it in the same
directory as the session log. If there is no session-specific directory for the session log, the
PowerCenter Server saves the file in the default log files directory.

2.

Open the file in any text editor.

Memory Requirement for Performance Details


When you enable monitoring, you must increase the size of the Load Manager Shared
Memory. For each session in shared memory that you configure to create performance details,
the Load Manager requires 200,000 bytes of additional shared memory.
If you create performance details for all sessions, multiply the MaxSessions parameter by
200,000 bytes to calculate the additional shared memory requirements.

Understanding Performance Counters


All transformations have some basic counters that indicate the number of input rows, output
rows, and error rows.
Source Qualifiers, Normalizers, and targets have additional counters that indicate the
efficiency of data moving into and out of buffers. You can use these counters to locate
performance bottlenecks.
Creating and Viewing Performance Details

437

Some transformations have counters specific to their functionality. For example, each Lookup
transformation has a counter that indicates the number of rows stored in the lookup cache.
When you read performance details, the first column displays the transformation name as it
appears in the mapping, the second column contains the counter name, and the third column
holds the resulting number or efficiency percentage.
When you create multiple partitions in a pipeline, the PowerCenter Server generates one set
of counters for each partition. The following performance counters illustrate two partitions
for an Expression transformation:
Transformation

Counter

Value

EXPTRANS [1]

Expression_input rows

Expression_output rows

Expression_input rows

16

Expression_output rows

16

EXPTRANS [2]

Note: When you increase the number of partitions, the number of aggregate or rank input

rows may be different from the number of output rows from the previous transformation.
Table 14-6 lists the counters that may appear in the Session Performance Details dialog box or
in the performance details file:
Table 14-6. Performance Counters
Transformation

Aggregator and
Rank
Transformations

438

Chapter 14: Monitoring Workflows

Counters

Description

Aggregator/Rank_inputrows

Number of rows passed into the transformation.

Aggregator/Rank_outputrows

Number of rows sent out of the transformation.

Aggregator/Rank_errorrows

Number of rows in which the PowerCenter Server


encountered an error.

Aggregator/Rank_readfromcache

Number of times the PowerCenter Server read from the


index or data cache.

Aggregator/Rank_writetocache

Number of times the PowerCenter Server wrote to the


index or data cache.

Aggregator/Rank_readfromdisk

Number of times the PowerCenter Server read from the


index or data file on the local disk, instead of using
cached data.

Aggregator/Rank_writetodisk

Number of times the PowerCenter Server wrote to the


index or data file on the local disk, instead of using
cached data.

Aggregator/Rank_newgroupkey

Number of new groups the PowerCenter Server


created.

Aggregator/Rank_oldgroupkey

Number of times the PowerCenter Server used existing


groups.

Table 14-6. Performance Counters


Transformation

Lookup
Transformation

Joiner
Transformation

Counters

Description

Lookup_inputrows

Number of rows passed into the transformation.

Lookup_outputrows

Number of rows sent out of the transformation.

Lookup_errorrows

Number of rows in which the PowerCenter Server


encountered an error.

Lookup_rowsinlookupcache

Number of rows stored in the lookup cache.

Joiner_inputMasterRows

Number of rows the master source passed into the


transformation.

Joiner_inputDetailRows

Number of rows the detail source passed into the


transformation.

Joiner_outputrows

Number of rows sent out of the transformation.

Joiner_errorrows

Number of rows in which the PowerCenter Server


encountered an error.

Joiner_readfromcache

Number of times the PowerCenter Server read from the


index or data cache.

Joiner_writetocache

Number of times the PowerCenter Server wrote to the


index or data cache.

Joiner_readfromdisk*

Number of times the PowerCenter Server read from the


index or data files on the local disk, instead of using
cached data.

Joiner_writetodisk*

Number of times the PowerCenter Server wrote to the


index or data files on the local disk, instead of using
cached data.

Joiner_readBlockFromDisk**

Number of times the PowerCenter Server read from the


index or data files on the local disk, instead of using
cached data.

Joiner_writeBlockToDisk**

Number of times the PowerCenter Server wrote to the


index or data cache.

Joiner_seekToBlockInDisk**

Number of times the PowerCenter Server accessed the


index or data files on the local disk.

Joiner_insertInDetailCache*

Number of times the PowerCenter Server wrote to the


detail cache. The PowerCenter Server generates this
counter only if you join data from a single source.

Joiner_duplicaterows

Number of duplicate rows the PowerCenter Server


found in the master relation.

Joiner_duplicaterowsused

Number of times the PowerCenter Server used the


duplicate rows in the master relation.

Creating and Viewing Performance Details

439

Table 14-6. Performance Counters


Transformation

All Other
Transformations

Counters

Description

Transformation_inputrows

Number of rows passed into the transformation.

Transformation_outputrows

Number of rows sent out of the transformation.

Transformation_errorrows

Number of rows in which the PowerCenter Server


encountered an error.

*The PowerCenter Server generates this counter when you use sorted input for the Joiner transformation.
**The PowerCenter Server generates this counter when you do not use sorted input for the Joiner transformation.

If you have multiple source qualifiers and targets, evaluate them as a whole. For source
qualifiers and targets, a high value is considered 80-100 percent. Low is considered 0-20
percent.

440

Chapter 14: Monitoring Workflows

Tips
Reduce the size of the Time window.
When you reduce the size of the Time window, the Workflow Monitor refreshes the screen
faster, reducing flicker.
Use the Repository Manager to truncate the list of workflow logs.
If the Workflow Monitor takes a long time to refresh from the repository or to open folders,
truncate the list of workflow logs. When you configure a session or workflow to archive
session logs or workflow logs, the PowerCenter Server saves those logs in local directories. The
repository also creates an entry for each saved workflow log and session log. If you move or
delete a session log or workflow log from the workflow log directory or session log directory,
truncate the lists of workflow and session logs to remove the entries from the repository. The
repository always retains the most recent workflow log entry for each workflow.

Tips

441

442

Chapter 14: Monitoring Workflows

Chapter 15

Using Multiple Servers


This chapter covers the following topics:

Overview, 444

Using Server Variables, 445

Working with Server Grids, 446

Configuring Server Grids, 450

443

Overview
You can register and run multiple PowerCenter Servers against a local or global repository.
When you register multiple PowerCenter Servers to the same repository, you can distribute
the workload across the servers to increase performance.
You have the following options to run workflows and sessions using multiple servers:

Use a server grid to run workflows. You can use a server grid to automate the distribution
of sessions. A server grid is a server object that distributes sessions in a workflow to servers
based on server availability. The grid maintains connections to multiple servers in the grid.
For more information about using server grids, see Working with Server Grids on
page 446.

Change the assigned server for a workflow. When you configure a workflow, you assign a
server to run that workflow. Each time the scheduled workflow runs, it runs on the
assigned server. You can change the assigned server for a workflow in the workflow
properties.

Change the assigned server for a session. When you configure a session, by default it runs
on the server assigned to the workflow. You can change the assigned server for a session in
the session properties.

Start a workflow on a non-assigned server. By default, each workflow runs on its assigned
PowerCenter Server. You can run a workflow on a non-assigned server if the workflow is
not currently running. Use the Start Workflow button on the Standard toolbar, and choose
a PowerCenter Server.

You can use the Workflow Monitor to monitor workflows running on multiple servers. For
server grids, the Workflow Monitor shows the individual status of each server in a grid. You
can identify the server grid that a server is assigned to by right-clicking the server in the
Workflow Monitor and selecting Properties. For more information about using the Workflow
Monitor, see Monitoring Workflows on page 401.
Tip: You might want to place the most CPU intensive sessions on the more powerful servers.

444

Chapter 15: Using Multiple Servers

Using Server Variables


In a multiple server environment, each server must have access to input files and directories
used by the session it runs. You can use server variables to simplify the process of changing the
server that runs a session or workflow. Server variables set the paths for files and caches
created during a session.
If you override a server variable in a workflow or session, you may need to manually edit the
session or workflow properties. If the new PowerCenter Server cannot locate the override
directory, it cannot run the session.

Using a File Server


Consider setting up a central location or using a file server accessible to all the PowerCenter
Servers. This allows you to run sessions on different servers without moving cache files and
input files.

Configure $PMRootDir for each server to point to the central location.

Use the same variables on each machine.

If you do not use a central file server, you need to relocate input files to the default directories
of the new PowerCenter Server. Input files can include parameter files, cache files, external
procedures, and flat file sources.

Running Sessions with Cache Files


In a multiple server environment, each PowerCenter Server needs access to the index and data
cache files created during previous sessions. This can include incremental aggregation files
and persistent lookup cache files. If the PowerCenter Server cannot locate the cache files, it
rebuilds them.
When the PowerCenter Server rebuilds incremental aggregation files, it loses aggregate
history. Use one of the following methods to save aggregate history in a multiple server
environment:

Use consistent server variables. Use the same variable for $PMCacheDir for each
PowerCenter Server running incremental aggregation sessions.

Run incremental aggregation sessions on the same machine. When you run large
incremental aggregation sessions, you might want to consider assigning a server to a
session and overriding the server variable to write to a drive local to the assigned
PowerCenter Server.

Move incremental aggregation files. If you cannot make files accessible to each
PowerCenter Server, or if the files are very large, you must move them to the server
running the session.

Note: Since aggregate files can become very large, make sure the directory can accommodate

the necessary files.

Using Server Variables

445

Working with Server Grids


You can increase workflow performance by using a server grid to balance the server workload.
When you create a server grid, you can add PowerCenter Servers to the grid. When you run a
workflow against a PowerCenter Server in the grid, that server becomes the master server for
the workflow. The master server runs all non-session tasks and assigns session tasks to run on
other servers in the grid. The other servers become worker servers for that workflow run.
You can specify server grid distribution options at the server level, workflow level, and session
level. PowerCenter Servers specified at the session level override both server level and
workflow level properties. For more information about these overrides, see Configuring
Server Grids on page 450.
Note: You cannot run a single session on multiple servers.

Distributing Sessions
In a server grid, the master server starts the workflow and then distributes sessions to worker
servers. The master server is the server that starts a workflow. A worker server is a server that
runs sessions assigned to it by a master server. By default, each PowerCenter Server in a server
grid is both a master server and a worker server. This means that a server in a grid can
distribute sessions to and receive sessions from every server in the grid. The master server
distributes sessions that are ready to run to available worker servers in a round-robin fashion
based on server availability. The starting point for the session assignment is random.
If a worker server is running the maximum number of concurrent sessions, the master server
assigns another worker server to run the session. If all worker servers are running the
maximum number of concurrent sessions, the master server places the session in its own ready
queue.
For information about configuring the maximum number of concurrent sessions, see
Installing and Configuring the PowerCenter Server on Windows and Installing and
Configuring the PowerCenter Server on UNIX in the Installation and Configuration Guide.
Figure 15-1 shows how a master server distributes the sessions in Workflow1 among the
servers in a grid. The server grid contains Server A, Server B, and Server C. Server A is the
master server, and Server B and Server C are worker servers.
Figure 15-1. Distributing Sessions in a Server Grid
In Workflow1, Server A is the master
server.
Server B
Server A
Server C

Server A

446

Chapter 15: Using Multiple Servers

Figure 15-2 shows how a master server distributes sessions in a workflow where a non-session
task exists. Server C is the master server, and Server A and Server B are worker servers. Server
C runs all non-session tasks it encounters and assigns sessions in a round-robin fashion.
Figure 15-2. Running a Non-session Task on the Master Server
Server C is the master server.

Server C

Server A

Server C

Server B

Server A

Server C

Server B

Server Grid Connectivity


PowerCenter Servers in a server grid create and maintain a connection to each other. A server
grid contains information about other servers in the grid. When you start a PowerCenter
Server, it fetches the server grid object and creates a TCP/IP connection to the other servers in
the grid.
Each server in the grid monitors the other servers to check connectivity status. As a result, the
grid notifies each server when you add, edit, or delete any server in the grid.
You can add servers to a server grid at any time. When a server starts up, it connects to the
grid and can run sessions from master servers and distribute sessions to worker servers in the
grid. The Workflow Monitor communicates with the master server to monitor progress of
workflows, get session statistics, retrieve performance details, and stop or abort the workflow
or task instances.
If a PowerCenter Server loses its connection to the grid, it tries to reestablish a connection.
You do not need to restart the server for it to connect to the grid. If a PowerCenter Server is
not connected to the server grid, the other PowerCenter Servers in the server grid do not send
it tasks.
When a PowerCenter Server cannot reestablish a connection to the grid, session and workflow
completion depends on factors such as shut down mode and which server loses connectivity.

Working with Server Grids

447

Table 15-1 lists scenarios where a server grid can lose connectivity:
Table 15-1. Losing Connectivity in a Server Grid
Connectivity Loss

Server Behavior

Worker server shuts down


unexpectedly or you shut it down
before it receives a session.

The worker server is not available to the master servers in the server grid.
Master servers do not assign a session to the unavailable worker server and
proceed with the round-robin distribution of sessions.

Worker server shuts down


unexpectedly while running a
session.

The master server marks the status of the session as terminated. The worker
server stops running all sessions. The session settings you specify determine if
the workflow fails. For more information about the Fail parent if this task fails
option, Fail parent if this task does not run option, or Disable this task option,
see Configuring Tasks on page 135.

You shut down a worker server while


it is running a session.

The shut down mode you specify determines how the worker server handles
sessions when it shuts down. When you shut down the worker server in
complete mode, it continues to run the sessions it started until it completes, but
does not accept sessions from master servers. For more information about
shut down modes, see pmcmd Reference on page 594.

Worker server loses its network


connection and cannot connect to the
server grid.

The worker server continues to run the session and writes its status to the
session log. However, the master server marks the status of the session as
terminated.
You must resume the workflow or resume from the failed task to continue
running the workflow and update the session status. If you do not need the
session status of the previous run, you can restart the workflow or restart the
workflow from a task to start up a new workflow run. For more information, see
Working with Tasks and Workflows on page 416.

Master server shuts down


unexpectedly.

Workflow fails. You must restart the workflow on another server or wait for the
master server to become available.

You shut down the master server


while running a workflow or session.

The shut down mode you specify determines how the master server handles
workflows and sessions when it shuts down. When you shut down the master
server in complete mode, it continues to run the workflows and sessions it
started until they complete, but does not accept tasks from other master
servers. For more information about shut down modes, see pmcmd
Reference on page 594.

Master server loses its network


connection and cannot connect to the
server grid.

The master server continues to run workflows as a standalone PowerCenter


Server. If a worker server is assigned to a session, the session fails because
the master server cannot distribute the session to the worker server. The
session settings you specify determine if the workflow fails. For more
information about the Fail parent if this task fails option, Fail parent if this task
does not run option, or Disable this task option, see Configuring Tasks on
page 135.

Server Grid Guidelines and Requirements


Informatica recommends that each PowerCenter Server in a server grid uses the same
operating system. While you can specify different session log directories, workflow log

448

Chapter 15: Using Multiple Servers

directories, and temp directories for the PowerCenter Servers, each PowerCenter Server in a
server grid must meet the following requirements:

Register each PowerCenter Server to the same repository.

Use the same database connectivity for each PowerCenter Server.

Use the same server variables for each server in a grid, except for the $PMTempDir,
$PMSessionLogDir, and $PMWorkflowLogDir variables.

Use the same cache directory.

Configure the following PowerCenter Server parameters the same:

Fail session if maximum number of concurrent sessions is reached

PMServer 4.0 date handling compatibility

Aggregate treat null as zero

Aggregate treat rows as insert

Treat CHAR as CHAR on read

Data Movement Mode

Validate Data Code Pages

Output Session Log In UTF8

Export Session Log Lib Name

Treat Null in comparison operator as

Data Display Format

PowerCenter Servers must be the same product version.

DB2 EEE loader must be on the same machine as PowerCenter Server.

Working with Server Grids

449

Configuring Server Grids


When you work with server grids, you can configure properties in the grid, workflow, and
session. When you run a session using a server grid, the server grid evaluates session properties
first, then workflow properties, and then grid properties.

Configuring Server Grid Properties


By default, each PowerCenter Server you add to the server grid can be both a master server
and a worker server. Each server accepts tasks from the grid. You can configure a server to be
only a master server by clearing Accept task from Server Grid. A PowerCenter Server that is
only a master server does not run sessions from other servers in the grid, but it can distribute
sessions to other servers in the grid.

Configuring Workflow Properties


When you configure a workflow, you can configure the following server properties:

You can assign a server to run the workflow. When you assign a server to a workflow, the
server becomes the master server for the workflow.

You can configure the entire workflow to run only on the master server. By default, the
master server distributes sessions to worker servers. You can configure the session to
override this workflow configuration.

Configuring Session Properties


You can assign a server to run a session. When you assign a server to a session, you override
workflow and grid server assignments. You might want to assign a server to sessions that use
the following features:

Caching. When you run sessions that access large cache files, such as incremental
aggregation files, you can increase performance by using a drive local to the PowerCenter
Server for the cache directory. Assign a server to a session and override the server variable
to write to a drive local to the PowerCenter Server.

External loader. Assign a server to run DB2 EEE external loader sessions. DB2 EEE
loaders require that the loader process runs on the PowerCenter Server running the session.

Note: If you assign a server to a session that is not in the grid, and the master server cannot

connect to the assigned server, the session fails.

450

Chapter 15: Using Multiple Servers

Override Examples
Table 15-2 shows a configuration where the session properties override the workflow
properties. The session runs on Server B even though you select the workflow option to run
all tasks on Server A because the session is assigned to Server B.
Table 15-2. Override Workflow Properties
Level

Configuration

Grid

- Server A accepts tasks from server grid.


- Server B accepts tasks from server grid.

Workflow

- Run on Server A.
- Tasks must run on server.

Session

Run on Server B.

Table 15-3 shows a configuration where the session properties override the server grid
properties. The session runs on Server B, even though you configure Server B not to accept
tasks from the grid because you assigned the session to Server B.
Table 15-3. Override Server Grid Properties
Level

Configuration

Grid

- Server A accepts tasks from server grid.


- Server B does not accept tasks from server grid.

Workflow

- Run on Server A.
- Tasks can run on other servers in the grid.

Session

Run on Server B.

Steps for Creating a Server Grid


Use the Server Grid Browser to create and edit server grids. When you create or edit a server
grid, you can choose servers from the list of available servers. A server is available if it is
registered in the same repository and is not part of another server grid. You can add up to 64
PowerCenter Servers in a grid.
Use the following procedure to create a server grid.
To create a server grid:
1.

Choose Server-Server Grid.


The Server Grid Browser opens.

2.

Click New.

Configuring Server Grids

451

The Server Grid Editor opens with a list of available PowerCenter Servers.

3.

Enter a server grid name and description.

4.

Select the server you want to include in the server grid, and click Add.
The selected server appears in Selected Servers column.

5.

Clear Accept tasks from Server Grid if you want the server to be only a master server.

Configure as both a
master and worker
server.

6.

452

Repeat steps 4 and 5 until you have chosen all the servers for the grid.

Chapter 15: Using Multiple Servers

7.

Click OK.
The server grid name appears in the Server Grid Browser. Select Show servers in grid to
view the servers in the grid.

8.

Click Close.

Configuring Server Grids

453

454

Chapter 15: Using Multiple Servers

Chapter 16

Log Files
This chapter covers the following topics:

Overview, 456

Workflow Logs, 457

Session Logs, 463

Reject Files, 476

455

Overview
The PowerCenter Server can create log files for each workflow it runs. These files contain
information about the tasks the PowerCenter Server performs, plus statistics about the
workflow and all sessions in the workflow. If the writer or target database rejects data during a
session run, the PowerCenter Server creates a file that contains the rejected rows.
The PowerCenter Server can create the following types of log files:

Workflow log. Contains information about the workflow run such as workflow name,
tasks executed, and workflow errors. By default, the PowerCenter Server writes this
information to the server log or Windows Event Log, depending on how you configure the
PowerCenter Server. If you wish to create a workflow log, enter a workflow file name in the
workflow properties. For more information, see Workflow Logs on page 457.

Session log. Contains information about the tasks that the PowerCenter Server performs
during a session, plus load summary and transformation statistics. By default, the
PowerCenter Server creates one session log for each session it runs. If a workflow contains
multiple sessions, the PowerCenter Server creates a separate session log for each session in
the workflow. For more information, see Session Logs on page 463.

Reject file. Contains rows rejected by the writer or target file during a session run. If the
writer or target does not reject any data during a session, the PowerCenter Server does not
generate a reject file for that session. For more information, see Reject Files on page 476.

By default, the PowerCenter Server saves each type of log file in its own directory. The
PowerCenter Server represents these directories using server variables.
Table 16-1 shows the default location for each type of log file:
Table 16-1. Log File Default Locations
Log File Type

Default Directory
(Server Variable)

Value

Workflow logs

$PMWorkflowLogDir

$PMRootDir/WorkflowLogs

Session logs

$PMSessionLogDir

$PMRootDir/SessLogs

Reject files

$PMBadFileDir

$PMRootDir/BadFiles

You can change the default directories at the server level by editing the server connection in
the Workflow Manager. You can also override these values for individual workflows or sessions
by updating the workflow or session properties.

456

Chapter 16: Log Files

Workflow Logs
You can configure a workflow to create a workflow log. When you do this, the PowerCenter
Server writes information such as process initialization, workflow task run information, errors
encountered, and workflow run summary to the workflow log.
In general, a workflow log contains the following information about the workflow:

Workflow name

Workflow status

Status of tasks and worklets in the workflow

Start and end times for tasks and worklets

Results of link conditions

Some session messages and errors

Errors encountered during the workflow

The PowerCenter Server categorizes workflow log error messages into severity levels. The
PowerCenter Server either writes or does not write an error message to the log file based on
the error severity level. You can set the Error Severity Level for Log Files in the PowerCenter
Server setup program. For more information, see Installing and Configuring the
PowerCenter Server on Windows or Installing and Configuring the PowerCenter Server on
UNIX in the Installation and Configuration Guide. You can also configure the PowerCenter
Server to suppress writing messages to the workflow log file completely.
As with PowerCenter Server logs and session logs, the PowerCenter Server enters a code
number into the workflow log file message along with message text. You can find information
on error messages in the Troubleshooting Guide.
You configure a workflow to create a workflow log by entering a workflow log file name in the
workflow properties. If you choose to create a workflow log, the PowerCenter Server saves the
workflow log in a directory entered for the server variable $PMWorkflowLogDir in the
PowerCenter Server registration. You can override the workflow log directory at the server
level or at the workflow level.
By default, the PowerCenter Server saves one workflow log for each workflow. If you want to
save multiple logs for different workflow runs, you can configure the workflow to save a
workflow log file by timestamp, which permits an unlimited number of workflow logs, or by
run, which saves a specified number of logs. To view previous workflow logs, save log files by
timestamp.
If you choose not to create workflow logs, the PowerCenter Server writes the workflow log
messages to the to the server log or Windows Event Log, depending on how you configure the
PowerCenter Server. For more information on configuring the PowerCenter Server, see
Installing and Configuring the PowerCenter Server on Windows or Installing and
Configuring the PowerCenter Server on UNIX in the Installation and Configuration Guide.

Workflow Logs

457

Workflow Log Messages


The PowerCenter Server precedes each message in the log file with a code and number. It also
precedes some messages with a timestamp. The code defines a group of messages for a specific
process. The number defines a specific message. The message can provide general information
or it can be an error message.
You can configure the PowerCenter Server to append a time stamp to every message it writes
to the workflow log. To do this, enable the Time Stamp Workflow Log option in the
PowerCenter Server setup program. For more information, see Installing and Configuring
the PowerCenter Server on Windows or Installing and Configuring the PowerCenter Server
on UNIX in the Installation and Configuration Guide.

Workflow Log Codes


You can use the workflow log to determine the cause of workflow problems. To resolve
workflow problems, locate the relevant log file codes and text prefixes in the workflow log,
then see the Troubleshooting Guide for details. You can find workflow-related server messages
in the UNIX server log (default name: pmserver.log) or in the Windows Event Log (viewed
with the Event Viewer).
Table 16-2 describes the codes that can appear in workflow logs:
Table 16-2. Workflow Log Codes
Error Code

Description

CMN

Messages related to databases, memory allocation, Lookup and Joiner transformations, and internal
errors.

LM

Messages related to the Load Manager.

REP

Messages related to repository functions.

TM

Messages related to Data Transformation Manager (DTM).

VAR

Messages related to mapping variables.

Workflow Log Sample


The following sample is a workflow log from a simple workflow that shows log file codes:
INFO : LM_36315 [Tue Nov 18 11:16:38 2003] : (270|305) Starting execution
of workflow [wf_PhoneList].
INFO : LM_36330 [Tue Nov 18 11:16:38 2003] : (270|305) Starting execution
of start instance [StartWorkflow].
INFO : LM_36333 [Tue Nov 18 11:16:38 2003] : (270|305) Execution of start
instance [StartWorkflow] succeeded.
INFO : LM_36505 : (270|305) Link [StartWorkflow --> s_PhoneList]: empty
expression string, evaluated to TRUE.
INFO : LM_36330 [Tue Nov 18 11:16:38 2003] : (270|305) Starting execution
of session instance [s_PhoneList].

458

Chapter 16: Log Files

INFO : LM_36522 : (270|305) Started DTM process [pid = 273] for session
instance [s_PhoneList].
INFO : CMN_1760 : (273|255) Message from session: LM_36033 [Connected to
repository [SALES] running on server:port [monster]:[5001] user
[Administrator]].
INFO : CMN_1760 : (273|255) Message from session: TM_6228 [Writing session
output to log file [d:\pcserver\SessLogs\s_PhoneList.log].].
INFO : LM_36333 [Tue Nov 18 11:16:43 2003] : (270|306) Execution of
session instance [s_PhoneList] succeeded.
INFO : LM_36318 [Tue Nov 18 11:16:43 2003] : (270|306) Execution of
workflow [wf_PhoneList] succeeded.

Configuring Workflow Logs


You can configure workflow log options in the workflow properties. You can configure the
following information for a workflow log:

Location. You can configure the directory where you want the workflow log created. By
default, the PowerCenter Server creates the workflow log in the directory configured for
the $PMWorkflowLogDir server variable. You can enter a different directory, but if the
directory does not exist or is not local to the PowerCenter Server that runs the workflow,
the workflow fails.

Name. If you wish to create a workflow log, you can enter a name for the workflow log
file. If you do not enter a filename, the PowerCenter Server does not create a workflow log.
Instead, the PowerCenter Server writes workflow log messages to the Windows Event Log
or UNIX server log.

Archive. You can configure the number of workflow logs you want the PowerCenter Server
to archive for each workflow. By default, the PowerCenter Server does not archive
workflow logs.

Archiving Workflow Logs


By default, the PowerCenter Server does not save multiple logs for a single workflow. It
creates one workflow log for each workflow and overwrites the existing log with the latest
workflow log.
If you wish to save multiple logs for a workflow, you can configure the PowerCenter Server to
do this. The PowerCenter Server can save workflow logs in two ways:

Save a selected number of logs

Save all logs by timestamp

If you configure the workflow to save a specific number of workflow logs, it names the most
recent log filename.log. It then cycles through a closed naming sequence for historical logs as
follows: filename.log.0, filename.log.1, filename.log.2, , filename.log.n-1, where n represents
the number of workflow logs. Because the PowerCenter Server cycles through the numeric
naming sequence, check the workflow log file timestamp to determine the chronological order
of those files.
Workflow Logs

459

Instead of entering a specific number of workflow logs to save, you can use the server variable
$PMWorkflowLogCount. When you use $PMWorkflowLogCount server variable, the
PowerCenter Server archives the number of workflow logs configured for the server variable.
If you use $PMWorkflowLogCount for all workflows, you can increase the number of
archived workflow logs for all workflows by changing the server variable.
Note: By default, $PMWorkflowLogCount is set to 0. To archive workflow logs using

$PMWorkflowLogCount, configure it for a larger number of workflow logs. For details on


configuring server variables, see Registering the PowerCenter Server on page 46.
You can also save all workflow logs by configuring a workflow to save logs by timestamp.
When timestamping workflow logs, the PowerCenter Server appends the year, month, day,
hour, and minute of the workflow completion to the log file. The resulting log file name is
filename.log.yyyymmddhhmi, where:

yyyy = year

mm = month, ranging from 1-12

dd = day, ranging from 1-31

hh = hour, ranging from 0-23

mi = minute, ranging from 0-59

To prevent filling the workflow log directory, periodically delete or backup log files when
using the timestamp option.
Note: You can also truncate workflow and session log entries from the repository. For more

information, see Using the Repository Manager in the Repository Guide.

Steps for Configuring Workflow Logs


You can configure workflow log information on the Properties tab of the workflow properties.
To configure workflow log information:
1.

460

In the Workflow Manager, open the workflow properties.

Chapter 16: Log Files

2.

Select the Properties tab.

3.

Enter the following workflow log options:


Option Name

Description

Parameter File Name

Designates the name and directory for the parameter file. Use the parameter file to
define workflow parameters. For details on parameter files, see Parameter Files
on page 511.

Workflow Log File Name

Optionally enter a file name, or a file name and directory.


If you leave this field blank, the PowerCenter Server does not create a workflow
log. Instead, the PowerCenter Server writes workflow log messages to the server
log or Windows Event Log, depending on how you configure the PowerCenter
Server.
If you fill in this field, the PowerCenter Server appends information in this field to
that entered in the Workflow Log File Directory field. For example, if you have
"C:\workflow_logs\" in the Workflow Log File Directory field, then enter
"logname.txt" in the Workflow Log File Name field, the PowerCenter Server writes
logname.txt to the C:\workflow_logs\ directory.

Workflow Log File Directory

Designates a location for the workflow log file. By default, the PowerCenter Server
writes the log file in the server variable directory, $PMWorkflowLogDir.
If you enter a full directory and file name in the Workflow Log File Name field, clear
this field.

Workflow Logs

461

4.

Option Name

Description

Save Workflow Log By

If you select Save Workflow Log by Timestamp, the PowerCenter Server saves all
workflow logs, appending a timestamp to each log.
If you select Save Workflow Log by Runs, the PowerCenter Server saves a
designated number of workflow logs. Configure the number of workflow logs in the
Save Workflow Log for These Runs option.
For details on these options, see Archiving Workflow Logs on page 459.
You can also use the $PMWorkflowLogCount server variable to save the
configured number of workflow logs for the PowerCenter Server.

Save Workflow Log for


These Runs

The number of historical workflow logs you want the PowerCenter Server to save.
The Informatica saves the number of historical logs you specify, plus the most
recent workflow log. Therefore, if you specify 5 runs, the PowerCenter Server
saves the most recent workflow log, plus historical logs 0 to 4, for a total of 6 logs.
You can specify up to 2,147,483,647 historical logs. If you specify 0 logs, the
PowerCenter Server saves only the most recent workflow log.

Click OK to save the workflow.

Viewing Workflow Logs


Workflow logs are text files that you can open with any text editor. The PowerCenter Server
saves workflow logs in the directory you specify in the Workflow Log File Directory field in
the workflow properties.
You can also view workflow logs through the Workflow Monitor. When you do this, the
Workflow Manager creates a temporary file that stores the workflow log. You can view the
temporary file through the Workflow Monitor.
The PowerCenter Server generates the workflow log based on the PowerCenter Server code
page. You can specify the language in which you want to view the workflow log based on the
locale of the machine hosting the PowerCenter Server.
To use the Workflow Monitor to view the most recent workflow log:
1.

In the Navigator window, connect to the server on which the workflow runs.

2.

Open the folder that contains the workflow.

3.

Right-click the workflow and choose Get Workflow Log.

If you save workflow logs by timestamp, you can also use the Workflow Monitor to view past
workflow logs. To do this, right click the workflow in the Gantt chart view and choose Get
Workflow Log.
For more information about the Workflow Monitor, see Using the Workflow Monitor on
page 404.

462

Chapter 16: Log Files

Session Logs
The session log file contains information about all tasks the PowerCenter Server performs,
plus the load summary and transformation statistics. The amount of detail in the session log
depends on the tracing level that you set. You can define the tracing level for each
transformation or for the entire session. The session-level tracing overrides any
transformation-level tracing levels.
In general, the session log contains the following information about the session:

Allocation of system shared memory

Execution of pre-session commands

Creation of SQL commands for reader and writer threads

Start and end times for target loading

Errors encountered during session

Execution of post-session commands

Load summary of reader, writer, and Data Transformation Manager (DTM) statistics

By default, the PowerCenter Server saves session logs in the directory for the PowerCenter
Server variable $PMSessionLogDir, which you define in the Workflow Manager. The default
name for the session log is s_mapping name.log. You can override the session log name and
location in the session properties.
The PowerCenter Server does not archive session logs by default. Instead, it creates one log for
each session and overwrites the existing log with the latest session log. However, you can
configure the session to archive session logs. For more information, see Archiving Session
Logs on page 471.
By default, the PowerCenter Server generates session log files based on the PowerCenter
Server code page. However, if you enable the Output Session Log in UTF-8 option on the
Configuration tab of the PowerCenter Server setup program, the PowerCenter Server writes
to the session log using the UTF-8 character set.
Note: By default, the PowerCenter Server writes row errors to the session log. However, if you

enable row error logging in the sessions properties, the PowerCenter Server does not write
dropped rows to the session log. When you enable row error logging, you can configure the
PowerCenter Server to write row errors to the session log in addition to the row error log by
enabling verbose data tracing.

Session Log Messages


The PowerCenter Server precedes each message in the log file with a thread identification and
then a code and number. The code defines a group of messages for a specific process. The
number defines a specific message. The message can provide general information or it can be
an error message.

Session Logs

463

You can configure the PowerCenter Server to write session log messages to an external library
as well as to the session log. To do this, you can set the Export Session Log Lib Name in the
PowerCenter Server setup program. For more information, see Installing and Configuring
the PowerCenter Server on Windows or Installing and Configuring the PowerCenter Server
on UNIX in the Installation and Configuration Guide.

Session Log Codes


You can use the session log to determine the cause of session problems. To resolve session
problems, locate the relevant log file codes and text prefixes in the session log, then see the
Troubleshooting Guide for details. You can find session-related server messages in the UNIX
server log (default name: pmserver.log) or in the Windows Event Log (viewed with the Event
Viewer).
Table 16-3 describes the codes that can appear in session logs:
Table 16-3. Session Log Codes

464

Message Code

Description

BLKR

Messages related to reader process, including Application, relational, or flat file.

CNX

Messages related to the Repository Agent connections.

CMN

Messages related to databases, memory allocation, Lookup and Joiner transformations, and
internal errors.

DBG

Messages related to PowerCenter Server loading and debugging.

DBGR

Messages related to the Debugger.

EP

Messages related to external procedures.

ES

Messages related to the Repository Server.

FR

Messages related to file sources.

FTP

Messages related to File Transfer Protocol operations.

HIER

Messages related to reading XML sources.

LM

Messages related to the Load Manager.

NTSERV

Messages related to Windows server operations.

OBJM

Messages related to the Repository Agent.

ODL

Messages related to database functions.

PETL

Messages related to pipeline partitioning.

PMF

Messages related to caching Aggregator, Rank, Joiner, or Lookup transformations.

RAPP

Messages related to the Repository Agent.

REP

Messages related to repository functions.

RR

Messages related to relational sources.

SF

Messages related to server framework, used by Load Manager and Repository Server.

Chapter 16: Log Files

Table 16-3. Session Log Codes


Message Code

Description

SORT

Messages related to the Sorter transformation.

TE

Messages related to transformations.

TM

Messages related to Data Transformation Manager (DTM).

TT

Messages related to transformations.

VAR

Messages related to mapping variables.

WRT

Messages related to the Writer.

XMLR

Messages related to the XML Reader.

XMLW

Messages related to the XML Writer.

Thread Identification
The thread identification consists of the thread type and a series of numbers separated by
underscores. The numbers following a thread name indicate the following information:

Target load order group number

Partition point number

Partition number

Note: The PowerCenter Server writes an asterisk (*) as the partition point number for writer

threads.
The PowerCenter Server prints the thread identification before the log file code and the
message text in the session log. The following example illustrates a reader thread from target
load order group one, concurrent source set one, source pipeline one, and partition one:
READER_1_1_1> DBG_21438 Reader: Source is [p152636], user [jennie]

For more information on partitioning, see Pipeline Partitioning on page 345.


When you configure the PowerCenter Server to read Joiner transformation sources
sequentially, the PowerCenter Server writes numbers with the following information after the
thread name:

Target load order group number

Concurrent source set number

Partition point number

Partition number

A concurrent source set is the group of sources in a target load order group the PowerCenter
Server reads concurrently. A target load order group might contain multiple concurrent
source sets if it contains a Joiner transformation and you configure the PowerCenter Server to
read Joiner transformation sources sequentially.

Session Logs

465

Enable the PMServer 6.X Joiner source order compatibility PowerCenter Server option to
configure it to read Joiner transformation sources sequentially.

Session Log Sample


The following sample is an excerpt from a session log file that illustrates log file codes and
thread identifications:
TM_6703 Session [s_m_SampleSessionLog] is run by PowerCenter Server
[sarao].
MASTER> CMN_1688 Allocated [12000000] bytes from process memory for [DTM
Buffer Pool].
MASTER> PETL_24000 Parallel Pipeline Engine initializing.
MASTER> PETL_24001 Parallel Pipeline Engine running.
MASTER> PETL_24003 Initializing session run.
MAPPING> TM_6014 Initializing session [s_m_SampleSessionLog] at [Tue Aug
03 11:29:57 2004]
.
.
.
*****START LOAD SESSION*****

Load Start Time: Tue Aug 03 11:30:00 2004

Target tables:

Emp_target

READER_1_1_1> BLKR_16019 Read [1] rows, read [0] error rows for source
table [EMP_SRC] instance name [EMP_SRC]
READER_1_1_1> BLKR_16008 Reader run completed.
TRANSF_1_1_1> DBG_21216 Finished transformations for Source Qualifier
[SQ_EMP_SRC]. Total errors [0]
WRITER_1_*_1> WRT_8167 Start loading table [Emp_target] at: Tue Aug 03
11:30:00 2004
.
MASTER> PETL_24002 Parallel Pipeline Engine finished.
MASTER> PETL_24012 Session run completed successfully.

466

Chapter 16: Log Files

Some messages are embedded within other messages. For example, a code CMN_1039
contains informational messages from the Microsoft SQL Server as it changes to the source
database to be used in the session.
Note: If you configure the PowerCenter Server to run in ASCII mode, the session log file

reports the sort order as Binary, even if you select a different sort order in the session
properties.

Load Summary
The session log includes a load summary that reports the number of rows inserted, updated,
deleted, and rejected for each target as of the last commit point. The PowerCenter Server
reports the load summary for each session by default. However, you can set tracing level to
Verbose Initialization or Verbose Data to report the load summary for each transformation.
The following sample is an excerpt from a load summary:
*****START LOAD SESSION*****

Load Start Time: Tue Aug 03 11:30:00 2004

Target tables:

Emp_target
Commit on end-of-data

Aug 03 11:30:07 2004

===================================================

WRT_8036 Target: Emp_target (Instance Name: [Emp_target])


WRT_8038 Inserted rows - Requested: 1
Rejected: 0
Affected: 1

Applied: 1

WRITER_1_*_1> WRT_8035 Load complete time: Tue Aug 03 11:30:07 2004

LOAD SUMMARY
============

Session Logs

467

WRT_8036 Target: Emp_target (Instance Name: [Emp_target])


WRT_8038 Inserted rows - Requested: 1
Rejected: 0
Affected: 1

Applied: 1

.
.
,
WRITER_1_*_1> WRT_8043 *****END LOAD SESSION*****

The PowerCenter Server reports statistics for each of the following operations performed on
the target:

Inserted. Shows the number of rows the PowerCenter Server marked for insert into the
target. The number of affected rows cannot be larger than requested for this operation.

Updated. Shows the number of rows the PowerCenter Server marked for update in the
target. The number of affected rows can be different from the number of requested rows.
For example, you have a table with one column called SALES_ID and five rows containing
the values: 1, 2, 3, 2, and 2. You mark rows for update where SALES_ID is 2. The writer
affects three rows, even though there was only one update request. Or, if you mark rows for
update where SALES_ID is 4, the writer affects 0 rows.

Deleted. Shows the number of rows the PowerCenter Server marked to remove from the
target. The number of affected rows can be different from the number of requested rows.

Rejected. Shows the number of rows the PowerCenter Server rejected during the writing
process. These rows cannot be applied to the target. For the Rejected rows category, the
number of affected and applied rows is always zero since these rows are not written to the
target.

The load summary provides the following statistics:

468

Requested rows. Shows the number of rows the writer actually received for the specified
operation.

Applied rows. Shows the number of rows the writer successfully applied to the target (that
is, the target returned no errors).

Affected rows. Shows the number of rows affected by the specified operation. Depending
on the operation, the number of affected rows can be different from the number of
requested rows. For example, you have a table with one column called SALES_ID and five
rows containing the values: 1, 2, 3, 2, and 2. You mark rows for update where SALES_ID
is 2. The writer affects three rows, even though there was only one update request. Or, if
you mark rows for update where SALES_ID is 4, the writer affects 0 rows.

Rejected rows. Shows the number of rows the writer could not apply to the target. For
example, the target database rejects a row if the PowerCenter Server attempts to insert
NULL into a not-null field. The PowerCenter Server writes all rejected rows to the session
reject file, or to the row error log, depending on how you configure the session.

Chapter 16: Log Files

Mutated from update. Shows the number of rows originally flagged for update that are
instead inserted into the target when the session is configured Update Else Insert.

If the number of rows requested, applied, rejected, and affected are all zero for any of these
four operations, the operation does not appear as a line in the load summary. If no data is
passed to the target, the writer reports the following message:
No data loaded for this target.

Detailed Transformation Statistics


The DTM enables transformation statistics in the session log for two levels of tracing,
Verbose Initialization and Verbose Data. Transformation statistics appear after the load
summary in the log file.
The PowerCenter Server reports the following details for each transformation in the mapping:

The name of the transformation

The number of input rows and the name of the input source

The number of output rows and the name of the output transformation or target

The number of rows dropped

The following sample is an excerpt from the transformation statistics in a session log file:
DETAILED TRANSFORMATION ROW STATISTICS
for DSQ [SQ_EMPLOYEES], Partition[1]
--------------------------------MAPPING>
MAPPING> TT_11031 Transformation [SQ_EMPLOYEES]:
MAPPING> TT_11035 Input

- 12 (__READER__)

MAPPING> TT_11037 [T_EMPLOYEES]: Output - 12, Dropped - 0


MAPPING>
.
.
.

Configuring Session Logs


Configure session log options in the session properties. You can configure the following
information for a session log:

Location. You can configure the directory where you want the session log created. By
default, the PowerCenter Server creates the session log in the directory configured for the
$PMSessionLogDir server variable. You can enter a different directory, but if the directory
does not exist or is not local to the PowerCenter Server that runs the session, the session
fails.
Session Logs

469

Name. You can name the session log or accept the default name. The default name for the
session log is s_mapping name.log.

Archive. You can configure the number of session logs you want the PowerCenter Server to
archive for each session. By default, the PowerCenter Server does not archive session logs.

Tracing levels. You can control the type of information the PowerCenter Server includes in
the session log by setting a tracing level for the session. By default, the PowerCenter Server
uses tracing levels configured in the mapping.

Configuring Session Log Locations and Filenames


You can configure the name and location of the session log on the Properties tab of the session
properties.
To configure session log information:
1.

In the Workflow Manager, open the session properties.

2.

Select the General Options settings on the Properties tab.

Session Log Filename


and Directory

470

Chapter 16: Log Files

3.

4.

Enter the following session log options:


Option Name

Description

Session Log File


Name

By default, the PowerCenter Server uses the session name for the log file name:
s_mapping name.log. For a debug session, it uses DebugSession_mapping
name.log.
Optionally enter a file name, a file name and directory, or use the
$PMSessionLogFile session parameter. The PowerCenter Server appends
information in this field to that entered in the Session Log File Directory field. For
example, if you have C:\session_logs\ in the Session Log File Directory field, then
enter logname.txt in the Session Log File field, the PowerCenter Server writes the
logname.txt to the C:\session_logs\ directory.
You can also use the $PMSessionLogFile session parameter to represent the name
of the session log or the name and location of the session log. For details on session
parameters, see Session Parameters on page 495.

Session Log File


Directory

Location of the log file. Enter a valid directory local to the PowerCenter Server. By
default, the PowerCenter Server creates session logs in the directory configured for
the $PMSessionLogDir server variable.

Click OK to save the session.

Archiving Session Logs


You can archive session logs on a session-by-session basis. The PowerCenter Server can save
session logs in the following ways:

Save a selected number of logs

Save all logs by timestamp

By default, the PowerCenter Server does not archive session logs. It creates one session log for
each session and overwrites the existing log with the latest session log.
If you configure the session to save a specific number of session logs, it names the most recent
log s_mapping name.log. It then cycles through a closed naming sequence for historical logs as
follows: s_mapping name.log.0, s_mapping name.log.1, s_mapping name.log.2, , s_mapping
name.log.n-1, where n is the number of session logs. Because the PowerCenter Server cycles
through the numeric naming sequence, check the session log file timestamp to determine the
chronological order of those files.
Instead of entering a specific number of session logs to save, you can use the server variable
$PMSessionLogCount. When you use $PMSessionLogCount server variable, the
PowerCenter Server archives the number of session logs configured for the server variable. If
you use $PMSessionLogCount for all sessions, you can increase the number of archived
session logs for all sessions by changing the server variable.
Note: By default, $PMSessionLogCount is set to 0. To archive session logs using

$PMSessionLogCount, configure it for a larger number of session logs. For details on


configuring server variables, see Registering the PowerCenter Server in the Installation and
Configuration Guide.

Session Logs

471

You can also save all session logs by configuring a session to save logs by timestamp. When
timestamping session logs, the PowerCenter Server appends the month, day, hour, and minute
of the session completion to the log file. The resulting log file name is s_mapping
name.log.yyyymmddhhmi, where:

yyyy = year

mm = month, ranging from 1-12

dd = day, ranging from 1-31

hh = hour, ranging from 0-23

mi = minute, ranging from 0-59

To prevent filling the session log directory, periodically delete or backup log files when using
the timestamp option.
Note: You can also truncate workflow and session log entries from the repository. For more

information, see Using the Repository Manager in the Repository Guide.


To specify archiving information:
1.

In the Workflow Manager, open the session properties.

2.

Select the Log Options settings on the Config Object tab.

Log Options Settings

472

Chapter 16: Log Files

3.

4.

Enter the following session log options:


Option Name

Description

Save Session Log By

If you select Save Session Log by Timestamp, the PowerCenter Server saves all
session logs, appending a timestamp to each log.
If you select Save Session Log by Runs, the PowerCenter Server saves a designated
number of session logs. Configure the number of sessions in the Save Session Log for
These Runs option.
You can also use the $PMSessionLogCount server variable to save the configured
number of session logs for the PowerCenter Server.

Save Session Log for


These Runs

The number of historical session logs you want the PowerCenter Server to save.
The Informatica saves the number of historical logs you specify, plus the most recent
session log. Therefore, if you specify 5 runs, the PowerCenter Server saves the most
recent session log, plus historical logs 0 to 4, for a total of 6 logs.
You can specify up to 2,147,483,647 historical logs. If you specify 0 logs, the
PowerCenter Server saves only the most recent session log.

Click OK to save the session.

Setting Tracing Levels


The amount of detail in the session log depends on the tracing level that you set. You can
define tracing levels for each transformation or for the entire session. By default, the
PowerCenter Server uses tracing levels configured in the mapping.
Setting a tracing level for the session overrides the tracing levels configured for each
transformation in the mapping. If you select a normal tracing level or higher, the
PowerCenter Server writes row errors into the session log, including the transformation in
which the error occurred and complete row data. If you configure the session for row error
logging, the PowerCenter Server writes row errors to the error log instead of the session log. If
you want the PowerCenter Server to write dropped rows to the session log as well, configure
the session with Verbose Data tracing level.
Table 16-4 describes the session log tracing levels:
Table 16-4. Session Log Tracing Levels
Tracing Level

Description

None

The PowerCenter Server uses the tracing level set in the mapping.

Terse

PowerCenter Server logs initialization information as well as error messages and notification of
rejected data.

Normal

PowerCenter Server logs initialization and status information, errors encountered, and skipped
rows due to transformation row errors. Summarizes session results, but not at the level of
individual rows.

Session Logs

473

Table 16-4. Session Log Tracing Levels


Tracing Level

Description

Verbose
Initialization

In addition to normal tracing, PowerCenter Server logs additional initialization details, names of
index and data files used, and detailed transformation statistics.

Verbose Data

In addition to verbose initialization tracing, PowerCenter Server logs each row that passes into the
mapping. Also notes where the PowerCenter Server truncates string data to fit the precision of a
column and provides detailed transformation statistics.
When you configure the tracing level to verbose data, the PowerCenter Server writes row data for
all rows in a block when it processes a transformation.

You can also enter tracing levels for individual transformations in the mapping. When you
enter a tracing level in the session properties, you override tracing levels configured for
transformations in the mapping.
To set the tracing level:
1.

Select the Error Handling settings on the Config Object tab.

Tracing
Level

2.

Select a tracing level from the Override Tracing list. Table 16-4 on page 473 describes the
session log tracing levels.

3.

Click OK to save the session.

Viewing Session Logs


Session logs are text files that you can open with any text editor. The PowerCenter Server
saves session logs in the directory you specify in the Session Log File Directory field in the
session properties.

474

Chapter 16: Log Files

You can also view session logs through the Workflow Monitor. When you do this, the
Workflow Monitor creates a temporary file that stores the session log. You can view the
temporary file through the Workflow Monitor.
If a session fails, you can still view the session log file.
The PowerCenter Server generates the session log based on the PowerCenter Server code page.
You can specify the language in which you want to view the session log based on the locale of
the machine hosting the PowerCenter Server.
To use the Workflow Monitor to view the most recent session log:
1.

In the Navigator window, connect to the server on which the workflow runs.

2.

Open the folder that contains the workflow.

3.

Open the workflow that contains the session whose log you wish to view.

4.

Right-click the session and choose Get Session Log.

If you save session logs by timestamp, you can also use the Workflow Monitor to view past
session logs. To do this, right-click the session in the Gantt chart view and choose Get Session
Log.
For more information about the Workflow Monitor, see Using the Workflow Monitor on
page 404.

Session Logs

475

Reject Files
During a session, the PowerCenter Server creates a reject file for each target instance in the
mapping. If the writer or the target rejects data, the PowerCenter Server writes the rejected
row into the reject file. The reject file and session log contain information that helps you
determine the cause of the reject.
Each time you run a session, the PowerCenter Server appends rejected data to the reject file.
Depending on the source of the problem, you can correct the mapping and target database to
prevent rejects in subsequent sessions.
Note: If you enable row error logging in the session properties, the PowerCenter Server does

not create a reject file. It writes the reject rows to the row error tables or file.

Locating Reject Files


The PowerCenter Server creates reject files for each target instance in the mapping. It creates
reject files in the session reject file directory, as configured on the Properties settings of the
Targets node on the Mapping tab (Transformation view). By default, the PowerCenter Server
creates reject files in the $PMBadFileDir server variable directory.
The PowerCenter Server names reject files after the target instance name. The default name
for reject files is target instance partition number.bad. You can view or edit reject file names in
the session properties. The Workflow Manager replaces slash characters in the target instance
name with underscore characters.
To find the location and name of the reject files, view the properties settings of the Targets
node on the Mapping tab (Transformation view).

476

Chapter 16: Log Files

Figure 16-1 shows the properties settings on the Mapping tab:


Figure 16-1. Properties Settings on the Mapping Tab

Reject file directory


and filename

When you run a session that contains multiple partitions, the PowerCenter Server creates a
separate reject file for each partition.

Reading Reject Files


After you locate a reject file, you can read it using a text editor that supports the reject file
code page. Reject files contain rows of data rejected by the writer or the target database.
Though the PowerCenter Server writes the entire row in the reject file, the problem generally
centers on one column within the row. To help you determine which column caused the row
to be rejected, the PowerCenter Server adds row and column indicators to give you more
information about each column:

Row indicator. The first column in each row of the reject file is the row indicator. The
numeric indicator tells whether the row was marked for insert, update, delete, or reject.
If the session is a user-defined commit session, the row indicator might tell whether the
transaction was rolled back due to a non-fatal error or if the committed transaction was in
a failed target connection group. For more information about user-defined commit
sessions and rejected rows, see User-Defined Commits on page 283.

Column indicator. Column indicators appear after every column of data. The alphabetical
character indicators tell whether the data was valid, overflow, null, or truncated.

The following sample reject file shows the row and column indicators:
0,D,1921,D,Nelson,D,William,D,415-541-5145,D
0,D,1922,D,Page,D,Ian,D,415-541-5145,D

Reject Files

477

0,D,1923,D,Osborne,D,Lyle,D,415-541-5145,D
0,D,1928,D,De Souza,D,Leo,D,415-541-5145,D
0,D,2001,D,S. MacDonald,D,Ira,D,415-541-5145,D

Row Indicators
The first column in the reject file is the row indicator. The number listed as the row indicator
tells the writer what to do with the row of data.
Table 16-5 describes the row indicators in a reject file:
Table 16-5. Row Indicators in Reject File
Row Indicator

Meaning

Rejected By

Insert

Writer or target

Update

Writer or target

Delete

Writer or target

Reject

Writer

Rolled-back insert

Writer

Rolled-back update

Writer

Rolled-back delete

Writer

Committed insert

Writer

Committed update

Writer

Committed delete

Writer

If a row indicator is 3, the writer rejected the row because an update strategy expression
marked it for reject.
If a row indicator is 0, 1, or 2, either the writer or the target database rejected the row. To
narrow down the reason why rows marked 0, 1, or 2 were rejected, review the column
indicators and consult the session log.

Column Indicators
After the row indicator is a column indicator, followed by the first column of data, and
another column indicator. Column indicators appear after every column of data and define
the type of the data preceding it.

478

Chapter 16: Log Files

Table 16-6 describes the column indicators in a reject file:


Table 16-6. Column Indicators in Reject File
Column
Indicator

Type of data

Writer Treats As

Valid data.

Good data. Writer passes it to the target database. The


target accepts it unless a database error occurs, such
as finding a duplicate key.

Overflow. Numeric data exceeded the


specified precision or scale for the column.

Bad data, if you configured the mapping target to reject


overflow or truncated data.

Null. The column contains a null value.

Good data. Writer passes it to the target, which rejects it


if the target database does not accept null values.

Truncated. String data exceeded a


specified precision for the column, so the
PowerCenter Server truncated it.

Bad data, if you configured the mapping target to reject


overflow or truncated data.

Null columns appear in the reject file with commas marking their column. An example of a
null column surrounded by good data appears as follows:
5,D,,N,5,D

Because either the writer or target database can reject a row, and because they can reject the
row for a number of reasons, you need to evaluate the row carefully and consult the session
log to determine the cause for reject.

Reject Files

479

480

Chapter 16: Log Files

Chapter 17

Row Error Logging


This chapter includes the following topics:

Overview, 482

Understanding the Error Log Tables, 483

Understanding the Error Log File, 489

Configuring Error Log Options, 493

481

Overview
When you configure a session, you can choose to log row errors in a central location. When a
row error occurs, the PowerCenter Server logs error information that allows you to determine
the cause and source of the error. The PowerCenter Server logs information such as source
name, row ID, current row data, transformation, timestamp, error code, error message,
repository name, folder name, session name, and mapping information.
You can log row errors into relational tables or flat files. When you enable error logging, the
PowerCenter Server creates the error tables or an error log file the first time it runs the session.
Error logs are cumulative. If the error logs exist, the PowerCenter Server appends error data to
the existing error logs.
You can choose to log source row data. Source row data includes row data, source row ID, and
source row type from the source qualifier where an error occurs. The PowerCenter Server
cannot identify the row in the source qualifier that contains an error if the error occurs after a
non pass-through partition point with more than one partition or one of the following active
sources:

Aggregator

Custom, configured as an active transformation

Joiner

Normalizer (pipeline)

Rank

Sorter

By default, the PowerCenter Server logs transformation errors in the session log and reject
rows in the reject file. When you enable error logging, the PowerCenter Server does not
generate a reject file or write dropped rows to the session log. Without a reject file, the
PowerCenter Server does not log Transaction Control transformation rollback or commit
errors. If you want to write rows to the session log in addition to the row error log, you can
enable verbose data tracing.
Note: When you log row errors, session performance may decrease because the PowerCenter
Server processes one row at a time instead of a block of rows at once.

Error Log Code Pages


The code page for the error log must match the code page for the session log. By default, the
error log code page matches the server code page, and you can set the server configuration
parameter to use UTF-8. The code page for the relational database where the error tables exist
needs to be one-way compatible with the server code page. For more information about code
pages, see Globalization Overview in the Installation and Configuration Guide.

482

Chapter 17: Row Error Logging

Understanding the Error Log Tables


When you choose relational database error logging, the PowerCenter Server creates four error
tables the first time you run a session. You specify the database connection to the database
where the PowerCenter Server creates these tables. If the error tables exist for a session, the
PowerCenter Server appends row errors to these tables.
Relational database error logging allows you to collect row errors from multiple sessions in
one set of error tables. To do this, you specify the same error log table name prefix for all
sessions. You can issue select statements on the generated error tables to retrieve error data for
a particular session.
You can specify a prefix for the error tables. The error table names can have up to eleven
characters. Do not specify a prefix that exceeds 19 characters when naming Oracle, Sybase, or
Teradata error log tables, as these databases have a maximum length of 30 characters for table
names.
The PowerCenter Server creates the error tables without specifying primary and foreign keys.
However, you can specify key columns.
The PowerCenter Server generates the following tables to help you track row errors:

PMERR_DATA. Stores data and metadata about a transformation row error and its
corresponding source row.

PMERR_MSG. Stores metadata about an error and the error message.

PMERR_SESS. Stores metadata about the session.

PMERR_TRANS. Stores metadata about the source and transformation ports, such as
name and datatype, when a transformation error occurs.

PMERR_DATA
When the PowerCenter Server encounters a row error, it inserts an entry into the
PMERR_DATA table. This table stores data and metadata about a transformation row error
and its corresponding source row.
Table 17-1 describes the structure of the PMERR_DATA table:
Table 17-1. PMERR_DATA Table Schema
Column Name

Datatype

Description

REPOSITORY_GID

Varchar

A unique identifier for the repository.

WORKFLOW_RUN_ID

Integer

A unique identifier for the workflow.

WORKLET_RUN_ID

Integer

A unique identifier for the worklet. If a session is not part of


a worklet, this value is 0.

SESS_INST_ID

Integer

A unique identifier for the session.

TRANS_MAPPLET_INST

Varchar

Name of the mapplet where an error occurred.

Understanding the Error Log Tables

483

Table 17-1. PMERR_DATA Table Schema


Column Name

Datatype

Description

TRANS_NAME

Varchar

Name of the transformation where an error occurred.

TRANS_GROUP

Varchar

Name of the input group or output group where an error


occurred. Defaults to either input or output if the
transformation does not have a group.

TRANS_PART_INDEX

Integer

Specifies the partition number of the transformation where


an error occurred.

TRANS_ROW_ID

Integer

Specifies the row ID generated by the last active source.

TRANS_ROW_DATA

Long Varchar

Delimited string containing all column data, including the


column indicator. Column indicators are:
D - valid
O - overflow
N - null
T - truncated
B - binary
U - data unavailable
The fixed delimiter between column data and column
indicator is colon ( : ). The delimiter between the columns
is pipe ( | ). You can override the column delimiter in the
error handling settings.
The PowerCenter Server converts all column data to text
string in the error table. For binary data, the PowerCenter
Server uses only the column indicator.
This value can span multiple rows. When the data exceeds
2000 bytes, the PowerCenter Server creates a new row.
The line number for each row error entry is stored in the
LINE_NO column.

484

SOURCE_ROW_ID

Integer

Value that the source qualifier assigns to each row it


reads. If the PowerCenter Server cannot identify the row,
the value is -1.

SOURCE_ROW_TYPE

Integer

The row indicator that tells whether the row was marked
for insert, update, delete, or reject.
0 - Insert
1 - Update
2 - Delete
3 - Reject

Chapter 17: Row Error Logging

Table 17-1. PMERR_DATA Table Schema


Column Name

Datatype

Description

SOURCE_ROW_DATA

Long Varchar

Delimited string containing all column data, including the


column indicator. Column indicators are:
D - valid
O - overflow
N - null
T - truncated
B - binary
U - data unavailable
The fixed delimiter between column data and column
indicator is colon ( : ). The delimiter between the columns
is pipe ( | ). You can override the column delimiter in the
error handling settings.
The PowerCenter Server converts all column data to text
string in the error table or error file. For binary data, the
PowerCenter Server uses only the column indicator.
This value can span multiple rows. When the data exceeds
2000 bytes, the PowerCenter Server creates a new row.
The line number for each row error entry is stored in the
LINE_NO column.

LINE_NO

Integer

Specifies the line number for each row error entry in


SOURCE_ROW_DATA and TRANS_ROW_DATA that
spans multiple rows.

Informatica recommends using the fields in bold to join tables.

PMERR_MSG
When the PowerCenter Server encounters a row error, it inserts an entry into the
PMERR_MSG table. This table stores metadata about the error and the error message.
Table 17-2 describes the structure of the PMERR_MSG table:
Table 17-2. PMERR_MSG Table Schema
Column Name

Datatype

Description

REPOSITORY_GID

Varchar

A unique identifier for the repository.

WORKFLOW_RUN_ID

Integer

A unique identifier for the workflow.

WORKLET_RUN_ID

Integer

A unique identifier for the worklet. If a session is not part


of a worklet, this value is 0.

SESS_INST_ID

Integer

A unique identifier for the session.

MAPPLET_INST_NAME

Varchar

Mapplet to which the transformation belongs. If the


transformation is not part of a mapplet, this value is N/A.

TRANS_NAME

Varchar

Name of the transformation where an error occurred.

Understanding the Error Log Tables

485

Table 17-2. PMERR_MSG Table Schema


Column Name

Datatype

Description

TRANS_GROUP

Varchar

Name of the input group or output group where an error


occurred. Defaults to either input or output if the
transformation does not have a group.

TRANS_PART_INDEX

Integer

Specifies the partition number of the transformation


where an error occurred.

TRANS_ROW_ID

Integer

Specifies the row ID generated by the last active source.

ERROR_SEQ_NUM

Integer

Counter for the number of errors per row in each


transformation group. If a session has multiple partitions,
the PowerCenter Server maintains this counter for each
partition.
For example, if a transformation generates three errors in
partition 1 and two errors in partition 2,
ERROR_SEQ_NUM generates the values 1, 2, and 3 for
partition 1, and values 1 and 2 for partition 2.

ERROR_TIMESTAMP

Date/Time

Timestamp of the PowerCenter Server when the error


occurred.

ERROR_UTC_TIME

Integer

The Coordinated Universal Time, also known as


Greenwich Mean Time, of when an error occurred.

ERROR_CODE

Integer

The error code that the error generates.

ERROR_MSG

Long Varchar

Error message, which can span multiple rows. When the


data exceeds 2000 bytes, the PowerCenter Server
creates a new row. The line number for each row error
entry is stored in the LINE_NO column.

ERROR_TYPE

Integer

The type of error that occurred. The PowerCenter Server


uses the following values:
1 - Reader error
2 - Writer error
3 - Transformation error

LINE_NO

Integer

Specifies the line number for each row error entry in


ERROR_MSG that spans multiple rows.

Informatica recommends using the fields in bold to join tables.

PMERR_SESS
When you choose relational database error logging, the PowerCenter Server inserts entries
into the PMERR_SESS table. This table stores metadata about the session where an error
occurred.

486

Chapter 17: Row Error Logging

Table 17-3 describes the structure of the PMERR_SESS table:


Table 17-3. PMERR_SESS Table Schema
Column Name

Datatype

Description

REPOSITORY_GID

Varchar

A unique identifier for the repository.

WORKFLOW_RUN_ID

Integer

A unique identifier for the workflow.

WORKLET_RUN_ID

Integer

A unique identifier for the worklet. If a session is not part of a


worklet, this value is 0.

SESS_INST_ID

Integer

A unique identifier for the session.

SESS_START_TIME

Date/Time

Timestamp of the PowerCenter Server when a session starts.

SESS_START_UTC_TIME

Integer

The Coordinated Universal Time, also known as Greenwich Mean


Time, of when the session starts.

REPOSITORY_NAME

Varchar

The repository name where sessions are stored.

FOLDER_NAME

Varchar

Specifies the folder where the mapping and session are located.

WORKFLOW_NAME

Varchar

Specifies the workflow that runs the session being logged.

TASK_INST_PATH

Varchar

Fully qualified session name that can span multiple rows. The
PowerCenter Server creates a new line for the session name. The
PowerCenter Server also creates a new line for each worklet in the
qualified session name. For example, you have a session named
WL1.WL2.S1. Each component of the name appears on a new line:
WL1
WL2
S1
The PowerCenter Server writes the line number in the LINE_NO
column.

MAPPING_NAME

Varchar

Specifies the mapping that the session uses.

LINE_NO

Integer

Specifies the line number for each row error entry in


TASK_INST_PATH that spans multiple rows.

Informatica recommends using the fields in bold to join tables.

PMERR_TRANS
When the PowerCenter Server encounters a transformation error, it inserts an entry into the
PMERR_TRANS table. This table stores metadata, such as the name and datatype of the
source and transformation ports.
Table 17-4 describes the structure of the PMERR_TRANS table:
Table 17-4. PMERR_TRANS Table Schema
Column Name

Datatype

Description

REPOSITORY_GID

Varchar

A unique identifier for the repository.

WORKFLOW_RUN_ID

Integer

A unique identifier for the workflow.

Understanding the Error Log Tables

487

Table 17-4. PMERR_TRANS Table Schema


Column Name

Datatype

Description

WORKLET_RUN_ID

Integer

A unique identifier for the worklet. If a session is not


part of a worklet, this value is 0.

SESS_INST_ID

Integer

A unique identifier for the session.

TRANS_MAPPLET_INST

Varchar

Specifies the instance of a mapplet.

TRANS_NAME

Varchar

Name of the transformation where an error occurred.

TRANS_GROUP

Varchar

Name of the input group or output group where an error


occurred. Defaults to either input or output if the
transformation does not have a group.

TRANS_ATTR

Varchar

Lists the port names and datatypes of the input or


output group where the error occurred. Port name and
datatype pairs are separated by commas, for example:
portname1:datatype, portname2:datatype.
This value can span multiple rows. When the data
exceeds 2000 bytes, the PowerCenter Server creates a
new row for the transformation attributes and writes the
line number in the LINE_NO column.

SOURCE_MAPPLET_INST

Varchar

Name of the mapplet in which the source resides.

SOURCE_NAME

Varchar

Name of the source qualifier. N/A appears when a row


error occurs downstream of an active source that is not
a source qualifier or a non pass-through partition point
with more than one partition. For a list of active sources
that can affect row error logging, see Overview on
page 482.

SOURCE_ATTR

Varchar

Lists the connected field(s) in the source qualifier


where an error occurred. When an error occurs in
multiple fields, each field name is entered on a new
line. Writes the line number in the LINE_NO column.

LINE_NO

Integer

Specifies the line number for each row error entry in


TRANS_ATTR and SOURCE_ATTR that spans multiple
rows.

Informatica recommends using the fields in bold to join tables.

488

Chapter 17: Row Error Logging

Understanding the Error Log File


You can create an error log file to collect all errors that occur in a session. This error log file is
a column delimited line sequential file. By specifying a unique error log file name, you can
create a separate log file for each session in a workflow. When you want to analyze the row
errors for only one session, use an error log file.
In an error log file, double pipes || delimit error logging columns. By default, pipe |
delimits row data. You can change this row data delimiter by setting the Data Column
Delimiter error log option.
The code page for the error file is the same as the code page for the session log file. If the
session log uses a UTF-8 code page, the error file also uses a UTF-8 code page. For more
information about code pages, see Globalization Overview in the Installation and
Configuration Guide.
Error log files have the following structure:
[Session Header]
[Column Header]
[Column Data]

Session header. Contains session run information. Information in the session header is like
the information stored in the PMERR_SESS table.

Column header. Contains data column names.

Column data. Contains actual row data and error message information.

The following sample error log file contains a session header, column header, and column
data:
**********************************************************************
Repository GID: fe4817ab-7d87-465f-9110-354222424df0
Repository: CustomerInfo
Folder: Row_Error_Logging
Workflow: wf_basic_REL_errors_AGG_case
Session: s_m_basic_REL_errors_AGG_case
Mapping: m_basic_REL_errors_AGG_case
Workflow Run ID: 1310
Worklet Run ID: 0
Session Instance ID: 19
Session Start Time: 08/03/2004 16:57:01
Session Start Time (UTC): 1067126221
**********************************************************************

Understanding the Error Log File

489

Transformation||Transformation Mapplet Name||Transformation


Group||Partition Index||Transformation Row ID||Error Sequence||Error
Timestamp||Error UTC Time||Error Code||Error Message||Error
Type||Transformation Data||Source Mapplet Name||Source Name||Source Row
ID||Source Row Type||Source Data
agg_REL_basic||N/A||Input||1||1||1||08/03/2004
16:57:03||1067126223||11019||Port [CUST_ID_NULL]: Default value is:
ERROR(<<Expression Error>> [ERROR]: [AGG] CUST_ID - NULL detected on
input.\n... nl:ERROR(s:'[AGG] CUST_ID - NULL detected on
input.')).||3||D:1221|N:|N:|N:|D:Kauai Dive Shoppe|D:4-976 Sugarloaf
Hwy|D:Kapaa Kauai|D:HI|D:94766|D:[AGG] DEFAULT SID VALUE.|D:01/01/2001
00:00:00||mplt_add_NULLs_to_QACUST3||SQ_QACUST3||1||0||D:1221|D:Kauai
Dive Shoppe|D:4-976 Sugarloaf Hwy|D:Kapaa Kauai|D:HI|D:94766
agg_REL_basic||N/A||Input||1||4||1||08/03/2004
16:57:03||1067126223||11019||Port [CITY_IN]: Default value is:
ERROR(<<Expression Error>> [ERROR]: [AGG] Null detected for City_IN.\n...
nl:ERROR(s:'[AGG] Null detected for
City_IN.')).||3||D:1354|N:|N:|D:1354|T:Cayman Divers World|D:PO Box
541|N:|D:Gr|N:|D:[AGG] DEFAULT SID VALUE.|D:01/01/2001
00:00:00||mplt_add_NULLs_to_QACUST3||SQ_QACUST3||4||0||D:1354|D:Cayman
Divers World Unlim|D:PO Box 541|N:|D:Gr|N:
agg_REL_basic||N/A||Input||1||5||1||08/03/2004
16:57:03||1067126223||11131||Transformation [agg_REL_basic] had an error
evaluating variable column [Var_Divide_by_Price]. Error message is
[<<Expression Error>> [/]: divisor is zero\n... f:(f:2 / f:(f:1 f:TO_FLOAT(i:1)))].||3||D:1356|N:|N:|D:1356|T:Tom Sawyer Diving C|T:632-1
Third Frydenh|D:Christiansted|D:St|D:00820|D:[AGG] DEFAULT SID
VALUE.|D:01/01/2001
00:00:00||mplt_add_NULLs_to_QACUST3||SQ_QACUST3||5||0||D:1356|D:Tom
Sawyer Diving Centre|D:632-1 Third Frydenho|D:Christiansted|D:St|D:00820

Table 17-5 describes the columns in an error log file:


Table 17-5. Error Log File Column Headers

490

Log File Column Headers

Description

Transformation

The name of the transformation used by a mapping where an error occurred.

Transformation Mapplet Name

Name of the mapplet that contains the transformation. N/A appears when this
information is not available.

Transformation Group

Name of the input or output group where an error occurred. Defaults to either input
or output if the transformation does not have a group.

Partition Index

Specifies the partition number of the transformation partition where an error


occurred.

Transformation Row ID

Specifies the row ID for the error row.

Error Sequence

Counter for the number of errors per row in each transformation group. If a session
has multiple partitions, the PowerCenter Server maintains this counter for each
partition.
For example, if a transformation generates three errors in partition 1 and two errors
in partition 2, ERROR_SEQ_NUM generates the values 1, 2, and 3 for partition 1,
and values 1 and 2 for partition 2.

Chapter 17: Row Error Logging

Table 17-5. Error Log File Column Headers


Log File Column Headers

Description

Error Timestamp

Timestamp of the PowerCenter Server when the error occurred.

Error UTC Time

The Coordinated Universal Time, also known as Greenwich Mean Time, when the
error occurred.

Error Code

The error code that corresponds to the error message.

Error Message

Error message.

Error Type

The type of error that occurred. The PowerCenter Server uses the following values:
1 - Reader error
2 - Writer error
3 - Transformation error

Transformation Data

Delimited string containing all column data, including the column indicator. Column
indicators are:
D - valid
O - overflow
N - null
T - truncated
B - binary
U - data unavailable
The fixed delimiter between column data and column indicator is a colon ( : ). The
delimiter between the columns is a pipe ( | ). You can override the column delimiter
in the error handling settings.
The PowerCenter Server converts all column data to text string in the error file. For
binary data, the PowerCenter Server uses only the column indicator.

Source Name

Name of the source qualifier. N/A appears when a row error occurs downstream of
an active source that is not a source qualifier or a non pass-through partition point
with more than one partition. For a list of active sources that can affect row error
logging, see Overview on page 482.

Source Row ID

Value that the source qualifier assigns to each row it reads. If the PowerCenter
Server cannot identify the row, the value is -1.

Understanding the Error Log File

491

Table 17-5. Error Log File Column Headers


Log File Column Headers

Description

Source Row Type

The row indicator that tells whether the row was marked for insert, update, delete, or
reject.
0 - Insert
1 - Update
2 - Delete
3 - Reject

Source Data

Delimited string containing all column data, including the column indicator. Column
indicators are:
D - valid
O - overflow
N - null
T - truncated
B - binary
U - data unavailable
The fixed delimiter between column data and column indicator is a colon ( : ). The
delimiter between the columns is a pipe ( | ). You can override the column delimiter
in the error handling settings.
The PowerCenter Server converts all column data to text string in the error table or
error file. For binary data, the PowerCenter Server uses only the column indicator.

492

Chapter 17: Row Error Logging

Configuring Error Log Options


You configure error logging for each session in a workflow. You can find error handling
options in the Config Object tab of the sessions properties.
Tip: You can use the Workflow Manager to create a reusable set of attributes for the Config

Object tab. For more information on creating a session configuration object, see Creating a
Session Configuration Object on page 183.
To configure error logging options:
1.

Double-click the Session task to open the session properties.

2.

Select the Config Object tab.

3.

Choose error handling options.

Error Log
Options

Configuring Error Log Options

493

Table 17-6 describes the error logging settings of the Config Object tab:
Table 17-6. Error Log Options

4.

494

Error Log Options

Required/
Optional

Error Log Type

Required

Specifies the type of error log to create. You can specify relational
database, flat file, or no log. By default, the PowerCenter Server
does not create an error log.

Error Log DB Connection

Required/
Optional

Specifies the database connection for a relational log. This option is


required when you enable relational database logging.

Error Log Table Name


Prefix

Optional

Specifies the table name prefix for relational logs. The PowerCenter
Server appends 11 characters to the prefix name. Oracle and
Sybase have a 30 character limit for table names. If a table name
exceeds 30 characters, the session fails.

Error Log File Directory

Required/
Optional

Specifies the directory where errors are logged. By default, the error
log file directory is $PMBadFilesDir\. This option is required when
you enable flat file logging.

Error Log File Name

Required/
Optional

Specifies error log file name. The character limit for the error log file
name is 255. By default, the error log file name is PMError.log. This
option is required when you enable flat file logging.

Log Row Data

Optional

Specifies whether or not to log transformation row data. By default,


the PowerCenter Server logs transformation row data. If you disable
this property, N/A or -1 appears in transformation row data fields.

Log Source Row Data

Optional

If you choose not to log source row data, or if source row data is
unavailable, the PowerCenter Server writes an indicator such as N/
A or -1, depending on the column datatype.
If you do not need to capture source row data, consider disabling
this option to increase PowerCenter Server performance.

Data Column Delimiter

Required

Delimiter for string type source row data and transformation group
row data. By default, the PowerCenter Server uses a pipe ( | )
delimiter. Verify that you do not use the same delimiter for the row
data as the error logging columns. If you use the same delimiter, you
may find it difficult to read the error log file.

Click OK.

Chapter 17: Row Error Logging

Description

Chapter 18

Session Parameters
This chapter contains information on the following topics:

Overview, 496

Session Log Parameter, 497

Database Connection Parameters, 499

Source File Parameters, 502

Target File Parameters, 504

Lookup File Parameters, 506

Reject File Parameters, 508

Tips, 510

495

Overview
Session parameters, like mapping parameters, represent values you might want to change
between sessions, such as a database connection or source file. Use session parameters in the
session properties, and then define the parameters in a parameter file. You can specify the
parameter file for the session to use in the session properties. You can also specify it when you
use pmcmd to start the session.
The Workflow Manager provides one built-in session parameter, $PMSessionLogFile. With
$PMSessionLogFile, you can change the name of the session log generated for the session.
The Workflow Manager also allows you to create user-defined session parameters.
Table 18-1 describes required naming conventions for the session parameters you can define:
Table 18-1. Naming Conventions for User-Defined Session Parameters
Parameter Type

Naming Convention

Database Connection

$DBConnectionName

Source File

$InputFileName

Target File

$OutputFileName

Lookup File

$LookupFileName

Reject File

$BadFileName

Use session parameters to make sessions more flexible. For example, you have the same type of
transactional data written to two different databases, and you use the database connections
TransDB1 and TransDB2 to connect to the databases. You want to use the same mapping for
both tables. Instead of creating two sessions for the same mapping, you can create a database
connection parameter, $DBConnectionSource, and use it as the source database connection
for the session. When you create a parameter file for the session, you set
$DBConnectionSource to TransDB1 and run the session. After the session completes, you set
$DBConnectionSource to TransDB2 and run the session again.
You might use several session parameters together to make session management easier. For
example, you might use source file and database connection parameters to configure a session
to read data from different source files and write the results to different target databases. You
can then use reject file parameters to write the session reject files to the target machine. You
can use the session log parameter, $PMSessionLogFile, to write to different session logs in the
target machine, as well.
When you use session parameters, you must define the parameters in the parameter file.
Session parameters do not have default values. When the PowerCenter Server cannot find a
value for a session parameter, it fails to initialize the session.

496

Chapter 18: Session Parameters

Session Log Parameter


The Workflow Manager provides a built-in session parameter named $PMSessionLogFile. Use
$PMSessionLogFile in the session properties to change the name or location of the session log
between runs. When you use $PMSessionLogFile in the session properties, define the
parameter in the parameter file.

Changing the Session Log Name


You can use $PMSessionLogFile to change the session log name between sessions. In the
General Options settings of the Properties tab, enter $PMSessionLogFile in the Session Log
Filename field. Then define $PMSessionLogFile in the parameter file. When the PowerCenter
Server runs the session, it creates a session log in the directory listed in the Session Log File
Directory field and names the session log as instructed by the parameter file. If a session log
with the same name already exists, the PowerCenter Server overwrites the existing file.
Figure 18-1 illustrates how to use the session log parameter with a directory:
Figure 18-1. Using $PMSessionLogFile as the Name of the Session Log

Session Log
Parameter
Session Log
Directory
Parameter
Filename

For example, in a session, you leave Session Log File Directory set to its default value, the
$PMSessionLogDir server variable. For Session Log File Name, you enter the session
parameter $PMSessionLogFile. In the parameter file, you set $PMSessionLogFile to
TestRun.txt. When you registered the PowerCenter Server, you defined $PMSessionLogDir
as C:/Program Files/Informatica/PowerCenter Server/SessLogs. When the PowerCenter Server
Session Log Parameter

497

runs the session, it creates a session log named TextRun.txt in the C:/Program Files/
Informatica/PowerCenter Server/SessLogs directory.

Changing the Session Log Name and Location


You can also use $PMSessionLogFile to change both the directory and the session log name
between sessions. If you do this, you also need to clear the Session Log File Directory field.
The PowerCenter Server concatenates both fields to determine where and how to name the
session log.
For example, you have one session writing target files to different systems. You want each
session log written to the target machine so the local administrator can review the file. In the
session, you configure a target file session parameter $PMOutputFile1. You then use
$PMSessionLogFile to define the session log file name and clear the Session Log File
Directory. In the parameter file, you configure both the target file and session log file
parameter to write to the same machine. Set $PMOutputFile1 to E:/target files/
Marketing.out, and $PMSessionLogFile to E:/session logs/Marketing.txt. After you run the
session, you can edit the parameter file to change the directory and file names for both the
target file and session log parameters.
Alternatively, you can create a different parameter file for each target. You can then use
pmcmd to specify which parameter file to use when you start the session.

Steps for Using $PMSessionLogFile


Use $PMSessionLogFile when you want to change the name and/or location of a session log
between session runs.
To use the session log parameter:
1.

In the session properties, click the General Options settings of the Properties tab.

2.

Enter $PMSessionLogFile in the Session Log File field.

3.

If you want $PMSessionLogFile to represent both the session log name and directory,
clear the Session Log File Directory field.

4.

Enter a parameter file and directory in the Parameter File Name field.

5.

Click OK.

Before you run the session, create the parameter file in the specified directory and define
$PMSessionLogFile. For details, see Parameter Files on page 511.

498

Chapter 18: Session Parameters

Database Connection Parameters


You can create user-defined database connection session parameters to reuse sessions for
different relational sources, targets, or lookups. You can create a database connection
parameter in the session properties of any session that uses a relational source, target, or
lookup. Name all database connection session parameters with the prefix $DBConnection,
followed by any alphanumeric and underscore characters. When you define the parameter in
the parameter file, you can reference any database connection in the repository.
For example, you have a session you want to use with two relational sources. You access the
first source with a database connection named Marketing and the second with a connection
named Sales. In the session, you create a source database connection parameter named
$DBConnection_Source. In the parameter file, you define $DBConnection_Source as
Marketing and run the session. After the session completes, you set $DBConnection_Source
to Sales in the parameter file, and then run the session.
Alternatively, you can create two different parameter files, one for each source database
connection. You can then use pmcmd to specify which parameter file to use when you start the
session.
If you want to use the same database connection for more than one connection, such as source
and target, you can enter the same $DBConnection parameter for both source and target
database connection. In the parameter file, enter one default value for the $DBConnection
parameter. The PowerCenter Server uses the same DBConnectionName when accessing
source and target.
Similarly heterogeneous sources may also use the same $DBConnection parameter.
To configure a database connection parameter:
1.

In the session properties, click the Mapping tab (Transformation view) and click
Connections settings for the sources or targets node.

Database Connection Parameters

499

2.

Click the Open button in the Value field.

Open
Button

500

3.

In the Relational Connection Browser, select Use Connection Variable.

4.

Enter a name for the database connection parameter. Name the connection parameter
$DBConnectionName.

Chapter 18: Session Parameters

5.

In the General Options settings of the Properties tab, enter a parameter file and directory
in the Parameter Filename field.
The directory must be local to the PowerCenter Server.

6.

Click OK.

Before you run the session, create the parameter file in the specified directory and define the
database connection parameter. For details, see Parameter Files on page 511.

Database Connection Parameters

501

Source File Parameters


You can create user-defined source file session parameters. Use a source file parameter when
you want to change the name or location of a session source file between session runs. Name
all source file session parameters with the prefix $InputFile, followed by any alphanumeric
and underscore characters. All source file session parameters within a session must have
distinct names. You can create a source file parameter in any session that reads from file
sources. When you define the parameter in the parameter file, you can reference any source
file local to the PowerCenter Server.
You can use a user-defined source file session parameter in either the Source File Directory or
Source Filename session property.

Changing the Source File


You can use a source file parameter to change the name of the source file a session uses. In the
Properties settings of the Mapping tab, enter the source file parameter in the Source Filename
field. Then define the parameter in a parameter file. When the PowerCenter Server runs the
session, it connects to the directory listed in the Source File Directory field and reads the
source file listed in the parameter file.
Figure 18-2 shows how to use a source file parameter with a source directory:
Figure 18-2. Using Parameters to Change the Session Source File

Source File
Directory
Source
Filename In the
Parameter File

502

Chapter 18: Session Parameters

For example, in a session, you leave Source File Directory set to its default, the
$PMSourceFileDir server variable. For the source file name, you create a session parameter
named $Inputfile_products. In the parameter file, you set $Inputfile_products to
products.txt. When you registered the PowerCenter Server, you set $PMSourceFileDir for
C:/Program Files/Informatica/PowerCenter Server/SrcFiles. When the PowerCenter Server
runs the session, it reads the products.txt file in the C:/Program Files/Informatica/
PowerCenter Server/SrcFiles directory.

Changing the Source File and Directory


You can use a source file parameter to change both the source file and directory used by a
session. When you specify both the source file and directory in the Source Filename field, you
need to clear the Source File Directory field. The PowerCenter Server concatenates both fields
to determine where to find the indicated source file.

Steps for Using a Source File Parameter


Use a source file parameter when you want to change the source file and/or location between
session runs.
To use a source file parameter:
1.

Select a source under the Sources node on the Mapping tab.

2.

Go to the Properties settings.

3.

In the Source Filename field, enter the source file parameter name.
Name all source file parameters $InputFileName.

4.

If you want the parameter to represent both the source file name and location, clear the
Source Directory field.

5.

In the General Options settings of the Properties tab, enter a parameter file and directory
in the Parameter Filename field.

6.

Click OK.

Before you run the session, create the parameter file in the specified directory and define the
source file parameter. For details, see Parameter Files on page 511.

Source File Parameters

503

Target File Parameters


You can create user-defined target file session parameters. Use a target file parameter when
you want to change the name or location of a session target file between session runs. Name
all target file session parameters with the prefix $OutputFile, followed by any alphanumeric
and underscore characters. All target file session parameters within a session need to have
distinct names. You can create a target file parameter in any session that writes to file targets.
When you define the parameter in a parameter file, you can write the target file to any
directory local to the PowerCenter Server.
You can use a user-defined target file session parameter in either the Output File Directory or
Output Filename session property.

Changing the Target File


You can use a target file parameter to change the name of the target file the PowerCenter
Server creates when it runs a session. In the Properties settings of the Mapping tab, enter the
target file parameter in the Output File Name field. Then define the parameter in a parameter
file. When the PowerCenter Server runs the session, it connects to the directory listed in the
Output File Directory field and creates the target file listed in the parameter file. If the target
file exists, the PowerCenter Server overwrites the existing target file.
Figure 18-3 shows how to use a target file parameter with a target file directory:
Figure 18-3. Using Parameters to Change the Session Target File

Target file
directory
Target file name
in the
parameter file

504

Chapter 18: Session Parameters

For example, you want to name the target file based on the month in which the session runs.
In the session you leave the target directory set to its default, the $PMTargetFileDir server
variable. For the target file name, you create a session parameter named $OutputFileName. In
the parameter file, you set $OutputFileName to Nov2000.out. When you registered the
PowerCenter Server, set the $PMTargetFileDir to C:/Program Files/Informatica/PowerCenter
Server/TgtFiles. When the PowerCenter Server runs the session, it creates Nov2000.out in the
C:/Program Files/Informatica/PowerCenter Server/TgtFiles directory.

Changing the Target File and Directory


You can use a target file parameter to change both the target file and directory used by a
session. When you specify both the target file and directory in the Output Filename field, you
need to clear the Output File Directory field. The PowerCenter Server concatenates both
fields to determine where to create the target file.
For example, a session uses a source file parameter to read both internal and external weblogs
on different session runs. You want to write the results of the internal weblog session to one
system and the external weblog session to another. In the session, you name the target file
$OutputFileName and clear the Output File Directory field. In the parameter file, you set
$OutputFileName to E:/internal_weblogs/November_int.txt to create a target file for the
internal weblog session. After the session completes, you change $OutputFileName to F:/
external_weblogs/November_ex.txt for the external weblog session.
Alternatively, you can create a different parameter file for each target. You can then use
pmcmd to specify which parameter file to use when you start the session.

Steps for Using a Target File Parameter


Use a target file parameter when you want to change the name and/or location of a target file
between session runs.
To use a target file parameter:
1.

Select a target under the Targets node on the Mapping tab.

2.

Go to the Properties settings.

3.

In the Output Filename field, enter the target file parameter name.
Name all target file parameters $OutputFileName.

4.

If you want the parameter to represent both the target file name and location, clear the
Output File Directory field.

5.

In the General Options settings of the Properties tab, enter a parameter file and directory
in the Parameter Filename field.

6.

Click OK.

Before you run the session, create the parameter file in the specified directory and define the
target file parameter you created. For details, see Parameter Files on page 511.

Target File Parameters

505

Lookup File Parameters


You can create user-defined lookup file session parameters. Use a lookup file parameter when
you want to change the name or location of a session lookup file between session runs. Name
all lookup file session parameters with the prefix $LookupFile, followed by any alphanumeric
and underscore characters. All lookup file session parameters within a session must have
distinct names. You can create a lookup file parameter in any session that performs lookups
onflat files. When you define the parameter in the parameter file, you can reference any
lookup file local to the PowerCenter Server.
You can use a user-defined lookup file session parameter in either the Lookup Source File
Directory or Lookup Source Filename session property.

Changing the Lookup File


You can use a lookup file parameter to change the name of the lookup file a session uses. In
the Properties settings of the Mapping tab, enter the lookup file parameter in the Lookup
Filename field. Then define the parameter in a parameter file. When the PowerCenter Server
runs the session, it connects to the directory listed in the Lookup File Directory field and
reads the source file listed in the parameter file.
Figure 18-4 shows how to use a lookup file parameter with a lookup directory:
Figure 18-4. Using Parameters to Change the Session Lookup File

Lookup File
Directory
Lookup file
name in the
parameter file

506

Chapter 18: Session Parameters

For example, in a session, you leave Lookup File Directory set to its default, the
$PMLookupFileDir server variable. For the lookup file name, you create a session parameter
named $LookupFile_orders. In the parameter file, you set $LookupFile_orders to orders.txt.
When you registered the PowerCenter Server, you set $PMLookupFileDir for C:/Program
Files/Informatica/PowerCenter Server/LkpFiles. When the PowerCenter Server runs the
session, it reads the orders.txt file in the C:/Program Files/Informatica/PowerCenter Server/
LkpFiles directory.

Changing the Lookup File and Directory


You can use a lookup file parameter to change both the lookup file and directory used by a
session. When you specify both the lookup file and directory in the Lookup Source Filename
field, you need to clear the Lookup Source File Directory field. The PowerCenter Server
concatenates both fields to determine where to find the indicated lookup file.

Steps for Using a Lookup File Parameter


Use a lookup file parameter when you want to change the lookup file and/or location between
session runs.
To use a lookup file parameter:
1.

Select a Lookup transformation on the Mapping tab.

2.

Go to the Properties settings.

3.

In the Lookup Source Filename field, enter the lookup file parameter name.
Name all lookup file parameters $LookupFileName.

4.

If you want the parameter to represent both the source file name and location, clear the
Lookup Directory field.

5.

In the General Options settings of the Properties tab, enter a parameter file and directory
in the Parameter Filename field.

6.

Click OK.

Before you run the session, create the parameter file in the specified directory and define the
lookup file parameter. For details, see Parameter Files on page 511.

Lookup File Parameters

507

Reject File Parameters


You can create user-defined reject file session parameters. Use a reject file parameter when you
want to change the name or location of session reject files between session runs. Name all
reject file session parameters with the prefix $BadFile, followed by any alphanumeric and
underscore characters. All reject file parameters within a session need to have distinct names.
You can create a reject file parameter for any target in a session. When you define the
parameter in a parameter file, you can reference any directory local to the PowerCenter Server.
You can use a user-defined reject file session parameter in either the Reject File Directory or
Reject Filename session property.

Changing the Reject File Name


You can use a reject file parameter to change the name of a reject file a session uses. In the
Properties settings of the Mapping tab, enter the reject file parameter in the Reject Filename
field. Then define the parameter in the parameter file. When the PowerCenter Server runs the
session, it locates the directory listed in the Reject File Directory field and creates the reject
file listed in the parameter file. If the reject file already exists, it appends rejected data to the
existing reject file.
Figure 18-5 shows how to use a reject file parameter with a reject file directory:
Figure 18-5. Using Parameters to Change the Reject File Name

Reject file
directory
Reject file
name in the
parameter file

508

Chapter 18: Session Parameters

For example, you want to rename reject files between sessions to keep rejected data from
different session runs in different files. in a session, you leave Reject File Directory set to its
default, the $PMBadFileDir server variable. For the reject file name, you create a session
parameter named $BadFileName. In the parameter file, you set $BadFileName to
FirstRun.bad. When you registered the PowerCenter Server, you set $PMBadFileDir for C:/
Program Files/Informatica/PowerCenter Server/BadFiles. When the PowerCenter Server runs
the session, it creates the FirstRun.bad file in the C:/Program Files/Informatica/PowerCenter
Server/BadFiles directory.

Changing the Reject File and Directory


You can use a reject file parameter to change both the directory and name for session reject
files. When you specify both the reject file and directory in the Reject Filename field, you
need to clear the Reject File Directory field. The PowerCenter Server concatenates both fields
to determine where to find the indicated reject file.
For example, you use a database connection parameter to configure a session to write to
different target databases. Instead of having the PowerCenter Server append rejected data
from all sessions to the same reject file, you want to have a reject file for each target system. In
the session, you name the reject file $BadFileName and clear the Reject File Directory field.
In the parameter file, you set $BadFileName to the reject filename and directory for the target
database used in the session. When you change the database connection parameter to a
different database, you can also change the reject filename and directory.
Alternatively, you can create a different parameter file for each target system. You can then use
pmcmd to specify which parameter file to use when you start the session.

Steps for Using a Reject File Parameter


Use a reject file parameter when you want to change the reject file and/or location between
session runs.
To use a reject file parameter:
1.

Go to the Properties settings of the Mapping tab.

2.

In the Reject Filename field, enter the reject file parameter name.
Name all reject file parameters $BadFileName.

3.

If you want the parameter to represent both the reject file name and location, clear the
Reject File Directory field.

4.

In the General Options settings of the Properties tab, enter a parameter file and directory
in the Parameter Filename field.

5.

Click OK.

Before you run the session, create the parameter file in the specified directory and define the
reject file parameter. For details, see Parameter Files on page 511.

Reject File Parameters

509

Tips
Use reject file and session log parameters in conjunction with target file or target database
connection parameters.
When you use a target file or target database connection parameter with a session, you can
keep track of reject files by using a reject file parameter to write the reject file to the target
machine. You can also use the session log parameter to write the session log to the target
machine.

510

Chapter 18: Session Parameters

Chapter 19

Parameter Files
This chapter covers the following topics:

Overview, 512

Parameter File Format, 513

Guidelines for Creating Parameter Files, 515

Sample Parameter File, 517

Configuring the Parameter File Location, 518

Troubleshooting, 520

Tips, 521

511

Overview
You can use a parameter file to define the values for parameters and variables used in a
workflow, worklet, or session. You can create a parameter file using a text editor such as
WordPad or Notepad. You list the parameters or variables and their values in the parameter
file. Parameter files can contain the following types of parameters and variables:

Workflow variables

Worklet variables

Session parameters

Mapping parameters and variables

When you use parameters or variables in a workflow, worklet, or session, the PowerCenter
Server checks the parameter file to determine the start value of the parameter or variable. You
can use a parameter file to initialize workflow variables, worklet variables, mapping
parameters, and mapping variables. If you do not define start values for these parameters and
variables, the PowerCenter Server checks for the start value of the parameter or variable in
other places. For more information, see Using Workflow Variables on page 103 and
Mapping Parameters and Variables in the Designer Guide.
You can place parameter files on the PowerCenter Server machine or on a local machine. Use
a local parameter file if you do not have access to parameter files on the PowerCenter Server
machine. When you use a local parameter file, pmcmd passes variables and values in the file to
the PowerCenter Server. Local parameter files are used with the startworkflow pmcmd
command. For more information, see pmcmd Reference on page 594.
You must define session parameters in a parameter file. Since session parameters do not have
default values, when the PowerCenter Server cannot locate the value of a session parameter in
the parameter file, it fails to initialize the session.
You can include parameter or variable information for more than one workflow, worklet, or
session in a single parameter file by creating separate sections for each object within the
parameter file.
You can also create multiple parameter files for a single workflow, worklet, or session and
change the file that these tasks use as needed. To specify the parameter file the PowerCenter
Server uses with a workflow, worklet, or session, you can do either of the following:

Enter the parameter file name and directory in the workflow, worklet, or session
properties.

Start the workflow, worklet, or session using pmcmd and enter the parameter filename and
directory in the command line. For details, see Using pmcmd on page 581.

If you enter a parameter file name and directory in both the workflow, worklet, or session
properties and in the pmcmd command line, the PowerCenter Server uses the information
you enter in the pmcmd command line.

512

Chapter 19: Parameter Files

Parameter File Format


When you enter values in a parameter file, you must precede the entries with a heading that
identifies the workflow, worklet, or session whose parameters and variables you want to
assign. You assign individual parameters and variables directly below this heading, entering
each parameter or variable on a new line. You can list parameters and variables in any order
for each task.
You can define the following heading formats:

Workflow variables:
[folder name.WF:workflow name]

Worklet variables:
[folder name.WF:workflow name.WT:worklet name]

Worklet variables in nested worklets:


[folder name.WF:workflow name.WT:worklet name.WT:worklet name...]

Session parameters, plus mapping parameters and variables:


[folder name.WF:workflow name.ST:session name]

or
[folder name.session name]

or
[session name]

Below each heading, you define parameter and variable values as follows:
parameter name=value
parameter2 name=value
variable name=value
variable2 name=value

For example, you have a session, s_MonthlyCalculations, in the Production folder. The
session uses a string mapping parameter, $$State, that you want to set to MA, and a
datetime mapping variable, $$Time. $$Time already has an initial values of 9/30/2000
00:00:00 saved in the repository, but you want to override this value to 10/1/2000
00:00:00. The session also uses session parameters to connect to source files and target
databases, as well as to write session log to the appropriate session log file.
Table 19-1 shows the parameters and variables that you define in the parameter file:
Table 19-1. Parameters and Variables in Parameter File
Parameter and Variable Type

Parameter and Variable Name

Desired Definition

String Mapping Parameter

$$State

MA

Datetime Mapping Variable

$$Time

10/1/2000 00:00:00

Parameter File Format

513

Table 19-1. Parameters and Variables in Parameter File


Parameter and Variable Type

Parameter and Variable Name

Desired Definition

Source File (Session Parameter)

$InputFile1

Sales.txt

Database Connection (Session Parameter)

$DBConnection_Target

Sales (database connection)

Session Log File (Session Parameter)

$PMSessionLogFile

d:/session logs/firstrun.txt

The parameter file for the session includes the folder and session name, as well as each
parameter and variable:
[Production.s_MonthlyCalculations]
$$State=MA
$$Time=10/1/2000 00:00:00
$InputFile1=sales.txt
$DBConnection_target=sales
$PMSessionLogFile=D:/session logs/firstrun.txt

The next time you run the session, you might edit the parameter file to change the state to
MD and delete the $$Time variable. This allows the PowerCenter Server to use the value for
the variable that was set in the previous session run.

514

Chapter 19: Parameter Files

Guidelines for Creating Parameter Files


Use the following guidelines when creating parameter files:

Capitalize folder and session names as necessary. Folder and session names are casesensitive in the parameter file.

Enter folder names for non-unique session names. When a session name exists more than
once in a repository, enter the folder name to indicate the location of the session.

Create one or more parameter files. You assign parameter files to workflows, worklets, and
sessions individually. You can specify the same parameter file for all of these tasks or create
several parameter files.

When you want to include parameter and variable information for more than one
session in the file, create a new section for each session as follows. The folder name is
optional.
[folder_name.session_name]
parameter_name=value
variable_name=value
mapplet_name.parameter_name=value
[folder2_name.session_name]
parameter_name=value
variable_name=value
mapplet_name.parameter_name=value

Specify headings in any order. You can place headings in any order in the parameter file.
However, if you define the same parameter or variable more than once in the file, the
PowerCenter Server assigns the parameter or variable value using the first instance of the
parameter or variable.

Specify parameters and variables in any order. Below each heading, you can specify the
parameters and variables in any order.

When defining parameter values, do not use unnecessary line breaks or spaces. The
PowerCenter Server might interpret additional spaces as part of the value.

List all necessary mapping parameters and variables. Values entered for mapping
parameters and variables become the start value for parameters and variables in a mapping.
Mapping parameter and variable names are not case sensitive.

List all session parameters. Session parameters do not have default values. An undefined
session parameter can cause the session to fail. Session parameter names are not casesensitive.

Use correct date formats for datetime values. When entering datetime values, use the
following date formats:

MM/DD/RR

MM/DD/RR HH24:MI:SS
Guidelines for Creating Parameter Files

515

MM/DD/YYYY

MM/DD/YYYY HH24:MI:SS

Do not enclose parameters or variables in quotes. The PowerCenter Server interprets


everything after the equal sign as part of the value.

Precede parameters and variables created in mapplets with the mapplet name as follows:
mapplet_name.parameter_name=value
mapplet2_name.variable_name=value

516

Chapter 19: Parameter Files

Sample Parameter File


The following text is an excerpt from a parameter file:
[HET_TGTS.WF:wf_TCOMMIT_INST_ALIAS]
$$platform=unix
[HET_TGTS.WF:wf_TGTS_ASC_ORDR.ST:s_TGTS_ASC_ORDR]
$$platform=unix
$DBConnection_ora=qasrvrk2_hp817
[ORDERS.WF:wf_PARAM_FILE.WT:WL_PARAM_Lvl_1]
$$DT_WL_lvl_1=02/01/2000 00:00:00
$$Double_WL_lvl_1=2.2
[ORDERS.WF:wf_PARAM_FILE.WT:WL_PARAM_Lvl_1.WT:NWL_PARAM_Lvl_2]
$$DT_WL_lvl_2=03/01/2000 00:00:00
$$Int_WL_lvl_2=3
$$String_WL_lvl_2=ccccc

Sample Parameter File

517

Configuring the Parameter File Location


You can specify the parameter filename and directory in the workflow or session properties.
To enter a parameter file in the workflow properties:
1.

Select Workflows-Edit.

2.

Click the Properties tab.

3.

Enter the parameter directory and name in the Parameter Filename field.
You can enter either a direct path or a server variable directory. Use the appropriate
delimiter for the PowerCenter Server operating system.

Enter the
parameter
directory.

4.

Click OK.

To enter a parameter file in the session properties:

518

1.

Click the Properties tab and open the General Options settings.

2.

Enter the parameter directory and name in the Parameter Filename field.

Chapter 19: Parameter Files

3.

You can enter either a direct path or a server variable directory. Use the appropriate
delimiter for the PowerCenter Server operating system.

Enter the
parameter
directory.

4.

Click OK.

Configuring the Parameter File Location

519

Troubleshooting
I have a section in a parameter file for a session, but the PowerCenter Server does not seem
to read it.
In the parameter file, folder and session names are case-sensitive. Make sure to enter folder
and session names exactly as they appear in the Workflow Manager. Also, use the appropriate
prefix for all user-defined session parameters.
Table 19-2 describes required naming conventions for user-defined session parameters:
Table 19-2. Naming Conventions for User-Defined Session Parameters
Parameter Type

Naming Convention

Database Connection

$DBConnectionName

Reject File

$BadFileName

Source File

$InputFileName

Target File

$OutputFileName

Lookup File

$LookupFileName

I am trying to use a source file parameter to specify a source file and location, but the
PowerCenter Server cannot find the source file.
Make sure to clear the source file directory in the session properties. The PowerCenter Server
concatenates the source file directory with the source file name to locate the source file.
Also, make sure to enter a directory local to the PowerCenter Server and to use the
appropriate delimiter for the operating system.
I am trying to run a workflow with a parameter file and one of the sessions keeps failing.
The session might contain a parameter that is not listed in the parameter file. The
PowerCenter Server uses the parameter file to start all sessions in the workflow. Check the
session properties, then verify that all session parameters are defined correctly in the
parameter file.

520

Chapter 19: Parameter Files

Tips
Use a single parameter file to group parameter information for related sessions.
When sessions are likely to use the same database connection or directory, you might want to
include them in the same parameter file. When existing systems are upgraded, you can update
information for all sessions by editing one parameter file.
Use pmcmd and multiple parameter files for sessions with regular cycles.
When you change parameter values for a session in a cycle, reuse the same values on a regular
basis. If you run a session against both the sales and marketing databases once a week, you
might want to create separate parameter files for each regular session run. Then, instead of
changing the parameter file in the session properties each time you run the session, use pmcmd
to specify the parameter file to use when you start the session.

Tips

521

522

Chapter 19: Parameter Files

Chapter 20

External Loading
This chapter covers the following topics:

Overview, 524

External Loader Permissions, 525

External Loader Behavior, 526

Loading to DB2, 528

Loading to Oracle, 533

Loading to Sybase IQ, 535

Loading to Teradata, 538

Creating an External Loader Connection, 551

Configuring External Loading in a Session, 553

Troubleshooting, 557

523

Overview
You can configure a session to use DB2, Oracle, Sybase IQ, and Teradata external loaders to
load session target files into the respective databases. External Loaders can increase session
performance since these databases can load information directly from files faster than they can
run the SQL commands to insert the same data into the database.
To use an external loader for a session, you must perform the following tasks:
1.

Create an external loader connection in the Workflow Manager and configure the
external loader attributes. For details on creating external loader connections, see
Creating an External Loader Connection on page 551.

2.

Configure the session to write to flat file instead of to a relational database. For more
information, see Configuring a Session to Write to a File on page 553.

3.

Choose an external loader connection for each target file in the session properties. For
more information, see Selecting an External Loader Connection on page 555.

When you run a session that uses an external loader, the PowerCenter Server creates a control
file and a target flat file. The control file contains information about the target flat file such as
data format and loading instructions for the external loader. The control file has an extension
of .ctl. You can view the control file and the target flat file in the target file directory (default:
$PMTargetFileDir).
The PowerCenter Server waits for all external loading to complete before performing postsession commands, external procedures, and sending post-session email.
Before you run external loaders, consider the following issues:

Disable constraints. Normally, you disable constraints built into the tables receiving the
data before performing the load. Consult your database documentation for instructions on
how to disable constraints.

Performance issues. To preserve high performance, you can increase commit intervals and
turn off database logging. However, to perform database recovery on failed sessions, you
must have database logging turned on.

Code page requirements. DB2, Oracle, Sybase IQ, and Teradata database servers must run
in the same code page as the target flat file code page. The external loaders start in the
target flat file code page. The PowerCenter Server creates the control and target flat files
using the target flat file code page. If you are using a code page other than 7-bit ASCII for
the target flat file, run the PowerCenter Server in Unicode data movement mode.

The PowerCenter Server can use multiple external loaders within one session. For example, if
the mapping contains two targets, you can create a session that uses different connection
types: one uses an Oracle external loader connection and the other uses a Sybase IQ external
loader connection.

524

Chapter 20: External Loading

External Loader Permissions


You can set external loader connection permissions in the connection object in the Workflow
Manager. The Workflow Manager assigns Owner permissions to the user who registers the
connection. The Workflow Manager grants Owner Group permissions to the first group in
the Group Memberships list of the owner. You can manage External Loader permissions if you
are the owner of the external loader connection or if you have Super User privileges.
If you want to edit an external loader connection, you must have read and write permissions
for the connection. If you want to run sessions that use a target external loader connection,
you must have at least execute permission for the connection.

Permissions and Privileges


To create an external loader connection, you must have one of the following privileges:

Use Workflow Manager

Super User

To configure a session to use an external loader, you must have one of the following sets of
privileges and permissions:

Use Workflow Manager privilege and folder read and write permissions

Super User

If you enabled enhanced security, you must also have read permission for external loader
connections associated with the session.

External Loader Permissions

525

External Loader Behavior


The behavior of the external loader depends on how you choose to load the data. You can load
data in the following ways:

Loading to named pipes. When you load data to named pipes, the external loader starts to
load data to the target database as soon as the data appears in the named pipe.

Staging data using flat files. When you stage data in flat files, the external loader starts to
load data to the target databases only after the PowerCenter Server completes writing to
the target flat files.

Loading Data Using Named Pipes


On UNIX, the PowerCenter Server writes to a named pipe, which is named after the
configured target file name. The external loader starts to load data to the database as soon as
the data appears in the named pipe. When you use external loaders on UNIX, the loader
deletes the named pipe as soon as it completes the load.
On Windows, when you load data using named pipes, the PowerCenter Server writes data to
a named pipe using the specified format: \\.\pipe\<pipename> where the pipename is the same
as the configured target name. If the PowerCenter Server finds a file or named pipe that uses
the same name as the target flat file, it deletes the file or named pipe and recreates it.
If the PowerCenter Server on UNIX finds a file or named pipe (with the same name as the
session target flat file) in the target directory, it deletes the file or named pipe and recreates the
named pipe.
Tip: You may not be able to create a named pipe or file if another file exists that uses the same

name. You can rename the output file in the session that uses the external loader.

Staging Data to Flat Files


When you stage data using flat files, the external loader starts loading data to target databases
only after the PowerCenter Server completes writing to the target flat files. The external
loader does not delete the target flat files after loading them to the database. Make sure the
target file directory can accommodate the size of the target flat files.
If the session contains fatal errors, the PowerCenter Server does not finish writing data to the
target files, and the external loader does not start.

Partitioning Sessions with External Loaders


When you configure multiple partitions in a session with a flat file target, the PowerCenter
Server creates a separate flat file for each partition. Some external loaders cannot load data
from multiple files into the target. When you use an external loader in a session with multiple
partitions, you must configure partitioning according to the external loader you use.

526

Chapter 20: External Loading

When you use an external loader that can load data from multiple files, you can create
multiple partitions in the session. You choose an external loader connection for each
partition. The PowerCenter Server creates an output file for each partition, and the external
loader loads the output from each target file to the database.
If you use a loader that cannot load from multiple files, the session fails.
Table 20-1 lists the external loaders and loader behavior:
Table 20-1. Partitioning Guidelines for External Loaders
External Loader

Load Behavior

DB2 EE db2load

Cannot load from multiple output files.

DB2 EEE autoloader

Cannot load from multiple output files.*

Oracle

Behavior based on parallel load configuration:


- Disabled. Cannot load from multiple output files.
- Enabled. Can load from multiple output files.

Sybase IQ

Cannot load from multiple output files.

Teradata MultiLoad

Cannot load from multiple output files.

Teradata TPump

Can load from multiple output files.

Teradata Fastload

Cannot load from multiple output files.

Teradata Warehouse Builder

Can load from multiple output files.

*The PowerCenter Server cannot pass multiple output files to the DB2 EEE autoloader.

Errors and Error Messages


The PowerCenter Server writes external loader initialization and completion messages in the
session log. For details on external loader performance, check the external loader log. The
loader saves the log in the same directory as the target flat files (default location:
$PMTargetFileDir). The default extension for external loader logs is .ldrlog.

External Loader Behavior

527

Loading to DB2
The DB2 EE external loader and DB2 EEE external loader can perform insert and replace
operations on targets. The external loaders can also restart or terminate load operations.
The DB2 EE external loader invokes the db2load executable located in the PowerCenter
Server installation directory. The DB2 EE external loader can load data to a DB2 server on a
machine that is remote to the PowerCenter Server.
The DB2 EEE external loader invokes the IBM DB2 Autoloader program to load data. The
Autoloader program uses the db2atld executable. The DB2 EEE external loader can partition
data and load the partitioned data simultaneously to the corresponding database partitions.
When you use the DB2 EEE external loader, the PowerCenter Server and the DB2 EEE server
must be on the same machine.
The DB2 external loaders load from a delimited flat file. Verify that the target table columns
are wide enough to store all of the data.
If you select a DB2 loader in a session with multiple partitions, the session fails. For more
information about partitioning sessions with external loaders, see Partitioning Sessions with
External Loaders on page 526.
If you configure multiple targets in the same pipeline to use DB2 external loaders, each loader
must load to a different tablespace on the target database. For information on selecting
external loaders, see Configuring External Loading in a Session on page 553.
When you load data to a DB2 database using the DB2 EE or DB2 EEE external loader, you
must have the correct authority levels and privileges to load data to the database tables.

Setting DB2 External Loader Operation Modes


DB2 operation modes specify the type of load the external loader runs. You can configure the
DB2 EE or DB2 EEE external loader to run in one of the following operation modes:

Insert. Adds loaded data to the table without changing existing table data.

Replace. Deletes all existing data from the table, and inserts the loaded data. The table and
index definitions do not change.

Restart. Restarts a previously interrupted load operation.

Terminate. Terminates a previously interrupted load operation and rolls back the
operation to the starting point, even if consistency points were passed. The tablespaces
return to normal state, and all table objects are made consistent.

Configuring Authorities, Privileges, and Permissions


When you load data to a DB2 database using the DB2 EE or DB2 EEE external loader, you
must have the correct authority levels and privileges to load data to the database tables.

528

Chapter 20: External Loading

DB2 privileges allow you to create or access database resources. Authority levels provide a
method of grouping privileges and higher-level database manager maintenance and utility
operations. Together, these act to control access to the database manager and its database
objects. You can access objects for which you have the required privilege or authority.
To load data into a table, you must have one of the following authorities:

SYSADM authority

DBADM authority

LOAD authority on the database, and one of the following privileges:

INSERT privilege on the table when the load utility is invoked in INSERT mode,
TERMINATE mode (to terminate a previous load insert operation), or RESTART mode
(to restart a previous load insert operation)

INSERT and DELETE privilege on the table when the load utility is invoked in
REPLACE mode, TERMINATE mode (to terminate a previous load replace operation),
or RESTART mode (to restart a previous load replace operation)

In addition, you must have proper read access and read/write permissions:

The database instance owner must have read access to the external loader input files.

If you run DB2 as a service on Windows, you must configure the service start account with
a user account that has read/write permissions to use LAN resources, including drives,
directories, and files.

If you load to DB2 EEE, the database instance owner must have write access to the load
dump file and the load temporary file.

For more information, consult your IBM DB2 database documentation.

Configuring DB2 EE External Loader Attributes


Table 20-2 describes attributes for DB2 EE external loader connections:
Table 20-2. DB2 EE External Loader Attributes
Attributes

Default
Value

Opmode

Insert

The DB2 external loader operation mode. Choose one of the following operation
modes:
- Insert
- Replace
- Restart
- Terminate
For more information about DB2 operation modes, see Setting DB2 External
Loader Operation Modes on page 528.

External Loader
Executable

db2load

The name of the DB2 EE external loader executable file.

Description

Loading to DB2

529

Table 20-2. DB2 EE External Loader Attributes


Attributes

Default
Value

DB2 Server Location

Remote

The location of the DB2 EE database server relative to the PowerCenter Server.
Select Local if the DB2 EE database server resides on the PowerCenter Server
machine. Select Remote if the DB2 EE Server resides on another machine.

Is Staged

Disabled

The method of loading data. Select Is Staged to load data to a flat file staging
area before loading to the database. Otherwise, the data is loaded to the
database using a named pipe. For more information, see Loading Data Using
Named Pipes on page 526 or Staging Data to Flat Files on page 526.

Recoverable

Enabled

Sets tablespaces in backup pending state if forward recovery is enabled. If you


disable forward recovery, the DB2 tablespace will not set to backup pending
state. If the DB2 tablespace is in backup pending state, you must fully back up
the database before you perform any other operation on the tablespace.

Description

DB2 EE External Loader Return Codes


The DB2 EE external loader indicates the success or failure of a load operation with a return
code. The PowerCenter Server writes the external loader return code to the session log.
Return code (0) indicates that the load operation succeeded. The Informatica Server writes
the following message to the session log if the external loader successfully completes the load
operation:
WRT_8029 External loader process <external loader name> exited
successfully.

Any other return code indicates that the load operation failed. The PowerCenter Server writes
the following error message to the session log:
WRT_8047 Error: External loader process <external loader name> exited with
error <return code>.

Table 20-3 describes the return codes for the DB2 EE external loader:
Table 20-3. DB2 EE External Loader Return Codes
Code

Description

The external loader operation completed successfully.

The external loader cannot locate the control file.

The external loader could not open the external loader log file.

The external loader could not access the control file because the control file is locked by another process.

The DB2 database returned an error.

Configuring DB2 EEE External Loader Attributes


You can configure the DB2 EEE external loader to use different loading modes when loading
to the database. Loading modes determine how the DB2 EEE external loader loads data across

530

Chapter 20: External Loading

partitions in the database. You can configure the DB2 EEE external loader to use the
following loading modes:

Split and load. The DB2 EEE external loader partitions the data and loads it
simultaneously on the corresponding database partitions.

Split only. The DB2 EEE external loader partitions the data and writes the output to files
in the specified split file directory.

Load only. The DB2 EEE external loader does not partition the data. It loads data in
existing split files on the corresponding database partitions.

Analyze. The DB2 EEE external loader generates an optimal partitioning map with even
distribution across all database partitions. If you run the external loader in split and load
mode after you run it in analyze mode, the external loader uses the optimal partitioning
map to partition the data.

For more information about DB2 loading modes, consult your DB2 database documentation.
The DB2 EEE external loader also writes multiple external loader logs. The number of
external loader logs depends on the number of database partitions to which the external
loader loads data. For each partition, the external loader appends a number corresponding to
the partition number to the external loader log file name. The DB2 EEE external loader log
file format is file_name.ldrlog.partition_number.
The PowerCenter Server does not archive or overwrite DB2 EEE external loader logs. If an
external loader log of the same name exists when the external loader runs, the external loader
appends new external loader log messages to the end of the existing external loader log file.
You must manually archive or delete the external loader log files. For details on log files
generated by DB2 Autoload, consult your DB2 documentation.
For information on DB2 EEE external loader return codes, consult your DB2
documentation.
Table 20-4 describes attributes for DB2 EEE external loader connections:
Table 20-4. DB2 EEE External Loader Attributes
Attribute

Default
Value

Opmode

Insert

The DB2 external loader operation mode. Choose one of the following
operation modes:
- Insert
- Replace
- Restart
- Terminate
For more information about DB2 operation modes, see Setting DB2 External
Loader Operation Modes on page 528.

External Loader
Executable

db2atld

The name of the DB2 EEE external loader executable file.

Split File Location

n/a

The location of the split files. The external loader creates split files if you
configure SPLIT_ONLY loading mode.

Output Nodes

n/a

The database partitions on which the load operation is to be performed.

Description

Loading to DB2

531

Table 20-4. DB2 EEE External Loader Attributes

532

Attribute

Default
Value

Split Nodes

n/a

The database partitions that determine how to split the data. If you do not
specify this attribute, the external loader automatically determines an optimal
splitting method.

Mode

Split and
load

The loading mode the external loader uses to load the data. Choose one of the
following loading modes:
- Split and load
- Split only
- Load only
- Analyze

Max Num Splitters

25

Maximum number of splitter processes.

Force

No

Forces the external loader operation to continue even if it determines at startup


time that some target partitions or tablespaces are offline.

Status Interval

100

Number of megabytes of data the external loader loads before writing a


progress message to the external loader log. You can specify a value between
1 and 4,000 MB.

Ports

6000-6063

The range of TCP ports the external loader uses to create sockets for internal
communications with the DB2 server.

Check Level

Nocheck

Specifies whether the external loader should check for record truncation during
input or output.

Map File Input

n/a

The name of the file that specifies the partitioning map. If you want to use a
customized partitioning map, you must specify this attribute. You can generate
a customized partitioning map when you run the external loader in Analyze
loading mode.

Map File Output

n/a

The name of the partitioning map when you run the external loader in Analyze
loading mode. You must specify this attribute if you want to run the external
loader in Analyze loading mode.

Trace

The number of rows the external loader traces when you need to review a
dump of the data conversion process and output of hashing values.

Is Staged

Disabled

The method of loading data. Select Is Staged to load data to a flat file staging
area before loading to the database. Otherwise, the data is loaded to the
database using a named pipe. For more information, see Loading Data Using
Named Pipes on page 526 or Staging Data to Flat Files on page 526.

Date Format

mm/dd/
yyyy

The date format. The date format in the Connection Object definition must
match the date format you define in the target definition. DB2 supports the
following date formats:
- mm/dd/yyyy
- yyyy-mm-dd
- dd.mm.yyyy
- yyyy-mm-dd

Chapter 20: External Loading

Description

Loading to Oracle
The Oracle SQL loader can perform insert, update, and delete operations on targets. The
target flat file for an Oracle external loader can be fixed-width or delimited.

Loading Multibyte Data to Oracle


When you load multibyte data to Oracle, data precision is measured in bytes for fixed-width
files and in characters for delimited files. Make sure the target table columns are wide enough
to store all the data without risking data truncation. To widen the columns, increase the
column size in the target table definition.
Oracle supports character-oriented datatypes, such as Nchar, where the precision is measured
in characters. If you use the Nchar datatype, multiply the maximum number of characters by
K, where K is the maximum number of bytes a character contains in the selected target code
page. This ensures that the PowerCenter Server does not truncate data before loading the
target file.
Note: If you configure a session to write to an Oracle 8 table in bulk mode with NOT NULL

constraints on any columns, the session may write null data into a NOT NULL column.

Oracle External Loader Attributes


Use the following guidelines when you enter attributes for the Oracle external loader
connection:

If you select an Oracle external loader, the default external loader executable name is
SQLLOAD. This is accurate for most UNIX platforms, but if you use Windows, check
your Oracle documentation to find the name of the external loader executable.

Select Do Not Enable Parallel Load to write to a non-partitioned Oracle target table.

To write to a partitioned Oracle target using Direct Path, you must select Enable Parallel
Load and Append load mode.

To write to a partitioned Oracle target using Conventional Path, select Enable Parallel
Load for best performance.

Tip: For optimal performance, select Direct Path when writing to a partitioned Oracle target.

For details, see your Oracle documentation.

Loading to Oracle

533

Table 20-5 describes the attributes for Oracle external loader connections:
Table 20-5. Oracle External Loader Attributes
Attribute

Default Value

Description

Error Limit

Number of errors to allow before the external loader stops the load
operation.

Load Mode

Append

The loading mode the external loader uses to load data. Choose from
one of the following loading modes:
- Append
- Insert
- Replace
- Truncate

Load Method

Use Conventional
Path

The method the external loader uses to load data. Choose from one of
the following load methods:
- Use Conventional Path
- Use Direct Path (Recoverable)
- Use Direct Path (Unrecoverable)

Enable Parallel
Load

Enable Parallel
Load

Determines whether the Oracle external loader loads data in parallel to a


partitioned Oracle target table. Choose either Enable Parallel Load or Do
Not Enable Parallel Load.
You can create multiple partitions in a session if you use a loader
configured to enable parallel load. Sessions with multiple partitions fail if
you use a loader configured not to enable parallel load. For more
information, see Partitioning Sessions with External Loaders on
page 526.

Rows Per Commit

10000

For Conventional Path load method, this attribute specifies the number
of rows in the bind array for load operations. For Direct Path load
methods, this attribute specifies the number of rows the external loader
reads from the target flat file before it saves the data to the database.

External Loader
Executable

sqlload

The name of the external loader executable file.

Log File Name

n/a

The path and name of the external loader log file.

Is Staged

Disabled

The method of loading data. Select Is Staged to load data to a flat file
staging area before loading to the database. Otherwise, the data is
loaded to the database using a named pipe. For more information, see
Loading Data Using Named Pipes on page 526 or Staging Data to Flat
Files on page 526.

Reject File
The Oracle external loader creates a reject file for data rejected by the database. The reject file
has an extension of .ldrreject. The loader saves the reject file in the target files directory
(default location: $PMTargetFileDir).

534

Chapter 20: External Loading

Loading to Sybase IQ
The Sybase external loader can perform insert operations on Sybase IQ targets. It cannot
perform update or delete operations on targets.
Use the following rules and guidelines when you work with a Sybase IQ external loader:

Ensure that target tables do not violate primary key constraints.

Configure a Sybase IQ user with read/write access before you use a Sybase IQ external
loader.

Target flat files for a Sybase IQ external loader can be fixed-width or delimited.

The PowerCenter Server can load multibyte data to Sybase IQ targets.

If you select a Sybase IQ external loader in a session with multiple partitions, the session
fails. For more information about partitioning sessions with external loaders, see
Partitioning Sessions with External Loaders on page 526.

If the PowerCenter Server and Sybase IQ Server are on different machines, map a drive
from the machine hosting the PowerCenter Server to the machine hosting the Sybase IQ
Server. In a UNIX environment, mount the drive.

Using Sybase IQ External Loader on UNIX


For Sybase IQ external loaders, the PowerCenter Server can write to a named pipe if the
PowerCenter Server is local to the Sybase IQ database. Use pmconfig to enable the
SybaseIQLocaltoPMServer option. If the PowerCenter Server is not local to the Sybase IQ
database server or if you do not enable the option, the PowerCenter Server writes to a flat file.

Loading Multibyte Data to Sybase IQ


When you load multibyte data to Sybase IQ targets, consider the following issues involving
data precision and delimiters.

Fixed-Width Flat File Targets


If you plan to load multibyte data into a fixed-width flat file target, configure the precision to
accommodate the multibyte data. Fixed-width files are byte-oriented, not character-oriented.
So when you configure the precision for a fixed-width target, you need to consider the
number of bytes you load into the target, rather than the number of characters. The
PowerCenter Server writes the row to the reject file if the precision is not large enough to
accommodate the multibyte data.
For more information about writing to flat files, see Working with File Targets on page 261.

Loading to Sybase IQ

535

Delimited Flat File Targets


For delimited flat files, data precision is measured in characters. When you insert multibyte
character data in the target, you do not need to allow for additional precision for multibyte
data. Sybase IQ does not allow optional quotes. You must choose None for Optional Quotes
if you have a delimited target flat file.
When you load multibyte data to Sybase IQ targets, null characters and delimiters can be up
to four bytes each. To avoid reading the delimiters as regular characters, each byte of the
delimiter must have an ASCII value of less than 0x40. For details on loading multibyte data to
targets, see Working with File Targets on page 261.

Sybase IQ External Loader Attributes


Use the following guidelines when you enter attributes for the Sybase IQ external loader
connection:

The connect string must contain the following attributes:


uid=user ID; pwd=password; eng=Sybase IQ database server name;
links=tcpip; (host=host name; port=port number)

The server datafile directory is relative to the database server.


If the directory is in a Windows system, use a backslashes (\) in the directory path:
D:\mydirectory\inputfile.out

If the directory is in a UNIX system, use a forward slash (/):


/mydirectory/inputfile.out

When you create a Sybase IQ external loader connection, the Workflow Manager sets the
name of the external loader executable file to dbisql by default. If you use an executable file
with a different name, for example, dbisqlc, you must update the External Loader
Executable field. If the external loader executable file directory is not in the system path,
you must enter the file path and file name in this field.

Table 20-6 describes the attributes for Sybase IQ external loader connections:
Table 20-6. Sybase IQ External Loader Attributes

536

Attribute

Default
Value

Block Factor

10000

The number of records per block in the target Sybase table. The external
loader applies the Block Factor attribute to load operations for fixedwidth flat file targets only.

Block Size

50000

The size of blocks used in Sybase database operations. The external


loader applies the Block Size attribute to load operations for delimited
flat file targets only.

Checkpoint

Enabled

If enabled, the Sybase IQ database issues a checkpoint after


successfully loading the table. If disabled, the database issues no
checkpoints.

Chapter 20: External Loading

Description

Table 20-6. Sybase IQ External Loader Attributes


Attribute

Default
Value

Notify Interval

1000

The number of rows the Sybase IQ external loader loads before it writes
a status message to the external loader log.

Server Datafile Directory

n/a

The location of the flat file target. You must specify this attribute relative
to the database server installation directory. Enter the target file directory
path using the syntax for the machine hosting the database server
installation. For example, if the PowerCenter Server is on a Windows
machine and the Sybase IQ Server is on a UNIX machine, use UNIX
syntax.

External Loader
Executable

dbisql

The name of the Sybase IQ external loader executable.

Is Staged

Enabled

The method of loading data. Select Is Staged to load data to a flat file
staging area before loading to the database. Otherwise, the data is
loaded to the database using a named pipe. For more information, see
Loading Data Using Named Pipes on page 526 or Staging Data to Flat
Files on page 526.

Description

Loading to Sybase IQ

537

Loading to Teradata
When you load to Teradata, you can use the following external loaders:

Multiload. Performs insert, update, delete, and upsert operations for large volume
incremental loads. You can use this loader when you run a session with a single partition.
Multiload acquires table level locks, making it appropriate for offline loading. For more
information about configuring the Multiload external loader connection object, see
Teradata MultiLoad External Loader Attributes on page 540.

TPump. Performs insert, update, delete, and upsert operations for relatively low volume
updates. You can use this loader when you run a session with multiple partitions. TPump
acquires row-hash locks on the table, allowing other users to access the table as TPump
loads to it. For more information about configuring the Tpump external loader
connection object, see Teradata TPump External Loader Attributes on page 542.

FastLoad. Performs insert operations for high volume initial loads, or for high volume
truncate and reload operations. You can use this loader when you run a session with a
single partition. You can only use this loader on empty tables with no secondary indexes.
For more information about configuring the FastLoad external loader connection object,
see Teradata FastLoad External Loader Attributes on page 545.

Warehouse Builder. Performs insert, update, upsert, and delete operations on targets. You
can use this loader when you run a session with multiple partitions. You can achieve the
functionality of the other loaders based on the operator you use. For more information
about configuring the Warehouse Builder external loader connection object, see Teradata
Warehouse Builder External Loader Attributes on page 547.

If you use a Teradata external loader to perform update or upsert, you can use the Target
Update Override option in the Mapping Designer to override the UPDATE statement in the
external loader control file. For upsert, the INSERT statement in the external loader control
file remains unchanged. For details on using the Target Update Override option, see
Mappings in the Designer Guide.
Use the following guidelines when you use the Teradata external loaders:

538

The PowerCenter Server can use Teradata external loaders to load fixed-width flat files to a
Teradata database.

The target output file name, including the file extension, must not exceed 27 characters. If
the session contains multiple partitions, the target output file name, including the file
extension, must not exceed 25 characters.

You cannot use spaces as null characters.

You can use the Teradata external loaders to load multibyte data.

You cannot use the Teradata external loaders to load binary data.

When you load to Teradata using named pipes, set the checkpoint value to 0 to prevent
external loaders from performing checkpoint operations.

When you edit a session, you can specify error, log, or work table names, depending on the
loader you use. You can also specify error, log, or work database names.

Chapter 20: External Loading

When you edit a session, you can override the control file in the loader connection
properties.

You can view the Teradata control file in the target directory.
See the Teradata documentation for more information about the loaders.

Overriding the Control File


When you edit the loader connection in a session, you can override the control file. You might
want to override the control file to change some loader properties that you cannot edit in the
loader connection. For example, you can specify the tracing option in the control file.
When you override the control file, the Workflow Manager saves the control file to the
repository. The PowerCenter Server uses the saved control file when you run the session
again. If you do not override the control file, the PowerCenter Server generates a control file
based on the session and loader properties by default. It saves the control file in the output file
directory by default, but it does not use the control file the next time it runs the session.
To override the control file, override the loader connection for the target in the session. Click
the Edit button in the Control File Content Override loader property.
Figure 20-1 shows the Control File Editor dialog box where you override the Teradata control
file:
Figure 20-1. Control File Editor Dialog Box for Teradata

In the Control File Editor dialog box, click Generate to create the default control file. The
Workflow Manager creates the default control file based on the session and loader properties.
Edit the generated control file, and click OK to save your changes.
Note that if you change a target or loader connection setting after you edit the control file,
the control file does not include those changes. If you want to include those changes, you
must generate the control file again and edit it.
Note: The Workflow Manager does not validate the control file syntax. Teradata verifies the

control file syntax when you run a session. If the control file is invalid, the session fails.
Loading to Teradata

539

Teradata MultiLoad External Loader Attributes


You can configure the external loader connection object in the Workflow Manager. You can
also override the external loader connection object attributes when you edit a reusable or nonreusable session.
Use the following guidelines when you work with the MultiLoad external loader:

You can perform insert, update, delete, and upsert operations on targets. You can also use
data driven mode to perform insert, update, or delete operations based on instructions
coded in an Update Strategy or Custom transformation within a mapping.

The MultiLoad external loader cannot load from multiple output files. If you run a session
with multiple partitions, the session fails. For more information about partitioning
sessions with external loaders, see Partitioning Sessions with External Loaders on
page 526.

If you invoke a greater number of sessions than the maximum number of concurrent
sessions the database allows, the session may hang. You can set the minimum value for
Tenacity and Sleep to ensure that sessions fail rather than hang.

Table 20-7 shows the attributes that you configure for the Teradata MultiLoad external
loader:
Table 20-7. Teradata MultiLoad External Loader Attributes

540

Attribute

Default
Value

Description

TDPID

n/a

The Teradata database ID.

Database Name

n/a

Optional database name.

Date Format

n/a

The date format. The date format in the Connection Object definition must match
the date format you define in the target definition. The PowerCenter Server
supports the following date formats:
- dd/mm/yyyy
- mm/dd/yyyy
- yyyy/dd/mm
- yyyy/mm/dd

Error Limit

The total number of rejected records that MultiLoad can write to the MultiLoad error
tables. Uniqueness violations do not count as rejected records.
An error limit of 0 means that there is no limit on the number of rejected rows.

Checkpoint

10,000

The interval between checkpoints. You can set the interval to the following values:
- 60 or more: MultiLoad performs a checkpoint operation after it processes each
multiple of that number of records.
- 159: MultiLoad performs a checkpoint operation at the specified interval, in
minutes.
- 0: MultiLoad does not perform any checkpoint operations during the import task.

Tenacity

10,000

Specifies how long, in hours, MultiLoad tries to log onto the required sessions. If a
logon fails, MultiLoad delays for the number of minutes specified in the Sleep
attribute, and then retries the logon. MultiLoad keeps trying until the logon
succeeds or the number of hours specified in the Tenacity attribute elapses.

Chapter 20: External Loading

Table 20-7. Teradata MultiLoad External Loader Attributes


Attribute

Default
Value

Load Mode

Upsert

The mode to generate SQL commands: Insert, Delete, Update, Upsert, or Data
Driven.
When you select Data Driven loading, the PowerCenter Server follows instructions
coded in an Update Strategy or Custom transformations within the mapping to
determine how to flag rows for insert, delete, or update. The PowerCenter Server
writes a column in the target file or named pipe to indicate the update strategy. The
control file uses these values to determine how to load data to the target. The
PowerCenter Server uses the following values to indicate the update strategy:
0 - Insert
1 - Update
2 - Delete

Drop Error Tables

Enabled

Specifies whether to drop the MultiLoad error tables before beginning the next
session. Select this option to drop the tables, or clear it to keep them.

External Loader
Executable

mload

The name and optional file path of the Teradata external loader executable. If the
external loader executable directory is not in the system path, you must enter the
file path and filename.

Max Sessions

The maximum number of MultiLoad sessions per MultiLoad job. Max Sessions must
be between 1 and 32,767.
Running multiple MultiLoad sessions causes the client and database to use more
resources. Therefore, setting this value to a small number may improve
performance.

Sleep

The number of minutes MultiLoad waits before retrying a logon. MultiLoad tries until
the logon succeeds or the number of hours specified in the Tenacity attribute
elapses.
Sleep must be greater than 0. If you specify 0, MultiLoad issues an error message
and uses the default value, 6 minutes.

Is Staged

Disabled

The method of loading data. Select Is Staged to load data to a flat file staging area
before loading to the database. Otherwise, the data is loaded to the database using
a named pipe. For more information, see Loading Data Using Named Pipes on
page 526 or Staging Data to Flat Files on page 526.

Error Database

n/a

The error database name. You can use this attribute to override the default error
database name. If you do not specify a database name, the PowerCenter Server
uses the target table database.

Work Table
Database

n/a

The work table database name. You can use this attribute to override the default
work table database name. If you do not specify a database name, the
PowerCenter Server uses the target table database.

Log Table
Database

n/a

The log table database name. You can use this attribute to override the default log
table database name. If you do not specify a database name, the PowerCenter
Server uses the target table database.

Description

Loading to Teradata

541

Table 20-8 shows the attributes that you configure when you edit a session and override the
Teradata MultiLoad external loader connection object:
Table 20-8. Teradata MultiLoad External Loader Attributes Defined at the Session Level
Attribute

Default
Value

Error Table 1

n/a

The table name for the first error table. You can use this attribute to override
the default error table name. If you do not specify an error table name, the
PowerCenter Server uses ET_<target_table_name>.

Error Table 2

n/a

The table name for the second error table. You can use this attribute to
override the default error table name. If you do not specify an error table name,
the PowerCenter Server uses UV_<target_table_name>.

Work Table

n/a

The work table name. You can use this attribute to override the default work
table name. If you do not specify a work table name, the PowerCenter Server
uses WT_<target_table_name>.

Log Table

n/a

The log table name. You can use this attribute to override the default log table
name. If you do not specify a log table name, the PowerCenter Server uses
ML_<target_table_name>.

Control File Content


Override

n/a

The control file text. You can use this attribute to override the control file the
PowerCenter Server uses when it loads to Teradata. For more information, see
Overriding the Control File on page 539.

Description

For more information about these attributes, consult your Teradata documentation.

Teradata TPump External Loader Attributes


You can configure the external loader connection object in the Workflow Manager. You can
also override the external loader connection object attributes when you edit a reusable or nonreusable session.
You can perform insert, update, delete, and upsert operations on targets. You can also use data
driven mode to perform insert, update, or delete operations based on instructions coded in an
Update Strategy or Custom transformation within a mapping.
If you run a session with multiple partitions, you can use a TPump external loader to load the
output files to a Teradata database. You must select a Teradata TPump external loader for each
partition. For information on selecting external loaders, see Configuring External Loading in
a Session on page 553.
Table 20-9 shows the attributes that you configure for the Teradata TPump external loader:
Table 20-9. Teradata TPump External Loader Attributes

542

Attribute

Default
Value

Description

TDPID

n/a

The Teradata database ID.

Database Name

n/a

Optional database name.

Chapter 20: External Loading

Table 20-9. Teradata TPump External Loader Attributes


Attribute

Default
Value

Error Limit

Limits the number of rows rejected for errors. When the error limit is exceeded,
TPump rolls back the transaction that causes the last error. An error limit of 0
causes TPump to stop processing after any error.

Checkpoint

15

The number of minutes between checkpoints. You must set the checkpoint to a
value between 0 and 60.

Tenacity

Specifies how long, in hours, TPump tries to log onto the required sessions. If a
logon fails, TPump delays for the number of minutes specified in the Sleep
attribute, and then retries the logon. TPump keeps trying until the logon succeeds
or the number of hours specified in the Tenacity attribute elapses.
To disable Tenacity, set the value to 0.

Load Mode

Upsert

The mode to generate SQL commands: Insert, Delete, Update, Upsert, or Data
Driven.
When you select Data Driven loading, the PowerCenter Server follows instructions
coded in an Update Strategy or Custom transformations within the session mapping
to determine how to flag rows for insert, delete, or update. The PowerCenter Server
writes a column in the target file or named pipe to indicate the update strategy. The
control file uses these values to determine how to load data to the database. The
PowerCenter Server uses the following values to indicate the update strategy:
0 - Insert
1 - Update
2 - Delete

Drop Error Tables

Enabled

Specifies whether to drop the TPump error tables before beginning the next
session. Select this option to drop the tables, or clear it to keep them.

External Loader
Executable

tpump

The name and optional file path of the Teradata external loader executable. If the
external loader executable directory is not in the system path, you must enter the
file path and filename.

Max Sessions

The maximum number of TPump sessions per TPump job. Each partition in a
session starts its own TPump job. Running multiple TPump sessions causes the
client and database to use more resources. Therefore, setting this value to a small
number may improve performance.

Sleep

The number of minutes TPump waits before retrying a logon. TPump tries until the
logon succeeds or the number of hours specified in the Tenacity attribute elapses.

Packing Factor

20

The number of rows that each session buffer holds. Packing improves network/
channel efficiency by reducing the number of sends and receives between the
target flat file and the Teradata database.

Statement Rate

The initial maximum rate, per minute, at which the TPump executable sends
statements to the Teradata database. If you set this attribute to 0, the statement
rate is unspecified.

Description

Loading to Teradata

543

Table 20-9. Teradata TPump External Loader Attributes


Attribute

Default
Value

Serialize

Disabled

Determines whether or not operations on a given key combination (row) occur


serially.
You may want to check this option if the TPump job contains multiple changes to
one row. Sessions that contain multiple partitions with the same key range but
different filter conditions may cause multiple changes to a single row. In this case,
you may want to enable Serialize to prevent locking conflicts in the Teradata
database, especially if you set the Pack attribute to a value greater than 1.
If you select this option, the PowerCenter Server uses the primary key specified in
the target table as the Key column. If no primary key exists in the target table, you
must either clear this checkbox or indicate the Key column in the data layout
section of the control file.

Robust

Disabled

When Robust is not selected, it signals TPump to use simple restart logic. In this
case, restarts cause TPump to begin at the last checkpoint. TPump reloads any
data that was loaded after the checkpoint. This method does not have the extra
overhead of the additional database writes in the robust logic.

No Monitor

Enabled

When selected, this attribute prevents TPump from checking for statement rate
changes from, or update status information for, the TPump monitor application.

Is Staged

Disabled

The method of loading data. Select Is Staged to load data to a flat file staging area
before loading to the database. Otherwise, the data is loaded to the database using
a named pipe. For more information, see Loading Data Using Named Pipes on
page 526 or Staging Data to Flat Files on page 526.

Error Database

n/a

The error database name. You can use this attribute to override the default error
database name. If you do not specify a database name, the PowerCenter Server
uses the target table database.

Log Table
Database

n/a

The log table database name. You can use this attribute to override the default log
table database name. If you do not specify a database name, the PowerCenter
Server uses the target table database.

Description

Table 20-10 shows the attributes that you configure when you edit a session and override the
Teradata TPump external loader connection object:
Table 20-10. Teradata TPump External Loader Attributes Defined at the Session Level

544

Attribute

Default
Value

Error Table

n/a

The error table name. You can use this attribute to override the default error
table name. If you do not specify an error table name, the PowerCenter Server
uses ET_<target_table_name><partition_number>.

Log Table

n/a

The log table name. You can use this attribute to override the default log table
name. If you do not specify a log table name, the PowerCenter Server uses
LT_<target_table_name><partition_number>.

Control File Content


Override

n/a

The control file text. You can use this attribute to override the control file the
PowerCenter Server uses when it loads to Teradata. For more information, see
Overriding the Control File on page 539.

Chapter 20: External Loading

Description

For more information about these attributes, consult your Teradata documentation.

Teradata FastLoad External Loader Attributes


You can configure the external loader connection object in the Workflow Manager. You can
also override the external loader connection object attributes when you edit a reusable or nonreusable session.
Use the following guidelines with the FastLoad external loader:

Each FastLoad job loads data to one Teradata database table. If you want to load data to
multiple tables using FastLoad, you must create multiple FastLoad jobs.

The FastLoad external loader cannot load from multiple output files. If you run a session
with multiple partitions, the session fails. For more information about partitioning
sessions with external loaders, see Partitioning Sessions with External Loaders on
page 526.

The target table must be empty with no defined secondary indexes.

FastLoad does not load duplicate rows from the output file to the target table in the
Teradata database if the target table has a primary key.

If you load date values to the target table, you must configure the date format for the
column in the target table in the format YYYY-MM-DD.

You cannot use FastLoad to load binary data.

You can view the Teradata FastLoad control file in the target directory.
Table 20-11 shows the attributes that you configure for the Teradata FastLoad external loader:
Table 20-11. Teradata FastLoad External Loader Attributes
Attribute

Default
Value

Description

TDPID

n/a

The Teradata database ID.

Database Name

n/a

The database name.

Error Limit

1,000,000

The maximum number of rows that FastLoad rejects before it stops loading data to
the database table.

Checkpoint

The number of rows transmitted to the Teradata database between checkpoints. If


processing stops while a FastLoad job is running, you can restart the job at the
most recent checkpoint.
If you enter 0, FastLoad does not perform checkpoint operations.

Tenacity

The number of hours FastLoad tries to log on to the required FastLoad sessions
when the maximum number of load jobs are already running on the Teradata
database. When FastLoad tries to log on for a new session, and the Teradata
database indicates that the maximum number of load sessions is already running,
FastLoad logs off all new sessions that were logged on, delays for the number of
minutes specified in the Sleep attribute, and then retries the logon. FastLoad keeps
trying until it logs on for the required number of sessions or exceeds the number of
hours specified in the Tenacity attribute.

Loading to Teradata

545

Table 20-11. Teradata FastLoad External Loader Attributes


Attribute

Default
Value

Drop Error Tables

Enabled

Specifies whether to drop the FastLoad error tables before beginning the next
session. FastLoad will not run if non-empty error tables exist from a prior job.
Select this option to drop the tables, or clear it to keep them.

External Loader
Executable

fastload

The name and optional file path of the Teradata external loader executable. If the
external loader executable directory is not in the system path, you must enter the
file path and file name.

Max Sessions

The maximum number of FastLoad sessions per FastLoad job. Max Sessions must
be between 1 and the total number of access module processes (AMPs) on your
system.

Sleep

The number of minutes FastLoad pauses before retrying a logon. FastLoad tries
until the logon succeeds or the number of hours specified in the Tenacity attribute
elapses.

Truncate Target
Table

Disabled

Specifies whether to truncate the target database table before beginning the
FastLoad job. FastLoad cannot load data to non-empty tables.

Is Staged

Disabled

The method of loading data. Select Is Staged to load data to a flat file staging area
before loading to the database. Otherwise, the data is loaded to the database using
a named pipe. For more information, see Loading Data Using Named Pipes on
page 526 or Staging Data to Flat Files on page 526.

Error Database

n/a

The error database name. You can use this attribute to override the default error
database name. If you do not specify a database name, the PowerCenter Server
uses the target table database.

Description

Table 20-12 shows the attributes that you configure when you edit a session and override the
Teradata FastLoad external loader connection object:
Table 20-12. Teradata FastLoad External Loader Attributes Defined at the Session Level
Attribute

Default
Value

Error Table 1

n/a

The table name for the first error table. You can use this attribute to override
the default error table name. If you do not specify an error table name, the
PowerCenter Server uses ET_<target_table_name>.

Error Table 2

n/a

The table name for the second error table. You can use this attribute to
override the default error table name. If you do not specify an error table
name, the PowerCenter Server uses UV_<target_table_name>.

Control File Content


Override

n/a

The control file text. You can use this attribute to override the control file the
PowerCenter Server uses when it loads to Teradata. For more information,
see Overriding the Control File on page 539.

Description

For more information about these attributes, consult your Teradata documentation.

546

Chapter 20: External Loading

Teradata Warehouse Builder External Loader Attributes


You can configure the external loader connection object in the Workflow Manager. You can
also override the external loader connection object attributes when you edit a reusable or nonreusable session.
If you run a session with multiple partitions, you can use a Warehouse Builder external loader
to load the output files to a Teradata database. You must select a Teradata Warehouse Builder
external loader for each partition. For information on selecting external loaders, see
Configuring External Loading in a Session on page 553.
Teradata Warehouse Builder uses operators to load data. Operators allow the Teradata
Warehouse Builder to achieve the functionality of FastLoad, MultiLoad, or TPump. When
you use Teradata Warehouse Builder, each operator uses the protocol for a Teradata external
loader.
Table 20-13 shows the operators and protocol for each Teradata Warehouse Builder operator:
Table 20-13. Teradata Warehouse Builder Operators and Protocol
Operator

Protocol

Load

Uses FastLoad protocol. Load attributes are described in Table 20-14. For more
information about how FastLoad works, see Teradata FastLoad External Loader
Attributes on page 545.

Update

Uses MultiLoad protocol. Update attributes are described in Table 20-14. For more
information about how MultiLoad works, see Teradata MultiLoad External Loader
Attributes on page 540.

Stream

Uses TPump protocol. Stream attributes are described in Table 20-14. For more
information about how TPump works, see Teradata TPump External Loader
Attributes on page 542.

Each Teradata Warehouse Builder operator has associated attributes. Not all attributes
available for FastLoad, MultiLoad, and TPump external loaders are available for Teradata
Warehouse Builder.
Table 20-14 shows the attributes that you configure for Teradata Warehouse Builder:
Table 20-14. Teradata Warehouse Builder External Loader Attributes
Attribute

Default
Value

Description

TDPID

n/a

The Teradata database ID.

Database Name

n/a

The database name.

Error Database
Name

n/a

The name of the error database.

Operator

Update

The Warehouse Builder operator used to load the data. Choose Load, Update, or
Stream.

Max instances

The maximum number of parallel instances for the defined operator.

Loading to Teradata

547

Table 20-14. Teradata Warehouse Builder External Loader Attributes

548

Attribute

Default
Value

Error Limit

The maximum number of rows that Warehouse Builder rejects before it stops loading
data to the database table.

Checkpoint

The number of rows transmitted to the Teradata database between checkpoints. If


processing stops while a Warehouse Builder job is running, you can restart the job at
the most recent checkpoint.
If you enter 0, Warehouse Builder does not perform checkpoint operations.

Tenacity

The number of hours Warehouse Builder tries to log on to the Warehouse Builder
sessions when the maximum number of load jobs are already running on the
Teradata database. When Warehouse Builder tries to log on for a new session, and
the Teradata database indicates that the maximum number of load sessions is
already running, Warehouse Builder logs off all new sessions that were logged on,
delays for the number of minutes specified in the Sleep attribute, and then retries the
logon. Warehouse Builder keeps trying until it logs on for the required number of
sessions or exceeds the number of hours specified in the Tenacity attribute.
To disable Tenacity, set the value to 0.

Load Mode

Upsert

The mode to generate SQL commands. Choose Insert, Update, Upsert, Delete or
Data Driven.
When you use the Update or Stream operators, you can choose Data Driven load
mode. When you select data driven loading, the PowerCenter Server follows
instructions coded in Update Strategy or Custom transformations within the mapping
to determine how to flag rows for insert, delete, or update. The PowerCenter Server
writes a column in the target file or named pipe to indicate the update strategy. The
control file uses these values to determine how to load data to the database. The
PowerCenter Server uses the following values to indicate the update strategy:
0 - Insert
1 - Update
2 - Delete

Drop Error Tables

Enabled

Specifies whether to drop the Warehouse Builder error tables before beginning the
next session. Warehouse Builder will not run if error tables containing data exist from
a prior job. Clear the option to keep error tables.

Truncate Target
Table

Disabled

Specifies whether to truncate target tables. Enable this option to truncate the target
database table before beginning the Warehouse Builder job.

External Loader
Executable

tbuild

The name and optional file path of the Teradata external loader executable file. If the
external loader directory is not in the system path, enter the file path and file name.

Max Sessions

The maximum number of Warehouse Builder sessions per Warehouse Builder job.
Max Sessions must be between 1 and the total number of access module processes
(AMPs) on your system.

Sleep

The number of minutes Warehouse Builder pauses before retrying a logon.


Warehouse Builder tries until the logon succeeds or the number of hours specified in
the Tenacity attribute elapses.

Serialize

Disabled

Specifies whether operations on a column occur serially.


Enabled with Update and Stream operators only.

Chapter 20: External Loading

Description

Table 20-14. Teradata Warehouse Builder External Loader Attributes


Attribute

Default
Value

Packing Factor

20

The number of rows that each session buffer holds. Packing improves network/
channel efficiency by reducing the number of sends and receives between the target
file ad the Teradata database. Enabled with Stream operator only.

Robust

Disabled

The recovery or restart mode. When you disable Robust, the Stream operator uses
simple restart logic. The Stream operator reloads any data that was loaded after the
last checkpoint.
When you enable Robust, Warehouse Builder uses robust restart logic. In robust
mode, the Stream operator determines how many rows were processed since the
last checkpoint. The Stream operator processes all the rows that were not processed
after the last checkpoint. Enabled with Stream operator only.

Is Staged

Disabled

The method of loading data. Select Is Staged to load data to a flat file staging area
before loading to the database. Otherwise, the data is loaded to the database using a
named pipe. For more information, see Loading Data Using Named Pipes on
page 526 or Staging Data to Flat Files on page 526.

Error Database

n/a

The error database name. You can use this attribute to override the default error
database name. If you do not specify a database name, the PowerCenter Server
uses the target table database.

Work Table
Database

n/a

The work table database name. You can use this attribute to override the default
work table database name. If you do not specify a database name, the PowerCenter
Server uses the target table database.

Log Table
Database

n/a

The log table database name. You can use this attribute to override the default log
table database name. If you do not specify a database name, the PowerCenter
Server uses the target table database.

Description

Note: Valid attributes depend upon the operator you select.

Table 20-15 shows the attributes that you configure when you edit a session and override
Teradata Warehouse Builder external loader connection object:
Table 20-15. Teradata Warehouse Builder External Loader Attributes Defined at the Session Level
Attribute

Default
Value

Error Table 1

n/a

The table name for the first error table. You can use this attribute to override the
default error table name. If you do not specify an error table name, the
PowerCenter Server uses ET_<target_table_name>.

Error Table 2

n/a

The table name for the second error table. You can use this attribute to override
the default error table name. If you do not specify an error table name, the
PowerCenter Server uses UV_<target_table_name>.

Work Table

n/a

The work table name. You can use this attribute to override the default work table
name. If you do not specify a work table name, the PowerCenter Server uses
WT_<target_table_name>.

Log Table

n/a

The log table name. You can use this attribute to override the default log table
name. If you do not specify a log table name, the PowerCenter Server uses
RL_<target_table_name>.

Description

Loading to Teradata

549

Table 20-15. Teradata Warehouse Builder External Loader Attributes Defined at the Session Level
Attribute
Control File
Content Override

Default
Value
n/a

Description
The control file text. You can use this attribute to override the control file the
PowerCenter Server uses when it loads to Teradata. For more information, see
Overriding the Control File on page 539.

Note: Valid attributes depend upon the operator you select.

For more information about these attributes, consult your Teradata documentation.

550

Chapter 20: External Loading

Creating an External Loader Connection


The PowerCenter Server uses external loader attributes to create an external loader
connection. You enter external loader attributes in the Workflow Manager when you create an
external loader connection.
When you configure external loader settings, you may need to consult your DB2, Oracle SQL
Loader, Sybase IQ, or Teradata documentation for details.
Tip: If you edit an external loader connection, all sessions using the connection use the

updated connection.
To create an external loader connection:
1.

In the Workflow Manager, choose Connections-Loader.


The Loader Connection Browser dialog box appears:

2.

Click New.

Creating an External Loader Connection

551

3.

Select an external loader type, and then click OK.

4.

Enter a name for the external loader connection.

5.

Enter the database user name, password, and connect string.


Enter the PmNullUser user name and PmNullPasswd if you use Oracle OS
Authentication. PowerCenter uses Oracle OS Authentication when the connection user
name is PmNullUser and the connection is with an Oracle database.
Note: When you use Teradata, you can enter PmNullPasswd as the database password to

prevent the password from appearing in the control file. When you do this, the
PowerCenter Server writes an empty string for the password in the control file.

552

6.

Enter the necessary loader attributes.

7.

Click OK.

8.

To create additional connections, repeat steps 3-7, and then click Close to save your
changes.

Chapter 20: External Loading

Configuring External Loading in a Session


Before using an external loader in a session, you must first configure the necessary
connections. For more details, see Creating an External Loader Connection on page 551.
To use an external loader during a session, perform the following steps:
1.

Configure the session to write to a file.

2.

Configure the file properties.

3.

Select the external loader connection.

Configuring a Session to Write to a File


When you want to use an external loader to write to a database, create the target definition in
the mapping according to the target database type. The session configures a relational target
type by default. To select an external loader connection, you must configure the session to
write to a file instead of a relational target. To do this, you must change the writer type from
Relational Writer to File Writer. You change the writer type using the Writers settings on the
Mappings tab.
Figure 20-2 shows the Writers settings on the Mapping tab:
Figure 20-2. Writers Settings on the Mapping Tab

Target
Instance
Writer Type

Configuring External Loading in a Session

553

To change the writer type for the target, select the target instance in the Instances list. Change
the writer type from Relational Writer to File Writer.

Configuring File Properties


After you configure the session to write to a file, you can set the file properties. You need to
specify the output file name and directory, as well as the reject file name and directory. You
configure these properties using the Properties settings on the Mapping tab.
Figure 20-3 shows the Properties settings on the Mapping tab:
Figure 20-3. Properties Settings on the Mapping Tab

Target
Instance

Properties
Settings

To set the file properties, select the target instance in the Instances list.

554

Chapter 20: External Loading

Table 20-16 shows the attributes in Properties settings:


Table 20-16. Properties Settings
Attribute

Description

Output File Directory

Enter the directory name in this field. By default, the PowerCenter Server writes output
files to the directory $PMTargetFileDir.
If you enter a full directory and file name in the Output Filename field, clear this field.
External loader sessions may fail if you use double spaces in the path for the output file.

Output Filename

Enter the file name, or file name and path. By default, the Workflow Manager names the
target file based on the target definition used in the mapping: target_name.out. External
loader sessions may fail if you use double spaces in the path for the output file.

Reject File Directory

By default, the PowerCenter Server writes all reject files to the directory $PMBadFileDir.
If you enter a full directory and file name in the Reject Filename field, clear this field.

Reject Filename

Enter the file name, or file name and directory. The PowerCenter Server appends
information in this field to that entered in the Reject File Directory field. For example, if you
have C:/reject_file/ in the Reject File Directory field, and enter filename.bad in the
Reject Filename field, the PowerCenter Server writes rejected rows to C:/reject_file/
filename.bad.
By default, the PowerCenter Server names the reject file after the target instance name:
target_name.bad.
You can also enter a reject file session parameter to represent the reject file or the reject
file and directory. Name all reject file parameters $BadFileName. For details on session
parameters, see Session Parameters on page 495.

Set File Properties

Opens a dialog box that allows you to define flat file properties. When you use an external
loader, you must define the flat file properties by clicking the Set File Properties button.
For Oracle external loaders, the target flat file can be fixed-width or delimited.
For Sybase IQ external loaders, the target flat file can be fixed-width or delimited.
For Teradata external loaders, the target flat file must be fixed-width. For DB2 external
loaders, the target flat file must be delimited.
For more information, see Configuring Fixed-Width Properties on page 265 and
Configuring Delimited Properties on page 266.

Note: Do not select Merge Partitioned Files or enter a merge file name. You cannot merge

partitioned output files when you use an external loader.

Selecting an External Loader Connection


After you configure file properties, you are ready to select the external loader connection. To
do this, you must choose the connection type and the connection object. You configure
connection options using the Connections settings on the Mappings tab.

Configuring External Loading in a Session

555

Figure 20-4 shows the Connections settings on the Mapping tab:


Figure 20-4. Connections Settings on the Mapping Tab

Target
Instance
Connection
Type and
selected
Connection
Object

To select an external loader connection:


1.

On the Mapping tab, select the target instance in the Navigator.

2.

Select the Loader connection type.

3.

Click the Open button in the Value field to select the correct external loader connection
object.

4.

Choose an external loader connection object, and then click OK.

5.

Click OK to save your changes.

If the session contains multiple partitions, and you choose a loader that can load from
multiple output files, you can select a different connection for each partition, but each
connection must be of the same type. For example, you can select different Teradata TPump
external loader connections for each partition, but you cannot select a Teradata TPump
connection for one partition and an Oracle connection for another partition.
If the session contains multiple partitions, and you choose a loader that can load from only
one output file, the session fails. For more information about running external loader sessions
with multiple partitions, see Partitioning Sessions with External Loaders on page 526.

556

Chapter 20: External Loading

Troubleshooting
I am trying to set up a session to load data to an external loader, but I cannot select an
external loader connection in the session properties.
Check your mapping to make sure you did not configure it to load to a flat file target. In
order to use an external loader, you must configure the mapping with a DB2, Oracle, Sybase
IQ, or Teradata relational target. When you create the session, select a file writer in the
Writers settings of the Mapping tab in the session properties. Then open the Connections
settings and select an external loader connection.
I am trying to run a session that uses TPump, but the session fails. The session log displays
an error saying that the Teradata output file name is too long.
The PowerCenter Server uses the Teradata output file name to generate names for the TPump
error and log files, as well as the log table name. To do this, the PowerCenter Server adds a
prefix of several characters to the output file name. It adds three characters for sessions with
one partition and five characters for sessions with multiple partitions.
Teradata allows log table names of up to 30 characters. Because the PowerCenter Server adds a
prefix, if you are running a session with a single partition, specify a target output file name
with a maximum of 27 characters, including the file extension. If you are running a session
with multiple partitions, specify a target output file name with a maximum of 25 characters,
including the file extension.
I tried to load data to Teradata using TPump, but the session failed. I corrected the error,
but the session still fails.
Occasionally, Teradata does not drop the log table when you rerun the session. Check the
Teradata database, and manually drop the log table if it exists. Then rerun the session.

Troubleshooting

557

558

Chapter 20: External Loading

Chapter 21

Using FTP
This chapter covers the following topics:

Overview, 560

Creating an FTP Connection, 561

Creating an FTP Session, 565

559

Overview
The PowerCenter Server can use File Transfer Protocol (FTP) to access source and target files.
With both source and target files, you can use FTP to transfer the files directly to the
PowerCenter Server or stage them on a local directory.
You can also stage files by creating a pre-session shell command to move the files local to the
PowerCenter Server. Accessing files directly with FTP generally provides better session
performance than using FTP to stage the files. However, you may want to stage FTP files to
keep a local archive.
Before creating an FTP session, you must configure the FTP connection in the Workflow
Manager. For details, see Creating an FTP Connection on page 561.
When using FTP file sources and targets in a session, you should know the following
information:

FTP connection name

Remote file name and exact path

Whether you want to stage the files

Mainframe Notes
Due to mainframe restrictions, the following constraints apply when using FTP with
mainframe machines:

560

You cannot execute sessions concurrently if the sessions use the same FTP source file or
target file located on a mainframe.

If you abort a workflow containing a session with a staged FTP source or target from a
mainframe, you may need to wait for the connection to timeout before you can run the
workflow again.

Chapter 21: Using FTP

Creating an FTP Connection


The PowerCenter Server can access source and target files on remote machines using FTP. The
PowerCenter Server can use FTP to access any machine to which the PowerCenter Server can
connect.
Before you create a session using FTP, you must configure the FTP connection in the
Workflow Manager.
You must know the following information when you create an FTP connection:

Connection name. The connection name used by the Workflow Manager.

Host name. The name or IP address of the remote machine. Optionally, you can specify a
port number between 1 and 65535 inclusive. If you do not specify a port number, the
PowerCenter Server uses the port number 21 by default. Use the following syntax for
specifying a host name:
hostname:port-number

or
IP address:port-number

When you specify a port number, enable that port number for FTP on the host machine.

Default remote directory. The directory you want the PowerCenter Server to use by
default. In the session, when you enter a file name without a directory, the PowerCenter
Server appends the file name to this directory. Therefore, this path must be exact and
contain the appropriate trailing delimiters. For example, if you enter c:/data/ and in the
session specify the file FILENAME, the PowerCenter Server reads the path and file name
as c:\data\FILENAME.
If you enter the wrong delimiter for an FTP directory, the Workflow Manager does not
correct it. If the FTP host is a mainframe machine, the directory must begin with a single
quote and end with the period delimiter, such as: defaultdir. You can override this option
in the session properties.

Depending on the remote machine you access, you might also need to enter the user name
and password. The password must be in 7-bit ASCII only. As with database connections, if
you edit an FTP connection, all sessions using the FTP connection use the updated
connection.

FTP Permissions
If you enable enhanced security, you can set FTP connection permissions in the Workflow
Manager. The Workflow Manager assigns Owner permissions to the user who registers the
connection. The Workflow Manager grants Owner Group permissions to the first group in
the Group Memberships list of the owner. You can manage FTP connection permissions if
you are the owner of the connection or if you have Super User privileges.
A registered FTP connection does not appear in the list of FTP connections if you do not
have at least read permission for the connection. If you want to edit a connection, you must
Creating an FTP Connection

561

have read and write permissions for the connection. If you want to run sessions that use a
source or target FTP connection, you must have execute permission for the connection.
To create an FTP connection, you must have one of the following privileges:

Use Workflow Manager

Super User

Steps for Creating an FTP Connection


Perform the following steps to create an FTP connection.
To create an FTP connection:

562

1.

In the Workflow Manager, connect to a repository.

2.

Choose Connections-FTP. The FTP Object Browser appears.

Chapter 21: Using FTP

3.

Click New.

4.

Enter the connection information in Table 21-1:


Table 21-1. FTP Options
FTP Option

Required/
Optional

Description

Name

Required

Connection name used by the Workflow Manager.

User Name

Optional

User name necessary to access the host machine.

Password

Optional

Password for the user name. Must be in 7-bit ASCII only.

Host Name

Required

Host name or dotted IP address of the FTP connection.


Optionally, you can specify a port number between 1 and 65535,
inclusive. If you do not specify a port number, the PowerCenter Server
uses 21 by default. Use the following syntax for specifying the host
name:
hostname:port-number

-orIP address:port-number

When you specify a port number, enable that port number for FTP on the
host machine.
Default Remote
Directory

Required

Enter a valid FTP directory on the host machine.


Do not enclose the default remote directory in quotation marks.
The default directory name must be exact and include a trailing delimiter.
Note: Depending on the FTP server you use, you may have limited
options for entering FTP directories. Please see your FTP server
documentation for details.

Creating an FTP Connection

563

564

5.

Click OK.

6.

Repeat steps 3-5 for any other necessary FTP connection, then click Close.

Chapter 21: Using FTP

Creating an FTP Session


After defining FTP connections in the Workflow Manager, you can create sessions using FTP
file sources and targets. You can use any mapping with the flat file sources or targets.
The steps to create FTP sessions vary for source and target files. You can use FTP to access
both source and target files in a session.
To create a session using FTP sources and targets, you must have one of the following sets of
privileges and permissions:

Use Workflow Manager privilege with folder read and write permissions

Super User privilege

You must have read permission for FTP connections you want to associate with the session in
addition to the privileges and permissions listed above.

FTP File Sources


Use FTP to access source files from any machine on your network, including mainframes.
To create a session using FTP source files:
1.

In the Workflow Manager, open the session properties.

2.

In the Connections settings on the Mapping tab, select FTP for Type.

Select an
FTP
connection.

Creating an FTP Session

565

3.

Click the Open button in the Value field to select an FTP connection.

4.

Click Override and enter the remote file name.

If you enter a file name without a leading slash or drive letter, the PowerCenter Server
appends the file name to the Default Remote Directory path entered in the FTP
Connection dialog box. For example, if your default remote directory is c:/data/, and you
enter a remote file name of FILENAME, the PowerCenter Server connects to the FTP
host and looks for c:/data/FILENAME.
If you enter a fully qualified file name in the Remote Filename field, the PowerCenter
Server uses the named path rather than the path entered in the Default Remote
Directory.

566

Chapter 21: Using FTP

If you enter a mainframe file name for a source file in the default directory, make sure you
enter the closing quote. For example, if your default remote directory is:
defaultdir.

To access the file, FILENAME, from the default mainframe directory, enter the following
in the Remote Filename field:
filename

When the PowerCenter Server begins the session, it connects to the mainframe host and
looks for:
defaultdir.filename

In contrast, if you want to use a file in a different directory, you must enter that directory
and file name in the Remote Filename field, like this:
overridedir.filename

Note: Depending on the FTP server you use, you may have limited options for entering

FTP directories. Please see your FTP server documentation for details.
5.

To store the file in a directory local to the PowerCenter Server, select Is Staged.
When you select this option for a source file, the PowerCenter Server moves the source
file from the FTP host to a local directory before the session begins, then uses the local
file during the session. If the staged file exists, the PowerCenter Server truncates the
staged file before running the session.
The location of the local file differs depending on the information entered in the
Properties settings of the Sources tab:

Creating an FTP Session

567

If you have an individual path and file name listed in the Source Filename field, the
PowerCenter Server uses that path as the local directory, and names the staged local file
after the listed file. For example, if the Source Filename field contains the path, c:/data/
sales_info, the PowerCenter Server connects to the FTP host, then moves the file to c:/
data, and names the file sales_info.
If the Source Filename field contains only a file name (and no path), the PowerCenter
Server names the file as defined in the Source Filename field, and places the file in the
directory listed in the Source file directory field. If the directory is not specified, the
PowerCenter Server stages the file in the directory where the PowerCenter Server runs on
UNIX or in Windows system directory.
If you do not stage the source file, the PowerCenter Server accesses the data directly from
the FTP host.
6.

Repeat steps 3-5 for each FTP source and target in the session, then click OK.

7.

Configure the rest of the session, then click OK.

FTP File Targets


You can use FTP to transfer target files to any machine to which the PowerCenter Server can
connect.
To create a session using FTP target files:
1.

568

In the Workflow Manager, open the session properties.

Chapter 21: Using FTP

2.

In the Connections settings on the Mapping tab, select FTP for Type.

Select an
FTP
connection.

3.

Click the Open button in the Value field to select an FTP connection.

4.

Click Override and enter the remote file name.

Creating an FTP Session

569

If you enter a file name without a leading slash or drive letter, the PowerCenter Server
appends the file name to the Default Remote Directory path entered in the FTP
Connection dialog. For example, if your default remote directory is c:/data/, and you
enter a remote file name of FILENAME, the PowerCenter Server connects to the FTP
host and looks for c:/data/FILENAME.
If you enter a fully qualified file name, the PowerCenter Server uses the named path
rather than the path entered in the Default Remote Directory. Do not enclose the fully
qualified file name in single or double quotation marks. The session may fail if you
enclose the fully qualified file name in quotation marks.
When you transfer a target file to a mainframe host, make sure you enter the opening
quote. For example, if your default remote directory is defaultdir., you enter the
following in the default remote directory field:
defaultdir.

Note: Depending on the FTP server you use, you may have limited options for entering

FTP directories. Please see your FTP server documentation for details.
5.

To store the target file in a directory on the machine where the PowerCenter Server runs,
select Is Staged.
When you select this option, the PowerCenter Server writes to the local target file during
the session, then moves the file to the FTP host after the session is complete. The
location of the local file differs depending on the information entered in the Properties
settings of the Mapping tab:

570

Chapter 21: Using FTP

If you have an individual path and file name listed in the Output Filename field, the
PowerCenter Server uses that path as the local directory, and names the staged local file
after the listed file. For example, if the Output Filename field contains the path, c:/data/
t_company_all.out, the PowerCenter Server connects to the FTP host, then moves the
file to c:/data, and names the file t_company_all.out.
If the Output Filename field contains only a file name (and no path), the PowerCenter
Server names the file as defined in the Output Filename field, and places the file in the
directory listed in the Output file directory field. If the directory is not specified, the
PowerCenter Server stages the file in the directory where the PowerCenter Server runs on
UNIX or the system directory on Windows.
If you do not stage the file, the PowerCenter Server accesses the data directly from the
FTP host. The local file and directory are not used.
Select the Merge Partitioned Files option and specify the merge file name and directory
when you partition your target. For more information, see Partitioning File Targets on
page 380.
6.

Repeat steps 3-5 for each FTP target in the session, and then click OK.

7.

Configure the rest of the session, and then click OK.

Creating an FTP Session

571

572

Chapter 21: Using FTP

Chapter 22

Using Incremental
Aggregation
This chapter covers the following topics:

Overview, 574

PowerCenter Server Processing for Incremental Aggregation, 575

Reinitializing the Aggregate Files, 576

Moving or Deleting the Aggregate Files, 577

Partitioning Guidelines with Incremental Aggregation, 578

Preparing for Incremental Aggregation, 579

573

Overview
When using incremental aggregation, you apply captured changes in the source to aggregate
calculations in a session. If the source changes only incrementally and you can capture
changes, you can configure the session to process only those changes. This allows the
PowerCenter Server to update your target incrementally, rather than forcing it to process the
entire source and recalculate the same data each time you run the session.
For example, you might have a session using a source that receives new data every day. You
can capture those incremental changes because you have added a filter condition to the
mapping that removes pre-existing data from the flow of data. You then enable incremental
aggregation.
When the session runs with incremental aggregation enabled for the first time on March 1,
you use the entire source. This allows the PowerCenter Server to read and store the necessary
aggregate data. On March 2, when you run the session again, you filter out all the records
except those time-stamped March 2. The PowerCenter Server then processes only the new
data and updates the target accordingly.
Consider using incremental aggregation in the following circumstances:

You can capture new source data. Use incremental aggregation when you can capture new
source data each time you run the session. Use a Stored Procedure or Filter transformation
to process only new data.

Incremental changes do not significantly change the target. Use incremental aggregation
when the changes do not significantly change the target. If processing the incrementally
changed source alters more than half the existing target, the session may not benefit from
using incremental aggregation. In this case, drop the table and re-create the target with
complete source data.

Note: Do not use incremental aggregation if your mapping contains percentile or median

functions. The PowerCenter Server uses system memory to process Percentile and Median
functions in addition to the cache memory you configure in the session property sheet. As a
result, the PowerCenter Server does not store incremental aggregation values for Percentile
and Median functions in disk caches.

574

Chapter 22: Using Incremental Aggregation

PowerCenter Server Processing for Incremental


Aggregation
The first time you run an incremental aggregation session, the PowerCenter Server processes
the entire source. At the end of the session, the PowerCenter Server stores aggregate data from
that session run in two files, the index file and the data file. The PowerCenter Server creates
the files in a local directory.
Each subsequent time you run the session with incremental aggregation, you use only the
incremental source changes in the session.
For each input record, the PowerCenter Server checks historical information in the index file
for a corresponding group. If it finds a corresponding group, the PowerCenter Server
performs the aggregate operation incrementally, using the aggregate data for that group, and
saves the incremental change. If it does not find a corresponding group, the PowerCenter
Server creates a new group and saves the record data.
When writing to the target, the PowerCenter Server applies the changes to the existing target.
It saves modified aggregate data in the index and data files to be used as historical data the
next time you run the session.
If the source changes significantly, and you want the PowerCenter Server to continue saving
aggregate data for future incremental changes, configure the PowerCenter Server to overwrite
existing aggregate data with new aggregate data. For details, see Reinitializing the Aggregate
Files on page 576.
When you partition a session that uses incremental aggregation, the PowerCenter Server
creates one set of cache files for each partition.
The PowerCenter Server creates new aggregate data, instead of using historical data, when you
perform one of the following tasks:

Save a new version of the mapping.

Configure the session to reinitialize the aggregate cache.

Move the aggregate files without correcting the configured path or directory for the files in
the session property sheet.

Change the configured path or directory for the aggregate files without moving the files to
the new location.

Delete cache files.

Decrease the number of partitions.

Note: When the PowerCenter Server rebuilds incremental aggregation files, the data in the

previous files is lost.

PowerCenter Server Processing for Incremental Aggregation

575

Reinitializing the Aggregate Files


If the source tables change significantly, you might want to run the session with the entire
source data. To do this, you can configure the session to reinitialize the aggregate cache.
For example, you can reinitialize the aggregate cache if the source for a session changes
incrementally every day and completely changes once a month. When you receive the new
monthly source, you might configure the session to reinitialize the aggregate cache, truncate
the existing target, and use the new source table during the session.
After you run a session that reinitializes the aggregate cache, edit the session properties to
disable the Reinitialize Aggregate Cache option. If you do not clear Reinitialize Aggregate
Cache, the PowerCenter Server overwrites the aggregate cache each time you run the session.
Note: When you move from Windows to UNIX, you must reinitialize the cache. Therefore,

you cannot change from a Latin1 code page to an MSLatin1 code page, even though these
code pages are compatible.

576

Chapter 22: Using Incremental Aggregation

Moving or Deleting the Aggregate Files


Once you run an incremental aggregation session, avoid moving or modifying the index and
data files that store historical aggregate information.
If you do move the files into a different directory, and you want the PowerCenter Server to
use the aggregate files, you must also change the path to those files in the session properties.
As well, if you change the path to the files, but you do not move the files, the PowerCenter
Server rebuilds the files the next time you run the session.
If you change certain session or server properties, the PowerCenter Server cannot use the
incremental aggregation files, and it fails the session. To avoid session failure, delete existing
incremental aggregation files when you perform any of the following tasks:

Change the PowerCenter Server data movement mode from ASCII to Unicode or from
Unicode to ASCII.

Change the PowerCenter Server code page to an incompatible code page.

Change the session sort order when the PowerCenter Server runs in Unicode mode.

Change the Enable High Precision session option.

Finding Index and Data Files


By default, the PowerCenter Server stores the index and data files in the directory entered in
the server variable, $PMCacheDir, in the Workflow Manager. The PowerCenter Server names
the index file PMAGG*.idx. The PowerCenter Server names the data file PMAGG*.dat.
If you run the session using Verbose Init mode, the PowerCenter Server writes the file names
in the session log. To locate the files, look in the previous session log for the TE_7034 and
TE_7035 messages that indicate the cache file name and location. The following messages
show sample entries in the session log:
MAPPING> TE_7034 Aggregate Information: Index file is
[D:\Informatica\InformaticaServer\Cache\PMAGG8_4_2.idx]
MAPPING> TE_7035 Aggregate Information: Data file is
[D:\Informatica\InformaticaServer\Cache\PMAGG8_4_2.dat]

If you do not run the session using Verbose Init mode or use an identifiable transformation
naming convention, you may have difficulty determining which files belong to each session.
For more information about cache file storage and naming conventions, see Cache Files on
page 615.

Moving or Deleting the Aggregate Files

577

Partitioning Guidelines with Incremental Aggregation


When you use incremental aggregation in a session with multiple partitions, the PowerCenter
Server creates one set of cache files for each partition.
Use the following guidelines when you change the number of partitions or the cache
directory:

Change the cache directory for a partition. If you change the directory for a partition and
you want the PowerCenter Server to reuse the cache files, you must move the cache files for
the partition associated with the changed directory.

If you change the directory for the first partition, and you do not move the cache files,
the PowerCenter Server rebuilds the cache files for all partitions.

If you change the directory for partitions 2-n, and you do not move the cache files, the
PowerCenter Server rebuilds the cache files that it cannot locate.

Decrease the number of partitions. If you delete a partition, and you want the
PowerCenter Server to reuse the cache files, you must move the cache files for the deleted
partition to the directory configured for the first partition. If you do not move the files to
the directory of the first partition, the PowerCenter Server rebuilds the cache files that it
cannot locate.
Note: If you increase the number of partitions, the PowerCenter Server realigns the index

and data cache files the next time you run a session. It does not need to rebuild the files.

Move cache files. If you move cache files for a partition and you want the PowerCenter
Server to reuse the files, you must also change the partition directory. If you do not change
the directory, the Informatica rebuilds the files the next time you run a session.

Delete cache files. If you delete cache files, the PowerCenter Server rebuilds them the next
time you run a session.

If you change the number of partitions and the cache directory, you may need to move cache
files for both. For example, if you change the cache directory for the first partition, and you
decrease the number of partitions, you need to move the cache files for the deleted partition as
well as the cache files for the partition associated with the changed directory.

578

Chapter 22: Using Incremental Aggregation

Preparing for Incremental Aggregation


When you use incremental aggregation, you need to configure both mapping and session
properties.

Implement mapping logic or filter to remove pre-existing data.

Configure the session for incremental aggregation and verify that the file directory has
enough disk space for the aggregate files.

Configuring the Mapping


Before enabling incremental aggregation, you must capture changes in source data. You might
do this by:

Using a filter in the mapping. You may be able to remove pre-existing source data during
a session with a filter.

Using a stored procedure. You may be able to remove pre-existing source data at the
source database with a pre-load stored procedure.

Configuring the Session


Use the following guidelines when you configure the session for incremental aggregation:

Verify the location where you want to store the aggregate files. The index and data files
grow in proportion to the source data. When denoting the directory for those files, be sure
the directory has enough disk space to store historical data for the session.
When you run multiple sessions with incremental aggregation, decide where you want the
files stored. Then enter the appropriate directory for the server variable, $PMCacheDir, in
the Workflow Manager. You can enter session-specific directories for the index and data
files. However, by using the server variable for all sessions using incremental aggregation,
you can easily change the cache directory when necessary by changing $PMCacheDir.
Changing the cache directory without moving the files causes the PowerCenter Server to
reinitialize the aggregate cache and gather new aggregate data.
In a server grid, PowerCenter Servers rebuild incremental aggregation files they cannot
find. When a PowerCenter Server rebuilds incremental aggregation files, it loses aggregate
history. For more information about methods to save aggregate history in a server grid, see
Running Sessions with Cache Files on page 445.

Configure the session to write file names in the session log. If you want the PowerCenter
Server to write the incremental aggregation cache file names in the session log, configure
the session with Verbose Init tracing. You can override tracing in the Error Handling
settings on the Config Object tab.

Verify the incremental aggregation settings in the session properties. You can configure
the session for incremental aggregation in the Performance settings on the Properties tab.

Preparing for Incremental Aggregation

579

You can also configure the session to reinitialize the aggregate cache. If you choose to
reinitialize the cache, the Workflow Manager displays a warning indicating the
PowerCenter Server overwrites the existing cache and a reminder to clear this option after
running the session.To configure a session for incremental aggregation:
Figure 22-1 shows the Performance settings on the Properties tab where you configure
incremental aggregation options:
Figure 22-1. Incremental Aggregation Session Properties

Configure
incremental
aggregation.

Note: You cannot use incremental aggregation when the mapping includes an Aggregator

transformation with Transaction transformation scope. The Workflow Manager marks the
session invalid.

580

Chapter 22: Using Incremental Aggregation

Chapter 23

Using pmcmd
This chapter covers the following topics:

Overview, 582

Configuring Environment Variables, 585

Using the Command Line Mode, 589

Using the Interactive Mode, 592

pmcmd Reference, 594

581

Overview
pmcmd is a program that you can use to communicate with the PowerCenter Server. You can
perform some of the tasks that you can also perform in the Workflow Manager such as
starting and stopping workflows and tasks.
You can use pmcmd in the following modes:

Command line mode. The command line syntax allows you to write scripts for scheduling
workflows. Each command you write in the command line mode must include connection
information to the PowerCenter Server.

Interactive mode. You establish and maintain an active connection to the PowerCenter
Server. This allows you to issue a series of commands.

You can use repository user names and passwords as environment variables with pmcmd. You
can also customize the way pmcmd displays the date and time on the machine running the
PowerCenter Server. Before you use pmcmd, configure these variables on the PowerCenter
Server. For more information, see Configuring Environment Variables on page 585.
Note: To issue the shutdownserver command, you must have the Super User privilege or

Administer Server privilege.


Table 23-1 provides a description for the pmcmd commands. For details on command syntax
and usage, see pmcmd Reference on page 594.
Table 23-1. pmcmd Commands

582

Command

Mode(s)

Description

aborttask

Command line,
Interactive

Aborts a task. Issue this command only after the PowerCenter


Server fails to stop when you issue the stoptask command. For
more information, see Aborttask on page 596.

abortworkflow

Command line,
Interactive

Aborts a workflow. Issue this command only after the PowerCenter


Server fails to stop the workflow when you issue the stopworkflow
command. For more information, see Abortworkflow on page 597.

connect

Interactive

Connects to the PowerCenter Server in the interactive mode. Use


this command in conjunction with connection information. For more
information, see Connect on page 597.

disconnect

Interactive

Disconnects from the PowerCenter Server in the interactive mode.


For more information, see Disconnect on page 598.

exit

Interactive

Exits from pmcmd in the interactive mode. For more information,


see Exit on page 598.

getrunningsessionsdetails

Command line,
Interactive

Displays details for sessions currently running on a PowerCenter


Server including information for the folder, workflow, and session
instance. Displays session status and statistics on each target
table and source qualifier. For more information, see
Getrunningsessionsdetails on page 598.

Chapter 23: Using pmcmd

Table 23-1. pmcmd Commands


Command

Mode(s)

Description

getserverdetails

Command line,
Interactive

Displays details for the PowerCenter Server including server


status, information on active workflows, and timestamp
information.
In a server grid, this command displays the PowerCenter Servers
that runs each task instance. For more information, see
Getserverdetails on page 599.

getserverproperties

Command line,
Interactive

Displays the PowerCenter Server name, type, and version. It


returns the timestamp on the PowerCenter Server and the name of
the repository. It also indicates the data movement mode and
whether the PowerCenter Server can debug mappings. For more
information, see Getserverproperties on page 599.

getsessionstatistics

Command line,
Interactive

Displays session details including information for the folder,


workflow, and task instance. Displays session status and statistics
on each target table and source qualifier.
In a server grid, this command displays the PowerCenter Servers
that runs each task instance. For more information, see
Getsessionstatistics on page 600.

gettaskdetails

Command line,
Interactive

Displays details for a task including folder and workflow name. Also
displays the task, status, and run mode.
In a server grid, this command displays the PowerCenter Servers
that runs each task instance. For more information, see
Gettaskdetails on page 601.

getworkflowdetails

Command line,
Interactive

Displays details for a workflow including workflow name, status,


and run mode. Also displays information when the workflow was
last executed. For more information, see Getworkflowdetails on
page 601.

help

Command line,
Interactive

Displays a list of pmcmd commands and syntax. For more


information, see Help on page 602.

pingserver

Command line,
Interactive

Determines whether the PowerCenter Server is running. For more


information, see Pingserver on page 602.

quit

Interactive

Quits from pmcmd in the interactive mode. For more information,


see Quit on page 602.

resumeworkflow

Command line,
Interactive

Resumes a suspended workflow. For more information, see


Resumeworkflow on page 603.

resumeworklet

Command line,
Interactive

Resumes a suspended worklet. For more information, see


Resumeworklet on page 603.

scheduleworkflow

Command line,
Interactive

The scheduleworkflow command instructs the PowerCenter Server


to schedule a workflow. Use this command to manually reschedule
a workflow that has been removed from the schedule. For more
information, see Scheduleworkflow on page 604.

setfolder

Interactive

Designates a folder as the default folder in which to execute all


subsequent commands. For more information, see Setfolder on
page 604.

Overview

583

Table 23-1. pmcmd Commands

584

Command

Mode(s)

Description

setnowait

Interactive

Instructs the PowerCenter Server to execute subsequent


commands in the nowait mode. In the nowait mode, you can enter
a new pmcmd command after the PowerCenter Server receives the
previous command. For more information, see Setnowait on
page 605.

setwait

Interactive

Instructs the PowerCenter Server to execute subsequent


commands in the wait mode. In the wait mode, you can enter a new
pmcmd command only after the PowerCenter Server completes the
previous command. For more information, see Setwait on
page 605.

showsettings

Interactive

Displays the settings for the interactive mode, including


PowerCenter Server and repository name, username, wait mode,
and default folder. For more information, see Showsettings on
page 605.

shutdownserver

Command line,
Interactive

Shuts down the PowerCenter Server. Use this command in


conjunction with a shutdownmode option. For more information,
see Shutdownserver on page 605.

startask

Command line,
Interactive

Starts a task. Use this command in conjunction with a task name.


For more information, see Starttask on page 606.

startworkflow

Command line,
Interactive

Starts a workflow. Use this command in conjunction with a


workflow name. For more information, see Startworkflow on
page 607.

stoptask

Command line,
Interactive

Stops a task. Use this command in conjunction with a task name.


For more information, see Stoptask on page 609.

stopworkflow

Command line,
Interactive

Stops a workflow. Use this command in conjunction with a workflow


name. For more information, see Stopworkflow on page 609.

unscheduleworkflow

Command line,
Interactive

Instructs the PowerCenter Server to remove the workflow from the


schedule. For more information, see Unscheduleworkflow on
page 610.

unsetfolder

Interactive

Designates no folder as the default folder. For more information,


see Unsetfolder on page 610.

version

Command line,
Interactive

Displays the PowerCenter version number. For more information,


see Version on page 611.

waittask

Command line,
Interactive

Instructs the PowerCenter Server to wait for the completion of a


running task before starting another command. Use this command
in conjunction with a task name. For more information, see
Waittask on page 611.

waitworkflow

Command line,
Interactive

Notifies you of the status of a workflow. Use this command in


conjunction with a workflow name. For more information, see
Waitworkflow on page 611.

Chapter 23: Using pmcmd

Configuring Environment Variables


Before you use pmcmd, you can set environment variables that are applied each time you run
pmcmd. You can configure the following environment variables to use with pmcmd:

PM_CODEPAGENAME

PMTOOL_DATEFORMAT

Repository USERNAME and PASSWORD

PM_HOME

Configuring PM_CODEPAGENAME
pmcmd uses the code page of the machine hosting pmcmd unless you specify the code page
environment variable, PM_CODEPAGENAME, to override it. The code page must be
compatible with the PowerCenter Server code page. pmcmd sends commands in Unicode. If
the code pages are not compatible, the PowerCenter Server might not find the workflow,
session, or task in the repository. For more information about code page compatibility, see
Globalization Overview and Code Pages in the Installation and Configuration Guide.
To configure a code page environment variable in a UNIX environment:
1.

If you are in a UNIX C shell environment, type:


setenv PM_CODEPAGENAME <code page name>

If you are in a UNIX Bourne shell environment, type:


PM_CODEPAGENAME=<code page name>
export PM_CODEPAGENAME
To configure a code page as an environment variable on Windows:
1.

Enter environment variables in the Windows System Properties.


For information about setting environment variables for your Windows operating system,
consult your Windows documentation.

2.

Enter a system variable named PM_CODEPAGENAME and set the value to the code
page name.

Configuring PMTOOL_DATEFORMAT
Use this environment variable to customize the way pmcmd displays the date and time. The
pmcmd program verifies that the string you specify is a valid format. If the format string is not
valid, the PowerCenter Server generates a warning message and displays the date in the format
DY MON DD HH24:MI:SS YYYY.

Configuring Environment Variables

585

To configure a date display format as an environment variable on UNIX:


1.

If you are in a UNIX C shell environment, type:


setenv PMTOOL_DATEFORMAT <date/time format string>

If you are in a UNIX Bourne shell environment, type:


PMTOOL_DATEFORMAT=<date/time format string>
export PMTOOL_DATEFORMAT
To configure a date display format as an environment variable on Windows:
1.

Enter environment variables in the Windows System Properties.


For information about setting environment variables for your Windows operating system,
consult your Windows documentation.

2.

Enter a system or user variable named PMTOOL_DATEFORMAT and set the value to
the display format string.

Configuring Repository Username and Password


You can enter your repository user name and password at the command line as environment
variables. The password is an encrypted value.
To configure a username as an environment variable on UNIX:
1.

If you are in a UNIX C shell environment, type:


setenv USERNAME YourUsername

If you are in a UNIX Bourne shell environment, type:


USERNAME=YourUsername
export USERNAME

You can assign the environment variable any valid UNIX name.
To configure a password as an environment variable on UNIX:
1.

In a UNIX session, navigate to the directory where the PowerCenter Server is installed.

2.

At the shell prompt, type:


pmpasswd YourPassword

This command runs the encryption utility pmpasswd located in the directory where the
PowerCenter Server is installed. The encryption utility generates and displays your
encrypted password. The following is sample output. In this example, the password
entered was monday.
Encrypted string -->bX34dqq<-Will decrypt to -->monday<--

Your encrypted password is bX34dqq.

586

Chapter 23: Using pmcmd

3.

If you are in a UNIX C shell environment, type:


setenv PASSWORD YourEncryptedPassword

If you are in a UNIX Bourne shell environment, type:


PASSWORD= YourEncryptedPassword
export PASSWORD

You can assign the environment variable any valid UNIX name.
To configure a username as an environment variable on Windows:
1.

Enter environment variables in the Windows System Properties.


For information about setting environment variables for your Windows operating system,
consult your Windows documentation.

2.

Enter the name of the user environment variable in the Variable field. Enter your
repository username in the Value field.
You can set these up as either a user or system variable. User variables take precedence
over system variables.

To configure a password as an environment variable on Windows:


1.

In Windows DOS, navigate to the directory where the PowerCenter Server is installed.

2.

At the command line, type:


pmpasswd YourPassword

The encryption utility generates and displays your encrypted password. The following is
sample output. In this example, the password entered was monday.
Encrypted string -->bX34dqq<-Will decrypt to -->monday<--

Your encrypted password is bX34dqq.


3.

Enter environment variables in the Windows System Properties.


For information about setting environment variables for your Windows operating system,
consult your Windows documentation.

4.

Enter the name of your password environment variable in the Variable field. Enter your
encrypted password in the Value field.
You can set these up as either a user or system variable. User variables take precedence
over system variables.

Configuring PM_HOME
Use the PM_HOME variable to start pmcmd from a directory other than the install directory.
On UNIX, point the PM_HOME and PATH environment variables to the PowerCenter

Configuring Environment Variables

587

Server installation directory. On Windows, include the PowerCenter Server install directory
in the environment path.
Warning: If you specify an incorrect directory path for the PM_HOME environment variable
the PowerCenter Server cannot start.
To start pmcmd from any directory on UNIX:
1.

Point the PM_HOME environment variable to the installation directory.


If you are in a UNIX C shell environment, type the following to set the PM_HOME
variable:
setenv PM_HOME <install directory>

If you are in a UNIX Bourne shell environment, type:


PM_HOME=<install directory>
export PM_HOME
2.

Add the installation directory to the PATH environment variable.


If you are in a UNIX C shell environment, type the following to set the PATH variable:
setenv PATH <install directory>:$PATH

If you are in a UNIX Bourne shell environment, type:


PATH=<install directory>:$PATH
export PATH
To start pmcmd from any directory on Windows:

In the system properties, add the installation directory to the path variable. For example, on
Windows 2000, configure the path variable in System settings. Click the Environment tab to
select the path variable and add the installation directory to the variable value.

588

Chapter 23: Using pmcmd

Using the Command Line Mode


You can use pmcmd commands with operating system scheduling tools like cron or embed
pmcmd commands into shell scripts or Perl programs.
Each command must include the connection information to the PowerCenter Server and the
PowerCenter repository. For example, to start a workflow named wFlow4 in the command
line mode, use the following syntax:
pmcmd startworkflow -s serveraddress:portno -u YourUsername -p
YourPassword wFlow4

The following command immediately starts the workflow wSalesAvg, located in the east
folder, on the remote PowerCenter Server with host name Sales listening at port 6258:
pmcmd startworkflow -u seller3 -p jackson -s SALES:6258 -f east -wait
wSalesAvg

The user, seller3, with the password jackson sends the request to start the workflow. When
you use the wait option, pmcmd returns to the shell or command prompt when the workflow
completes.
For a list of commands you can use in the command line mode, see Table 23-1 on page 582.
For details on each command see pmcmd Reference on page 594.

Connecting to the PowerCenter Server in the Command Line Mode


When you run pmcmd in the command line mode, you enter connection parameters such as
username, password, and server information for each command. If you incorrectly enter or
omit one of the required parameters, the command fails and pmcmd returns a non-zero return
code. For a description of all the return codes, see pmcmd Return Codes on page 590.
There are several options to enter the user and password information. You can enter a
username. Or, if you previously defined a username environment variable, you can enter that
instead. You can also enter a previously defined password environment variable instead of a
password. The following command uses both user and password variables:
pmcmd startworkflow -s serveraddress:portno -uv USERNAME -pv PASSWORD
wFlow4

For information on defining username and password environment variables, see Configuring
Repository Username and Password on page 586.

Using the Command Line Mode

589

Table 23-2 describes the connection information you enter each time you write a command in
the command line mode:
Table 23-2. Connection Information for the Command Line Mode
Required/
Optional

Description

-user
-u

Required

Your repository username. Required if userEnvVar is not used.

userEnvVar

-uservar
-uv

Required

Specifies the username environment variable. Required if


username is not used.
If you do not encrypt your password, you can use -u $username,
and run the command from a shell script.

password

-password
-p

Required

Your repository password. Required if passwordEnvVar is not


used.

passwordEnvVar

-passwordvar
-pv

Required

Specifies the password environment variable. Required if


password is not used.
If you do not encrypt your password, you can use -p $password,
and run the command from a shell script.

serveraddr

-serveraddr
-s

Required

Server address of the machine hosting the PowerCenter Server.

host

N/A

Optional

Name of the machine hosting the PowerCenter Server. If you do


not specify a host name, pmcmd assumes the PowerCenter
Server runs on the machine executing pmcmd.

portno

N/A

Required

Port number at which the PowerCenter Server listens.

Parameter

Flags

username

pmcmd Return Codes


When you work in the command line mode, pmcmd indicates the success or failure of a
command with a return code. Return code (0) indicates that the command succeeded. Any
other return code indicates that the command failed.
Table 23-3 describes the return codes for command line pmcmd.
Table 23-3. pmcmd Return Codes

590

Code

Description

For all commands, a return value of zero indicates that the command ran successfully. You can issue
these commands in the wait or nowait mode: starttask, startworkflow, resumeworklet, resumeworkflow,
aborttask, and abortworkflow. If you issue a command in the wait mode, a return value of zero indicates
the command ran successfully. If you issue a command in the nowait mode, a return value of zero
indicates that the request was successfully transmitted to the PowerCenter Server, and it acknowledged
the request.

The PowerCenter Server is down, or pmcmd cannot connect to the PowerCenter Server. The TCP/IP
host name or port number or a network problem occurred.

The specified task name, workflow name, or folder name does not exist.

Chapter 23: Using pmcmd

Table 23-3. pmcmd Return Codes


Code

Description

An error occurred in starting or running the workflow or task.

Usage error. You passed the wrong parameters to pmcmd.

An internal pmcmd error occurred. Contact Informatica Technical Support.

An error occurred while stopping the PowerCenter Server. Contact Informatica Technical Support.

You used an invalid username or password.

You do not have the appropriate permissions or privileges to perform this task.

The connection to the PowerCenter Server timed out while sending the request.

12

The PowerCenter Server cannot start recovery because the session or workflow is scheduled,
suspending, waiting for an event, waiting, initializing, aborting, stopping, disabled, or running.

13

The username environment variable is not defined.

14

The password environment variable is not defined.

15

The username environment variable is missing.

16

The password environment variable is missing.

17

Parameter file does not exist.

18

The PowerCenter Server found the parameter file, but it did not have the initial values for the session
parameters, such as $input or $output.

19

The PowerCenter Server cannot start the session in recovery mode because the workflow is configured
to run continuously.

20

A repository error has occurred. Please make sure that the Repository Server and the database are
running and the number of connections to the database is not exceeded.

21

PowerCenter Server is shutting down and it is not accepting new requests.

22

The PowerCenter Server cannot find a unique instance of workflow/session you specified. Enter the
command again with the folder name and workflow name.

23

There is no data available for your request.

24

Out of memory.

25

Command is cancelled.

Using the Command Line Mode

591

Using the Interactive Mode


Use pmcmd in the interactive mode to start and stop workflows and tasks without writing a
script. Once you establish a dedicated connection to the PowerCenter Server, you can issue
commands without specifying the connection information. For example, to start the
workflow wFlow4 in the interactive mode, type the following at the pmcmd prompt:
pmcmd> startworkflow wFlow4

The following commands immediately start the workflow wSalesAvg, located in the east
folder:
pmcmd> connect -user seller3 -password jackson -serveraddr SALES:6258
pmcmd> setwait
pmcmd> setfolder east
pmcmd> startworkflow wSalesAvg

The setwait command means that for all subsequent commands, pmcmd returns the command
prompt when the workflow completes. The setfolder command means that for all subsequent
commands dealing with workflows or tasks, pmcmd uses the specified workflow or task from
the east folder.
For a list of commands you can use in the interactive mode, see Table 23-1 on page 582. For
details on each command see pmcmd Reference on page 594.

Connecting to the PowerCenter Server in the Interactive Mode


To use pmcmd in the interactive mode, first establish a dedicated connection to the
PowerCenter Server.
To start in the interactive mode:
1.

In either a Windows DOS session or a UNIX session, navigate to the directory where the
PowerCenter Server is installed.

2.

At the shell or command prompt, type:


pmcmd

This command returns the PowerCenter version number and the pmcmd prompt.
3.

From the pmcmd prompt, type:


connect -u YourUserName -p YourPassword -s ServerName:PortNo

Or, if you use username and password environment variables, type the following at the
pmcmd prompt:
connect -uv USERNAME -pv PASSWORD -serveraddr ServerName:PortNo

For information on defining user name and password environment variables, see
Configuring Repository Username and Password on page 586.

592

Chapter 23: Using pmcmd

If you omit connection information, pmcmd prompts you to enter the correct information.
Once pmcmd successfully connects, you receive the pmcmd prompt. At the pmcmd prompt,
you can issue commands without specifying the connection information.

Setting Defaults in the Interactive Mode


Once you connect to a PowerCenter Server using pmcmd interactive mode, you can designate
default folders or conditions to use each time the PowerCenter Server executes a command.
For example, if you want to issue a series of commands on tasks in the same folder, specify the
name of the folder with the setfolder command. All subsequent commands use that folder as
the default.
Table 23-4 describes the commands that you can use to set defaults for subsequent
commands.
Table 23-4. Setting Defaults for the Interactive Mode
Command

Description

setfolder

Designates a folder as the default folder in which to execute all subsequent commands.

setnowait

Instructs the PowerCenter Server to execute subsequent commands in the nowait mode.
The pmcmd prompt is available after the PowerCenter Server receives the previous
command. The nowait mode is the default mode.

setwait

Instructs the PowerCenter Server to execute subsequent commands in the wait mode.
The pmcmd prompt is available only after the PowerCenter Server completes the previous
command.

showsettings

Displays the following settings for the interactive mode:


- name of the PowerCenter Server and repository to which pmcmd is connected
- username
- wait mode
- default folder

unsetfolder

Reverses the setfolder command.

For a list of all the commands that you can use in the interactive mode, see Table 23-1 on
page 582.

Using the Interactive Mode

593

pmcmd Reference
pmcmd provides multiple ways to enter some of the parameters. For example, to enter a
repository password, use the following syntax:
<<-password|-p> password|<-passwordvar|-pv> passwordEnvVar>

You can use -password or -p before entering a password. Or, use -passwordvar or -pv before a
password environment variable.
To enter a password, precede the password with either the -password or the -p flag.
-password YourPassword
or
-p YourPassword

If you use a password environment variable, precede the variable name with either the -pv flag
or the -passwordvar flag.
-passwordvar PASSWORD
or
-pv PASSWORD

For a list of all the parameters you can use with pmcmd, see Table 23-5 on page 594.

Command Parameters
When you use most parameters, you precede the parameter with a flag. For ease of use, you
can use a shortened version for most flags. For example, you can either use -serveraddr or its
shortened equivalent, -s.
Table 23-5 describes the parameters used in pmcmd commands and lists the associated flags:
Table 23-5. Command Parameters

594

Parameter

Flags

Description

folder

-folder
-f

Name of the folder containing the workflow or task. Required if the workflow
or task name is not unique in the repository.

host

N/A

The name of the machine hosting the PowerCenter Server. If you do not
specify a host name, pmcmd assumes the PowerCenter Server runs on the
machine executing pmcmd.

localparamfile

-localparamfile
-lpf

The localparamfile is a parameter file on a local machine that pmcmd uses


when you start a workflow. Use in conjunction with the startworkflow
command.

paramfile

-paramfile

The paramfile parameter determines which parameter file is used when a task
or workflow runs. It overrides the configured parameter file for the workflow or
task. Use in conjunction with the starttask or startworkflow commands.

Chapter 23: Using pmcmd

Table 23-5. Command Parameters


Parameter

Flags

Description

password

-password
-p

Your repository password. Required if passwordEnvVar is not used.

passwordEnvVar

-passwordvar
-pv

Specifies the password environment variable. Required if password is not


used.

portno

N/A

Specifies the port number at which the PowerCenter Server listens.

recovery

-recovery

Specifies you want to run the session in recovery mode.

serveraddr

-serveraddr
-s

Server address of the machine hosting the PowerCenter Server.

startfrom

-startfrom

Starts a workflow from a specified task, taskInstancePath. Use the startfrom


parameter in conjunction with the startworkflow command. Write the
taskInstancePath as a fully qualified string.

taskInstancePath

N/A

Indicates a task and where it appears within the workflow. A task within a
workflow is indicated by its task name alone. A task within a worklet is
indicated by WorkletName.TaskName.

userEnvVar

-uservar
-uv

Specifies the username environment variable. Required if username is not


used.

username

-user
-u

Your repository username. Required if userEnvVar is not used.

workflow

-workflow
-w

Name of the workflow.

Using Quotation Marks


If a command parameter contains spaces, use single or double quotation marks to enclose the
parameter. For example, use single quotes in the following syntax to enclose the folder name:
abortworkflow -f quarterly sales -wait Q3workflow

To denote an empty string, use two single quotes () or two double quotes (). Be sure you
match an opening quote with a closing quote.

Syntax Notation
Table 23-6 describes the notation used in pmcmd syntax:
Table 23-6. pmcmd Syntax Notation
Convention

Description

-z

Flag placed before a parameter. This designates the parameter you enter. For
example, to enter the username, type -u or -user followed by the username.

<x>

Required parameter. If you omit a required parameter, pmcmd returns an error


message.

pmcmd Reference

595

Table 23-6. pmcmd Syntax Notation


Convention

Description

<x | y >

Select between required parameters. For the command to run, you must select
from the listed parameters. If you omit a required parameter, pmcmd returns an
error message.

[x]

Optional parameter. The command runs whether or not you enter in optional
parameters. For example, if you want to use the help command, the syntax is a
follows:
Help [Command]
If you enter a command, pmcmd returns information on that command only. If you
omit the command name, pmcmd returns a list of all commands.

[x|y]

Select between optional parameters. The command runs whether or not you
enter in optional parameters. For example, many commands run in either the wait
or nowait mode.
[-wait|-nowait]
The command runs in the mode you specify. If you do not specify a mode,
pmcmd runs the command in the default nowait mode.

<< x | y>| <a | b>>

When a set contains subsets, the superset is indicated with bold brackets < >. A
bold pipe symbol (| )separates the subsets.

Tip: When you enter commands in pmcmd, type the command name first followed by the

optional parameters in any order.

Aborttask
The aborttask command aborts a task. Issue this command only after the PowerCenter Server
fails to stop the task when you issue the stoptask command. For details on how the
PowerCenter Server aborts and stops tasks, see Server Handling of Stop and Abort on
page 129.
In the command line mode, use the following syntax to abort a task:
pmcmd aborttask
<-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
<<-password|-p> password|<-passwordvar|-pv> passwordEnvVar>
[<-folder|-f> folder]
<<-workflow|-w> workflow>
[-wait|-nowait]
taskInstancePath

In the interactive mode, enter the following syntax at the pmcmd prompt to abort a task:
aborttask
[<-folder|-f> folder]
<<-workflow|-w> workflow>
596

Chapter 23: Using pmcmd

[-wait|-nowait]
taskInstancePath

Write the taskInstancePath as a fully qualified string. If the task is within a worklet, write the
string as WorkletName.TaskName. If the task is directly within a workflow, use the task name
alone.
For information on other parameters used in this command, see Table 23-5 on page 594.

Abortworkflow
The abortworkflow command aborts a workflow. Issue this command only after the
PowerCenter Server fails to stop the workflow when you issue the stopworkflow command.
For details on how the PowerCenter Server aborts and stops workflows, see Server Handling
of Stop and Abort on page 129.
In the command line mode, use the following syntax to abort a workflow:
pmcmd abortworkflow
<-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
<<-password|-p> password|<-passwordvar|-pv> passwordEnvVar>
[<-folder|-f> folder]
[-wait|-nowait]
workflow

In the interactive mode, enter the following syntax at the pmcmd prompt to abort a workflow:
abortworkflow
[<-folder|-f> folder]
[-wait|-nowait]
workflow

For information on other parameters used in this command, see Table 23-5 on page 594.

Connect
The connect command connects the pmcmd program to the PowerCenter Server in the
interactive mode. If you omit connection information, pmcmd prompts you to enter the
correct information. Once pmcmd successfully connects, you receive the pmcmd prompt. At
the pmcmd prompt, you can issue commands without specifying the connection information.
connect
<-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
<<-password|-p> password|<-passwordvar|-pv> passwordEnvVar>

pmcmd Reference

597

Note: You can use this command in the interactive mode only.

Disconnect
The disconnect command disconnects pmcmd from the PowerCenter Server. It does not close
the pmcmd program. Use this command when you want to disconnect from a PowerCenter
Server and connect to another in the interactive mode.
In the interactive mode, use the following syntax to disconnect pmcmd from a PowerCenter
Server:
disconnect

Note: You can use this command only in the pmcmd interactive mode.

Exit
The exit command disconnects pmcmd from the PowerCenter Server and closes the pmcmd
program.
In the interactive mode, use the following syntax to exit pmcmd:
exit

Note: You can use this command only in the pmcmd interactive mode.

Getrunningsessionsdetails
The getrunningsessionsdetails command returns the details for all sessions currently running
on the PowerCenter Server. Details include startup and current time, folder and workflow
names, session instance, master and execution servers, number of successful and failed rows in
sources and targets, number of transformation errors, and number of sessions running on the
PowerCenter Server.
In the command line mode, use the following syntax to get details about sessions running on
the PowerCenter Server:
pmcmd getrunningsessionsdetails
<-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
<<-password|-p> password|<-passwordvar|-pv> passwordEnvVar>

In the interactive mode, enter the following syntax at the pmcmd prompt to get details about
the PowerCenter Server:
getrunningsessionsdetails

598

Chapter 23: Using pmcmd

Getserverdetails
The getserverdetails command returns details about workflows and tasks running on a
PowerCenter Server.

Workflow details. Workflow details include the name of the PowerCenter Server, folder,
workflow, workflow log file, and user that runs the workflow. It includes workflow run
type, start time, run status, and run error code. It also includes the number of active
workflows and the number of scheduled workflows.

Task details. In addition to workflow details, task details include folder name, workflow
name, task instance name, task type, task start time, task run status, task run error code,
and task run mode. When the task is a session, the getserverdetails command also returns
master server name, worker server name, server grid name, the number of active sessions,
and the number of waiting sessions.

In the command line mode, use the following syntax to get details about the PowerCenter
Server:
pmcmd getserverdetails
<-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
<<-password|-p> password|<-passwordvar|-pv> passwordEnvVar>
[-all|-running|-scheduled]

In the interactive mode, enter the following syntax at the pmcmd prompt to get details about
the PowerCenter Server:
getserverdetails
[-all|-running|-scheduled]

Issue the getserverdetails command for all or some of the workflows. The -running option
returns status details on active workflows. Active workflows include running, suspending, and
suspended workflows. The -scheduled option returns status details on the scheduled
workflows. The default option is the -all option, and it returns status details on the scheduled
and running workflows.
For information on other parameters used in this command, see Table 23-5 on page 594.

Getserverproperties
The getserverproperties command returns the PowerCenter Server name, type, and version. It
returns the timestamp on the PowerCenter Server, the PowerCenter Server startup time, and
the name of the repository. It indicates the data movement mode, the PowerCenter Server
code page, and whether the PowerCenter Server can debug mappings. It also specifies the
server grid name.
In the command line mode, use the following syntax to see the PowerCenter Server
properties:
pmcmd getserverproperties

pmcmd Reference

599

<-serveraddr|-s>[host:]portno

In the interactive mode, enter the following syntax at the pmcmd prompt to see PowerCenter
Server properties:
getserverproperties
<-serveraddr|-s>[host:]portno

Serveraddr is the server name and port number of the PowerCenter Server.

Getsessionstatistics
The getsessionstatistics command returns session details and statistics. The command returns
the following information for each partition:

Session details. Session details include the name of the folder, workflow, task instance, and
mapping. It includes the task run status, session log file name, first error code and message,
the number of transformation errors, and the number of successful and failed rows for the
sources and targets. It also includes the name of the master server, worker server, and server
grid.

Session statistics. Session statistics include the transformation name, transformation


instance name, and the number of applied, affected, and rejected rows. It also includes the
throughput, last error code and message, and start and end time for the session.

In the command line mode, use the following syntax to get session statistics:
pmcmd getsessionstatistics
<-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
<<-password|-p> password|<-passwordvar|-pv> passwordEnvVar>
[<-folder|-f> folder]
<<-workflow|-w> workflow>
taskInstancePath

In the interactive mode, enter the following syntax at the pmcmd prompt to get session
statistics:
getsessionstatistics
[<-folder|-f> folder]
<<-workflow|-w> workflow>
taskInstancePath

When using this command, specify the workflow name. Also, write the taskInstancePath as a
fully qualified string. If the task is within a worklet, write the string as
WorkletName.TaskName. If the task is directly within a workflow, enter only the task name.
For information on other parameters used in this command, see Table 23-5 on page 594.

600

Chapter 23: Using pmcmd

Gettaskdetails
The gettaskdetails command returns the folder name, workflow name, task instance name,
task type, last execution start time, last execution complete time, task run status, and task run
mode. It also returns the run error code and message.
If you issue the gettaskdetails command for a Session task, the command also returns the
following additional information: mapping name, session log file name, first error code and
message, number of successful and failed rows from the source and target, the number of
transformation errors, master server name, worker server name, and server grid name.
In the command line mode, use the following syntax to get details on a task:
pmcmd gettaskdetails
<-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
<<-password|-p> password|<-passwordvar|-pv> passwordEnvVar>
[<-folder|-f> folder]
<<-workflow|-w> workflow>
taskInstancePath

In the interactive mode, enter the following syntax at the pmcmd prompt to get details on a
task:
gettaskdetails
[<-folder|-f> folder]
<<-workflow|-w> workflow>
taskInstancePath

When you use this command, specify the workflow name. Also, write the taskInstancePath as
a fully qualified string. If the task is within a worklet, write the string as
WorkletName.TaskName. If the task is directly within a workflow, enter only the task name.
For information on other parameters used in this command, see Table 23-5 on page 594.

Getworkflowdetails
The getworkflowdetails command returns the folder name, workflow name, last start time,
last completion time, workflow status, run mode, and the username that ran the last
workflow.
In the command line mode, use the following syntax to get details on a workflow:
pmcmd getworkflowdetails
<-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
<<-password|-p> password|<-passwordvar|-pv> passwordEnvVar>

pmcmd Reference

601

[<-folder|-f> folder]
workflow

In the interactive mode, enter the following syntax at the pmcmd prompt to get details on a
workflow:
getworkflowdetails
[<-folder|-f> folder]
workflow

For information on other parameters used in this command, see Table 23-5 on page 594.

Help
The help command returns the syntax for the command you specify. If you omit the
command name, pmcmd lists each command and syntax.
In the command line mode, use the following command for help with command line
commands:
pmcmd help [command]

In the interactive mode, use the following command for help with interactive mode
commands:
help [command]

Pingserver
The pingserver command verifies that the PowerCenter Server is running.
In the command line mode, use the following syntax to ping the PowerCenter Server:
pmcmd pingserver
<-serveraddr|-s> [host:]portno

In the interactive mode, enter the following syntax at the pmcmd prompt to ping the
PowerCenter Server:
pingserver

Serveraddr is the host name and port number of the PowerCenter Server.

Quit
The quit command disconnects pmcmd from the PowerCenter Server and closes the pmcmd
program.
In the interactive mode, use the following syntax to quit pmcmd:
quit

Note: You can use this command in the pmcmd interactive mode only.
602

Chapter 23: Using pmcmd

Resumeworkflow
The resumeworkflow command resumes suspended workflows. To resume a workflow, specify
the folder and workflow name. The PowerCenter Server resumes the workflow from all
suspended and failed worklets and all suspended and failed Command, Email, and Session
tasks.
In the command line mode, use the following syntax to resume a workflow:
pmcmd resumeworkflow
<-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
<<-password|-p> password|<-passwordvar|-pv> passwordEnvVar>
[<-folder|-f> folder]
[-wait|-nowait]
[-recovery]
workflow

In the interactive mode, enter the following syntax at the pmcmd prompt to resume a
workflow:
resumeworkflow
[<-folder|-f> folder]
[-wait|-nowait]
[-recovery]
workflow

For information on other parameters used in this command, see Table 23-5 on page 594.

Resumeworklet
The resumeworklet command resumes suspended worklets. To resume the workflow from a
specific worklet, specify the taskInstancePath as a fully qualified string. If you do not specify a
taskInstancePath, the workflow resumes from the suspended worklet.
In the command line mode, use the following syntax to resume a worklet:
pmcmd resumeworklet
<-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
<<-password|-p> password|<-passwordvar|-pv> passwordEnvVar>
[<-folder|-f> folder]
<<-workflow|-w> workflow>
[-wait|-nowait]
[-recovery]

pmcmd Reference

603

taskInstancePath

In the interactive mode, enter the following syntax at the pmcmd prompt to resume a worklet:
resumeworklet
[<-folder|-f> folder]
<<-workflow|-w> workflow>
[-wait|-nowait]
[-recovery]
taskInstancePath

For information on other parameters used in this command, see Table 23-5 on page 594.

Scheduleworkflow
The scheduleworkflow command instructs the PowerCenter Server to schedule a workflow.
Use this command to reschedule a workflow that has been removed from the schedule.
In the command line mode, use the following syntax to schedule a workflow:
pmcmd scheduleworkflow <-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> user_env_var>
<<-password|-p> password|<-passwordvar|-pv> password_env_var>
[<-folder|-f> folder] workflow

In the interactive mode, enter the following syntax at the pmcmd prompt to schedule a
workflow:
scheduleworkflow [<-folder|-f> folder] workflow

For information on other parameters used in this command, see Table 23-5 on page 594.

Setfolder
The setfolder command designates a folder as the default folder in which to execute all
subsequent commands. After issuing this command, you do not need to enter a folder name
for workflow, task, and session commands. If you enter a folder name in a command after the
setfolder command, that folder name overrides the default folder name for that command
only.
In the interactive mode, enter the following syntax at the pmcmd prompt to designate a folder
as the default folder:
setfolder folder

Note: You can use this command in the pmcmd interactive mode only.

604

Chapter 23: Using pmcmd

Setnowait
The setnowait command instructs the PowerCenter Server to execute subsequent commands
in the nowait mode. The nowait mode is the default mode.
In the interactive mode, enter the following syntax at the pmcmd prompt to instruct the
PowerCenter Server to execute subsequent commands in the nowait mode:
setnowait

When the nowait mode is set, the pmcmd prompt is available after the PowerCenter Server
receives the previous command. No parameters are required for this command.
Note: You can use this command in the pmcmd interactive mode only.

Setwait
The setwait command instructs the PowerCenter Server to execute subsequent commands in
the wait mode. The pmcmd prompt is available only after the PowerCenter Server completes
the previous command.
In the interactive mode, enter the following syntax at the pmcmd prompt to instruct the
PowerCenter Server to execute subsequent commands in the wait mode:
setwait

No parameters are required for this command.


Note: You can use this command in the pmcmd interactive mode only.

Showsettings
The showsettings command returns the name of the PowerCenter Server and repository to
which pmcmd is connected. It displays the username, wait mode, and default folder. No
parameters are required for this command.
In the interactive mode, enter the following syntax at the pmcmd prompt to display interactive
mode settings:
showsettings

Note: You can use this command in the pmcmd interactive mode only.

Shutdownserver
The shutdownserver command stops the PowerCenter Server. You must have the Super User
or Administer Server privilege to use this command.
You can shut down the PowerCenter Server in the complete, stop, or abort mode. In the
complete mode, pmcmd allows currently running workflows to complete before shutting
down the PowerCenter Server. In the stop mode, the PowerCenter Server stops the running
workflows. In the abort mode, the PowerCenter Server aborts the running workflows. For

pmcmd Reference

605

more information on the implications of stopping or abort a workflow, see Stopping or


Aborting the Workflow on page 129.
In the command line mode, use the following syntax to stop the PowerCenter Server:
pmcmd shutdownserver
<-serveraddr|-s>[host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
<<-password|-p> password|<-passwordvar|-pv> passwordEnvVar>
<-complete|-stop|-abort>

In the interactive mode, enter the following syntax at the pmcmd prompt to stop the
PowerCenter Server:
shutdownserver
<-complete|-stop|-abort>

For information on other parameters used in this command, see Table 23-5 on page 594.

Starttask
The starttask command starts a task.
In the command line mode, use the following syntax to start a task:
pmcmd starttask
<-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
<<-password|-p> password|<-passwordvar|-pv> passwordEnvVar>
[<-folder|-f> folder]
<<-workflow|-w> workflow>
[-paramfile paramfile]
[-wait|-nowait]
[-recovery]
taskInstancePath

In the interactive mode, enter the following syntax at the pmcmd prompt to start a task:
starttask
[<-folder|-f> folder]
<<-workflow|-w> workflow>
[-paramfile paramfile]
[-wait|-nowait]
[-recovery]
taskInstancePath

606

Chapter 23: Using pmcmd

Write the taskInstancePath as a fully qualified string. If the task is within a worklet, write the
string as WorkletName.TaskName. If the task is directly within a workflow, enter only the
task.

Using Parameter Files with Starttask


When you start a task, you can optionally enter the directory and name of a parameter file.
The PowerCenter Server runs the task using the parameters in the file you specify.
For UNIX shell users, enclose the parameter file name in single quotes:
-paramfile $PMRootDir/myfile.txt

For Windows command prompt users, the parameter file name cannot have beginning or
trailing spaces. If the name includes spaces, enclose the file name in double quotes:
-paramfile $PMRootDir\my file.txt

When you write a pmcmd command that includes a parameter file located on another
machine, use the backslash (\) with the dollar sign ($). This ensures that the machine where
the variable is defined expands the server variable.
pmcmd starttask -uv USERNAME -pv PASSWORD -s SALES:6258 -f east -w
wSalesAvg -paramfile \$PMRootDir/myfile.txt taskA

For information on other parameters used in this command, see Table 23-5 on page 594.

Startworkflow
The startworkflow command starts a workflow.
In the command line mode, use the following syntax to start a workflow:
pmcmd startworkflow
<-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
<<-password|-p> password|<-passwordvar|-pv> passwordEnvVar>
[<-folder|-f> folder]
[<-startfrom> taskInstancePath]
[-recovery]
[-paramfile paramfile]
[<-localparamfile|-lpf> localparamfile]
[-wait|-nowait]
workflow

In the interactive mode, enter the following syntax at the pmcmd prompt to start a workflow:
startworkflow
[<-folder|-f> folder]

pmcmd Reference

607

[<-startfrom> taskInstancePath]
[-recovery]
[-paramfile paramfile]
[<-localparamfile|-lpf> localparamfile]
[-wait|-nowait]
workflow

Use the -startfrom flag to start the workflow at a designated taskInstancePath. Write the
taskInstancePath as a fully qualified string. If the task is within a worklet, write the string as
WorkletName.TaskName. If the task is directly within a workflow, enter only the task. If you
do not specify a starting point, the workflow starts at the Start task.

Using Parameter Files with Startworkflow


When you start a workflow, you can optionally enter the directory and name of a parameter
file. The PowerCenter Server runs the workflow using the parameters in the file you specify.
For UNIX shell users, enclose the parameter file name in single quotes. For Windows
command prompt users, the parameter file name cannot have beginning or trailing spaces. If
the name includes spaces, enclose the file name in double quotes
You can use choose parameter files on the following machines:

PowerCenter Server machine. When you use a parameter file located on the PowerCenter
Server machine, use the -paramfile option to indicate the location and name of the
parameter file.
On UNIX, use the following syntax:
-paramfile $PMRootDir/myfile.txt

On Windows, use the following syntax:


-paramfile $PMRootDir\my file.txt

Local machine. When you use a parameter file located on the machine where pmcmd is
invoked, pmcmd passes variables and values in the file to the PowerCenter Server. When
you list a local parameter file, specify the absolute path or relative path to the file. Use the
-localparamfile or -lpf option to indicate the location and name of the local parameter file.
On UNIX, use the following syntax:
-lpf param_file.txt
-lpf c:\Informatica\parameterfiles\param file.txt
-localparamfile c:\Informatica\parameterfiles\param file.txt

On Windows, use the following syntax:


-lpf param_file.txt
-lpf c:\Informatica\parameterfiles\param file.txt
-localparamfile param_file.txt

608

Chapter 23: Using pmcmd

Shared network drives. When you use a parameter file located on another machine, use
the backslash (\) with the dollar sign ($). This ensures that the machine where the variable
is defined expands the server variable.
-paramfile \$PMRootDir/myfile.txt

For information on other parameters used in this command, see Table 23-5 on page 594.

Stoptask
The stoptask command stops a task.
In the command line mode, use the following syntax to stop a task:
pmcmd stoptask
<-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
<<-password|-p> password|<-passwordvar|-pv> passwordEnvVar>
[<-folder|-f> folder]
<<-workflow|-w> workflow>
[-wait|-nowait]
taskInstancePath

In the interactive mode, enter the following syntax at the pmcmd prompt to stop a task:
stoptask
[<-folder|-f> folder]
<<-workflow|-w> workflow>
[-wait|-nowait] taskInstancePath

Write the taskInstancePath as a fully qualified string. If the task is within a worklet, write the
string as WorkletName.TaskName. If the task is directly within a workflow, use the task name
alone.
For information on other parameters used in this command, see Table 23-5 on page 594.

Stopworkflow
The stopworkflow command stops a workflow.
In the command line mode, use the following syntax to stop a workflow:
pmcmd stopworkflow
<-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
<<-password|-p> password|<-passwordvar|-pv> passwordEnvVar>
[<-folder|-f> folder]

pmcmd Reference

609

[-wait|-nowait]
workflow

In the interactive mode, enter the following syntax at the pmcmd prompt to stop a workflow:
stopworkflow
[<-folder|-f> folder]
[-wait|-nowait]
workflow

For information on other parameters used in this command, see Table 23-5 on page 594.

Unscheduleworkflow
The unscheduleworkflow command instructs the PowerCenter Server to remove the workflow
from the schedule.
In the command line mode, enter the following syntax at the pmcmd prompt to remove the
workflow from the schedule:
pmcmd unscheduleworkflow <-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> user_env_var>
<<-password|-p> password|<-passwordvar|-pv> password_env_var>
[<-folder|-f> folder] workflow

In the interactive mode, enter the following syntax at the pmcmd prompt to remove the
workflow from the schedule:
unscheduleworkflow [<-folder|-f> folder] workflow

For information on other parameters used in this command, see Table 23-5 on page 594.

Unsetfolder
The unsetfolder command designates no folder as the default folder. After you issue this
command, you must specify a folder name each time you enter a command for a session,
workflow, or task.
In the interactive mode, enter the following syntax at the pmcmd prompt to clear the setfolder
command:
unsetfolder

No parameters are required for this command.


Note: You can use this command in the pmcmd interactive mode only.

610

Chapter 23: Using pmcmd

Version
The version command displays the PowerCenter version and Informatica trademark and
copyright information.
In the command line mode, use the following command to verify the PowerCenter version:
pmcmd version

In the interactive mode, enter the following syntax at the pmcmd prompt to verify the
PowerCenter version:
version

Waittask
The waittask command instructs the PowerCenter Server to complete the task before
returning the pmcmd prompt to the command prompt or shell.
In the command line mode, use the following syntax to set a task in the wait mode:
pmcmd waittask
<-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
<<-password|-p> password|<-passwordvar|-pv> passwordEnvVar>
[<-folder|-f> folder]
<<-workflow|-w> workflow>
taskInstancePath

In the interactive mode, enter the following syntax at the pmcmd prompt to set a task in the
wait mode:
waittask
[<-folder|-f> folder]
<<-workflow|-w> workflow>
taskInstancePath

Write the taskInstancePath as a fully qualified string. If the task is within a worklet, write the
string as WorkletName.TaskName. If the task is directly within a workflow, use the task name
alone.
For information on other parameters used in this command, see Table 23-5 on page 594.

Waitworkflow
The waitworkflow command notifies you whether the specified workflow has run successfully
or is not running. If the workflow is running, pmcmd indicates the success with return code 0
after the workflow has completed. If the workflow is not running, pmcmd indicates the

pmcmd Reference

611

workflow is not running with return code 3. For more information on pmcmd return codes,
see pmcmd Return Codes on page 590.
The waitworkflow command returns the pmcmd prompt to the command prompt or shell
when a workflow completes.
In the command line mode, use the following syntax to set a workflow to the wait mode:
pmcmd waitworkflow
<-serveraddr|-s> [host:]portno
<<-user|-u> username|<-uservar|-uv> userEnvVar>
<<-password|-p> password|<-passwordvar|-pv> passwordEnvVar>
[<-folder|-f> folder]
workflow

In the interactive mode, enter the following syntax at the pmcmd prompt to set a workflow to
the wait mode:
waitworkflow
[<-folder|-f> folder]
workflow

You can use waitworkflow in conjunction with the startworkflow command if you are running
scripts. For example, you may want to check the status of a critical workflow that was
previously started. You can use the waitworkflow command to wait for that workflow to
complete before you start the next workflow.
For information on other parameters used in this command, see Table 23-5 on page 594.

612

Chapter 23: Using pmcmd

Chapter 24

Session Caches
This chapter includes the following topics:

Overview, 614

Determining Cache Requirements, 617

Cache Partitioning, 620

Aggregator Caches, 621

Joiner Caches, 624

Lookup Caches, 628

Rank Caches, 632

613

Overview
The PowerCenter Server creates index and data caches in memory for Aggregator, Rank,
Joiner, and Lookup transformations in a mapping. The PowerCenter Server stores key values
in the index cache and output values in the data cache. You configure memory parameters for
the index and data cache in the transformation or session properties.
If the PowerCenter Server requires more memory, it stores overflow values in cache files.
When the session completes, the PowerCenter Server releases cache memory, and in most
circumstances, it deletes the cache files.
The PowerCenter Server creates cache files based on the PowerCenter Server code page.
Table 24-1 gives an overview of the type of information that the PowerCenter Server stores in
the index and data caches:
Table 24-1. Caching Storage Overview
Transformation

Index Cache

Data Cache

Aggregator

Stores group values as configured in the


group by ports.

Stores calculations based on the group by


ports.

Rank

Stores group values as configured in the


group by ports.

Stores ranking information based on the group


by ports.

Joiner

Stores index values for the master source


table as configured in the join condition.

Stores master source rows.

Lookup

Stores lookup condition information.

Stores lookup data that is not stored in the


index cache.

Memory Cache
The PowerCenter Server creates a memory cache based on the size configured in the session
properties. When you create a mapping, you specify the index and data cache size for each
transformation instance. When you create a session, you can override the index and data
cache size for each transformation instance in the session properties.
When you configure a session, you calculate the amount of memory the PowerCenter Server
needs to process the session. Calculate requirements based on factors such as processing
overhead and column size for key and output columns.
By default, the PowerCenter Server allocates 1,000,000 bytes to the index cache and
2,000,000 bytes to the data cache for each transformation instance. If the PowerCenter Server
cannot allocate the configured amount of cache memory, it cannot initialize the session and
the session fails.
If a server grid has 32-bit and 64-bit servers, and if a session exceeds 2 GB of memory, the
master server assigns it to a 64-bit server. For information on server grids, see Working with
Server Grids on page 446.

614

Chapter 24: Session Caches

When you specify large cache sizes in transformations on 64-bit machines, the PowerCenter
Server might run out of physical memory and perform slower. If the cache size forces the
PowerCenter Server to swap virtual memory and to spill to disk, performance decreases.
Note: A PowerCenter Server running on a 32-bit machine cannot run a session if the total size

of all the configured session caches is more than 2 GB.

Cache Files
If the PowerCenter Server requires more memory than the configured cache size, it stores
overflow values in the cache files. Since paging to disk can slow session performance, try to
configure the index and data cache sizes to store data in memory.
The PowerCenter Server creates the index and data cache files by default in the PowerCenter
Server variable directory, $PMCacheDir. If you do not define $PMCacheDir, the
PowerCenter Server saves the files in the PMCache directory specified in the UNIX
configuration file or the cache directory in the Windows registry. If the UNIX PowerCenter
Server does not find a directory there, it creates the index and data files in the installation
directory. If the PowerCenter Server on Windows does not find a directory there, it creates the
files in the system directory.
If a cache file handles more than 2 GB of data, the PowerCenter Server creates multiple index
and data files. When creating these files, the PowerCenter Server appends a number to the
end of the filename, such as PMAGG*.idx1 and PMAGG*.idx2. The number of index and
data files are limited only by the amount of disk space available in the cache directory.
When you run a session, the PowerCenter Server writes a message in the session log indicating
the cache file name and the transformation name. When a session completes, the
PowerCenter Server typically deletes index and data cache files. However, you may find index
and data files in the cache directory under the following circumstances:

The session performs incremental aggregation.

You configure the Lookup transformation to use a persistent cache.

The session does not complete successfully.

The PowerCenter Server use the following naming convention when it creates cache files:
[<Name Prefix> | <Prefix> <session ID>_<transformation ID>]_[partition
index]<suffix>.[overflow index]

Overview

615

Table 24-2 describes the naming convention for cache files that the PowerCenter Server
creates:
Table 24-2. Cache File Names
File Name Component

Description

Name Prefix

Cache file name prefix configured in the Lookup transformation.

Prefix

Describes the type of transformation:


- Aggregator transformation is PMAGG.
- Joiner transformation is PMJNR.
- Lookup transformation is PMLKUP.
- Rank transformation is PMAGG.

Session ID

Session instance ID number.

Transformation ID

Transformation instance ID number.

Partition Index

If the session contains more than one partition, this identifies the partition number. The
partition index is zero-based, so the first partition has no partition index. Partition index 2
indicates a cache file created in the third partition.

Suffix

Identifies the type of file:


- Index file is .idx.
- Data file is .dat.

Overflow Index

If a cache file handles more than 2 GB of data, the PowerCenter Server creates multiple
index and data files. When creating these files, the PowerCenter Server appends an
overflow index to the filename, such as PMAGG*.idx.1 and PMAGG*.idx.2. The number of
index and data files are limited by the amount of disk space available in the cache
directory.

For example, in the file name, PMLKUP8_4_2.idx, PMLKUP identifies the transformation
type as Lookup, 8 is the session ID, 4 is the transformation ID, and 2 is the partition index.
The cache directory should be local to the PowerCenter Server. You might encounter
performance or reliability problems when you cache large quantities of data on a mapped or
mounted drive.
For details on tuning the caches, see Performance Tuning on page 635.

616

Chapter 24: Session Caches

Determining Cache Requirements


When you configure a mapping that uses an Aggregator, Rank, Joiner, or Lookup
transformation, you configure memory cache on the Properties tab of the transformation. You
can override these memory requirements in the session properties. To calculate the index and
data cache, you need to consider column and row requirements as well as processing overhead.
The PowerCenter Server requires processing overhead to cache data and index information.
Column overhead includes a null indicator, and row overhead can include row ID and key
information.
Use the following steps to calculate and configure the cache size required to run a mapping:
1.

Add the size requirements for the columns in the cache.

2.

Add row or group processing overhead.

3.

Multiply by the number of groups or rows.

4.

Configure the index and data cache in the transformation properties. You configure cache
sizes for each transformation on the Properties tab in the mapping.

The amount of memory you configure depends on the partition properties and how much
memory cache and disk cache you want to use. If you use cache partitioning, the PowerCenter
Server requires only a portion of total cache memory for each partition. For information on
cache partitioning, see Cache Partitioning on page 620.

Cache Calculations
To determine cache requirements for a session, first add the total column size in the cache to
the row overhead. Multiply the result by the number of groups or rows in the cache. This
gives the minimum caching requirements. To determine the maximum requirements for the
index cache, you multiply the minimum requirements by two.
The following tables provide the calculations for the minimum cache requirements for each
transformation:
Table 24-3. Aggregate Cache Calculation
Cache

Calculation

Columns in Cache

Index

# groups [( column size) + 17]

Group by columns.

Data

# groups[( column size) + 7]

- Non group by input ports used in non-aggregate output


expression.
- Non group by input/output ports.
- Local variable ports.
- Column containing aggregate function (multiply by
three).*

* Each aggregate function has different cache space requirements. As a general rule, you can multiply the column containing the
aggregate function by three.

Determining Cache Requirements

617

Table 24-4. Rank Cache Calculation


Cache

Calculation

Columns in Cache

Index

# groups [( column size) + 17]

Group by columns.

Data

# groups [(# ranks *( column size + 10)) + 20]

- Non group by input ports used in


non-aggregate output expression.
- Non group by input/output ports.
- Local variable ports.
- Rank ports.

Table 24-5. Joiner Cache Calculation


Cache

Calculation

Columns in Cache

Index

# master rows [( column size) + 16]

Master column in join conditions.

Data

# master rows [( column size) + 8]

Master column not in join condition and used for


output.

Table 24-6. Lookup Cache Calculation


Cache

Calculation

Index
(minimum)

200 * [(

Index
(maximum)

# rows in lookup table [(

Data

# rows in lookup table [( column size) + 8]

Columns in Cache
Columns in lookup condition.

column size) + 16]

Columns in lookup condition.

column size) + 16] * 2

Connected output ports not in


the lookup condition.
Return port (for unconnected
Lookup transformations).

For more information about each cache, see the separate sections in this chapter.

Cache Column Sizes


When you calculate the column size for each cache, include the size of the data and additional
processing requirements.
Table 24-7 gives the columns sizes for index and data cache calculations:
Table 24-7. Column Sizes for Cache Calculations

618

Datatype

Aggregator, Rank

Joiner, Lookup

Binary

precision + 2

precision + 8
Round to nearest multiple of 8

Date/Time

18

24

Decimal, high precision off (all precision)

10

16

Decimal, high precision on (precision <=18)

18

24

Chapter 24: Session Caches

Table 24-7. Column Sizes for Cache Calculations


Datatype

Aggregator, Rank

Joiner, Lookup

Decimal, high precision on (precision >18, <=28)

22

32

Decimal, high precision on (precision >28)

10

16

Decimal, high precision on (negative scale)

10

16

Double

10

16

Real

10

16

Integer

16

Small integer

16

NString, NText, String, Text

Unicode mode:
2*(precision + 2)
ASCII mode: precision + 3

Unicode mode:
2*(precision + 5)
ASCII mode: precision + 9

The column sizes include the bytes required for a null indicator.
Additionally, to increase lookup and join performance, the PowerCenter Server aligns all data
for lookup and joiner caches on an eight byte boundary. So, each Lookup and Joiner column
includes rounding to the nearest multiple of eight.

Determining Cache Requirements

619

Cache Partitioning
When you create a session with multiple partitions, the PowerCenter Server can partition
caches for the Aggregator, Joiner, Lookup, and Rank transformations. It creates a separate
cache for each partition, and each partition works with only the rows needed by that
partition. As a result, the PowerCenter Server requires only a portion of total cache memory
for each partition. When you run a session, the PowerCenter Server accesses the cache in
parallel for each partition. If you do not use cache partitioning, the PowerCenter Server
accesses the cache serially for each partition.
After you configure the session for partitioning, you can configure memory requirements and
cache directories for each transformation in the Transformations view on the Mapping tab of
the session properties. To configure the memory requirements, calculate the total
requirements for a transformation, and divide by the number of partitions. To further
improve performance, you can configure separate directories for each partition.
The guidelines for cache partitioning is different for each cached transformation:

Aggregator transformation. The PowerCenter Server uses cache partitioning for any
multi-partitioned session with an Aggregator transformation. You do not have to set a
partition point at the Aggregator transformation. For more caching information, see
Aggregator Caches on page 621.

Joiner transformation. The PowerCenter Server uses cache partitioning when you create a
partition point at the Joiner transformation. For more caching information, see Joiner
Caches on page 624.

Lookup transformation. The PowerCenter Server uses cache partitioning when you create
a hash auto-keys partition point at the Lookup transformation. For more caching
information, see Lookup Caches on page 628.

Rank transformation. The PowerCenter Server uses cache partitioning for any multipartitioned session with a Rank transformation. You do not have to set a partition point at
the Rank transformation. For more caching information, see Joiner Caches on page 624.

For more partitioning information, see Pipeline Partitioning on page 345.

620

Chapter 24: Session Caches

Aggregator Caches
When the PowerCenter Server runs a session with an Aggregator transformation, it stores data
in memory until it completes the aggregation. The PowerCenter Server uses cache
partitioning when you create multiple partitions in a pipeline that contains an Aggregator
transformation. It creates one memory cache and one disk cache for each partition and routes
data from one partition to another based on group key values of the transformation.
After you configure the partitions in the session, you can configure the memory requirements
and cache directories for the Aggregator transformation on the Mappings tab in session
properties. Allocate enough disk space to hold one row in each aggregate group.
If you use incremental aggregation, the PowerCenter Server saves the cache files in the cache
file directory. For information about caching with incremental aggregation, see Partitioning
Guidelines with Incremental Aggregation on page 578.
Note: The PowerCenter Server uses memory to process an Aggregator transformation with

sorted ports. It does not use cache memory. You do not need to configure cache memory for
Aggregator transformations that use sorted ports.
For more information about the Aggregator transformation, see Aggregator Transformation
in the Transformation Guide.

Calculating the Aggregator Index Cache


The index cache holds group information from the group by ports. Use the following
information to calculate the minimum aggregate index cache size:
Aggregate Index Cache Calculation

Columns in Cache

# groups [( column size) + 17]

Group by columns.

Aggregator Caches

621

For example, the following Aggregator transformation, AGG_SalesPerRegionItem, groups by


STORE_ID and ITEM.

Use the column sizes in Table 24-7 on page 618 to add the group by columns.
Column Name

Column Type

Datatype

Size

STORE_ID

Group by

Integer

ITEM

Group by

String (15)

18

TOTAL COLUMN SIZE =

24

You know that there are 36 stores and 2,000 items, so the total number of groups is 72,000.
Use the following calculation to determine the minimum index cache requirements:
72,000 * (24 + 17) = 2,952,000

Double the size to determine the maximum index cache requirements:


2,952,000 * 2 = 5,904,000

Therefore, this Aggregator transformation requires an index cache size between 2,952,000
and 5,904,000 bytes.

Calculating the Aggregator Data Cache


The data cache holds row data for variable ports and connected output ports. As a result, the
data cache is generally larger than the index cache. To reduce the data cache size, connect only

622

Chapter 24: Session Caches

the necessary input/output ports to subsequent transformations. Use the following


information to calculate the minimum aggregate data cache size:
Aggregate Data Cache Calculation

Columns in Cache

# groups[( column size) + 7]

- Non group by input ports used in non-aggregate output expression.


- Non group by input/output ports.
- Local variable ports.
- Port containing aggregate function (multiply by three).*

*The cache space requirements for aggregate functions are different for each function. However, you can multiply the port containing the
aggregate function by three for all aggregate functions.

The following figure shows the connected output ports of AGG_SalesPerRegionItem:

Use the column sizes in Table 24-7 on page 618 to add the columns in the data cache:
Column Name

Column Type

Datatype

Size

ORDER_ID

Non group by input/output

Integer

SALES_PER_STORE_ITEMS

Port containing aggregate function

Decimal (12, 2)

30*

TOTAL COLUMN SIZE =

36

*Remember to multiply the port containing the aggregate function by three. For more information, see Table 24-3 on page 617.

Note that you do not use STORE_ID and ITEM in the data cache calculation. These
columns are connected to the target, but you do not use them in the cache calculation because
they are group by ports and are used in the index cache calculation.
The total number of groups as calculated for the index cache size is 72,000. Use the following
calculation to determine the minimum data cache requirements:
72,000 * (36 + 7) = 3,096,000

Therefore, this Aggregator transformation requires a data cache size of 3,096,000 bytes.

Aggregator Caches

623

Joiner Caches
When the PowerCenter Server runs a session with a Joiner transformation, it reads rows from
the master and detail sources concurrently and builds index and data caches based on the
master rows. The PowerCenter Server then performs the join based on the detail source data
and the cache data.
The number of rows the PowerCenter Server stores in the cache depends on the partitioning
scheme, the data in the master source, and whether or not you use sorted input. For more
information on how many rows the PowerCenter Server stores, see Calculating the Number
of Master Rows on page 625.
When you create multiple partitions in a session, the PowerCenter Server processes the Joiner
transformation differently when you use n:n partitioning and when you use 1:n partitioning.

Processing master and detail data for outer joins. When you run a multi-partitioned
session with a partitioned Joiner transformation, the PowerCenter Server builds one cache
per partition. In a single-partitioned master pipeline (1:n), the PowerCenter Server
outputs unmatched master rows after it processes all detail partitions. In a multipartitioned master pipeline (n:n), the PowerCenter Server outputs unmatched master rows
after it processes the partition for each detail cache.

Configuring memory requirements. When you run a session with a Joiner transformation,
the PowerCenter Server uses n times the memory you specify on the Transformation view
of the Mapping tab. The PowerCenter Server might page to disk if you do not specify
enough memory.
When you use 1:n partitioning, each partition requires as much memory as a 1:1 partition
session. When you configure the cache for the Joiner transformation, enter the total
transformation memory requirements for a single partition.
When you use n:n partitioning, each partition requires only a portion of the memory
required by a 1:1 partition session. When you configure the cache, divide the memory
requirements for a 1:1 partition session by the number of partitions. Enter that amount for
the cache requirements.
For example, you calculate the following cache requirements for a Joiner transformation
instance and determine that the transformation requires 2,000,000 bytes of memory for
the index cache and 4,000,000 bytes of memory for the data cache. You create four
partitions for the pipeline. If you use 1:n partitioning, you enter 2,000,000 bytes for the
index cache and 4,000,000 bytes for the data cache. If you use n:n partitioning, enter
500,000 bytes for the index cache and 1,000,000 bytes for the data cache.

To increase join performance, the PowerCenter Server aligns all data for joiner caches on an
eight byte boundary.
Note: To use n:n partitioning with a Joiner transformation, you must create a partition point

at the Joiner transformation. This allows you to create multiple partitions for both the master
and detail source of a Joiner transformation.
For more information about the Joiner transformation, see Joiner Transformation in the
Transformation Guide.
624

Chapter 24: Session Caches

Calculating the Number of Master Rows


The number of rows the PowerCenter Server stores in the cache depends on the partitioning
scheme, the data in the master source, and whether or not you use sorted input.
The PowerCenter Server caches all master rows with a unique key in the index cache, and all
master rows in the data cache under any of the following circumstances:

You do not use sorted input.

You use sorted input and 1:n partitioning.

However, when you use sorted input and you use n:n partitioning, the PowerCenter Server
caches a different number of rows in the index and data cache:

Index cache. The PowerCenter Server caches 100 master rows with unique keys.

Data cache. The PowerCenter Server caches the master rows in the data cache that
correspond to the 100 rows in the index cache. The number of rows it stores in the data
cache depends on the data. For example, if every master row contains a unique key, the
PowerCenter Server stores 100 rows in the data cache. However, if the master data
contains multiple rows with the same key, the PowerCenter Server stores more than 100
rows in the data cache.

Calculating the Joiner Index Cache


The index cache holds rows from the master source that are in the join condition. Use the
following information to calculate the minimum joiner index cache size:
Joiner Index Cache Calculation

Columns in Cache

# master rows [( column size) + 16]

Master column in join condition.

Joiner Caches

625

For example the Joiner transformation, JNR_ORDERS_PRODUCTS, does not use sorted
input, and it joins the sources ORDERS and PRODUCTS on ITEM_NO:

Use the column sizes in Table 24-7 on page 618 to add the columns in the index cache:
Column Name

Column Type

Datatype

Size

ITEM_NO

Master column in join condition

Decimal (10)

16

TOTAL COLUMN SIZE =

16

PRODUCTS is the master source and has 90,000 rows. Use the following calculation to
determine the minimum index cache requirements:
90,000 * (16 + 16) = 2,880,000

Double the size to determine the maximum index cache requirements:


2,880,000 * 2 = 5,760,000

Therefore, this Joiner transformation requires an index cache size between 2,880,000 and
5,760,000 bytes.

Calculating the Joiner Data Cache


The data cache holds rows from the master source until the PowerCenter Server joins the
data. Use the following information to calculate the minimum joiner data cache size:

626

Joiner Data Cache Calculation

Columns in Cache

# master rows [( column size) + 8]

Master column not in join condition and used for output.

Chapter 24: Session Caches

The following figure shows the connected output ports for JNR_ORDERS_PRODUCTS:

Use the column sizes in Table 24-7 on page 618 to add the columns for the data cache:
Column Name

Column Type

Datatype

Size

ITEM_NAME

Master column not in join condition

String (23)

32

PRODUCT CATEGORY

Master column not in join condition

Decimal (21)

30

TOTAL COLUMN SIZE =

62

Note that you do not use ITEM_NO in the data cache calculation because it is part of the
join condition and is used in the index cache.
The master source has 90,000 rows.
Use the following calculation to determine the minimum data cache requirements:
90,000 * (62 + 8) = 6,300,000

This Joiner transformation requires a data cache size of 6,300,000 bytes.

Joiner Caches

627

Lookup Caches
When the PowerCenter Server builds a lookup cache in memory, it processes the first row of
data in the transformation. It queries the cache for each row that enters the transformation.
Configure the index and data cache memory for each Lookup transformation. The
PowerCenter Server caches data differently for static and dynamic caches and also for sessions
that use cache partitioning.
When you run the session, the PowerCenter Server rebuilds a persistent cache if any cache file
is missing or invalid.
For more information about configuring the lookup cache and how the PowerCenter Server
processes lookup requests, see Lookup Caches in the Transformation Guide.

Static Cache
When you use a static lookup cache, the PowerCenter Server creates one memory cache for
each partition.
If you use cache partitioning, the PowerCenter Server requires only a portion of the total
memory to cache each partition. So, when you configure cache size, you can divide the total
memory requirements by the number of partitions.
If you do not use cache partitioning, the PowerCenter Server requires as much memory for
each partition as it does for a single partition pipeline. So, when you configure cache size, you
enter the total memory requirements for the transformation.
If two Lookup transformations in a mapping share the cache, the PowerCenter Server does not
allocate additional memory for shared transformations in the same pipeline stage. For shared
transformations in a different pipeline stage, the PowerCenter Server does allocate additional
memory.
Static Lookup transformations that use the same data or a subset of data to create a disk cache
can share the disk cache. However, the lookup keys may be different, so the transformations
must have separate memory caches.
For more information about caching the Lookup transformation, see Lookup Caches in the
Transformation Guide.

Dynamic Cache
When you use a dynamic lookup cache, the PowerCenter Server creates the memory cache
based on whether you use cache partitioning or not.
If you use cache partitioning, the PowerCenter Server creates one memory cache for each
partition. It requires only a portion of the total memory to cache each partition. So, when you
configure cache size, you can divide the total memory requirements by the number of
partitions.

628

Chapter 24: Session Caches

If you do not use cache partitioning, the PowerCenter Server creates one memory cache and
one disk cache for each transformation. All partitions share the memory and disk cache.
When you configure the cache size, enter the total memory requirements in the
transformation or on the Mapping tab in the session properties.
When Lookup transformations share a dynamic cache, the PowerCenter Server updates the
memory cache and disk cache. To keep the caches synchronized, the PowerCenter Server must
share the disk cache and the corresponding memory cache between the transformations.

Sharing Partitioned Caches


Use the following guidelines when you share partitioned Lookup caches:

Lookup transformations can share a partitioned cache if the transformations meet the
following conditions:

The cache structures are identical. The lookup/output ports for the first shared
transformation must match the lookup/output ports for the subsequent transformations.

The transformations have the same lookup conditions, and the lookup condition
columns are in the same order.

You cannot share a partitioned cache with a non-partitioned cache.

When you share Lookup caches across target load order groups, you must configure the
target load order groups with the same number of partitions.

Note: If the PowerCenter Server detects a mismatch between Lookup transformations sharing

an unnamed cache, it rebuilds the cache files. If the PowerCenter Server detects a mismatch
between Lookup transformations sharing a named cache, it fails the session.

Calculating the Lookup Index Cache


The lookup index cache holds data for the columns used in the lookup condition. The
formula for calculating the minimum lookup index cache size is different than calculating the
maximum size.
For best session performance, specify the maximum lookup index cache size. If you specify a
lookup index cache less than the minimum cache size, the PowerCenter Server fails the
session.

Calculating the Minimum Lookup Index Cache


The minimum size for a lookup index cache is independent of the number of source rows.
Use the following information to calculate the minimum lookup index cache for both
connected and unconnected Lookup transformations:
Lookup Index Cache Calculation

Columns in Cache

200 * [( column size) + 16]

Columns in lookup condition.

Lookup Caches

629

Calculating the Maximum Lookup Index Cache


Use the following information to calculate the maximum lookup index cache for both
connected and unconnected Lookup transformations:
Lookup Index Cache Calculation

Columns in Cache

# rows in lookup table [( column size) + 16] * 2

Columns in lookup condition.

Example
The Lookup transformation, LKP_PROMOS, looks up values based on the ITEM_ID. It
uses the following lookup condition:
ITEM_ID = IN_ITEM_ID1

Use the column sizes in Table 24-7 on page 618 to add the columns for the index cache:
Column Name

Column Type

Datatype

Size

ITEM_ID

Column in lookup condition

integer

16

TOTAL COLUMN SIZE =

16

The lookup condition uses one column, ITEM_ID, and the table contains 60,000 rows.
Use the following calculation to determine the minimum index cache requirements:
200 * (16 + 16) = 6,400

Use the following calculation to determine the maximum index cache requirements:
60,000 * (16 + 16) * 2 = 3,840,000

630

Chapter 24: Session Caches

Therefore, this Lookup transformation requires an index cache size between 6,400 and
3,840,000 bytes.

Calculating the Lookup Data Cache


In a connected transformation, the data cache contains data for the connected output ports,
not including ports used in the lookup condition. In an unconnected transformation, the data
cache contains data from the return port.
Use the following information to calculate the minimum data cache requirements for both
connected and unconnected Lookup transformations:
Lookup Data Cache Calculation

Columns in Cache

# rows in lookup table [( column size) + 8]

Connected output ports not in the lookup condition.


Use return ports for unconnected transformations.

The following figure shows the connected output ports for LKP_PROMOS:

Use the column sizes in Table 24-7 on page 618 to add the columns for the data cache:
Column Name

Column Type

Datatype

Size

PROMOTION_ID

Connected output port not in lookup condition

Integer

16

DISCOUNT

Connected output port not in lookup condition

Decimal (10)

16

TOTAL COLUMN SIZE =

32

The lookup table has 60,000 rows.


Use the following calculation to determine the minimum data cache requirements:
60,000 * (32 + 8) = 2,400,000

This Lookup transformation requires a data cache size of 2,400,000 bytes.

Lookup Caches

631

Rank Caches
When the PowerCenter Server runs a session with a Rank transformation, it compares an
input row with rows in the data cache. If the input row out-ranks a stored row, the
PowerCenter Server replaces the stored row with the input row.
For example, you configure a Rank transformation to find the top three sales. The
PowerCenter Server reads the following input data:
SALES
10,000
12,210
5,000
2,455
6,324

The PowerCenter Server caches the first three rows (10,000, 12,210, and 5,000). When the
PowerCenter Server reads the next row (2,455) it compares it to the cache values. Since the
row is lower in rank than the cached rows, it discards the row with 2,455. The next row
(6,324), however, is higher in rank than one of the cached rows. Therefore, the PowerCenter
Server replaces the cached row with the higher-ranked input row.
If the Rank transformation is configured to rank across multiple groups, the PowerCenter
Server ranks incrementally for each group it finds.
The PowerCenter Server uses cache partitioning, when you create multiple partitions in a
pipeline that contains a Rank transformation. It creates one memory cache and one disk cache
per partition and routes data from one partition to another based on group key values of the
transformation.
After you configure the partitions in the session, you can configure the memory requirements
and cache directories for the Rank transformation on the Mappings tab in session properties.
For more information about the Rank transformation, see Rank Transformation in the
Transformation Guide.

Calculating the Rank Index Cache


The index cache holds group information from the group by ports. Use the following
information to calculate the minimum rank index cache size:

632

Rank Index Cache Calculation

Columns in Cache

# groups [( column size) + 17]

Group by columns.

Chapter 24: Session Caches

For example, the Rank transformation, RNK_TOPTEN, groups by product category:

Use the column sizes in Table 24-7 on page 618 to add the columns in the index cache:
Column Name

Column Type

Datatype

Size

PRODUCT_CATEGORY

Group by

String (21)

24

TOTAL COLUMN SIZE =

24

There are 10,000 product categories, so the total number of groups is 10,000. Use the
following calculation to determine the minimum index cache requirements:
10,000 * (24 + 17) = 410,000

Double the size to determine the maximum index cache requirements:


410,000 * 2 = 820,000

Therefore, this Rank transformation requires an index cache size between 410,000 and
820,000 bytes.

Calculating the Rank Data Cache


The data cache size is proportional to the number of ranks. It holds row data until the
PowerCenter Server completes the ranking and is generally larger than the index cache. To
reduce the data cache size, connect only the necessary input/output ports to subsequent

Rank Caches

633

transformations. Use the following information to calculate the minimum rank data cache
size:
Rank Data Cache Calculation

Columns in Cache

# groups [(# ranks *( column size + 10)) + 20]

- Non group by input ports used in nonaggregate output expression.


- Non group by input/output ports.
- Local variable ports.
- Rank ports.

The following figure shows the connected output ports of RNK_TOPTEN:

Use the column sizes in Table 24-7 on page 618 to add the columns in the data cache:
Column Name

Column Type

Datatype

Size

ITEM_NO

Non group by input/output port

Decimal (10)

10

ITEM_NAME

Non group by input/output port

String (23)

26

PRICE

Rank port

Decimal (14)

10

TOTAL COLUMN SIZE =

46

RNK_TOPTEN ranks by price, and the total number of ranks is 10. The number of groups is
10,000.
Use the following calculation to determine the minimum data cache requirements:
10,000[(10 * (46 + 10)) + 20] = 5,800,000

This Rank transformation requires a data cache size of 5,800,000 bytes.

634

Chapter 24: Session Caches

Chapter 25

Performance Tuning
This chapter covers the following topics:

Overview, 636

Identifying the Performance Bottleneck, 637

Optimizing the Target Database, 642

Optimizing the Source Database, 645

Optimizing the Mapping, 647

Optimizing the Session, 655

Optimizing the System, 660

Pipeline Partitioning, 663

635

Overview
The goal of performance tuning is to optimize session performance by eliminating
performance bottlenecks. To tune the performance of a session, first you identify a
performance bottleneck, eliminate it, and then identify the next performance bottleneck until
you are satisfied with the session performance. You can use the test load option to run sessions
when you tune session performance.
The most common performance bottleneck occurs when the PowerCenter Server writes to a
target database. You can identify performance bottlenecks by the following methods:

Running test sessions. You can configure a test session to read from a flat file source or to
write to a flat file target to identify source and target bottlenecks.

Studying performance details. You can create a set of information called performance
details to identify session bottlenecks. Performance details provide information such as
buffer input and output efficiency. For details about performance details, see Creating
and Viewing Performance Details on page 436.

Monitoring system performance. You can use system monitoring tools to view percent
CPU usage, I/O waits, and paging to identify system bottlenecks.

Once you determine the location of a performance bottleneck, you can eliminate the
bottleneck by following these guidelines:

Eliminate source and target database bottlenecks. Have the database administrator
optimize database performance by optimizing the query, increasing the database network
packet size, or configuring index and key constraints.

Eliminate mapping bottlenecks. Fine tune the pipeline logic and transformation settings
and options in mappings to eliminate mapping bottlenecks.

Eliminate session bottlenecks. You can optimize the session strategy and use performance
details to help tune session configuration.

Eliminate system bottlenecks. Have the system administrator analyze information from
system monitoring tools and improve CPU and network performance.

If you tune all the bottlenecks above, you can further optimize session performance by
increasing the number of pipeline partitions in the session. Adding partitions can improve
performance by utilizing more of the system hardware while processing the session.
Because determining the best way to improve performance can be complex, change only one
variable at a time, and time the session both before and after the change. If session
performance does not improve, you might want to return to your original configurations.

636

Chapter 25: Performance Tuning

Identifying the Performance Bottleneck


The first step in performance tuning is to identify the performance bottleneck. Performance
bottlenecks can occur in the source and target databases, the mapping, the session, and the
system. Generally, you should look for performance bottlenecks in the following order:
1.

Target

2.

Source

3.

Mapping

4.

Session

5.

System

You can identify performance bottlenecks by running test sessions, viewing performance
details, and using system monitoring tools.

Identifying Target Bottlenecks


The most common performance bottleneck occurs when the PowerCenter Server writes to a
target database. You can identify target bottlenecks by configuring the session to write to a flat
file target. If the session performance increases significantly when you write to a flat file, you
have a target bottleneck.
If your session already writes to a flat file target, you probably do not have a target bottleneck.
You can optimize session performance by writing to a flat file target local to the PowerCenter
Server.
Causes for a target bottleneck may include small check point intervals, small database
network packet size, or problems during heavy loading operations. For details about
eliminating a target bottleneck, see Optimizing the Target Database on page 642.

Identifying Source Bottlenecks


Performance bottlenecks can occur when the PowerCenter Server reads from a source
database. If your session reads from a flat file source, you probably do not have a source
bottleneck. You can improve session performance by setting the number of bytes the
PowerCenter Server reads per line if you read from a flat file source.
If the session reads from relational source, you can use a filter transformation, a read test
mapping, or a database query to identify source bottlenecks.

Using a Filter Transformation


You can use a filter transformation in the mapping to measure the time it takes to read source
data.

Identifying the Performance Bottleneck

637

Add a filter transformation in the mapping after each source qualifier. Set the filter condition
to false so that no data is processed past the filter transformation. If the time it takes to run
the new session remains about the same, then you have a source bottleneck.

Using a Read Test Session


You can create a read test mapping to identify source bottlenecks. A read test mapping isolates
the read query by removing the transformation in the mapping. Use the following steps to
create a read test mapping:
1.

Make a copy of the original mapping.

2.

In the copied mapping, keep only the sources, source qualifiers, and any custom joins or
queries.

3.

Remove all transformations.

4.

Connect the source qualifiers to a file target.

Use the read test mapping in a test session. If the test session performance is similar to the
original session, you have a source bottleneck.

Using a Database Query


You can identify source bottlenecks by executing the read query directly against the source
database.
Copy the read query directly from the session log. Execute the query against the source
database with a query tool such as isql. On Windows, you can load the result of the query in a
file. On UNIX systems, you can load the result of the query in /dev/null.
Measure the query execution time and the time it takes for the query to return the first row. If
there is a long delay between the two time measurements, you can use an optimizer hint to
eliminate the source bottleneck.
Causes for a source bottleneck may include an inefficient query or small database network
packet sizes. For details about eliminating source bottlenecks, see Optimizing the Source
Database on page 645.

Identifying Mapping Bottlenecks


If you determine that you do not have a source or target bottleneck, you might have a
mapping bottleneck. You can identify mapping bottlenecks by using a Filter transformation
in the mapping.
If you determine that you do not have a source bottleneck, you can add a Filter
transformation in the mapping before each target definition. Set the filter condition to false
so that no data is loaded into the target tables. If the time it takes to run the new session is the
same as the original session, you have a mapping bottleneck.

638

Chapter 25: Performance Tuning

You can also identify mapping bottlenecks by using performance details. High errorrows and
rowsinlookupcache counters indicate a mapping bottleneck. For details on eliminating
mapping bottlenecks, see Optimizing the Mapping on page 647.

High Rowsinlookupcache Counters


Multiple lookups can slow down the session. You might improve session performance by
locating the largest lookup tables and tuning those lookup expressions. For details, see
Optimizing Multiple Lookups on page 650.

High Errorrows Counters


Transformation errors impact session performance. If a session has large numbers in any of
the Transformation_errorrows counters, you might improve performance by eliminating the
errors. For details, see Eliminating Transformation Errors on page 648.

Identifying a Session Bottleneck


If you do not have a source, target, or mapping bottleneck, you may have a session bottleneck.
You can identify a session bottleneck by using the performance details. The PowerCenter
Server creates performance details when you enable Collect Performance Data in the
Performance settings on the Properties tab of the session properties.
Performance details display information about each Source Qualifier, target definition, and
individual transformation. All transformations have some basic counters that indicate the
number of input rows, output rows, and error rows.
For details about performance details, see Creating and Viewing Performance Details on
page 436.
Any value other than zero in the readfromdisk and writetodisk counters for Aggregator,
Joiner, or Rank transformations indicate a session bottleneck.
Small cache size, low buffer memory, and small commit intervals can cause session
bottlenecks. For details on eliminating session bottlenecks, see Optimizing the Session on
page 655.

Aggregator, Rank, and Joiner Readfromdisk and Writetodisk Counters


If a session contains Aggregator, Rank, or Joiner transformations, examine each
Transformation_readfromdisk and Transformation_writetodisk counter.
If these counters display any number other than zero, you can improve session performance
by increasing the index and data cache sizes. The PowerCenter Server uses the index cache to
store group information and the data cache to store transformed data, which is typically
larger. Therefore, although both the index cache and data cache sizes affect performance, you
will most likely need to increase the data cache size more than the index cache size. For
further information about configuring cache sizes, see Session Caches on page 613.

Identifying the Performance Bottleneck

639

If the session performs incremental aggregation, the PowerCenter Server reads historical
aggregate data from the local disk during the session and writes to disk when saving historical
data. As a result, the Aggregator_readtodisk and writetodisk counters display a number
besides zero. However, since the PowerCenter Server writes the historical data to a file at the
end of the session, you can still evaluate the counters during the session. If the counters show
any number other than zero during the session run, you can increase performance by tuning
the index and data cache sizes.
To view the session performance details while the session runs, right-click the session in the
Workflow Monitor and choose Properties. Click the Properties tab in the details dialog box.

Source and Target BufferInput_efficiency and BufferOutput_efficiency


Counters
If the BufferInput_efficiency and the BufferOutput_efficiency counters are low for all sources
and targets, increasing the session DTM buffer size may improve performance. For
information on when and how to tune this parameter, see Increasing DTM Buffer Size on
page 656.
Under certain circumstances, tuning the buffer block size may also improve session
performance. For details, see Optimizing the Buffer Block Size on page 657.

Identifying a System Bottleneck


After you tune the source, target, mapping, and session, you may consider tuning the system.
You can identify system bottlenecks by using system tools to monitor CPU usage, memory
usage, and paging.
The PowerCenter Server uses system resources to process transformation, session execution,
and reading and writing data. The PowerCenter Server also uses system memory for other
data such as aggregate, joiner, rank, and cached lookup tables. You can use system
performance monitoring tools to monitor the amount of system resources the PowerCenter
Server uses and identify system bottlenecks.
On Windows, you can use system tools in the Task Manager or Administrative Tools.
On UNIX systems you can use system tools such as vmstat and iostat to monitor system
performance.
For details on eliminating system bottlenecks, see Optimizing the System on page 660.

Identifying System Bottlenecks on Windows


On Windows, you can view the Performance and Processes tab in the Task Manager (use CtrlAlt-Del and choose Task Manager). The Performance tab in the Task Manager provides a
quick look at CPU usage and total memory used. You can view more detailed performance
information by using the Performance Monitor on Windows (use Start-ProgramsAdministrative Tools and choose Performance Monitor).

640

Chapter 25: Performance Tuning

Use the Windows Performance Monitor to create a chart that provides the following
information:

Percent processor time. If you have several CPUs, monitor each CPU for percent
processor time. If the processors are utilized at more than 80%, you may consider adding
more processors.

Pages/second. If pages/second is greater than five, you may have excessive memory
pressure (thrashing). You may consider adding more physical memory.

Physical disks percent time. This is the percent time that the physical disk is busy
performing read or write requests. You may consider adding another disk device or
upgrading the disk device.

Physical disks queue length. This is the number of users waiting for access to the same
disk device. If physical disk queue length is greater than two, you may consider adding
another disk device or upgrading the disk device.

Server total bytes per second. This is the number of bytes the server has sent to and
received from the network. You can use this information to improve network bandwidth.

Identifying System Bottlenecks on UNIX


You can use UNIX tools to monitor user background process, system swapping actions, CPU
loading process, and I/O load operations. When you tune UNIX systems, tune the server for
a major database system. Use the following UNIX tools to identify system bottlenecks on the
UNIX system:

lsattr -E -I sys0. Use this tool to view current system settings. This tool shows maxuproc,
the maximum level of user background processes. You may consider reducing the amount
of background process on your system.

iostat. Use this tool to monitor loading operation for every disk attached to the database
server. Iostat displays the percentage of time that the disk was physically active. High disk
utilization suggests that you may need to add more disks.
If you use disk arrays, use utilities provided with the disk arrays instead of iostat.

vmstat or sar -w. Use this tool to monitor disk swapping actions. Swapping should not
occur during the session. If swapping does occur, you may consider increasing your
physical memory or reduce the number of memory-intensive applications on the disk.

sar -u. Use this tool to monitor CPU loading. This tool provides percent usage on user,
system, idle time, and waiting time. If the percent time spent waiting on I/O (%wio) is
high, you may consider using other under-utilized disks. For example, if your source data,
target data, lookup, rank, and aggregate cache files are all on the same disk, consider
putting them on different disks.

Identifying the Performance Bottleneck

641

Optimizing the Target Database


If your session writes to a flat file target, you can optimize session performance by writing to a
flat file target that is local to the PowerCenter Server. If your session writes to a relational
target, consider performing the following tasks to increase performance:

Drop indexes and key constraints.

Increase checkpoint intervals.

Use bulk loading.

Use external loading.

Increase database network packet size.

Optimize Oracle target databases.

Dropping Indexes and Key Constraints


When you define key constraints or indexes in target tables, you slow the loading of data to
those tables. To improve performance, drop indexes and key constraints before running your
session. You can rebuild those indexes and key constraints after the session completes.
If you decide to drop and rebuild indexes and key constraints on a regular basis, you can
create pre- and post-load stored procedures to perform these operations each time you run the
session.
Note: To optimize performance, use constraint-based loading only if necessary.

Increasing Checkpoint Intervals


The PowerCenter Server performance slows each time it waits for the database to perform a
checkpoint. To increase performance, consider increasing the database checkpoint interval.
When you increase the database checkpoint interval, you increase the likelihood that the
database performs checkpoints as necessary, when the size of the database log file reaches its
limit.
For details on specific database checkpoints, checkpoint intervals, and log files, consult your
database documentation.

Bulk Loading
You can use bulk loading to improve the performance of a session that inserts a large amount
of data to a DB2, Sybase, Oracle, or Microsoft SQL Server database. Configure bulk loading
on the Mapping tab.
When bulk loading, the PowerCenter Server bypasses the database log, which speeds
performance. Without writing to the database log, however, the target database cannot
perform rollback. As a result, you may not be able to perform recovery. Therefore, you must

642

Chapter 25: Performance Tuning

weigh the importance of improved session performance against the ability to recover an
incomplete session.
For more information on configuring bulk loading, see Bulk Loading on page 252.

External Loading
You can use the External Loader session option to integrate external loading with a session.
If you have a DB2 EE or DB2 EEE target database, you can use the DB2 EE or DB2 EEE
external loaders to bulk load target files. The DB2 EE external loader uses the PowerCenter
Server db2load utility to load data. The DB2 EEE external loader uses the DB2 Autoloader
utility.
If you have a Teradata target database, you can use the Teradata external loader utility to bulk
load target files.
If your target database runs on Oracle, you can use the Oracle SQL*Loader utility to bulk
load target files. When you load data to an Oracle database using a pipeline with multiple
partitions, you can increase performance if you create the Oracle target table with the same
number of partitions you use for the pipeline.
If your target database runs on Sybase IQ, you can use the Sybase IQ external loader utility to
bulk load target files. If your Sybase IQ database is local to the PowerCenter Server on your
UNIX system, you can increase performance by loading data to target tables directly from
named pipes.
For details on the External Loader option, see External Loading on page 523.

Increasing Database Network Packet Size


You can increase the network packet size in the Informatica Workflow Manager to reduce
target bottleneck. For Sybase and Microsoft SQL Server, increase the network packet size to
8K - 16K. For Oracle, increase the network packet size in tnsnames.ora and listener.ora. If
you increase the network packet size in the PowerCenter Server configuration, you also need
to configure the database server network memory to accept larger packet sizes.
See your database documentation about optimizing database network packet size.

Optimizing Oracle Target Databases


If your target database is Oracle, you can optimize the target database by checking the storage
clause, space allocation, and rollback segments.
When you write to an Oracle database, check the storage clause for database objects. Make
sure that tables are using large initial and next values. The database should also store table and
index data in separate tablespaces, preferably on different disks.

Optimizing the Target Database

643

When you write to Oracle target databases, the database uses rollback segments during loads.
Make sure that the database stores rollback segments in appropriate tablespaces, preferably on
different disks. The rollback segments should also have appropriate storage clauses.
You can optimize the Oracle target database by tuning the Oracle redo log. The Oracle
database uses the redo log to log loading operations. Make sure that redo log size and buffer
size are optimal. You can view redo log properties in the init.ora file.
If your Oracle instance is local to the PowerCenter Server, you can optimize performance by
using IPC protocol to connect to the Oracle database. You can set up Oracle database
connection in listener.ora and tnsnames.ora.
See your Oracle documentation for details on optimizing Oracle databases.

644

Chapter 25: Performance Tuning

Optimizing the Source Database


If your session reads from a flat file source, you can improve session performance by setting
the number of bytes the PowerCenter Server reads per line. By default, the PowerCenter
Server reads 1024 bytes per line. If each line in the source file is less than the default setting,
you can decrease the Line Sequential Buffer Length setting in the session properties.
If your session reads from a relational source, review the following suggestions for improving
performance:

Optimize the query.

Create tempdb as in-memory database.

Use conditional filters.

Increase database network packet size.

Connect to Oracle databases using IPC protocol.

Optimizing the Query


If a session joins multiple source tables in one Source Qualifier, you might be able to improve
performance by optimizing the query with optimizing hints. Also, single table select
statements with an ORDER BY or GROUP BY clause may benefit from optimization such as
adding indexes.
Usually, the database optimizer determines the most efficient way to process the source data.
However, you might know properties about your source tables that the database optimizer
does not. The database administrator can create optimizer hints to tell the database how to
execute the query for a particular set of source tables.
The query the PowerCenter Server uses to read data appears in the session log. You can also
find the query in the Source Qualifier transformation. Have your database administrator
analyze the query, and then create optimizer hints and/or indexes for the source tables.
Use optimizing hints if there is a long delay between when the query begins executing and
when PowerCenter receives the first row of data. Configure optimizer hints to begin returning
rows as quickly as possible, rather than returning all rows at once. This allows the
PowerCenter Server to process rows parallel with the query execution.
Queries that contain ORDER BY or GROUP BY clauses may benefit from creating an index
on the ORDER BY or GROUP BY columns. Once you optimize the query, use the SQL
override option to take full advantage of these modifications. For details on using SQL
override, see Source Qualifier Transformation in the Transformation Guide.
You can also configure the source database to run parallel queries to improve performance.
See your database documentation for configuring parallel query.

Optimizing the Source Database

645

Using tempdb to Join Sybase and Microsoft SQL Server Tables


When joining large tables on a Sybase or Microsoft SQL Server database, you might improve
performance by creating the tempdb as an in-memory database to allocate sufficient memory.
Check your Sybase or Microsoft SQL Server manual for details.

Using Conditional Filters


A simple source filter on the source database can sometimes impact performance negatively
because of lack of indexes. You can use the PowerCenter conditional filter in the Source
Qualifier to improve performance.
Whether you should use the PowerCenter conditional filter to improve performance depends
on your session. For example, if multiple sessions read from the same source simultaneously,
the PowerCenter conditional filter may improve performance.
However, some sessions may perform faster if you filter the source data on the source
database. You can test your session with both the database filter and the PowerCenter filter to
determine which method improves performance.

Increasing Database Network Packet Sizes


You can improve the performance of a source database by increasing the network packet size,
allowing larger packets of data to cross the network at one time. To do this you must complete
the following tasks:

Increase the database server network packet size.

Change the packet size in the Workflow Manager database connection to reflect the
database server packet size.

For Oracle, increase the packet size in listener.ora and tnsnames.ora. For other databases,
check your database documentation for details on optimizing network packet size.

Connecting to Oracle Source Databases


If your Oracle instance is local to the PowerCenter Server, you can optimize performance by
using IPC protocol to connect to the Oracle database. You can set up Oracle database
connection in listener.ora and tnsnames.ora.

646

Chapter 25: Performance Tuning

Optimizing the Mapping


Mapping-level optimization may take time to implement but can significantly boost session
performance. Focus on mapping-level optimization only after optimizing on the target and
source databases.
Generally, you reduce the number of transformations in the mapping and delete unnecessary
links between transformations to optimize the mapping. You should configure the mapping
with the least number of transformations and expressions to do the most amount of work
possible. You should minimize the amount of data moved by deleting unnecessary links
between transformations.
For transformations that use data cache (such as Aggregator, Joiner, Rank, and Lookup
transformations), limit connected input/output or output ports. Limiting the number of
connected input/output or output ports reduces the amount of data the transformations store
in the data cache.
You can also perform the following tasks to optimize the mapping:

Configure single-pass reading.

Optimize datatype conversions.

Eliminate transformation errors.

Optimize transformations.

Optimize expressions.

Configuring Single-Pass Reading


Single-pass reading allows you to populate multiple targets with one source qualifier.
Consider using single-pass reading if you have several sessions that use the same sources. If
you join the separate mappings and use only one source qualifier for each source, the
PowerCenter Server then reads each source only once, then sends the data into separate data
flows. A particular row can be used by all the data flows, by any combination, or by none, as
the situation demands.
For example, you have the PURCHASING source table, and you use that source daily to
perform an aggregation and a ranking. If you place the Aggregator and Rank transformations
in separate mappings and sessions, you force the PowerCenter Server to read the same source
table twice. However, if you join the two mappings, using one source qualifier, the
PowerCenter Server reads PURCHASING only once, then sends the appropriate data to the
two separate data flows.
When changing mappings to take advantage of single-pass reading, you can optimize this
feature by factoring out any functions you do on both mappings. For example, if you need to
subtract a percentage from the PRICE ports for both the Aggregator and Rank

Optimizing the Mapping

647

transformations, you can minimize work by subtracting the percentage before splitting the
pipeline as shown in Figure 25-1:
Figure 25-1. Single-Pass Reading

Optimizing Datatype Conversions


Forcing the PowerCenter Server to make unnecessary datatype conversions slows
performance. For example, if your mapping moves data from an Integer column to a Decimal
column, then back to an Integer column, the unnecessary datatype conversion slows
performance. Where possible, eliminate unnecessary datatype conversions from mappings.
Some datatype conversions can improve system performance. Use integer values in place of
other datatypes when performing comparisons using Lookup and Filter transformations.
For example, many databases store U.S. zip code information as a Char or Varchar datatype.
If you convert your zip code data to an Integer datatype, the lookup database stores the zip
code 94303-1234 as 943031234. This helps increase the speed of the lookup comparisons
based on zip code.

Eliminating Transformation Errors


In large numbers, transformation errors slow the performance of the PowerCenter Server.
With each transformation error, the PowerCenter Server pauses to determine the cause of the
error and to remove the row causing the error from the data flow. Then the PowerCenter
Server typically writes the row into the session log file.
Transformation errors occur when the PowerCenter Server encounters conversion errors,
conflicting mapping logic, and any condition set up as an error, such as null input. Check the
session log to see where the transformation errors occur. If the errors center around particular
transformations, evaluate those transformation constraints.
If you need to run a session that generates a large numbers of transformation errors, you
might improve performance by setting a lower tracing level. However, this is not a
recommended long-term response to transformation errors. For details on error tracing and
performance, see Reducing Error Tracing on page 659.

648

Chapter 25: Performance Tuning

Optimizing Lookup Transformations


If a mapping contains a Lookup transformation, you can optimize the lookup. Some of the
things you can do to increase performance include caching the lookup table, optimizing the
lookup condition, or indexing the lookup table.
For more information on the Lookup transformation, see Lookup Transformation in the
Transformation Guide. For more information on lookup caching, see Lookup Caches in the
Transformation Guide and Session Caches on page 613.

Caching Lookups
If a mapping contains Lookup transformations, you might want to enable lookup caching. In
general, you want to cache lookup tables that need less than 300MB.
When you enable caching, the PowerCenter Server caches the lookup table and queries the
lookup cache during the session. When this option is not enabled, the PowerCenter Server
queries the lookup table on a row-by-row basis. You can increase performance using a shared
or persistent cache:

Shared cache. You can share the lookup cache between multiple transformations. You can
share an unnamed cache between transformations in the same mapping. You can share a
named cache between transformations in the same or different mappings.

Persistent cache. If you want to save and reuse the cache files, you can configure the
transformation to use a persistent cache. Use this feature when you know the lookup table
does not change between session runs. Using a persistent cache can improve performance
because the PowerCenter Server builds the memory cache from the cache files instead of
from the database.

For more information on lookup caching options, see Lookup Transformation in the
Transformation Guide.

Reducing the Number of Cached Rows


Use the Lookup SQL Override option to add a WHERE clause to the default SQL statement.
This allows you to reduce the number of rows included in the cache.

Optimizing the Lookup Condition


If you include more than one lookup condition, place the conditions with an equal sign first
to optimize lookup performance.

Indexing the Lookup Table


The PowerCenter Server needs to query, sort, and compare values in the lookup condition
columns. The index needs to include every column used in a lookup condition. You can
improve performance for both cached and uncached lookups:

Cached lookups. You can improve performance by indexing the columns in the lookup
ORDER BY. The session log contains the ORDER BY statement.
Optimizing the Mapping

649

Uncached lookups. Because the PowerCenter Server issues a SELECT statement for each
row passing into the Lookup transformation, you can improve performance by indexing
the columns in the lookup condition.

Optimizing Multiple Lookups


If a mapping contains multiple lookups, even with caching enabled and enough heap
memory, the lookups can slow performance. By locating the Lookup transformations that
query the largest amounts of data, you can tune those lookups to improve overall
performance.
To see which Lookup transformations process the most data, examine the
Lookup_rowsinlookupcache counters for each Lookup transformation. The Lookup
transformations that have a large number in this counter might benefit from tuning their
lookup expressions. If those expressions can be optimized, session performance improves. For
hints on tuning expressions, see Optimizing Expressions on page 652.

Optimizing Filter Transformations


If you filter rows from the mapping, you can improve efficiency by filtering early in the data
flow. Instead of using a Filter transformation halfway through the mapping to remove a
sizable amount of data, use a source qualifier filter to remove those same rows at the source.
If you cannot move the filter into the source qualifier, move the Filter transformation as close
to the source qualifier as possible to remove unnecessary data early in the data flow.
In your filter condition, avoid using complex expressions. You can optimize Filter
transformations by using simple integer or true/false expressions in the filter condition.
Use a Filter or Router transformation to drop rejected rows from an Update Strategy
transformation if you do not need to keep rejected rows.

Optimizing Aggregator Transformations


Aggregator transformations often slow performance because they must group data before
processing it. Aggregator transformations need additional memory to hold intermediate
group results. You can optimize Aggregator transformations by performing the following
tasks:

Group by simple columns.

Use sorted input.

Use incremental aggregation.

Group By Simple Columns


You can optimize Aggregator transformations when you group by simple columns. When
possible, use numbers instead of string and dates in the columns used for the GROUP BY.
You should also avoid complex expressions in the Aggregator expressions.
650

Chapter 25: Performance Tuning

Use Sorted Input


You can increase session performance by sorting data and using the Aggregator Sorted Input
option.
The Sorted Input decreases the use of aggregate caches. When you use the Sorted Input
option, the PowerCenter Server assumes all data is sorted by group. As the PowerCenter
Server reads rows for a group, it performs aggregate calculations. When necessary, it stores
group information in memory.
The Sorted Input option reduces the amount of data cached during the session and improves
performance. Use this option with the Source Qualifier Number of Sorted Ports option to
pass sorted data to the Aggregator transformation.
You can benefit from better performance when you use the Sorted Input option in sessions
with multiple partitions.
For details about using Sorted Input in the Aggregator transformation, see Aggregator
Transformation in the Transformation Guide.

Use Incremental Aggregation


If you can capture changes from the source that changes less than half the target, you can use
Incremental Aggregation to optimize the performance of Aggregator transformations.
When using incremental aggregation, you apply captured changes in the source to aggregate
calculations in a session. The PowerCenter Server updates your target incrementally, rather
than processing the entire source and recalculate the same calculations every time you run the
session.
For details on using Incremental Aggregation, see Using Incremental Aggregation on
page 573.

Optimizing Joiner Transformations


Joiner transformations can slow performance because they need additional space at run time
to hold intermediate results. You can view Joiner performance counter information to
determine whether you need to optimize the Joiner transformations.
Joiner transformations need a data cache to hold the master table rows and an index cache to
hold the join columns from the master table. You need to make sure that you have enough
memory to hold the data and the index cache so the system does not page to disk. To
minimize memory requirements, you can also use the smaller table as the master table or join
on as few columns as possible.
The type of join you use can affect performance. Normal joins are faster than outer joins and
result in fewer rows. When possible, use database joins for homogenous sources.

Optimizing the Mapping

651

Optimizing Sequence Generator Transformations


You can optimize Sequence Generator transformations by creating a reusable Sequence
Generator and use it in multiple mappings simultaneously. You can also optimize Sequence
Generator transformations by configuring the Number of Cached Values property.
The Number of Cached Values property determines the number of values the PowerCenter
Server caches at one time. Make sure that the Number of Cached Value is not too small. You
may consider configuring the Number of Cached Values to a value greater than 1,000.
For details on configuring Sequence Generator transformation, see Sequence Generator
Transformation in the Transformation Guide.

Optimizing Expressions
As a final step in tuning the mapping, you can focus on the expressions used in
transformations. When examining expressions, focus on complex expressions for possible
simplification. Remove expressions one-by-one to isolate the slow expressions.
Once you locate the slowest expressions, take a closer look at how you can optimize those
expressions.

Factoring Out Common Logic


If the mapping performs the same task in several places, reduce the number of times the
mapping performs the task by moving the task earlier in the mapping. For example, you have
a mapping with five target tables. Each target requires a Social Security number lookup.
Instead of performing the lookup five times, place the Lookup transformation in the mapping
before the data flow splits. Then pass lookup results to all five targets.

Minimizing Aggregate Function Calls


When writing expressions, factor out as many aggregate function calls as possible. Each time
you use an aggregate function call, the PowerCenter Server must search and group the data.
For example, in the following expression, the PowerCenter Server reads COLUMN_A, finds
the sum, then reads COLUMN_B, finds the sum, and finally finds the sum of the two sums:
SUM(COLUMN_A) + SUM(COLUMN_B)

If you factor out the aggregate function call, as below, the PowerCenter Server adds
COLUMN_A to COLUMN_B, then finds the sum of both.
SUM(COLUMN_A + COLUMN_B)

Replacing Common Sub-Expressions with Local Variables


If you use the same sub-expression several times in one transformation, you can make that
sub-expression a local variable. You can use a local variable only within the transformation,
but by calculating the variable only once, you can speed performance. For details, see
Transformations in the Designer Guide.

652

Chapter 25: Performance Tuning

Choosing Numeric versus String Operations


The PowerCenter Server processes numeric operations faster than string operations. For
example, if you look up large amounts of data on two columns, EMPLOYEE_NAME and
EMPLOYEE_ID, configuring the lookup around EMPLOYEE_ID improves performance.

Optimizing Char-Char and Char-Varchar Comparisons


When the PowerCenter Server performs comparisons between CHAR and VARCHAR
columns, it slows each time it finds trailing blank spaces in the row. You can use the Treat
CHAR as CHAR On Read option in the PowerCenter Server setup so that the PowerCenter
Server does not trim trailing spaces from the end of Char source fields. For details, see the
Installation and Configuration Guide.

Choosing DECODE versus LOOKUP


When you use a LOOKUP function, the PowerCenter Server must look up a table in a
database. When you use a DECODE function, you incorporate the lookup values into the
expression itself, so the PowerCenter Server does not have to look up a separate table.
Therefore, when you want to look up a small set of unchanging values, using DECODE may
improve performance. For details on using a DECODE, see the Transformation Language
Reference.

Using Operators Instead of Functions


The PowerCenter Server reads expressions written with operators faster than expressions with
functions. Where possible, use operators to write your expressions. For example, if you have
an expression that involves nested CONCAT calls such as:
CONCAT( CONCAT( CUSTOMERS.FIRST_NAME, ) CUSTOMERS.LAST_NAME)

you can rewrite that expression with the || operator as follows:


CUSTOMERS.FIRST_NAME || || CUSTOMERS.LAST_NAME

Optimizing IIF Expressions


IIF expressions can return a value as well as an action, which allows for more compact
expressions. For example, say you have a source with three Y/N flags: FLG_A, FLG_B,
FLG_C, and you want to return values such that: If FLG_A = Y, then return = VAL_A. If
FLG_A = Y AND FLG_B = Y, then return = VAL_A + VAL_B, and so on for all the
permutations.
One way to write the expression is as follows:
IIF( FLG_A = 'Y' and FLG_B = 'Y' AND FLG_C = 'Y',
VAL_A + VAL_B + VAL_C,
IIF( FLG_A = 'Y' and FLG_B = 'Y' AND FLG_C = 'N',
VAL_A + VAL_B ,
IIF( FLG_A = 'Y' and FLG_B = 'N' AND FLG_C = 'Y',
VAL_A + VAL_C,
IIF( FLG_A = 'Y' and FLG_B = 'N' AND FLG_C = 'N',

Optimizing the Mapping

653

VAL_A ,
IIF( FLG_A = 'N' and FLG_B = 'Y' AND FLG_C = 'Y',
VAL_B + VAL_C,
IIF( FLG_A = 'N' and FLG_B = 'Y' AND FLG_C = 'N',
VAL_B ,
IIF( FLG_A = 'N' and FLG_B = 'N' AND FLG_C = 'Y',
VAL_C,
IIF( FLG_A = 'N' and FLG_B = 'N' AND FLG_C = 'N',
0.0,
))))))))

This first expression requires 8 IIFs, 16 ANDs, and at least 24 comparisons.


But if you take advantage of the IIF functions ability to return a value, you can rewrite that
expression as:
IIF(FLG_A='Y', VAL_A, 0.0)+ IIF(FLG_B='Y', VAL_B, 0.0)+ IIF(FLG_C='Y',
VAL_C, 0.0)

This results in three IIFs, two comparisons, two additions, and a faster session.

Evaluating Expressions
If you are not sure which expressions slow performance, the following steps can help isolate
the problem.
To evaluate expression performance:

654

1.

Time the session with the original expressions.

2.

Copy the mapping and replace half of the complex expressions with a constant.

3.

Run and time the edited session.

4.

Make another copy of the mapping and replace the other half of the complex expressions
with a constant.

5.

Run and time the edited session.

Chapter 25: Performance Tuning

Optimizing the Session


Once you optimize your source database, target database, and mapping, you can focus on
optimizing the session. You can perform the following tasks to improve overall performance:

Increase the number of partitions.

Reduce errors tracing.

Remove staging areas.

Tune session parameters.

Table 25-1 lists the settings and values you can use to improve session performance:
Table 25-1. Session Tuning Parameters
Setting

Default Value

Suggested
Minimum Value

Suggested
Maximum Value

DTM Buffer Size

12,000,000 bytes

6,000,000 bytes

128,000,000 bytes

Buffer block size

64,000 bytes

4,000 bytes

128,000 bytes

Index cache size

1,000,000 bytes

1,000,000 bytes

12,000,000 bytes

Data cache size

2,000,000 bytes

2,000,000 bytes

24,000,000 bytes

Commit interval

10,000 rows

N/A

N/A

High Precision

Disabled

N/A

N/A

Tracing Level

Normal

Terse

N/A

Pipeline Partitioning
If you purchased the partitioning option, you can increase the number of partitions in a
pipeline to improve session performance. Increasing the number of partitions allows the
PowerCenter Server to create multiple connections to sources and process partitions of source
data concurrently.
When you create a session, the Workflow Manager validates each pipeline in the mapping for
partitioning. You can specify multiple partitions in a pipeline if the PowerCenter Server can
maintain data consistency when it processes the partitioned data.
For details on partitioning sessions, see Pipeline Partitioning on page 663.

Allocating Buffer Memory


When the PowerCenter Server initializes a session, it allocates blocks of memory to hold
source and target data. The PowerCenter Server allocates at least two blocks for each source
and target partition. Sessions that use a large number of sources and targets might require
additional memory blocks. If the PowerCenter Server cannot allocate enough memory blocks
to hold the data, it fails the session.
Optimizing the Session

655

By default, a session has enough buffer blocks for 83 sources and targets. If you run a session
that has more than 83 sources and targets, you can increase the number of available memory
blocks by adjusting the following session parameters:

DTM Buffer Size. Increase the DTM buffer size found in the Performance settings of the
Properties tab. The default setting is 12,000,000 bytes.

Default Buffer Block Size. Decrease the buffer block size found in the Advanced settings
of the Config Object tab. The default setting is 64,000 bytes.

To configure these settings, first determine the number of memory blocks the PowerCenter
Server requires to initialize the session. Then, based on default settings, you can calculate the
buffer size and/or the buffer block size to create the required number of session blocks.
If you have XML sources or targets in your mapping, use the number of groups in the XML
source or target in your calculation for the total number of sources and targets.
For example, you create a session that contains a single partition using a mapping that
contains 50 sources and 50 targets.
1.

You determine that the session requires 200 memory blocks:


[(total number of sources + total number of targets)* 2] = (session buffer
blocks)
100 * 2 = 200

2.

Next, based on default settings, you determine that you can change the DTM Buffer Size
to 15,000,000, or you can change the Default Buffer Block Size to 54,000:
(session Buffer Blocks) = (.9) * (DTM Buffer Size) / (Default Buffer Block
Size) * (number of partitions)
200 = .9 * 14222222 / 64000 * 1

or
200 = .9 * 12000000 / 54000 * 1

Increasing DTM Buffer Size


The DTM Buffer Size setting specifies the amount of memory the PowerCenter Server uses as
DTM buffer memory. The PowerCenter Server uses DTM buffer memory to create the
internal data structures and buffer blocks used to bring data into and out of the PowerCenter
Server. When you increase the DTM buffer memory, the PowerCenter Server creates more
buffer blocks, which improves performance during momentary slowdowns.
Increasing DTM buffer memory allocation generally causes performance to improve initially
and then level off. When you increase the DTM buffer memory allocation, consider the total
memory available on the PowerCenter Server system.
If you do not see a significant increase in performance, DTM buffer memory allocation is not
a factor in session performance.
Note: Reducing the DTM buffer allocation can cause the session to fail early in the process

because the PowerCenter Server is unable to allocate memory to the required processes.

656

Chapter 25: Performance Tuning

To increase DTM buffer size:


1.

Go to the Performance settings of the Properties tab.

2.

Increase the setting for DTM Buffer Size, and click OK.

The default for DTM Buffer Size is 12,000,000 bytes. Increase the setting by increments of
multiples of the buffer block size, then run and time the session after each increase.

Optimizing the Buffer Block Size


Depending on the session source data, you might need to increase or decrease the buffer block
size.
If the session mapping contains a large number of sources or targets, you might need to
decrease the buffer block size. For more information, see Allocating Buffer Memory on
page 655.
If you are manipulating unusually large rows of data, you can increase the buffer block size to
improve performance. If you do not know the approximate size of your rows, you can
determine the configured row size by following the steps below.
To evaluate needed buffer block size:
1.

In the Mapping Designer, open the mapping for the session.

2.

Open the target instance.

3.

Click the Ports tab.

4.

Add the precisions for all the columns in the target.

5.

If you have more than one target in the mapping, repeat steps 2-4 for each additional
target to calculate the precision for each target.

6.

Repeat steps 2-5 for each source definition in your mapping.

7.

Choose the largest precision of all the source and target precisions for the total precision
in your buffer block size calculation.

The total precision represents the total bytes needed to move the largest row of data. For
example, if the total precision equals 33,000, then the PowerCenter Server requires 33,000
bytes in the buffers to move that row. If the buffer block size is 64,000 bytes, the PowerCenter
Server can move only one row at a time.
Ideally, a buffer should accommodate at least 20 rows at a time. So if the total precision is
greater than 32,000, increase the size of the buffers to improve performance.
To increase buffer block size:
1.

Go to the Advanced settings on the Config Object tab.

2.

Increase the setting for Default Buffer Block Size, and click OK.

The default for this setting is 64,000 bytes. Increase this setting in relation to the size of the
rows. As with DTM buffer memory allocation, increasing buffer block size should improve

Optimizing the Session

657

performance. If you do not see an increase, buffer block size is not a factor in session
performance.

Increasing the Cache Sizes


The PowerCenter Server uses the index and data caches for Aggregator, Rank, Lookup, and
Joiner transformation. The PowerCenter Server stores transformed data from Aggregator,
Rank, Lookup, and Joiner transformations in the data cache before returning it to the data
flow. It stores group information for those transformations in the index cache. If the allocated
data or index cache is not large enough to store the data, the PowerCenter Server stores the
data in a temporary disk file as it processes the session data. Each time the PowerCenter Server
pages to the temporary file, performance slows.
You can see when the PowerCenter Server pages to the temporary file by examining the
performance details. The Transformation_readfromdisk or Transformation_writetodisk
counters for any Aggregator, Rank, Lookup, or Joiner transformation indicate the number of
times the PowerCenter Server must page to disk to process the transformation. Since the data
cache is typically larger than the index cache, you should increase the data cache more than
the index cache.
For details on calculating the index and data cache size for Aggregator, Rank, Lookup, or
Joiner transformations, see Session Caches on page 613.

Increasing the Commit Interval


The Commit Interval setting determines the point at which the PowerCenter Server commits
data to the target tables. Each time the PowerCenter Server commits, performance slows.
Therefore, the smaller the commit interval, the more often the PowerCenter Server writes to
the target database, and the slower the overall performance.
If you increase the commit interval, the number of times the PowerCenter Server commits
decreases and performance improves.
When you increase the commit interval, consider the log file limits in the target database. If
the commit interval is too high, the PowerCenter Server may fill the database log file and
cause the session to fail.
Therefore, weigh the benefit of increasing the commit interval against the additional time you
would spend recovering a failed session.
Click the General Options settings of the Properties tab to review and adjust the commit
interval.

Disabling High Precision


If a session runs with high precision enabled, disabling high precision might improve session
performance.

658

Chapter 25: Performance Tuning

The Decimal datatype is a numeric datatype with a maximum precision of 28. To use a high
precision Decimal datatype in a session, configure the PowerCenter Server to recognize this
datatype by selecting Enable High Precision in the session properties. However, since reading
and manipulating the high precision datatype slows the PowerCenter Server, you can improve
session performance by disabling high precision.
When you disable high precision, the PowerCenter Server converts data to a double. The
PowerCenter Server reads the Decimal row 3900058411382035317455530282 as
390005841138203 x 1013 . For details on high precision, Handling High Precision Data on
page 204.
Click the Performance settings on the Properties tab to enable high precision.

Reducing Error Tracing


If a session contains a large number of transformation errors that you have no time to correct,
you can improve performance by reducing the amount of data the PowerCenter Server writes
to the session log.
To reduce the amount of time spent writing to the session log file, set the tracing level to
Terse. You specify Terse tracing if your sessions run without problems and you dont need
session details. At this tracing level, the PowerCenter Server does not write error messages or
row-level information for reject data.
To debug your mapping, set the tracing level to Verbose. However, it can significantly impact
the session performance. Do not use Verbose tracing when you tune performance.
The session tracing level overrides any transformation-specific tracing levels within the
mapping. This is not recommended as a long-term response to high levels of transformation
errors.
For more information about tracing levels, see Setting Tracing Levels on page 473.

Removing Staging Areas


When you use a staging area, the PowerCenter Server performs multiple passes on your data.
Where possible, remove staging areas to improve performance. The PowerCenter Server can
read multiple sources with a single pass, which may alleviate your need for staging areas. For
details on single-pass reading, see Optimizing the Mapping on page 647.

Optimizing the Session

659

Optimizing the System


Often performance slows because your session relies on inefficient connections or an
overloaded PowerCenter Server system. System delays can also be caused by routers, switches,
network protocols, and usage by many users. After you determine from the system monitoring
tools that you have a system bottleneck, you can make the following global changes to
improve the performance of all your sessions:

Improve network speed. Slow network connections can slow session performance. Have
your system administrator determine if your network runs at an optimal speed. Decrease
the number of network hops between the PowerCenter Server and databases.

Use multiple PowerCenter Servers. Using multiple PowerCenter Servers on separate


systems might double or triple session performance.

Use a server grid. Use a collection of PowerCenter Servers to distribute and process the
workload of a workflow. For information on server grids, see Working with Server Grids
on page 446.

Improve CPU performance. Run the PowerCenter Server and related machines on high
performance CPUs, or configure your system to use additional CPUs.

Configure the PowerCenter Server for ASCII data movement mode. When all character
data processed by the PowerCenter Server is 7-bit ASCII or EBCDIC, configure the
PowerCenter Server for ASCII data movement mode.

Check hard disks on related machines. Slow disk access on source and target databases,
source and target file systems, as well as the PowerCenter Server and repository machines
can slow session performance. Have your system administrator evaluate the hard disks on
your machines.

Reduce paging. When an operating system runs out of physical memory, it starts paging to
disk to free physical memory. Configure the physical memory for the PowerCenter Server
machine to minimize paging to disk.

Use processor binding. In a multi-processor UNIX environment, the PowerCenter Server


may use a large amount of system resources. Use processor binding to control processor
usage by the PowerCenter Server.

Improving Network Speed


The performance of the PowerCenter Server is related to network connections. A local disk
can move data five to twenty times faster than a network. Consider the following options to
minimize network activity and to improve PowerCenter Server performance.
If you use flat file as a source or target in your session, you can move the files onto the
PowerCenter Server system to improve performance. When you store flat files on a machine
other than the PowerCenter Server, session performance becomes dependent on the
performance of your network connections. Moving the files onto the PowerCenter Server
system and adding disk space might improve performance.

660

Chapter 25: Performance Tuning

If you use relational source or target databases, try to minimize the number of network hops
between the source and target databases and the PowerCenter Server. Moving the target
database onto a server system might improve PowerCenter Server performance.
When you run sessions that contain multiple partitions, have your network administrator
analyze the network and make sure it has enough bandwidth to handle the data moving across
the network from all partitions.

Using Multiple PowerCenter Servers


You can run multiple PowerCenter Servers on separate systems against the same repository.
Distributing the session load to separate PowerCenter Server systems increases performance.
For details on using multiple PowerCenter Servers, see Using Multiple Servers on page 443.

Using Server Grids


A server grid allows you to use the combined processing power of multiple PowerCenter
Servers to balance the workload of workflows. For more information about creating a server
grid, see Working with Server Grids on page 446.
In a server grid, a PowerCenter Server distributes sessions across the network of available
PowerCenter Servers. You can further improve performance by assigning a more powerful
server to run a complicated mapping. For more information about assigning a server to a
session, see Assigning the PowerCenter Server to a Session on page 198.

Running the PowerCenter Server in ASCII Data Movement Mode


When all character data processed by the PowerCenter Server is 7-bit ASCII or EBCDIC,
configure the PowerCenter Server to run in the ASCII data movement mode. In ASCII mode,
the PowerCenter Server uses one byte to store each character. When you run the PowerCenter
Server in Unicode mode, it uses two bytes for each character, which can slow session
performance.

Using Additional CPUs


Configure your system to use additional CPUs to improve performance. Additional CPUs
allows the system to run multiple sessions in parallel as well as multiple pipeline partitions in
parallel.
However, additional CPUs might cause disk bottlenecks. To prevent disk bottlenecks,
minimize the number of processes accessing the disk. Processes that access the disk include
database functions and operating system functions. Parallel sessions or pipeline partitions also
require disk access.

Optimizing the System

661

Reducing Paging
Paging occurs when the PowerCenter Server operating system runs out of memory for a
particular operation and uses the local disk for memory. You can free up more memory or
increase physical memory to reduce paging and the slow performance that results from
paging. Monitor paging activity using system tools.
You might want to increase system memory in the following circumstances:

You run a session that uses large cached lookups.

You run a session with many partitions.

If you cannot free up memory, you might want to add memory to the system.

Using Processor Binding


In a multi-processor UNIX environment, the PowerCenter Server may use a large amount of
system resources if you run a large number of sessions. As a result, other applications on the
machine may not have enough system resources available. You can use processor binding to
control processor usage by the PowerCenter Server.
In a Sun Solaris environment, the system administrator can create and manage a processor set
using the psrset command. The system administrator can then use the pbind command to
bind the PowerCenter Server to a processor set so the processor set only runs the PowerCenter
Server. The Sun Solaris environment also provides the psrinfo command to display details
about each configured processor, and the psradm command to change the operational status
of processors. For details, see your system administrator and Sun Solaris documentation.
In an HP-UX environment, the system administrator can use the Process Resource Manager
utility to control CPU usage in the system. The Process Resource Manager allocates
minimum system resources and uses a maximum cap of resources. For details, see your system
administrator and HP-UX documentation.
In an AIX environment, system administrators can use the Workload Manager in AIX 5L to
manage system resources during peak demands. The Workload Manager can allocate resources
and manage CPU, memory, and disk I/O bandwidth. For details, see your system
administrator and AIX documentation.

662

Chapter 25: Performance Tuning

Pipeline Partitioning
Once you have tuned the application, databases, and system for maximum single-partition
performance, you may find that your system is under-utilized. At this point, you can
reconfigure your session to have two or more partitions. Adding partitions may improve
performance by utilizing more of the hardware while processing the session.
Use the following tips when you add partitions to a session:

Add one partition at a time. To best monitor performance, add one partition at a time,
and note your session settings before you add each partition.

Set DTM Buffer Memory. For a session with n partitions, this value should be at least n
times the value for the session with one partition.

Set cached values for Sequence Generator. For a session with n partitions, there should be
no need to use the Number of Cached Values property of the Sequence Generator
transformation. If you must set this value to a value greater than zero, make sure it is at
least n times the original value for the session with one partition.

Partition the source data evenly. Configure each partition to extract the same number of
rows.

Monitor the system while running the session. If there are CPU cycles available (twenty
percent or more idle time) then this session might see a performance improvement by
adding a partition.

Monitor the system after adding a partition. If the CPU utilization does not go up, the
wait for I/O time goes up, or the total data transformation rate goes down, then there is
probably a hardware or software bottleneck. If the wait for I/O time goes up a significant
amount, then check the system for hardware bottlenecks. Otherwise, check the database
configuration.

Tune databases and system. Make sure that your databases are tuned properly for parallel
ETL and that your system has no bottlenecks.

For details on pipeline partitioning, see Pipeline Partitioning on page 345.

Optimizing the Source Database for Partitioning


Usually, each partition on the reader side represents a subset of the data to be processed. But if
the database is not tuned properly, the results may not make your session any quicker. This is
fairly easy to test. Create a pipeline with one partition. Measure the reader throughput in the
Workflow Manager. After you do this, add partitions. Is the throughput scaling linearly? In
other words, if you have two partitions, is your reader throughput twice as fast? If this is not
true, you probably need to tune your database.
Some databases may have specific options that must be set to enable parallel queries. You
should check your individual database manual for these options. If these options are off, the
PowerCenter Server runs multiple partition SELECT statements serially.

Pipeline Partitioning

663

You can also consider adding partitions to increase the speed of your query. Each database
provides an option to separate the data into different tablespaces. If your database allows it,
you can use the SQL override feature to provide a query that extracts data from a single
partition.
To maximize a single-sorted query on your database, you need to look at options that enable
parallelization. There are many options in each database that may increase the speed of your
query.
Here are some configuration options to look for in your source database:

Check for configuration parameters that perform automatic tuning. For example, Oracle
has a parameter called parallel_automatic_tuning.

Make sure intra-parallelism (the ability to run multiple threads on a single query) is
enabled. For example, on Oracle you should look at parallel_adaptive_multi_user. On
DB2, you should look at intra_parallel.

Maximum number of parallel processes that are available for parallel executions. For
example, on Oracle, you should look at parallel_max_servers. On DB2, you should look at
max_agents.

Size for various resources used in parallelization. For example, Oracle has parameters such
as large_pool_size, shared_pool_size, hash_area_size, parallel_execution_message_size,
and optimizer_percent_parallel. DB2 has configuration parameters such as dft_fetch_size,
fcm_num_buffers, and sort_heap.

Degrees of parallelism (may occur as either a database configuration parameter or an


option on the table or query). For example, Oracle has parameters
parallel_threads_per_cpu and optimizer_percent_parallel. DB2 has configuration
parameters such as dft_prefetch_size, dft_degree, and max_query_degree.

Turn off options that may affect your database scalability. For example, disable archive
logging and timed statistics on Oracle.

Note: The above examples are not a comprehensive list of all the tuning options available to

you on the databases. Check your individual database documentation for all performance
tuning configuration parameters available.

Optimizing the Target Database for Partitioning


If you have a mapping with multiple partitions, you want the throughput for each partition to
be the same as the throughput for a single partition session. If you do not see this correlation,
then your database is probably inserting rows into the database serially.
To make sure that your database inserts rows in parallel, check the following configuration
options in your target database:

664

Look for a configuration option that needs to be set explicitly to enable parallel inserts.
For example, Oracle has db_writer_processes, and DB2 has max_agents (some databases
may have this enabled by default).

Chapter 25: Performance Tuning

Consider partitioning your target table. If it is possible, try to have each partition write to
a single database partition. You can use the Router transformation to do this. Also, look
into having the database partitions on separate disks to prevent I/O contention among the
pipeline partitions.

Turn off options that may affect your database scalability. For example, disable archive
logging and timed statistics on Oracle.

Pipeline Partitioning

665

666

Chapter 25: Performance Tuning

Appendix A

Session Properties
Reference
This appendix contains a listing of settings in the session properties. These settings are
grouped by the following tabs:

General Tab, 668

Properties Tab, 670

Config Object Tab, 675

Mapping Tab (Transformations View), 681

Mapping Tab (Partitions View), 705

Components Tab, 710

Metadata Extensions Tab, 718

667

General Tab
By default, the General tab appears when you edit a session task.
Figure A-1 displays the General tab:
Figure A-1. General Tab

On the General tab you can rename the session task and enter a description for the session
task.
Table A-1 describes settings on the General tab:
Table A-1. General Tab

668

General Tab
Options

Required/
Optional

Description

Rename

Optional

The Rename button allows you to enter a new name for the session task.

Description

Optional

You can enter a description for the session task in the Description field.

Mapping name

Required

The name of the mapping associated with the session task.

Server

Required

The name of the server associated with the session task.

Fail Parent if this


task fails*

Optional

Fails the parent worklet or workflow if this task fails.

Appendix A: Session Properties Reference

Table A-1. General Tab


General Tab
Options

Required/
Optional

Description

Fail parent if this


task does not run*

Optional

Fails the parent worklet or workflow if this task does not run.

Disable this task*

Optional

Disables the task.

Treat the input links


as AND or OR*

Required

Runs the task when all or one of the input link conditions evaluate to True.

*Appears only in the Workflow Designer.

General Tab

669

Properties Tab
On the Properties tab you can configure the following settings:

General Options. General Options settings allow you to configure session log file name,
session log file directory, parameter filename and other general session settings. For more
information, see General Options Settings on page 670.

Performance. The Performance settings allow you to increase memory size, collect
performance details, and set configuration parameters. For more information, see
Performance Settings on page 673.

General Options Settings


You can configure General Options settings on the Properties tab. You can enter session log
file name, session log file directory, and other general session settings.
Figure A-2 displays the General Options settings on the Properties tab:
Figure A-2. Properties Tab - General Options Settings

670

Appendix A: Session Properties Reference

Table A-2 describes the General Options settings on the Properties tab:
Table A-2. Properties Tab - General Options Settings
General Options
Settings

Required/
Optional

Session Log File


Name

Optional

By default, the PowerCenter Server uses the session name for the log file
name: s_mapping name.log. For a debug session, it uses
DebugSession_mapping name.log.
Optionally enter a file name, a file name and directory, or use the
$PMSessionLogFile session parameter. The PowerCenter Server appends
information in this field to that entered in the Session Log File Directory field.
For example, if you have C:\session_logs\ in the Session Log File Directory
File field, then enter logname.txt in the Session Log File field, the
PowerCenter Server writes the logname.txt to the C:\session_logs\ directory.
You can also use the $PMSessionLogFile session parameter to represent the
name of the session log or the name and location of the session log. For details
on session parameters, see Session Parameters on page 495.

Session Log File


Directory

Required

Designates a location for the session log file. By default, the PowerCenter
Server writes the log file in the server variable directory,
$PMSessionLogFileDir.
If you enter a full directory and file name in the Session Log File Name field,
clear this field.

Parameter File
Name

Optional

Designates the name and directory for the parameter file. Use the parameter
file to define session parameters. You can also use it to override values of
mapping parameters and variables. For details on session parameters, see
Session Parameters on page 495. For details on mapping parameters and
variables, see Mapping Parameters and Variables in the Designer Guide.

Enable Test Load

Optional

You can configure the PowerCenter Server to perform a test load.


With a test load, the PowerCenter Server reads and transforms data without
writing to targets. The PowerCenter Server generates all session files, and
performs all pre- and post-session functions, as if running the full session.
The PowerCenter Server writes data to relational targets, but rolls back the
data when the session completes. For all other target types, such as flat file
and SAP BW, the PowerCenter Server does not write data to the targets.
Enter the number of source rows you want to test in the Number of Rows to
Test field.
You cannot perform a test load on sessions using XML sources.
Note: You can perform a test load when you configure a session for normal
mode. If you configure the session for bulk mode, the session fails.

Number of Rows to
Test

Optional

Enter the number of source rows you want the PowerCenter Server to test
load.
The PowerCenter Server reads the exact number you configure for the test
load. You cannot perform a test load when you run a session against a
mapping that contains XML sources.

Description

Properties Tab

671

Table A-2. Properties Tab - General Options Settings

672

General Options
Settings

Required/
Optional

$Source
Connection Value

Optional

Enter the database connection you want the PowerCenter Server to use for the
$Source variable. Choose a relational or application database connection. You
can also choose a $DBConnection parameter.
You can use the $Source variable in Lookup and Stored Procedure
transformations to specify the database location for the lookup table or stored
procedure.
If you use $Source in a mapping, you can specify the database location in this
field to ensure the PowerCenter Server uses the correct database connection
to run the session.
If you use $Source in a mapping, but do not specify a database connection in
this field, the PowerCenter Server determines which database connection to
use when it runs the session. If it cannot determine the database connection, it
fails the session. For more information, see Lookup Transformation and
Stored Procedure Transformation in the Transformation Guide.

$Target Connection
Value

Optional

Enter the database connection you want the PowerCenter Server to use for the
$Target variable. Choose a relational or application database connection. You
can also choose a $DBConnection parameter.
You can use the $Target variable in Lookup and Stored Procedure
transformations to specify the database location for the lookup table or stored
procedure.
If you use $Target in a mapping, you can specify the database location in this
field to ensure the PowerCenter Server uses the correct database connection
to run the session.
If you use $Target in a mapping, but do not specify a database connection in
this field, the PowerCenter Server determines which database connection to
use when it runs the session. If it cannot determine the database connection, it
fails the session. For more information, see Lookup Transformation and
Stored Procedure Transformation in the Transformation Guide.

Treat Source Rows


As

Required

Indicates how the PowerCenter Server treats all source rows. If the mapping
for the session contains an Update Strategy transformation or a Custom
transformation configured to set the update strategy, the default option is Data
Driven.
When you select Data Driven and you load to either a Microsoft SQL Server or
Oracle database, you must use a normal load. If you bulk load, the
PowerCenter Server fails the session.

Commit Type

Required

Determines whether the PowerCenter Server uses a source- or target-based,


or user-defined commit. You can choose source- or target-based commit if the
mapping has no Transaction Control transformation or only ineffective
Transaction Control transformations. By default, the PowerCenter Server
performs a target-based commit.
A User-Defined commit is enabled by default if the mapping has effective
Transaction Control transformations.
For details on Commit Intervals, see Setting Commit Properties on page 292.

Commit Interval

Required

In conjunction with the selected commit interval type, indicates the number of
rows. By default, the PowerCenter Server uses a commit interval of 10,000
rows.
This option is not available for user-defined commit.

Appendix A: Session Properties Reference

Description

Table A-2. Properties Tab - General Options Settings


General Options
Settings

Required/
Optional

Commit On End Of
File

Required

By default, this option is enabled and the PowerCenter Server performs a


commit at the end of the file. Clear this option if you want to roll back open
transactions.
This option is enabled by default for a target-based commit. You cannot disable
it.

Rollback
Transactions on
Errors

Optional

For source-based commit, the PowerCenter Server rolls back the transaction at
the next commit point when it encounters a non-fatal writer error.
For user-defined commit, the PowerCenter Server rolls back the transaction at
the next commit point when it encounters a non-fatal error.
This option is not available for target-based commit.

Description

*Tip: When you bulk load to Microsoft SQL Server or Oracle targets, define a large commit interval. Microsoft SQL
Server and Oracle start a new bulk load transaction after each commit. Increasing the commit interval reduces the
number of bulk load transactions and increases performance.

Performance Settings
You can configure performance settings on the Properties tab. In Performance settings you
can increase memory size, collect performance details, and set configuration parameters.
Figure A-3 displays the Performance settings on the Properties tab:
Figure A-3. Properties Tab - Performance Settings

Properties Tab

673

Table A-3 describes the Performance settings on the Properties tab:


Table A-3. Properties Tab - Performance Settings

674

Performance
Settings

Required/
Optional

DTM Buffer Size

Required

The amount of memory allocated to the session from the DTM process. By
default, the Workflow Manager allocates 12 MB for DTM buffer memory. If a
session contains large amounts of character data and you configure it to run in
Unicode mode, increase the DTM Buffer size to 24 MB.
Note: If a source contains a large binary object with a precision larger than the
allocated DTM buffer size, then increase the DTM buffer size to increase the
buffer memory. If you do not increase the DTM buffer memory, the session will
fail.
For information on improving session performance, see Performance Tuning
on page 635.

Collect
Performance Data

Optional

When selected, the PowerCenter Server creates session performance details.


Use this file to help determine how you can improve session performance. For
more information, see Performance Tuning on page 635.

Incremental
Aggregation

Optional

Select Incremental Aggregation option if you want the PowerCenter Server to


perform incremental aggregation. For details, see Using Incremental
Aggregation on page 573.

Reinitialize
Aggregate Cache

Optional

Select Reinitialize Aggregate Cache if the session is an incremental


aggregation session and you want to overwrite existing aggregate files.
After a single session run, to return to a normal incremental aggregation
session run, you must clear this option. For details, see Using Incremental
Aggregation on page 573.

Enable High
Precision

Optional

When selected, the PowerCenter Server processes the Decimal datatype to a


precision of 28. If a session does not use the Decimal datatype, leave this
setting clear. For details on using the Decimal datatype with high precision, see
Handling High Precision Data on page 204.

Session Retry On
Deadlock

Optional

Select this option if you want the PowerCenter Server to retry target writes on
deadlock. You can only use Session Retry on Deadlock for sessions configured
for normal load. This option is disabled for bulk mode. You can configure the
PowerCenter Server to set the number of deadlock retries and the deadlock
sleep time period.

Session Sort Order

Required

Specify a sort order for the session. The session properties display all sort
orders associated with the PowerCenter Server code page. When the
PowerCenter Server runs in Unicode mode, it sorts character data in the
session using the selected sort order. When the PowerCenter Server runs in
ASCII mode, it ignores this setting and uses a binary sort order to sort
character data.

Appendix A: Session Properties Reference

Description

Config Object Tab


The Config Object tab displays settings such as session log settings, error handling settings,
and other advanced properties. You can override properties in the default session
configuration in the Config Object tab. Or, you can choose a session configuration object you
already created in the Workflow Manager and override its properties.
Click the Open button in the Config Name field to choose the session configuration object
you want to override.
You can configure the following settings in the Config Object tab:

Advanced. Advanced settings allow you to configure constraint-based loading, lookup


caches, and buffer sizes. For more information, see Advanced Settings on page 675.

Log Options. Log options allow you to configure how you want to save the session log. By
default, the PowerCenter Server saves only the current session log. For more information,
see Log Options Settings on page 677.

Error Handling. Error Handling settings allow you to determine if the session fails or
continues when it encounters pre-session command errors, stored procedure errors, or a
specified number of session errors. For more information see, Error Handling Settings
on page 678.

Advanced Settings
Advanced settings allow you to configure constraint-based loading, lookup caches, and buffer
sizes.

Config Object Tab

675

Figure A-4 displays the Advanced settings on the Config Object tab:
Figure A-4. Config Object Tab - Advanced Settings

Table A-4 describes the Advanced settings of the Config Object tab:
Table A-4. Config Object Tab - Advanced Settings

676

Advanced
Settings

Required/
Optional

Constraint Based
Load Ordering

Optional

The PowerCenter Server loads targets based on primary key-foreign key


constraints where possible.

Cache Lookup()
Function

Optional

If selected, the PowerCenter Server caches PowerMart 3.5 LOOKUP functions


in the mapping, overriding mapping-level LOOKUP configurations.
If not selected, the PowerCenter Server performs lookups on a row-by-row
basis, unless otherwise specified in the mapping.

Appendix A: Session Properties Reference

Description

Table A-4. Config Object Tab - Advanced Settings


Advanced
Settings

Required/
Optional

Default Buffer
Block Size

Optional

This setting is performance related. For details on performance tuning, see


Performance Tuning on page 635.
Note: The session must have enough buffer blocks to initialize. The minimum
number of buffer blocks must be greater than the total number of sources
(Source Qualifiers, Normalizers for COBOL sources), and targets. The number
of buffer blocks in a session = DTM Buffer Size / Buffer Block Size. Default
settings create enough buffer blocks for 83 sources and targets. If the session
contains more than 83, you might need to increase DTM Buffer Size or
decrease Default Buffer Block Size.

Line Sequential
Buffer Length

Optional

Affects the way the PowerCenter Server reads flat files. Increase this setting
from the default of 1024 bytes per line only if source flat file records are larger
than 1024 bytes.

Description

Log Options Settings


Log options allow you to configure how you want to save the session log. By default, the
PowerCenter Server saves only the current session log.
Figure A-5 displays the Log Options settings on the Config Object tab:
Figure A-5. Config Object Tab - Log Option Settings

Config Object Tab

677

Table A-5 displays the Log Options settings of the Config Object tab:
Table A-5. Config Object Tab - Log Options Settings
Log Options Settings

Required/
Optional

Save Session Log By

Required

If you select Save Session Log by Timestamp, the PowerCenter Server


saves all session logs, appending a timestamp to each log.
If you select Save Session Log by Runs, the PowerCenter Server saves
a designated number of session logs. Configure the number of sessions
in the Save Session Log for These Runs option.
You can also use the $PMSessionLogCount server variable to save the
configured number of session logs for the PowerCenter Server.
For details on these options, see Configuring Session Logs on
page 469.

Save Session Log for


These Runs

Required

The number of historical session logs you want the PowerCenter Server
to save.
The Informatica saves the number of historical logs you specify, plus the
most recent session log. Therefore, if you specify 5 runs, the
PowerCenter Server saves the most recent session log, plus historical
logs 0-4, for a total of 6 logs.
You can specify up to 2,147,483,647 historical logs. If you specify 0 logs,
the PowerCenter Server saves only the most recent session log.

Description

Error Handling Settings


Error Handling settings allow you to determine if the session fails or continues when it
encounters pre-session command errors, stored procedure errors, or a specified number of
session errors.

678

Appendix A: Session Properties Reference

Figure A-6 displays the Error Handling settings on the Config Object tab:
Figure A-6. Config Object Tab - Error Handling Settings

Table A-6 describes the Error handling settings of the Config Object tab:
Table A-6. Config Object Tab - Error Handling Settings
Error Handling
Settings

Required/
Optional

Stop On Errors

Optional

Indicates how many non-fatal errors the PowerCenter Server can


encounter before it stops the session. Non-fatal errors include reader,
writer, and DTM errors. Enter the number of non-fatal errors you want to
allow before stopping the session. The PowerCenter Server maintains an
independent error count for each source, target, and transformation. If
you specify 0, non-fatal errors do not cause the session to stop.
Optionally use the $PMSessionErrorThreshold server variable to stop on
the configured number of errors for the PowerCenter Server.

Override Tracing

Optional

Overrides tracing levels set on a transformation level. Selecting this


option enables a menu from which you choose a tracing level: None,
Terse, Normal, Verbose Initialization, or Verbose Data. For details on
tracing levels, see Configuring Session Logs on page 469.

Description

Config Object Tab

679

Table A-6. Config Object Tab - Error Handling Settings

680

Error Handling
Settings

Required/
Optional

On Stored Procedure
Error

Optional

Required if the session uses pre- or post-session stored procedures.


If you select Stop Session, the PowerCenter Server stops the session on
errors executing a pre-session or post-session stored procedure.
If you select Continue Session, the PowerCenter Server continues the
session regardless of errors executing pre-session or post-session stored
procedures.
By default, the PowerCenter Server stops the session on Stored
Procedure error and marks the session failed.

On Pre-Session
Command Task Error

Optional

Required if the session has pre-session shell commands.


If you select Stop Session, the PowerCenter Server stops the session on
errors executing pre-session shell commands.
If you select Continue Session, the PowerCenter Server continues the
session regardless of errors executing pre-session shell commands.
By default, the PowerCenter Server stops the session upon error.

On Pre-Post SQL Error

Optional

Required if the session uses pre- or post-session SQL.


If you select Stop Session, the PowerCenter Server stops the session
errors executing pre-session or post-session SQL.
If you select Continue, the PowerCenter Server continues the session
regardless of errors executing pre-session or post-session SQL.
By default, the PowerCenter Server stops the session upon pre- or postsession SQL error and marks the session failed.

Enable Recovery

Optional

Enables recovery for the session. For details on recovery, see


Recovering Data on page 295.

Error Log Type

Required

Specifies the type of error log to create. You can specify relational, file, or
no log. By default, the Error Log Type is set to none.

Error Log DB Connection

Optional

Specifies the database connection for a relational error log.

Error Log Table Name


Prefix

Optional

Specifies table name prefix for a relational error log. Oracle and Sybase
have a 30 character limit for table names. If a table name exceeds 30
characters, the session fails.

Error Log File Directory

Optional

Specifies the directory where errors are logged. By default, the error log
file directory is $PMBadFilesDir\.

Error Log File Name

Optional

Specifies error log file name. By default, the error log file name is
PMError.log.

Log Row Data

Optional

Specifies whether or not to log row data. By default, the check box is clear
and row data is not logged.

Log Source Row Data

Optional

Specifies whether or not to log source row data. By default, the check box
is clear and source row data is not logged.

Data Column Delimiter

Optional

Delimiter for string type source row data and transformation group row
data. By default, the PowerCenter Server uses a pipe ( | ) delimiter. Verify
that you do not use the same delimiter for the row data as the error
logging columns. If you use the same delimiter, you may find it difficult to
read the error log file.

Appendix A: Session Properties Reference

Description

Mapping Tab (Transformations View)


In the Transformations view of the Mapping tab, you can configure settings for connections,
sources, targets, and transformations.
You can configure the following nodes:

Connections

Sources

Targets

Transformations

Connections Node
The Connections node displays the source, target, lookup, stored procedure, FTP, external
loader, and queue connections. You can choose connection types and connection values. You
can also edit connection object values.
Figure A-7 displays the Connections settings on the Mapping tab:
Figure A-7. Mapping Tab - Connections Settings

Mapping Tab (Transformations View)

681

Table A-7 describes the Connections settings on the Mapping tab:


Table A-7. Mapping Tab - Connections Settings

682

Connections
Node Settings

Required/
Optional

Type

Required

Appendix A: Session Properties Reference

Description
Enter the connection type for relational and non-relational sources and targets.
Specifies Relational for relational sources and targets.
You can choose the following connection types for flat file, XML, and MQSeries
sources/Targets:
- Queue. Select this connection type to access a MQSeries source if you are
using MQ Source Qualifiers. For static MQSeries targets, set the connection
type to FTP or Queue. For dynamic MQSeries targets, the connection type is
set to Queue. MQSeries connections must be defined in the Workflow
Manager prior to configuring sessions. For more information, see the
PowerCenter Connect for IBM MQSeries User and Administrator Guide .
- Loader. Select this connection type to use the External Loader to load output
files to Teradata, Oracle, DB2, or Sybase IQ databases. If you select this
option, select a configured loader connection in the Value column.
To use this option, you must use a mapping with a relational target definition
and choose File as the writer type on the Writers tab for the relational target
instance. As the PowerCenter Server completes the session, it uses an
external loader to load target files to the Oracle, Sybase IQ, DB2, or Teradata
database. You cannot choose external loader for flat file or XML target
definitions in the mapping.
Note to Oracle 8 users: If you configure a session to write to an Oracle 8
external loader target table in bulk mode with NOT NULL constraints on any
columns, the session may write the null character into a NOT NULL column if
the mapping generates a NULL output.
For details on using the external loader feature, see External Loading on
page 523.
- FTP. Select this connection type to use FTP to access the source/target
directory for flat file and XML sources/targets. If you select this option, select
a configured FTP connection in the Value column. FTP connections must be
defined in the Workflow Manager prior to configuring sessions. For details on
using FTP, see Using FTP on page 559.
- None. Choose None when you want to read from a local flat file or XML file, or
if you are using an associated source for a MQSeries session.
The type also specifies lists the connections in the mapping, such as $Source
connection value and $Target connection value.
You can also configure connection information for Lookups and Stored
Procedures.

Table A-7. Mapping Tab - Connections Settings


Connections
Node Settings

Required/
Optional

Description

Partitions

N/A

Displays the partitions if the session is partitioned.

Value

Required

Enter a source and target connection based on the value you choose in the
Type column. You can also specify the $Source and $Target connection value:
- $Source connection value. Enter the database connection you want the
PowerCenter Server to use for the $Source variable. Choose a relational or
application database connection. You can also choose a $DBConnection
parameter. You can use the $Source variable in Lookup and Stored
Procedure transformations to specify the database location for the lookup
table or stored procedure. If you use $Source in a mapping, you can specify
the database location in this field to ensure the PowerCenter Server uses the
correct database connection to run the session. If you use $Source in a
mapping, but do not specify a database connection in this field, the
PowerCenter Server determines which database connection to use when it
runs the session. If it cannot determine the database connection, it fails the
session. For more information, see the Transformation Guide.
- $Target connection value. Enter the database connection you want the
PowerCenter Server to use for the $Target variable. Choose a relational or
application database connection. You can also choose a $DBConnection
parameter. You can use the $Target variable in Lookup and Stored Procedure
transformations to specify the database location for the lookup table or stored
procedure. If you use $Target in a mapping, you can specify the database
location in this field to ensure the PowerCenter Server uses the correct
database connection to run the session. If you use $Target in a mapping, but
do not specify a database connection in this field, the PowerCenter Server
determines which database connection to use when it runs the session. If it
cannot determine the database connection, it fails the session. For more
information, see the Transformation Guide.
You can also specify the lookup and stored procedure location information
value, if your mapping has lookups or stored procedures.

Sources Node
The Sources node lists the sources used in the session and displays their settings. If you want
to view and configure the settings of a specific source, select the source from the list.
You can configure the following settings:

Readers. The Readers settings displays the reader the PowerCenter Server uses with each
source instance. For more information, see Readers Settings on page 684.

Connections. The Connections settings allows you to configure connections for the
sources. For more information, see Connections Settings on page 684.

Properties. The Properties settings allows you to configure the source properties. For more
information, see Properties Settings on page 686.

Mapping Tab (Transformations View)

683

Readers Settings
You can view the reader the PowerCenter Server uses with each source instance. The
Workflow Manager specifies the necessary reader for each source instance. For relations
sources the reader is Relational Reader and for file sources it is File Reader.
Figure A-8 displays the Readers settings on the Mapping tab (Sources node):
Figure A-8. Mapping Tab - Sources Node - Readers Settings

Connections Settings
You can configure the connections the PowerCenter Server uses with each source instance.

684

Appendix A: Session Properties Reference

Figure A-9 displays the Connections settings on the Mapping tab (Sources node):
Figure A-9. Mapping Tab - Sources Node - Connections Settings

Table A-8 describes the Connections settings on the Mapping tab (Sources node):
Table A-8. Mapping Tab - Sources Node - Connections Settings
Connections
Settings

Required/
Optional

Type

Required

Enter the connection type for relational and non-relational sources. Specifies
Relational for relational sources.
You can choose the following connection types for flat file, XML, and MQSeries
sources:
- Queue. Select this connection type to access a MQSeries source if you are using
MQ Source Qualifiers. MQSeries connections must be defined in the Workflow
Manager prior to configuring sessions. For more information, see the PowerCenter
Connect for IBM MQSeries User and Administrator Guide .
- FTP. Select this connection type to use FTP to access the source directory for flat
file and XML sources. If you want to extract data from a flat file or XML source
using FTP, you must specify an FTP connection when you configure source
options. If you select this option, select a configured FTP connection in the Value
column. FTP connections must be defined in the Workflow Manager prior to
configuring sessions. For details on using FTP, see Using FTP on page 559.
- None. Choose None when you want to read from a local flat file or XML file, or if
you are using an associated source for a MQSeries session.

Value

Required

Enter a source connection based on the value you choose in the Type column.

Description

Mapping Tab (Transformations View)

685

Properties Settings
Click the Properties settings to define source property information. The Workflow Manager
displays properties for both relational and file sources.
Figure A-10 displays the Properties settings on the Mapping tab (Sources node):
Figure A-10. Mapping Tab - Sources Node - Properties Settings

Table A-9 describes Properties settings on the Mapping tab for relational sources:
Table A-9. Mapping Tab - Sources Node - Properties Settings (Relational Sources)

686

Relational
Source Options

Required/
Optional

Description

Owner Name

Optional

Specified the table owner name.

User Defined Join

Optional

Specifies the condition used to join data from multiple sources


represented in the same Source Qualifier transformation. For
more information about user defined join, see Source
Qualifier Transformation in the Transformation Guide.

Tracing Level

N/A

Specifies the amount of detail included in the session log


when you run a session containing this transformation. You
can view the value of this attribute when you click Show all
properties. For more information about tracing level, see
Setting Tracing Levels on page 473.

Select Distinct

Optional

Selects unique rows.

Appendix A: Session Properties Reference

Table A-9. Mapping Tab - Sources Node - Properties Settings (Relational Sources)
Relational
Source Options

Required/
Optional

Pre SQL

Optional

Pre-session SQL commands to run against the source


database before the PowerCenter Server reads the source.
For more information about pre-session SQL, see Using Preand Post-Session SQL Commands on page 186.

Post SQL

Optional

Post-session SQL commands to run against the source


database after the PowerCenter Server writes to the target.
For more information about post-session SQL, see Using
Pre- and Post-Session SQL Commands on page 186.

Sql Query

Optional

Defines a custom query that replaces the default query the


PowerCenter Server uses to read data from sources
represented in this Source Qualifier. A custom query overrides
entries for a custom join or a source filter. For more
information, see Overriding the SQL Query on page 216.

Source Filter

Optional

Specifies the filter condition the PowerCenter Server applies


when querying records. For more information, see Source
Qualifier Transformation in the Transformation Guide.

Description

Table A-10 describes the Properties settings on the Mapping tab for file sources:
Table A-10. Mapping Tab - Sources Node - Properties Settings (File Sources)
File Source
Options

Required/
Optional

Source File
Directory

Optional

Enter the directory name in this field. By default, the PowerCenter Server looks
in the server variable directory, $PMSourceFileDir, for file sources.
If you specify both the directory and file name in the Source Filename field,
clear this field. The PowerCenter Server concatenates this field with the Source
Filename field when it runs the session.
You can also use the $InputFileName session parameter to specify the file
directory.
For details on session parameters, see Session Parameters on page 495.

Source Filename

Required

Enter the file name, or file name and path. Optionally use the $InputFileName
session parameter for the file name.
The PowerCenter Server concatenates this field with the Source File Directory
field when it runs the session. For example, if you have C:\data\ in the Source
File Directory field, then enter filename.dat in the Source Filename field.
When the PowerCenter Server begins the session, it looks for
C:\data\filename.dat.
By default, the Workflow Manager enters the file name configured in the source
definition.
For details on session parameters, see Session Parameters on page 495.

Description

Mapping Tab (Transformations View)

687

Table A-10. Mapping Tab - Sources Node - Properties Settings (File Sources)
File Source
Options

Required/
Optional

Source Filetype

Required

Allows you to configure multiple file sources using a file list.


Indicates whether the source file contains the source data, or a list of files with
the same file properties. Choose Direct if the source file contains the source
data. Choose Indirect if the source file contains a list of files.
When you select Indirect, the PowerCenter Server finds the file list then reads
each listed file when it executes the session. For details on file lists, see Using
a File List on page 230.

Set File Properties

Optional

Allows you to configure the file properties. For more information, see Setting
File Properties for Sources on page 688.

Datetime Format*

N/A

Displays the datetime format for datetime fields.

Thousand
Separator*

N/A

Displays the thousand separator for numeric fields.

Decimal Separator*

N/A

Displays the decimal separator for numeric fields.

Description

*You can view the value of this attribute when you click Show all properties. This attribute is read-only. For more information, see the
Designer Guide.

Setting File Properties for Sources


Configure flat file properties by clicking the Set File Properties link in the Sources node. You
can define properties for both fixed-width and delimited flat file sources.
You can configure flat file properties for non-reusable sessions in the Workflow Designer and
for reusable sessions in the Task Developer.
Figure A-11 shows the Flat Files dialog box that appears when you click Set File Properties:
Figure A-11. Flat Files Dialog Box for Sources

Select the file type (fixed-width or delimited) you want to configure and click Advanced.

Configuring Fixed-Width Properties for Sources


To edit the fixed-width properties, select Fixed Width in the Flat Files dialog box and click
the Advanced button. The Fixed Width Properties dialog box appears.

688

Appendix A: Session Properties Reference

Note: Edit these settings only if you need to override those configured in the source definition.

Figure A-12 displays the Fixed Width Properties dialog box for flat file sources:
Figure A-12. Fixed Width Properties

Table A-11 describes the options you define in the Fixed Width Properties dialog box for
sources:
Table A-11. Fixed-Width Properties for File Sources
Fixed-Width
Properties Options

Required/
Optional

Null Character: Text/


Binary

Required

Indicates the character representing a null value in the file. This can be any
valid character in the file code page, or any binary value from 0 to 255. For
more information about specifying null characters, see Null Character
Handling on page 227.

Repeat Null
Character

Optional

If selected, the PowerCenter Server reads repeat NULL characters in a


single field as a single NULL value. If you do not select this option, the
PowerCenter Server reads a single null character at the beginning of a field
as a null field. Important: For multibyte code pages, Informatica
recommends that you specify a single-byte null character if you are using
repeating non-binary null characters. This ensures that repeating null
characters fit into the column exactly.
For more information about specifying null characters, see Null Character
Handling on page 227.

Code Page

Required

Select the code page of the fixed-width file. The default setting is the client
code page.

Number of Initial
Rows to Skip

Optional

The PowerCenter Server skips the specified number of rows before reading
the file. Use this to skip header rows. One row may contain multiple rows. If
you select the Line Sequential File Format option, the PowerCenter Server
ignores this option.
You can enter any integer from zero to 2147483647.

Description

Mapping Tab (Transformations View)

689

Table A-11. Fixed-Width Properties for File Sources


Fixed-Width
Properties Options

Required/
Optional

Number of Bytes to
Skip Between
Records

Optional

The PowerCenter Server skips the specified number of bytes between


records. For example, you have an ASCII file on Windows with one record on
each line, and a carriage return and line feed appear at the end of each line.
If you want the PowerCenter Server to skip these two single-byte characters,
enter 2.
If you have an ASCII file on UNIX with one record for each line, ending in a
carriage return, skip the single character by entering 1.

Strip Trailing Blanks

Optional

If selected, the PowerCenter Server strips trailing blank spaces from records
before passing them to the Source Qualifier transformation.

Line Sequential File


Format

Optional

Select this option if the file uses a carriage return at the end of each record,
shortening the final column.

Description

Configuring Delimited File Properties for Sources


To edit the delimited properties, select Delimited in the Flat Files dialog box and click the
Advanced button. The Delimited File Properties dialog box appears.
Note: Edit these settings only if you need to override those configured in the source definition.

Figure A-13 displays the Delimited File Properties dialog box for flat file sources:
Figure A-13. Delimited Properties for File Sources

690

Appendix A: Session Properties Reference

Table A-12 describes the options you can define in the Delimited File Properties dialog box
for flat file sources:
Table A-12. Delimited Properties for File Sources
Delimited File
Properties Options

Required/
Optional

Delimiters

Required

Character used to separate columns of data in the source file. Use the
Browse button to the right of this field to enter a different delimiter. Delimiters
can be either printable or single-byte unprintable characters, and must be
different from the escape character and the quote character (if selected). You
cannot select unprintable multibyte characters as delimiters. The delimiter
must be in the same code page as the flat file code page.

Optional Quotes

Required

Select None, Single, or Double. If you select a quote character, the


PowerCenter Server ignores delimiter characters within the quote characters.
Therefore, the PowerCenter Server uses quote characters to escape the
delimiter.
For example, a source file uses a comma as a delimiter and contains the
following row: 342-3849, Smith, Jenna, Rockville, MD, 6.
If you select the optional single quote character, the PowerCenter Server
ignores the commas within the quotes and reads the row as four fields.
If you do not select the optional single quote, the PowerCenter Server reads
six separate fields.
When the PowerCenter Server reads two optional quote characters within a
quoted string, it treats them as one quote character. For example, the
PowerCenter Server reads the following quoted string as Im going

Description

tomorrow:

2353, Im going tomorrow., MD


Additionally, if you select an optional quote character, the PowerCenter
Server only reads a string as a quoted string if the quote character is the first
character of the field.
Note: You can improve session performance if the source file does not
contain quotes or escape characters.
Code Page

Required

Select the code page of the delimited file. The default setting is the client
code page.

Escape Character

Optional

Character immediately preceding a delimiter character embedded in an


unquoted string, or immediately preceding the quote character in a quoted
string. When you specify an escape character, the PowerCenter Server
reads the delimiter character as a regular character (called escaping the
delimiter or quote character).
Note: You can improve session performance for mappings containing
Sequence Generator transformations if the source file does not contain
quotes or escape characters.

Remove Escape
Character From Data

Optional

This option is selected by default. Clear this option to include the escape
character in the output string.

Mapping Tab (Transformations View)

691

Table A-12. Delimited Properties for File Sources


Delimited File
Properties Options

Required/
Optional

Treat Consecutive
Delimiters as One

Optional

By default, the PowerCenter Server reads pairs of delimiters as a null value.


If selected, the PowerCenter Server reads any number of consecutive
delimiter characters as one.
For example, a source file uses a comma as the delimiter character and
contains the following record: 56, , , Jane Doe. By default, the PowerCenter
Server reads that record as four columns separated by three delimiters: 56,
NULL, NULL, Jane Doe. If you select this option, the PowerCenter Server
reads the record as two columns separated by one delimiter: 56, Jane Doe.

Number of Initial
Rows to Skip

Optional

The PowerCenter Server skips the specified number of rows before reading
the file. Use this to skip title or header rows in the file.

Description

Targets Node
The Targets node lists the used in the session and displays their settings. If you want to view
and configure the settings of a specific target, select the target from the list.
You can configure the following settings:

Writers. The Writers settings displays the writer the PowerCenter Server uses with each
target instance. For more information, see Writers Settings on page 692.

Connections. The Connections settings allows you to configure connections for the
targets. For more information, see Connections Settings on page 693.

Properties. The Properties settings allows you to configure the target properties. For more
information, see Properties Settings on page 695.

Writers Settings
You can view and configure the writer the PowerCenter Server uses with each target instance.
The Workflow Manager specifies the necessary writer for each target instance. For relational
targets the writer is Relational Writer and for file targets it is File Writer.

692

Appendix A: Session Properties Reference

Figure A-14 displays the Writers settings on the Mapping tab (Targets node):
Figure A-14. Mapping Tab - Targets Node - Writers Settings

Table A-13 describes the Writers settings on the Mapping tab (Targets node):
Table A-13. Mapping Tab - Targets Node - Writers Settings
Writers
Setting

Required/
Optional

Writers

Required

Description
For relational targets, choose Relational Writer or File Writer. When the target in the
mapping is a flat file, an XML file, a SAP BW target, or MQ target, the Workflow
Manager specifies the necessary writer in the session properties.
When you choose File Writer for a relational target you can use an external loader
to load data to this target. For more information, see External Loading on
page 523.
When you override a relational target to use the file writer, the Workflow Manager
changes the properties for that target instance on the Properties settings. It also
changes the connection options you can define on the Connections settings.
After you override a relational target to use a file writer, define the file properties for
the target. Click Set File Properties and choose the target to define. For more
information, see Configuring Fixed-Width Properties on page 265 and Configuring
Delimited Properties on page 266.

Connections Settings
You can enter connection types and specific target database connections on the Targets node
of the Mappings tab.
Mapping Tab (Transformations View)

693

Figure A-15 displays the Connections settings on the Mapping tab (Targets node):
Figure A-15. Mapping Tab - Targets Node - Connections Settings

694

Appendix A: Session Properties Reference

Table A-14 describes the Connections settings on the Mapping tab (Targets node):
Table A-14. Mapping Tab - Targets Node - Connections Settings
Connections
Settings

Required/
Optional

Type

Required

Enter the connection type for non-relational targets. Specifies Relational for
relational targets.
You can choose the following connection types for flat file, XML, and MQ
targets:
- FTP. Select this connection type to use FTP to access the target directory for
flat file and XML targets. If you want to load data to a flat file or XML target
using FTP, you must specify an FTP connection when you configure target
options. If you select this option, select a configured FTP connection in the
Value column. FTP connections must be defined in the Workflow Manager
prior to configuring sessions. For details on using FTP, see Using FTP on
page 559.
- External Loader. Select this connection type to use the External Loader to
load output files to Teradata, Oracle, DB2, or Sybase IQ databases. If you
select this option, select a configured loader connection in the Value column.
To use this option, you must use a mapping with a relational target definition
and choose File as the writer type on the Writers tab for the relational target
instance. As the PowerCenter Server completes the session, it uses an
external loader to load target files to the Oracle, Sybase IQ, DB2, or Teradata
database. You cannot choose external loader for flat file or XML target
definitions in the mapping.
Note to Oracle 8 users: If you configure a session to write to an Oracle 8
external loader target table in bulk mode with NOT NULL constraints on any
columns, the session may write the null character into a NOT NULL column if
the mapping generates a NULL output.
For details on using the external loader feature, see External Loading on
page 523.
- Queue. Choose Queue when you want to output to an MQSeries message
queue. If you select this option, select a configured MQ connection in the
Value column. For more information, see the PowerCenter Connect for IBM
MQSeries User and Administrator Guide.
- None. Choose None when you want to write to a local flat file or XML file.

Partitions

N/A

Displays the partitions if the session is partitioned.

Value

Required

Enter a target connection based on the value you choose in the Type column.

Description

Properties Settings
Click the Properties settings to define target property information. The Workflow Manager
displays different properties for the different target types: relational, flat file, and XML.

Properties Settings for Relational Targets


You can configure the writer and object instance attributes for a relational target.

Mapping Tab (Transformations View)

695

Figure A-16 displays the Properties settings on the Mapping tab for relational targets:
Figure A-16. Mapping Tab - Targets Node - Properties Settings (Relational)

696

Appendix A: Session Properties Reference

Table A-15 describes the Properties settings on the Mapping tab for relational targets:
Table A-15. Mapping Tab - Targets Node - Properties Settings (Relational)
Target Property

Required/
Optional

Target Load Type

Required

You can choose Normal or Bulk.


If you select Normal, the PowerCenter Server loads targets normally.
You can only choose Bulk when you load to Sybase, Oracle, or Microsoft
SQL Server. If you select Bulk for a Sybase, Oracle, or Microsoft SQL
Server target, Informatica invokes the bulk API with default settings,
bypassing database logging.
If you select Bulk for other database types, the PowerCenter Server reverts
to a normal load.
Loading in bulk mode can improve session performance, but limits your
ability to recover because no database logging occurs.
Note: Choose Normal mode if the mapping contains an Update Strategy
transformation.
Tip: When you choose Bulk mode for Microsoft SQL Server or Oracle
targets, define a large commit interval.
Consider the following database limitations when you choose Bulk mode
when loading to Oracle:
- Do not define CHECK constraints in the database.
- Do not define primary-foreign keys in the database. However, you can
define primary-foreign keys for the target definitions in the Designer.
- Do not create indexes in the database.
- When you use the LONG datatype, verify it is the last column in the table.
For more information, see your Oracle documentation.

Insert

Optional

If selected, the PowerCenter Server inserts all rows flagged for insert.
By default, this option is selected.
For details on target update strategies, see Update Strategy
Transformation in the Transformation Guide.

Update (as Update)

Optional

If selected, the PowerCenter Server updates all rows flagged for update.
By default, this option is selected.
For details on target update strategies, see Update Strategy
Transformation in the Transformation Guide.

Update (as Insert)

Optional

If selected, the PowerCenter Server inserts all rows flagged for update.
By default, this option is not selected.
For details on target update strategies, see Update Strategy
Transformation in the Transformation Guide.

Update (else Insert)

Optional

If selected, the PowerCenter Server updates rows flagged for update if it


they exist in the target, then inserts any remaining rows marked for insert.
For details on target update strategies, see Update Strategy
Transformation in the Transformation Guide.

Delete

Optional

If selected, the PowerCenter Server deletes all rows flagged for delete.
For details on target update strategies, see Update Strategy
Transformation in the Transformation Guide.

Truncate Table

Optional

If selected, the PowerCenter Server truncates the target before loading. For
details on this feature, see Truncating Target Tables on page 245.

Description

Mapping Tab (Transformations View)

697

Table A-15. Mapping Tab - Targets Node - Properties Settings (Relational)


Target Property

Required/
Optional

Reject File Directory

Optional

Enter the directory name in this field. By default, the PowerCenter Server
writes all reject files to the server variable directory, $PMBadFileDir.
If you specify both the directory and file name in the Reject Filename field,
clear this field. The PowerCenter Server concatenates this field with the
Reject Filename field when it runs the session.
You can also use the $BadFileName session parameter to specify the file
directory.
For details on session parameters, see Session Parameters on page 495.

Reject Filename

Required

Enter the file name, or file name and path. By default, the PowerCenter
Server names the reject file after the target instance name:
target_name.bad. Optionally use the $BadFileName session parameter for
the file name.
The PowerCenter Server concatenates this field with the Reject File
Directory field when it runs the session. For example, if you have
C:\reject_file\ in the Reject File Directory field, and enter filename.bad in
the Reject Filename field, the PowerCenter Server writes rejected rows to
C:\reject_file\filename.bad.
For details on session parameters, see Session Parameters on page 495.

Rejected Truncated/
Overflowed rows*

Optional

Instructs the PowerCenter Server to write the truncated and overflowed


rows to the reject file.

Update Override*

Optional

Override the default UPDATE statement.

Table Name Prefix

Optional

Specify the owner of the target tables.

Pre SQL

Optional

You can enter pre-session SQL commands for a target instance in a


mapping to execute commands against the target database before the
PowerCenter Server reads the source.

Post SQL

Optional

Enter post-session SQL commands to execute commands against the target


database after the PowerCenter Server writes to the target.

Description

*You can view the value of this attribute when you click Show all properties. This attribute is read-only. For more information, see the
Designer Guide.

698

Appendix A: Session Properties Reference

Properties Settings for Flat File Targets


Figure A-17 describes the Properties settings on the Mapping tab for file targets:
Figure A-17. Mapping Tab - Targets Node - File Properties Settings

Table A-16 describes the Properties settings on the Mapping tab for file targets:
Table A-16. Mapping Tab - Targets Node - File Properties Settings
Target Property

Required/
Optional

Description

Merge Partitioned
Files

Optional

When selected, the PowerCenter Server merges the partitioned target files into
one file when the session completes, and then deletes the individual output
files. If the PowerCenter Server fails to create the merged file, it does not
delete the individual output files.
You cannot merge files if the session uses FTP, an external loader, or a
message queue.
For details on configuring a session for partitioning, see Pipeline Partitioning
on page 345.

Merge File
Directory

Optional

Enter the directory name in this field. By default, the PowerCenter Server
writes the merged file in the server variable directory, $PMTargetFileDir.
If you enter a full directory and file name in the Merge File Name field, clear
this field.

Merge File Name

Optional

Name of the merge file. Default is target_name.out. This property is required if


you select Merge Partitioned Files.

Mapping Tab (Transformations View)

699

Table A-16. Mapping Tab - Targets Node - File Properties Settings


Target Property

700

Required/
Optional

Description

Output File
Directory

Optional

Enter the directory name in this field. By default, the PowerCenter Server
writes output files in the server variable directory, $PMTargetFileDir.
If you specify both the directory and file name in the Output Filename field,
clear this field. The PowerCenter Server concatenates this field with the Output
Filename field when it runs the session.
You can also use the $OutputFileName session parameter to specify the file
directory.
For details on session parameters, see Session Parameters on page 495.

Output Filename

Required

Enter the file name, or file name and path. By default, the Workflow Manager
names the target file based on the target definition used in the mapping:
target_name.out.
If the target definition contains a slash character, the Workflow Manager
replaces the slash character with an underscore.
When you use an external loader to load to an Oracle database, you must
specify a file extension. If you do not specify a file extension, the Oracle loader
cannot find the flat file and the PowerCenter Server fails the session. For more
information about external loading, see Loading to Oracle on page 533.
Enter the file name, or file name and path. Optionally use the $OutputFileName
session parameter for the file name.
The PowerCenter Server concatenates this field with the Output File Directory
field when it runs the session.
For details on session parameters, see Session Parameters on page 495.
Note: If you specify an absolute path file name when using FTP, the
PowerCenter Server ignores the Default Remote Directory specified in the FTP
connection. When you specify an absolute path file name, do not use single or
double quotes.

Reject File
Directory

Optional

Enter the directory name in this field. By default, the PowerCenter Server
writes all reject files to the server variable directory, $PMBadFileDir.
If you specify both the directory and file name in the Reject Filename field,
clear this field. The PowerCenter Server concatenates this field with the Reject
Filename field when it runs the session.
You can also use the $BadFileName session parameter to specify the file
directory.
For details on session parameters, see Session Parameters on page 495.

Reject Filename

Required

Enter the file name, or file name and path. By default, the PowerCenter Server
names the reject file after the target instance name: target_name.bad.
Optionally use the $BadFileName session parameter for the file name.
The PowerCenter Server concatenates this field with the Reject File Directory
field when it runs the session. For example, if you have C:\reject_file\ in the
Reject File Directory field, and enter filename.bad in the Reject Filename
field, the PowerCenter Server writes rejected rows to
C:\reject_file\filename.bad.
For details on session parameters, see Session Parameters on page 495.

Set File Properties

Optional

Allows you to configure the file properties. For more information, see Setting
File Properties for Targets on page 701.

Datetime Format*

N/A

Displays the datetime format selected for datetime fields.

Appendix A: Session Properties Reference

Table A-16. Mapping Tab - Targets Node - File Properties Settings


Required/
Optional

Description

Thousand
Separator*

N/A

Displays the thousand separator for numeric fields.

Decimal Separator*

N/A

Displays the decimal separator for numeric fields.

Target Property

*You can view the value of this attribute when you click Show all properties. This attribute is read-only. For more information, see the
Designer Guide.

Setting File Properties for Targets


Click the Set File Properties button on the Mapping tab to configure flat file properties. You
can define flat file properties for both fixed-width and delimited flat file targets.
You can configure flat file properties for non-reusable sessions in the Workflow Designer and
for reusable sessions in the Task Developer.
Figure A-18 shows the Flat Files dialog box that appears when you click Set File Properties:
Figure A-18. Flat Files Dialog Box for Targets

Select the file type (fixed-width or delimited) you want to configure and click Advanced.

Configuring Fixed-Width Properties for Targets


To edit the fixed-width properties, select Fixed Width in the Flat Files dialog box and click
the Advanced button. The Fixed Width Properties dialog box appears.

Mapping Tab (Transformations View)

701

Figure A-19 displays the Fixed-Width Properties dialog box for flat file targets:
Figure A-19. Fixed-Width Properties for File Targets

Table A-17 describes the options you define in the Fixed Width Properties dialog box:
Table A-17. Fixed-Width Properties for File Targets
Fixed-Width
Properties Options

Required/
Optional

Null Character

Required

Enter the character you want the PowerCenter Server to use to represent
null values. You can enter any valid character in the file code page.
For more information about specifying null characters for target files, see
Null Characters in Fixed-Width Files on page 272.

Repeat Null Character

Optional

Select this option to indicate a null value by repeating the null character to
fill the field. If you do not select this option, the PowerCenter Server enters
a single null character at the beginning of the field to represent a null
value. For more information about specifying null characters for target
files, see Null Characters in Fixed-Width Files on page 272.

Code Page

Required

Select the code page of the fixed-width file. The default setting is the client
code page.

Description

Configuring Delimited Properties for Targets


To edit the delimited properties, select Delimited in the Flat Files dialog box and click the
Advanced button. The Delimited File Properties dialog box appears.
Figure A-20 displays the Delimited File Properties dialog box for flat file targets:
Figure A-20. Delimited Properties for File Targets

702

Appendix A: Session Properties Reference

Table A-18 describes the options you can define in the Delimited File Properties dialog box
for flat file targets:
Table A-18. Delimited Properties for File Targets
Edit Delimiter
Options

Required/
Optional

Delimiters

Required

Character used to separate columns of data. Use the Browse button to the right
of this field to enter a non-printable delimiter. Delimiters can be either printable
or single-byte unprintable characters, and must be different from the escape
character and the quote character (if selected). You cannot select unprintable
multibyte characters as delimiters.

Optional Quotes

Required

Select No Quotes, Single Quote, or Double Quotes. If you select a quote


character, the PowerCenter Server does not treat delimiter characters within
the quote characters as a delimiter. For example, suppose an output file uses a
comma as a delimiter and the PowerCenter Server receives the following row:
342-3849, Smith, Jenna, Rockville, MD, 6.
If you select the optional single quote character, the PowerCenter Server
ignores the commas within the quotes and writes the row as four fields.
If you do not select the optional single quote, the PowerCenter Server writes
six separate fields.

Code Page

Required

Select the code page of the delimited file. The default setting is the client code
page.

Description

Transformations Node
On the Transformations node, you can override properties that you configure in
transformation and target instances in a mapping. The attributes you can configure depends
on the type of transformation you select.

Mapping Tab (Transformations View)

703

Figure A-21 displays the Transformations node on the Mapping tab:


Figure A-21. Mapping Tab - Transformations Node

704

Appendix A: Session Properties Reference

Mapping Tab (Partitions View)


In the Partitions view of the Mapping tab, you can configure partitions. You can configure
partitions for non-reusable sessions in the Workflow Designer and for reusable sessions in the
Task Developer.
The following nodes are available in the Partitions view:

Partition Properties. For more information, see Partition Properties Node on page 705.

KeyRange. For more information, see KeyRange Node on page 706.

HashKeys. For more information, see HashKeys Node on page 706.

Partition Points. For more information, see Partition Points Node on page 706.

Non-Partition Points. For more information, see Non-Partition Points Node on


page 709.

Partition Properties Node


The Partition Properties node allows you to configure partitions.
Figure A-22 displays the Mapping tab - Partitions Properties node:
Figure A-22. Mapping Tab - Partitions Properties Node

Mapping Tab (Partitions View)

705

KeyRange Node
In the KeyRange node, you can configure the partition range for key-range partitioning.
Select Edit Keys to edit the partition key. For more information, see Edit Partition Key on
page 708.
Figure A-23 displays the KeyRange node on the Mapping tab:
Figure A-23. Mapping Tab - KeyRange Node

HashKeys Node
The HashKeys node you can configure hash key partitioning. Select Edit Keys to edit the
partition key. For more information, see Edit Partition Key on page 708.

Partition Points Node


The Partition Points node displays the mapping with the transformation icons. The Partition
Points node lists the partition points in the tree. Select a partition point to configure its
attributes.
In the Partition Points node you can configure the following options for each pipeline in a
mapping:

706

Add and delete partition points.

Specify the partition type at each partition point.

Appendix A: Session Properties Reference

Add and delete partitions.

Enter a description for each partition.

Add keys and key ranges for certain partition types.

For more information about partitioning a pipeline, see Pipeline Partitioning on page 345.
Figure A-24 displays Mapping tab - Partition Points node:
Figure A-24. Mapping Tab - Partition Points Node

Table A-19 describes the Partition Points node:


Table A-19. Mapping Tab - Partition Points Node
Partition Points
Node

Description

Add Partition Point

Click to add a new partition point to the Transformation list. For information on adding partition
points, see Adding and Deleting Partition Points on page 353.

Delete Partition
Point

Click to delete the current partition point. You cannot delete certain partition points. For details,
see Adding and Deleting Partition Points on page 353.

Edit Partition Point

Click to edit the current partition point.

Edit Keys

Click to add, remove, or edit the key for key range or hash user keys partitioning. This button is
not available for auto-hash, round-robin, or pass-through partitioning.
For more information on adding keys and key ranges, see Adding Keys and Key Ranges on
page 358.

Mapping Tab (Partitions View)

707

Edit Partition Point


The Edit Partition Point dialog box allows you to add and delete partitions, and to select the
partition type.
Figure A-25 displays the Edit Partition Points dialog box:
Figure A-25. Edit Partition Point Dialog Box

Table A-20 describes the options in the Edit Partition Point dialog box:
Table A-20. Edit Partition Point Dialog Box Options
Edit Partition Point
Options

Description

Add button

Click to add a partition. You can add up to 64 partitions. For more information on
adding partitions, see Adding and Deleting Partitions on page 356.

Delete button

Click to delete the selected partition. For more information on deleting partitions, see
Adding and Deleting Partitions on page 356.

Name

Partition number.

Description

Enter a description for the current partition.

Select Partition Type

Select a partition type from the list. For more information, see Specifying Partition
Types on page 356.

Edit Partition Key


When you specify key range or hash user keys partitioning at any partition point, you must
specify one or more ports as the partition key. Click Edit Key to display the Edit Partition Key
dialog box.

708

Appendix A: Session Properties Reference

Figure A-26 displays the Edit Partition Key dialog box:


Figure A-26. Edit Partition Key Dialog Box

You can specify one or more ports as the partition key. To rearrange the order of the ports that
make up the key, select a port in the Selected Ports list and click the up or down arrow.
For information on adding a key for key range partitioning, see Key Range Partition Type
on page 363. For information on adding a key for hash partitioning, see Hash Keys Partition
Types on page 361.

Non-Partition Points Node


The Non-Partition Points node displays the mapping objects in iconized view. The Partition
Points node lists the non-partition points in the tree. You can select a non-partition point and
add partitions if you want.

Mapping Tab (Partitions View)

709

Components Tab
In the Components tab, you can configure pre-session shell commands, post-session
commands, and email messages if the session succeeds or fails.
Figure A-27 displays the Components Tab:
Figure A-27. Components Tab

710

Appendix A: Session Properties Reference

Table A-21 describes the Components tab options:


Table A-21. Components Tab
Components Tab
Option

Optional/
Required

Task

n/a

Tasks you can perform in the Components tab. You can configure pre- or postsession shell commands and success or failure email messages in the
Components tab.

Type

Required

Select None if you do not want to configure commands and emails in the
Components tab.
For pre- and post-session commands, select Reusable to call an existing
reusable Command task as the pre- or post-session shell command. Select
Non-Reusable to create pre- or post-session shell commands for this session
task.
For success or failure emails, select Reusable to call an existing Email task as
the success or failure email. Select Non-Reusable to create email messages
for this session task.

Value

Optional

Use to configure commands or emails.

Description

Table A-22 describes the tasks available in the Components tab:


Table A-22. Components Tab Tasks
Components Tab
Tasks

Required/
Optional

Pre-Session
Command

Optional

Shell commands that the PowerCenter Server performs at the beginning of a


session. For details on using pre-session shell commands, see Using Pre- or
Post-Session Shell Commands on page 188.

Post-Session
Success Command

Optional

Shell commands that the PowerCenter Server performs after the session
completes successfully. For details on using pre-session shell commands, see
Using Pre- or Post-Session Shell Commands on page 188.

Post-Session
Failure Command

Optional

Shell commands that the PowerCenter Server performs after the session if the
session fails. For details on using pre-session shell commands, see Using
Pre- or Post-Session Shell Commands on page 188.

On Success Email

Optional

The PowerCenter Server sends On Success email message if the session


completes successfully.

On Failure Email

Optional

The PowerCenter Server sends On Failure email message if the session fails.

Description

Reusable Pre- or Post-Session Commands


Select Reusable in the Type field if you want to select an existing Command task as the pre- or
post-session shell command. The Command Object Browser appears when you click the
Open button in the Value field.

Components Tab

711

Figure A-28 displays the Task Browser:


Figure A-28. Task Browser

Click the Override button to override the Run If Previous Completed option in the
Command task. For details on the Run If Previous Completed option, see Table A-24 on
page 714.

Non-Reusable Pre- or Post-Session Commands


Select Non-Reusable in the Type field if you want to create pre- or post-session commands for
the session. Non-reusable pre- or post-session commands do not appear as Command tasks in
the folder.
Click the Open button in the Value field in the Components tab to edit pre- or post-session
shell commands. The Edit Pre-Session Command or Edit Post-Session Command dialog box
appears.

712

Appendix A: Session Properties Reference

Figure A-29 displays the Edit Pre-Session Command dialog box:


Figure A-29. Edit Pre-Session Command Dialog Box

Table A-23 describes General tab for editing pre- or post-session shell commands:
Table A-23. Pre- or Post-Session Commands - General Tab
General Tab for
Pre- or PostSession
Commands

Required/
Optional

Description

Name

Required

Enter a name for the pre- or post-session shell command.

Make Reusable

Required

Select Make Reusable to create a reusable Command task from the pre- or
post-session shell commands.
Clear the Make Reusable option if you do not want the Workflow Manager to
create a reusable Command task from the shell commands.
For details on creating Command tasks from pre- or post-session shell
commands, see Creating a Reusable Command Task from Pre- or PostSession Commands on page 191.

Description

Optional

Enter a description for the pre- or post-session shell command.

Components Tab

713

Table A-24 describes the Properties tab for editing pre- or post-session commands:
Table A-24. Pre- or Post-Session Commands - Properties Tab
Properties Tab
for Pre- or PostSession
Commands

Required/
Optional

Description

Name

Required

The name of the pre-session shell command.

Run If Previous
Completed

Required

Select this option if you want the PowerCenter Server to perform the next
command only if the previous command completed successfully.

Table A-25 describes the Commands tab for editing pre- or post-session commands:
Table A-25. Pre- or Post-Session Commands - Commands Tab
Commands Tab
for Pre- or PostSession
Commands

Required/
Optional

Description

Name

Required

The name of the pre- or post-session shell command.

Command

Required

The shell command you want the PowerCenter Server to perform. Enter one
command for each line. You can use session parameters or server variables in
shell commands.
If your command contains spaces, enclose the command in quotes. For
example, if you want to call c:\program files\myprog.exe, you must enter
c:\program files\myprog.exe, including the quotes. Enter only one command
on each line.

Reusable Email
Select Reusable in the Type field for the On-Success or On-Failure email if you want to select
an existing Email task as the On-Success or On-Failure email. The Email Object Browser
appears when you click the right side of the Values field.

714

Appendix A: Session Properties Reference

Figure A-30 displays Email Object Browser:


Figure A-30. Email Object Browser

Select an Email task to use as On-Success or On-Failure email. Click the Override button to
override properties of the email. For more information about email properties, see Table A-27
on page 717.

Non-Reusable Email
Select Non-Reusable in the Type field to create a non-reusable email for the session. NonReusable emails do not appear as Email tasks in the Task folder. Click the right side of the
Values field to edit the properties for the non-reusable On-Success or On-Failure emails. For
more information about email properties, see Table A-27 on page 717.

Email Properties
You configure email properties for On-Success or On-Failure Emails when you override an
existing Email task or when you create a non-reusable email for the session.

Components Tab

715

Figure A-31 displays the dialog box for editing the On-Success or On-Failure email
properties:
Figure A-31. On-Success or On-Failure Email - General Tab

Table A-26 describes general settings for editing On-Success or On-Failure emails:
Table A-26. On-Success or On-Failure Emails - General Tab

716

Email Settings

Required/
Optional

Description

Name

Required

Enter a name for the email you want to configure.

Description

Required

Enter a description for the email you want to configure.

Appendix A: Session Properties Reference

Figure A-32 displays the properties for On-Success or On-Failure emails:


Figure A-32. On-Success or On-Failure Email - Properties Tab

Table A-27 describes the email properties for On-Success or On-Failure emails:
Table A-27. On-Success or On-Failure Emails - Properties Tab
Email Properties

Required/
Optional

Email user name

Required

Required to send On-Success or On-Failure session email. Enter the email


address of the person you want the PowerCenter Server to email after the
session completes. The email address must be entered in 7-bit ASCII.
For success email, you can enter $PMSuccessEmailUser to send email to the
user configured for the server variable.
For failure email, you can enter $PMFailureEmailUser to send email to the user
configured for the server variable.

Email subject

Optional

Enter the text you want to appear in the subject header.

Email text

Optional

Enter the text of the email. You can use several variables when creating this
text to convey meaningful information, such as the session name and session
status. For details, see Sending Email on page 319.

Description

Components Tab

717

Metadata Extensions Tab


The Metadata Extensions tab appears in the session property sheet after the Partitions tab.
Figure A-33 displays the Metadata Extensions tab:
Figure A-33. Metadata Extensions Tab

The Metadata Extensions tab allows you to create and promote metadata extensions. For
information on creating metadata extensions, see Metadata Extensions in the Repository
Guide.
Table A-28 describes the configuration options for the Metadata Extensions tab:
Table A-28. Metadata Extensions Tab

718

Metadata
Extensions Tab
Options

Required/
Optional

Extension Name

Required

Name of the metadata extension. Metadata extension names must be unique in


a domain.

Datatype

Required

The data type: numeric (integer), string, boolean, or XML.

Appendix A: Session Properties Reference

Description

Table A-28. Metadata Extensions Tab


Metadata
Extensions Tab
Options

Required/
Optional

Value

Optional

Value of the metadata extension.


For a numeric metadata extension, the value must be an integer.
For a boolean metadata extension, choose true or false.
For a string or XML metadata extension, click the button in the Value field to
enter a value of more than one line. The Workflow Manager does not validate
XML syntax.

Precision

Required for
string and
XML objects

The maximum length for string or XML metadata extensions.

Reusable

Required

Select to make the metadata extension apply to all objects of this type
(reusable). Clear to make the metadata extension apply to this object only
(non-reusable).

Description

Optional

Description of the metadata extension.

Description

Metadata Extensions Tab

719

720

Appendix A: Session Properties Reference

Appendix B

Workflow Properties
Reference
This appendix contains a listing of settings in the workflow properties. These settings are
grouped by the following tabs:

General Tab, 722

Properties Tab, 724

Scheduler Tab, 726

Variables Tab, 731

Events Tab, 732

Metadata Extensions Tab, 733

721

General Tab
You can change the workflow name and enter a comment for the workflow on the General
tab. By default, the General tab appears when you open the workflow properties.
Figure B-1 displays the General tab of the workflow properties:
Figure B-1. Workflow Properties - General Tab

Select a
PowerCenter Server
to run the workflow.

Select a suspension
email.

Table B-1 describes the settings found on the General tab:


Table B-1. Workflow Properties - General Tab

722

General Tab
Options

Required/
Optional

Description

Name

Required

The name of the workflow.

Comments

Optional

Optional comment to describe the workflow.

Server

Required

Select a registered PowerCenter Server when configuring a workflow.

Tasks must run on


Server

Optional

Requires all workflow tasks to run on the PowerCenter Server that you
select.

Suspension Email

Optional

Select a reusable email task for the suspension email. When a task fails,
the PowerCenter Server suspends the workflow and sends the
suspension email.
For details on suspending workflows, see Suspending the Workflow on
page 127.

Disabled

Optional

Select to disable the workflow from the schedule. The PowerCenter


Server stops running the workflow until you clear the Disabled option.
For details on the Disabled option, see Disabling Workflows on
page 118.

Appendix B: Workflow Properties Reference

Table B-1. Workflow Properties - General Tab


General Tab
Options

Required/
Optional

Suspend On Error

Optional

If selected, the PowerCenter Server suspends the workflow when a task


in the workflow fails.
For details on suspending workflows, see Suspending the Workflow on
page 127.

Web Services

Optional

If selected, you create a service workflow. Click Config Service to


configure service information.
For more information on creating web services, see the Web Services
Provider Guide.

Description

General Tab

723

Properties Tab
Configure parameter file name and workflow log options in the Properties tab.
Figure B-2 displays the Properties tab:
Figure B-2. Workflow Properties - Properties Tab

Table B-2 describes the settings found on the Properties tab:


Table B-2. Workflow Properties - Properties Tab

724

Properties Tab
Options

Required/
Optional

Parameter File
Name

Optional

Designates the name and directory for the parameter file. Use the parameter
file to define workflow parameters. For details on parameter files, see
Parameter Files on page 511.

Workflow Log File


Name

Optional

Optionally enter a file name, or a file name and directory.


If you leave this field blank, the PowerCenter Server does not create a
workflow log. Instead, the PowerCenter Server writes workflow log messages
to the server log or Windows Event Log, depending on how you configure the
PowerCenter Server.
If you fill in this field, the PowerCenter Server appends information in this field
to that entered in the Workflow Log File Directory field. For example, if you
have "C:\workflow_logs\" in the Workflow Log File Directory field, then enter
"logname.txt" in the Workflow Log File Name field, the PowerCenter Server
writes logname.txt to the C:\workflow_logs\ directory.

Appendix B: Workflow Properties Reference

Description

Table B-2. Workflow Properties - Properties Tab


Properties Tab
Options

Required/
Optional

Workflow Log File


Directory

Required

Designates a location for the workflow log file. By default, the PowerCenter
Server writes the log file in the server variable directory, $PMWorkflowLogDir.
If you enter a full directory and file name in the Workflow Log File Name field,
clear this field.

Save Workflow Log


By

Required

If you select Save Workflow Log by Timestamp, the PowerCenter Server saves
all workflow logs, appending a timestamp to each log.
If you select Save Workflow Log by Runs, the PowerCenter Server saves a
designated number of workflow logs. Configure the number of workflow logs in
the Save Workflow Log for These Runs option.
For details on these options, see Archiving Workflow Logs on page 459.
You can also use the $PMWorkflowLogCount server variable to save the
configured number of workflow logs for the PowerCenter Server.

Save Workflow Log


For These Runs

Required

The number of historical workflow logs you want the PowerCenter Server to
save.
The Informatica saves the number of historical logs you specify, plus the most
recent workflow log. Therefore, if you specify 5 runs, the PowerCenter Server
saves the most recent workflow log, plus historical logs 04, for a total of 6
logs.
You can specify up to 2,147,483,647 historical logs. If you specify 0 logs, the
PowerCenter Server saves only the most recent workflow log.

Description

Properties Tab

725

Scheduler Tab
The Scheduler Tab allows you to schedule a workflow to run continuously, run at a given
interval, or manually start a workflow. For details on scheduling workflows, see Scheduling a
Workflow on page 112.
Figure B-3 displays the Scheduler tab:
Figure B-3. Workflow Properties - Scheduler Tab

Edit
scheduler
settings.

You can configure the following types of scheduler settings:

726

Non-Reusable. Choose to create a non-reusable scheduler for the workflow.

Reusable. Choose a reusable scheduler for the workflow.

Appendix B: Workflow Properties Reference

Table B-3 describes the settings found on the Scheduler Tab:


Table B-3. Workflow Properties - Scheduler Tab
Scheduler Tab Options

Required/
Optional

Non-Reusable/Reusable

Required

Indicates the scheduler type.


If you select Non Reusable, the scheduler can only be used by
the current workflow.
If you select Reusable, choose a reusable scheduler. You can
create reusable schedulers by selecting Schedulers.

Scheduler

Required

Choose a set of scheduler settings for the workflow.

Description

Optional

Enter a description for the scheduler.

Summary

N/A

Read-only summary of the selected scheduler settings.

Description

Edit Scheduler Settings


Click the Edit Scheduler Settings button to configure the scheduler. The Edit Scheduler
dialog box appears.
Figure B-4 displays the Edit Scheduler dialog box:
Figure B-4. Workflow Properties - Scheduler Tab - Edit Scheduler Dialog Box

Scheduler Tab

727

Table B-4 describes the settings on the Edit Scheduler dialog box:
Table B-4. Workflow Properties - Scheduler Tab - Edit Scheduler Dialog Box
Scheduler Options

728

Required/
Optional

Description

Run Options: Run On Server


Initialization/Run On Demand/
Run Continuously

Optional

Indicates the workflow schedule type.


If you select Run On Server Initialization, the PowerCenter
Server runs the workflow as soon as the server is initialized.
If you select Run On Demand, the PowerCenter Server only runs
the workflow when you start the workflow.
If you select Run Continuously, the PowerCenter Server starts
the next run of the workflow as soon as it finishes the first run.

Schedule Options: Run Once/


Run Every/Customized Repeat

Optional

Required if you select Run On Server Initialization in Run


Options.
Also required if you do not choose any setting in Run Options.
If you select Run Once, the PowerCenter Server runs the
workflow once, as scheduled in the scheduler.
If you select Run Every, the PowerCenter Server runs the
workflow at regular intervals, as configured.
If you select Customized Repeat, the PowerCenter Server runs
the workflow on the dates and times specified in the Repeat
dialog box.

Edit

Optional

Required if you select Customized Repeat in Schedule Options.


Opens the Repeat dialog box, allowing you to schedule specific
dates and times for the workflow to run. The selected scheduler
appears at the bottom of the page. For details about the Repeat
dialog box, see Customizing Repeat Option on page 116.

Start Date

Optional

Required if you select Run On Server Initialization in Run


Options.
Also required if you do not choose any setting in Run Options.
Indicates the date on which the PowerCenter Server begins
scheduling the workflow.

Start Time

Optional

Required if you select Run On Server Initialization in Run


Options.
Also required if you do not choose any setting in Run Options.
Indicates the time at which the PowerCenter Server begins
scheduling the workflow.

End Options: End On/End


After/Forever

Optional

Required if the workflow schedule is Run Every or Customized


Repeat.
If you select End On, the PowerCenter Server stops scheduling
the workflow in the selected date.
If you select End After, the PowerCenter Server stops
scheduling the workflow after the set number of workflow runs.
If you select Forever, the PowerCenter Server schedules the
workflow as long as the workflow does not fail.

Appendix B: Workflow Properties Reference

Customizing Repeat Option


You can schedule the workflow to run once, run at an interval, or customize your own repeat
option. Click the Edit button on the Edit Scheduler dialog box to configure Customized
Repeat options.
Figure B-5 shows the Customized Repeat dialog box:
Figure B-5. Workflow Properties - Customized Repeat Dialog Box

Table B-5 describes options in the Customized Repeat dialog box:


Table B-5. Workflow Properties - Repeat Dialog Box Options
Repeat Option

Required/
Optional

Repeat Every

Required

Enter the numeric interval you want to schedule the workflow, then select Days,
Weeks, or Months, as appropriate.
If you select Days, select the appropriate Daily Frequency settings.
If you select Weeks, select the appropriate Weekly and Daily Frequency
settings.
If you select Months, select the appropriate Monthly and Daily Frequency
settings.

Weekly

Optional

Required to enter a weekly schedule. Select the day or days of the week on
which you want to schedule the workflow.

Description

Scheduler Tab

729

Table B-5. Workflow Properties - Repeat Dialog Box Options

730

Repeat Option

Required/
Optional

Monthly

Optional

Required to enter a monthly schedule.


If you select Run On Day, select the dates on which you want the workflow
scheduled on a monthly basis. The PowerCenter Server schedules the
workflow on the selected dates. If you select a numeric date exceeding the
number of days within a given month, the PowerCenter Server schedules the
workflow for the last day of the month, including leap years. For example, if you
schedule the workflow to run on the 31st of every month, the PowerCenter
Server schedules the session on the 30th of the following months: April, June,
September, and November.
If you select Run On The, select the week(s) of the month, then day of the
week on which you want the workflow to run. For example, if you select Second
and Last, then select Wednesday, the PowerCenter Server schedules the
workflow on the second and last Wednesday of every month.

Daily

Required

Enter the number of times you would like the PowerCenter Server to run the
workflow on any day the session is scheduled.
If you select Run Once, the PowerCenter Server schedules the workflow once
on the selected day, at the time entered on the Start Time setting on the Time
tab.
If you select Run Every, enter Hours and Minutes to define the interval at which
the PowerCenter Server runs the workflow. The PowerCenter Server then
schedules the workflow at regular intervals on the selected day. The
PowerCenter Server uses the Start Time setting for the first scheduled
workflow of the day.

Appendix B: Workflow Properties Reference

Description

Variables Tab
Before you can use workflow variables, you must declare them in the Variables tab.
Figure B-6 displays the settings on the Variables tab:
Figure B-6. Workflow Properties - Variables Tab

Table B-6 describes the settings found on the Variables Tab:


Table B-6. Workflow Properties - Variables Tab
Variable Options

Required/
Optional

Description

Name

Required

The name of the workflow variable.

Datatype

Required

The datatype of the workflow variable.

Persistent

Required

Indicates whether the PowerCenter Server maintains the value of the variable
from the previous workflow run.

Is Null

Required

Indicates whether the workflow variable is null.

Default

Optional

Default value of the workflow variable.

Description

Optional

Optional details about the workflow variable.

Variables Tab

731

Events Tab
Before you can use the Event-Raise task, declare a user-defined event in the Events tab.
Figure B-7 displays the Events Tab:
Figure B-7. Workflow Properties - Events Tab

Table B-7 describes the settings found on the Events Tab:


Table B-7. Workflow Properties - Events Tab

732

Events Tab
Options

Required/
Optional

Description

Events

Required

The name of the event you declare.

Description

Optional

Optional details to describe the event.

Appendix B: Workflow Properties Reference

Metadata Extensions Tab


Extend the metadata stored in the repository by associating information with individual
repository objects. Create metadata extensions for repository objects by editing the object and
then adding the metadata extension to the Metadata Extension tab.
Figure B-8 displays the Metadata Extensions tab:
Figure B-8. Workflow Properties - Metadata Extensions Tab

The Metadata Extensions tab allows you to create and promote metadata extensions. For
information on creating metadata extensions, see Metadata Extensions in the Repository
Guide.
Table B-8 describes the configuration options for the Metadata Extensions tab:
Table B-8. Workflow Properties - Metadata Extensions Tab
Metadata
Extensions Tab
Options

Required/
Optional

Extension Name

Required

Name of the metadata extension. Metadata extension names must be unique in


a domain.

Datatype

Required

The datatype: numeric (integer), string, boolean, or XML.

Value

Optional

An optional value.
For a numeric metadata extension, the value must be an integer.
For a boolean metadata extension, choose true or false.
For a string or XML metadata extension, click the Edit button on the right side
of the Value field to enter a value of more than one line. The Workflow Manager
does not validate XML syntax.

Description

Metadata Extensions Tab

733

Table B-8. Workflow Properties - Metadata Extensions Tab


Metadata
Extensions Tab
Options

734

Required/
Optional

Description

Precision

Required for
string and
XML objects

The maximum length for string or XML metadata extensions.

Reusable

Required

Select to make the metadata extension apply to all objects of this type
(reusable). Clear to make the metadata extension apply to this object only
(non-reusable).

UnOverride

Optional

This column appears only if the value of one of the metadata extensions was
changed. To restore the default value, click Revert.

Description

Optional

Optional description of the metadata extension.

Appendix B: Workflow Properties Reference

Appendix C

Session Properties Comparison


Reference
This appendix covers the following topics:

Overview, 736

General Tab, 737

Source Location Tab, 754

Time Tab, 755

Log and Error Handling Tab, 758

Transformations Tab, 761

Partitions Tab, 762

735

Overview
The Workflow Manager and Workflow Monitor replace the Server Manager in PowerCenter
5.x and PowerMart 5.x. This appendix compares session properties in the Server Manager
with session and workflow options in the Workflow Manager. It lists the session properties as
they appeared on the session properties in the Server Manager. It then gives the corresponding
options in the Workflow Manager.
The session properties for the Server Manager contain the following tabs:

736

General tab

Source Location tab

Time tab

Log and Error Handling tab

Transformations tab

Partitions tab

Appendix C: Session Properties Comparison Reference

General Tab
In the Server Manager, the General tab appeared when you opened the session properties. In
the Workflow Manager, the General tab appears when you open the session properties in the
Task Developer or the Workflow Designer.
Figure C-1 shows the Server Manager General tab:
Figure C-1. Server Manager General Tab

In the Server Manager, you configured the following options from the General tab:

General options

Source options

Target options

Session commands

Performance

General Options
In the Server Manager, you could configure the Session Name field, Server Name, and the
Session Enabled option on the General tab of the session properties.
In the Workflow Manager, these options are on either the General tab of the session
properties or in the workflow properties.

General Tab

737

Table C-1 compares general session options for the Server Manager with the corresponding
options for Workflow Manager:
Table C-1. General Session Options Comparison
Server Manager General Tab
Properties

Property Location in Workflow Manager

Session Name

General tab-Rename button.

Server Name

General tab-Workflow or session properties.

Add Server Button

General tab-Workflow or session properties.

Session Enabled

General tab-Disable this task. You can only view this property when you edit
the session instance from the Workflow Designer.

Source Options
In the Server Manager, Source options appeared under the Session Name field on the General
tab.
In the Workflow Manager, source options appear under the Sources node on the Mapping tab
(Transformations view). The Sources node contains connections, properties, and readers
settings.
Table C-2 compares Source options for the Server Manager with the corresponding properties
for the Workflow Manager:
Table C-2. Source Options Comparison
Server Manager General Tab-Source
Options Properties

Property Location in Workflow Manager

Source Type

Mapping tab-Transformations view-Sources node-Connections settings.

Treat Rows As

Properties tab-General Options settings.

Source Options Button

Mapping tab-Transformations view-Sources node-Properties settings.


Click Set File Properties.

Source Database

Mapping tab-Transformations view-Sources node-Connections settings.


Click the Edit button in the Value field.

Source Options Dialog Box for Flat File Sources


In the Server Manager, the Source Options dialog box appeared when you clicked Source
Options on the General tab and the mapping used file sources.
In the Workflow Manager, most of the source options for file sources appear when you select
Properties from the Sources node on the Mapping tab.

738

Appendix C: Session Properties Comparison Reference

Figure C-2 shows the Server Manager Source Options Dialog Box for File Sources:
Figure C-2. Server Manager Source Options Dialog Box for File Sources

Table C-3 compares source options for file sources for the Server Manager with the
corresponding options for the Workflow Manager:
Table C-3. File Source Options Comparison
Server Manager General Tab-Source
Options Properties

Property Location in Workflow Manager

Source Directory

Mapping tab-Transformations view-Sources node-Properties settings.

File Name

Mapping tab-Transformations view-Sources node-Properties settings.

File Type

Mapping tab-Transformations view-Sources node-Properties settings.


Click Set File Properties.

File List

Mapping tab-Transformations view-Sources node-Properties settings.


Set Source Filetype property to direct or indirect.

FTP File

Mapping tab-Transformations view-Sources node-Connections settings.


Choose FTP for Type.

Apply to All Files

N/A

Edit File Property Button

Mapping tab-Transformations view-Sources node-Properties settings.


Click Set File Properties.

Edit FTP Property Button

Mapping tab-Transformations view-Sources node-Connections settings.


Choose FTP for Type. Click the Edit button on the right side of the Value
field to edit FTP properties.

General Tab

739

Source Options for Fixed-Width File Sources


In the Server Manager, the Fixed-Width Properties dialog box appeared when you selected a
fixed-width file from the File Source Dialog box and then clicked Edit File Property.
In the Workflow Manager, the Fixed-Width Properties dialog box appears when you click the
Set File Properties from the Sources node on the Mapping tab, select Fixed-Width, and then
click Advanced.
Figure C-3 shows the Server Manager Fixed-Width Properties dialog box:
Figure C-3. Server Manager Fixed-Width Properties Dialog Box

Delimited File Properties


In the Server Manager, the Delimited File Properties dialog box appeared when you selected a
delimited file from the File Source Dialog box and then clicked Edit File Property.
In the Workflow Manager, the Delimited Properties dialog box appears when you click Set
File Properties from the Sources node on the Mapping tab, select Delimited, and click
Advanced.

740

Appendix C: Session Properties Comparison Reference

Figure C-4 shows the Server Manager Delimited File Properties dialog box:
Figure C-4. Server Manager Delimited File Properties Dialog Box

Source Options for XML Sources


In the Server Manager, the Source Options for XML sources appeared when you clicked
Source Options on the General tab and the mapping used XML sources.
In the Workflow Manager, XML source options appear in the Sources node on the Mapping
tab when the mapping uses XML sources.
Figure C-5 shows the Server Manager Source Options dialog box for XML sources:
Figure C-5. Server Manager Source Options Dialog Box (XML Sources)

General Tab

741

Table C-4 compares XML source options for the Server Manager with the corresponding
options for the Workflow Manager:
Table C-4. XML Sources Options Comparison
Server Manager XML Source Options
Properties

Property Location in Workflow Manager

Source Directory

Mapping tab-Transformations view-Sources node-Properties settings.

File Name

Mapping tab-Transformations view-Sources node-Properties settings.

Code Page

Mapping tab-Transformations view-Sources node-Properties settings.


Click Set File Properties, and then click Advanced.

File List

Mapping tab-Transformations view-Sources node-Properties settings.


Set Source Filetype property to direct or indirect.

FTP File

Mapping tab-Transformations view-Sources node-Properties settings.


Click Set File Properties.

Edit FTP Property Button

Mapping tab-Transformations view-Sources node-Connections settings.


Choose FTP for Type. Click the Edit button on the right side of the Value
field to edit FTP properties.

FTP Properties
In the Server Manager, the FTP Properties dialog box appeared when you edited FTP
properties.
In the Workflow Manager, the FTP Connection Editor appears when you choose FTP as the
connection type from the Sources tab, click the Edit button on the right side of the Value
field, and then click Override to edit the FTP properties.
Figure C-6 shows the Server Manager FTP Properties dialog box:
Figure C-6. Server Manager FTP Properties Dialog Box

742

Appendix C: Session Properties Comparison Reference

Table C-5 compares FTP properties for the Server Manager with the corresponding options
for the Workflow Manager:
Table C-5. FTP Properties Comparison
Server Manager FTP Properties

Property Location in Workflow Manager

Connection Name

Mapping tab-Transformations view-Sources node-Connections settings.


Click the Edit button on the right side of the Value field. Choose FTP for
Type. Click the Edit button on the right side of the Value field to edit FTP
properties. Select an FTP connection.

Remote File Name

Mapping tab-Transformations view-Sources node-Connections settings.


Click the Edit button on the right side of the Value field. Choose FTP for
Type. Click the Edit button on the right side of the Value field to edit FTP
properties. Click Override in the FTP Object Browser.

Stage the FTP Data

Mapping tab-Transformations view-Sources node-Connections settings.


Click the Edit button on the right side of the Value field. Choose FTP for
Type. Click the Edit button on the right side of the Value field to edit FTP
properties. Click Override in the FTP Object Browser.

Source Options for Relational Sources


In the Server Manager, the Source options dialog box for relational sources appeared when
you clicked Source Options on the General tab and the mapping used relational sources.
In the Workflow Manager, enter a prefix for each source table in the Owner Name field on the
Mapping tab-Transformations view-Sources node-Properties settings.

Target Options
In the Server Manager target options appeared on the General tab. In the target options, you
could select the target type for the session, configure reject file names, and create database
connection session parameters in the target options.
In the Workflow Manager, the Mapping tab-Transformations view-Targets node contains
connections, properties, and writers settings.
Table C-6 compares target options for the Server Manager with the corresponding options for
Workflow Manager:
Table C-6. Target Options Comparison
Server Manager General Tab-Target
Table Properties

Property Location in Workflow Manager

Target Type

Mapping tab-Transformations view-Targets node-Writers settings.

Target Options Button

Properties in the Target Options dialog box are located on the Mapping
tab-Transformations view-Targets node-Properties settings.

General Tab

743

Table C-6. Target Options Comparison


Server Manager General Tab-Target
Table Properties

Property Location in Workflow Manager

Reject Options Button

Properties in the Rejects Options dialog box are located on the Mapping
tab-Transformations view-Targets node-Properties settings.

Target Database

Mapping tab-Transformations view-Targets node-Connections settings.


Click the Edit button on the right side of the Value field to choose a
target connection.

Relational Target Options


In the Server Manager, the Targets dialog box appeared when you selected a relational target
type and clicked Target Options on the General tab.
In the Workflow Manager, the target options for relational targets appear when you select the
Mapping tab.
Figure C-7 shows the Server Manager Targets dialog box:
Figure C-7. Server Manager Targets Dialog Box

Table C-7 compares relational target options for the Server Manager with the corresponding
options for the Workflow Manager:
Table C-7. Relational Target Options Comparison

744

Server Manager General Tab-Target


Table Options Properties

Workflow Manager Property Location

Insert

Mapping tab-Transformations view-Targets node-Properties settings.

Update (as update)

Mapping tab-Transformations view-Targets node-Properties settings.

Update (as insert)

Mapping tab-Transformations view-Targets node-Properties settings.

Appendix C: Session Properties Comparison Reference

Table C-7. Relational Target Options Comparison


Server Manager General Tab-Target
Table Options Properties

Workflow Manager Property Location

Update (else insert)

Mapping tab-Transformations view-Targets node-Properties settings.

Delete

Mapping tab-Transformations view-Targets node-Properties settings.

Truncate Table

Mapping tab-Transformations view-Targets node-Properties settings.

Normal/Bulk

Mapping tab-Transformations view-Targets node-Properties settings.


Choose Normal or Bulk for Target Load Type.

Test Load

Properties tab-General Options settings.

Number of Rows To Test

Properties tab-General Options settings.

Output Files
In the Server Manager, the Output Files dialog box appeared when you selected a file target
type, then clicked Target Options on the General tab.
In the Workflow Manager, output file target options appear on the Mapping tabTransformations view. The Targets node contains connections, properties, and writer settings.
Figure C-8 shows the Server Manager Output Files dialog box:
Figure C-8. Server Manager Output Files Dialog Box

General Tab

745

Table C-8 compares output file options for the Server Manager with the corresponding
options for the Workflow Manager:
Table C-8. File Target Output Options Comparison
Server Manager General Tab-Output
Files Properties

Workflow Manager Property Location

Directory

Mapping tab-Transformations view-Targets node-Properties settings.

File Name

Mapping tab-Transformations view-Targets node-Properties settings.

FTP file

Mapping tab-Transformations view-Targets node-Connections settings.

Loader

Mapping tab-Transformations view-Targets node-Connections settings.

Edit Object Properties

Mapping tab-Transformations view-Targets node-Connections settings.


Choose the connection type, and then click the Edit button on the right
side of the Value field.

Fixed Width/Delimited

Mapping tab-Transformations view-Targets node-Connections settings.


Click Set File Properties.

Edit Null Character Button

Mapping tab-Transformations view-Targets node-Connections settings.


Click Set File Properties. Choose Fixed-Width and click the Advance
button.

Edit Delimiter Button

Mapping tab-Transformations view-Targets node-Connections settings.


Click Set File Properties. Choose Delimited and click the Advance
button.

Number of Rows To Test

Properties tab-General Options settings.

Merge Targets For Partitioned Sessions

Mapping tab-Transformations view-Targets node-Properties settings.

External Loader Properties


In the Server Manager, the External Loader Properties dialog box appeared when you used the
Loader option on the Targets Options dialog box, and then clicked Edit Object Properties to
select the external loader you wanted the PowerCenter Server to use.
In the Workflow Manager, the External Loader Properties dialog box appears when you
choose External Loader from the Targets node Connections settings on the Mappings tab, and
then click the Edit button on the right side of the Value field.

746

Appendix C: Session Properties Comparison Reference

Figure C-9 shows the Server Manager External Loader Properties dialog box:
Figure C-9. Server Manager External Loader Properties

Fixed-Width Properties
In the Server Manager, the Fixed-Width dialog box appeared when you configured a session
to write to a fixed-width target file, and then clicked Edit Null Character.
In the Workflow Manager, you can access the Fixed-Width Properties dialog box from the
Properties settings of the Mappings tab. Click Set File Properties, and select Fixed-Width.
Figure C-10 shows the Server Manager Fixed-Width dialog box:
Figure C-10. Server Manager Fixed-Width Dialog Box (Output Files)

Delimited File Properties


In the Server Manager, the Delimited File Properties dialog box appeared when you
configured a session to write to a delimited target file, then clicked Edit Delimiter.
In the Workflow Manager, you can access the Delimited Properties dialog box from the
Properties settings of the Mappings tab. Click Set File Properties, and select Delimited.

General Tab

747

Figure C-11 shows the Server Manager Delimited File Properties dialog box:
Figure C-11. Server Manager Delimited File Properties Dialog Box (Output Files)

XML Targets
In the Server Manager, the XML Target dialog box appeared when you selected an XML file
target type, then clicked Target Options.
In the Workflow Manager, you can access the XML Target dialog box from the Properties
settings of the Mappings tab. Click Set File Properties.
Figure C-12 shows the Server Manager XML Target dialog box:
Figure C-12. Server Manager XML Target Dialog Box

Table C-9 compares XML target options for the Server Manager with the corresponding
options for Workflow Manager:
Table C-9. XML Target Options Comparison

748

Server Manager General Tab-XML


Target Properties

Workflow Manager Property Location

Directory

Mapping tab-Transformations view-Targets node-Properties settings.

File Name

Mapping tab-Transformations view-Sources node-Properties settings.

Appendix C: Session Properties Comparison Reference

Table C-9. XML Target Options Comparison


Server Manager General Tab-XML
Target Properties

Workflow Manager Property Location

Code Page

Mapping tab-Transformations view-Targets node-Properties settings.


Click Set File Properties, and then click Advanced.

FTP File

Mapping tab-Transformations view-Targets node-Properties settings.


Click Set File Properties.

Edit Object Properties

Mapping tab-Transformations view-Targets node-Connections settings.


Choose FTP for Type. Click the Edit button on the right side of the Value
field to edit FTP properties.

Reject Files
In the Server Manager, the Reject Files dialog box appeared when you clicked Reject Options
on the General tab.
In the Workflow Manager, the reject file options appear in the Targets node Properties
settings on the Mapping tab.
Figure C-13 shows the Server Manager Reject File dialog box:
Figure C-13. Server Manager Reject File Dialog Box

Table C-10 compares Reject Files options for the Server Manager with the corresponding
options for Workflow Manager:
Table C-10. Reject Files Options Comparison
Server Manager General tab-Reject
File Properties

Workflow Manager Property Location

Reject File Directory

Mapping tab-Transformations view-Targets node-Properties settings.

File Name

Mapping tab-Transformations view-Targets node-Properties settings.

General Tab

749

Session Commands
In the Server Manager, session commands appeared under the Server Name field on the
General tab. You could enter pre-session shell commands, post-session commands and
separate email messages if the session succeeded or failed.
In the Workflow Manager, session commands appear on the Components tab.

Pre-Session Commands
In the Server Manager, the Pre-Session Commands dialog box appeared when you clicked PreSession on the General tab of the session properties.
In the Workflow Manager, pre-session command options appear on the Components tab.
Figure C-14 shows the Server Manager Pre-Session Commands dialog box:
Figure C-14. Server Manager Pre-Session Commands Dialog Box

Table C-11 compares session command options for the Server Manager with the
corresponding options for the Workflow Manager:
Table C-11. Pre-Session Commands Comparison
Server Manager General Tab-Session
Commands Pre-Session Properties

Workflow Manager Property Location

Description

Components tab. Click the Edit button on the right side of the Value field
for Pre-Session Commands. Enter the description in the General tab of
the Edit Pre-Session Commands dialog box.

Command

Components tab. Click the Edit button on the right side of the Value field
for Pre-Session Commands. Enter the command in the Command tab of
the Edit Pre-Session Commands dialog box.

Post-Session Commands and Email


In the Server Manager, the Post-Session Commands and Email dialog box appears when you
click Post-session And Email on the General tab of the session properties.
In the Workflow Manager, post-session commands and email options appear on the
Components tab.

750

Appendix C: Session Properties Comparison Reference

Figure C-15 shows the Server Manager Post-Session Commands and Email dialog box:
Figure C-15. Server Manager Post-Session Commands and Email

Table C-12 compares post-session command and email options for the Server Manager with
the corresponding options for the Workflow Manager:
Table C-12. Post-Session Commands and Email Comparison
Server Manager General Tab-PostSession Commands And Email
Properties

Workflow Manager Property Location

Description

Components tab. Click the Edit button on the right side of the Value field
for Post-Session Commands. Enter the description in the General tab of
the Edit Post-Session Commands dialog box.

Command

Components tab. Click the Edit button on the right side of the Value field
for Post-Session Commands. Enter the command in the Command tab
of the Edit Post-Session Commands dialog box.

Success

Components tab-On Success Email.

Failure

Components tab-On Failure Email.

Email User Name

Components tab. Click the Edit button on the right side of the Value field
for On Success Email or On Failure Email. Enter the email user name in
the Properties tab of the Edit Success Email or Edit Failure Email dialog
box.

Email Subject

Components tab. Click the Edit button on the right side of the Value field
for On Success Email or On Failure Email. Enter the email subject in the
Properties tab of the Edit Success Email or Edit Failure Email dialog
box.

Email Text

Components tab. Click the Edit button on the right side of the Value field
for On Success Email or On Failure Email. Enter the email text in the
Properties tab of the Edit Success Email or Edit Failure Email dialog
box.

General Tab

751

Performance Options
In the Server Manager, Performance options appeared under Session Commands on the
General tab. In Performance options you could increase memory size, selected performance
details, and set configuration parameters. In the Workflow Manager, Performance options
appear on the Properties tab in the session properties.
Table C-13 compares performance options for the Server Manager with the corresponding
options for the Workflow Manager:
Table C-13. Performance Options Comparison
Server Manager General TabPerformance Properties

Workflow Manager Property Location

DTM Buffer Pool Size

Properties tab-Performance settings.

Collect Performance Data

Properties tab-Performance settings.

Advanced Options button

Config Object tab, Mapping tab, and Properties tab.

Configuration Parameters
In the Server Manager, the Configuration Parameters dialog box appeared when you clicked
Advanced Options on the General tab. In the Configuration Parameters dialog box, you could
configure the DTM memory parameters, general parameters, reader parameters, and eventbased scheduling.
In the Workflow Manager, the configuration parameters options appear on multiple tabs.
Figure C-16 shows the Server Manager Configuration Parameter dialog box:
Figure C-16. Server Manager Configuration Parameter Dialog Box

752

Appendix C: Session Properties Comparison Reference

Table C-14 compares configuration parameters for the Server Manager with the
corresponding options for the Workflow Manager:
Table C-14. Configuration Parameters Comparison
Server Manager Advanced Option
Properties

Workflow Manager Property Location

Default Buffer Block Size

Config Object tab-Advanced settings.

Index Cache Size

Mapping tab-Transformations view-Transformations node-Properties


settings for Aggregator, Joiner, Lookup, Rank transformations.

Data Cache Size

Mapping tab-Transformations view-Transformations node-Properties


settings for Aggregator, Joiner, Lookup, Rank transformations.

Line Sequential Buffer Length

Config Object tab-Advanced settings.

Source Based Commit Interval

Properties tab-General settings.

Target Based Commit Interval

Properties tab-General settings.

Commit Interval

Properties tab-General settings.

Enable Decimal Arithmetic

Properties tab-Performance settings. The option name is Enable High


Precision.

Constraint Based Loading

Config Object tab-Advanced settings.

Cache LOOKUP( ) Function

Config Object tab-Advanced settings.

Event-Based Scheduling-Indicator File To


Wait For

Event Wait Task-Events tab-Pre Defined Event. Enter the name of the
file to watch.

General Tab

753

Source Location Tab


In the Server Manager, the Source Location tab displays when you created a heterogeneous
session. In the Source Name field, you could optionally edit the source database listed for
each relation source.
In the Workflow Manager, source database information displays in the Connections settings
of the Sources node on the Mapping tab.
Figure C-17 shows the Server Manager Source Location tab:
Figure C-17. Server Manager Source Location Tab

754

Appendix C: Session Properties Comparison Reference

Time Tab
In the Server Manager, the Time tab appeared after the General tab unless the session was
heterogeneous. If the session was heterogeneous, the Time tab appeared after the Source
Location tab.
In the Workflow Manager, the Schedule tab contains workflow scheduling options. To
configure reusable scheduler options, select Workflows-Schedulers from the menu. To
configure non-reusable schedule options, select Edit-Workflow to open workflow properties
and click the Schedule tab.
Figure C-18 shows the Server Manager Time tab:
Figure C-18. Server Manager Time tab

In the Server Manager, you configured the following options from the Time tab:

Schedule options

Start options

Duration options

Batch option

Schedule Options
In the Server Manager, you used the Schedule options on the Time tab of the session
properties to schedule the frequency of a session run.

Time Tab

755

In the Workflow Manager, you use the Run Options and Schedule Options on the Schedule
tab of the Scheduler properties to schedule the frequency of a workflow run.

Repeat Options
In the Server Manager, the Repeat dialog box appeared when you selected Customized
Repeat, then clicked Edit on the Time tab.
In the Workflow Manager, the Customized Repeat dialog box appears when you schedule a
session to run on server initialization, select Customized Repeat, and then click Edit.
Figure C-19 shows the Server Manager Repeat dialog box:
Figure C-19. Server Manager Repeat Dialog Box

Start Options
In the Server Manager, the Start options appeared below the Schedule options on the Time
tab. In the Start options, you could select the session start date and session start time.
In the Workflow Manager, the Start options appear on the Schedule tab of the workflow
properties.

Duration Options
In the Server Manager, Duration options appeared next to Start options on the Time tab. In
Duration options, you could set the end date of a session run, the number of session runs, or
schedule a session to run forever as long as it was successful.
In the Workflow Manager, End options appear next to Start options on the Scheduler tab of
the workflow properties.

756

Appendix C: Session Properties Comparison Reference

Use Absolute Time Option


In the Server Manager, Use Absolute Time option, or Batch option, appeared under Start and
Duration options on the Time tab. You could use Use Absolute Time option to use the
schedule as set in the session.
In the Workflow Manager, Use Absolute Time appears on the Schedule tab of the Timer
object.

Time Tab

757

Log and Error Handling Tab


In the Server Manager, the Log and Error Handling tab appeared after the Time tab on the
session properties.
In the Workflow Manager, log and error handling options appear on the Properties and
Config Object tabs on the session properties.
Figure C-20 shows the Server Manager Log and Error Handling tab:
Figure C-20. Server Manager Log and Error Handling Tab

In the Server Manager, on the Log and Error Handling tab you could configure the following
options:

Log File options

Parameter File option

Batch Handling option

Error Handling options

Log File Options


In the Server Manager Log File options appeared at the top of the Log and Error Handling
tab. You could enter a session log variable, enter a file name for the session, or indicate how
session logs should be archived.
In the Workflow Manager, Log File options appear on the Properties and Config Object tabs.

758

Appendix C: Session Properties Comparison Reference

Table C-15 compares the Log File options for Server Manager with the corresponding options
for the Workflow Manager:
Table C-15. Log File Options Comparison
Server Manager Log and Error
Properties

Property Location in Workflow Manager

Server Path to Log Files

Properties tab-General Options settings. Enter the path in Session Log


File Directory.

Session Log File

Properties tab-General Options settings. Enter the log file name in


Session Log File Name.

Save the Session Log From the Last


<number> Session Runs

Config Object tab-Log Options settings.

Save Session Log By Timestamp

Config Object tab-Log Options settings.

Parameter File Option


In the Server Manager, the Parameter File option appeared beneath the log file options on the
Log and Error File tab. You could use the Parameter File option to designate a name and
directory for a parameter file.
In the Workflow Manager, the Parameter File option appears on the Properties tab-General
Options settings.

Batch Handling Option


In the Server Manager, the Batch Handling option appears under the Parameter File option
on the Log and Error Handling tab.
In the Workflow Manager, use link conditions in the Workflow Designer for a task to run
based on the success or failure of the previous task.

Error Handling Options


In the Server Manager, Error Handling options appeared below the Parameter File option. In
the Workflow Manager, Error handling options appear on the Config Object tab.
Table C-16 compares the Error Handling options for Server Manager with the corresponding
options for the Workflow Manager:
Table C-16. Error Handling Options Comparison
Server Manager Log and Error
Handling Properties

Property Location in Workflow Manager

Stop On

Config Object tab-Error handling settings.

Perform Recovery

Config Object tab-Error handling settings.

Log and Error Handling Tab

759

Table C-16. Error Handling Options Comparison

760

Server Manager Log and Error


Handling Properties

Property Location in Workflow Manager

Override Tracing

Config Object tab-Error handling settings.

Log and Error Handling tab-On pre-session


command errors-Stop session/Continue
session

Config Object tab-Error handling settings.

Log and Error Handling tab-On stored


procedure errors-Stop session/Continue
session

Config Object tab-Error handling settings.

Appendix C: Session Properties Comparison Reference

Transformations Tab
In the Server Manager, the Transformations tab appeared on the session properties after the
Log and Error Handling tab.
In the Workflow Manager, the settings for transformations appear on the Mapping tabTransformations view.
Figure C-21 shows the Server Manager Transformations tab:
Figure C-21. Server Manager Transformations Tab

Table C-17 compares the Transformations tab options for Server Manager with the
corresponding options for the Workflow Manager:
Table C-17. Transformations Tab Options Comparison
Server Manager Transformations Tab
Properties

Property Location in Workflow Manager

Session Level Override Transformations

Mapping tab-Transformations view-Transformations node.

Aggregate Behavior

Properties tab-Performance settings.

Deadlock Behavior-Retry Session On


Deadlock

Properties tab-Performance settings.

Sort Order

Properties tab-Performance settings.

Transformations Tab

761

Partitions Tab
In the Server Manager, the Partitions tab appeared in the session properties after the
Transformations tab.
In the Workflow Manager, the settings for partitioning appear on the Mapping tab-Partitions
view. For more information about partitioning, see Configuring Partitioning Information
on page 351.

762

Appendix C: Session Properties Comparison Reference

Index

A
ABORT function
See also Transformation Language Reference
session failure 200
aborted status 421
aborting
Control tasks 147
server handling 129
sessions 130
status 421
tasks 129
tasks in Workflow Monitor 418
workflows 129
Aborttask
pmcmd syntax 596
Abortworkflow
pmcmd syntax 597
absolute time
specifying 162
Timer task 161
active sources
constraint-based loading 248
defined 259
generating commits 278
row error logging 260
source-based commit 278
transaction generators 259
XML targets 259

adding
tasks 92
advanced settings
session properties 675
aggregate caches
calculating the data cache 622
calculating the index cache 621
overview 621
reinitializing 576, 674
aggregate files
deleting 577
moving 577
aggregate function calls
minimizing 652
Aggregator transformation
cache options 621
cache partitioning 621
caches 26, 34
data cache 622
index cache 621
optimizing performance 650
optimizing with Sorted Input 651
partitioning guidelines 347
performance detail 639
allocating memory
XML sources 655
AND links 137
archiving
session logs 471

763

workflow logs 459


arrange
workflows vertically 40
workspace objects 71
ASCII mode
See also Installation and Configuration Guide
See also Unicode mode
overview 27
performance 661
session behavior 16
assigning
PowerCenter Servers 122, 198
Assignment tasks
creating 140
definition 140
description 132
using expression editor 96
variables in 103

B
$BadFile
definition 508
naming convention 496, 520
using 509
blocking
definition 23
blocking source data
PowerCenter Server handling 23
buffer block size
configuring 677
optimizing 655, 657
buffer memory
allocating 655
buffer blocks 25
DTM process 25
bulk loading
commit interval 253
data driven session 252
DB2 642
DB2 guidelines 253
Oracle 643
Oracle guidelines 253
session properties 252, 697
Sybase IQ 643
targets 642
test load 244
using user-defined commit 283

764

Index

C
cache files
locating 577
naming convention 615
permissions 28
cache partitioning
Aggregator transformation 621
described 359
incremental aggregation 621
Joiner transformation 624
Lookup transformation 391
Rank transformation 620
caches
Aggregator transformation 621
calculating Aggregator data cache 622
calculating Aggregator index cache 621
calculating Joiner data cache 626
calculating Joiner index cache 625
calculating Lookup data cache 631
calculating Lookup index cache 629
calculating Rank data cache 633
calculating Rank index cache 632
default directory 34
files for index and data 614
files, overview 34
Joiner transformation 624
Lookup transformation 628
memory 26, 614
memory usage 26
optimizing 658
overview 28, 614
resetting with real-time sessions 288
session cache files 614
transformation 34
caching
lookup functions 676
Char datatypes
removing trailing blanks for optimization 653
check point interval
optimizing 642
checking in
versioned objects 74
checking out versioned objects 74
COBOL sources
error handling 227
numeric data handling 229
code page compatibility
See also Installation and Configuration Guide
multiple file sources 230
targets 235

code pages
See also Installation and Configuration Guide
data movement modes 27
database connections 54, 234
delimited source 224
delimited target 267, 703
external loader files 524
fixed-width sources 222
fixed-width target 266, 702
relaxed validation 55
validation 12
viewing the session log 475
color
setting 42
workspace 42
command line mode for pmcmd
connecting 589
return codes 590
using 589
command line program See pmcmd
Command task
multiple UNIX commands 145
Command tasks
creating 143
definition 143
description 132
executing commands 145
promoting to reusable 145
Run if Previous Completed 145
using server variables 188, 193
using session parameters 143
comments
adding in Expression Editor 97
commit interval
bulk loading 253
configuring 292
description 276
optimizing 655, 658
source- and target-based 276
commit source
source-based commit 278
commit type
configuring 672
committing data
target connect groups 278
transaction control 283
common logic
factoring 652
comparing objects
See also Designer Guide
See also Repository Guide

sessions 79
tasks 79
workflows 79
worklets 79
Components tab
properties 710
concurrent connections
in partitioned pipelines 379
Config Object tab
properties 675
configuring
error handling options 493
connect string
examples 54
syntax 54
connection objects
See also Repository Guide
assigning permissions 51
definition 51
deleting 59
connection settings
applying to all session instances 180
targets 695
connections
copy as 59, 60
copying a relational database connection 59
external loader 551
FTP 561
multiple targets 274
relational database 56
replacing a relational database connection 62
sources 211
targets 237
connectivity
See also Installation and Configuration Guide
connect string examples 54
overview 5
server grids 447
constraint-based loading
active sources 248
configuring 248
enabling 251
key relationships 248
session property 676
target connection groups 249
Update Strategy transformations 249
control file
overriding Teradata 539
overview 33
permissions 28

Index

765

Control tasks
definition 147
description 132
options 148
stopping or aborting the workflow 129
copying
repository objects 77
counters
BufferInput_efficiency 640
BufferOutput_efficiency 640
overview 437
Rowsinlookupcache 639
Transformation_errorrows 639
Transformation_readfromdisk 639
Transformation_writetodisk 639
CPU usage
PowerCenter Server 24
creating
external loader connections 551
FTP sessions 565
server grids 451
sessions 175
workflows 91
CUME
partitioning restrictions 395
Custom transformation
partitioning guidelines 396
customized repeat
daily 117
editing 115
monthly 117
options 116
repeat every 117
weekly 117

D
data
capturing incremental source changes 574, 579
data caches
Aggregator transformation 622
description 614
for incremental aggregation 577
memory usage 26
optimizing 655, 658
Rank transformation 633
data driven
bulk loading 252
data files
creating directory 579

766

Index

finding 577
data flow
See pipeline
data movement mode
See also ASCII mode
See also Installation and Configuration Guide
See also Unicode mode
affecting incremental aggregation 577
overview 27
database connections
See also Installation and Configuration Guide
configuring 56
copying a relational database connection 59
domain name 58
packet size 58
privileges required to create 53
replacing a relational database connection 62
rollback segment 58
session parameter 499
use trusted connection 58
using Oracle OS Authentication 53
databases
connection requirements 57
connectivity overview 46
environment SQL 55
optimizing sources 645
optimizing targets 642
selecting code pages 54
setting up connections 53
datatypes
See also Designer Guide
Char 653
Decimal 269
Double 269
Float 269
Integer 269
minimizing conversions 648
Money 269
Numeric 269
padding bytes for fixed-width targets 268
Real 269
Varchar 653
dates
configuring 38
formats 38
DB2
bulk loading 642
bulk loading guidelines 253
commit interval 253
See IBM DB2

$DBConnection
definition 499
naming convention 496, 520
using 499
deadlock
retry session 674
deadlock retry
See also Installation and Configuration Guide
configuring 246
target connection groups 257
Debugger
restrictions in partitioned pipelines 396
decimal arithmetic
See high precision
Decision tasks
creating 151
decision condition variable 149
definition 149
description 132
example 149
using Expression Editor 96
variables in 103
DECODE function
See also Transformation Language Reference
using for optimization 653
default remote directories
for FTP connections 561
deleting
connection objects 59
servers 50
workflows 97
delimited flat files
code page 691
code page, sources 224
code page, targets 267
consecutive delimiters 692
escape character 691
escape character, sources 224
numeric data handling 229
quote character 691
quote character, sources 224
quote character, targets 267
session properties, sources 222
session properties, targets 266
sources 691
delimited sources
number of rows to skip 692
delimited targets
session properties 703
delimiter
session properties, sources 222

session properties, targets 266


description
repository objects 73
directories
for historical aggregate data 579
server defaults 46
server variables 46
workspace file 41
disabled
status 421
disabling
tasks 137
workflows 118
displaying
customizing windows 69
date time format 38
Expression Editor 97
fonts 42
options 39
servers in Workflow Monitor 406
show solid lines for links 42
toolbars 69
workspace color 42
documentation
conventions xlix
description xlviii
online xlix
domain name 58
dropping
indexes 248
DTM (Data Transformation Manager)
buffer memory 25
overview 3
post-session email 10
process 7, 11
running sessions and workflows 7
transformation statistics example 469
DTM Buffer Pool Size
optimizing 655
session property 674
tuning 656

E
edit
delimiter 690
edit null characters
session properties 702
editing
delimiter 702

Index

767

session privileges 178


sessions 177
email
attaching files 333, 342
configuring a user on Windows 322, 342
configuring the PowerCenter Server on UNIX 321
configuring the PowerCenter Server on Windows 322
distribution lists 326
email variables 333
format tags 333
logon network security on Windows 325
MIME format 320
multiple recipients 326
on failure 332
on success 332
overview 320
post-session 332
rmail 321
server variables 333
session properties 714
specifying a Microsoft Outlook profile 327
suspending workflows 339
text message 328
tips 342
user name 328
using other mail programs 343
using server variables 333
Windows service startup account 322
workflows 341
worklets 341
Email tasks
creating 329
description 132
overview 328
See also email 328
suspension email 128
email variables
overview 333
Enable Past Events option 159
enabling enhanced security 44
end of file
transaction control 284
end options
end after 116
end on 116
forever 116
enhanced security
enabling 44
enabling for connection objects 44
environment SQL
configuring 55

768

Index

guidelines for entering 55


environment variables
PM_CODEPAGENAME 585
PM_HOME 587
PMTOOL_DATEFORMAT 585
repository username and password 586
error handling 186
COBOL sources 227
error log files 489
fixed-width file 227
options 493
overview 201
PMError_MSG table schema 485
PMError_ROWDATA table schema 483
PMError_Session table schema 486
pre- and post-session SQL 186
settings 679
transaction control 284
error log
options 494
session errors 201
error log files 489
error log tables
creating 483
overview 483
error logging
overview 482
error logs
messages 29
error messages
external loader 527
error threshold
$PMSessionErrorThreshold 47
pipeline partitioning 200
stop on errors 200
errors
See also Troubleshooting Guide
eliminating to improve performance 648
fatal 200
minimizing tracing level to improve performance 659
pre-session shell command 193
stopping on 679
threshold 200
validating in Expression Editor 97
Event-Raise tasks
configuring 155
declaring user-defined event 155
definition 153
description 132
in worklets 167

events
in worklets 167
pre-defined events 153
user-defined events 153
Event-Wait tasks
definition 153
description 132
for pre-defined events 158
for user-defined events 157
waiting for past events 159
working with 156
Expression Editor
adding comments 97
displaying 97
syntax colors 97
using 96
validating 119
validating expressions using 97
expressions
optimizing 652
validating 97
external loader
behavior 526
code page 524
connections 551
DB2 528
error messages 527
loading multibyte data 533, 535
on Windows systems 526
Oracle 533
overview 524
performance 643
permissions 525
PowerCenter Server support 524
privileges required to create connection 525
session properties 682, 695
setting up Workflow Manager 553
Sybase IQ 535
Teradata 538
using with partitioned pipeline 380
External Procedure transformation
See also Designer Guide
partitioning guidelines 396

F
fail parent workflow 138
failed status 421
failing workflows
failing parent workflows 148

using Control task 148


fatal errors
session failure 200
file list
creating for multiple sources 230
creating for partitioned sources 375
using for source file 230
file server
for multiple PowerCenter Servers 445
setting up for multiple servers 445
file sources
numeric data handling 229
partitioning 374
server handling 226, 229
session properties 218
file targets
partitioning 380
session properties 261
filter conditions
in partitioned pipelines 372
filtering
deleted tasks in Workflow Monitor 406
servers in Workflow Monitor 406
tasks in Gantt Chart view 405
tasks in Task View 431
filters
optimizing 650
finding objects
Workflow Manager 70
fixed-width files
code page 689
code page, sources 222
code page, targets 266
error handling 227
multibyte character handling 227
null character 689
null characters, sources 222
null characters, targets 266
numeric data handling 229
padded bytes in fixed-width targets 268
source session properties 220
target session properties 265
writing to 268, 269
fixed-width sources
session properties 689
fixed-width targets
session properties 702
flat file definitions
escape character, sources 224
PowerCenter Server handling, targets 268
quote character, sources 224

Index

769

quote character, targets 267


session properties, sources 218
session properties, targets 261
flat files
See also Designer Guide
code page, sources 222
code page, targets 266
delimiter, sources 224
delimiter, targets 267
increasing performance 660
multibyte data 270
null characters, sources 222
null characters, targets 266
numeric data handling 229
output file session parameter 504
output files 33
precision 270
precision, targets 269
shift-sensitive target 271
source file session parameter 502
fonts
setting 42
format options
changing the font 42
color 42
configuring 42
date and time 38
reset all 42
schedule 38
show solid lines for links 42
Timer task 38
FTP (File Transfer Protocol)
accessing source files 565
accessing target files 568
connecting to file targets 380
connection names 561
connection options 563
creating a session 565
defining connections 561
defining default remote directory 561
defining host names 561
mainframe restrictions 560
overview 560
privileges required to create connections 562
session properties 682, 695
functions
See also Transformation Language Reference
minimizing for optimization 653

770

Index

G
Gantt Chart
configuring 411
filtering 405
listing tasks and workflows 424
navigating 425
opening and closing folders 407
organizing 425
overview 402
searching 427
using 423
zooming 426
general options
arranging workflow vertically 40
configuring 39
in-place editing 40
launching Workflow Monitor 41
open editor 41
panning windows 40
receive notification from server 41
reload task or workflow 40
session properties 668
show expression on a link 41
show full name of task 41
General tab in session properties
FTP properties 742
in Server Manager 737
in Workflow Manager 668
session commands 750
source options 738
target options 743
General tab of session properties
general options 737
performance options 752
generating
commits with source-based commit 278
Getrunningsessionsdetails
pmcmd syntax 598
Getserverdetails
pmcmd syntax 599
Getserverproperties
pmcmd syntax 599
Getsessionstatistics
pmcmd syntax 600
Gettaskdetails
pmcmd syntax 601
Getworkflowdetails
pmcmd syntax 601
globalization
See also Installation and Configuration Guide

database connections 234


overview 234
targets 234

H
hash partitioning
adding hash keys 362
hash auto-keys partitioning 361
hash user keys partitioning 362
overview 348, 361
Help
pmcmd syntax 602
heterogeneous sources
defined 208
heterogeneous targets
overview 274
high precision
disabling 658
enabling 674
handling 204
optimizing 655
history names
in Workflow Monitor 419
host names
for FTP connections 561
registering the PowerCenter Server 49

I
IBM DB2
connect string example 54
icon
Workflow Monitor 404
worklet validation 171
IIF expressions
See also Transformation Language Reference
optimizing 653
incremental aggregation
See also Installation and Configuration Guide
cache partitioning 621
changing server code page 577
changing server data movement mode 577
changing session sort order 577
configuring 674
configuring the session 579
deleting files 577
files 34
moving files 577
overview 574

partitioning data 578


performance 651
preparing to enable 579
processing 575
reinitializing cache 576
incremental changes
capturing 579
index caches
Aggregator transformation 621
description 614
for incremental aggregation 577
memory usage 26
optimizing 655, 658
Rank transformation 632
indexes
creating directory 579
dropping for target tables 248
finding 577
optimizing by dropping 642
recreating for target tables 248
indicator files
description 33
pre-defined events 156
session output 33
Informatica
documentation xlviii
Webzine l
Informix
connect string syntax 54
row-level locking 379
in-place editing 40
$InputFile
definition 502
naming convention 496, 520
using 503, 507
interactive mode for pmcmd
connecting 592
setting defaults 592

J
joiner cache
overview 624
Joiner transformation
cache partitioning 624
caches 26, 34, 624
joining sorted flat files 385
joining sorted relational data 387
optimizing 651
optimizing performance 650

Index

771

partitioning guidelines 396


performance detail 639
threads created 19

K
key constraints
optimizing by dropping 642
key range partitioning 348, 363
keys
constraint-based loading 248

L
launch
Workflow Monitor 41, 404
line sequential buffer length
configuring 677
sources 225
links
AND 137
condition 92
example link condition 94
linking tasks concurrently 93
linking tasks sequentially 94
loops 92
OR 137
show expression on a link 41
show solid lines 42
specifying condition 94
using Expression Editor 96
variables in 103
working with 92
List Tasks
in Workflow Monitor 424
Load Manager
creating log files 11
memory usage 24
overview 3
parameters 25
post-session email 10
process 7, 8
running sessions and workflows 7
scheduling workflows 8
validating code pages 12
load summary
sessions 467
local variables
replacing sub-expressions 652

772

Index

Log and Error Handling tab


batch handling option 759
error handling option 759
log file options 758
parameter file option 759
Server Manager session properties 758
log files
See session logs, workflow logs
See also Installation and Configuration Guide
editor for Workflow Monitor 410
server variable for 46
session log 671
log options
settings 677
logs
server 28
session 31
workflow 30
lookup cache
calculating size 629, 631
overview 628
persistent 35
pipeline partitioning 628
ports included 628
session property 676
lookup caches
See also Designer Guide
enabling 649
query created 628
LOOKUP function
See also Transformation Language Reference
minimizing for optimization 653
Lookup SQL Override option
reducing cache size 649
Lookup transformation
See also Designer Guide
cache partitioning 391
caches 26, 34, 628
calculating cache size 628, 629, 631
enabling caching 649
optimizing 639, 649
optimizing lookup condition 649
optimizing multiple lookup expressions 650
optimizing with indexing 649
loops in workflow 92

M
mapping bottlenecks
identify 638

mapping parameters
See also Designer Guide
in session properties 203
overriding 203
mapping threads
description 14
mapping variables
See also Designer Guide
in partitioned pipelines 394
mappings
definition 2
factoring common logic 652
identify bottlenecks 638
increasing performance 636
single-pass reading 647
master servers 446
master thread
description 14
Maximum Days
Workflow Monitor 410
maximum sessions
See also Installation and Configuration Guide
parameter, description 25
Maximum Workflow Runs
Workflow Monitor 410
memory
caches 614
DTM buffer 25
increasing to avoid paging 662
merge target files
session properties 699
merging target files 380, 382
message queue
using with partitioned pipeline 380
metadata extensions
creating 82
deleting 85
editing 84
overview 82
session properties 718
Microsoft Access
pipeline partitioning 379
Microsoft Outlook
configuring an email user 322, 342
configuring the PowerCenter Server 322
Microsoft SQL Server
bulk loading 642
commit interval 253
connect string syntax 54
optimizing 646

MIME format
email 320
monitoring
data flow 639
session details 434
MOVINGAVG
See also Transformation Language Reference
partitioning restrictions 395
MOVINGSUM
See also Transformation Language Reference
partitioning restrictions 395
multibyte data
character handling 227
Oracle external loader 533
Sybase IQ external loader 535
writing to files 270
multiple servers
overview 444
multiple sessions 196

N
naming convention
See also Getting Started Guide
naming conventions
session parameters 496, 520
native connect string
See connect string
navigating
workspace 69
network packets
increasing 643, 646
non-persistent variables 110
non-reusable tasks
inherited changes 136
promoting to reusable 136
normal loading
session properties 697
Normal tracing levels
definition 473
Normalizer transformation
partitioning guidelines 347
notification
general option 41
null characters
editing 702
file targets 266
server handling 227
session properties, targets 265
targets 702

Index

773

numeric operations
optimizing by using 653
numeric values
reading from sources 229

O
open transaction
defined 287
operators
using for optimization 653
optimizing
block size 657
buffer block size 655
choosing numeric vs. string operations 653
commit interval 655, 658
data cache 655
data caches 658
data flow 440, 637, 639
disabling high precision 658
dropping indexes and key constraints 642
DTM Buffer Pool Size 655
eliminating transformation errors 648
expressions 652
factoring out common logic 652
filters 650
high precision 655
IIF expressions 653
increasing checkpoint interval 642
increasing network packet size 646
index cache 655, 658
Joiner transformation 651
Lookup transformation 649, 650
mapping 647
minimizing aggregate function calls 652
minimizing datatype conversions 648
minimizing error tracing 659
pipeline partitioning 663
removing trailing blank spaces 653
replacing sub-expressions with local variables 652
sessions 655
single-pass reading 647
source database 645
system-level 660
target database 642
Tracing Level 655
using DECODE vs. LOOKUP expressions 653
using operators vs. functions 653
optimizing performance
Aggregator transformation 650

774

Index

OR links 137
Oracle
bulk loading 642
bulk loading guidelines 253
commit intervals 253
connect string syntax 54
connection with OS Authentication 53
Oracle external loader
attributes 533
bulk loading 643
connecting with OS Authentication 552
data precision 533
delimited flat file target 533
external loader connections 551
external loader support 524, 533
fixed-width flat file target 533
multibyte data 533
null constraint 533
partitioned target files 533
reject file 534
output files
overview 28, 33
permissions 28
session parameter 504
session properties 700
targets 263
$OutputFile
definition 504
naming convention 496, 520
using 505
override
Teradata loader control file 539
tracing levels 473, 679
owner name
truncating target tables 245

P
packet size 58
paging
eliminating 662
parameter files
format 513
location 518
session 512
specifying in session 518
using with pmcmd starttask 607
using with pmcmd startworkflow 608
parameters
session 496

partition keys
adding 358, 362, 364
adding key ranges 365
partition points
adding and deleting 353
default 17
description 17, 346
Joiner transformation 384
partition types
description 348
partitioning
See pipeline partitioning
partitioning data
incremental aggregation 578
partitioning restrictions
Debugger 396
Informix 379
numerical functions 395
PowerCenter Connect for IBM MQSeries restrictions
397
PowerCenter Connect for PeopleSoft restrictions 397
PowerCenter Connect for SAP BW 397
PowerCenter Connect for SAP R/3 397
PowerCenter Connect for Siebel 398
relational targets 395
Sybase IQ 379, 395
transformations 395
unconnected transformations 353
XML targets 396
Partitioning tab
in the Server Manager 762
in the Workflow Manager 762
Partitions
properties 352
partitions
adding and deleting 356
description 18, 348
Partitions views
properties 351
pass-through pipeline
overview 15
performance
See also optimizing
commit interval 278
detail file 31
identifying bottlenecks 637
monitoring 436
server data movement mode 661
Sybase IQ 643
tuning, overview 636

performance data
collecting 674
performance detail files
creating 436
enabling session monitoring 436
permissions 28
understanding counters 437
viewing 436
performance settings
session properties 674
permissions
connection objects 51
creating a session 175
database 51
deleting a PowerCenter Server 50
editing sessions 177
external loader 525
FTP connections 561
FTP session 565
output and log files 28
recovery files 28
scheduling 90
Workflow Monitor tasks 403
persistent lookup cache
session output 35
persistent variables 110
in worklets 169
pinging
pmcmd syntax 602
PowerCenter Server in Workflow Monitor 405
Pingserver
pmcmd syntax 602
pipeline partitioning
adding and deleting partitions 356
adding hash keys 362
adding key ranges 365
adding partition points 353
caching Lookup transformations 628
concurrent connections 379
configuring a session 351
configuring for sorted data 384
configuring to optimize join performance 384
database compatibility 379
description 346
error threshold 200
example of use 349
external loaders 380, 526
file lists 375
file sources 374
file targets 380
filter conditions 372

Index

775

hash auto-keys partitioning 361


hash partitioning 361
hash user keys partitioning 362
Joiner transformation 384
key range 363
loading to Informix 379
mapping variables 394
merge target files 699
merging target files 380, 382
message queues 380
multiple CPUs 3
multiple source pipelines 19
numerical functions restrictions 395
object validation 396
optimizing performance 663
optimizing source databases 663
optimizing target databases 664
overview 3
partition keys 358, 362, 364
partition types overview 356
partitioning indirect files 375
pass-through partitioning 367
recovery 200
reject file 476
relational sources 371
relational targets 378
round-robin partitioning 360
rules and restrictions 395, 398
session properties 705
sorted flat files 385
sorted relational data 387
Sorter transformation 389, 392
SQL queries 371
symmetric processing platform 24
threads and partitions 18
threads created 16
Transaction Control transformation 356
pipelines
See source pipelines
active sources 259
data flow monitoring 440, 637, 639
description 346
PM_CODEPAGENAME
using with pmcmd 585
PM_RECOVERY table
format 299
PM_TGT_RUN_ID table
format 299
pmcmd
aborttask 596
abortworkflow 597

776

Index

command line mode 589


command parameters 594
commands, list 582
commands, reference 594
environment variables 585
getserverdetails 599
getserverproperties 599
getsessionstatistics 600
gettaskdetails 601
getworkflowdetails 601
help 602
interactive mode 592
overview 582
parameter files 607, 608
pingserver 602
resumeworkflow 603
return codes 300
setfolder 604
setnowait 605
setwait 605
showsettings 605
shutdownserver 605
starttask 606
startworkflow 607
stoptask 609
stopworkflow 609
syntax 595
unsetfolder 610
version 611
waittask 611
waitworkflow 611
writing scripts 589
PMError_MSG table schema 485
PMError_ROWDATA table schema 483
PMError_Session table schema 486
$PMFailureEmailUser
definition 333
tips 342
PmNullPasswd
reserved word 53
PmNullUser
reserved word 53
pmserver
process 11
$PMSessionLogCount
saving a number of logs 471
$PMSessionLogDir
configuring the session log 471
definition 469
$PMSessionLogFile
definition 497

using 498
$PMSuccessEmailUser
definition 333
tips 342
PMTOOL_DATEFORMAT
using with pmcmd 585
$PMWorkflowLogDir
definition 459
$PMWorkflowLogCount
saving a number of logs 460
post-session command
session properties 711
shell command properties 714
post-session email
overview 33, 332
See also email
session options 716
session properties 711
post-session shell command
configuring non-reusable 189
configuring reusable 192
using 188
post-session SQL commands 186
post-session threads
description 14
PowerCenter Connect for IBM MQSeries
partitioning restrictions 397
PowerCenter Connect for PeopleSoft
partitioning restrictions 397
PowerCenter Connect for SAP BW
partitioning restrictions 397
PowerCenter Connect for SAP R/3
partitioning restrictions 397
PowerCenter Connect for Siebel
partitioning restrictions 398
PowerCenter Server 22
architecture 2
assigning sessions 198
assigning workflows 122
blocking data 23
changing servers 445
commit interval overview 276
configuring for multiple servers 445
connecting in Workflow Monitor 405
connectivity overview 5, 46
creating server grids 451
data movement modes 27
deleting 50
external loader support 524
filtering in Workflow Monitor 406
handling file targets 268

logs 28
messages 29
monitoring 436
multiple servers overview 444
multiple source file list 230
online and offline mode 405
output files 33
performance detail file 31
permissions to delete 50
pinging in Workflow Monitor 405
privileges required to register 46
processing data 22
reading sources 22
registering 46, 48
removing assigned sessions 199
removing assigned workflows 123
reporting session statistics 468
server grids overview 446
system resources 24
tracing levels 473
truncating target tables 245
using FTP 561
using multiple to increase performance 661
using server grids to increase performance 661
variables for 46
pre- and post-session SQL
entering 186
guidelines 186
precision
flat files 270
writing to file targets 269
pre-defined events
waiting for 158
pre-defined variables
in Decision tasks 149
pre-session shell command
configuring non-reusable 189
configuring reusable 192
errors 193
session properties 711
using 188
pre-session SQL commands 186
pre-session threads
description 14
privileges
See also permissions
See also Repository Guide
scheduling 90
session 175
workflow 90
Workflow Monitor tasks 403

Index

777

workflow operator 90
Properties tab in session properties
in Workflow Manager 670

Q
Quit
pmcmd syntax 602
quoted identifiers
reserved words 255

R
rank cache
calculating data cache 633
calculating index cache 632
location 632
overview 632
size 632
Rank transformation
See also Transformation Guide
cache partitioning 620
caches 26, 34, 632
partitioning guidelines 347
performance detail 639
reader threads
description 14, 15
reading
sources 22
real-time sessions
transformation scope 288
recovering
pipeline partitioning 200
recovery
completing unrecoverable sessions 316
configuring mappings 297
configuring the session 297
configuring the target database 298
configuring the workflow 298
files, permissions 28
overview 296
PM_RECOVERY table format 299
PM_TGT_RUN_ID table format 299
pmcmd return codes 300
recover from task 308
recover task 311
recovering a failed workflow 308
recovering a session task 311
recovering a suspended workflow 305
recovery table layout 314
778

Index

resume/recover 305
server handling 314
recovery files
permissions 28
recreating
indexes 248
registering
PowerCenter Server 46, 48
registering server
See also Installation and Configuration Guide
reinitializing
aggregate cache 576
reject file
changing names 476
column indicators 478
locating 456, 476
Oracle external loader 534
overview 32
permissions 28
pipeline partitioning 476
reading 477
row indicators 478
session parameter 508
session properties 243, 263, 698, 700
transaction control 284
viewing 476
relational connections
See relational databases
relational databases
configuring a connection 56
copying a relational database connection 59
replacing a relational database connection 62
rollback segment 58
relational sources
partitioning 371
session properties 214
relational targets
partitioning 378
partitioning restrictions 395
session properties 240, 697
Relative time
specifying 162
Timer task 161
reload task or workflow
configuring 40
rename
repository objects 73
repositories
adding 73
connecting in Workflow Monitor 405
enter description 73

repository objects
configuring 73
rename 73
Repository Server
notification 41
notification in Workflow Monitor 410
requirements
server grids 448
reserved words
generating SQL with 255
resword.txt 255
reserved words file
creating 256
reset all 42
restarting
in Workflow Monitor 416
Resumeworkflow
pmcmd syntax 603
Resumeworklet
pmcmd syntax 603
reusable tasks
inherited changes 136
reverting changes 136
reverting changes
tasks 136
rmail
See also email
configuring 321
rollback segment 58
rolling back data
transaction control 283
round-robin partitioning 348, 360
row error log files
permissions 28
row error logging
active sources 260
row indicators
reject file 478
rows to skip
delimited files 692
Run if Previous Completed
in Command Tasks 145
session command 714
run options
run continuously 115
run on demand 115
server initialization 115
running status 421
running, sessions 197
running, workflows 122

S
saving
session logs 471
workflow logs 459
scheduled status 421
scheduling
configuring 114
creating reusable scheduler 114
disabling workflows 118
editing 117
end options 116
error message 113
permission 90
run every 115
run once 115
run options 115
schedule options 115
start date 116
start time 116
workflows 112
searching
for versioned objects in the Workflow Manager 76
Workflow Manager 70
Workflow Monitor 427
Sequence Generator transformation
partitioning guidelines 353, 396
server
See PowerCenter Server
See also database-specific server
selecting 122, 197
server code page
See also PowerCenter Server
affecting incremental aggregation 577
Server Grid Browser 453
Server Grid Editor 452
server grids
connectivity 447
creating 451
definition 444
distributing sessions 446
increasing performance 661
master servers 446
overview 446
requirements 448
worker servers 446
server handling
file targets 268
fixed-width targets 269, 270
multibyte data to file targets 271
shift-sensitive data, targets 271

Index

779

server logs
messages 29
overview 28
Server Manager session properties
General tab 737
Log and Error Handling tab 758
Partitioning tab 762
Source Location tab 754
Time tab 755
Transformations tab 761
server variables
description 46
email 333
for multiple servers 445
in Command tasks 188, 193
list 47
log files 46
servers
assigned 444
non-associated 444
session command settings
session properties 711
session details
monitoring sessions 434
session errors 201
session logs
archiving 471
changing location 498
changing locations 471
changing name 497
changing names 471
code page 475
codes 463
creation 11
default name 470
editing 419
external loader error messages 527
generating using UTF-8 463
load summary 467
locating 456, 469
location 671
log file settings 469, 470, 472, 474
overview 31
parameter 497
permissions 28
reading 463
sample 466
saving 678
session details 31
session parameter 497
thread identification 465

780

Index

timestamp 472
tracing levels 473
transformation statistics 469
viewing 474
viewing dynamically 419
viewing in Workflow Monitor 419
session output
cache files 34
control file 33
incremental aggregation files 34
indicator file 33
performance detail file 31
persistent lookup cache 35
post-session email 33
PowerCenter Server log 28
reject file 32
session logs 31
target output file 33
session parameters
database connection parameter 499
defining 512
in Command tasks 143
naming conventions 496, 520
overview 496
reject file parameter 508
session log parameter 497
session parameter file 512
source file parameter 502
target file parameter 504
session properties
Components tab 710
Config Object tab 675
constraint-based loading 251
delimited files, sources 222
delimited files, targets 266
edit delimiter 690, 702
edit null character 702
email 332, 714
external loader 682, 695
fixed-width files, sources 220
fixed-width files, targets 265
FTP files 682, 695
general settings 668
General tab 668
log files 469, 470, 472, 474
Metadata Extensions tab 718
null character, targets 265
on failure email 332
on success email 332
output files, flat file 700
partition attributes 351, 352

Partitions View 705


performance settings 674
post-session email 332
post-session shell command 714
Properties tab 670
reject file, flat file 263, 700
reject file, relational 243, 698
relational sources 214
relational targets 240
session command settings 711
session retry on deadlock 246
sort order 577
source connections 211
sources 210
table name prefix 254
target connection settings 682, 695
target connections 237
target load options 252, 697
target-based commit 292
targets 236
Transformation node 703
transformations 703
session properties comparison
overview 736
session retry on deadlock
See also Installation and Configuration Guide
overview 246
sessions
See also session logs
See also session properties
aborting 130, 200
apply attributes to all instances 178
assigning PowerCenter Servers 198
caches 28
configuring for multiple source files 231
configuring to optimize join performance 384
creating 175
creating a session configuration object 183
definition 2, 174
description 132
distributing in server grids 446
DTM buffer memory 25
editing 177
editing privileges 178
eliminating paging 621
email 320
enabling monitoring 436
external loading 524, 553
failure 200
high-precision data 204
identifying bottlenecks 639

metadata extensions in 82
monitoring counters 437
multiple source files 230
optimizing 636, 655
output files 28
overview 174
parameter file 512
parameters 496
performance detail file 31
performance tuning 636
properties reference 667
read-only 175
removing assigned PowerCenter Servers 199
running 197
runtime operations overview 7
session details file 31
starting 197
stopping 130, 200
test load 244, 264
truncating target tables 245
using FTP 565
validating 195
viewing performance details 436
Setfolder
pmcmd syntax 604
Setnowait
pmcmd syntax 605
Setwait
pmcmd syntax 605
shared memory
Load Manager 24
shell commands
executing in Command tasks 145
make reusable 191
post-session 188
post-session properties 714
pre-session 188
using Command tasks 143
using server variables 188, 193
using session parameters 143
Showsettings
pmcmd syntax 605
Shutdownserver
pmcmd syntax 605
single-pass reading
definition 647
sort order
See also session properties
affecting incremental aggregation 577
sorted flat files
partitioning for optimized join performance 385

Index

781

sorted ports
caching requirements 621
sorted relational data
partitioning for optimized join performance 387
Sorter transformation
partitioning 392
partitioning for optimized join performance 389
$Source
session properties 672
source bottlenecks
using a database query to identify 638
using a read test session to identify 638
using filter transformation to identify 637
source data
capturing changes for aggregation 574
source databases
database connection session parameter 499
identifying bottlenecks 637
optimizing 645
optimizing by partitioning 663
optimizing the query 645
optimizing with conditional filters 646
source files
accessing through FTP 560, 565
configuring for multiple files 230, 231
delimited properties 691
fixed-width properties 689
session parameter 502
session properties 220, 687
using parameters 502, 506
source location
session properties 220, 687
Source Location tab
in the Workflow Manager 754
Server Manager session properties 754
source pipelines
description 346
pass-through 15
reading 22
stages 17
target load order groups 22
threads created 19
with Joiner transformations 19
Source Qualifier transformation
partitioning guidelines 347
source-based commit
active sources 278
description 278
sources
code page 224
code page, flat file 222

782

Index

connections 211
delimiters 224
escape character 691
line sequential buffer length 225
multiple sources in a session 230
null character 689
null character handling 227
null characters 222
overriding SQL query, session 216
partitioning 371, 374
quote character 691
reading 22
session properties 210
specifying code page 689, 691
SQL
configuring environment SQL 55
guidelines for entering environment SQL 55
SQL queries
in partitioned pipelines 371
stages
description 17
staging areas
removing to improve performance 659
start date, scheduling 116
Start tasks, definition 88
start time, scheduling 116
starting
selecting a server 122, 197
sessions 197
start from task 124
starting a part of a workflow 124
starting tasks 125
starting workflows using Workflow Manager 124
Workflow Monitor 404
workflows 122
Starttask
pmcmd syntax 606
using a parameter file 607
Startworkflow
pmcmd syntax 607
using a parameter file 608
statistics
for Workflow Monitor 408
viewing 408
status
aborted 421
aborting 421
disabled 421
failed 421
in Workflow Monitor 421
running 421

scheduled 421
stopped 421
stopping 421
succeeded 421
suspended 127, 421
suspending 127, 421
tasks 421
terminated 421
unscheduled 421
waiting 421
workflows 421
stop on
$PMSessionErrorThreshold 47
error threshold 200
errors 679
pre- and post-session SQL errors 186
stopped status 421
stopping
PowerCenter Server See Installation and Configuration
Guide
in Workflow Monitor 418
server handling 129
sessions 130
tasks 129
using Control tasks 147
workflows 129
stopping status 421
Stoptask
pmcmd syntax 609
Stopworkflow
pmcmd syntax 609
string operations
minimizing for performance 653
sub-expressions
replacing with local variables 652
succeeded status 421
Suspend On Error option 127
suspended status 127, 421
suspending
behavior 127
email 128
resume in Workflow Monitor 417
status 127
workflows 127
worklets 164
suspending status 421
suspension email 339
Sybase
commit interval 253
Sybase IQ
partitioning restrictions 379, 395

Sybase IQ external loader


attributes 536
bulk loading 643
connections 551
data precision 535
delimited flat file targets 536
fixed-width flat file targets 535
multibyte data 535
optional quotes 535
overview 535
support 524
Sybase SQL Server
bulk loading 642
connect string example 54
optimizing 646
symmetric processing platform
pipeline partitioning 24
system bottlenecks
identifying 640
UNIX 641
Windows 640
system-level optimization
improving network speed 660
overview 660
using additional CPUs 661

T
table name prefix
target owner 254
table owner name
session properties 216
targets 254
$Target
session properties 672
target connect groups
committing data 278
target connection group
Transaction Control transformation 289
target connection groups
constraint-based loading 249
defined 257
target connection settings
session properties 682, 695
target databases
bulk loading 642
database connection session parameter 499
identifying bottlenecks 637
optimizing 642
optimizing by partitioning 664

Index

783

optimizing Oracle target database 643


target files
delimited 703
fixed-width 702
target load order
constraint-based loading 249
groups 22
target load order groups
defined 22
target owner
table name prefix 254
target properties
bulk mode 241
test load 241
update strategy 241
target tables
truncating 245
target-based commit
WriterWaitTimeout 277
target-based commit interval
description 277
targets
accessing through FTP 560, 568
code page 267, 702, 703
code page compatibility 235
code page, flat file 266
connection settings 695
connections 237
database connections 234
delimiters 267
file writer 236
globalization features 234
heterogeneous 274
load, session properties 252, 697
merging output files 380, 382
multiple connections 274
multiple types 274
null characters 266
output files 263
output files for 33
partitioning 378, 380
relational settings 697
relational writer 236
session properties 236, 240
specifying null character 702
truncating tables 245
viewing session detail 31
writers 236
Task Developer
creating tasks 133
displaying and hiding tool name 41

784

Index

Task view
configuring 412
customizing 412
displaying 430
filtering 431
hiding 412
opening and closing folders 407
overview 402
using 430
tasks
aborted 421
aborting 129, 421
adding in workflows 92
arranging 71
Assignment tasks 140
Command tasks 143
configuring 135
Control task 147
copying 77
creating 133
creating in Task Developer 133
creating in Workflow Designer 133
Decision tasks 149
disabled 421
disabling 137
email 328
Event-Raise tasks 153
Event-Wait tasks 153
failed 421
failing parent workflow 138
in worklets 166
inherited changes 136
instances 136
list of 132
non-reusable 92
overview 132
promoting to reusable 136
restarting in Workflow Monitor 416
reusable 92
reverting changes 136
running 421
show full name 41
starting 125
status 421
stopped 421
stopping 129, 421
stopping and aborting in Workflow Monitor 418
succeeded 421
Timer tasks 161
using Tasks toolbar 92
validating 119

Tasks toolbar
creating tasks 134
TCP/IP network protocol
server settings 49
Teradata
connect string example 54
Teradata external loader
code page 538
connections 551
date format 538
FastLoad attributes 545
MultiLoad attributes 540
overriding the control file 539
support 524
Teradata Warehouse Builder attributes 547
TPump attributes 542
Teradata Warehouse Builder
attributes 547
operators 547
terminated status 421
Terse tracing levels
See also Designer Guide
defined 473
test load
bulk loading 244
enabling 671
file targets 264
number of rows to test 671
relational targets 244
thread identification
session log file 465
threads
and partitions 18
creation 13, 14
mapping 14
master 14
post-session 14
pre-session 14
reader 14, 15
transformation 14, 16
types 14
writer 14, 16
time
configuring 38
formats 38
Time tab
duration options 756
schedule options 755
Server Manager session properties 755
start options 756
use absolute time option 757

Timer tasks
absolute time 161, 162
definition 161
description 132
example 161
relative time 161, 162
variables in 103
timestamps
session logs 472
workflow logs 460, 462
Workflow Monitor 402
tool names
displaying and hiding 41
toolbars 69
adding tasks 92
creating tasks 134
using 69
Workflow Monitor 415
Tracing Level
optimizing 655
tracing levels
See also Designer Guide
Normal 473
overriding 679
session 473
Terse 473
Verbose Data 474
Verbose Initialization 474
transaction
defined 287
transaction boundary
dropping 287
transaction control 287
transaction control
bulk loading 283
end of file 284
open transaction 287
overview 287
PowerCenter Server handling 283
real-time sessions 287
reject file 284
rules and guidelines 290
transaction control points 287
transformation error 284
transformation scope 287
user-defined commit 283
transaction control point
defined 287
Transaction Control transformation
partitioning guidelines 356
target connection group 289

Index

785

transaction control unit


defined 289
transaction generator
active sources 259
effective and ineffective 259
transaction control points 287
transformation scope
defined 287
real-time processing 288
transformations 288
transformation threads
description 14, 16
transformations
as partition points 353
eliminating errors 648
optimizing 639
partitioning restrictions 395
session properties 703
statistics on 469
Transformations node
properties 703
Transformations tab
in the Server Manager 761
in the Workflow Manager 761
Transformations view
session properties 681
Treat Source Rows As
bulk loading 252
Treat Source Rows As property
overview 214
truncating
Table Name Prefix 245
target tables 245

U
unconnected transformations
partitioning restrictions 353
Unicode mode
See also Installation and Configuration Guide
code pages 27
session behavior 16
UNIX systems
email 321
external loader behavior 526
PowerCenter Server as daemon 3
unscheduled status 421
Unsetfolder
pmcmd syntax 610

786

Index

update strategy
target properties 241
Update Strategy transformation
constraint-based loading 249
updating
incrementally 579
URL
adding through business documentation links 97
user-defined commit
see also transaction control
bulk loading 283
user-defined events
declaring 155
example 153
waiting for 157
using multiple servers 444

V
validating 196
expressions 97, 119
tasks 119
workflows 119, 120
worklets 171
Varchar datatypes
See also Designer Guide
removing trailing blanks for optimization 653
variables
email 333
server 46
workflow 103
Verbose Data tracing levels
configuring session log 474
See also Designer Guide
Verbose Initialization tracing levels
configuring session log 474
See also Designer Guide
Version
pmcmd syntax 611
versioned objects
See also Repository Guide
checking in 74
checking out 74
searching for in the Workflow Manager 76
viewing
reject file 476
session logs 474
workflow logs 462

W
waiting status 421
Waittask
pmcmd syntax 611
Waitworkflow
pmcmd syntax 611
web links
adding to expressions 97
webzine l
windows
customizing 69
displaying and closing 69
docking and undocking 69
Navigator 67
Output 67
overview 67
panning 40
reloading 40
Workflow Manager 67
Workflow Monitor 402
workspace 67
Windows System Tray
accessing Workflow Monitor 404
Windows systems
email 322
external loader behavior 526
Informatica service owner 322
logon network security 325
PowerCenter Server service 3
worker servers 446
Workflow Designer
creating tasks 133
displaying and hiding tool name 41
workflow logs
archiving 459
changing locations 461
changing name 461
codes 458
configuring 460
creation 9
editing 419
enabling and disabling 459, 461
locating 456, 459
log file settings 459, 460
overview 30
permissions 28
reading 458
sample 458
timestamp 460
viewing 462

viewing dynamically 419


viewing in Workflow Monitor 419
Workflow Manager
adding repositories 73
arrange 71
checking out and in versioned objects 74
configuring for multiple source files 231
copying 77
creating external loader connections 551
customizing options 39
date and time formats 38
defining FTP connections 561
display options 39
entering object descriptions 73
format options 42
general options 39
increasing network packet size 646
managing multiple servers 444
messages to Workflow Monitor 410
overview 38, 46, 66
registering the PowerCenter Server 46, 48
searching for items 70
searching for versioned objects 76
setting up database connections 53, 56
toolbars 69
tools 66
validating sessions 195
windows 67, 69
zooming the workspace 71
Workflow Monitor
closing folders 407
configuring 409
connecting to repositories 405
connecting to server 405
customizing columns 412
deleted servers 405
deleted tasks 406
disconnecting from server 405
displaying servers 406
dynamic logs 419
editing logs 419
filtering deleted tasks 406
filtering servers 406
filtering tasks in Task View 405, 431
Gantt Chart view 402
hiding columns 412
hiding servers 406
icon 404
launching 404
launching automatically 41
listing tasks and workflows 424

Index

787

log file editor 410


Maximum Days 410
Maximum Workflow Runs 410
monitor modes 405
navigating the Time window 425
notification from Repository Server 410
opening folders 407
overview 402
performing tasks 416
permissions and privileges 403
pinging the PowerCenter Server 405
receive messages from Workflow Manager 410
restarting tasks, workflows, and worklets 416
resuming a workflow or worklet 417
searching 427
session details 434
starting 404
statistics 408
stopping or aborting tasks and workflows 418
switching views 403
System Tray 404
Task view 402
time 402
toolbars 415
viewing history names 419
viewing session logs 419
viewing workflow logs 419
workflow and task status 421
zooming 426
workflow output
email 33
workflow logs 30
workflow parameter file 110
workflow properties
log files 459, 460
suspension email 339
workflow variables
creating 110
datatypes 105, 110
default values 106, 109, 110
keywords 104
non-persistent variables 110
persistent variables 110
pre-defined 105
start and current values 109
SYSDATE 105
user-defined 108
using 103
using in expressions 106
WORKFLOWSTARTTIME 105

788

Index

workflows
aborted 421
aborting 129, 421
adding tasks 92
assigning PowerCenter Servers 122
branches 88
copying 77
creating 91
definition 2, 88
deleting 97
developing 89, 91
disabled 421
disabling 118
editing 98
email 341
events 88
fail parent workflow 138
failed 421
guidelines 89
links 88
locking 8
metadata extensions in 82
monitor 89
overview 88
parameter file 9
privileges 90
properties reference 721
removing assigned PowerCenter Servers 123
restarting in Workflow Monitor 416
resuming in Workflow Monitor 417
running 7, 122, 421
runtime operations overview 7
scheduled 421
scheduling 112
selecting a server 89
starting 122
starting on non-associated server 444
status 127, 421
stopped 421
stopping 129, 421
stopping and aborting in Workflow Monitor 418
succeeded 421
suspended 421
suspending 127, 421
suspension email 339
terminated 421
unscheduled 421
using tasks 132
validating 119
variables 103
waiting 421

Worklet Designer
displaying and hiding tool name 41
worklets
adding tasks 166
configuring properties 166
create non-reusable worklets 165
create reusable worklets 165
declaring events 167
developing 165
email 341
fail parent worklet 138
metadata extensions in 82
overriding variable value 169
overview 164
parameters tab 169
persistent variable example 169
persistent variables 169
restarting in Workflow Monitor 416
resuming in Workflow Monitor 417
suspended 421
suspending 164, 421
unscheduled 421
validating 171
variables 169
waiting 421
workspace
color 42
navigating 69
setting colors 42
setting fonts 42
zooming 71
workspace file directory 41
writer threads
description 14, 16
writers
session properties 692
WriterWaitTimeout
target-based commit 277
writing
multibyte data to files 270
to fixed-width files 268, 269

target-based commit 277

Z
zooming
Workflow Manager 71
Workflow Monitor 426

X
XML sources
allocating memory 655
numeric data handling 229
XML targets
active sources 259
partitioning restrictions 396

Index

789

790

Index

Você também pode gostar