Você está na página 1de 17

1

HOW TO CREATE BUILDOPS IN DATASTAGE

Buildops:

You can define a Build stage to enable you to provide a custom operator that can be executed from
a DataStage Parallel job stage.

When defining a Build stage you provide the following information:

Description of the data that will be input to the stage.

Whether records are transferred from input to output. A transfer


copies the input record to the output buffer. If you specify auto
transfer, the operator transfers the input record to the output
record immediately after execution of the per record code. The
code can still access data in the output buffer until it is actually
written.

Any definitions and header file information that needs to be


included.

Code that is executed at the beginning of the stage (before any


records are processed).

Code that is executed at the end of the stage (after all records have
been processed).

Code that is executed every time the stage processes a record.

Compilation and build details for actually building the stage.

When you have specified the information, and request that the stage is generated, DataStage
generates a number of files and then compiles these to build an operator which the stage executes.
The generated files include:

Header files (ending in .h)


Source files (ending in .c)
Object files (ending in .o)

Advantages of using Buildops:

Very efficient because they are native to the framework


Can implement extremely complex algorithms
Can be developed in a stand-alone mode without being connected to the DataStage
server
Highly reusable and portable

How to create Buildop in datastage 1 of 17 Last updated: 12/5/2017


2

Support RCP
Provide immediate access to the Orchestrates C++ classes

Properties of Buildops:

Parallel only
Must have at least one input interface and one output interface
Interfaces are static
Partitioning type is Same

The tool includes the GUI portion and the Unix command buildop

The BuildOp GUI helps to create the operator definition file (.opd file) and call the buildop
command to generate the operator executable file.

The buildop command syntax: buildop options .opd_file

Lets Create a Buildup (Calculator):

Source:

Consider following is a file definition (meta-data) for your input file (Source):

Our sample source file contains two integer columns:


(1) fno and
(2) sno.

Target:

Following is a file definition (meta-data) for your output file (Target)

How to create Buildop in datastage 2 of 17 Last updated: 12/5/2017


3

Our sample target file contains 7 columns:

Business Rules for Transformations:

Lets assume that we want to apply following transformation rules to the source file in order to
populate our target file.

Ans_add = fno + sno


Ans_mul = fno * sno
Ans_div = fno / sno
Ans_exp = fno % sno
Org_a = fno
Org_b = sno
Ans_max = maxof(fno, sno)

Lets start creating our Buildop calculator.!!!!!

- Go to palette (In Datastage Designer)

- Right click on stage type

- Click on new parallel stage

- Select build

How to create Buildop in datastage 3 of 17 Last updated: 12/5/2017


4

This will bring you a buildop dialog box. (see below)

General Tab:

Type the stage type name (this is the name you will be giving to your buildop)

Supply category (the folder where your buildop will be residing within stages on your palette)

Supply sort description (this will help others to know what this buildop is doing)
Execution mode is Parallel by default

Class Name will be automatically populated and is same as your stage type name

How to create Buildop in datastage 4 of 17 Last updated: 12/5/2017


5

Creator tab:

Here you can supply the author name and version number.

You can supply the bitmaps (icons of your buildop)

There are two bitmaps

(1) 16 x 16 : this is a small icon which will appear on the palette


(2) 32 x 32 : this is a large icon which will appear on your job design

Properties Tab:

Here you can type in various properties for your buildop

For example say if you are doing lookups using your buildop then you can specify the Key
value property here..!!!

We dont need any properties for our sample calculator buildop.

How to create Buildop in datastage 5 of 17 Last updated: 12/5/2017


6

Build tab (very important):

How to create Buildop in datastage 6 of 17 Last updated: 12/5/2017


7

Build tab has three sub-tabs within:

(A) Interfaces:

(1) Input:

Input tab is used to define your input links coming in to the buildop.

Port Name: you can supply your own port names here (compare port names with the link
names in existing stages say merge stage which has two input links master and update

Auto Read: if you specify this as true then you dont need to use macros for reading the
input records (discussed later). And if you specify this as false then you need to use
macros to read the input records.

Table Name: select the table definition or meta data using the table Name column. Our
calculator source file meta data is saved under Saved\Tst_buildop\a directory so we can
select that here.

RCP: you can make RCP enable by selecting true. Or leave it false.

(2) Output:

Output tab is used to define your output links coming out from the buildop.

How to create Buildop in datastage 7 of 17 Last updated: 12/5/2017


8

Port Name: you can supply your own port names here (compare port names with the link
names in existing stages say merge stage which has two output links output and reject

Auto Read: if you specify this as true then you dont need to use macros for reading the
input records (discussed later). And if you specify this as false then you need to use
macros to read the input records.

Table Name: select the table definition or meta data using the table Name column. Our
calculator source file meta data is saved under Saved\Tst_buildop\ans directory so we can
select that here.

RCP: you can make RCP enable by selecting true. Or leave it false.

(3) Transfer:

Transfer tab is used to specify how the data transfer will occur in our sample buildop we
will take one record from source and write one record to the target so our inpu will be ina
(source) and output will be ans (target). Select Auto Transfer property to true or you
have to use transfer macros while coding.

(B) Logic:

How to create Buildop in datastage 8 of 17 Last updated: 12/5/2017


9

Very important tab in buildop

Actual buildop logic will be defined here

Logic tab has four sub-tabs:

(1) Definitions:

Remember buildop code is a native C++ code so we need to define many


variables/classes/structures/arrays in order to do the task

Any definitions related to C++ code would always go in Definitions tab.

All #Incldue statements, #Define statements would always go under Definitions tab.
(compare this with our C++ header files from training)

How to create Buildop in datastage 9 of 17 Last updated: 12/5/2017


10

(2) Pre-Loop:

Any code specified within this tab will be executed first while running the build op

In our example we need to read first record from our source ina for this task we can use the
buildop macro readRecord(input name)

In the our example readRecord(ina.portid_) will read the first available record from our source file.
Where ina is our source or input name

(3) Per-Record:

How to create Buildop in datastage 10 of 17 Last updated: 12/5/2017


11

This is the table where actual transformation of a record will occur.

This contains pure C++ code with some buildop macros.

Actual Code for our Calculator: (refer to Explanation in next section for details)

while(!inputDone(ina.portid_))
{

ans.ans_add = ina.fno + ina.sno;


ans.ans_mul = ina.fno * ina.sno;
ans.ans_div = ina.fno / ina.sno;
ans.ans_exp = ina.fno % ina.sno;
ans.org_a = ina.fno;
ans.org_b = ina.sno;

if ( ina.sno < ina.fno)

{
ans.ans_max = ina.fno;
writeRecord(ans.portid_);
}

else

{
ans.ans_max = ina.sno;
writeRecord(ans.portid_);
}

readRecord(ina.portid_);
}

prefix[indent] = '\0';
cout << "evnet id " << prefix;
if (logDetail->eventId < 0)
cout << "unknown";
else
cout << logDetail->eventId << endl;

switch(logDetail->type)

How to create Buildop in datastage 11 of 17 Last updated: 12/5/2017


12

{
case DSJ_LOGINFO:
cout << "INFO";
break;
case DSJ_LOGWARNING:
cout << "WARNING";
break;
case DSJ_LOGFATAL:
cout << "FATAL";
break;
case DSJ_LOGREJECT:
cout << "REJECT";
break;
case DSJ_LOGSTARTED:
cout << "STARTED";
break;
case DSJ_LOGRESET:
cout << "RESET";
break;
case DSJ_LOGBATCH:
cout << "BATCH";
break;
case DSJ_LOGOTHER:
cout << "OTHER";
break;
default:
cout << "Successful execution of Buildop";

break;
}
cout << endl;
cout << "Message" << prefix;
}

Explanation:

while(!inputDone(ina.portid_))

This statement uses a macro inputDone which loops through all the input records in source until
it finds end of records.

ans.ans_add = ina.fno + ina.sno;

This statement first adds two numbers (values in columns fno and sno in our source) and then later,
it assigns the result to the output field ans_add.

How to create Buildop in datastage 12 of 17 Last updated: 12/5/2017


13

ans.ans_mul = ina.fno * ina.sno;

This statement first multiplies two numbers (values in columns fno and sno in our source) and then
later, it assigns the result to the output field ans_mul.

ans.ans_div = ina.fno / ina.sno;

This statement first devides two numbers (values in columns fno and sno in our source) and then
later, it assigns the result to the output field ans_div.

ans.ans_exp = ina.fno % ina.sno;

This statement first does apply exponential function on two numbers (values in columns fno and
sno in our source) and then later, it assigns the result to the output field ans_exp.

ans.org_a = ina.fno;

Straight move of source column fno to target column org_a

ans.org_b = ina.sno;

straight move of source column sno to target column org_b

if ( ina.sno < ina.fno)


{
ans.ans_max = ina.fno;
writeRecord(ans.portid_);
}

These statements compare values within sno and fno columns in source and then assigns greater
of those two values to output column ans_max

Macro writeRecord writes the record to the target

Rest of the code is a datastage API code which writes the error massages to the datastage director
job log. (need advanced knowledge of datastage API coding)

Thats it.your buildop is ready.now you need to generate the underlying codes to make the
buildop ready to execute in order to do this you need to click generate button on the GUI.

How to create Buildop in datastage 13 of 17 Last updated: 12/5/2017


14

When you click Generate button the GUI will create necessary files including

(1) Ingx_Calculator.opd
(2) Ingx_Calculator.h
(3) Ingx_Calculator.C
(4) Ingx_Calculator.O

Under applicable project directory on Unix box (where Datastage engine is residing)

Clicking Generate button will give you following confirmation dialog which confirms that
your buildop has been created and ready to use.

In you palette you will be able to see the buildop youve just created.

How to create Buildop in datastage 14 of 17 Last updated: 12/5/2017


15

Now lets use the buildop in a PX job

Below is the job design to test our buildop

How to create Buildop in datastage 15 of 17 Last updated: 12/5/2017


16

Inside the buildop stage you will see following mappings that you can define:

Lets verify our results:

Consider following input:

Lets run the job with our newly created buildop:

How to create Buildop in datastage 16 of 17 Last updated: 12/5/2017


17

Log entries:

Output:

How to create Buildop in datastage 17 of 17 Last updated: 12/5/2017

Você também pode gostar