Escolar Documentos
Profissional Documentos
Cultura Documentos
Head Stage:
1.The Head Stage is a Development/Debug stage
2. It can have a single input link and a single output link
3. The Head Stage selects the first N rows from each partition of an input data set and
copies the selected rows to an output data set. You determine which rows are copied by
setting properties which allow you to specify:
The number of rows to copy
The partition from which the rows are copied
The location of the rows to copy
The number of rows to skip before the copying operation begins
4.This stage is helpful in testing and debugging applications with large data sets. For
example, the Partition property lets you see data from a single partition to determine if
the data is being partitioned as you want it to be. The Skip property lets you access a
certain portion of a data set.
The stage editor has three pages:
Stage Page. This is always present and is used to specify general information about
the stage.
Input Page. This is where you specify the details about the single input set from
which you are selecting records.
Output Page. This is where you specify details about the processed data being output
from the stage.
Head stage properties:
1. Rows
All Rows[After Skip]=True/False
Copy all rows to the output following any requested skip positioing
All Rows[After Skip]=False
No of rows [Per partition]=10
Period[Per Partition]=N
Copy every N'th row per partition, starting with the first.
Skip[Per Partition]=1
Number of rows to skip at the start of every partition.
If we select false then only No of rows [Per partition]=10 will come
2. Partitions
All Partition=True
When set to True copies rows from all partitions. When set to False, copies from specific
partition numbers, which must be specified.
Output seqEmpdata:
JOB:
InputseqEmpData Properties:
2
Case-1:
Head stage Properties:
AllRows=False
Number of rows=5
Allpartitios=TRUE
Target Output_Sequentialdata:
OutputSequential data:
Job:
Input SeqEmpData:
Target OutputSequentialData:
Target Output_Sequentail_data:
Tail Stage:
1.The Tail Stage is a Development/Debug stage
2. It can have a single input link and a single output link
3. The Tail Stage selects the last N records from each partition of an input data set and
copies the selected records to an output data set. You determine which records are copied
by setting properties which allow you to specify:
The number of records to copy
The partition from which the records are copied
4.This stage is helpful in testing and debugging applications with large data sets. For
example, the Partition property lets you see data from a single partition to determine if
the data is being partitioned as you want it to be. The Skip property lets you access a
certain portion of a data set
The stage editor has three pages:
Stage Page. This is always present and is used to specify general information about
the stage.
Input Page. This is where you specify the details about the single input set from
which you are selecting records.
Output Page. This is where you specify details about the processed data being output
from the stage
Tail stage properties:
1.Rows
2.Partitions
Rows:
No of rows[Per partition]=10(Default is 10 if we need more than 10 or less
than 10 we have to change the number)
Number of rows to copy from input to output per partition.
Partitions:
All Partition=True/False
When set to True copies rows from all partitions. When set to False, copies from
specific partition numbers, which must be specified.
JOB:
Tailstage Properties:
Output Columns:
10
Target Output_Sequentialdata:
Sample Stage:
1.The Sample stage is a Development/Debug stage.
2. It can have a single input link and any number of output links when operationg in
percent mode,
3. And a single input and single output link when operating in period mode
4.The Sample stage samples an input data set. It operates in two modes. In Percent mode,
it extracts rows, selecting them by means of a random number generator, and writes a
given percentage of these to each output data set. You specify the number of output data
sets, the percentage written to each, and a seed value to start the random number
generator. You can reproduce a given distribution by repeating the same number of
outputs, the percentage, and the seed value
5.In Period mode, it extracts every Nth row from each partition, where N is the period,
which you supply. In this case all rows will be output to a single data set, so the stage
used in this mode can only have a single output link
6.For both modes you can specify the maximum number of rows that you want to sample
from each partition.
The stage editor has three pages:
11
Stage Page. This is always present and is used to specify general information about
the stage.
Input Page. This is where you specify details about the data set being Sampled.
Output Page. This is where you specify details about the Sampled data being output
from the stage.
EXAMPLE JOB FOR SAMPLE STAGE:
Note: Sample stage we can Operate in Two Modes one is Period and Another one is
Percentage Mode
Input data:
JOB:
12
Output Columns:
13
Output data:
14
Note :Here out put we get only 3 records because we set option period[perpartion]=3 So
it takes every 3 rd record from input file data
Job:
15
16
18
19
20
Peek Stage:
1.The Peek stage is a Development/Debug stage.
2. It can have a single input link and any number of output links.
3.The Peek stage lets you print record column values either to the job log or to a separate
output link as the stage copies records from its input data set to one or more output data
sets.
4.Like the Head stage and the Tail stage (Sample stage), the Peek stage can be helpful for
monitoring the progress of your application or to diagnose a bug in your application.
The stage editor has three pages:
Stage Page. This is always present and is used to specify general information about
the stage.
Input Page. This is where you specify the details about the single input set from
which you are selecting records.
Output Page. This is where you specify details about the processed data being output
from the stage.
JOB:
22
23
Option outputmode=Joblog:
Job:
24
25
Here we set the option Peek output mode=job log so we can see the data at Logs only
Procedure for see the data at logs:
Go to the tools and run directornow click on view log it will show the screen like
26
in the above screen from bottom to 8th row u click it will show the log details
27
JOB:
28
29
30
31
peekoutput2 mappings:
peekoutput3 columns:
32
Peekoutput3 mappings:
peekout1 properties:
33
peekoutput1 data:
Peekoutput2 properties:
34
Peekoutput2 data:
Peekoutput2 properties:
35
Peekoutput3 data:
Peekoutput3 properties:
36