Você está na página 1de 129

3 O D I O verview

O racle D ata Integrator perform s data transform ation/validation using below steps.

Extract
Load
Transfer
Traditional tools are used to do Extract, Transfer and Load i.e. ETL . B ut, O D I uses
ELT. The advantage of ELT is that data gets m oved from the Source system on an as
is basis, w ithout adding extra load to the Source system . H eavy-duty validation and
transform ation happens in the O D I server. This w ill enable O D I to handle large sets
of data. A s data grow s,it can be m ade scalable to m eet the grow th.

3.1

O D I A rchitecture and C om ponents

3.2

W hat is E LT ?

ELT m eans Extract data from source, Load data into staging area, and perform
Transform ation as required.
3.2.1 E xtract
In this step, the data gets extracted from one or m ore source system s running on different
operating system s and databases. The O perating system can be W indow s Server, U N
IX , Linux etc.,the database can be a SQ L server,O racle,Flat File,Excel spreadsheet etc.,

3.2.2 L oad
In this step, extracted data gets loaded into a data w arehouse. In general it gets
loaded into a staging area for further validations.
3.2.3 Transform
In this step, the extracted data gets validated to ensure that the dow nstream reporting
w il be accurate.D uring this step,surrogate keys get assigned to the records.

4 O D I Studio O verview
In this section you w ill learn to know the different com ponents of O racle D ata
Integrator (O D I).

4.1 D esigner
This is the m ost frequently used com ponent of O racle D ata Integrator used by D eveloper. D
eveloper uses this section to define Projects, M odels, ETL M appings, Variables, K now ledge
base,Load Plan etc.,W ork repository stores the D esigner m etadata.

The D esigner N avigator consists of the follow ing sections:


Projects
M odels
Load Plans and Scenarios
G lobal O bjects
Solutions

4.1.1 Projects
This section of the O racle D ata Integrator enables D eveloper to create, Project
specific folders. A ll Variables and K now ledge m odules defined w ith a Project
are private to its Project only. A s soon as a Project folder gets created, it w ill
create the below sub-sections by default.

4.1.1.1 Packages
The O D I packages are used to group m ultiple O D I objects such as Variable, M apping,
and Procedures in a specific sequence of execution. U sing Packages, one can evaluate
True or False and take a different path based on the U sers need. Package is a diagram m
atic representation of Jobs.The w orkflow of Jobs gets defined by drag and drop scenarios.

4.1.1.2 M appings
M apping is an interface w hich consists of Source and Target w ith validation
logics built in.The core logic of ETL is defined in the m apping section.
4.1.1.3 R eusable M appings
R eusable m appings are sim ilar to regular m apping. In addition to m apping, R
eusable m apping can be used w ith a m apping. R eusable m apping has both
Input and O utput param eters.
4.1.1.4 Procedures
Procedure is a perform ing logic, w hich is not suitable for ETL M apping. Even though
Procedure can be used to validate and transform data, ETL m apping should be the first

choice to do data validation and transform ations. Procedure w ill support m ultiple
technologies such as O perating System com m and, FTP, JM S com m and etc., O D I
Procedures can be called w ith in an O D I Package. O D I Procedure can have m ore than one set
of com m ands. M ultiple com m ands are processed sequentially. A Procedure consists of

com m ands from m ultiple technologies.


4.1.1.5 Variables
A variable is sim ilar to variables in any program m ing language.The value can be set
using a Select statem ent during the runtim e. O racle D ata Integrator provides a
feature to define static variables. The variable type can be A lphanum eric, Text, N um
eric and D ate. The Variables can be used in ETL m apping,Packages,Procedures.

4.1.1.6 Sequences
Sequences enable users to autom atically generate sequence num bers w ith a
specific increm ental value. Sequence can be an O D I sequence or it can be
based on a database. The sequence can be used in ETL m apping.
4.1.1.7 U ser Functions
U ser functions are used to create Project/System specific custom functions. It
can be defined at a Project level or at a G lobal level. The O D I function can be
used in O D I M apping and O D I Procedures. U ser functions enable easy m
aintenance of com m only used functions.
4.1.1.8 K now ledge M odules
A K now ledge M odule is a set of generic code that perform s a specific task. K M uses O D I
specific syntax to reference variables.O racle provides several out of the box K M to enable
faster developm ent of ETL M appings. O nce defined in the K M , it can be used
in m ultiple ETL m appings.K M can handle m ultiple technologies.
4.1.1.9 M arkers
M arker is used to flag O D I objects. This w ill enable O D I objects to be
grouped. O racle provides 3 out of the box m arkers as show n below :

U ser can create custom M arkers based on their needs.


4.1.2 M
odels

M odel stores the structure of Source and Target objects such as Table, Files. U nless it is
defined in the M odel, the objects cannot be used in ETL m apping. D atabase table
structure can be added to the M odel using reverse engineering. M odel does not store any
data. It stores the structure of the object only. U sing M odel, data can be queried. M odel
can be used to add additional constraints to the objects,other than defined in the database.

4.1.3 L oad Plans and Scenarios


The Load planner is used to execute O D I packages, O D I Procedures and ETL m apping
in a serial or parallel m anner. Load planner is used to populate data w arehouse by
running O D I jobs at specific intervals. Load planner stores the O D I job run tim es and
session ID . Variables can be set during the execution of Load planner. Load planner also
handles exception handling. Load planner has the capability to restart O D I jobs in m
ultiple w ays. Individual steps in the Load Planner can be enabled or disabled based on the
users need. Load Planner m etadata is stored in w ork repository.

4.2 O perator
U se the O perator section is used to m onitor the ETL jobs. A fter executing a
scenario, a job can be m onitored thoroughly in the O perator section. O perator has
the flexibility of locating a Job by D ate,A gent,Session,Status,K eyw ord and U ser.

The O perator N avigator consists of the follow ing sections:


Session List
H ierarchical Sessions
Load Plans and Scenarios
Scheduling
Load Plan and Scenarios
Solutions

4.2.1 Session L ist


This section, show s the progress of running Jobs and the results of all com
pleted Jobs. O nce an ETL or Load plan is subm itted for execution, its status can
be m onitored. The status of all ETL m appings can be view ed by D ate, A gent,
Sessions, Status, K eyw ord and U ser

4.2.2 H ierarchical Sessions


H ierarchical session is sim ilar to Session List but it show s C hild sessions.

4.2.3 L oad Plans and Scenarios

This section is specific to m onitoring

execution of Load plans.

4.2.4 Scheduling
This section show s all scheduled Load Plans by A gent and all scheduled jobs.

4.2.5 L oad Plan and Scenarios


This section lists all available Load plans and Scenarios.

4.3 Topology
Topology is part of a M aster repository. U sing Topology navigator, you can m anage
the sever connection,database connection,physical and logical connections etc.,

Topology inform ation is shared by D esigner during the ETL m apping.


Topology has the below sub-sections
Physical A rchitecture
C ontexts
Logical A rchitecture
Languages
R epositories
G eneric A ction

4.3.1 Physical architecture


In this section of the O D I, the physical characteristics of the environm ent can be
defined. In this section, the server connection using ip address, server nam e can also
be defined. Provide U ser nam e and passw ord to login to Servers. The physical
connection also defined by the type of database used during the connection.
The physical connection to a server is stored in D ata server. A D ata server can
connect to one technology only.O ne data server can have m ultiple physical schem as.

Physical A gents are also defined in this section.


List of Technologies available are:

1. PA RTS_D IM - Part M aster table to store all parts


2. PA RT_TY PES_D IM - Parts Type table to store type of Parts.
3. C U STO M ER S_D IM - C ustom er m aster to store C ustom er data
4. R EG IO N S_D IM - R egion list to report based on region
5. SA LES_R EPS_D IM - Sales person details
6. SA LES_FA C T - Sales inform ation

C lick on R everse Engineer

To im port all Table structures,select Standard check Table option and enter %
for M ask as show n below

C lick on R everse Engineer to extract the table structure from the database.

A fter Table structure extraction,O D I w ill


create table structure as follow s:

D rag and D rop source colum n to Target colum n as show n below :

For SA LE _R EP_D IM .SA LES_R EP_SID colum n m ap sequence as


show n below : C lick on SA LES_R EP_SID colum n
G o to the property w indow
G o to expression editor

D rag and drop SA LES_R EP_D IM _S sequence to the editor as show n below :

C lick O K
C lick on JO IN
G o to Property W indow and click C onnector Point You
can review the IN PU T and O U TPU T connectors.

Save the
ELT R un
the ELT

A fter the above 2 steps,the screen w ill look like as show n below :

Test the connection

C lick on Test C
onnection

C lick Test

C lick O K
The connection tested successfully.N ow w e are done w ith connecting to the desktop.

Let us look at steps to define the D irectory.


G o to Topology -> Physical A rchitecture -> Technologies -> File -> LO C A L_D
ESK TO P

R ight click
Pick N ew Physical Schem a
Set the directory to D :\ as show n below :

Save
You get the follow ing w
arning m essage.C lick O K
.W e w ill attach logical
schem a later.

The Physical A rchitecture w ill look like as show n below :

13.4 L ogical C onnection


G o to Topology -> Logical A rchitecture -> Technologies -> File

R ight click on File


Pick N ew Logical Schem a
N ow associate logical schem a to physical
schem a. D efine the logical schem a as follow s:

Save the schem a

13.5 M odel
G o to D esigner - > M odels

C lick on

Pick N ew M odel Folder


C reate m odel folder as show n below :

C lick
on

R ight C lick
Pick N ew M odel
D efine a M odel as show n below :

D esigner w ill look like as show n below :

13.6 File Form at


C lick on M _FILE _U PLO A D _M

R ight click
Pick N ew D atastore
D efine a data store as show n below :

In this exam ple,w e are trying to load C om m a separated file.

C lick on A ttributes.
W e are going to define the fields in the file.

Save
The designer w ill look like as show n below :

G o to Target -> Integration Type


Pick Increm ental U pdate.

In this exam ple w e are going to update existing records.

D efine a key at the Target


G o to PA RT _D IM _TA R G ET and click on PA RT_SID

C heck K ey
U ncheck U pdate

C lick on Physical

C lick on PA RT _D IM _TA R G ET
M ake the Integration know ledge M odule to IK M O racle Increm ental U pdate.

R un the m
apping.

NOTE: The com m and can be a PL/SQ L block w ith D EC LA R E, B EG IN


and EN D . R un the Procedure

18.3 A ctual R esult


A s you can see below ,D ESC R IPTIO N S appended w ith -TEST-

19 O D I Variables
You can define variables and it can be populated using Select statem ents. These
variables can be used in Packages as runtim e variables.
G o to D esigner
R ight C lick on Variables

Pick N ew Variable
D efine Variable

N am e: FIR ST _VA R IA B LE
D ata Type: A lphanum eric
K eep H istory: A ll Values

G o to R efreshing
Select Schem a

W rite a SQ L code using O D I Functions as show n below :

Save
C lick R efresh button to run the SQ L

C heck the O perator for successful com pletion


C lick on the H istory tab to see the value
It returns the num ber of
records, satisfies the SQ L
statem ent. N ew variables
created and it is available in
the O D I.They can be used in
M apping,Packages.

Você também pode gostar