Você está na página 1de 9

What is the difference between rollup and scan?

Ans: By using rollup we cant generate cumulative summary records for that we will be
using scan.
What is the difference between partitioning with key and round
robin?
Ans: PARTITI! B" #$":
In this% we have to specify the &ey based on which the partition will occur. 'ince it is
&ey based it results in very well balanced data. It is useful for &ey dependent
parallelism.
PARTITI! B" R(!) RBI!:In this% the records are partitioned in se*uential way%
distributing data evenly in bloc&si+e chun&s across the output partition. It is not &ey
based and results in well balanced data especially with bloc&si+e of ,. It is useful for
record independent parallelism.
How do you truncate a table
ans: There are many ways to do it.
,. Probably the easiest way is to use Truncate Table
-. Run '*l or update table can be used to do the same thing
.. Run Program
What is the difference between a DB config and a CFG file?
Ans/ A .dbc file has the information re*uired for Ab Initio to connect to the database
to e0tract or load tables or views. 1hile .234 file is the table configuration file created
by db5config while using components li&e 6oad )B Table
Types of parallelism in detail
ans:There are . types of parallelism in ab7initio.
,8 )ata Parallelism: )ata is processed at the different servers at the same time.
-8 Pipeline parallelism: In this the records are processed in pipeline% i.e. the
components do not have to wait for all the records to be processed. The records that
got processed are passed to ne0t component in pipeline.
.8 2omponent Parallelism: In this two or more components process the records in
parallel.
2omponent parallelism:7 A graph with multiple processes running simultaneously on
separate data uses component parallelism.
)ata parallelism :7 A graph that deals with data divided into segments and operates on
each segment simultaneously uses data parallelism. !early all commercial data
processing tas&s can use data parallelism. To support this form of parallelism% Ab Initio
provides Partition components to segment data% and )epartition components to merge
segmented data bac& together .
Pipeline parallelism :7 A graph with multiple components running simultaneously on
the same data uses pipeline parallelism. $ach component in the pipeline continuously
reads from upstream components% processes data% and writes to downstream
components. 'ince a downstream component can process records previously written
by an upstream component% both components can operate in parallel. !T$: To limit
the number of components running simultaneously% set phases in the graph.
What is the function you would use to transfer a string into a
decimal?
Ans: 3or converting a string to a decimal we need to typecast it using the following
synta0%
out.decimal5field :: 9 decimal9 si+e5of5decimal 8 8 string5field/
The above statement converts the string to decimal and populates it to the decimal
field in output.
. How to e!ecute the graph from start to end stages? Tell me and how to run
graph in non"#binitio system?
Ans: There are so many ways to do this% i am giving one e0ample due to time
constraint you can run components according to phasea how you defined.
by creating &sh% sh scripts also you can run.
. What is data mapping and data modelling?
Ans/ )ata mapping deals with the transformation of the e0tracted data at 3I$6) level
i.e. the transformation of the source field to target field is specified by the mapping
defined on the target field. The data mapping is specified during the cleansing of the
data to be loaded.
3or $0ample:
source/
string9.:8 name ; <'iva #rishna </
target/
string9<=,<8 nm;!(669<<8/>?9ma0imum length is string9.:88?>
Then we can have a mapping li&e:
'traight move.Trim the leading or trailing spaces.
The above mapping specifies the transformation of the field nm
What is the difference between sandbo! and $%$& can we perform
checkin
and chec&out through sandbo0> 2an anybody e0plain chec&in and chec&out?
Ans/ 'andbo0es are wor& areas used to develop% test or run code associated with a
given pro@ect. nly one version of the code can be held within the sandbo0 at any
time.
The $A$ )atastore contains all versions of the code that have been chec&ed into it. A
particular sandbo0 is associated with only one Pro@ect where as a Pro@ect can be
chec&ed out to a number of sandbo0es
e!plain the en'ironment 'araibles with e!ample?
ans/ $nvironemental variables server as global variables in uni0 envrionment. They
are used for passing on values from a shell> process to another. They are inherited by
Abinitio as sandbo0 variables> graph parameters li&e
AI5'RT5AAB52R$
AI5CA$
AI5'$RIA6
AI5A3' etc.
To &now what all variables e0ist% in your uni0 shell% find out the naming convention
and type a command li&e <env D grep AI<. This will provide you a list of all the
variables set in the shell. "ou can refer to the graph parameters> components to see
how these variables are used inside Abinitio.
What r the Graph parameter?
ans: There are - types of graph parameters in AbInitio
,. local parameter
-. 3ormal parameters.9those parameters wor&ing at runtime8
. How to (mpro'e )erformance of graphs in #b initio?Gi'e some
e!amples or tips?
#ns* There are somany ways to improve the performance of the graphs in
Abinitio.
I have few points from my side.
,.(se A3' system using Partion by Round by robin.
-.If needed use loo&up local than loo&up when there is a large data.
..Ta&eout unnecessary components li&e filter by e0p instead provide them in
reformat>Eoin>Rollup.
F.(se gather instead of concatenate.
:.Tune Aa05core for ptional performance.
G.Try to avoid more phases.
What are the most commonly used components in a #binition
graph e!ample of a trasformation of data& say customer data in a credit card
company into meaningful output based on business rules?
Ans: The most commonly used components in to any Ab Initio pro@ect are
input file>output file
input table>output table
loo&up file
reformat%gather%@oin%runs*l%@oin with db%compress components%sort%trash%partition by
e0pression%partition by &ey %concatinate
Difference between con'entional loading and direct loading ? when it is used
in real time ?
ans: 2onventional 6oad:
Before loading the data% all the Table constraints will be chec&ed against the data.
)irect load:93aster 6oading8
All the 2onstraints will be disabled. )ata will be loaded directly.6ater the data will be
chec&ed against the table constraints and the bad data wonHt be inde0ed.
Api conventional loading
utility direct loading.
How to find the number of arguments defined in graph?
Ans: IJ 7 !o of positional parameters
I? 7 the e0it status of the last e0ecuted command.
. What is the difference between dbc and cfg file?
Ans: .cfg file is forK the remote connection and .dbc is for connecting the database.
.cfg contains :
,. The name of the remote machine
-. The username>pwd to be used while connecting to the db.
.. The location of the operating system on the remote machine.
F. The connection method.
and .dbc file contains the information:
,. The database name-. )atabase version
.. (serid>pwd
F. )atabase character set and some more...
. How to do we run se+uences of ,obs &&like output of # -.B is (nput
to B .How do we co"ordinate the ,obs?
Ans: By writing the wrapper scripts we can control the se*uence of e0ecution of more
than one @ob.
How would you do performance tuning for already built graph ? Can you let
me know some e!amples?
Ans: e0ample :7 suppose sort is used in fornt of merge component its no use of using
sort L bc+ we hv sort component built in merge.
-8 we use loo&up instead of EI!%Aerge 2omponenet.
.8 suppose we wnt to @oin the data comming from - files and we dnt wnt dupliates we
will use union funtion instead of adding addtional component for duplicate remover.
. What is semi",oin
ans: In abinitio%there are . types of @oin...
,.inner @oin. -.outer @oin and ..semi @oin.
for inner @oin Hrecord5re*uirednH parameter is true for all in ports.
for outer @oin it is false for all the in ports.
if u want the semi @oin u put Hrecord5re*uirednH as true for the re*uired component and
false for other components..
How to get D%/ using 0tilities in 01(2?
Ans: If your source is a cobol copyboo&% then we have a command in uni0 which
generates the re*uired in Ab Initio. here it is:
cobol7to7dml.
what is local and formal parameter?
Ans: Two are graph level parameters but in local you need to initiali+e the value at the
time of declaration where as globle no need to initiali+e the data it will promt at the
time of running the graph for that parameter.
. what is B3.DC#4T(1G and 3$)/(C#T$ ?
ans: Broadcast 7 Ta&es data from multiple inputs% combines it and sends it to all the
output ports.
$g 7 "ou have - incoming flows 9This can be data parallelism or component
parallelism8 on Broadcast component% one with ,= records M other with -= records.
Then on all the outgoing flows 9it can be any number of flows8 will have ,= N -= ; .=
records
Replicate 7 It replicates the data for a particular partition and send it out to multiple
out ports of the component% but maintains the partition integrity.
$g 7 "our incoming flow to replicate has a data parallelism level of -. with one partition
having ,= recs M other one having -= recs. !ow suppose you have . output flos from
replicate. Then each flow will have - data partitions with ,= M -= records respectively.
What is m5dump
m5dump command prints the data in a formatted way.
m5dump OdmlP Ofile.datP
an e!aple of realtime start script in the graph?
Ans: Cere is a simple e0ample to use a start script in a graph:
In start script lets give as:
e0port I)T;Qdate HNRmRdRyHQ
!ow this variable )T will have todayHs date before the graph is run.
!ow somewhere in the graph transform we can use this variable as/
out.process5dt::I)T/
which provides the value from the shell.
How to run the graph without GD$?
Ans: In R(! ;;P )eploy PP As script % it create a .bat file at ur host directory %and
then run .bat file from 2ommand prompt
How Does %#2C.3$ works?
Ans: Aa0core is a value 9it will be in #b8.1hne ever a component is e0ecuted it will
ta&e that much memeory we specified for e0ecution
What is 6mp,ret? Where it is used in ab"initio?
ans: "ou can use Imp@ret in endscript li&e
if = 7e*9Imp@ret8then
echo <success<
else
mail0 7s <SgraphnameT failed< mailid
How do you con'ert 7"way %F4 to 8"way mfs?
Ans: To convert F way to U way partition we need to change the layout in the
partioning component. There will be seperate parameters for each and every type of
partioning eg. AI5A3'5CA$% AI5A3'5A$)I(A5CA$% AI5A3'51I)$5CA$
etc.
The appropriate parameter need to be selected in the component layout for the type of
partioning..
What is #B5/.C#/ e!pression where do you use it in ab"initio?
ans: ablocal5e0pr is a parameter of itable component of Ab Initio.AB62A698 is
replaced by the contents of ablocal5e0pr.1hich we can ma&e use in parallel
unloads.There are two forms of AB562A698 construct% one with no arguments and one
with single argument as a table name9driving table8.
The use of AB562A698 construct is in 'ome comple0 'V6 statements contain
grammar that is not recogni+ed by the Ab Initio parser when unloading in parallel. "ou
can use the AB62A698 construct in this case to prevent the Input Table component
from parsing the 'V6 9it will get passed through to the database8. It also specifies
which table to use for the parallel clause.
What is mean by Co 9 .perating system and why it is special for
#binitio ?
ans: It converts the AbInitio specific code into the format% which the (!IB>1indows
can understand and feeds it to the native operating system% which carries out the tas&.
How will you test a dbc file from command prompt ?
ans: try <m5db test myfile.dbc<
. Which one is faster for processing fi!ed length dmls or delimited dmls
and why ?
ans: 3i0ed length )A6Hs are faster because it will directly read the data of that length
without any comparisons but in delimited one%s every character is to be compared and
hence delays
.What are the continuous components in #binitio?
ans: 2ontineous components used to create graphs%that produce useful output file
while running continously
$0:7 2ontineous rollup%2ontineous update%batch subscribe
How to retrie'e data from database to source in that case whice component is
used for this?
ans/ To unload 9retrive8 )ata from the database )B-% Informi0% or racle we have
components li&e Input Table and (nload )B Table by using these two components we
can unload data from the database.
. What is the relation between $%$ & GD$ and Co"operating system
?
ans: $A$ is said as enterprise metdata env% 4)$ as graphical devlopment env and
2ooperating sytem can be said as asbinitio server relation b>w this 27P% $A$ A!)
4)$
is as fallows
2o operating system is the Abinitio 'erver. this co7op is installed on perticular .'
platform that is called !ATIW$ .' .comming to the $A$% % its hold the
metadata%trnsformations%db config files source and targets informationHs. comming to
4)$ its is end user envirinment where we can devlop the
graphs9mapping @ust li&e in informatica8
designer uses the 4)$ and designs the graphs and save to the $A$ or 'and bo0 it is
at user side.where $A$ is ast server side.
. What are kinds of layouts does ab initio supports
ans: Basically there are serial and parallel layouts supported by AbInitio. A graph can
have both at the same time. The parallel one depends on the degree of data
parallelism. If the multi7file system is F7way parallel then a component in a graph can
run F way parallel if the layout is defined such as itHs same as the degree of
parallelism.
Do you know what a local lookup is?
ans: 6oo&up 3ile consists of data records which can be held in main memory. This
ma&es the transform function to retrieve the records much faster than retirving from
dis&. It allows the transform component to process the data records of multiple files
fastly.
How many components in your most complicated graph?
ans: This is a tric&y *uestion% number of component in a graph has nothing to do withthe
level of &nowledge a person has. n the contrary% a proper standardi+ed and
modular parametric approach will reduce the number of components to a very few. In
a well thought modular and parametric design% mostly the graphs will have .>F
components% which will be doing a particular tas& and will then call another sets of
graphs to do the ne0t and so on. This way total numbers of distinct graphs will
drastically come down% support and maintenance will be much more simplified.
The bottomline is% there are lot more other things to plan rather than to add
components.
. How to handle if D%/ changes dynamically in abinitio
ans: If the )A6 changes dynamically then both dml and 0fr has to be passed as graph
level parameter during the runtime.
Ha'e you worked with packages?
Ans: Pac&ages are nothing but the reusable bloc&s of ob@ects li&e transforms% user
defined functions% dmls etc. These pac&ages are to be included in the transform where
you use them. 3or e0ample% consider a user defined function li&e
>?string5trim.0fr?>
out::trim9input5string8;
begin
let string9.:8 trimmed5string ; string5lrtrim9input5string8/
out::trimmed5string/
end
!ow% the above 0fr can be included in the transform where you call the above function
as
include HHX>0fr>string5trim.0frHH/
But this should be included ABW$ your transform function.
3or more details see the help file in <pac&ages<.
. What are primary keys and foreign keys?
Ans: In R)BA' the relationship between the two tables is represented as Primary &ey
and foreign &ey relationship.1heras the primary &ey table is the parent table and
foreign&ey table is the child table.The criteria for both the tables is there should be a
matching column.
What are Cartesian ,oins?
Ans: 2artesian @oin will get you a 2artesian product. A 2artesian @oin is when you @oin
every row of one table to every row of another table. "ou can also get one by @oining
every row of a table to every row of itself.
$!plain the difference between the :truncate; and <delete<
commands?
ans: Truncate :7 It is a ))6 command% used to delete tables or clusters. 'ince it is a
))6 command hence it is auto commit and Rollbac& canHt be performed. It is faster
than delete.
. How can i run the = G0( merge files?
Ans:)o you mean by merging 4ui map files in 1R.If so% by merging 4(I map files in
4(I map editor it wont create corresponding test script.without testscript you cant run
a file.'o it is impossible to run a file by merging - 4(I map files.

Você também pode gostar