Escolar Documentos
Profissional Documentos
Cultura Documentos
Scenario for above is Inserting row into A table based on one column(Primary key) and other side is update
the same table A based on the other column as key.
What is the difference between star schema and snow flake schema? When we use those schema's?
star schema: When dimension table contains less number of rows, we can go for Star schema. In this Both
Dimension and Fact Tables are in De-Normalized form. Good for data marts with simple relationships …
Data warehouses typically use a surrogate, (also known as artificial or identity key), key for the dimension
tables primary keys. They can use Infa sequence generator, or Oracle sequence, or SQL Server Identity
values for the surrogate key.
It is useful because the natural primary key (i.e. Customer Number in Customer table) can change and this
makes updates more difficult.
Some tables have columns such as AIRPORT_NAME or CITY_NAME which are stated as the primary keys
(according to the business users) but, not only can these change, indexing on a numerical value is probably
better and you could consider creating a surrogate key called, say, AIRPORT_ID. This would be internal to
the system and as far as the client is concerned you may display only the AIRPORT_NAME.
ODS (Operational Data Source) is the first point in the Data warehouse. Its store the real time data of daily
transactions as the first instance of Date.
Staging Area, is the later part which comes after the ODS. Here the Data is cleansed and temporarily stored
before loaded into the Data warehouse.
Transformations can be active or passive. An active transformation can change the number of rows that
pass through it, such as a Filter transformation that removes rows that do not meet the filter condition.
A passive transformation does not change the number of rows that pass through it, such as an Expression
transformation that performs a calculation on data and passes all rows through the transformation
Active transformations
Passive transformation
Lookup
Expression
Stored Procedure
Sequence generator
External Procedure
XML Source Qualifier
Maplet- Input
Maplet – Output
PowerCentre:
Unconnected lookup:it recives the value :lkp expression 2)it will be use only dynamic
3)it return only single value 4)it does not support user defined values
Informatica Architecture :
client tools-
Repositoy Manager-> Pc Designer-> WFManager-> WFMonitor
| | | |
(creat,mdify,del-folders, (src&Tar def, (create tasks (display
privilages& access RS) mapings,maPlets)connect to work flow) reult/output)
---------------------
Powercenter tools--
PC service <--------------------->Repository Database
|
|
<-------> PC Server<---->
here: Pc-PowerCenter.
a) Unicode - IS allows 2 bytes for each character and uses additional byte for each non-ascii character (such
as Japanese characters)
b) ASCII - IS holds all data in a single byte
which is better among incremental load, Normal Load and Bulk load
Incremental load:
Incremental means suppose today we processed 100 records ,for tomorrow run
u need to extract whatever the records inserted newly and updated after previous run based on last
updated timestamp (Yesterday run) this process called as incremental or delta
Normal load:
In normal load we are processing entire source data into target with constraint based checking
Bulk load:
In bulk load with out checking constraints in target we are processing entire source data into target
Which transformation you need while using the cobol sources as source definitions?
Normalizer transformation which is used to normalize the data
b) For Targets
to load any target table you have the options to make either "fast load connection" OR " multiload
connection" OR "Tpump connection" from Workflow Manager > Connections > Loader > New .... menu.
How to configure these connections you might want to have a person which has knowledge about teradata
TDPID etc.
Advantages: In the event of failures, you can recover using the Teradata recovery process.
Disadvantages: Staged mode is slower than Piped mode, and you need more disk space, as it can create
large data files.
Piped Mode: The Informatica process reads from the source and simultaneously pipes that data to the
loader to start loading the target table.
Advantages: Quicker than Staged mode, and you do not require large amounts of disk space because no
data files are created.
Disadvantages: In the event of failures, you cannot recover using the Teradata recovery process (because
tpump does row commits unlike fastload and mload).
Fastoad
You use the Fastload process on empty tables, such as loading staging tables and in initial loads where the
tables are empty.
When the Fastload process starts loading, it locks the target table, which means that processes (for
example, lookups) cannot access that table. One solution to this problem is to specify dummy SQL for the
look up overrides at the session level.
TIP: If a session fails during a Fastlload process, use SQL Assistant to run a simple SQL command (for
example, count(*)), to determine whether the table is locked by a Fastload process.
If a table is locked (for example, (for W_ORG_DS), use the following script to release the lock:
LOGON SDCNCR1/Siebel_qa1,sqa1;
BEGIN LOADING Siebel_qa1.W_ORG_DS
ERRORFILES Siebel_qa1.ET_W_ORG_DS,siebel_qa1.UV_W_ORG_DS;
END LOADING;
If you save the above text in a file called test.ctl, you would run this process by entering the following
command at a command prompt:
C:\fastload\test.ctl
TIP: To create a load script for a table, edit the test.ctl script above to change the login information, and
replace all occurrences of W_ORG_DS with the required target table name.
After a load process script runs successfully, you should be able to run the command 'select count(*)' on the
target table. If you are not able release the lock, you might need to drop and re-create the table to remove
the lock. If you do so, you must re-create the statistics.
TIP: Fastload is typically used in piped mode to load staging tables and initial loads. In the event of
errors, reload the entire data.
Mload
The Mload process is slower than Fastload but quicker than Tpump. The Mload process can work on both
empty tables as well as on tables with data. In the event of errors when running in piped mode, you cannot
recover the data.
Tpump
The Tpump process is slower than Mload but faster than ODBC. The Tpump process does row commits,
which enables you to recover processed operations, even if you use piping mode. In other words, if you re-
start the process, Tpump starts loading data from the last committed data.
Tpump can be used in the following modes:
Tpump_Insert : Use to do inserts.
Tpump_Update : Use to do updates (this mode requires you to define the primary key in the Informatica
target table definition).
Tpump_Upsert : Use to do update otherwise insert (this mode requires you to define the primary key in
the Informatica target table definition).
Tpump_Delete: Use to do deletes (this mode requires you to define the primary key in the Informatica
target table definition).
Informatica uses the the actual target table name to generate the error table and log tables to be used as
part of its control file generation. If you have two instances of Tpump loading into same target table at the
same time, you need to modify the session to use a different error table and log table name.
The Tpump load process in piped mode is useful for incremental loads, and where the table is not empty.
In the event of errors, restart the process and it starts re-loading from the last committed data.
Refer to Informatica documentation for information about configuring a session to use Teradata loaders.
In Informaitca we use Check in and Check out as versioning tool. Whenever we want to edit any
mapping/mapplet/workflow/session in the repository then we need to Check OUT first.Once we Check out
it becomes into editable mode and we can implement our changes.And once all the changes are done we
can save it.
Once we are done with the changes we can Check IN so that other users can also see what changes we have
done and they can also view that.
Note:It is good practice to enter comments for every Check IN and Check OUT.This allows other users to
know the purpose of your change
A debugger is used to troubleshoot the errors in a Informatica mapping that you find before running a
session or after saving the mapping and running the session. To debug a mapping, first we need to
configure the debugger and then run the same within the Mapping Designer.
The Debugger makes use of the existing session or creates a debug session of its own to debug the
mapping.
Debugging can be done in either of the below situations.
Before running the session: Once you are done with the mapping you can do the debugging on the
mapping before making the session to check the initial results
After you run a session: When you encounter any errors while running the sessions then you can go to the
mapping and start debugger with the existing session
We can move the objects from Development folder to Production folder in Informatica by using any of the
two methods.Before making any changes to the Production folder make sure you take the back up of all the
objects.
Export and import
Export the mappings that you want to move from the DEV folder and save those as XML in some folder.
Take the back up of the Production mappings before replacing
Import these XML into Production Folder
Save the mappings
But when you are doing these we need to check the below things
1.We need to check the Replace check box in case the source or target is already present in the folder
2. For other reusable transformations such as source shortcuts, target shortcuts or any other
transformations (Sequence Generator, Lookup etc) choose Reuse (not replace)
3. On Global copy options, in conflict resolution wizard, select (or mark) Retain Sequence Generator,
Normalizer or XML key current values.
Direct Migration :
IF the Development and Production are in separate repositories then go to the Repository Manager and
then connect to the development Repository.Then go to the Production repository and open that too.Then
you can drag and drop all the folders from Dev to Production.
Note:
Problem might come when we export and import objects separately i.e. mapping, workflow etc. The big
problems is shortcuts. In this case
1.Open Development folder from Repository Manger
2.Select only the workflows related to the mapping from the repository manager and export only the
workflow XML. This will take all the associated objects (mappings, sessions, etc.) with the workflows.
3.Import from this single file into your Prod Environment.
This will import and export everything regarding a mapping.
If it’s A Full Load Check If the Truncate Option Is Enabled and Working Correctly. For Type 2(Scdtype 2)
Truncate Option Should Be Disabled
3)Check For The Performance Of The Session While Loading The Data. This Includes Checking The
Threshold Value Once The Data Is Loaded. The Performance Becomes Important When The Number Of
Records Are Huge
4)Set The Stop On Error As 1 In Error Handling. Indicates How Many Non-Fatal Errors The Integration
Service Can Encounter Before It Stops The Session. Non-Fatal Errors Include Reader, Writer, And Dtm
Errors. Enter the Number Of Non-Fatal Errors You Want To Allow Before Stopping The Session.This Will
Stop The Workflow When Informatica Encounters Any Error
Check The Option To Fail The Parent If The Task Fails. Fail Parent If This Task Fails Should Be Checked For
All The Sessions Within A Workflow.
Check If The Logs Are Getting Updated Properly After The Data Load Is Completed. This Includes the
Session Log and Workflow Log
Check If the Mapping Parameters And Workflow Parameter Used In The Mapping Are Defined Correctly
Compare The Stage And Target Table Counts.
Select Count(*) From Product_D
Select Count(*) From Product_Ds
Comparing the Attributes from Stage Tables to That of the Target Tables. They Should Be Matching.
In Informatica Lookup transformation we have the option to the cache the Lookup table(Cached Lookup).If
we don’t use the lookup cache its is called as Uncached Lookup.
In Uncached lookup we do lookup on the base table and will return output values based on the Lookup
condition. If the lookup condition is matching it returns the value from Lookup table or cache .And if
lookup condition is not satisfied then it returns either NULL or default value. This is how Uncached
Lookup works
Lookup Cache can be of different types like Dynamic Cache and Static Cache
Static Cache is same as a Cached Lookup in which once a Cache is created the Integration Service always
queries the Cache instead of the Lookup Table.
In Static Cache when the Lookup condition is true it return value from lookup table else returns Null or
Default value.
In Static Cache the important thing is that you cannot insert or update the cache.
1. Connected or Unconnected:
They differ in the way output is received. In Connected Lookup the input is received through pipeline
whereas in Unconnected Lookup receives input values from the result of a :LKP expression in another
transformation
2. Lookup via Flat File or Relational:
After creating Lookup Transformation we can lookup either on a Flat file or on relational tables.
When we do Lookup on Relational tables we have to connect to the required table which will be there in
the source list of Lookup. If it’s not there then we need to import the table definition for the Lookup
Transformation
When we do Lookup on Flat files the Designer invokes the Flat File Wizard and connects the source.
3. Cached or Uncached :
Lookup Cache can be of two types:
1. Dynamic Cache: If its Dynamic Cache then integration service takes the rows from the cache .This
improves the session performance and speeds up the activity.
2. Static Cache: By default the Lookup cache will be static and will not change during the entire session
Normally a lookup will be in cached form by default. Which means that when we do lookup on table then
Informatica will be go into the lookup table and store the data in Cache file, which avoids re lookup into
the table when we need the data again.Informatica will make use of the cache file and this makes the
lookup more faster.
But now the question comes why we need Persistent Cache in Lookup. To use Persistent Cache we need to
check the option of Using Persistent Cache in the lookup. When we do that what Informatica does is that it
will store the cache file and won’t delete it after run of the session or workflow.
This becomes handy for situations where we use the same lookup in many mappings. Suppose that we use
the Lookup LKP_GET_VALUE with same lookup condition and return and output ports in 10 different
mappings. In this case if you don’t use Persistent Cache then we have to lookup the table 10 times, and if
the table is a huge value then it will take some time to build the cache. This can be avoided by using
Persistent Cache.
Dynamic cache
In Dynamic Cache we can insert or update rows in the cache when we pass the rows. The Integration
Service dynamically inserts or updates data in the lookup cache and passes the data to the target. The
dynamic cache is synchronized with the target
Shared cache
When we use shared Cache Informatica server creates the cache memory for multiple lookup
transformations in the mapping and once the lookup is done for the first lookup then memory is released
and use that memory used by the other look up transformation.
We can share the lookup cache between multiple transformations. Unnamed cache is shared between
transformations in the same mapping and named cache between transformations in the same or different
mappings.
Persistent cache
If we use Persistent cache Informatica server processes a lookup transformation and saves the lookup cache
files and reuses them the next time. The Integration Service saves or deletes lookup cache files after a
successful session run based on whether the Lookup cache is checked as persistent or not
In order to make a Lookup Cache as Persistent cache you need to make the following changes
Lookup cache persistent: Needs to be checked
Cache File Name Prefix: Enter the Named Persistent cache file name
Re-cache from lookup source: Needs to be checked
Date and time show in Workflow Manager is the one which is in Windows Control Panel of the
PowerCenter Client machine.You can modify using the below steps
Go to Control Panel
Click on Regional Settings
Set the date and time
Syntax:
pmcmd startworkflow -sv <Integration Service Name> -d <Domain Name> -u <Integration Service
Username> -p <Password> -f <Folder Name> <Workflow>
Before that we need to configure the environment variables and make the necessary changes
After that Go to control panel->systems->advanced ->environment variable->system variable and there you
can see the PATH variable present just add a “; “ at the end and add the path
Informatica Power Centre allows us to control the roll back and commit on transaction based on set of rows
that passes through the Transaction Control transformation. This allows to define your transaction whether
it should be committed or rollback based on the rows that pass through ,such as based on the Entry Date or
some other column.We can control this transaction either at the Mapping level or Session level.
1)Mapping Level:
Inside the mapping we will be using Transaction Control transformation. And inside this transformation
we have an expression. Based on the return value of this expression we decide whether we have to commit,
roll back, or continue without any transaction changes.The transaction control expression uses the IIF
function to test each row against the condition.
Use the following syntax for the expression:
IIF (condition, value1, value2)
Use the following built-in variables in the Expression Editor when you create a transaction control
expression:
TC_CONTINUE_TRANSACTION
The Integration Service does not perform any transaction change for this row. This is the default value of
the expression.
TC_COMMIT_BEFORE
The Integration Service commits the transaction, begins a new transaction, and writes the current row to
the target. The current row is in the new transaction.
TC_COMMIT_AFTER
The Integration Service writes the current row to the target, commits the transaction, and begins a new
transaction. The current row is in the committed transaction.
TC_ROLLBACK_BEFORE
The Integration Service rolls back the current transaction, begins a new transaction, and writes the current
row to the target. The current row is in the new transaction.
TC_ROLLBACK_AFTER
The Integration Service writes the current row to the target, rolls back the transaction, and begins a new
transaction. The current row is in the rolled back transaction
2)Session Level:
When we run a session the Integration Service checks the expression ,and when it finds a commit row then
it commits all rows in transaction to the target table. When the Integration Service evaluates a rollback row,
it rolls back all rows in the transaction from the target or targets.
Also we can do user defined commit here in case Integration services fails to do so.
Surrogate key is the primary key for the Dimensional table. Surrogate key is a substitution for the natural
primary key.It is just a unique identifier or number for each row that can be used for the primary key to the
table.The only requirement for a surrogate primary key is that it is unique for each row in the table.
Data warehouses typically use a surrogate,(also known as artificial or identity key), key for the dimension
tables primary keys.They can use sequence generator, or Oracle sequence, or SQL Server Identity values for
the surrogate key.It is useful because the natural primary key (i.e. Customer Number in Customer table)
can change and this makes updates more difficult.
Some tables have columns such as AIRPORT_NAME or CITY_NAME which are stated as the primary keys
(according to the business users) but ,not only can these change, indexing on a numerical value is probably
better and you could consider creating a surrogate key called, say, AIRPORT_ID. This would be internal to
the system and as far as the client is concerned you may display only the AIRPORT_NAME.
select ROWNUM as RANK, ename,sal from(select ename,sal from emp ORDER BY sal) WHERE
ROWNUM<=3;
RANK ENAME SAL
---------- ---------- ----------
1 SMITH 800
2 JAMES 950
3 ADAMS 1100
SCD Type 1,Slowly Changing Dimension Use,Example,Advantage,Disadvantage
In Type 1 Slowly Changing Dimension, the new information simply overwrites the original information. In
other words, no history is kept.
In our example, recall we originally have the following table:
After Williams moved from New York to Los Angeles, the new information replaces the new record, and
we have the following table:
Customer Key Name State
1001 Williams Los Angeles
Advantages
This is the easiest way to handle the Slowly Changing Dimension problem, since there is no need to keep
track of the old information.
Disadvantages
All history is lost. By applying this methodology, it is not possible to trace back in history. For example, in
this case, the company would not be able to know that Williams lived in New York before.
Usage
About 50% of the time.
In Type 3 Slowly Changing Dimension, there will be two columns to indicate the particular attribute of
interest, one indicating the original value, and one indicating the current value. There will also be a column
that indicates when the current value becomes active.
In our example, recall we originally have the following table:
Customer Key
Name
State
1001
Williams
New York
To accommodate Type 3 Slowly Changing Dimension, we will now have the following columns:
Customer Key
Name
Original State
Current State
Effective Date
After Williams moved from New York to Los Angeles, the original information gets updated, and we have
the following table (assuming the effective date of change is February 20, 2010):
Customer Key
Name
Original State
Current State
Effective Date
1001
Williams
New York
Los Angeles
20-FEB-2010
Advantages
This does not increase the size of the table, since new information is updated.
This allows us to keep some part of history.
Disadvantages
Type 3 will not be able to keep all history where an attribute is changed more than once. For example, if
Williams later moves to Texas on December 15, 2003, the Los Angeles information will be lost.
Usage
Type 3 is rarely used in actual practice.
important points/enhancements in OBIEE 11g when compared to OBIEE 10g are listed below
OBIEE 11g uses WebLogic Server as the application server as compared to Oracle AS or OC4J in OBIEE
10g.
The clustering process in much easier and automated in OBIEE 11g.
We can now model lookup tables in the repository.
The new UI called Unified Framework now combines Answers, Dashboards, and Delivers.
A new column called the hierarchical column in introduced.
BI Publishers is fully and seamlessly integrated with OBIEE 11g.
New time series functions PERIOD ROLLING and AGGREGATE AT are introduced.
In OBIEE 11g we can create KPIs to represent business metrics.
The aggregate persistence wizard creates indexes automatically.
The session variables get initialized when they are actually used in OBIEE 11g unlike OBIEE 10g where
they were initialized as soon as a user logs in.
OBIEE 11g now supports Ragged (Unbalanced) and Skipped Hierarchy.
You can also define Parent-Child hierarchy in OBIEE 11g as well.
SELECT_PHYSICAL command is supported in OBIEE 11g.
In OBIEE 11g there are some changes in the terminology as well.
iBots are renamed as Agents.
Requests are renamed as Analyses.
Charts are renamed as Graphs.
Presentation Columns are renamed as Attribute Columns.
All dimensions in your warehouse need to be conformed to get the exact power of a Data warehouse.
Below are some of the commonly used Conformed Dimensions:
Customer
Product
Date/Time
Employee
Account
Region /Territory
Vendor
About code page setting --Depend on data u try to load... if it contain lot of special character we prefer to
load via utf 8.. as some code page do not support that... ur informatica connection code page should match
with source or target code page..
How to fail a session immediately when the first error occurs on log?
Click on session -->config object --> stop on error =1.
Stop on error means the running session will stop/fail whenever it encounters any error in the session. If
you specify '1' here, session will stop on the very first error without processing any data further, hence will
fail
DD_Delete property deletes the row frm table..hence decreases no of rows , hence update strategy is
Active
Q)What are the reasons why target bottleneck occurs? can any one give me at least 5 reasons?
1. Drop Indexes and key constraints. We can Drop & rebuild the indexes in the pre and the post session.
4. Increase database network packet size (at oracle level we do so in the tnsnames.ora or listner.ora and
in the informatica level we need to increase it inthe informatica server configuration and also in the
database server network memory).
6.Use of Partitioning concepts (pass through partitioning, database partitioning, Key partitioning,
Hash Key partitioning).
2.
A)47,23,59. By default aggregator returns the last value from the group set..hence 23 is returned..
Q)Difference b/w informatica 8.x and 9.x?\
1) Lookup can be now configured as active transformation. 2) We have an option to configure the size of
session log. 3)New version of Informatica comes with a bundle with ( Informatica Developer + Analyst
tool). 4) We have deadlock feature enabled in this version before if there was a deadlock session use to
get fail.
It is a concept we implement in DWHsing where we use it to load data from different kinds of sources
before loading DWH. Staging contains only current cleaned,profiled data whereas DWH contains all
historical cleaned,profiled data.
Main purpose is we cannot compare source format data with DWH directly as source data will not be a
proper cleaned data which may not match with cleaned data stored in DWH. So, first we clean all source
data and load temporarly to staging from where we comapre with DWH using SCDs.
However in EDW systems there are multiple reasons why you may want to stage the data
i.e if you dont have a staging and if load fails you may have different set of data getting extracted from
source
2. Staged source is easy work with in bulk mode when you want to merge multiple sources and load to
EDW
- Some times all sources would not be available at the same time. They may become available at different
slots.
3. SCD II type of complex operations may need place holder on the database in the form of Staging where it
is easier to build complex SQLs and then extract data instead of in memory joins
whatever is your source, you will be implementing some logic to clean,profile,refomat your source data
before loading to target DWH right?
Simple Eg: Assume source has a column Ename which has data with some spaces like ' abc '. So, before
loading to DWH you will be having some standards to folow something like spaces need to be removed or
data should be in upper case or may extract part of string etc.
Now, if you dont have a temporary staging, when you Impelment SCD, while comparing data ' abc ' will
never match with 'ABC'. But once you clean data and load to staging, data will be cleaned lik 'ABC' which
can be compared with DWH data (SCD Implementation) which will match and load accordingly.
ETL Cleansing,profiling,
Reforamtting,Business Stds
ETL
SCD Implementation
without staging,
ETL
SCD Implementation
Basically we didn't perform business logic in target area. Apart there are lot of issues like data cleansing
issues in source system. In that case we need to maintain staging area. If your data is cleaned and you
alrady perform transformations, cleansing issues through ETL specific tool or you have storage issues.
Then you may not need staging area. But as a standard staging area must be used.Because the source data
is always prone to error.
Even if your source and target data base table or files as per requirement you have to perform some data
cleansing operations before loading into ODS Layer. for this we are using staging. If you don't have any
requirement to cleansing data, just pass data from source systems to ODS then no need to use staging layer.
also this is done to avoid network overhead on OLTP system. if u directly run complex queries on source
system this might lock the table for quiet a period of time. so to avoid network congestion it is brought in
staging area...but this is not mandatory.
Different Types of Tracing Levels In Informatica
The tracing levels can be configured at the transformation And/OR session level in informatica. There are 4
different types of tracing levels. The different types of tracing levels are listed below:
Tracing levels:
None: Applicable only at session level. The Integration Service uses the tracing levels configured in the
mapping.
Terse: logs initialization information, error messages, and notification of rejected data in the session log
file.
Normal: Integration Service logs initialization and status information, errors encountered and skipped
rows due to transformation row errors. Summarizes session results, but not at the level of
individual rows.
Verbose Initialization: In addition to normal tracing, the Integration Service logs additional
initialization details; names of index and data files used, and detailed transformation statistics.
Verbose Data:In addition to verbose initialization tracing, the Integration Service logs each row that
passes into the mapping. Also notes where the Integration Service truncates string data to fit the
precision of a column and provides detailed transformation statistics. When you configure the
tracing level to verbose data, the Integration Service writes row data for all rows in a block when it
processes a transformation.
At transformation level, you can find the Tracing Level option in the properties tab. Select the required
tracing level. At session level, go to the Config Object tag and select the option Override Tracing.
The question is which tracing level to use. If you want to debug the session, use the verbose data as it
provides complete information about session run. However do not use this in tracing level in production.
Because it will cause some performance issue as the integration service writes the complete information
into the session log file.
In production use the normal tracing level. The normal tracing level is enough to identify most of the errors
when a session fails. As the integration service writes less amount of data in the session log file, this tracing
level wont cause any performance issue.
ANS)There are many ways to perform data validation.Once ETL is completed business team validates
the data.They either manually query the critical fields and check whether implemented business
logic has met the requirements.In market you have off the shelf by which you can do data
validation example..Informatica data validation tool (DVO).
Business team will provide all the valid business dates. If the date is invalid then you will be defaulting
the values.This help us to see if you have any invalid dates in which are coming from your source
system and also this help them to take it forward to discuss with the data goverance team to fix at
the source system level.
Oracle Query: select name,no,max(case when subject='M' then marks end) Maths,max(case when
subject='S' then marks end) science,
max(case when subject='E' then marks end) English from students group by name,no order by name;
OR
select * from students pivot(max(marks) for subject in('M' as MATHS,'S' as Science,'E' as English));
MQ and FTP , I am sure are there for target files..I am not sure about Loader and ERP, and should be out of
two..I need to check on that ..vl update you
Both joiner and SQ also needs common key or any relation between the SD that are joined, else mapping
will be invalid
How many return ports max I can use in unconnected lookup transformation?
If I want to use more than 1 return port in unconnected lookup what is the best way to follow?
Data cache and index cache using in joiner and lookup transformations?
If I change the data type in SQ of any mapping what kind of error I may get?
Have you used parameter file concept? If so where the parameter file will be saved and how it will be
accessed?
Can we use alias names in where condition in SQL query? If I use what will happen?
Without any condition if I used Agg transformation what is the output from the transformation and how
many rows it will through?
The Qs :-
#1
a
b
a
b
c
d
tar1
a
b
c
d
tar2
a
b
#2
Can we update the target table without update strategy??
what need to be done to target property??or simply session level property will do??
#3
what are the types of Lookups??what is difference between static n dynamic cache??
#4
Ask me about my project..how you go through with SDLC ..in my first support project how many incidents
you handled etc.
#5
what happen if we dont sort the data before aggregator,is it mandatory if not done session fail???
SESSION Partitions
Pass Through
Round Robin
Key Range
Hash auto key
Hash User key
Key Range: Based on the port specified the integration service splits the rows
Hash auto key: The IS uses the sorted ports and group by ports to generate auto keys
Hash user keys: the user need to specify the porton which hash function should be applied to group the
rows
29.Can you use the mapping parameters or variables created in one mapping into another mapping?
31.When we can join tables at the Source qualifier itself, why do we go for joiner transformation?
34.In a joiner transformation, you should specify the table with lesser rows as the master table. Why?
37.Explain what Load Manager does when you start a work flow?
38.In a Sequential batch how do i stop one particular session from running?
42.What are the different types of the caches available in Informatica? Explain in detail?
43.What is polling?
45.What is Mapplet?
50.How can you delete duplicate rows with out using Dynamic Lookup? Tell me any other ways using
lookup delete the duplicate rows?
51.Can u copy the session in to a different folder or repository?
52.What is tracing level and what are its types?
55.If your workflow is running slow, what is your approach towards performance tuning?
57.After dragging the ports of three sources (Sql server, oracle, Informix) to a single source qualifier, can we
map these three ports directly to target?
61.Explain how we set the update strategy transformation at the mapping level and at the session level?
62.What is exact use of 'Online' and 'Offline' server connect Options while defining Work flow in Work
flow monitor? The system hangs when 'Online' Server connect option. The Informatica is installed on a
Personal laptop.
64.Write a session parameter file which will change the source and targets for every session. i.e different
source and targets for each session run ?
68.What is Transformation?
70.How do you recognize whether the newly added rows got inserted or updated?
74.How do you handle the decimal places when you are importing the flat file?
75.What is the difference between $ & $$ in mapping or parameter file? In which case they are generally
used?
76.While importing the relational source definition from database, what are the meta data of source U
import?
77.Difference between Power mart & Power Center?
79.If a sequence generator (with increment of 1) is connected to (say) 3 targets and each target uses the
NEXTVAL port, what value will each target get?
87.How to delete duplicate records from source database/Flat Files? Can we use post sql to delete these
records. In case of flat file, how can you delete duplicates before it starts loading?
88.You are required to perform “bulk loading” using Informatica on Oracle, what action would perform at
Informatica + Oracle level for a successful load?
89.What precautions do you need take when you use reusable Sequence generator transformation for
concurrent sessions?
90.Is it possible negative increment in Sequence Generator? If yes, how would you accomplish it?
91.Which directory Informatica looks for parameter file and what happens if it is missing when start the
session? Does session stop after it starts?
92.Informatica is complaining about the server could not be reached? What steps would you take?
93.You have more five mappings use the same lookup. How can you manage the lookup?
94.What will happen if you copy the mapping from one repository to another repository and if there is no
identical source?
95.How can you limit number of running sessions in a workflow?
96.An Aggregate transformation has 4 ports (l sum (col 1), group by col 2, col3), which port should be the
output?
97.What is a dynamic lookup and what is the significance of NewLookupRow? How will use them for
rejecting duplicate records?
98.If you have more than one pipeline in your mapping how will change the order of load?
99.When you export a workflow from Repository Manager, what does this xml contain? Workflow only?
100. Your session failed and when you try to open a log file, it complains that the session details are not
available. How would do trace the error? What log file would you seek for?
101.You want to attach a file as an email attachment from a particular directory using ‘email task’ in
Informatica, How will you do it?
102. You have a requirement to alert you of any long running sessions in your workflow. How can you
create a workflow that will send you email for sessions running more than 30 minutes. You can use any
method, shell script, procedure or Informatica mapping or workflow control?
Scenario 1: How can we load first and last record from a flat file source to target?
Solution:
1st pipeline would capture the first record, and 2nd one for last record.
1st Pipeline:
src-> sq-> exp(take a variable port with numeric data type and pass through a output port 'O_Test')-
>filter(pass if only
O_Test =1)->tgt
2nd pipeline:
Scenario 2: How to find out nth row in flat file...we used to do top N analysis by using rownum & some
other functionalities by using rowid when source is table .and my query is how to achieve the same
functionalities when my source is flat file?
Here we have two things - Parameters(constant values passed to the mapping) and variables which are
dynamic and can be stored as a metadata for future runs(for example you want to do an incremental load
into a table B from table A. So you can define a variable which holds the seqid from source. Before you
write the data into target , create an expression and source the seqid from source as input and create a
variable Max_seqid as output. Now update this value for each row. when the session finishes informatica
saves the last read seqid and you can use this in your source qualifier when you run the mapping next
time. Please see Infa doc for setmaxvaribale and setminvariables.
In this case, we have to just make use parameters to find the nth row.
Now you have to create a parameter file on unix box before you call the workflow.
echoe'[<FOLDERNAME>.WF:<WorkflowName>.ST:<SessionName>]'
count=`wc -l filename`
echo "\$\$MappingVariable="$count
Name the parameter file as <workflowname>.par and copy the complete path of the file name and update
the "Parameter filename" field under Properties tab in workflow edit tasks.
You can then use this variable in your mapping wherever you want. Just proceed it with two $$.
10 c 200
20 d 300
Solution :
2. Create a variable in an expression transformation that would track the change in EID e.g. in your case if
the data is sorted based on EID then it would look like
10 a 100
10 c 200
20 b 100
20 d 300
this would create a new file whenever there is a change in the EID value.
4. Add a "filename" port in the target and then pass on a value as per your requirement so that the
filenames get generated dynamically as per your requirement.
1|A,1|B,1|C,1|D,2|A,2|B,3|A,3|B
1|A+B+C+D
2|A+B
3|A+B
Solution:
Follow the logic given below in the expression and you will get your output.
Please ensure that all the ports you mentioned below are variable ports and the incoming data should be
sorted by key,data
V_CURNT KEY 1 1 1 2
V_CURNT_DA
DATA a b c d
TA
IIF(isnull(v_PREV_DATA) or
v_OUT length(v_PREV_DATA)
=5,v_CURNT_DATA,iif(V_CURNT = a a~b a~b~c d
(variable port) V_PREV, V_PREV_DATA||'~'||
V_CURNT_DATA,NULL)
Hi all the above scenario’s have been taken from informatica communities.Incase any one needs any info
about the scenarios discussed then they may contact for clarifications.
Scenario 1: My source files as 5, 00,000 records. While fetching its skipping records due to data type and
other issues. Finally it fetches only 1, 00,000 records. Through session properties we are considering 100000
as the source count.
But actually we are loosing 400000 records. How can I find the number or records that are skipped?
Solution: OPB_SESS_TASK_LOG there is a count for SRC_FAILED_ROWS
id date
101 2/4/2008
101 4/4/2008
102 6/4/2008
102 4/4/2008
103 4/4/2008
104 8/4/2008
Solution: You can use the rank transformation and select rank port for id and group by on date. In
properties tab select bottom and number of ranks as '1' .
Scenario 3: My scenario is like I am loading records on a daily basis, target is not a truncate load suppose
from source I am loading records like
ID|Name
101|Apple
102|Orange
102|Banana
but in target I am already having target record(ID 102 of 10 records), scenario I need is like I have to delete
only Empid 102 of yesterday record and load today record(2 records)
Solution: You can achieve your goal by taking the Look up on your target table and match on the basis of
ID column. Then take an expression after your lookup and add a FLAG column. In that FLAG column
check for the NULL value return from Look up. After expression take 2 filters and in one filter pass the
records with NULL values and Insert those records into Target.
If the Value is not NULL then you can take a UPDATE strategy and Update the old row with the new one.
Scenario 4:
Id | Name
101 | ABC
102 | DEF
101 | AMERICA
103 | AFRICA
102 | JAPAN
103 | CHINA
SID | ID | NAME
1 |101 | ABC
2 |101 | AMERICA
1 |102 | DEF
2 |102 | JAPAN
1 |103 | AFRICA
2 |103 | CHINA
Solution:
1 sort on Id
10
10
10
20
20
30
O/P :
Solution: first import source, then use a sorter transformation . sort it by ur column, then use a expression.
like this
3.first_value=current_num.
name---backup.sh
1. What are the differences between joiner transformation and source qualifier transformation?
A joiner transformation can join heterogeneous data sources where as a source qualifier can join only
homogeneous sources. Source qualifier transformation can join data from only relational sources but
cannot join flat files.
2. What are the limitations of joiner transformation?
Both pipelines begin with the same original data source.
Both input pipelines originate from the same Source Qualifier transformation.
Both input pipelines originate from the same Normalizer transformation.
Both input pipelines originate from the same Joiner transformation.
Either input pipelines contains an Update Strategy transformation.
Either input pipelines contains a connected or unconnected Sequence Generator transformation.
3. What are the settings that you use to configure the joiner transformation?The following settings are used
to configure the joiner transformation.
Master and detail source
Type of join
Condition of the join
4. What are the join types in joiner transformation?
The join types are
Normal (Default)
Master outer
Detail outer
Full outer
5. What are the joiner caches?
When a Joiner transformation occurs in a session, the Informatica Server reads all the records from the
master source and builds index and data caches based on the master rows. After building the caches, the
Joiner transformation reads records from the detail source and performs joins.
6. What is the look up transformation?
Lookup transformation is used to lookup data in a relational table, view and synonym. Informatica server
queries the look up table based on the lookup ports in the transformation. It compares the lookup
transformation port values to lookup table column values based on the look up condition.
7. Why use the lookup transformation?
Lookup transformation is used to perform the following tasks.
Get a related value.
Perform a calculation.
Update slowly changing dimension tables.
8. What are the types of lookup transformation?
The types of lookup transformation are Connected and unconnected.
9. What is meant by lookup caches?
The informatica server builds a cache in memory when it processes the first row of a data in a cached look
up transformation. It allocates memory for the cache based on the amount you configure in the
transformation or session properties. The informatica server stores condition values in the index cache and
output values in the data cache.
13. How the informatica server sorts the string values in Rank transformation?
When the informatica server runs in the ASCII data movement mode it sorts session data using Binary sort
order. If you configure the session to use a binary sort order, the informatica server calculates the binary
value of each string and returns the specified number of rows with the highest binary values for the string.
12. What are the basic requirements to join two sources in a source qualifier transformation using default
join?
The two sources should have primary key and foreign key relationship.
The two sources should have matching data types
SQL Queries Interview Questions
2. Write a query to display only friday dates from Jan, 2000 to till now?
Solution:
SELECT C_DATE,
TO_CHAR(C_DATE,'DY')
FROM
(
SELECT TO_DATE('01-JAN-2000','DD-MON-YYYY')+LEVEL-1 C_DATE
FROM DUAL
CONNECT BY LEVEL <=
(SYSDATE - TO_DATE('01-JAN-2000','DD-MON-YYYY')+1)
)
WHERE TO_CHAR(C_DATE,'DY') = 'FRI';
3. Write a query to duplicate each row based on the value in the repeat column? The input table data looks
like as below
Products, Repeat
----------------
A, 3
B, 5
C, 2
Now in the output data, the product A should be repeated 3 times, B should be repeated 5 times and C
should be repeated 2 times. The output will look like as below
Products, Repeat
----------------
A, 3
A, 3
A, 3
B, 5
B, 5
B, 5
B, 5
B, 5
C, 2
C, 2
Solution:
SELECT PRODUCTS,
REPEAT
FROM T,
( SELECT LEVEL L FROM DUAL
CONNECT BY LEVEL <= (SELECT MAX(REPEAT) FROM T)
)A
WHERE T.REPEAT >= A.L
ORDER BY T.PRODUCTS;
4. Write a query to display each letter of the word "SMILE" in a separate row?
S
M
I
L
E
Solution:
SELECT SUBSTR('SMILE',LEVEL,1) A
FROM DUAL
CONNECT BY LEVEL <=LENGTH('SMILE');
5. Convert the string "SMILE" to Ascii values? The output should look like as 83,77,73,76,69. Where 83 is the
ascii value of S and so on.
The ASCII function will give ascii value for only one character. If you pass a string to the ascii function, it
will give the ascii value of first letter in the string. Here i am providing two solutions to get the ascii values
of string.
Solution1:
SELECT SUBSTR(DUMP('SMILE'),15)
FROM DUAL;
Solution2:
SELECT WM_CONCAT(A)
FROM
(
SELECT ASCII(SUBSTR('SMILE',LEVEL,1)) A
FROM DUAL
CONNECT BY LEVEL <=LENGTH('SMILE')
);
SQL Queries Interview Questions - Oracle Part 1
To solve these interview questions on SQL queries you have to create the products, sales tables in your
oracle database. The "Create Table", "Insert" statements are provided below.
PRODUCT_ID PRODUCT_NAME
-----------------------
100 Nokia
200 IPhone
300 Samsung
I hope you have created the tables in your oracle database. Now try to solve the below SQL queries.
1. Write a SQL query to find the products which have continuous increase in sales every year?
Solution:
Here “Iphone” is the only product whose sales are increasing every year.
STEP1: First we will get the previous year sales for each product. The SQL query to do this is
SELECT P.PRODUCT_NAME,
S.YEAR,
S.QUANTITY,
LEAD(S.QUANTITY,1,0) OVER (
PARTITION BY P.PRODUCT_ID
ORDER BY S.YEAR DESC
) QUAN_PREV_YEAR
FROM PRODUCTS P,
SALES S
WHERE P.PRODUCT_ID = S.PRODUCT_ID;
Here the lead analytic function will get the quantity of a product in its previous year.
STEP2: We will find the difference between the quantities of a product with its previous year’s quantity. If
this difference is greater than or equal to zero for all the rows, then the product is a constantly increasing in
sales. The final query to get the required result is
SELECT PRODUCT_NAME
FROM
(
SELECT P.PRODUCT_NAME,
S.QUANTITY -
LEAD(S.QUANTITY,1,0) OVER (
PARTITION BY P.PRODUCT_ID
ORDER BY S.YEAR DESC
) QUAN_DIFF
FROM PRODUCTS P,
SALES S
WHERE P.PRODUCT_ID = S.PRODUCT_ID
)A
GROUP BY PRODUCT_NAME
HAVING MIN(QUAN_DIFF) >= 0;
PRODUCT_NAME
------------
IPhone
2. Write a SQL query to find the products which does not have sales at all?
Solution:
“LG” is the only product which does not have sales at all. This can be achieved in three ways.
SELECT P.PRODUCT_NAME
FROM PRODUCTS P
LEFT OUTER JOIN
SALES S
ON (P.PRODUCT_ID = S.PRODUCT_ID);
WHERE S.QUANTITY IS NULL
PRODUCT_NAME
------------
LG
SELECT P.PRODUCT_NAME
FROM PRODUCTS P
WHERE P.PRODUCT_ID NOT IN
(SELECT DISTINCT PRODUCT_ID FROM SALES);
PRODUCT_NAME
------------
LG
SELECT P.PRODUCT_NAME
FROM PRODUCTS P
WHERE NOT EXISTS
(SELECT 1 FROM SALES S WHERE S.PRODUCT_ID = P.PRODUCT_ID);
PRODUCT_NAME
------------
LG
3. Write a SQL query to find the products whose sales decreased in 2012 compared to 2011?
Solution:
Here Nokia is the only product whose sales decreased in year 2012 when compared with the sales in the
year 2011. The SQL query to get the required output is
SELECT P.PRODUCT_NAME
FROM PRODUCTS P,
SALES S_2012,
SALES S_2011
WHERE P.PRODUCT_ID = S_2012.PRODUCT_ID
AND S_2012.YEAR = 2012
AND S_2011.YEAR = 2011
AND S_2012.PRODUCT_ID = S_2011.PRODUCT_ID
AND S_2012.QUANTITY < S_2011.QUANTITY;
PRODUCT_NAME
------------
Nokia
Solution:
Nokia is the top product sold in the year 2010. Similarly, Samsung in 2011 and IPhone, Samsung in 2012.
The query for this is
SELECT PRODUCT_NAME,
YEAR
FROM
(
SELECT P.PRODUCT_NAME,
S.YEAR,
RANK() OVER (
PARTITION BY S.YEAR
ORDER BY S.QUANTITY DESC
) RNK
FROM PRODUCTS P,
SALES S
WHERE P.PRODUCT_ID = S.PRODUCT_ID
)A
WHERE RNK = 1;
PRODUCT_NAME YEAR
--------------------
Nokia 2010
Samsung 2011
IPhone 2012
Samsung 2012
Solution:
This is a simple query. You just need to group by the data on PRODUCT_NAME and then find the sum of
sales.
SELECT P.PRODUCT_NAME,
NVL( SUM( S.QUANTITY*S.PRICE ), 0) TOTAL_SALES
FROM PRODUCTS P
LEFT OUTER JOIN
SALES S
ON (P.PRODUCT_ID = S.PRODUCT_ID)
GROUP BY P.PRODUCT_NAME;
PRODUCT_NAME TOTAL_SALES
---------------------------
LG 0
IPhone 405000
Samsung 406000
Nokia 245000
SQL Queries Interview Questions -
i have used PRODUCTS and SALES tables as an example. Here also i am using the same tables. So, just
take a look at the tables by going through that link and it will be easy for you to understand the questions
mentioned here.
Solve the below examples by writing SQL queries.
1. Write a query to find the products whose quantity sold in a year should be greater than the average
quantity of the product sold across all the years?
Solution:
This can be solved with the help of correlated query. The SQL query for this is
SELECT P.PRODUCT_NAME,
S.YEAR,
S.QUANTITY
FROM PRODUCTS P,
SALES S
WHERE P.PRODUCT_ID = S.PRODUCT_ID
AND S.QUANTITY >
(SELECT AVG(QUANTITY)
FROM SALES S1
WHERE S1.PRODUCT_ID = S.PRODUCT_ID
);
2. Write a query to compare the products sales of "IPhone" and "Samsung" in each year? The output should
look like as
Solution:
By using self-join SQL query we can get the required result. The required SQL query is
SELECT S_I.YEAR,
S_I.QUANTITY IPHONE_QUANT,
S_S.QUANTITY SAM_QUANT,
S_I.PRICE IPHONE_PRICE,
S_S.PRICE SAM_PRICE
FROM PRODUCTS P_I,
SALES S_I,
PRODUCTS P_S,
SALES S_S
WHERE P_I.PRODUCT_ID = S_I.PRODUCT_ID
AND P_S.PRODUCT_ID = S_S.PRODUCT_ID
AND P_I.PRODUCT_NAME = 'IPhone'
AND P_S.PRODUCT_NAME = 'Samsung'
AND S_I.YEAR = S_S.YEAR
Solution:
The ratio of a product is calculated as the total sales price in a particular year divide by the total sales price
across all years. Oracle provides RATIO_TO_REPORT analytical function for finding the ratios. The SQL
query is
SELECT P.PRODUCT_NAME,
S.YEAR,
RATIO_TO_REPORT(S.QUANTITY*S.PRICE)
OVER(PARTITION BY P.PRODUCT_NAME ) SALES_RATIO
FROM PRODUCTS P,
SALES S
WHERE (P.PRODUCT_ID = S.PRODUCT_ID);
4. In the SALES table quantity of each product is stored in rows for every year. Now write a query to
transpose the quantity for each product and display it in columns? The output should look like as
Oracle 11g provides a pivot function to transpose the row data into column data. The SQL query for this is
SELECT * FROM
(
SELECT P.PRODUCT_NAME,
S.QUANTITY,
S.YEAR
FROM PRODUCTS P,
SALES S
WHERE (P.PRODUCT_ID = S.PRODUCT_ID)
)A
PIVOT ( MAX(QUANTITY) AS QUAN FOR (YEAR) IN (2010,2011,2012));
If you are not running oracle 11g database, then use the below query for transposing the row data into
column data.
SELECT P.PRODUCT_NAME,
MAX(DECODE(S.YEAR,2010, S.QUANTITY)) QUAN_2010,
MAX(DECODE(S.YEAR,2011, S.QUANTITY)) QUAN_2011,
MAX(DECODE(S.YEAR,2012, S.QUANTITY)) QUAN_2012
FROM PRODUCTS P,
SALES S
WHERE (P.PRODUCT_ID = S.PRODUCT_ID)
GROUP BY P.PRODUCT_NAME;
Solution:
To get this result we have to group by on year and the find the count. The SQL query for this question is
SELECT YEAR,
COUNT(1) NUM_PRODUCTS
FROM SALES
GROUP BY YEAR;
YEAR NUM_PRODUCTS
------------------
2010 3
2011 3
2012 3
SQL Queries Interview Questions - Oracle Part 4
1. Consider the following friends table as the source
Name, Friend_Name
-----------------
sam, ram
sam, vamsi
vamsi, ram
vamsi, jhon
ram, vijay
ram, anand
Here ram and vamsi are friends of sam; ram and jhon are friends of vamsi and so on. Now write a query to
find friends of friends of sam. For sam; ram,jhon,vijay and anand are friends of friends. The output should
look as
Name, Friend_of_Firend
----------------------
sam, ram
sam, jhon
sam, vijay
sam, anand
Solution:
SELECT f1.name,
f2.friend_name as friend_of_friend
FROM friends f1,
friends f2
WHERE f1.name = 'sam'
AND f1.friend_name = f2.name;
2. This is an extension to the problem 1. In the output, you can see ram is displayed as friends of friends.
This is because, ram is mutual friend of sam and vamsi. Now extend the above query to exclude mutual
friends. The outuput should look as
Name, Friend_of_Friend
----------------------
sam, jhon
sam, vijay
sam, anand
Solution:
SELECT f1.name,
f2.friend_name as friend_of_friend
FROM friends f1,
friends f2
WHERE f1.name = 'sam'
AND f1.friend_name = f2.name
AND NOT EXISTS
(SELECT 1 FROM friends f3
WHERE f3.name = f1.name
AND f3.friend_name = f2.friend_name);
3. Write a query to get the top 5 products based on the quantity sold without using the row_number
analytical function? The source data looks as
Solution:
SELECT products,
quantity_sold,
year
FROM
(
SELECT products,
quantity_sold,
year,
rownum r
from t
ORDER BY quantity_sold DESC
)A
WHERE r <= 5;
4. This is an extension to the problem 3. Write a query to produce the same output using row_number
analytical function?
Solution:
SELECT products,
quantity_sold,
year
FROM
(
SELECT products,
quantity_sold,
year,
row_number() OVER(
ORDER BY quantity_sold DESC) r
from t
)A
WHERE r <= 5;
5. This is an extension to the problem 3. write a query to get the top 5 products in each year based on the
quantity sold?
Solution:
SELECT products,
quantity_sold,
year
FROM
(
SELECT products,
quantity_sold,
year,
row_number() OVER(
PARTITION BY year
ORDER BY quantity_sold DESC) r
from t
)A
WHERE r <= 5;
SQL Query Interview Questions - Part 5
Write SQL queries for the below interview questions:
PRODUCT_ID PRODUCT_NAME
-----------------------
100 Nokia
200 IPhone
300 Samsung
400 LG
500 BlackBerry
600 Motorola
The requirements for loading the target table are:
Do not select the products which are already loaded in the target table with in the last 30 days.
Target table should always contain the products loaded in 30 days. It should not contain the products
which are loaded prior to 30 days.
Solution:
First we will create a target table. The target table will have an additional column INSERT_DATE to know
when a product is loaded into the target table. The target
table structure is
CONTENT_ID CONTENT_TYPE
-----------------------
1 MOVIE
2 MOVIE
3 AUDIO
4 AUDIO
5 MAGAZINE
6 MAGAZINE
The requirements to load the target table are:
Load only one content type at a time into the target table.
The target table should always contain only one contain type.
The loading of content types should follow round-robin style. First MOVIE, second AUDIO, Third
MAGAZINE and again fourth Movie.
Solution:
First we will create a lookup table where we mention the priorities for the content types. The lookup table
“Create Statement” and data is shown below.
The second step is to truncate the target table before loading the data
UPDATE CONTENTS_LKP
SET LOAD_FLAG = 0
WHERE LOAD_FLAG = 1;
UPDATE CONTENTS_LKP
SET LOAD_FLAG = 1
WHERE PRIORITY = (
SELECT DECODE( PRIORITY,(SELECT MAX(PRIORITY) FROM CONTENTS_LKP) ,1 , PRIORITY+1)
FROM CONTENTS_LKP
WHERE CONTENT_TYPE = (SELECT DISTINCT CONTENT_TYPE FROM TGT_CONTENTS)
);
1. About your project, your role and responsibilities.?
2. Which all transformation use cache and which all use index and data cache.?
3. What are indexes and its use.?
4. What are View and difference between view and materialistic view.?
5. What is cartesian product.?
6. Difference between stop and abort.?
7. What is SCD, explain SCD type 2 in full detail.?
8. Difference between Bulk and Normal load.?
9. What is mapplet and its use.?
10. Difference between Reusable and Shortcut.?
11. List of all active and passive t/n.?
12. Project data flow.?
13. Mapping and Session level variable.?
14. How to get record count of source, target and rejected rows in one flat file.
15. How is SQL sort diff from Informatica Sorter.
PROJECT EXPLANATION
This will be the project implementation summary for any of the BI solution
Q)Does anyone has idea how can we send Alert/Mail when Informatica throughput falls below certain
threshold.
Top of Form 1
We need to write Shell script to get it from the Session log and then.. hope you can catch it. Thanks
you can use %t in post session email else one more option is create a new mapping which will compare
thruput with the metadata thruput and check the limit, and then set flag accordingly and send email alert if
flag is set to Y.
SELECT SESSION_NAME,THRUPUT
FROM
OPB_SWIDGINST_LOG
,REP_SESS_LOG
WHERE
OPB_SWIDGINST_LOG.SESSION_ID=REP_SESS_LOG.SESSION_ID
AND REP_SESS_LOG.SESSION_NAME= 's_session_name';
check if it helps.. if anywhere you need more details pls feel free to ask..
%t : Source and target table details, including read throughput in bytes per second and write throughput in
rows per second. The Integration Service includes all information displayed in the session detail dialog
box.
Bottom of Form 1
1. Explain your Project?
2. What are your Daily routines?
3. How many mapping have you created all together in your project?
4. In which account does your Project Fall?
5. What is your Reporting Hierarchy?
6. How many Complex Mapping’s have you created?
7. Could you please me the situation for which you have developed that Complex mapping?
8. What is your Involvement in Performance tuning of your Project?
9. What is the Schema of your Project?
10. And why did you opt for that particular schema?
11. What are your Roles in this project?
12. Can I have one situation which you have adopted by which performance has improved dramatically?
13. Where you Involved in more than two projects simultaneously?
14. Do you have any experience in the Production support?
15. What kinds of Testing have you done on your Project (Unit or Integration or System or UAT)?
16. And Enhancement’s were done after testing?
17. How many Dimension Table are there in your Project and how are they linked to the fact table?
18. How do we do the Fact Load?
19. How did you implement CDC in your project?
20. How does your Mapping in File to Load look like?
21. How does your Mapping in Load to Stage look like?
22. How does your Mapping in Stage to ODS look like?
23. What is the size of your Data warehouse?
24. What is your Daily feed size and weekly feed size?
25. Which Approach (Top down or Bottom Up) was used in building your project?
26. How do you access your source’s (are they Flat files or Relational)?
27. Have you developed any Stored Procedure or triggers in this project?
28. How did you use them and in which situation?
29. Did your Project go live?
30. What are the issues that you have faced while moving your project from the Test Environment to the
Production Environment?
31. What is the biggest Challenge that you encountered in this project?
32. What is the scheduler tool you have used in this project?
33. How did you schedule jobs using it?