ETLTesting Interview Qs

Tell me about your self
Professional Experience: Well my name is Vasu. I got total 4 plus years of experiences in data warehousing testing including ETL development experience. Currently I m associated with Sierra Atlantic Software Services Ltd (A Hitachi consulting group). I m very good in testing data warehouse where on different ETL platforms like Informatica as well as SSIS. I have good knowledge on complete SDLC of projects where I was involved in 3 projects till now. I have very good experience in writing sql queries to retrieve the data for testing source and target databases like oracle and sql server. As well as to test informatica mappings, sessions and workflows and SSIS packages including data flows and control flows. I have very good knowledge in testing reports generated by SSRS and Business Objects
How to find/delete duplicate rows

1) To find duplicate rows SQL> Select *from emp where rowid in (select max(rowid) from emp group by empno, ename, mgr, job, hiredate, comm, deptno, sal); Or SQL> Select empno,ename,sal,job,hiredate,comm , count(*) from emp group by empno,ename,sal,job,hiredate,comm having count(*) >=1; 2) To delete duplicate rows SQL> Delete emp where rowid in (select max(rowid) from emp group by empno,ename,mgr,job,hiredate,sal,comm,deptno); 3) To find the count of duplicate rows SQL> Select ename, count(*) from emp group by ename having count(*) >= 1;
What is TCL and DCL Commands

TCL -- commit, rollback, savepoint DCL -- grant, revoke
What is save point o SAVEPOINTS are used to subdivide a transaction into smaller parts. It enables rolling back part of a transaction. Maximum of five save points are allowed. What is Cursor? Where we use parameters? How to declare it? What is %row type and syntax?
Cursor is a pointer to memory location which is called as context area which contains the information necessary for processing, including the number of rows processed by the statement, a pointer to the parsed representation of the statement, and the active set which is the set of rows returned by the query.
Cursor contains two parts Header Body Header includes cursor name, any parameters and the type of data being loaded. Body includes the select statement.
Ex: Cursor c(dno in number) return dept%rowtype is select *from dept; In the above
.
Header cursor c(dno in number) return dept%rowtype Body select *from dept
What is Sub Query?

Nesting of queries, one within the other is termed as a subquery. A statement containing a subquery is called a parent query. Subqueries are used to retrieve data from tables that depend on the values in the table itself
Where you can use minus operation in your project?

This will give the records of a table whose records are not in other tables having the same structure. Ex: SQL> select * from student1 minus select * from student2;
What is the normalization / normal form and where and why we used it in database? Normalization is basically the process of simplifying the data or can say it reduces the redundancy of the data (for example, storing the same data in more than one table). Normalization is the process of efficiently organizing data in a database. I.e. ensuring data dependencies make sense (only storing related data in a table). Normalization is the process of make the database tables. The normal forms (abbrev. NF) of relational database theory provide criteria for determining a table's degree of weakness to logical changeabilitys and irregularities. The higher the normal form applicable to a table, the less weakness it is to inconsistencies and irregularities.
First Normal Form (1NF): First normal form (1NF) sets the very basic rules for an
organized database: Eliminate duplicative columns from the same table. Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key). a) Database Tables should not be duplicate b) Create Separate Tables and set the identity of the row and can set the unique id also.
Second Normal Form (2NF): Meet all the requirements of the first normal form.
Remove subsets of data that apply to multiple rows of a table and place them in separate tables. Create relationships between these new tables and their predecessors through the use of foreign keys. If a table has a composite key, all attributes must be related to the whole key.
a) Second normal form takes all the parameters of first normal form as well as something new also. b) We can create relation between 2 tables.
Third Normal Form (3NF): It contains all the parameters of second normal form. It
requires that data stored in a table be dependent only on the primary key, and not on any other field in the table. a) The database must meet all the requirements of the first and second normal form. b) All fields must be directly dependent on the primary key field. Any field which is dependent on a non-key field which is in turn dependent on the Primary Key (ie a transitive dependency) is moved out to a separate database table.
Fourth Normal Form (4NF): This is also known as Boyce Code Normal Form. In
these normal forms or further normal forms data should be separated into so many tables and there should not be complexity in the tables. a) Meet all the requirements of the third normal form. b) A relation is in 4NF if it has no multi-valued dependencies.
What are the differences between delete truncate and drop and syntaxes? Delete: The DELETE command is used to remove rows from a table. A WHERE clause can be
used to only remove some rows. If no WHERE condition is specified, all rows will be removed. After performing a DELETE operation you need to COMMIT or ROLLBACK the transaction to make the change permanent or to undo it. Note that this operation will cause all DELETE triggers on the table to fire. Example: DELETE FROM emp WHERE job = 'CLERK';
Truncate: TRUNCATE removes all rows from a table. The operation cannot be rolled back and
no triggers will be fired. As such, TRUCATE is faster and doesn't use as much undo space as a DELETE. TRUNCATE removes the record permanently. Example: TRUNCATE TABLE emp;
Drop: The DROP command removes a table from the database. All the tables' rows, indexes and
privileges will also be removed. No DML triggers (insert, update and delete) will be fired. The operation cannot be rolled back. Example: DROP TABLE emp;
Note: DROP and TRUNCATE are DDL commands, whereas DELETE is a DML command. Therefore
DELETE operations can be rolled back (undone), while DROP and TRUNCATE operations cannot be rolled back.
What is the difference between function and procedure and syntaxes? a) A FUNCTION is always returns a value using the return statement.
A PROCEDURE return at all.
may return one or more values through parameters or may not
b) Function is mainly used in the case where it must return a value. Where as a procedure may
or may not return a value or may return more than one value using the OUT parameter.
c) Function can be called from SQL statements where as procedure can not be called from the d) e) f) g)
sql statements Functions are normally used for computations where as procedures are normally used for executing business logic. You can have DML (insert,update, delete) statements in a function. But, you cannot call such a function in a SQL query. Function returns 1 value only. Procedure can return multiple values (max 1024). Stored Procedure: supports deferred name resolution. Example while writing a stored procedure that uses table named tabl1 and tabl2 etc..but actually not exists in database is allowed only in during creation but runtime throws error Function wont support deferred name resolution. Stored procedure returns always integer value by default zero. where as function return type could be scalar or table or table values Stored procedure is precompiled execution plan where as functions are not. A procedure may modify an object where a function can only return a value The RETURN statement immediately completes the execution of a subprogram and returns control to the caller.
h) i)
Syntaxes: Function: CREATE FUNCTION [ schema_name. ] function_name

( { @parameter_name [AS] [ type_schema_name. ] parameter_data_type [ = default ] } [ ,...n ] ) RETURNS { return_data_type } [ WITH <clr_function_option> [ ,...n ] ] [ AS ] EXTERNAL NAME <method_specifier> [;] Procedure: CREATE [OR REPLACE] PROCEDURE procedure_name [ (parameter [,parameter]) ] IS [declaration_section] BEGIN executable_section [EXCEPTION exception_section] END [procedure_name];
How to write SP and syntax? Before creating a stored procedure, consider that:
a)
CREATE PROCEDURE statements cannot be combined with other SQL statements in a single batch. b) To create procedures, you must have CREATE PROCEDURE permission in the database and ALTER permission on the schema in which the procedure is being created. For CLR
stored procedures, you must either own the assembly referenced in <method_specifier>, or have REFERENCES permission on that assembly. c) Stored procedures are schema-scoped objects, and their names must follow the rules for identifiers. d) You can create a stored procedure only in the current database. When creating a stored procedure, you should specify:
a) Any input parameters and output parameters to the calling procedure or batch. b) The programming statements that perform operations in the database, including calling
other procedures. The status value returned to the calling procedure or batch to indicate success or failure (and the reason for failure). d) Any error handling statements needed to catch and handle potential errors. o Error handing functions such as ERROR_LINE and ERROR_PROCEDURE can be specified in the stored procedure.
c)
Procedure: CREATE [OR REPLACE] PROCEDURE procedure_name [ (parameter [,parameter]) ] IS [declaration_section] BEGIN executable_section [EXCEPTION exception_section] END [procedure_name];
Explain the join types with syntaxes?

Type of joins: The JOIN keyword is used in an SQL statement to query data from two or more tables, based on a relationship between certain columns in these tables.
a)
Inner JOIN: Return rows when there is at least one match in both tables Syntax: SELECT column_name(s) FROM table_name1 INNER JOIN table_name2 ON table_name1.column_name=table_name2.column_name
b) LEFT JOIN: Return all rows from the left table, even if there are no matches in the right table
Syntax: SELECT column_name(s) FROM table_name1 LEFT JOIN table_name2 ON table_name1.column_name=table_name2.column_name
c) RIGHT JOIN: Return all rows from the right table, even if there are no matches in the left table d) FULL JOIN: Return rows when there is a match in one of the tables
What is union and union all and list out the differences with syntaxes?
INTERSECT - returns all distinct rows selected by both queries. MINUS - returns all distinct rows selected by the first query but not by the second. UNION - returns all distinct rows selected by either query UNION ALL - returns all rows selected by either query, including all duplicates. Syntax: Union: Select field1, field2, . field_n from tables UNION select field1, field2, . field_n from tables; Syntax: Union All: Select field1, field2, . field_n from tables UNION ALL select field1, field2, . field_n from tables;
Explain the star schema and snowflake schema? Star Schema:

o
o o o o o o o
Star Schema is a relational database schema for representing multi dimensional data. It is the simplest form of data warehouse schema that contains one or more dimensions and fact tables. The center of the star schema consists of a large fact table and it points towards the dimension tables. The advantage of star schema is slicing down performance increase and easy understanding of data. In a star schema every dimension will have a primary key. In a star schema a dimension table will not have any parent table. Hierarchies for the dimensions are stored in the dimensional table itself in star schema. In this star schema fact table in normalized format and dimension table is in de normalized format. Star Schema contains Highly Denormalized Data Query performance is very high
Disadvantages: Occupies more space, Highly Denormalized Snow flake Schema:

o
A snowflake schema is a term that describes a star schema structure normalized through the use of outrigger tables. i.e dimension table hierachies are broken into simpler tables. The advantage of snowflake schema is it occupies less space
o o o o o o
In a snow flake schema a dimension table will have one or more parent tables. Hierarchies are broken into separate tables in snow flake schema. These hierarchies help to drill down the data from topmost hierarchies to the lowermost hierarchies. In this both dimension and fact table is in normalized format only. It is also known as extended star schema. Snow flake it requires more dimensions more foreign keys and it will reduce the query performance but it normalizes the records. Snow flake contains Partially normalized data Query performance is very low because more joiners are present
What are differences between control flow and data flow of SSIS? Control Flow:
a) b) c) d) e) f)
Process is the key: precedence constraints control the project flow based on task completion, success or failure Task 1 needs to complete before task 2 begins Smallest unit of the control flow is a task Control flow does not move data from task to task Tasks are run in series if connected with precedence or in parallel Package control flow is made up of containers and tasks connected with precedence constraints to control package flow
Data Flow:
a) b) c) d) e) f) g)
Streaming Unlink control flow, multiple components can process data at the same time Smallest unit of the data flow is a component Data flows move data, but are also tasks in the control flow, as such, their success or failure effects how your control flow operates Data is moved and manipulated through transformations Data is passed between each component in the data flow Data flow is made up of source(s), transformations, and destinations.
How to link a task using QC? How to call a test case using QC?
To link a defect to a test: Step: 1: Display the Test Plan module. Click the Test Plan button on the sidebar. Step: 2: Select the test name xxx. In the test plan tree, under test name, expand the test name and select the test. Click the Linked Defects tab. Step: 3: Add a linked defect. In the Linked Defects tab, click the Link Existing Defect arrow and choose Select. The Defects to Link dialog box opens. To call a test case using Parameters:
When you run the test, the test steps include the steps from the called test as part of the test. The test that you call is a template test. This is a reusable test that can be called by other tests. A template test can include parameters. Step: 1: Display the Design Steps tab for test name. In the test plan tree, expand the Cruises and Cruise Reservation subject folders, and select the name of the test. Click the Design Steps tab. Step: 2: Select the test with parameters that you want to call. Click the Call to Test button. The Select a Test dialog box opens. In the Find box, type Connect, and click the Find button. The Connect And Sign-On test is highlighted. Click OK. The Parameters of Test dialog box opens and displays the parameters contained in the called test. Step: 3: Assign values to the parameters. Click OK. The Call Connect And Sign-On Step: 4: Reorder the steps. Position the mouse pointer on the gray sidebar to the left of the Call Connect And Sign-On step. The mouse pointer changes to an arrow. Click and drag the step to the top row.
How many transformations are there in Informatica and list out all? What is the difference between Active and Passive Transformations and list out the examples?
What is view? What is index? What is primary key and foreign key? Where we can use all these and explain with syntaxes? View: A view is a database object that is a logical representation of a table. It is delivered
from a table but has no storage of its own and often may be used in the same manner as a table. A view takes the output of the query and treats it as a table, therefore a view can be thought of as a stored query or a virtual table. Why View: a) Provides additional level of security by restricting access to a predetermined set of rows and/or columns of a table. b) Hide the data complexity. c) Simplify commands for the user.
a. To protect some of the columns of a table from other users b. To hide complexity of a query c. To hide complexity of calculations
Ex:
SQL> Create SQL> Create
view dept_v as select *from dept with read only; view dept_v as select deptno, sum(sal) t_sal from emp group by
deptno; SQL> Create view stud as select rownum no, name, marks from student; SQL> Create view student as select *from student1 union select *from student2; SQL> Create view stud as select distinct no,name from student;
Index: Index is typically a listing of keywords accompanied by the location of information

on a subject. We can create indexes explicitly to speed up SQL statement execution on a table. The index points directly to the location of the rows containing the value.
WHY INDEXES? Indexes are most useful on larger tables, on columns that are likely to appear in where clauses as simple equality. Ex: To Create: CREATE INDEX "INDEX_NAME" ON "TABLE_NAME" (COLUMN_NAME) To Drop: DROP INDEX index_name ON table_name
Primary key:
a) This is used to avoid duplicates and nulls. This will work as combination of unique and not null. b) Primary key always attached to the parent table. c) We can add this constraint in three levels (COLUMN LEVEL, TABLE LEVEL and ALTER LEVEL). FOREIGN KEY: a) This is used to reference the parent table primary key column which allows duplicates. b) Foreign key always attached to the child table. c) We can add this constraint in table and alter levels only.
SDLC, STLC (bug life cycle)? SDLC: (Software Development Life Cycle) a) Requirement Analysis b) Design Analysis c) Coding d) Test Analysis e) Implement f) Maintainance STLC:
1) Requirements Stage: Usually there is Test Lead involved in the 'Requirement Gathering' stage along with the BA's PM's etc. 2) Test Plan: Once the requirements are freezed a test plan is made like testing approaches to be followed testing methodologies etc. 3) Test Design: A design is identify the various modules and the path connecting the modules. 4) Test Cases Preparation: Based on the above test design test cases are made for positive negative scenarios. 5) Test Execution & Bug Reporting. 6) Release to Production.
What is PLSQL? What is the Difference between sql and pl/sql? a) SQL is a data oriented language for selecting and manipulating sets of data. PL/SQL is a procedural language to create applications b) PL/SQL can be the application language just like Java or PHP can. c) The code that makes your program function is PL/SQL.
d) The code that manipulates the data is SQL DML. e) PL/SQL may call SQL to perform data manipulation. How many tables you joined in your project till now? How to join?
Ans: 2-3 per project How to join tables: Inner Join
How to backup your Database using sql? How to Insert Data from One Table to another Table?
There are two different ways to implement inserting data from one table to another table. Method 1 : INSERT INTO SELECT This method is used when table is already created in the database earlier and data is to be inserted into this table from another table. If columns listed in insert clause and select clause are same, they are are not required to list them. I always list them for readability and scalability purpose.
----Create TestTable CREATE TABLE TestTable (FirstName VARCHAR(100), LastName VARCHAR(100)) ----INSERT INTO TestTable using SELECT INSERT INTO TestTable (FirstName, LastName) SELECT FirstName, LastName FROM Person.Contact WHERE EmailPromotion = 2 ----Verify that Data in TestTable SELECT FirstName, LastName FROM TestTable ----Clean Up Database DROP TABLE TestTable GO
Method 2 : SELECT INTO This method is used when table is not created earlier and needs to be created when data from one table is to be inserted into newly created table from another table. New table is created with same data types as selected columns.
----Create new table and insert into table using SELECT INSERT SELECT FirstName, LastName INTO TestTable FROM Person.Contact WHERE EmailPromotion = 2 ----Verify that Data in TestTable SELECT FirstName, LastName FROM TestTable ----Clean Up Database DROP TABLE TestTable GO
Which Job scheduling tool you have used? Autosys or Control M? Autosys is a job scheduling software like Control - M and cron with the help of autosys we can define the runtime day date week and which script or program needs to be run. The main advantage of using autosys w.r.t crontab is that it is has a Java front end too so a person do not need to be a Unix champ to create or change a job in autosys. The attributes of AUTOSYS : Job_name : script : owner: machine it has to run on : day: date: week: error log : output log : alarm :
What are the testing techniques you have used but not testing methods?
What are the differences between V-model, agile model and waterfall model?
1.In Waterfall Model the tester role will take place only in the test phase but in V-Model role will take place in the requirement phase itself 2.Waterfall model is a fixed process u can't make any changes in the requirement or in any phase but in V-Model u can make any changes in the requirements 3.V-model is the simultaneous process but it is not in case of water fall model 4.waterfall model used only when the requirements are fixed but V-model can be used for any type of requirement(Uncertain requirement) a) Agile approaches to achieve working product / application, at the end of each iteration called sprint. b) V-model does not contain any iteration approach. c) Agile iteration - sprint, contains every phase of software development, i.e. requirement understanding, design, coding, testing d) In agile, developer, tester and customer works together on piece of code for application. e) V-model does not have this concept. In V-model, developer works on designing and coding and testers working on writing test cases and testing the product. f) There is no concept of "working together" for V-model. g) Agile is more suitable for the projects where requirements change rapidly. h) V-model is suitable where requirement changes are almost none. i) Agile mandates customer interaction on a regular basis, V doesn't
Task(requirements) are never measured to their weight in V, yes in Agile, hence we see Task Break down in smaller chunks k) In Agile we can change the direction on will, meaning back logs can be postponed or proponed, V doesn't have a look back or ahead and change direction concept.
j)
What is the package and where we used it?

A package is a container for related objects. It has specification and body. Each of them is stored separately in data dictionary. PACKAGE SYNTAX Create or replace package <package_name> is -- package specification includes subprograms signatures, cursors and global or public variables. End <package_name>; Create or replace package body <package_name> is -- package body includes body for all the subprograms declared in the spec, private Variables and cursors. Begin -- initialization section Exception -- Exception handling seciton End <package_name>;
. Difference between row ID and row Number? ROWID is a pseudo column attached to each row of a table. It is 18 characters long, block no, row number are the components of ROWID.

ETLTesting Interview Qs

Enviado por

Dados do documento

Descrição original:

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

ETLTesting Interview Qs

Enviado por

Direitos autorais:

Formatos disponíveis

Tell me about your self

How to find/delete duplicate rows

What is TCL and DCL Commands

What is Sub Query?

Where you can use minus operation in your project?

A PROCEDURE return at all.

may return one or more values through parameters or may not

Syntaxes: Function: CREATE FUNCTION [ schema_name. ] function_name

Explain the join types with syntaxes?

Syntax: SELECT column_name(s) FROM table_name1 LEFT JOIN table_name2 ON table_name1.column_name=table_name2.column_name

Explain the star schema and snowflake schema? Star Schema:

Disadvantages: Occupies more space, Highly Denormalized Snow flake Schema:

Index: Index is typically a listing of keywords accompanied by the location of information

What is the package and where we used it?

Você também pode gostar

ETLTesting Interview Qs

Enviado por

Dados do documento

Descrição original:

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

ETLTesting Interview Qs

Enviado por

Direitos autorais:

Formatos disponíveis

 Tell me about your self

 How to find/delete duplicate rows

 What is TCL and DCL Commands

 What is Sub Query?

 Where you can use minus operation in your project?

A PROCEDURE return at all.

may return one or more values through parameters or may not

Syntaxes: Function: CREATE FUNCTION [ schema_name. ] function_name

 Explain the join types with syntaxes?

Syntax: SELECT column_name(s) FROM table_name1 LEFT JOIN table_name2 ON table_name1.column_name=table_name2.column_name

 Explain the star schema and snowflake schema? Star Schema:

Disadvantages: Occupies more space, Highly Denormalized Snow flake Schema:

Index: Index is typically a listing of keywords accompanied by the location of information

 What is the package and where we used it?

Você também pode gostar

Tell me about your self

How to find/delete duplicate rows

What is TCL and DCL Commands

What is Sub Query?

Where you can use minus operation in your project?

Explain the join types with syntaxes?

Explain the star schema and snowflake schema? Star Schema:

What is the package and where we used it?