Você está na página 1de 5

Data warehousing

What is Data warehouse? Ans. A Data warehouse is a Subject-oriented, Integrated, Time-variant and Nonvolatile collection of data in support of managements decision making process.
1)

Subject-Oriented: A data warehouse can be used to analyze a particular subject area. For example, "sales" can be a particular subject. Integrated: A data warehouse integrates data from multiple data sources (transactional systems - OLTP). For example, source A and source B may have different ways of identifying a product, but in a data warehouse, there will be only a single way of identifying a product. Time-Variant: Historical data is kept in a data warehouse. For example, one can retrieve data from 3 months, 6 months, 12 months, or even older data from a data warehouse. Non-volatile: Once data is in the data warehouse, it will not change. Only inserts are done to data warehouse and No updates. So, historical data in a data warehouse should never be altered. What is Data mart? Ans. Data mart is a subset of a Data warehouse.
2)

What is Rapid Mart? Ans. It uses pre-packaged data marts for SAP, Oracle, PeopleSoft and Siebel applications to accelerate the delivery of analytical data.
3)

What is the difference between OLTP and OLAP? Ans. OLTP stands for On Line Transaction Processing which deals with day-to-day transactions, stores the current data in the database which is normalized as updates are very frequent and deals with the small amount of data.
4)

OLAP stands for On Line Analytical Processing stores the historical data based on OLTP source and the database is De-normalized as frequent updates will not happen and deals with bulk amount of data to support trend analysis and future predictions. What are the types of Dimensions? Ans. The types of dimensions are: Confirmed Dimension Junk Dimension De-generate Dimension Role-playing Dimension
5) 6)

What is Confirmed Dimension?

Ans. Dimension which is shared by all fact tables or shared across different data marts is called as Confirmed dimension. Example: The date dimension table connected to the sales facts is identical to the date dimension connected to the inventory facts. What is Junk Dimension? Ans. It is a dimension table consisting of attributes that does not belong to the
7) fact table or any of the existing dimension tables. These attributes are usually text or flags with yes/no or true/false indicators. 8)

What is Degenerate Dimension? Ans. It is a fact table primary key and represents the unique identifier of the parent. It has no attributes and doesnt join to an actual dimension table. Example: Invoice/Tran number. What are Degenerated Objects? Ans. Objects created using SQL queries or stored procedures called Degenerated Objects.
9)

What is Role playing Dimension? Ans. Dimensions which are used in multiple applications within the same database. For example a "Date" dimension can be used for "Date of Sale", as well as "Date of Delivery", or "Date of Hire". This is often referred to as a "role-playing dimension".
10)

What is Casual Dimension? Ans. Dimension which will not change the fundamental grain of the fact table is
11) called as casual dimension. Example: Gender - Male, Female. 12)

What are slowly changing dimensions? Ans : Slowly Changing Dimensions are basically those dimensions whose key value will remain static but description might change over the period of time. For example, the product id in companies, product line might remain the same, but the description might change from time to time. What are the types of slowly changing dimension (SCD)? Ans. There are 3 types of SCD. Type 1: History of dimension is not stored in the dimension table. No trace of the old record exists
13)

Type 2: A new record is added into the customer dimension table for maintaining history whenever the attributes of a dimension is changed.

Type 3: The original record is modified to reflect the change of dimension attribute. Partial history is maintained. What is MOLAP? (Multidimensional) Ans. In MOLAP data is stored in multidimensional cube. The data can be retrieved fast and slicing and dicing operation is optimal and can perform complex calculations but limited data can be handled.
14)

What is ROLAP? (Relational) Ans. In ROLAP data is stored in relational database. Can handle large amount of data but is limited by SQL functionalities and performance can be slow.
15)

What is HOLAP? Ans. It is the combination of MOLAP and ROLAP. For summary type information, it uses the cube technology for fast performance and when detail information is needed, it can drill through from the cube into underlying relational data.
16)

What is Dimension Modeling? Ans. Dimensional modeling is a logical design technique to present the data in a standard framework to allow for high-performance access. It is inherently dimensional and uses the relational model with some restrictions. Every dimensional model is composed of one table with a multipart key called the fact table and a set of smaller tables called dimension tables. Each dimension table has a single-part primary key that corresponds exactly to one of the components of the multipart key in the fact table.
17)

What is Fact table? Ans. It is a table which contains two types of columns. One that contain numeric facts (measurements) and other column have foreign keys to dimension tables. A fact table contains either detail-level facts or facts that have been aggregated (Summary tables).
18)

What is Dimension table? Ans. It is a table which contains further information about an attribute in a fact table. A foreign key of a fact table references the primary key in a dimension table in a many-to-one relationship.
19)

What are the different measure (fact) types? Ans. Additive: Measures that can be added across all dimensions. Example - Sales
20)

Semi-Additive: Measures that can be added across some dimensions and not across others. Example: Inventory level, where you cannot tell what a level means simply by looking at it. Non-Additive: Measures that cannot be added across any dimension. Example Average What is ODS (Operational Data Store)? Ans. An operational data store (ODS) is an integrated database, source includes legacy systems and it contains current or near term data, means data is not static. An ODS may contain 30 to 60 days of information, while a data warehouse typically contains years of data and data is static.
21)

What is a Star schema? Ans. A typical star schema has a completely de-normalized dimension and fact tables whose Entity-Relationship (ER) diagram looks like a star. Dimensions have Primary key and Fact table have foreign keys referencing dimension table primary keys. A star schema can have any number of dimension tables. The crow's feet at the end of the links connecting the tables indicate a many-to-one relationship between the fact table and each dimension table.
22)

What is a Snowflake schema? Ans. In a snowflake schema one or more dimension tables are partially or completely normalized. A snowflake schema can have any number of dimensions and each dimension can have any number of levels. The following figure shows a snowflake Schema.
23)

What is the difference between Star and Snow flake schema? Ans. STAR SCHEMA: De-Normalized Data Structure, Category wise Single Dimension Table, More data dependency and redundancy, No need to use complicated join, Query Results Faster, No Parent Table, and Simple DB Structure. SNOWFLAKE SCHEMA: Normalized Data Structure, Dimension table split into many pieces, less data dependency and No redundancy, Complicated Join, Some delay in Query Processing, It May contain Parent Table, Complicated DB Structure.
24)

Você também pode gostar