Escolar Documentos
Profissional Documentos
Cultura Documentos
UNIVERSITY
Home work 3
The snowflake schema is similar to the star schema. However, in the snowflake
schema, dimensions are normalized into multiple related tables, whereas the star
schema's dimensions are de normalized with each dimension represented by a
single table.
The following example query is the snowflake schema equivalent of the star schema
example code which returns the total number of units sold by brand and by country
for 1997 The benefit of using the snowflake schema in this example is that the
storage requirements are lower since the snowflake schema eliminates many
duplicate values from the dimensions themselves.
Q2. Suppose that a data warehouse consists of the three dimensions time, doctor, and patient,
and the two measures count and charge, where charge is the fee that a doctor charges a
patient for a visit.
(a) Enumerate three classes of schemas that are popularly used for modelling data
warehouses.
(b) Draw a schema diagram for the above data warehouse using one of the schema classes
listed in (a).
(a) The three classes of schemas that are popularly used for modelling data
warehouses
STAR SCHEMA: a fact table in the middle connected to a set
of dimension tables
SNOWFLAKE SCHEMA: a refinement of star schema where
some dimensional hierarchy is normalized into a set of smaller
dimension tables, forming a shape similar to snowflake.
Q3. A data warehouse can be modelled by either a star schema or a snowflake schema.
Briefly describe the similarities and the differences of the two models, and then analyze their
advantages and disadvantages with regard to one another. Give your opinion of which might
be more empirically useful and state the reasons behind your answer.
Star Schema:
Definition: The star schema is the simplest data warehouse schema. It is
called a star schema because the diagram resembles a star with points
radiating from a center.
In a star schema a dimension table will not have any parent table.
Whereas in a snow flake schema a dimension table will have one or more
parent tables.
Hierarchies for the dimensions are stored in the dimensional table itself in star
schema.
Whereas hierarchies are broken into separate tables in snow flake schema.
These hierarchies helps to drill down the data from topmost hierarchies to the
lowermost hierarchies.
Star schema : In this star schema fact table in normalized format and
dimension table is in de normalized format. It also known as basic star
schema.
Snowflake schema:
Snow flake schema: In this both dimension and fact table is in normalized
format only. It is also known as Extended star schema.
If u r taking the snow flake it requires more dimensions more foreign keys
and it will reduce the query performance but it normalizes the records.
• Simplest DW schema
• Easy to understand
• Easy to Navigate between the tables due to less number of joins.
• Most suitable for Query processing
Disadvantages:
• Occupies more space
• Highly Denormalized
Set 2
Q1. What is OLAP?
ANS:
OLAP stands for On Line Analytical Processing, a series of protocols used mainly for
business reporting. Using OLAP, businesses can analyze data in all manner of
different ways, including budgeting, planning, simulation, data warehouse reporting,
and trend analysis. A main component of OLAP is its ability to make
multidimensional calculations, allowing a wide and lightning-fast array of possibilities.
In addition, the bigger the business, the bigger its business reporting needs.
Multidimensional calculations enable a large business to complete in seconds what it
otherwise would have waited a handful of minutes to receive.
OLAP is an approach to swiftly answer multi-dimensional analytical queries. OLAP
is part of the broader category of business intelligence, which also
encompasses relational reporting and data mining. Typical applications of OLAP
include business reporting for sales, marketing, management reporting, business
process management(BPM). budgeting and forecasting, financial reporting and
similar areas, with new applications coming up, such as agriculture . The term
OLAP was created as a slight modification of the traditional database
term OLTP (Online Transaction Processing)
Databases configured for OLAP use a multidimensional data model, allowing for
complex analytical and ad-hoc queries with a rapid execution time. They borrow
aspects of navigational databases and hierarchical databases that are faster
than relational databases.
The output of an OLAP query is typically displayed in a matrix (or pivot) format. The
dimensions form the rows and columns of the matrix; the measures form the values.
For example:
TYPES OF OLAP
MOLAP
MOLAP is the 'classic' form of OLAP and is sometimes referred to as just OLAP.
MOLAP stores this data in an optimized multi-dimensional array storage, rather than
in a relational database. Therefore it requires the pre-computation and storage of
information in the cube - the operation known as processing.
Relational
ROLAP works directly with relational databases. The base data and the dimension
tables are stored as relational tables and new tables are created to hold the
aggregated information. Depends on a specialized schema design. This
methodology relies on manipulating the data stored in the relational database to give
the appearance of traditional OLAP's slicing and dicing functionality. In essence,
each action of slicing and dicing is equivalent to adding a "WHERE" clause in the
SQL statement.
Hybrid
There is no clear agreement across the industry as to what constitutes "Hybrid
OLAP", except that a database will divide data between relational and specialized
storage. For example, for some vendors, a HOLAP database will use relational
tables to hold the larger quantities of detailed data, and use specialized storage for
at least some aspects of the smaller quantities of more-aggregate or less-detailed
data.
Q3. What are meta data? Why are metadata so important to a data warehouse?
Metadata is data about data. Metadata has been around as long as there have been
programs and data that the programs operate on. Figure shows metadata in a simple
form.
While metadata is not new, the role of metadata and its importance in the face of the
data warehouse certainly is new. For years the information technology professional has
worked in the same environment as metadata, but in many ways has paid little attention
to metadata. The information professional has spent a life dedicated to process and
functional analysis, user requirements, maintenance, architectures, and the like.