Escolar Documentos
Profissional Documentos
Cultura Documentos
1
Agenda
OLTP Vs OLAP
Modeling Techniques
User Profile
Top down approach
Bottom up approach
2
Traditional OLTP systems
OLTP systems are highly structured sets of information that
support the ongoing and day-to-day operation of an
organization
3
OLTP (Contd)
4
What is OLAP ?
5
Why not OLTP for OLAP?
6
In other words...
7
Data warehouse
A Data Warehouse is a copy of the
enterprise operational data, suitably
modified to support the needs of
analytical processes and stored
outside the operational database.
According to Bill Inmon, known as the
father of Data Warehousing, a data
warehouse is a subject oriented,
integrated, time-variant, nonvolatile
collection of data in support of
management decisions.
8
OLAP Vs OLTP
Data warehouse OLTP database
database Designed for real-time
Designed for analysis of business operations
business measures by Optimized for a common
categories and attributes set of transactions,
Optimized for bulk loads usually adding or
and large, complex, retrieving a single row at
unpredictable queries a time per table
that access many rows Optimized for validation
per table of incoming data during
Loaded with consistent, transactions; uses
valid data; requires no validation data tables
real time validation Supports thousands of
Supports few concurrent concurrent users
users relative to OLTP
9
Data warehouse architecture
extract Query/Reporting
transform
load serve
refresh
etc. e.g., ROLAP
Operational
DBs Data Mining
serve
Data Marts
10
D/W Architecture Goals
11
Characteristic of D/W
Are based on a dimensional model
Contain historical data
Include both detailed and summarized data
Consolidate disparate data from multiple sources while
retaining consistency
Focus on a single subject, such as sales, inventory, or
finance
12
User Profile
Statisticians (2%)
Knowledge workers (15%)
Information Consumers (83%)
13
Steps in implementing D/W
14
Identify and gather requirements
Identify the Sponsor
Meet the Business Users
Meet Data experts
Communicate with users often and thoroughly
15
Identify The Business Areas
For Telecom D/W
Customer Behavior
Corporate Customer
Customer Service
Accounts
Settlements
Partner
Supplier
Competitor
Marketing
16
Sources and Targets
Sources
Telephone call detail recording
Customer Service such as ordering service
and disconnecting lines
Customer payment processing
Targets
Studies of minutes of call use by customer
group
Segmentation of customers by minutes of
call use
Product bundling analysis
Customer Payment analysis
17
Design the dimensional model
Identify the dimensions
Should match with Business needs
Identify the grain of the detail
Decide on
Star Schema
Snow-flake Schema
Star-flake Schema
18
Star Schema
19
Star Schema
20
Snowflake Schema
21
Snowflake Schema
22
23
Design consideration of
Dimension Table
Level of hierarchies
Surrogate Key
Star or Snowflake
Date and Time
24
Slowly changing Dimension
Type 1: Overwrite the dimension record.
Type 2: Add a new dimension record.
Type 3: Create new fields in the dimension record.
25
Rapidly changing Dimensions
Breaking
offending
dimension
attributes
Fact less facts!
Confirmed
Dimensions
26
Fact tables
Multiple Fact tables
Additive measures
Non-additive/Semi additive measures
Calculated Measures
Granularity
27
ETL
28
Extraction
Push strategy
Pull strategy
29
Transformation
Transformation involves applying
complex filters, removing the
inconsistency between data from
different sources, conditional
transforms, complex calculations to
create derived data etc. Cleansing of
data could be an important part of the
transformation process
30
Loading
31
Loading approach
32
Issues in Loading
33
Data Marts
A data mart is a repository of data
gathered from operational data and other
sources that is designed to serve a
particular community of knowledge
workers. In scope, the data may derive
from an enterprise-wide database or data
warehouse or be more specialized. The
emphasis of a data mart is on meeting
the specific demands of a particular group
of knowledge users in terms of analysis,
content, presentation, and ease-of-use
34
OLAP
ROLAP
MOLAP
HOLAP
35
Few Popular tools
ETL
DataStage
Data Junction.
Microsoft DTS (Available with SQL Server
7.0 and above)
Oracle Warehouse Builder.
Informatica- PowerCenter
IBM- Data Warehouse Manager
AbIntio
36
Few Popular tools
OLAP
Cognos
Business Objects
Power Analyzer
Microsoft Analysis service
Micro strategy
DB2 OLAP Server
Hyperion OLAP Server
37
Few Popular tools
Data Mining
Intelligent Miner
DARWIN
SAS
38
References
http://192.168.121.14/asp/Search/DispDoc.asp?DocNo
=8703&KCURating=8.61&ContentType=Internal+Literat
ure
http://www.datawarehouse-training.com
http://www.datawarehousing.com
http://www.caworld.com/proceedings/2000/data_wareh
ousing/ws006pn/sld001.htm
http://sdgcomputing.com
http://www.dmreview.com
39
Thank You
40