Escolar Documentos
Profissional Documentos
Cultura Documentos
What is Data?
Terminologies
a consistency summarizes the validity, accuracy, usability and integrity of related data between applications and across an IT enterprise.
Terminologies
Data integrity It refers to the validity of data, meaning of data is consistent and correct. In the data warehousing field, we frequently hear the term, "Garbage In, Garbage Out." If there is no data integrity in the data warehouse, any resulting report and analysis will not be useful.
Terminologies
data integrity Business rules that dictate the standards for acceptable data. These rules are applied to a database by using integrity constraints and triggers to prevent invalid data entry. Consistency states that only valid data will be written to the database.
Terminologies
Data integrity involves three level in DW:
Database level
ETL process Access level
Size of Data
sales, trade, wall mart, product descriptions, customer feedback, companies profile, . As per survey, after every 18 months, data is almost double. B, KB, MB, GB, TB, PB, EB, ZB,YB
Example
Dmart across India
Data Warehouse
WHAT IS DW? HOW DID DW COME INTO EXISTENCE? WHAT PURPOSE DOES IT SOLVE? HOW DO WE CREATE DW?
Ralph Kimball's paradigm: Data warehouse is the conglomerate of all data marts within the enterprise. Information is always stored in the dimensional model.
-In reality, the data warehouse in most enterprises are closer to Ralph Kimball's idea. -This is because most data warehouses started out as a departmental effort, and hence they originated as a data mart.
-Only when more data marts are built later do they evolve into a data warehouse.
Evolution
60s: Batch reports
hard to find and analyze information inflexible and expensive, reprogram every new request
23
EVOLUTION
30%
25% Respondents 20% 15% 10% Initial 5% 0%
Projected 2Q96
Source: META Group, Inc.
5GB
10-19GB
5-9GB
50-99GB
250-499GB
500GB-1TB
25
20-49GB
100-249GB
26
DW Creation
DW Creation
Construction of DW required : -data cleaning -data integration -data consolidation (Transformation)
DW Creation
-Steps involved in the data warehousing project cycle: Requirement Gathering Physical Environment Setup Data Modeling ETL OLAP Cube Design Front End Development Report Development Performance Tuning Query Optimization Quality Assurance Rolling out to Production Production Maintenance Incremental Enhancements
Users
-knowledge workers [manager, analyst, executives, administrators, .] uses warehouse to get the summarized data quickly to make strategic decision.