Você está na página 1de 5

Data Warehouse Datawarehouse4u.info portal delivers information about Data Warehouse technology.

In one place you can find descriptions of ETL and BI tools, the most popular Data Warehouse architectures, solutions, engines and many others. On our pages you can find overview and general very useful information about whole Business Intelligence market. In the News section you can gather a piece of information about Data Warehouse and BI events and seminars. If you want to gain optimum value from information assets in your organization Datawarehouse4u.info is the first step that you should take.

Data Warehouse Architecture


Data Warehouse definition by William H. Inmonna: A data warehouse is a: subject-oriented integrated timevarying non-volatilecollection of data in support of the management's decision-making process.

A data warehouse is a centralized repository that stores data from multiple information sources and transforms them into a common, multidimensional data model for efficient querying and analysis.

OLTP vs. OLAP We can divide IT systems into transactional (OLTP) and analytical (OLAP). In general we can assume that OLTP systems provide source data to data warehouses, whereas OLAP systems help to analyze it.

- OLTP (On-line Transaction Processing) is characterized by a large number of short on-line transactions (INSERT, UPDATE, DELETE). The main emphasis for OLTP systems is put on very fast query processing, maintaining data integrity in multi-access environments and an effectiveness measured by number of transactions per second. In OLTP database there is detailed and current data, and schema used to store transactional databases is the entity model (usually 3NF). - OLAP (On-line Analytical Processing) is characterized by relatively low volume of transactions. Queries are often very complex and involve aggregations. For OLAP systems a response time is an effectiveness measure. OLAP applications are widely used by Data Mining techniques. In OLAP database there is aggregated, historical data, stored in multi-dimensional schemas (usually star schema). The following table summarizes the major differences between OLTP and OLAP system design.

OLTP System Online Transaction Processing (Operational System)


Source of data Purpose of data What the data Inserts and Updates Queries Processing Speed Space Requirements DatabaseDesign Backup and Recovery Operational data; OLTPs are the original source of the data. To control and run fundamental business tasks Reveals a snapshot of ongoing business processes Short and fast inserts and updates initiated by end users Relatively standardized and simple queries Returning relatively few records Typically very fast Can be relatively small if historical data is archived Highly normalized with many tables Backup religiously; operational data is critical to run the business, data loss is likely to entail significant monetary loss and legal liability

OLAP System Online Analytical Processing (Data Warehouse)


Consolidation data; OLAP data comes from the various OLTP Databases To help with planning, problem solving, and decision support Multi-dimensional views of various kinds of business activities Periodic long-running batch jobs refresh the data Often complex queries involving aggregations Depends on the amount of data involved; batch datarefreshes and complex queries may take many hours; query speed can be improved by creating indexes Larger due to the existence of aggregation structures and history data; requires more indexes than OLTP Typically de-normalized with fewer tables; use of star and/or snowflake schemas Instead of regular backups, some environments may consider simply reloading the OLTP data as a recovery method

source: www.rainmakerworks.com

What is Business Intelligence?


Business Ingelligence (BI) - technology infrastructure for gaining maximum information from available data for the purpose of improving business processes. Typical BI infrastructure components are as follows: software solution for gathering, cleansing,

integrating, analyzing and sharing data. Business Intelligence produces analysis and provides believable information to help making effective and high quality business decisions. The most common kinds of

Business Intelligence systems are:

EIS - Executive Information Systems DSS - Decision Support Systems MIS - Management Information Systems GIS - Geographic Information Systems OLAP - Online Analytical Processing and multidimensional analysis CRM - Customer Relationship Management Business Intelligence systems based on Data Warehouse technology. A Data Warehouse(DW) gathers information from a wide range of company's operationalsystems, Business Intelligence systems based on it. Data loaded to DW is usually good
integrated and cleaned that allows to produce credible information which reflected so called 'one version of the true'.

Business Intelligence tools


The most popular BI tools on the market are:

Oracle - Siebel Business Analytics Applications SAS - Business Intelligence SAP - BusinessObjects XI IBM - Cognos 8 BI Oracle - Hyperion System 9 BI+ Microsoft - Analysis Services MicroStrategy - Dynamic Enterprise Dashboards Pentaho - Open BI Suite Information Builders - WebFOCUS Business Intelligence
QlikTech - QlikView

TIBCO Spotfire - Enterprise Analytics Sybase - InfoMaker KXEN - IOLAP SPSS ShowCase

ETL tools List of the most popular ETL tools: Informatica - Power Center IBM - Websphere DataStage(Formerly known as Ascential DataStage) SAP - BusinessObjects Data Integrator IBM - Cognos Data Manager (Formerly known as Cognos DecisionStream) Microsoft - SQL Server Integration Services Oracle - Data Integrator (Formerly known as Sunopsis Data Conductor) SAS - Data Integration Studio Oracle - Warehouse Builder AB Initio Information Builders - Data Migrator Pentaho - Pentaho Data Integration Embarcadero Technologies - DT/Studio IKAN - ETL4ALL IBM - DB2 Warehouse Edition Pervasive - Data Integrator ETL Solutions Ltd. - Transformation Manager

Group 1 Software (Sagent) - DataFlow Sybase - Data Integrated Suite ETL Talend - Talend Open Studio Expressor Software - Expressor Semantic Data Integration System Elixir - Elixir Repertoire OpenSys - CloverETL

ETL process ETL (Extract, Transform and Load) is a process in data warehousing responsible for pulling data out of the source systems and placing it into a data warehouse. ETL involves the following tasks: - extracting the data from source systems (SAP, ERP, other oprational systems), data from different source systems is converted into one consolidated data warehouse format which is ready for transformation processing. - transforming the data may involve the following tasks: applying business rules (so-called derivations, e.g., calculating new measures and dimensions), cleaning (e.g., mapping NULL to 0 or "Male" to "M" and "Female" to "F" etc.), filtering (e.g., selecting only certain columns to load), splitting a column into multiple columns and vice versa, joining together data from multiple sources (e.g., lookup, merge), transposing rows and columns, applying any kind of simple or complex data validation (e.g., if the first 3 columns in a row are empty then reject the row from processing) - loading the data into a data warehouse or data repository other reporting applications

Magic Quadrant for Business Intelligence


In the picture below you can see current global view of the market of the main BusinessIntelligence software vendors prepared by Gartner, Inc. - the world's leading information technology research and advisory company.

source:Gartner (February 2013)

Data Warehouse Database Management Systems The data warehouse database management system (DBMS) market liders are: IBM DB2 Warehouse 9.5 Microsoft SQL Server 2008 Oracle Database 11g Teradata Enterprise Data Warehouse 12.0 Sybase IQ

- Netezza Performance Server The table below lists Worldwide Vendor Revenue Estimates from RDBMS Software, Based on Total Software Revenue, 2006 (Millions of Dollars).
2006 Market Share (%) 2005 47.1 6,238.2 21.1 2,945.7 17.4 2,073.2 3.2 467.6 3.2 449.9 7.9 1,149.0 100.0 13,323.5 2005 Market Share (%) 2005-2006 Growth (%)

Company 2006 Oracle 7,168.0 IBM 3,204.1 Microsoft 2,654.4 Teradata 494.2 Sybase 486.7 Other Vendors 1,206.3 Total 15,213.7

46.8 22.1 15.6 3.5 3.4 8.6 100.0

14.9 8.8 28.0 5.7 8.2 5.0 14.2

source: Gartner Dataquest (June 2007)


Comparing the table with the list we can see that the same database vendors offer software for relational database systems as well as for data warehouses.

Você também pode gostar