Você está na página 1de 11

Data Warehousing and Business Intelligence

By

KATE LITONDO
E-mail: klitondo@yahoo.com
UNIVERSITY OF NAIROBI - SCHOOL OF BUSINESS
DEPARTMENT OF MANAGEMENT SCIENCE

OCTOBER 2008

Abstract
Computers were first developed to process business transactions; the trend has been to have
computers also support management in decision making.

This led to the evolution of

Management Information Systems (MIS). The concept of MIS can be traced from four major
areas, namely, managerial accounting, management science or operations research,
management theory and computer science. MIS is supported by a comprehensive set of data for
business operations referred to as a database.
organization.

There are several databases in any given

Organizations have realized that valuable information is hidden in separate

databases which might sometimes contain overlapping and contradictory information, and are
coming up with data warehouses. A data warehouse provides a platform for advanced, complex
and efficient data analysis using On-Line Analytical Processing (OLAP) for data mining or
Knowledge Data Discovery (KDD) to extract previously unknown strategic business information
or business intelligence (BI). There are a lot of similarities between data warehousing and
operations research (OR), they both require analytical processing to support executive decision
making, yet OR, MIS and IT exist as separate communities, right from the education level to the
organizations. There are many challenges facing data warehousing, with organizations viewing
it as a purely IT project. The objective of the paper was to establish how some of the challenges
of data warehousing could be addressed so as to reduce the failure rate of BI projects. It was
concluded that data warehousing would be more successful if its development was a joint effort
of both the OR & MIS community and the IT community.

It is recommended that OR and MIS

be taught as one discipline whose graduates will work very closely with IT specialists.
Background to Study
Computers were first developed to process business transactions; the trend has been to also have
them support management in decision making. This led to the concept of management
information systems (MIS). The evolution of MIS can be traced from four major areas: a)
managerial accounting; b) management science/operations research; c) management theory; d)
computer science (Davis, 1974). MIS is supported by a comprehensive set of data for business
operations referred to as a database.
A database is a collection of data organized to serve many applications efficiently by centralizing
the data rather than having data in separate files as was the case before the introduction of
2

databases (Laudon and Laudon, 2007). A database management system (DBMS) is the software
that assists in the creation of a database; it centralizes the data, manages the efficiency, and
allows users to communicate with the database. There are many databases within any given
organization. Businesses have realized that, the numerous databases they possess contain
valuable hidden information with overlapping and sometimes contradictory information. This
makes it difficult for management to get concise reliable information about current operations,
trends, and changes across the entire company, which can lead to decisions being made with
incomplete information. The emergence of data warehouse was supposed to address some of
these problems.
The purpose of building a data warehouse is to integrate multiple heterogeneous, autonomous
and distributed data sources within an enterprise, and also to provide a platform for advanced
complex and efficient data analysis (Kozielski and Wrembley, 2008). Data warehouses
consolidate and standardize data from both internal and external database.

They integrate

subject- oriented, time-variant, nonvolatile data that provide support for decision making. The
rapid progress of data acquisition and storage technology has led to tremendous amount of
multimedia data, which is non-structured or semi-structure, being stored. Powerful tools such as
data mining and Online Analytical Processing (OLAP) are required in order for organizations to
be able to get strategic business information known as business intelligence (BI) out of the data
warehouse (Laudon and Laudon, 2007). Data mining also known as knowledge data discovery
(KDD) discovers patterns of behavior such as, associations, sequences, clustering and trends. It
also looks for anomalies and dependencies within the data. OLAP supports multidimensional
data analysis that supports decision making, business modeling and operations research
(Lalitrojwong, 2005).
Data warehousing is an advancement of MIS which was created specifically to provide managers
with information for decision making. Operations research (OR) mainly concerns itself with the
development of mathematical and statistical tools and techniques that can greatly enhance
decision making and operations management is concerned with processes and systems that an
organization needs to transform inputs to outputs. MIS, OR and operations management belong
to the quantitative school of management, which is critical for data warehousing and BI.

Langseth (2008) lists some of challenges facing data warehousing as: a) building of warehouseextracting, transforming, cleansing and loading data from source systems; b) modeling of realtime data into existing databases - OLAP queries are designed for unchanging historical data
with an assumption that data does not change; c) data warehouse separated from transactions
system complex analytical queries run against warehouses can not manage lots of simultaneous
inserts, updates or deletes like transaction systems; d) real-time alerting- warehouse alert
technology works on schedule or event basis, while real-time alerting needs to continuously
monitor the incoming data and trigger the events when appropriate.

Langseth goes on to

conclude that, with the right tools, designs, advice and approaches, real-time data warehousing is
possible using todays technology. Wang (2008) states that the use of data mining techniques
become difficult due to luck of comprehensive and systematic methodology.
According to Moore (1999) end users have problems accepting data warehousing because they
do not understand how it applies to their business and everyday jobs. They're not getting what
they want or what they need. He further states that, the problem lies in the lack of an adequate
Customer Adoption Process, which means understanding the end users and their real needs,
providing adequate resources to support their selection process and offering the follow-through
implementation necessary to provide them with what they need, when they need it. Singh (2007)
comments that technologists are needed to move, transform, store, and query massive amounts of
data in the data warehouse environment, the problem is that the technologists build exquisitely
crafted, soaring structures, whereas, data warehouse is more like a market place. He further states
that, there is a need for a structure, but the structure should not be allowed to hinder the
flexibility to change. The flexibility is ensured by keeping the technical design simple, but
simplicity is ensured by ignoring performance optimizations as well as component reusability as
much as possible. He concludes by saying that a good balance needs to be achieved in these
matters, but simplicity and flexibility is paramount.
There are also problems in the application of OR in organization as cited by Adam and Ebert
(1986): a) Benefits of OR are not understood by managers; b) managers lack awareness and
understanding of OR techniques; c) managers are not exposed to OR early in training; d)
required data are difficult to quantify or do not exist; e) OR methods are difficult to sell to
management; f) time to apply OR to meet decision deadline is inadequate.
4

Problem Statement
More than half of all business intelligence (BI) projects fail or are never completed, meaning that
OLAP and data mining tools have failed to extract strategic information from data warehouses.
Companies treat BI projects as IT projects, yet according to Moore (1999) IT has never been able
to deliver on the promise of data warehousing. BI is neither a system nor a project; it is a
constantly evolving strategy, vision and architecture that is continuously seeking to align an
organizations operations and direction with its strategic goals. According to Grimes (2007) OR
and BI seem to exist as separate communities, yet they both think analytical and support
executive decisions. OR starts with the decision and works backwards to figure out what math
and data will help in devising a better solution, while BI tends to start with data and see what can
be done with it. BI which this study took to be synonymous with MIS community needs the OR
community to assist in building complex analytical tool kits for data warehousing, the problem
is that they are taught as separate disciplines, and this is carried forward to the organizations.
The research questions the study addressed were:
1. Would the challenges facing data warehousing and OR be minimized if OR and MIS were taught
as one discipline?
2. Would OR be understood better by organizations if it was associated with MIS at the time
training
Objective of the Study
The objective of the study was to review literature and establish whether integrated teaching of
OR and MIS would lead to better data warehousing systems.
Justification of the Study
According to Wang (2008), very little has been written to explain the challenges organizations
face as they try to make data mining a part of business intelligent operations. This study will
analyze the challenges and come up with an explanation of their occurrence. The study will also
assist managers of organizations and graduates of management science to understand the position
of OR/MIS graduate in organizations.

Literature Review
Evolution of MIS Concept
The idea of information to support management and decision making was there before the use of
computers, but the use of computers have extended the organizational capability of implementing
such a system Davis (1974). He further states that many of the ideas which are part of MIS
evolved as part of other disciple, although the four major areas are the most outstanding, namely:
Managerial Accounting is concerned with cost behavior and other analysis useful for managerial
decisions. The rise of large organizations created a need for more complex systems that would
assist management in cost analysis and improved reporting methods. Cost analysis is used in
managerial accounting to determine the most relevant cost for decision making. Managerial
accounting is oriented towards internal management and control and is therefore closely
identified with MIS.
Management Science (Operations Research) is the application of scientific method and
quantitative analysis techniques to management problems. It emphasizes systematic approach to
problem solving and application of scientific methods to investigation by using mathematical
models & mathematical and statistical procedures in analysis, and its goal is to seek optimal
decision or optimal policy. The systematic approach to problem solving, use of models,
management science techniques, and computer based solution algorithms are generally
incorporated in the MIS design.
Management Theory emphasizes reaching a satisfactory solution by looking at the behavior an
motivation consequences of organizational structures and systems within organizations. This are
important to MIS because they aid in understanding the role of man/machine systems and in the
creation of decision models
Computer Science has been a major factor in inducing MIS development, but computer
technology can also be a major inhibitor of progress. Without the capability of computers, the
concept of MIS could not be implemented. The capabilities of hardware and software induce the
design of information systems, but their development must occur uniformly. For example, one

could have appropriate hardware, but very poor software greatly affecting its performance. Most
of the time, software is the weakest link in development of MIS.
Data Warehousing
Management Information systems store data and retrieve information from databases.

database is a collection of data organized to serve many applications efficiently by centralizing


the data which was initially scattered in separate computer files. A database management system
(DBMS) is software that is used to create and maintain the database. There are several databases
in any given organization and this pose a problem when strategic decision making that would
affect the whole organization is required, especially organizations with large databases or large
systems for separate functions. Data warehouse evolved to address these problems and support
business management as a means of gaining competitive advantage. A data warehouse stores
data that is current and historical, and is of potential interest to decision makers (Laudon and
Laudon, 2007). Data from several database including both internal and external databases is
extracted and cleaned, that is, made uniform, before being stored in a warehouse.

Data

warehouses integrate subject-oriented, time-variant, and nonvolatile data (Lalitrojwong, 2005).


Many enterprises, institutions and organizations rely on knowledge-based management which
has data warehouses as their core component. The purpose of building a data warehouse is to
integrate multiple heterogeneous, autonomous and distributed data sources within an enterprise,
and create a platform for advanced, complex and efficient data analysis. The complexity of
integration of massive amounts of data which includes multimedia information makes integration
and processing of data a challenge (Chandhuri, 1997).

Once data has been captured,

standardized and organized in a data warehouse, further analysis is required to enable mangers
extract knowledge or business intelligence (BI) for better decision making. Principle tools for BI
include software for queries and reporting, tools for multidimensional data analysis, known as
online analytical processing (OLAP), and data mining (Laudon and Laudon, 2007). With OLAP
and query oriented analysis, users need have understood they are looking for.
Data mining finds patterns and relationships in data warehouse and infers rules that predict future
behavior, so as to guide decision making and predict the effect of these decisions (Laudon and

Laudon, 2007).

The type of information obtained include: a) Associations and occurrences

linked to a single event, for example, purchasing patterns in super markets might reveal that
when potato chips are purchased soda is purchased 65 percent of the time, but when there is a
promotion of soda it is purchased 85 percent of time.

This information teaches mangers

profitability of promotion and therefore, enhances decision making; b) Sequences, for example,
if a house is purchased, a fridge will be purchased within two weeks and an oven within one
month of the one month of house purchase; c) classifications, for example, telephone companies
can characterize customers who are likely to be non-loyal and therefore, device special
campaigns to retain them; d) clustering, for example super markets can discover customer
preferences for a particular product and stock accordingly. Data Warehousing is summarized as
below.
Data Warehousing
Operational
databases
Operation
al
databases
Customer
databases

Intern
al
Data
sourc
es

Manufacturin
g databases

Data
extraction
and
transformatio
n
Extract
and
transform

Externa
l Data
sources

External
databases

Data
access
and
analysis

Custom
er Data

OLAP

Product
data

Extract

Sales
data

Filter
Transform

Historical
databases

Data
warehou
ses

Classify

Integrated

Aggregate

Subject
oriented

Summarize

Time-variant
Non-volatile
Data

Source: Constructed from Laudon and Laudon, 2007


Discussions

Data
Mining
Querying
Reporting

Business
intelligence

One can clearly see that a data warehousing system is a comprehensive management information
system that supports management in decision making. Development of such a complex system
requires joint efforts of personnel who understand the operations of the organization, and the
type of information that management would require from the system. IT specialists are known
not to clearly interpret business needs, but are good at providing technical solutions to those
needs once they are explained in a technical way. Leaving the development of a data warehouse
entirely to the IT department as seems to be the case with several organizations would lead to a
system that would not provide users with the right information. The concept of management
science as explained originated form several disciplines including operations research and
computer science.
Computer technology, which includes the infrastructure and software is supposed to implement
the ideas leading to information systems. These ideas must be manipulated analytically for any
meaningful information to be obtained. OR specialist should be the ones to originate the ideas
and the methods for manipulating the ideas, and IT specialist working closely with OR
specialists should implement the idea into management solution. This means that OR personnel
must understand the operations of the organizations and the capabilities of computer systems for
them to be fully involved in designing of a data warehouse. In other words, a management
information system specialist is an OR person. This is the person who understands the data in
the operational databases, the tools required for the extraction of relevant data from the
warehouses. He or She should also be responsible for the tools that transform the data for its
storage in the data warehouse and the tools that analyze data for business intelligent.
When the above tasks are left to IT specialists who think technical and not analytical, the
challenges that managers experience with data warehouses occur. Problems such as errors in the
source databases, extraction of wrong information from source databases, delays in getting
information for decision making and even wastage of computer resources, since the goal of OR
is optimal usage of resources.

Conclusion and Recommendations


9

From the above discussions, one can clearly see that the specialists in MIS are OR personnel,
meaning that having two thematic areas of OR and MIS does not make sense, the two should
exist as one discipline, preferably under the name MIS since it seems to be better understood by
managers.

If the two are taken as one discipline, the training of OR specialists would

concentrate on ensuring that the graduates are well trained in computer science. The training
should be in OR techniques with an emphasis on developing this techniques using computers.
Separating the two areas makes students and employers fill that only those with MIS option
understand computers and the ones with OR orientation are left with no position in the
organization. Having the OR and MIS treated separately also means that one has MIS graduates
with very little understanding of OR techniques and OR graduates with little understanding of
computer systems. Many employers also expect MIS specialists to be IT specialists, which
should not be the case.
The recommendation is that OR and MIS should be taken to be one thematic area, and when
employed in organization, the specialists should work very closely with the IT specialists in
order to minimize the problems of data warehousing, so that the failure rate of business
intelligent projects could be reduced. IT specialists would lean more on the infrastructure, while
MIS/OR personnel would be involved in software development for data warehousing and
optimal usage of the infrastructure. IT specialists would lean more on the infrastructure, while
MIS/OR personnel would be involved in software development for data warehousing and
optimal usage of the infrastructure. The original concept of MIS where OR was supposed to
come up with mathematical models, and with the power of computer hardware and software,
assist in solving management problems, should not be lost.
From the literature review, it can be hypothesized that improvement of data warehousing system
through integrated teaching of OR and MIS would enhance business intelligence.

10

References
Adam E. and Ebert R. (1986) Production and Operations Management Concept, Models and
Behaviour, Prentice Hall, Englewood Cliffs, New Jersey
Davis G. (1974) Managing Information Systems: Conceptual Foundations, Structure and
Development, McGraw Hill, Inc. New York
Grimes S. (2008) What BI Practitioners can learn from operations research Business
Intelligent
Enterprise
Applications.
www.intelligententerprise.com/movabletype/blog/sgrimes.html
Kozielski, S. and Wrembel R. (2008) New Trends in Data Warehousing and Data Analysis,
Annals of Information Systems, Vol. 3.
Lalitrojwong P. (2006) Data Warehouse www.it.KMIH.
Laudon and Laudon J. (2007) Managing Information Systems: MANAGING THE DIGITAL
FIRM, Pearson Education, Inc., New Jersey
Langseth

J. (2008) Real-Time Data Warehousing: Challenges


http://www.DSSResources.COM/papers/dssarticles.html

and

Solutions,

Roberta M., The Challenges with Data Warehousing, Retrieved on September 27, 2008,
from:http://www.dmreview.com/issues/19990101/232-1.html
Singh J. The challenges of data warehousing. Retrieved on October 8, 2008,
http://www.dwoptimize.com/2007/09/challenges-of-data-warehousing.html
Surajit C. (1997) An Overview of Data Warehousing and OLAP Technology ACM SIGMOD
RECORD, Vol. 26, pp. 65 74
Wang J. (2008) encyclopedia of data warehousing and mining, Published by Idea Group Inc
(IGI), ISBN 1605660108, 9781605660103

11

Você também pode gostar