Escolar Documentos
Profissional Documentos
Cultura Documentos
Just as with any other software application involving different technologies (such as the mainframe, Java or Microsoft), testing is a
very crucial phase in the software development lifecycle for data warehouse/ business intelligence (DW/BI) projects. Testing for
DW/BI carries unique challenges and requires specialized approaches. However, the testing function for this highly dynamic
technology area is at a very nascent stage of maturity. This article discusses the various aspects associated with testing for DW/BI.
Testing for DW/BI
Why and how is testing for DW/BI different from testing for other technologies? Part of the answer lies in definition of what
constitutes DW/BI.
BI may be defined as "the result of in-depth analysis of detailed business data; includes database and application technologies as
well as analysis practices."1 BI is a broad category of application programs and technologies for gathering, storing, analyzing and
providing access to data to help enterprise users make better business decisions.
A DW is a collection of data designed to support management decision-making. According to Bill Inmon, a DW is a "subjectoriented, integrated, time-variant, nonvolatile collection of data in support of decision-making." DWs tend to have these
distinguishing features:
It is a common best practice for any DW/BI initiative to define the level of data accuracy expected (also known as tolerance level)
from the DW; needless to say, this varies from application to application. For example, in a DW for sales analysis, accuracy of
approximately 95 percent is acceptable; whereas, in the case of a DW for fraud analysis in a stock exchange, the accuracy levels
expected could be higher than 99 percent.
It is pertinent to note here that any testing activity has to be focused on the program's key objective and ensuring that this objective
is met by the application. In order to achieve these critical success factors (e.g., data accuracy and consistency in
reporting/analysis), data in a typical DW architecture passes through several steps of consolidation.
Large volume of duplicate data or incomplete data extracted from source systems
Incorrect cleansing (e.g., use of incorrect codes)
Incorrect aggregation techniques resulting either in the Cartesian product or data being dropped due to incorrect
join conditions
Unwillingness on the part of DW developers. Any IT professional planning to build a career in this exciting
field aims to be an expert ETL developer, OLAP specialist, dimensional data modeler or DW architect; DW
tester doesn't even make the list of desirable roles. This is true, by and large, of professionals across the
globe and not necessarily a local factor applicable only to certain countries. The general perception that only
such roles carry premium rates in the job/consulting market (and worse, sometimes a perception that only
such roles get to face the technical challenges associated with a DW/BI project), has left the DW/BI project
team with very few takers for the challenging and critical role of tester.
Lack of awareness. As a general practice, testers plan their career in such a way that they specialize and
equip themselves with technical skills for the tools involved in test execution (e.g., Winrunner, SilkTest)
and/or test management (e.g., QualityCenter), with very little endeavor to develop skills in the underlying
technology. A good understanding of ETL/OLAP tools and technologies is an essential skill for DW/BI testing
and, so far, testers have not developed a keen interest in this skill.
Absence of tools. The DW/BI marketplace is flooded with many tools and vendors, each attempting to replace
the other in the three layers of DW/BI: database, ETL and OLAP. There are no popular ETL/OLAP testing
tools in the market that offer features for automated testing or functional testing. In the absence of such
tools, it is highly impractical to achieve tool-based requirements traceability throughout the lifecycle or impact
analysis. Of course, some of the advanced ETL tools offer add-on products that provide insights into their
metadata (e.g., Metastage) and provide impact analysis within the ETL function, but their usability for
traceability is yet to be explored.
Lack of standard approach/methodology. While standard methodologies exist for testing as a whole, there
seems to be no industry-wide view on the suggested approach and/or methodology for DW/BI testing. An
ideal methodology should include a test strategy, a test plan and test cases that cover thorough testing of the
various phases of data movement (see Figure 2.) Creating test cases and test data that provide adequate
coverage to each of the phases is very critical for ensuring a comprehensive quality assurance (QA) of the
DW. A sample/suggested template to track data as it progresses through the phases is given in Figure 3.
Promote awareness within the DW community that DW/BI testing is a challenging proposition requiring highly
valued skills, thereby encouraging ETL and BI developers to assume these roles. Moreover, leading IT
players with extensive experience in the DW/BI area should promote well-defined career options and career
progression plans to the ETL/OLAP developers and conventional testers.
Invest in research to formalize methodologies covering the entire spectrum of DW/BI testing in full detail.
Invest in building assets, tools and job aids to strengthen this function and provide productivity gains.
Develop training courses and course content to cross-train ETL/OLAP developers in testing nuances and
testers in DW and ETL/OLAP tools and technology concepts.
Productivity enhancement
Absence of any independent tool vendor venturing to build and offer DW testing tools.
Lack of standards (metadata exchange) that do not encourage leading testing tool vendors (e.g., Mercury
Interactive) to foray into this space.
Even the existing pure ETL/BI tool vendors have been focused only on consolidation and/or entry into either of
the areas, without focusing on the need to build testing tools.
However, the market is expected to witness, in the next couple of years, a large consolidation exercise likely to leave a handful of
large technology players offering end-to-end technical solutions. Such a consolidation is expected to facilitate adoption of metadata
standards and also bring about the much-needed focus on developing complementary tools and technologies, the most critical of
them being tools for DW/BI testing, independent of the vendor/ETL/OLAP tools. Such development is also likely to spin off parallel
intellectual thoughts among the IT services providers; large IT services firms are expected to focus their innovation in evolving
DW/BI testing methodologies and best practices, leveraging the use of these tools.
In summary, the criticality and importance of DW/BI testing can never be overemphasized. Testing for DW/BI is a niche skill that
demands a good blend of ETL/OLAP technical skills (or the least a good understanding of them) and thorough testing skills. Unlike
other technologies, there are no tools currently available that can be used for DW/BI testing. In the absence of such tools, it is
essential to define and develop a framework for DW/BI testing that comprehensively covers the various layers and stages of data
transformation. IT services firms need to encourage their work force to adopt this as a preferred skill and promote ways to advance
these skills. Consolidation of ETL/OLAP tool vendors could prove to be the beginning for development of DW/BI testing tools.