Você está na página 1de 8

Survivorship

As the name suggests survivorship is about what becomes of the data in these blocks of
potential duplicates. The idea is to get the “best of breed” data out of each block, based
on built-in or custom rules such as “most frequently occurring non-missing value”,
“longest string”, “most recently updated” and so on.
The data that fulfill the requirements of these rules can then be handled in a couple of
ways. One technique is to come up with a “master record” – a “single version of the
truth” – that will become the standard for the organization. Another possibility is that
the improved data could be populated back into the source systems whence they were
derived; for example if one source were missing date of birth this could be populated
because the date of birth was obtained from another source. Or more than one. If this
is not the requirement (perhaps for legal reasons), then a table containing the linkage
between the source records and the “master record” keys can be created, so that the
original, source systems have the ability also to refer to the “single source of truth” and
vice versa.

Address Verification and Certification

Quality Stage can do more (than simple matching). Address verification can be
performed; that is, whether or not the address is a valid format can be reported. Out of
the box address verification can be performed down to city level for most countries. For
an extra charge, an additional module for world-wide address verification (WAVES) can
be purchased, which will give address verification down to street level for most
countries.
For some countries, where the postal systems provide appropriate data (for example
SERP in the USA, CASS in Canada, DPID in Australia), address certification can be
performed: in this case, an address is given to Quality Stage and looked up against a
database to report whether or not that particular address actually exists. These
modules carry an additional price, but that includes IBM obtaining regular updates to
the data from the postal authorities and providing them to the Quality Stage licensee.

New Address Verification Module


Summary

 IBM is planning to release its next version of Info Sphere Quality Stage Worldwide
Address Verification module (v10)
 Release time frame is Q4 2012
 AVI v10 will have superior functionality and coverage over our current AVI
v8.x module  see slide 4
 AVI v10 will leverage new address/decoding reference data
 AVI v10 will have broad support for various Information Server versions 
see slide 5
 For current AVI v8.x customers only:
 AVI v8.x will have continues support until end of Dec. 2013
 Address reference data for AVI v8.x has been discontinue by the
vendor is ending in Dec. 2013
 AVI v10 will include a Migration utility for automated migration from AVI
v8.x to AVI v10
 For comparison AVI v10 and AVI v8 can run side-by-side (for development)

Information Server / Operating System support matrix for AVI v10


Address Verification Configuration as easy
as 1,2,3

2
1
3

Stage Icon and Location


Quality Stage Benefits

Quality Stage provides the most powerful, accurate matching available, based on
probabilistic matching technology, easy to set up and maintain, and providing the
highest match rates available in the market.
An easy-to-use graphical user interface (GUI) with an intuitive, point-and-click interface
for specifying automated data quality processes – data investigation, standardization,
matching, and survivorship – reduces the time needed to deploy data cleansing
applications.

Quality Stage offers a thorough data investigation and analysis process for any kind of
free formatted data. Through its tight integration with Data Stage and other Information
Server products it also offers fully integrated management of the metadata associated
with those data.

There exists rigorous scientific justification for the probabilistic algorithms used in
Quality Stage; results are easy to audit and validate.
Worldwide address standardization verification and enrichment capabilities – including
certification modules for the United States, Canada, and Australia – add to the value of
cleansed address data.
Domain-agnostic data cleansing capabilities including product data, phone numbers,
email addresses, birth dates, events, and other comment and descriptive fields, are all
handled. Common data quality anomalies, such as data in the wrong field or data
spilling over into the next field, can be identified and addressed.
Extensive reporting providing metrics yield business intelligence about the data and help
tune the application for quality assurance.

Service oriented architecture (SOA) enablement with Info Sphere Information Services
Director, allowing you to leverage data quality logic built using the IBM Info Sphere
Information Server and publish it as an "always on, available everywhere" service in a
SOA – in minutes.

The bottom line is that Quality Stage helps to ensure that systems deliver accurate,
complete, trusted inform

Você também pode gostar