Você está na página 1de 42

Teradata Warehouse Miner Express

Introduction
Release 5.4.2
B035-2305-106A
October 2016
The product or products described in this book are licensed products of Teradata Corporation or its affiliates.

Teradata, BYNET, DBC/1012, DecisionCast, DecisionFlow, DecisionPoint, Eye logo design, InfoWise, Meta Warehouse, MyCommerce,
SeeChain, SeeCommerce, SeeRisk, Teradata Warehouse Miner, Teradata Source Experts, WebAnalyst, and Youve Never Seen Your Business
Like This Before are trademarks or registered trademarks of Teradata Corporation or its affiliates.
Adaptec and SCSISelect are trademarks or registered trademarks of Adaptec, Inc.
AMD Opteron and Opteron are trademarks of Advanced Micro Devices, Inc.
BakBone and NetVault are trademarks or registered trademarks of BakBone Software, Inc.
Cloudera and the Cloudera logo are trademarks of Cloudera, Inc.
This software contains material under license from DUNDAS SOFTWARE LTD., which is 1994-1999 DUNDAS SOFTWARE LTD., all
rights reserved.
EMC, PowerPath, SRDF, and Symmetrix are registered trademarks of EMC Corporation.
GoldenGate is a trademark of GoldenGate Software, Inc.
Hewlett-Packard and HP are registered trademarks of Hewlett-Packard Company.
Hortonworks, the Hortonworks logo and other Hortonworks trademarks are trademarks of Hortonworks Inc. in the United States and other
countries.
Intel, Pentium, and XEON are registered trademarks of Intel Corporation.
IBM, CICS, DB2, MVS, RACF, Tivoli, and VM are registered trademarks of International Business Machines Corporation.
Linux is a registered trademark of Linus Torvalds.
LSI and Engenio are registered trademarks of LSI Corporation.
MapR, MapR Heatmap, Direct Access NFS, Distributed NameNode HA, Direct Shuffle and Lockless Storage Services are all trademarks of
MapR Technologies, Inc.
Microsoft, Active Directory, Windows, Windows NT, Windows Server, Windows Vista, Visual Studio and Excel are either registered trademarks
or trademarks of Microsoft Corporation in the United States or other countries.
MongoDB, Mongo, and the leaf logo are registered trademarks of MongoDB, Inc.
Novell and SUSE are registered trademarks of Novell, Inc., in the United States and other countries.
QLogic and SANbox trademarks or registered trademarks of QLogic Corporation.
SAS, SAS/C and Enterprise Miner are trademarks or registered trademarks of SAS Institute Inc.
SPSS is a registered trademark of SPSS Inc.
STATISTICA and StatSoft are trademarks or registered trademarks of StatSoft, Inc.
SPARC is a registered trademarks of SPARC International, Inc.
Sun Microsystems, Solaris, Sun, and Sun Java are trademarks or registered trademarks of Sun Microsystems, Inc., in the United States and
other countries.
Symantec, NetBackup, and VERITAS are trademarks or registered trademarks of Symantec Corporation or its affiliates in the United States
and other countries.
Unicode is a collective membership mark and a service mark of Unicode, Inc.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Other product and company names mentioned herein may be the trademarks of their respective owners.

THE INFORMATION CONTAINED IN THIS DOCUMENT IS PROVIDED ON AN AS-IS BASIS, WITHOUT


WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING THE IMPLIED WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. SOME JURISDICTIONS
DO NOT ALLOW THE EXCLUSION OF IMPLIED WARRANTIES, SO THE ABOVE EXCLUSION MAY NOT APPLY
TO YOU. IN NO EVENT WILL TERADATA CORPORATION BE LIABLE FOR ANY INDIRECT, DIRECT, SPECIAL,
INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS OR LOST SAVINGS, EVEN IF
EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

The information contained in this document may contain references or cross-references to features, functions, products, or services that are
not announced or available in your country. Such references do not imply that Teradata Corporation intends to announce such features, functions,
products, or services in your country. Please consult your local Teradata Corporation representative for those features, functions, products, or
services available in your country.
Information contained in this document may contain technical inaccuracies or typographical errors. Information may be changed or updated
without notice. Teradata Corporation may also make improvements or changes in the products or services described in this information at any
time without notice.
To maintain the quality of our products and services, we would like your comments on the accuracy, clarity, organization, and value of this
document. Please email: teradata-books@lists.teradata.com
Any comments or materials (collectively referred to as Feedback) sent to Teradata Corporation will be deemed non-confidential. Teradata
Corporation will have no obligation of any kind with respect to Feedback and will be free to use, reproduce, disclose, exhibit, display, transform,
create derivative works of, and distribute the Feedback and derivative works thereof without limitation on a royalty-free basis. Further, Teradata
Corporation will be free to use any ideas, concepts, know-how, or techniques contained in such Feedback for any purpose whatsoever, including
developing, manufacturing, or marketing products or services incorporating Feedback.
Copyright 1999-2016 by Teradata Corporation. All Rights Reserved.

Teradata Warehouse Miner Express Introduction iii


iv Teradata Warehouse Miner Express Introduction
Preface

Purpose
This introductory document provides the following information:
Limitations of the Express Edition
Overview of the TWM family of products
General installation and configuration instructions
Configuring the tutorial environment
Examples of using the product

Audience
Professionals interested in evaluating Teradata Warehouse Miner through the use of the
Teradata Warehouse Miner Express product.

Revision Record
The following table lists a history of releases where this guide has been revised:

Release Date Description

TWM 5.4.2 10/31/16 Maintenance Release

TWM 5.4.1 01/08/16 Maintenance Release

TWM 5.4.0 07/31/15 Feature Release

TWM 5.3.5 06/16/14 Maintenance Release

TWM 5.3.4 08/31/13 Maintenance Release

TWM 5.3.3 06/30/12 Maintenance Release

TWM 5.3.2 06/01/11 Maintenance Release

TWM 5.3.1 06/30/10 Maintenance Release

TWM 5.3.0 10/30/09 Feature Release

TWM 5.2.2 02/05/09 Maintenance Release

TWM 5.2.1 12/15/08 Maintenance Release

Teradata Warehouse Miner Express Introduction v


Preface
How This Manual Is Organized

Release Date Description

TWM 5.2.0 05/31/08 Feature Release

TWM 5.1.1 01/23/08 Maintenance Release

TWM 5.1.0 07/12/07 Feature Release

TWM 5.0.1 11/16/06 Maintenance Release

TWM 5.0.0 09/22/06 Major Release

How This Manual Is Organized


This manual is organized and presents information as follows:
Chapter 1: Introduction provides and introduction to the product.
Chapter 2: Installation and Configuration describes how to install the product.
Chapter 3: Analysis Examples describes the features of the product and how to use
them.

Conventions Used In This Manual


The following typographical conventions are used in this guide:

Convention Description

Italic Titles (esp. screen names/titles)


New terms for emphasis

Monospace Code sample


Output

ALL CAPS Acronyms

Bold Important term or concept

GUI Item Screen item and/or esp. something you will click on or highlight in
following a procedure.

Related Documents
Related Teradata documentation and other sources of information are available from:
http://www.info.teradata.com
Additional technical information on data warehousing and other topics is available from:
http://www.teradata.com/t/resources

vi Teradata Warehouse Miner Express Introduction


Preface
Related Documents

Support Information
Services, support and training information is available from:
http://www.teradata.com/services-support

Teradata Warehouse Miner Express Introduction vii


Preface
Related Documents

viii Teradata Warehouse Miner Express Introduction


Table of Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Purpose. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Audience. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Revision Record. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
How This Manual Is Organized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
Conventions Used In This Manual. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
Related Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
Support Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

About Teradata Warehouse Miner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1


About The Express Edition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
About the TWM Family of Products. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Teradata Profiler. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Teradata ADS Generator/Teradata Data Set Builder for SAS . . . . . . . . . . . . . . . . . . . . . . 2
Teradata Warehouse Miner. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Chapter 2: Installation and Configuration . . . . . . . . . . . . . . . . . . . . . . . . 5

Software Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Client Operating System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Supporting Client Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Additional Client Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Teradata Database Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Installation Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Configuration Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Creating Tutorial User/Databases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Configuring a Data Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Configuring Connection Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Creating Metadata Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Installing Tutorial Tables and Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Teradata Warehouse Miner Express Introduction ix


Table of Contents

Installing the Tutorial Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Chapter 3: Analysis Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Getting Started with Teradata Warehouse Miner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13


Exploring Data with a Data Explorer Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Creating an Analytic Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Creating and Scoring a Decision Tree model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Building a Decision Tree Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Scoring a Decision Tree Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Appendix A: References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

x Teradata Warehouse Miner Express Introduction


List of Figures

Figure 1: Connection Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9


Figure 2: Teradata Warehouse Miner Express: Main Screen . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Figure 3: Data Explorer Tutorial #1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Figure 4: Graph snapshot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Figure 5: Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Figure 6: City Name Thumbnail Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Figure 7: Variable Creation Tutorial #1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Figure 8: Add (Arithmetic). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Figure 9: Absolute Value (Arithmetic) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Figure 10: Average (Aggregation) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Figure 11: Coalesce (Case) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Figure 12: avg_cc_tran_amt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Figure 13: INPUT > Anchor Table: Select TWM_CUSTOMER . . . . . . . . . . . . . . . . . . . . . . . 20
Figure 14: Join Path Wizard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Figure 15: Tree Browser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Figure 16: Text Tree tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Figure 17: Lift Chart tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Teradata Warehouse Miner Express Introduction xi


List of Figures

xii Teradata Warehouse Miner Express Introduction


List of Tables

Table 1: Decision Tree Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22


Table 2: Dependent Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Table 3: Independent Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Table 4: Confusion Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Table 5: Cumulative Lift Table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Table 6: Decision Tree Model Scoring Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Table 7: Confusion Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Table 8: Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Teradata Warehouse Miner Express Introduction xiii


List of Tables

xiv Teradata Warehouse Miner Express Introduction


CHAPTER 1 Introduction

Whats In This Chapter

1 About Teradata Warehouse Miner on page 1


2 About The Express Edition on page 1
3 About the TWM Family of Products on page 2

About Teradata Warehouse Miner


Teradata Warehouse Miner is software that allows users to perform data mining entirely
within a Teradata data warehouse or Aster database. Representing a dramatic shift from past
non-warehouse resident data mining architectures, Teradata Warehouse Miner users perform
data mining without the additional hardware, software, and associated data management
processes those architectures require. Additionally, the product is separated into distinct
offerings, allowing different types of Teradata users the functionality they need to perform
data profiling, analytic data set creation and model building, scoring and evaluation

About The Express Edition


The Express Edition of Teradata Warehouse Miner is functionally equivalent to the standard
edition of Teradata Warehouse Miner with the exception that the sum of the utilized space in all
of the configured source databases must be less than or equal to 40 Gigabytes. (The source
databases for a particular data source are specified using the Connection Properties dialog,
and these source databases are the only ones that may be used in the Express Edition's input
selectors.)
This limitation is validated whenever the user connects to a data source or an automatic
connection is made based on startup properties or batch execution parameters. It is also
validated whenever the Connection Properties dialog is displayed and the OK button is
selected.
(Note that the validation of the space limitation for the Express Edition requires SELECT
access to the dbc.allspace view in Teradata or EXECUTE permission on the nc_relationstats
SQL-MR function in Aster for the database user. If for any reason it is not possible to validate
the size limitation, the product will not function fully.)

Teradata Warehouse Miner Express Introduction 1


Chapter 1: Introduction
About the TWM Family of Products

About the TWM Family of Products


The Express Edition of Teradata Warehouse Miner may be used to evaluate the following
members of the Teradata Warehouse Miner family of products:
Teradata Profiler
Teradata ADS Generator / Teradata Data Set Builder for SAS
Teradata Warehouse Miner
(Note that the Express Edition does not include the Teradata Model Manager web-based
application that is part of the second and third packages above.)

Teradata Profiler
The first of the products that may be evaluated using the Express Edition of Teradata
Warehouse miner is the Teradata Profiler. The components available in this offering were
developed to provide a comprehensive data profiling and exploration tool that analyzes data
directly within the data warehouse through the use of generated SQL. A wide variety of
descriptive statistics functions are available to generate reports and graphs with drill down
capabilities, pointing out potential issues with data quality.
The highlight of this offering is the Data Explorer analysis that can perform the Values,
Frequency, Histogram and Statistics analyses on selected tables or columns, using multiple
threads of SQL operation and providing thumbnail graphs and drill-down capabilities.

Teradata ADS Generator/Teradata Data Set Builder for SAS


The second of the products that may be evaluated using the Express Edition of Teradata
Warehouse Miner is the Teradata ADS Generator, also known as the Teradata Data Set
Builder for SAS. This product includes all of the components of Teradata Profiler in
addition to analyses that aid in the generation of Analytic Data Sets, analyses that can build
and export a Correlation Matrix (and related matrix types), an analysis to score models that
are described using the Predictive Model Markup Language (PMML) and an analysis to publish
analytic data sets and/or models for deployment through the Teradata Model Manager web-
based application.
The need to build Analytic Data Sets derives from the fact that the data associated with the
highly normalized data models found in modern data warehouses are not directly suited for
predictive modeling. A precursor to the creation of analytic models is therefore the creation of
an Analytic Data Set (ADS), presenting the data in a flat structure in which all of the attributes
or variables associated with the item of interest (customer, household, account, etc.) are
present in a single table (data set). Components are provided that aid in the creation of these
variables, dimensioning or de-normalizing them, as well as statistically transforming them.
Additional components allow tables to be sampled, partitioned, de-normalized or joined
together.

2 Teradata Warehouse Miner Express Introduction


Chapter 1: Introduction
About the TWM Family of Products

Teradata Warehouse Miner


The third of the products that may be evaluated using the Express Edition of Teradata
Warehouse Miner is the Teradata Warehouse Miner product. In addition to the features of
the Teradata Profiler and Teradata ADS Generator, it provides analytic algorithms that make
predictions via Logistic Regression, Decision Tree or Linear Regression algorithms.
Dimensionality reduction is offered with several flavors of Factor Analysis, while the Clustering
algorithm provides a mechanism for customer segmentation and for solving various business
problems where similar grouping is required. The Association Rules algorithm with optional
Sequence Analysis provides a solution for problems such as market basket analysis and analysis
of channel usage.
With the exception of Association Rules, all of the models produced by the algorithms can be
scored directly in the database using generated SQL.
Finally, the Teradata Warehouse Miner product includes a collection of 17 Statistical Tests,
including various Binomial, Kolmogorov-Smirnov, Parametric, Rank and Contingency Table
tests.
Note: This third set of functionality is not available when connected to an Aster database.

Teradata Warehouse Miner Express Introduction 3


Chapter 1: Introduction
About the TWM Family of Products

4 Teradata Warehouse Miner Express Introduction


CHAPTER 2 Installation and Configuration

Whats In This Chapter

1 Software Dependencies on page 5


2 Installation Instructions on page 6
3 Configuration Instructions on page 7
This chapter outlines the installation and configuration instructions for Teradata Warehouse
Miner Express. See more detailed instructions in the Teradata Warehouse Miner User Guide
(Volume 1).

Software Dependencies

Client Operating System


One of the following operating systems is required:
Microsoft Windows 7 Professional
Microsoft Windows 8 Professional
Microsoft Windows 8.1Professional
Microsoft Windows 10 Professional

Supporting Client Software


Required supporting client software includes:
Microsoft .NET Runtime 4.0
(If this is missing at installation time, the installer will provide a link to download it. It
may co-exist with other versions of the .NET Runtime.)
Teradata ODBC Driver (32-bit) for Windows Version 14.10.00.00 or later.
Aster ODBC Driver (32-bit) Versions 6.0, 6.10 or 6.20, using the latest fix release
This and other versions are available at the following link:
http://downloads.teradata.com/download/connectivity/odbc-driver/windows
Note: If desired, the latest version of the ODBC driver consistent with the version of the
Teradata database and utilities may be used (for example, 15.00.00.00 may be used with
Teradata 15.0.)

Teradata Warehouse Miner Express Introduction 5


Chapter 2: Installation and Configuration
Installation Instructions

Additional Client Software


When installing the Teradata Warehouse Miner tutorial environment, additional client
software is required, including the following components of the Teradata Tools and Utilities
for Windows. (For use with the Express Edition of Teradata on a system with a limited
amount of data, a download is available from the Teradata Developer Exchange at Teradata
Express Tools and Utilities (http://downloads.teradata.com/download/tools/teradata-tools-
and-utilities-windows-installation-package).)
Teradata Fastload Utility (consistent with installed version of ODBC).
This is required if installing the TWM tutorial tables (to execute the tutorial projects) or
Statistical Test metadata tables.
Teradata BTEQ Utility (consistent with installed version of ODBC).
This is required if installing User Defined Functions (UDF's) for PMML or Matrix
building. (Although the UDF's are optional for Matrix building, most PMML tutorial
cases require these UDF's.)
Aster Cluster Terminal (ACT) (consistent with installed version of ODBC).
This is required if installing the TWM tutorial tables (to execute the tutorial projects) on
an Aster Database.

Teradata Database Support


Teradata Warehouse Miner supports the following releases of Teradata:
Teradata Version 14.10
Teradata Version 15.0
Teradata Version 15.10
Teradata Warehouse Miner supports the following releases of Aster:
Aster Version 6.0
Aster Version 6.10
Aster Version 6.20

Installation Instructions
Teradata Warehouse Miner Express is installed by opening the supplied TWM_Express.msi file
and following the prompts from the installation dialog. The following points should be
considered before installing the software.
Prior to installing Teradata Warehouse Miner Express, any previous versions of TWM,
Profiler or ADS Generator must first be removed.
When installing Teradata Warehouse Miner or any Teradata Tools and Utilities (TTU)
component, be sure to reboot if/when the individual installation programs tell you to. Do
not delay your reboot until all the software has been installed.

6 Teradata Warehouse Miner Express Introduction


Chapter 2: Installation and Configuration
Configuration Instructions

Configuration Instructions
There are two ways to configure Teradata Warehouse Miner Express. One is to configure it to
access the tutorial environment, and the other is to configure it to access other data in your
Teradata system. You can of course configure both ways, but configuring for the tutorial
environment is what is described here.
Configuring Teradata Warehouse Miner Express for the tutorial environment makes it
possible to use the tutorial projects that come with the product and that match the examples
in the online help system. Importing these projects is described later in this section.

Creating Tutorial User/Databases


The tutorial projects reference particular databases which can be created using the following
suggested SQL. Note that database sizes and passwords can be adjusted to suit specific needs.
(The sizes shown add up to 200 Megabytes.)
create user twm from dbc as perm=100000000, default database=twm_source,
password = twm;
grant all on twm to twm with grant option;
grant all on twm to dbc with grant option;
grant sel on dbc to twm; -- optional
grant select on td_sysfnlib to twm;
grant execute on td_sysfnlib to twm;
grant select on sysudtlib to twm;
grant udttype on sysudtlib to twm;
grant select on sysspatial to twm;

create database twm_source from dbc as perm=50000000;


grant all on twm_source to twm_source;
grant all on twm_source to twm with grant option;
grant all on twm_source to dbc with grant option;

create database twm_results from dbc as perm=50000000;


grant all on twm_results to twm_results;
grant all on twm_results to twm with grant option;
grant all on twm_results to dbc with grant option;

grant all on twm_source to twm_results with grant option;

Configuring a Data Source


An ODBC Data Source must be made available for accessing source tables, creating result
tables and creating and accessing metadata tables.
To create a data source called twm tutorial, execute the ODBC Data Source Administrator
from the TWM Tools menu or from Control Panel > Administrative Tools > Data Sources
(ODBC). Add an entry as follows:
Category: System DSN
Driver: Teradata
Name: twm tutorial

Teradata Warehouse Miner Express Introduction 7


Chapter 2: Installation and Configuration
Configuration Instructions

Server: <name or IP address of Teradata server>


Username: twm
Password: twm
Default Database: twm_source
Session Character Set: ASCII

Configuring Connection Properties


Teradata Warehouse Miner Connection Properties must be specified for each ODBC Data
Source to be used, utilizing the supplied Connection Properties dialog accessed from the Tools
menu, and specifying Source Databases and various metadata databases.
To specify the Connection Properties for the tutorial environment, first connect to the tutorial
data source created above using either the toolbar icon for connecting or the Tools menu item
to Change Connection. The Connection Properties for the tutorial environment should look
like the following:

8 Teradata Warehouse Miner Express Introduction


Chapter 2: Installation and Configuration
Configuration Instructions

Figure 1: Connection Properties

Creating Metadata Tables


Metadata tables must be created prior to saving created projects. The TWM Tools menu
contains items for Metadata Creation, Publish Tables Creation and Advertise Tables Creation.
These should only be executed once, since they will remove any previously created metadata
tables (after several warning prompts).

Installing Tutorial Tables and Functions


Program items for Teradata Warehouse Miner Express are supplied to:
Install or Uninstall UDF's
Load Demonstration Data
Load Statistical Test Metadata

Teradata Warehouse Miner Express Introduction 9


Chapter 2: Installation and Configuration
Configuration Instructions

Performing these operations requires that the Teradata Fastload and BTEQ utilities be
installed on the workstation where Teradata Warehouse Miner Express is installed (please refer
to Additional Client Software on page 6 for details).
Performing these operations also requires an entry in the hosts file on the client machine. (To
add an entry you must have administrative privileges). The hosts file is located in folder
C:\WINDOWS\system32\drivers\etc. For example, if the IP address is 127.0.0.1 and the system
name is dbc, the host entry should be:
127.0.0.1 dbc dbccop1

(The three items in the line above should be separated by a tab character.)
To install the tutorial tables in the twm_source database on host dbc, perform the following.
(Note that this operation requires the Teradata Fastload utility.)
Execute the program item Start > Programs > Teradata Warehouse Miner 5.4.2 > Load
Demonstration Data.
Hostname: dbc
Userid: twm
Password: twm
Account:
Database: twm_source
Char Set:
To install the Statistical Test tables in the twm database on host dbc, perform the following.
(Note that this operation requires the Teradata Fastload utility.)
Execute the program item Start > Programs > Teradata Warehouse Miner 5.4.2 > Load Statistical
Test Metadata.
Hostname: dbc
Userid: twm
Password: twm
Account:
Database: twm
Char Set:
To install the PMML User Defined Functions (UDFs) in the twm database on host dbc,
perform the following. (Note that this operation requires the Teradata Bteq utility and that
the target Teradata system has a suitable C compiler.)
Execute the program item Start > Programs > Teradata Warehouse Miner 5.4.2 > PMML UDF
Creation.
Hostname: dbc
Userid: twm
Password: twm

10 Teradata Warehouse Miner Express Introduction


Chapter 2: Installation and Configuration
Configuration Instructions

Database: twm
Account:
Char Set:

Installing the Tutorial Projects


The tutorial projects that are delivered with Teradata Warehouse Miner can be accessed from
the Help menu using the Import Tutorial Projects menu item. Each .bin file contains one or
more tutorial projects, each of which contains one or more analyses. One or more of these
.bin files can be imported into the working environment by selecting them on the Open dialog
and by clicking the Open button.
Clicking the Open button leads to the Import Wizard dialog and the database matching screen.
If the user is connected to the tutorial environment, it should be possible to just click on the
Import button to load the projects without changing (mapping) database names.
The tutorial projects may be imported from the Help menu and executed directly in the
project work area. If desired, they may also be saved to metadata for easier access at a later
time by right-clicking on a project in the project work area and selecting any of the Save
options.

Teradata Warehouse Miner Express Introduction 11


Chapter 2: Installation and Configuration
Configuration Instructions

12 Teradata Warehouse Miner Express Introduction


CHAPTER 3 Analysis Examples

Whats In This Chapter

This Chapter provides examples of using Teradata Warehouse Miner functions. The examples
include:
1 Getting Started with Teradata Warehouse Miner on page 13
2 Exploring Data with a Data Explorer Analysis on page 15
3 Creating an Analytic Data Set on page 17
4 Creating and Scoring a Decision Tree model on page 21
Additional information about these and many other features can be found in the applicable
user guide and in the help system. Note that context sensitive help is also available by selecting
the F1 key.

Getting Started with Teradata Warehouse


Miner
The first thing to do after starting Teradata Warehouse Miner is to get familiar with the main
screen of the application, as seen below.

Teradata Warehouse Miner Express Introduction 13


Chapter 3: Analysis Examples
Getting Started with Teradata Warehouse Miner

Figure 2: Teradata Warehouse Miner Express: Main Screen

There are three windows on the main screen, the largest of which is for viewing and editing
analysis forms. On the right is the Project Explorer window where open projects and the
analyses they contain are displayed in a tree view. Underneath both of these areas is the
Execution Status window. Directly over the analysis work area is a toolbar with icons for
primary functions (the names of which can be seen by hovering over them), and over that is a
series of menu topics, including File, View, Project, Tools, Window and Help.
In the sample screen above, the Open Connection icon has been selected to connect to data
source dbc twm, and the Add New Analysis icon has been selected to select Data Explorer from
the Descriptive Statistics category.
Now looking at the Data Explorer input form covering most of the main screen, selectors can
be seen on the left side of the form for selecting databases, tables and columns, and on the
right an area to drag selected columns into. (The arrow buttons in the middle can also be used
to select and de-select columns.)
Over the selectors are tabs for INPUT, OUTPUT and RESULTS, with sub-tabs that depend on
the type of analysis. After the parameters for an analysis have been specified, the analysis can
be executed by clicking the run button above, by right clicking on the project or analysis in the
project work area and selecting run, or by pressing the F5 key on the keyboard. The status of
the execution will be displayed in the Execution Status window below. When execution is
complete, the RESULTS tab will be enabled, and upon selection, the resulting data, graphs and
generated SQL (depending on analysis type) can be viewed.

14 Teradata Warehouse Miner Express Introduction


Chapter 3: Analysis Examples
Exploring Data with a Data Explorer Analysis

Exploring Data with a Data Explorer Analysis


Parameterize a Data Explorer analysis as follows:
Input Source: MultiTable
Available Databases: The database where the demonstration data was installed.
Available Tables:
TWM_CHECKING_ACCT
TWM_CREDIT_ACCT
TWM_CUSTOMER
TWM_SAVINGS_ACCT
Analyses to Perform
Values: Enabled
Compute Unique Values: Enabled
Statistics: Enabled
Frequency: Enabled
Histogram: Enabled
Output
Values Analysis Output Table: twm_values
Statistics Analysis Output Table: twm_stats
Frequency Analysis Output Table: twm_freq
Histogram Analysis Output Table: twm_hist
Run the analysis, and when it completes, click on the RESULTS tab.
Data
By clicking on data and then Load, each of the four tables produced can be viewed by
selecting the desired table in the pull-down selector.
Figure 3: Data Explorer Tutorial #1

Graph

Teradata Warehouse Miner Express Introduction 15


Chapter 3: Analysis Examples
Exploring Data with a Data Explorer Analysis

The following is a snapshot of the icon displayed when the graph tab is selected.
Figure 4: Graph snapshot

By clicking anywhere in this picture the subsequent display of the actual graph object is
displayed.
Figure 5: Graph

Clicking on the city_name thumbnail graph (6th from the left in the second row) leads
to the following display, while clicking on the bar for San Diego adds the drill down
box to the display. By clicking on the drill down button the customers in San Diego can
be displayed.

16 Teradata Warehouse Miner Express Introduction


Chapter 3: Analysis Examples
Creating an Analytic Data Set

Figure 6: City Name Thumbnail Graph

Creating an Analytic Data Set


The following depicts a tutorial example of creating an Analytic Data Set using the Variable
Creation analysis. Following this depiction are step-by-step instructions for defining the
variables created in this tutorial.

Teradata Warehouse Miner Express Introduction 17


Chapter 3: Analysis Examples
Creating an Analytic Data Set

Figure 7: Variable Creation Tutorial #1

Parameterize the above Variable Creation analysis as follows:


1 Select TWM_CUSTOMER as the Available Table.
2 Create seven variables by double-clicking on the following columns. (Note that the
variable name will default to the column name.)
TWM_CUSTOMER.cust_id
TWM_CUSTOMER.income
TWM_CUSTOMER.age
TWM_CUSTOMER.years_with_bank
TWM_CUSTOMER.nbr_children
TWM_CUSTOMER.gender
TWM_CUSTOMER.marital_status
3 Select TWM_CREDIT_TRAN as the Available Table.
4 Create a variable by clicking on the New button and build up an expression as follows.
5 Drag an Add (Arithmetic) SQL Element over the Variable, and then drag the following two
columns over the empty arguments:
TWM_CREDIT_TRAN.interest_amt
TWM_CREDIT_TRAN.principal_amt

18 Teradata Warehouse Miner Express Introduction


Chapter 3: Analysis Examples
Creating an Analytic Data Set

Figure 8: Add (Arithmetic)

6 Because there may be negative values, drag and drop an Absolute Value (Arithmetic) SQL
Element over both interest_amt and principal_amt:
Figure 9: Absolute Value (Arithmetic)

7 Take the average of this expression, by dragging and dropping an Average (Aggregation)
on top of the Add:
Figure 10: Average (Aggregation)

8 Because this analysis may generate many NULL values by joining TWM_CUSTOMER to
TWM_CREDIT_TRAN, drag a Coalesce (Case) on top of the Average:
Figure 11: Coalesce (Case)

Teradata Warehouse Miner Express Introduction 19


Chapter 3: Analysis Examples
Creating an Analytic Data Set

9 Drag and drop a Number (Literal) 0 into the expressions folder and rename it from
Variable1 to avg_cc_tran_amt to complete the variable:
Figure 12: avg_cc_tran_amt

10 Go to INPUT-anchor table and select TWM_CUSTOMER as the anchor table as seen


below.
Figure 13: INPUT > Anchor Table: Select TWM_CUSTOMER

11 Specify the Join Path from TWM_CUSTOMER to TWM_CREDIT_TRAN by clicking on


the Wizard button and specifying that they be joined on the column "cust_id".

20 Teradata Warehouse Miner Express Introduction


Chapter 3: Analysis Examples
Creating and Scoring a Decision Tree model

Figure 14: Join Path Wizard

12 Go to OUTPUT-storage, and select Store the tabular output of this analysis in the database.
Specify that a Table should be created named twm_tutorials_vc1.

Creating and Scoring a Decision Tree model


Building a Decision Tree Model
The following depicts a tutorial example of creating a Decision Tree model. In this example a
standard Gain Ratio tree was built to predict credit card ownership, ccacct, based on 20
numeric and categorical input variables. Notice that the tree initially built contained 100
nodes but was pruned back to only 11, counting the root node. This yielded not only a
relatively simple tree structure, but also Model Accuracy of 95.72% on this training data.
Parameterize a Decision Tree as follows:
Available Tables: twm_customer_analysis
Dependent Variable: ccacct
Independent Variables:
income, age
years_with_bank, nbr_children
gender, marital_status

Teradata Warehouse Miner Express Introduction 21


Chapter 3: Analysis Examples
Creating and Scoring a Decision Tree model

city_name, state_code
female, single
married, separated
ckacct, svacct
avg_ck_bal, avg_sv_bal
avg_ck_tran_amt, avg_ck_tran_cnt
avg_sv_tran_amt, avg_sv_tran_cnt
Tree Splitting: Gain Ratio
Minimum Split Count: 2
Maximum Nodes: 1000
Maximum Depth: 10
Bin Numeric Variables: Disabled
Pruning Method: Gain Ratio
Include Lift Table: Enabled
Response Value: 1
Run the analysis and click on Results when it completes. For this example, the Decision Tree
analysis generated the following pages.

Decision Tree Report

Table 1: Decision Tree Report

Total observations 747

Nodes before pruning 33

Nodes after pruning 11

Model Accuracy 95.72%

Variables

Table 2: Dependent Variables

Dependent Variable

ccacct

Table 3: Independent Variables

Independent Variable

income

ckacct

22 Teradata Warehouse Miner Express Introduction


Chapter 3: Analysis Examples
Creating and Scoring a Decision Tree model

Table 3: Independent Variables

Independent Variable

avg_sv_bal

avg_sv_tran_cnt

Confusion Matrix

Table 4: Confusion Matrix

Actual Non-Response Actual Response Correct Incorrect

Predicted 0 340 / 45.52% 0 / 0.00% 340 / 45.52% 0 / 0.00%

Predicted 1 32 / 4.28% 375 / 50.20% 375 / 50.20% 32 / 4.28%

Cumulative Lift Table

Table 5: Cumulative Lift Table

Cumulative
Captured Cumulative Captured
Response Response Cumulative Response Response Cumulative
Decile Count Response (%) (%) Lift Response (%) (%) Lift

1 5.00 5.00 100.00 1.33 1.99 5.00 100.00 1.33 1.99

2 0.00 0.00 0.00 0.00 0.00 5.00 100.00 1.33 1.99

3 0.00 0.00 0.00 0.00 0.00 5.00 100.00 1.33 1.99

4 0.00 0.00 0.00 0.00 0.00 5.00 100.00 1.33 1.99

5 0.00 0.00 0.00 0.00 0.00 5.00 100.00 1.33 1.99

6 402.00 370.00 92.04 98.67 1.83 375.00 92.14 100.00 1.84

7 0.00 0.00 0.00 0.00 0.00 375.00 92.14 100.00 1.84

8 0.00 0.00 0.00 0.00 0.00 375.00 92.14 100.00 1.84

9 0.00 0.00 0.00 0.00 0.00 375.00 92.14 100.00 1.84

10 340.00 0.00 0.00 0.00 0.00 375.00 50.20 100.00 1.00

Graphs
By default the Tree Browser is displayed as follows:

Teradata Warehouse Miner Express Introduction 23


Chapter 3: Analysis Examples
Creating and Scoring a Decision Tree model

Figure 15: Tree Browser

Select the Text Tree tab to view the rules in textual format:
Figure 16: Text Tree tab

Additionally, you can click on Lift Chart to view the Lift Table graphically.
Figure 17: Lift Chart tab

24 Teradata Warehouse Miner Express Introduction


Chapter 3: Analysis Examples
Creating and Scoring a Decision Tree model

Scoring a Decision Tree Model


In this example, the same table is scored as was used to build the decision tree model, as a
matter of convenience. Typically, this would not be done unless the contents of the table
changed since the model was built.
Parameterize a Decision Tree Scoring analysis as follows:
Selected Tables: twm_customer_analysis
Scoring Method: Evaluate and Score
Use the name of the dependent variable as the predicted value column name: Enabled
Targeted Confidence(s) - For binary outcome only: Enabled
Targeted Value: 1
Result Table Name: twm_score_tree_1
Primary Index Columns: cust_id
Run the analysis, and click on Results when it completes. For this example, the Decision
Tree Scoring analysis generated the following pages.

Decision Tree Model Scoring Report

Table 6: Decision Tree Model Scoring Report

Resulting Scored Table Name score_tree_1

Number of Rows in Scored File 747

Confusion Matrix

Table 7: Confusion Matrix

Actual Non-Response Actual Response Correct Incorrect

Predicted 0 340/45.52% 0/0.00% 340/45.52% 0/0.00%

Predicted 1 32/4.28% 375/50.20% 375/50.20% 32/4.28%

Cumulative Lift Table


In this case, the Cumulative Lift Table contains the same values as when the Decision Tree
model was built. (Note that the Cumulative Lift Table is available only when the Evaluate or
Score and Evaluate option is selected.)

Data

Table 8: Data

cust_id cc_acct _tm_target

1362480 1 0.92

1362481 0 0

Teradata Warehouse Miner Express Introduction 25


Chapter 3: Analysis Examples
Creating and Scoring a Decision Tree model

Table 8: Data

cust_id cc_acct _tm_target

1362484 1 0.92

1362485 0 0

1362486 1 0.92

Lift Graph
In this case, the Lift Graph is the same as when the Decision Tree model was built. (Note that
the Lift Graph is available only when the Evaluate or Score and Evaluate option is selected.)

26 Teradata Warehouse Miner Express Introduction


APPENDIX A References

1 Teradata Warehouse Miner Model Manager User Guide, B035-2303-106A, October 2016
2 Teradata Warehouse Miner Release Definition, B035-2494-106C, October 2016
3 Teradata Warehouse Miner User Guide, Volume 1, Introduction and Profiling,
B035-2300-106A, October 2016
4 Teradata Warehouse Miner User Guide, Volume 2, ADS Generation, B035-2301-106A,
October 2016
5 Teradata Warehouse Miner User Guide, Volume 3, Analytic Functions, B035-2302-106A,
October 2016

Teradata Warehouse Miner Express Introduction 27


Appendix A: References

28 Teradata Warehouse Miner Express Introduction

Você também pode gostar