Você está na página 1de 9

3/17/12

Data Mining
Submitted by: Click to edit Master subtitle style Paulami Roy(A-07) Tanaya Bag(A-10) Nirdeep Singh Sodhi(AGautam Sharma(A-48) Ayush Mahajan(A-)

3/17/12

Meaning

3/17/12

Definition
v

The process of discovering new patterns from large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems. The extraction of hidden predictive information from large database.

Foundations of Data Mining


Evolutionary Business Step Question
Data Collection (1960s)

3/17/12

Enabling Product Technologies Providers


IBM, CDC

Characteristi cs
Retrospective, static data delivery. Retrospective, dynamic data delivery at record level Retrospective, dynamic data delivery at multiple levels

"What was my total Computers, tapes, revenue in the last five disks years?"

Data Access(1980s) "What were unit sales RDBMS, SQL, in Delhi last ODBC February?" Data Warehousing "What were unit sales OLAP, &Decision Support in Delhi last February?multidimensional (1990s) Drill down to South databases, data Delhi?" warehouses Data Mining (Today)

Oracle, Sybase, Informix, IBM, Microsoft Pilot, Comshare, Arbor, Cognos, Microstrategy

"Whats likely to Advanced Pilot, Lockheed, Prospective, happen to South Delhi algorithms, IBM, SGI, proactive information unit sales next month? multiprocessor numerous startups delivery Why?" computers, massive databases

3/17/12

Scope of Data Mining

3/17/12

Techniques used in data mining


1) Artificial neural networks:

Non-linear predictive models that learn through training and resemble biological neural networks in structure.

2)

Decision trees:

Tree-shaped structures that represent sets of decisions. These decisions generate rules for the classification of a dataset. Specific decision tree methods include Classification and Regression Trees (CART) and Chi Square Automatic Interaction Detection (CHAID).

3) Genetic algorithms: Optimization techniques that use processes such as genetic combination,

3/17/12

Contd.
4) Nearest neighbour method: A technique that classifies each record in a dataset based on a combination of the classes of the k record(s) most similar to it in a historical dataset. Sometimes called the k-nearest neighbour technique.

5)Rule induction: The extraction of useful if-then rules from data based on statistical significance.

3/17/12

Architecture for Data Mining

3/17/12

APPLICATIONS

A pharmaceutical company can analyze its recent sales force activity and their results to improve targeting of high-value physicians and determine which marketing activities will have the greatest impact in the next few months.

A credit card company can leverage its vast warehouse of customer transaction data to identify customers most likely to be interested in a new credit product

A diversified transportation company with a large direct sales force can apply data mining to identify the best prospects for its services.

A large consumer package goods company can apply data mining to improve its sales process to retailers.

Você também pode gostar