Escolar Documentos
Profissional Documentos
Cultura Documentos
Data Mining:
Intelligent methods are applied to extract the
useful information or patterns
Data Mining: A KDD Process:
Data mining: the core of knowledge discovery
process.
Steps of a KDD Process
Data Cleaning
Handles Noisy, Inconsistent, Incomplete data
Missing Values
Noisy data
Binning, Clustering etc.
Inconsistencies
Tools, functional dependencies
Data Integration
Schema Integration
Data Selection
Select only the task relevant data
Data Transformation
Transform or consolidate data
Smoothing, Normalization, Feature Construction
Data Reduction Compression
Pattern Evaluation
Interestingness Measures
Knowledge Presentation
Visualization
Descriptive
Characterize general properties of the data
Predictive
Performs inference
Mining
Parallel
Various Granularities
Concept/class description
Association Analysis
Classification and Prediction
Cluster Analysis
Outlier Analysis
Evolution Analysis
Data Characterization:
Data Discrimination:
buys(X, Laptop)
Single Dimensional
Classification
Finds models that describe and differentiate
classes or concepts
Predicts class
Training data
Models rules, decision trees, NN, formulae
Preceded by relevance analysis (to eliminate
irrelevant attributes)
Prediction
Derived model is used for prediction
Data value prediction
Class label prediction (Classification)
Trend identification
Cluster Analysis
Unsupervised
Class labels are missing in the training set
Maximize Intra-class similarity
Minimize Inter-class similarity
Hierarchy of classes
Outlier Analysis
Evolution Analysis:
Trend detection
Time series data
Involves other functionalities