Escolar Documentos
Profissional Documentos
Cultura Documentos
Data
Mining,
I INTRODUCTION
Data Mining or knowledge discovery in database is to find
new knowledge from database. However, the
dimensionality, complexity, or amount of data is
prohibitively large for manual analysis. With a huge
amount of data stored in databases, it is increasingly
important to develop powerful tools for mining interesting
knowledge from it. In recent years, the computational
efficiency of modern computer technology is making the
mining fast and precise. One of the most interesting
developments in this area is the application of neural
computation.
Self-Organizing Map (SOM) (Kohonen, 1990) is a type of
neural network that uses the principles of competitive or
unsupervised learning. In unsupervised learning there is no
information about a desired output as is the case in
supervised learning. This unsupervised learning approach
forms abstractions by a topology preserving mapping of
high dimensional input patterns into a lower-dimensional
set of output clusters (Sestito & Dillon. 1994). These
clusters correspond to frequently occurring patterns of
features among the input data.
67
VCON10
(2.4)
Equation (2.4) shows that the confidence of rule
A=>B can be easily derived from the support counts of A
and AUB. That is, once the support counts of A, B, and
AUB are found, it is straightforward to derive the
corresponding association rules A=>B and B=>A and check
whether they are strong. Thus the problem of mining
association rules can be reduced to that of mining frequent
itemsets. In general, association rule mining can be viewed
as a two-step process:
1. Find all frequent itemsets: By definition, each of these
itemsets will occur at least as frequently as a predetermined
minimum support count, min support.
68
2nd Vaagdevi International Conference On Information Technology For Real World Problems
C. SOM Clustering
Transactions as in Table 1 in a database can be modeled as
data vectors, each transaction will be converted to an input
vector. If it has item i in the transaction, the ith component
in the vector will be 1, otherwise 0. The data modeling can
be done at the time when transactions are extracted from the
database. To train the input vectors, a SOM or GHSOM [3]
neural network can be initialized to generate map units.
Different from the usual neural networks we used in other
pattern, in this particular training, the network has only one
vector in each row, each neuron just has neighbors in
different rows. From the map units, one can easily find the
relationship between different items.
69
VCON10
The SOM algorithm has two phases [5], the first is the
search phase, during which each node i computes the
Euclidean distance between its weight vector wi(t) and the
input vector T(t) as following:
70
2nd Vaagdevi International Conference On Information Technology For Real World Problems
REFERENCES
71