Ijettcs 2013 07 08 009

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 2, Issue 4, July August 2013 ISSN 2278-6856
Clustering and Visualization of Web Usage Data using SOM and XML
T.Vijaya Kumar1 , Dr. H.S.Guruprasad2
1 Research Scholar, Department of IS & E BMS College of Engineering, Bull Temple Road Bangalore - 560019, Karnataka, India
Professor and Head, Department of IS & E BMS College of Engineering, Bull Temple Road Bangalore - 560019, Karnataka, India
Abstract: Clustering is the process of grouping objects

together in such a way that the objects belonging to the same group are similar and those belonging to different groups are dissimilar. Clustering Web Usage data is one of the important tasks of Web Usage Mining, which helps to find Web user clusters and Web page clusters. Web user clusters establish groups of users exhibiting similar browsing patterns and Web page clusters provide useful knowledge to personalized Web services. Recent studies have supported the use of neural networks such as Adaptive Resonance Theory (ART) and Self Organizing Maps (SOM) for real world data mining problems. Among the architectures and algorithms suggested for neural networks, the SOM has special property of effectively creating spatially organized internal representations of various features of input data and their abstractions. SOM is useful for visualizing low dimensional views of high dimensional data. In this paper, we propose an approach to visualize Web Usage data using SOM and XML based application language. First, we have constructed the sessions using concept hierarchy and link information. Then, these sessions are transformed in to Extensible Markup Language (XML) format. Finally the clusters depicting the sessions with similar patterns are obtained using SOM. In this paper, we have considered the server log files of the Website www.enggresources.com for overall study and analysis.
Keywords: Web Usage mining, Concept hierarchy, Concept based Website graph, Self-Organizing Maps, Scalable Vector Graphics.
1. INTRODUCTION
With the rapid growth of World Wide Web the study of modeling the users navigational behavior in a Web site has become very important. With the large number of companies using Internet to distribute and collect information, Knowledge discovery on the Web has become an important research area. The purpose of Web Usage mining is to apply statistical and data mining techniques to the preprocessed Web log data, in order to discover useful Web Usage patterns. More advanced data mining methods and algorithms, such as association rules, sequential pattern mining, classification and clustering are adapted appropriately to find suitable patterns from Web Usage data. In addition to these methods and algorithms, the Artificial Neural Network and fuzzy classification methods are also used to find Volume 2, Issue 4 July August 2013
valuable information from the Web Usage data. Web Usage mining contains three main tasks namely Data preprocessing, Cluster discovery and Cluster analysis. Data preprocessing consists of data cleaning, data transformation, and data reduction. Cluster discovery deals with formation of groups of users exhibiting similar browsing patterns and obtaining groups of pages that are accessed together. Cluster analysis filters out uninteresting patterns from the user clusters and page clusters found in the Cluster discovery phase. Clustering is a data mining technique that groups together a set of items having similar characteristics. In the Web Usage domain, two kinds of interesting clusters such as user clusters and page clusters can be discovered. SOM is considered as one of the most common neural network methods for cluster analysis. In SOM, data in the input data space will be projected onto prototype vectors on the grid so that two vectors which are projected onto prototypes are more likely to belong to the same cluster. In this paper we visualize Web Usage data using SOM and XML based application languages. The visualization of result is implemented by using Scalable Vector Graphics which is an XML application language. First in the preprocessing phase the sessions are constructed using concept hierarchy and link information. Then in the Cluster discovery phase the session clusters are obtained from sessions by using SOM. Later these clusters are visualized using Scalable Vector Graphics. The rest of the paper is organized as follows. Section 2 gives a brief description about the related work. Section 3 depicts the overall architecture of clustering and visualization using SOM and XML used in our work. Section 4 covers details of Data Preprocessing using concept hierarchy and Web site topology. The details of Self Organizing Maps and Scalable Vector Graphics are discussed in Section 5. The experimental design and results are discussed in section 6. Finally we give our conclusion in section 7.
2. LITERATURE SURVEY
Various data mining methods have been used to generate models of usage patterns. Models based on association rules, clustering algorithms, sequential analysis and Markov models have been used for discovering the knowledge from Web usage data. All these models are Page 23

predominantly based on usage information from Web usage data alone. Significant improvement can be achieved by making use of domain knowledge. In [1, 2], Cooley et al. covered Web usage mining process and various steps involved in it. It serves as the primary thesis to understand fundamentals of Web usage mining. Along with the server log file other sources of knowledge such as site content or structure and semantic domain knowledge can be used in Web usage mining [3]. In [4], Natheer Khasawneh et al. have presented new techniques for preprocessing Web log data including identifying unique users and sessions by making use of Website ontology. In [5], Kobra etminani et al. proposed an idea to obtain details of how Websites link structure can be used for tracking Web users activities. In [6], Sebastian A. Rios et al. have shown the use of concept-based approach using semantics in Web usage mining. In [7], Murat Ali Bayir et al. have proposed a novel framework, called Smart-Miner for Web usage mining problem which uses link information for producing accurate user sessions and frequent navigation patterns. Norwati Mustapha et al. [8] have proposed a model for mining users navigation pattern based on Expectation modeling algorithm and used it for finding maximum likelihood estimates of parameters in probabilistic models. In [9], Jiyang Chen et al. have proposed a visualization tool to visualize Web graphs, representations of Web structure overlaid with information and pattern. They also proposed Web graph algebra to manipulate and combine Web graphs and their layers in order to discover new patterns in an ad hoc manner. In [10], Esin Saka et al. have proposed a hybrid approach which combines the strengths of Spherical KMeans algorithm for fast clustering of high dimensional datasets in the original feature domain and the flockbased algorithm which iteratively adjusts the position and speed of dynamic flocks agents on a visualization plane. The hybrid algorithm decreases the complexity of FClust from quadratic to linear with further improvements in the cluster quality. In [11], S Park et al. have investigated the use of fuzzy ART neural network, to enhance the performance of the K-Means algorithm. A major limitation with fuzzy ART is the unrestricted growth of clusters. Fuzzy ART networks sometimes produce numerous clusters each with only a small number of members. The authors have used fuzzy ART as an initial seed generator and K-Means as the final clustering algorithm. In [12], Santosh K et al. have developed a clustering algorithm that groups users according to their Web access patterns. The algorithm is based on the ART1 version of adaptive resonance theory. ART1 offers an unsupervised clustering approach that adapts to the changes in users access patterns over time without losing earlier information. It applies specifically to binary vectors. They have compared their algorithms performance with the traditional K-Means clustering algorithm and showed that the ART1-based technique performed better in terms of intra cluster distances. In [13], Antonia S et al. have developed a variant of the Volume 2, Issue 4 July August 2013 classical SOM called Growing Hierarchical SOM (GHSOM). They have suggested a new visualization technique for the patterns in the hierarchical structure. In [14, 15], T. Kohonen et al. have used Kohenens Self Organizing Map to organize Web documents into a two dimensional map according their document content. Documents which are similar in content are located in similar regions on the map. In [16], Kate A. Smith et al. have developed LOGSOM, a system that utilizes Kohonens Self Organizing Map to organize Web page in to two dimensional maps. The organization of the Web pages is based on the users navigation behavior rather than the content of the Web pages. In [17], T.Vijaya Kumar et al. have proposed a framework for finding useful information from Web Usage Data that uses Self Organizing Maps (SOM). Sessions are constructed using the concept hierarchy and the link information. Then they have used SOM to form the cluster. In [18], Hannah et al. have proposed an approach to obtain user profiles based on intelligent rough clustering techniques. The proposed method provides efficient algorithms for finding hidden patterns in Web log data and is able to learn the number of clusters automatically from the given data. They have given a two-fold approach for clustering user access patterns and retrieving effective user profiles from Web logs using Gaussian Rough (GR) clustering, Gaussian Rough Fuzzy (GRF) clustering, rough clustering and rough fuzzy clustering. In [19], Rajhans Mishra et al. have adopted the similarity upper approximation based clustering of Web logs using various similarity metrics. In [20], Paola et al. have compared SOM and k-means algorithms to obtain clusters from user page access data. In [21], Ketki Muzumdar et al. have shown that SOM has better performance than K- means algorithm for Web usage data. In [22], Ajith Abraham et al. have proposed an ant clustering algorithm to discover Web usage patterns and a linear genetic algorithm approach to analyze user trends. In [23], Prakash S Raghavendra et al. have modeled user behavior as a vector of the time the user spends at each URL. They have implemented and compared the clustering and classification methods of kmeans with non-Euclidean similarity measure, Bayesian classifiers and artificial neural networks with standardized fuzzy inputs.
3. SYSTEM DESIGN
The main goal of the proposed system is to cluster and visualize Web usage sessions obtained from raw server log files. Fig. 1 depicts the overall architecture of the proposed model. The Clustering and Visualization in our approach is partitioned into two modules. In the first module Data cleaning, User identification and Session constructions are considered for Data preprocessing phase. Sessions are identified using Web site ontology and concept hierarchy. These identified sessions are transformed in to XML format. Then in the second phase, the transformed XML file is given as an input to SOM Page 24

algorithm, which clusters the sessions in to clusters depicting the sessions with similar navigation behavior. functionality this Web page provides, what kind of information it depicts etc. CBWG is an ordered pair, G = (V, E) comprising a set V of vertices or nodes, which represents Web pages of Website. Every node contains concept as one additional information on which this node is mapped to. Set E of edges or lines, which are 2-element subsets of V, which contains link information of Website graph. Hence, CBWG contains information about conceptual classification of Web pages and link information of Website. A combination of IP address and user agent is used to identify users uniquely. User identification can also be done using client side cookies. But, due to privacy reasons, cookies can be disabled by users, and not every Website employ cookies. Session identification is considered as the next step. Session identification divides all pages accessed by a user into sessions. A session is a sequence of requests made by a single user with a unique IP address on a particular Web domain during a specified period of time. The simplest method of achieving this is through a timeout, where in if the time between page requests exceeds a certain limit then it is assumed that the user is starting a new session. We have used concept switching and navigation approach with timeout as a criterion for creating user sessions.
4. DATA PREPROCESSING
Data preprocessing comprises of, merging of log files from different Web servers, Data cleaning, Identification of users, sessions, and visits, Data formatting and Summarization. Data cleaning consists of removing superfluous data from log file. We have considered the server log files of the Website www.enggresources.com. The server log file is in Extended Combined Log Format which is an extension of Common Log Format with two extra fields, referrer and user agent. User identification deals with identifying unique clients to Web server. Our model incorporates Website knowledge in Web usage mining techniques. Website knowledge is represented via concept based Website graph. This is a combination of Website graph and concept hierarchy of concerned Website. The Website Graph is just like any other graph consisting of Vertices and Edges, except, the Vertices represents the Web Pages and Edges represent the hyperlinks between the Web pages. Concept hierarchy represents organization of content of Website.
5. SELF-ORGANIZING MAP
The Self Organizing Map (SOM) is an excellent tool for experimental data mining. The SOMs are based on competitive learning, where the output neurons of the network compete among themselves to be activated or fired, with the result that only one output neuron, or one neuron per group is on at any one time. An output neuron that wins the competition is called a winner-takes-all neuron or simply a winning neuron. There are three essential processes involved in the formation of the SOM namely Competition, Cooperation and Synaptic adaptation [24]. The overall training process is depicted in Figure 2.
Figure 1 Overall architecture
Any medium or large Websites are usually organized and structured hierarchically to reflect functional characteristics. This hierarchy is collection of domain concepts which are organized using IS-A and HAS-A relationship. Each concept represents formal abstraction of human thought, his browsing intent. Concept based Website graph (CBWG) is an extension of Website graph, where each node will contain information about which concept this Webpage belongs to. This mapping of all Web pages to corresponding concept is done carefully with help of expertise Website knowledge like what Volume 2, Issue 4 July August 2013
Figure 2 Overall Training Process SOM projects input vectors on prototypes of a low dimensional regular grid that can be effectively utilized to explore properties of the data. A brief outline of the SOM concept is explained below. Let denote the dimension of the input space. Let an input vector selected at random from the input space be denoted by . Page 25

The synaptic weight vector of each neuron in the network has the same dimension as the input space. Let the synaptic weight vector of neuron be denoted by where is the total number of neurons in the network. To find the best match of the input vector with the synaptic weight vector , compute the inner products for and select the largest. This assumes that the same threshold is applied to all the neurons. Thus by selecting the neuron with the largest inner product the location where the topological neighbourhood of excited neurons is to be centered can be determined. This is mathematically equivalent to minimizing the Euclidean distance between the vectors and . Let denotes the neuron that best matches the input neuron . Then where This sums up the essence of competition process among the neurons. The particular neuron that satisfies this condition is called the best matching or winning neuron for the input vector . A continuous input space of activation patterns is mapped on to a discrete output space of neurons by a process of competition among the neurons in the network. Depending on the application of interest, the response of the network could be either the index of the winning neuron or the synaptic weight vector that is closest to the input vector in a Euclidean sense. The winning neuron locates the center of a topological neighborhood of cooperating neurons. A neuron that is firing tends to excite the neurons in its immediate neighborhood more than those farther away from it. The topological neighborhood around the winning neuron decays smoothly with lateral distance. A typical choice of the topological neighborhood centered on winning neuron and encompassing a set of cooperating neuron is the Gaussian function. For the network to be SelfOrganizing, the synaptic weight vector j of neuron in the network is required to change in relation to the input vector . A modified Hebbian Hypothesis can be used to change the synaptic weight vector j of the neuron . Scalable Vector Graphics is an XML application language for describing two dimensional vector and mixed vector or raster graphics. SVG is a language for rich graphical content. SVG allows for three types of graphic objects: vector graphic shapes (e.g., paths consisting of straight lines and curves), images and text. Graphical objects can be grouped, styled, transformed and composited into previously rendered objects. Text can be in any XML namespace suitable to the application, which enhances search ability and accessibility of the SVG graphics. The feature set includes nested transformations, clipping paths, alpha masks, filter effects, template objects and extensibility. SVG drawings can be dynamic and interactive. An outline of the revised SOM algorithm for our system is summarized below. Input: Sessions constructed using concept hierarchy and link information where each URL is assigned with a unique index. Output: Clusters depicting the sessions with similar navigation behavior. Step1. Initialization: Choose random values for the weight vectors j. Here we have selected the synaptic weight vectors from the available set of session vectors based on the concept hierarchy and Web site topology. Step2. Sampling: Draw an input vector sample with a certain probability. The dimension of vector is equal to . We assume that there is a set of Web pages and a set of user transactions. Step3. Winning neuron: For each input vector, calculate the Euclidean distance between the input vector and weight vector. Select the index value of the weight vector having minimum Euclidean distance with the input vector as the winning neuron. where . Step4. Updating: Adjust the synaptic weight vectors of all neurons by using the update formula j,i(x)(n) (x(n) Where is the learning rate parameter, and j,i(x) is the neighborhood function centered around the winning neuron . Both and are varied j,i(x) dynamically during learning for best results. Step5. Continuation: Continue with step 2 until no noticeable changes in the feature map are observed.
6. EXPERIMENTS AND RESULTS

For experimentation, data from Server log files of the Web site www.enggresources.com collected from 19-072009 to 11-08-2009 is considered for the overall study and analysis. Concept based Website graph is constructed as an additional input using concept hierarchy and link information. Error records, requests for images and multimedia files are removed from Server log file by using a tool called Web log filter. IP address, timestamp, user agent, request and referrer are retained for further processing. User Identification is considered as the next step. In user identification, IP address and user agent are used. That is, a combination of IP address and user agent is used to identify a unique user. In session construction, we have combined two trivial approaches, Time oriented approach and Navigation oriented approach along with concept name match approach for identifying user sessions. Page stay time threshold and session timeout threshold are set as 10 and 30 minutes respectively. Each Web page is assigned with unique index. And, every unique session is also given unique index. Two output files are produced from this step. First, session file, in which one record consists of sequence of Web pages requested in a session. Second, inter session navigation file, in which one record consists of session indexes of Page 26
Volume 2, Issue 4 July August 2013

one user identified uniquely by IP address and user agent. 10217 users and 25814 sessions were discovered from pre-processing. In Pattern Discovery phase we have used JavaSOM package, which gives the SOM implementation in Java and utilization of XML based application languages for information exchange and visualization [25]. Here we form the clusters from the sessions obtained in the preprocessing phase. First the sessions are transformed into XML format. Then the sessions in XML format is given as the input to SOM algorithm to form the clusters. SOM algorithm gradually leads to an organized representation of activation patterns from input space; where in the initial state is complete disorder. The parameters have to be selected properly to achieve this objective. The adaptation of the synaptic weights in the network is carried out in two phases: Self organizing phase followed by a convergence phase. In the Selforganizing phase of the adaptive process the topological ordering of the weight vectors takes place. The ordering phase may take 1000 or more iterations of the SOM algorithm. The learning rate parameter value and neighborhood function type must be selected properly. The learning-rate parameter should begin with a value of 0.1 and gradually decrease but remain above 0.01. The neighborhood function should initially include almost all neurons in the network centered on the winning neuron. The convergence phase is the second phase of the adaptive process, which provides an accurate statistical quantification of the input space. As a general rule, the number of iterations constituting the convergence phase must be at least 500 times the number of neurons in the lattice. For a good statistical accuracy the learning-rate parameter should be maintained at somewhere 0.01. The neighborhood function should contain only the nearest neighbors of the winning neuron. Sessions obtained using Concept Based Web site Graph is shown in Figure 3. Then the sessions are transformed in to XML format which is shown in Figure 4. The clustering and visualization of the sessions are shown in Figure 5. From the map we can observe that similar sessions are grouped together.
Figure 4 Sessions in XML format
Figure 5 Clustering and Visualization of Sessions
7. CONCLUSION
The SOMs have the possibilities not only to cluster, but also to visualize multidimensional data. However the simple SOM table is not enough in order to reveal the characteristics of the data. It is necessary to supplement the SOM table with extra information. In this paper, we proposed a clustering and visualization approach for Web usage data based on SOM and XML based language. The sessions are obtained by using concept hierarchy and Web site graph. Then these sessions are transformed in to XML format and clustered and visualized using JavaSOM. From the experiment results we can see that sessions with similar patterns are clustered together. The similarities between sessions are calculated by using Euclidean distance measure, which can be further improved by other similarity measures like rough set similarity, weighted session similarity measure and semantic similarity. Page 27
Figure 3 Sessions using Concept Based Web site Graph Volume 2, Issue 4 July August 2013

Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 2, Issue 4, July August 2013 ISSN 2278-6856 REFERENCES
[1] R. Cooley, B. Mobasher, and J. Srivastava, Web mining: information and pattern discovery on the World Wide Web, Ninth IEEE International Conference on Tools with Artificial Intelligence, Newport Beach, CA, USA, 1997, Pages 558-567. [2] J. Srivastava, R. Cooley, M. Deshpande, and P. N. Tan, Web usage mining: discovery and applications of usage patterns from Web data, ACM SIGKDD Explorations Newsletter, Volume 1, Pages 12-23, 2000. [3] Bamshad Mobasher, Chapter: 12, Web Usage Mining in Data Collection and Pre-Processing, ACM SIGKKD 2007 Pages 450-483. [4] Natheer Khasawneh and Hien Chung Chan, Active User-Based and Ontology-Based Weblog data preprocessing for Web Usage Mining, IEEE/ WIC/ACM International Conference 2006. [5] Kobra etminani, Amin, and Noorali Rouhani, Web usage Mining: Discovery of the users navigational patterns using SOM, IEEE 2009. [6] Sebastian A. Rios, and Juan D.Velasquez, Semantic Web Usage Mining by a Concept-based approach for Off-line Web Site Enhancements, IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology 2008. [7] Murat Ali Bayir, Ismail Hakki Toroslu, Guven Fidan, and Ahmet Cosar, Smart Miner: A New Framework for Mining Large Scale Web Usage Data, ACM 2009. [8] Norwati Mustapha, Manijeh Jalali , and Mehrdad Jalali, Expectation Maximization Clustering Algorithm for User Modeling in Web Usage Mining Systems, European Journal of Scientific Research ISSN 1450-216X Volume 32 Number.4 (2009), Pages.467-476. [9] Jiyang Chen, Lisheng Sun, Osmar R.Zaiane, and Ranidy Goeble, Visualizing and Discovering Web Navigational Patterns, Seventh International Workshop on the Web and Databases (Web DB 2004),June17-18, 2004, Paris, France. [10] Esin Saka, and Olfa Nasraoui, Simultaneous Clustering and Visualization of Web Usage Data using Swarm-based Intelligence, 20th IEEE International Conference on Tools with Artificial Intelligence. [11] Sungjune Park, Nallan C. Suresh, and Bong Keun Jeong, Sequence based clustering for Web usage mining: A new experimental framework and ANNenhanced K-Means algorithm, Elsevier Data and Knowledge Engineering 65 (2008) 512 543. [12] Santosh K.Rangarajan, Vir V.Phoha, Kiran S.Balagani, Rastko R.Selmic and S.S. Iyengar, Adaptive Neural Network Clustering of Web Users, IEEE 2004 0018-9162/04. [13] Antonio S, Jose D. Martin, Emilio S, Alberto P, Rafael M and Antonio, Web mining based on Volume 2, Issue 4 July August 2013 growing hierarchical Self Organizing Maps: Analysis of a real citizen Web portal, Expert Systems with applications 34(2008)2998-2994 www.elsevier.com. [14] T. Kohonen, S. Kaski, K. Lagus, J. Salojarvi, J. Honkela,V. Paatero,A. and Saarela, Self organization of a massive document collection, IEEE Transactions on Neural Networks 11 (3)(May 2000) 574 585. [15] Samuel Kaski, Timo Honkela, Krista Lagus, and Teuvo Kohonen, WEBSOM-Self organizing maps of document collections, Neurocomputing 21(1998) 101-117 Elsevier. [16] Kate A. Smith, Alan Ng, Web page clustering using a self-organizing map of user navigation atterns, Elsevier Decision Support Systems 35 (2003) 245 256. [17] T. Vijaya Kumar, Dr. H. S. Guruprasad, Clustering Web Usage Data using Concept hierarchy and Self Organizing Maps, International Journal of Computer Applications (0975 8887) Volume 56 No.18, October 2012 www.ijcaonline.org. [18] H. Hannah In barani , K. Thangavel, Rough set based User profiling for Web Personalization, International Journal of Recent Trends in Engineering, Vol 2, No. 1, November 2009. [19] Rajhans Mishra and Pradeep Kumar, Clustering Web Logs Using Similarity Upper Approximation with Different Similarity Measures, International Journal of Machine Learning and Computing, Vol. 2, No. 3, June 2012. [20] Paola Britor, Damian Martnelli, Hernan Merlino, Ranon Garcia MertineZ, Web Usage mining using Self Organizing Map, IJCSNS Vol. 7 No.6, June 2007. [21] Ketki Muzumdar, Ravi Mante, Prashant Chatur, Neural Network Approach for Web Usage Mining,IJRTE, ISSN : 2277-3878, Volume 2, Issue 2, May 2013. [22] Ajith Abraham, Ramos V, WUM using artificial ant colony clustering and linear genetic programming, Evolutionary Computation, 2003,Vol.2 Pages.13841391, ISBN: 0-7803-7804-0. [23] Prakash S Raghavendra, Shreya oy Chowdhury, Srilekha Vedula Kameshwari, Web Usage Mining using Statistical classifiers and Fuzzy artificial Neural Networks, International Journal of Multimedia and Image Processing, Volume 1, Issue 1, March 2011. [24] Simon Haykin, Neural Neworks A Comprehensive Foundation, Prentice-Hall, Inc-1999. [25] JavaSOM package, javasom.sourceforge.net.
Page 28

Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 2, Issue 4, July August 2013 ISSN 2278-6856 AUTHOR
T.Vijaya Kumar is a research scholar at Information Science and Engineering department, BMS College of Engineering, VTU, Belgaum, Karnataka, India. His research interest includes Data Mining, Knowledge Discovery from Web usage data, Theoretical Foundation of Computer Science, Artificial Neural Network and Cloud Computing. Dr.H.S.Guruprasad is currently working as Professor and Head of Information Science and Engineering department, BMS College of Engineering, Bangalore, India. He has done his PhD in the area of Network Communication. He has more than twenty two years of teaching experience. He has been awarded with Rashtriya Gaurav award and Best Citizen of India award. His research interest includes Multimedia Communications, Sensor Networks, Cloud Computing, Algorithms, Android Operating System, Data Mining and Knowledge discovery from Web usage data.
Volume 2, Issue 4 July August 2013
Page 29

Ijettcs 2013 07 08 009

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Ijettcs 2013 07 08 009

Enviado por

Direitos autorais:

Formatos disponíveis

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)

Abstract: Clustering is the process of grouping objects

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)

Figure 1 Overall architecture

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)

6. EXPERIMENTS AND RESULTS

Volume 2, Issue 4 July August 2013

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)

Figure 4 Sessions in XML format

Figure 5 Clustering and Visualization of Sessions

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)

Volume 2, Issue 4 July August 2013

Você também pode gostar