Ijcsrtv1is050006 PDF

International Journal of Computer Science Research & Technology (IJCSRT)
ISSN: 2321-8827
Vol. 1 Issue 5, October - 2013
Analysis of Evolving User Behaviour Profile For Predicting Masquerades

Devendra B. Dandekar, Prof. Vinay S. Kapse
Department of Computer Science and Engineering
Tulsiramji Gaikwad-Patil
College of Engineering, RTMNU, Nagpur
Abstract based on Distribution of relevant events based on

representing observed behaviour user. The UNIX
Knowledge about computer user is very beneficial for operating system environment is used in this research
assisting, predicting their future actions to detect for explaining and evaluating EVABCD. A user
masquerade. We propose an approach for creating and behaviour is represented as a sequence of UNIX
recognizing automatically the behaviour profile of command in command-line interface. Previous research
computer user. The sequence of commands typed by studies in this environment [3] focus on detecting
user where transformed into distribution of relevant masquerades who individual impersonate other users
subsequences of commands in order find our behaviour on computer networks and system. However,
profile. The self evolving system makes use of recursive EVABCD creates evolving user profiles and classifies
T
formula to find the potential of a new data sample to new users into one of the previously created profiles.
form new prototype or replace existing prototype. The Thus the goal of EVABCD in UNIX environment can
SR
masquerade user data sample are compared with the divided into two phases :
trained prototypes and behaviour is detected based on
standard deviation. By considering most existing 1.1 Creating and updating user profiles from the
IJC
techniques available as assuming handcrafted user commands the users typed in UNIX shells.
profile which encode repertoire of the observed user. 1.2 Classifying a new sequence of commands into the
We will be able to see challenges from this result with predefined profiles.
comparative study during creating evolve system
approach & predict it. In Summary, our Contributions are:
• We discover the limitations and their root causes
Keywords - Evolving Fuzzy Systems, Fuzzy-Rule Based when creating user behaviour profile in terms of
(FRB) Classifier, User Modelling. classifying relevant sequence of events.
• We generalize proposed previous work regarding
knowledge about computer user with increased
1. Introduction
complexity of thinking user behaviour.
Knowledge about computer user is very beneficial
• We extend new algorithm to execute the
to assist, to predict for creating & recognize behaviour
environments in which segmentation of subsequent
of profile. The recognition of other behaviour profile in
relevant events evaluated by using frequency based
real time significant offers different tasks such as to
method.
predict their future action. Specifically, computer user
• A comparative study to revise existing hypothesis
modelling learned about ordinary observing user to
than it is to generate hypothesis when each time new
promote a way of experience user profile. However, the
instance is observed.
construction of effective user profile problematic to
• To detect Masquerades (Un-Authorized work) when it
human behaviour is often erratic and sometimes it is
tends to knowledge of computer user.
different for their change of goals. There exists several
definition for user profile [1]. It defined as description
of user interests, characteristics, behaviours and 2. Motivation & Preliminary
preferences. In recent years, significant work has been Various approaches have been proposed as literature
carried out for profiling to the environment and new point of view that user profile usually changes to
goals of the user. Example behind this profile which recognize behaviour of others in real-time. To predict,
proposed in a previous work [2]. We approach to coordinate, to recognize human brain capacity for
(EVABCD) Evolving Agent behaviour Classification future actions. Different methods have been used to
IJCSRTV1IS050006 www.ijcsrt.org 28
ISSN: 2321-8827
find out relevant information in computer user Learning Vector Quantization (LVQ) is the nearest
behaviour in different computer areas : prototype learning algorithm [7]. LVQ considered to be
a supervised clustering algorithm which each weight
vector interpreted as a cluster center. Using this
algorithm number of reference vectors has to be set by
user. Poirier and Ferrieux proposed a method to
generate new porotypes dynamically. LVQ method
lacks the generation of prototype for application with
noisy data in Dynamic Vector Quantization (DVQ).
3.2 Bayesian classifier

This is an effective methodology for solving
classification problem when features are considered
simultaneously. However, features of Bayesian
Figure 2. High-level system architectural framework classifier forward selection method, huge computation
for Evolving Behaviour Model Library Profile is involved. Agrawal and Bala [8] proposed an
incremental versions of Baysian classifier.
We show the limits of existing construction user
behaviour which based on evolving profile-library 3.3 Support Vector Machine (SVM)
approaches with statistical classification method. A machine which performs classification to
“Figure 2. shows the high-level system architecture constructing N-dimensional hyperplane that optimally
framework for Evolving Behaviour Model Library configured into two categories. Training SVM
Profile” to extract significant pieces of sequence of “incrementally” discard all previous data except their
command. According to this aspect, it was used in [4] support vectors which gives only approximate result. It
T
order to get representative set of subsequences from the is an exact online method to construct the solution.
SR
acquired sequence. By using trie data structure to learn Xiao et al [9] propose which utilize the property of SV
a team behaviour and to classify the behaviour patterns set and accumulates distribution of knowledge sample
of a RiboCup soccer simulation team. space through adjust table parameter.
IJC
2.1 Discovery of navigation patterns However, this research focus command line interface, it
Spiliopoulou and Faulstich present the Web Utilization is necessary to approachable process in real time
Miner (WUM), a mining system for discovering streaming data. To capture sudden and abrupt changes
interesting navigation patterns in website.WUM in streaming data with necessary not only tuning
prepares the web log data for mining and the language parameter but also change in structure. Taking these
MINT mining the aggregated data according to the aspect when proposed a paper to evolving fuzzy-rule-
directives of the human expert [5]. based system; However, approach has important
advantage which makes it very useful in real
2.2 Computer security environments :
Pepyne et al [6] describe a method using queuing
theory and logistic regression modeling methods for • It can cope with huge amounts and data.
profiling computer users based on simple temporal • Its evolving structure can capture sudden and abrupt
aspects of their behavior. changes in the streams of data.
• Its structure meaning is very clear, as we propose a
rule-based classifier.
3.Existing Effective Classification Techniques
• It is monitoring in single pass computation with
In Observed Classifier efficient and fast.
Following several incremental effective classifier in • Its classifier structure is simple and interpretable.
evolving fuzzy rule based system which work with
automatically gain by observed behaviour for adaptive
distribution of relevant events. This classifier
4.Proposed Methodology
To Improve the Performance User Behaviour Profile
implemented using different framework.
system by using EVABCD as agent to predicting
masquerades. It is predicted with the help of standard
3.1 Prototype-based supervised algorithm platform such as JAVA and with environment of
LINUX etc. Primary Objectives of the proposed system
ISSN: 2321-8827
can be summarized as follows Creating User Behaviour to batch Bayesian classifier in terms classification
Profile in terms of classifying relevant sequence of accuracy. However, the proposed incremental Bayesian
events, Knowledge about Computer User with classifier has very high speed efficiency in comparison
increased complexity of thinking User Behaviour, to batch Bayesian classifier.
Segmentation of subsequent relevant events evaluated
by using frequency base method.
4.1 Evolving ABCD

To Revise Existing Hypothesis than it is to generate
hypothesis, when each time new instance is observed.
To detect Masquerades Un-Authorized Work for
Knowledge of Computer User. It performs filtering of
the input information, feature extraction, and forming
the input vectors. It consists of [10] modules that
receive inputs from the representation part and also
feedback from the environment. These are modules that
take input values from the decision part and pass output
Fig.4.3 Architecture of Classifier Design
information to the environment.
EVABCD receives observations in real time from the
4.2 Automatic Clustering environment to analyze. In our case, these observations
Bayesian classifier is an effective and fundamental are UNIX commands and they are converted into the
methodology for solving classification problems. corresponding distribution of subsequences online. It
However, it is computationally efficient when all means that we should know all the different
features are considered simultaneously. A block
T
subsequences of the environment a priori. However,
diagram of the EVABCD framework is given in Fig.4.2 this value is unknown and the creation of this data
SR
Also the noisy attributes sometimes may decrease the space from the beginning is not efficient. For this
accuracy of classifier. So before classification feature reason, in EVABCD, the dimension of the data space
selection is used as a pre-processing step. When the also evolves, it is incrementally growing according to
features are added one by one in Bayesian classifier in
IJC
the different subsequences that are represented in it.

batch mode in forward selection method huge
computation is involved. In this paper, an incremental
Bayesian classifier for multivariate normal distribution 4.4 Classification of User Behavior
datasets are proposed. In order to classify a UNIX user behaviour, these
distributions must be represented in a data space. For
this reason, each distribution is considered as a data
vector that defines a point that can be represented in the
data space.
4.4.1Segmentation of the sequence of

commands
First, the sequence is segmented into subsequences of
equal length from the first to the last element. Thus, the
sequence A1, A2 . . . An where n is the number of
commands of the sequence will be segmented in the
Fig.4.2 Architecture of Classifier Design subsequences described by Ai . . . where length is the
size of the subsequences created. In the remainder of
4.3 Classifier Designing the paper, we will use the term subsequence length to
The proposed incremental Bayesian classifier is denote the value of this length. This value determines
computationally efficient over batch Bayesian classifier how many commands are considered as dependent. In
in terms of time. The effectiveness of the proposed the proposed sample sequence (fls-date-ls-date-catg),
incremental Bayesian classifier has been demonstrated let 3 be the subsequence length, then we obtain (fls-
through experiments on different datasets. It is found date-lsg), (fdate-ls-dateg), (fls-date-catg).
on the basis of experiments that the incremental
Bayesian classifier has an equivalent power compared
ISSN: 2321-8827
4.4.2 Storage of the subsequences in a trie matrix will represent a particular subsequence of
The subsequences of commands are stored in a trie data commands. In the previous example, the trie consists of
structure. When a new model needs to be constructed, nine nodes; therefore, the corresponding profile
we create an empty trie. And it insert each subsequence consists of nine different subsequences which are
of events into it, such that all possible subsequences are labeled with its support. It shows the distribution of
accessible and explicitly represented. Every trie node these subsequences. Once a user behaviour profile has
represents the subsequences of commands are stored in been created, it is classified and used to update the
a trie data structure. Evolving-Profile-Library, as explained in the next
section.
The construction of a user profile from a single
sequence of commands is done by a three step process. 5. Mathematical steps in implementation of
When a new subsequence is inserted into a trie, the each block
existing nodes are modified new nodes are created. As A prototype is a data sample a behaviour
the dependencies of the commands are relevant in the represented by a distribution of subsequences of
user profile, the subsequence suffixes subsequences commands that represents several samples which
that extend to the end of the given sequence are also represent a certain class. The classifier is initialized
inserted. Considering the previous example, the first with the first data sample, which is stored in EPLib.
subsequence (fls-date-lsg) is added as the first branch Then, each data sample is classified to one of the
of the empty trie (Fig.4.4.2.A). Each node is labeled prototypes classes defined in the classifier. Finally,
with the number 1 which indicates that the command based on the potential of the new data sample to
has been inserted in the node once this number is become a prototype, it could form a new prototype or
enclosed in square brackets. Then, the suffixes of the replace an existing one.
subsequence (fdate- sg-and-ls) are also inserted
(Fig.4.4.2.B). Finally, after inserting the three 5.1 Calculate the Potential of Data Sample
T
subsequences and its corresponding suffixes, the The potential (P) of the kth data sample is calculated by
completed trie is obtained (Fig.4.4.2.C).
SR
(1) which represents a function of the accumulated
distance between a sample and all the other k-1 samples
in the data space [11]. The result of this function
represents the density of the data that surrounds a
IJC
certain data sample where distance represents the

distance between two samples in the data space.
1
p( x ) 
 ik 11dist 2 ( xk , xi )
k
1
k 1 (1)
Fig.4.4.2 Steps of Creating an example “Trie” The potential is calculated using the Euclidean distance
and it is calculated using the cosine distance. Cosine
distance has the advantage that it tolerates different
4.4.3 Creation of the user profile samples to have different number of attributes. An
Once the trie is created, the subsequences that
attribute is the support value of a subsequence of
characterize the user profile and its relevance are
commands. Also, cosine distance tolerates that the
calculated by traversing the trie. For this purpose,
value of several subsequences in a sample can be null is
frequency-based methods are used. In particular, in
different than zero. Therefore, EVABCD uses the
EVABCD, to evaluate the relevance of a subsequence,
cosine distance (cos Dist) to measure the similarity
its relative frequency or support is calculated. In this
between two samples, as it is described below
case, the support of a subsequence is defined as the
ratio of the number of times the subsequence has been  nj  1 xkj x pj
cos Dist ( x , x )  1 
inserted into the trie and the total number of k p 2
 nj  1 xkj  nj  1 x 2pj
subsequences of equal size inserted. In this step, the trie (2)
can be transformed into a set of subsequences labeled Where Xk & Xp represent the two samples to measure
by its support value. In EVABCD, this set of its distance and n represents the number of different
subsequences is represented as a distribution of relevant attributes in both samples. Once the corresponding
subsequences. Thus, we assume that user profiles are n- distribution has been created from the stream, it is
dimensional matrices, where each dimension of the
ISSN: 2321-8827
processed by the classifier. The structure of this distance and in using cosine distance. This formula is
classifier includes as follows:
𝑛 𝑗 𝑗 𝑗 𝑗
𝐵𝑘 = 𝑗 −1 𝑧𝑘 𝑏𝑘 ; 𝑏𝑘 = 𝑏(𝑘−1) +
1. Classify the new sample in a class represented by a
prototype. 𝑗
(𝑧𝑘 )2
2. Calculate the potential of the new data sample to be a (4)
1 (𝑧 𝑙 )2
prototype. 𝑙−1 1
3. Update all the prototypes considering the new data
sample. It is done because the density of the data space 𝑗
(𝑧𝑘 )2
𝑗
surrounding certain data sample changes with the 𝑏1 = 1 (𝑧 𝑙 )2 ; 𝑗 = 1, 𝑛 + 1 ,
insertion of each new data sample. Insert the new data 𝑙−1 1
sample as a new prototype if needed.

4. Remove any prototype if needed.
Using this expression, it is only necessary to calculate
(n-1) values where n is the number of different obtained
5.2 Calculating the Potential Recursively subsequences; this value is represented by b, where
As in a prototype is a data sample a behaviour
b_k^j, j=[1,n] represents the accumulated value for the
represented by a distribution of subsequences of
kth data sample. However, since the number of needed
commands that represents several samples which
accumulated values is very large, we [14] simplify this
represent a certain class. The classifier is initialized
expression in order to calculate it faster and with less
with the first data sample, which is stored in EPLib.
memory.
Then, each data sample is classified to one of the
prototypes classes defined in the classifier. Finally,
based on the potential of the new data sample to 5.4 Creating New Prototypes
become a prototype it could form a new prototype or In our particular application, the data are represented by
T
a set of support values and are thus positive. To
replace an existing one. The potential (P) of the kth
simplify the recursive calculation of the expression (1),
data [12] sample is calculated by (1) which represents a
SR
we can use simply the distance instead of square of the
function of the accumulated distance between a sample
distance. For this reason, we use in this case (4) instead
and all the other samples in the data space. In the
potential is calculated using the Euclidean distance and of (1)
IJC
in it is calculated using the cosine distance. Cosine 𝑖 = 1, 𝑁𝑢𝑚𝑃𝑟𝑜𝑡𝑜𝑡𝑦𝑝𝑒𝑠 ∶ 𝑝 𝑧𝑘 >

distance has the advantage that it tolerates different 𝑃 𝑃𝑟𝑜𝑡𝑖 . (5)
samples to have different number of attributes. An Thus, if the new data sample is not relevant, the overall
attribute is the support value of a subsequence of structure of the classifier is not changed. Otherwise, if
commands. Also, cosine distance tolerates that the the new data sample has high descriptive power and
value of several subsequences in a sample can be null is generalization potential, the classifier evolves by
different than zero. adding a new prototype which represents a part of the
observed data samples.
5.3 Simplifying the Potential Expression
In expression in (1) requires all the accumulated data
sample available to be calculated, which contradicts to
the requirement for real time and online application
needed in the proposed approach. For this reason, we
need to develop a recursive expression of the potential,
in which it is not needed to store the history of all the
data. In (3), the potential is calculated in the [13] input-
output joint data space, where therefore, the kth data
sample (xk) with its corresponding label is represented Fig.5.4 Approach Example: Distributions of
as zk. Each sample is represented by a set of values. subsequences of events in an evolving system
1
pk ( x ) 
k k 
i  1 cos dist 2 ( xk , xi )
1 5.5 Removing Existing Prototypes
1 After adding a new prototype, we check whether any of
k 1 (3) the already existing prototypes are described well by
Using (4), we develop a recursive expression similar to the newly added prototype.
the recursive expressions developed in using euclidean
ISSN: 2321-8827
𝑖 = 1, 𝑁𝑢𝑚𝑃𝑟𝑜𝑡𝑜𝑡𝑦𝑝𝑒𝑠 ∶ ∋ 𝑧𝑘 > 𝑒 1 (6) depends on the number of prototypes and its number of
For this reason, we calculate the membership function attributes. The self evolving system makes use of
between a data sample and a prototype which is defined recursive formula to find the potential of a new data
as where it represents the cosine distance between a sample to form new prototype or replace existing
data sample zk and the ith prototype P, i represents the prototype. The masquerade user data sample are
spread of the membership function, which also compared with the trained prototypes and behaviour is
symbolizes the radius of the zone of influence of the detected based on standard deviation.
prototype. The equation to get the spread of the kth data
sample is defined as where represents the cosine TABLE 5.6 Total Number of Different Subsequences
distance between a data sample (zk) and the ith Obtained
prototype(P); i represents the spread of the membership
No. of Command Sub-Sequence No. of Different Sub-
function, which also symbolizes the radius of the zone
per User Length Sequences
of influence of the prototype. This spread is determined
based on the scatter of the data. The equation to get the
3 799
spread of the kth data sample is defined as:
100
∈ 𝑘 = 4 799
1 𝑘
𝑘 𝑗 =1 𝑐𝑜𝑠𝐷𝑖𝑠𝑡(𝑃𝑟𝑜𝑡𝑖, 𝑧𝑘) ; ∈𝑖 0 =1 , 5 799
(7)
6 799
5.6 Classification Method 3 1416

In order to classify a new data sample, we compare it 500
with all the prototypes stored in EPLib. This
T
4 1416
comparison is done using cosine distance and the
SR
smallest distance determines the closest similarity. This 5 1417
comparison is shown as:
𝑐𝑙𝑎𝑠𝑠 𝑥2 = 𝑐𝑙𝑎𝑠𝑠 𝑃𝑟𝑜𝑡 ∗ ; 6 1416
IJC
The time consumed for classifying a new sample

depends on the number of prototypes and its number of
attributes. Incremental learning Algorithm (ILA), It is more
efficient to revise existing hypothesis than generate
hypothesis. Overcomes incremental classifier adaptive
to dynamic environments used in complex knowledge
streams.
5.7 Masquerades Prototypes Attributes Values

Vs Attributes
In our proposed approach we summarized that,
EVABCD is done [16] using cosine distance and the
smallest distance determines the closest similarity. The
combined features of learning agent adapted with
knowledge-based, logic systems, [17] case-based
reasoning and connectionist-based systems.
Fig.5.6 Approach Example: Distributions of
subsequences of events in an evolving system
However, we can consider in general terms, that both
[15] the time consumed and the computational
complexity are reduced and acceptable for real-time
applications in order of milliseconds per data sample.
The time consumed for classifying a new sample
ISSN: 2321-8827
Fig.5.7 Evolution of the Classification Rate during Fig.5.8 Data Sample Entity for Different User
Online Learning with a Subset of UNIX User Data Set Prototypes
The combined features of learning agent adapted with Applying this technique in ABCD, the subsequences
knowledge-based, logic systems, case-based reasoning typed by a user are indexed [19] with a number that
and connectionist-based systems. A [18] suitable indicates the moment they were read. This value can be
different classifier algorithm shows significant considered as an integer from 1 the first subsequence
interfacing with command-line environment. It is read to the number of subsequences read. Using this
T
suitable for detecting Masquerades when it ignore the value, the Age of a subsequence can be calculated. This
fact that user behaviour cannot change and evolve. age value indicates how old a subsequence stored in a
SR
user profile having formula for calculating this value is
5.8 Classification Rate of Different Classifiers shown in fig.5.8.
in the UNIX Users Environments
IJC
Evolving Profile Library Classifier (EPLib), library 7. Conclusion

contain Creating and evolving the classifier with EVABCD is a realization that one of the main
different expected behaviour. Evolving learning modules in a framework of evolving connectionist
observation and utilization of evolving influence systems. It is a model to classify user behaviors from a
technique uses the moment when the information is sequence of events. The underlying assumption in this
obtained. approach is that the data collected from the
corresponding environment can be transformed into a
To get an idea of how this number increases, Table.7.1 sequence of events. This sequence is segmented and
shows the number of different subsequences obtained stored in a trie and the relevant sub-sequences are
using different number of commands for training 100, evaluated by using a frequency-based method. Then, a
200 and 300 commands per user and subsequence distribution of relevant subsequences is created.
lengths 3, 4, 5 and 6. Using EVABCD, the number of However, as a user behaviour is not fixed but rather it
prototypes per class is not fixed, it varies automatically changes and evolves, the proposed classifier is able to
depending on the heterogeneity of the data. To get an keep up to date the created profiles using an Evolving
idea about it, fig.7.3 tabulates the number of prototypes Systems approach. EVABCD is one pass, noniterative,
created per group in [20] each of the 10 runs using recursive, and it has the potential to be used in an
1000 commands per user as training data set and a interactive mode; therefore, it is computationally very
subsequence length of 3. A widely acknowledged efficient and fast. In addition, its structure is simple and
challenge in the ABCD is how to accurately profile a interpretable.
user while his or her behaviour changes constantly.
Thus, a user profile should be frequently revised to EVABCD have all of the features of knowledge-based
keep it up to date. To solve this problem, we propose a systems, logic systems, case-based reasoning systems,
technique used by Angelov and Zhou, for analyzing the and adaptive connectionist-based systems combined.
quality of the rule base in an on-line fuzzy system. Through self-organization and self-adaptation during
the learning process. they allow for solving difficult
ISSN: 2321-8827
engineering tasks as well as for simulation of emerging 4, pp. 497 508, http://dx.doi.org/10.1109/5326.983933, Nov.
and evolving biological and cognitive processes. The 2001.
self evolving system makes use of recursive formula to [15] G. Widmer and M. Kubat, “Learning in the Presence of
find the potential of a new data sample to form new Concept Drift and Hidden Contexts,” Machine Learning, vol.
23, pp. 69 101, 1996.
prototype or replace existing prototype. The [16] D. Kalles and T. Morris, “Efficient Incremental
masquerade user data sample are compared with the Induction of Decision Trees,” Machine Learning, vol. 24, no.
trained prototypes and behaviour is detected based on 3, pp. 231 242, 1996.
standard deviation. [17] F.J. Ferrer Troyano, J.S. Aguilar Ruiz, and J.C.R.
Santos, “Data Streams Classification by Incremental Rule
8. References Learning with Parameterized Generalization,” Proc. ACM
[1] D. Godoy and A. Amandi, “User Profiling in Personal Symp. Applied Computing (SAC), pp. 657 661, 2006.
Information Agents: A Survey,” Knowledge Eng. Rev., vol. [18] N. Kasabov, “Evolving Fuzzy Neural Networks for
20, no. 4, pp. 329 361, 2005. Supervised/Unsupervised Online Knowledge Based
[2] J.A. Iglesias, A. Ledezma, and A. Sanchis, “Creating User Learning,” IEEE Trans. Systems, Man and Cybernetics Part
Profiles from a Command Line Interface: A Statistical B: Cybernetics, vol. 31, no. 6, pp. 902 918, Dec. 2001.
Approach,” Proc. Int’l Conf. User Modeling, Adaptation, and [19] F. Poirier and A. Ferrieux, “Dvq: Dynamic Vector
Personalization (UMAP), pp. 90 101, 2009. Quantization An Incremental Lvq”, Proc. Int’l Conf.
[3] M. Schonlau, W. Dumouchel, W.H. Ju, A.F. Karr, and Artificial Neural Networks, pp. 1333 1336, 1991.
Theus, “Computer Intrusion: Detecting Masquerades,” [20] P. Angelov and X. Zhou, “Evolving Fuzzy Rule Based
Statistical Science, vol. 16, pp. 58 74, 2001. Classifiers from Data Streams,” IEEE Trans. Fuzzy Systems:
[4] Fredkin, E.: “Trie memory,” Comm. ACM 3(9), 490–499 Special Issue on Evolving Fuzzy Systems, vol. 16, no. 6, pp.
(1960). 1462 1475, Dec. 2008.
[5] Wexelblat, A.: “An environment for aiding information-
browsing tasks.” In: Proc. Of AAAI Spring Symposium on
Acquisition, Learning and Demonstration: Automating Tasks
T
for Users. AAAI Press, Menlo Park (1996).
[6] Pepyne, D.L., Hu, J., Gong, W.: “User profiling for
SR
computer security,” In: Proceedings of the American Control
Conference, pp. 982–987 (2004).
[7] T. Kohonen, J. Kangas, J. Laaksonen, and K. Torkkola,
“Lvq pak: A Program Package for the Correct Application of
IJC
Learning Vector Quantization Algorithms,” Proc. IEEE Int’l

Conf. Neural Networks, pp. 725 730, 1992.
[8] R. K. Agrawal and R. Bala, “Incremental Bayesian
Classification for Multivariate Normal Distribution
Data,”Pattern Recognition Letters, vol. 29, no. 13, pp. 1873
1876, http://dx.doi.org/10.1016/j.patrec.2008.06.010, 2008.
[9] R. Xiao, J. Wang, and F. Zhang, “An Approach to
Incremental SVM Learning Algorithm,” Proc. IEEE Int’l
Conf. Tools with Artificial Intelligence, pp. 268 278, 2000.
[10] T. Seipone and J.A. Bullinaria, “Evolving Improved
Incremental Learning Schemes for Neural Network Systems,”
Proc. IEEE Congress on Evolutionary Computation, pp. 2002
2009, 2005.
[11] T. Lane and C.E. Brodley, “Temporal Sequence
Learning and Data Reduction for Anomaly Detection,” Proc.
ACM Conf. Computer and Comm. Security (CCS), pp. 150
158, 1998.
[12] A. Cufoglu, M. Lohi, and K. Madani, “A Comparative
Study of Selected Classifiers with Classification Accuracy in
User Profil ing,” Proc. WRI World Congress on Computer
Science and Information Eng. (CSIE), pp. 708 712, 2009.
[13]. P. Riley and M.M. Veloso, “On Behavior Classification
in Adversarial Environments,” Proc. Int’l Symp. Distributed
Autono mous Robotic Systems (DARS), pp. 371 380, 2000.
[14] R. Polikar, L. Upda, S.S. Upda, and V. Honavar,
“Learn++: An Incremental Learning Algorithm for
Supervised Neural Net works,” IEEE Trans. Systems, Man
and Cybernetics, Part C (Applications and Rev.), vol. 31, no.

Ijcsrtv1is050006 PDF

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Ijcsrtv1is050006 PDF

Enviado por

Direitos autorais:

Formatos disponíveis

International Journal of Computer Science Research & Technology (IJCSRT)

Analysis of Evolving User Behaviour Profile For Predicting Masquerades

Abstract based on Distribution of relevant events based on

3.2 Bayesian classifier

4.1 Evolving ABCD

the different subsequences that are represented in it.

4.4.1Segmentation of the sequence of

certain data sample where distance represents the

sample as a new prototype if needed.

in it is calculated using the cosine distance. Cosine 𝑖 = 1, 𝑁𝑢𝑚𝑃𝑟𝑜𝑡𝑜𝑡𝑦𝑝𝑒𝑠 ∶ 𝑝 𝑧𝑘 >

5.6 Classification Method 3 1416

The time consumed for classifying a new sample

5.7 Masquerades Prototypes Attributes Values

Evolving Profile Library Classifier (EPLib), library 7. Conclusion

Learning Vector Quantization Algorithms,” Proc. IEEE Int’l

Você também pode gostar